GPU Model Training Roadmap¶

This document outlines the plan for training production-quality neural models to replace the lightweight test models shipped with Artefex v1.0.

Current State¶

Artefex v1.0 ships with test ONNX models - small Conv-ReLU-Conv networks with random weights. These models exercise the full neural pipeline (loading, tiling, padding, inference, output) but do not produce meaningful restoration improvements over the classical methods.

The classical restoration pipeline (deblocking, denoising, color correction, sharpening) works well for moderate degradation. GPU-trained models will provide superior results for heavy degradation.

Training Infrastructure (Already Built)¶

Component	File	Status
Data generator	`train/generate_data.py`	Ready
Deblock trainer	`train/deblock_train.py`	Ready (U-Net, 1ch, L1 loss)
Denoise trainer	`train/denoise_train.py`	Ready (U-Net, 3ch, hybrid L1+MSE)
Test model generator	`train/create_test_models.py`	Ready
Model registry	`src/artefex/models_registry.py`	Ready (4 model slots)
Neural engine	`src/artefex/neural.py`	Ready (tiling, padding, CUDA)
Download system	`src/artefex/models_registry.py`	Ready (SHA-256 verification)

Models to Train¶

Phase 1: Core Models (v1.1)¶

1. deblock-v1 - JPEG Artifact Removal¶

Architecture: U-Net (64-128-256-512 channels), 1-channel (grayscale)
Training data: Clean images degraded with JPEG quality 10-70
Loss: L1 (already configured in deblock_train.py)
Target: PSNR improvement of 2-4 dB over classical deblocking
Estimated training: 4-8 hours on RTX 3060 or equivalent

2. denoise-v1 - Adaptive Denoising¶

Architecture: U-Net (48-96-192-384 channels), 3-channel (RGB)
Training data: Clean images with Gaussian noise (sigma 10-50)
Loss: 0.7 * L1 + 0.3 * MSE (already configured in denoise_train.py)
Target: PSNR improvement of 3-5 dB over classical median filter
Estimated training: 4-8 hours on RTX 3060 or equivalent

3. sharpen-v1 - Detail Recovery¶

Architecture: U-Net, 3-channel (RGB)
Training data: Clean images degraded with Gaussian blur + downscale/upscale
Loss: L1 + perceptual loss (VGG feature matching)
Needs: New training script based on denoise_train.py template
Estimated training: 6-10 hours

4. color-correct-v1 - Color Correction¶

Architecture: U-Net, 3-channel (RGB)
Training data: Clean images with random channel shifts, white balance errors
Loss: L1 + color histogram loss
Needs: New training script based on denoise_train.py template
Estimated training: 4-6 hours

Phase 2: Extended Models (v1.3)¶

5. Super-resolution (2x/4x upscaling)¶

Architecture: ESRGAN or SwinIR variant
Training data: DIV2K/Flickr2K dataset pairs
Estimated training: 24-48 hours on A100 or equivalent

6. Inpainting (watermark/object removal)¶

Architecture: Partial convolution U-Net
Training data: Clean images with synthetic masks
Estimated training: 12-24 hours

7. Dehazing/defogging¶

Architecture: FFA-Net variant
Training data: RESIDE dataset or synthetic haze
Estimated training: 8-12 hours

Quick Start (One Command)¶

On your PC (RTX 3060 12GB, Ryzen 9 5900XT, 64GB RAM), the entire process is automated:

# 1. Install PyTorch with CUDA (one-time setup)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# 2. Get training images (any folder of 500+ high-quality photos works)
#    Option A: Use your own photos
#    Option B: Download DIV2K dataset (~3.3 GB)

# 3. Run everything - generates data, trains all 4 models, validates, imports
python train/train_all.py --source /path/to/clean/images --epochs 100

# That's it. Expected time: 12-16 hours on RTX 3060.
# You can leave it running overnight.

What Happens During Training¶

The train_all.py script runs 5 stages automatically:

Generate training pairs (~30 min) - creates degraded/clean image pairs for each model type (JPEG compression, noise, blur, color shift)
Train deblock-v1 (~3-4 hrs) - JPEG artifact removal (grayscale U-Net)
Train denoise-v1 (~3-4 hrs) - noise reduction (RGB U-Net)
Train sharpen-v1 (~3-4 hrs) - detail recovery (RGB U-Net)
Train color-correct-v1 (~2-3 hrs) - color correction (RGB U-Net)

After each model trains, it is validated (must improve PSNR by >1 dB) and imported into the artefex registry.

Manual Training (Step by Step)¶

If you prefer to train one model at a time:

Step 1: Gather Training Data¶

# Generate degraded/clean training pairs for all model types
python train/generate_data.py --source /path/to/photos --output ./training_data --type deblock
python train/generate_data.py --source /path/to/photos --output ./training_data --type denoise
python train/generate_data.py --source /path/to/photos --output ./training_data --type sharpen
python train/generate_data.py --source /path/to/photos --output ./training_data --type color

Step 2: Train Models¶

# Train each model individually
python train/deblock_train.py --data ./training_data/deblock --epochs 100 --output ./models
python train/denoise_train.py --data ./training_data/denoise --epochs 100 --output ./models
python train/sharpen_train.py --data ./training_data/sharpen --epochs 100 --output ./models
python train/color_train.py --data ./training_data/color --epochs 100 --output ./models

Step 3: Import Models¶

artefex models import deblock-v1 ./models/deblock_v1.onnx
artefex models import denoise-v1 ./models/denoise_v1.onnx
artefex models import sharpen-v1 ./models/sharpen_v1.onnx
artefex models import color-correct-v1 ./models/color_correct_v1.onnx

Step 4: Validate¶

# Run the automated validation tests
pytest tests/test_model_validation.py -v

# Manual check
artefex restore degraded_photo.jpg restored.png
artefex compare degraded_photo.jpg restored.png

Hardware Requirements¶

Option	GPU	Training Time (all 4 models)	Cost
Your PC (RTX 3060 12GB)	RTX 3060	12-16 hours	Electricity
Google Colab (free)	T4	24-32 hours	Free
Google Colab Pro	A100	8-12 hours	~$10
Local RTX 4090	RTX 4090	4-8 hours	Electricity only

Your RTX 3060 12GB is ideal - 12GB VRAM handles batch size 8 with room to spare. You can increase to --batch-size 16 for faster training.

Acceptance Criteria¶

A trained model is ready for release when:

PSNR improvement over classical method is measurable (>1 dB)
No visual artifacts introduced on clean images
Model file size is under 50 MB
Inference time is under 500ms for a 256x256 patch on CPU
All 246 existing tests pass with the new model
The 10 model validation tests in test_model_validation.py pass
Manual visual inspection on 10+ diverse test images looks good

After Training¶

Once models are trained and validated:

Run pytest tests/test_model_validation.py -v to confirm all 10 tests pass
Run pytest tests/ -v to confirm all 246 tests still pass
Commit the updated SHA-256 checksums in models_registry.py
Create a GitHub Release with the ONNX model files attached
Update the download URLs in models_registry.py for artefex models download
Users who install artefex will get neural restoration automatically