Join our Discord
neosr is an open-source framework for training super-resolution models. It provides a comprehensive and reproducible environment for achieving state-of-the-art image restoration results, making it suitable for both the enthusiastic community, professionals and machine learning academic researchers. It serves as a versatile platform and aims to bridge the gap between practical application and academic research in the field.
-
Accessible: implements a wide range of the latest advancements in single-image super-resolution networks, losses, optimizers and augmentations. Users can easily explore, adapt and experiment with various configurations for their specific needs, even without coding skills.
-
Efficient: optimized for faster training iterations, quicker convergence and low GPU requirements, making it the most efficient choice for both research and practical use cases.
-
Practical: focuses on the real-world use of super-resolution to realistically restore degraded images in various domains, including photos, anime/cartoons, illustrations and more. It's also suitable for critical applications like medical imaging, forensics, geospatial and others (although caution should be taken in those cases).
-
Reproducible: this framework emphasizes the importance of reproducible research. It provides deterministic training environments that can create bit-exact reproducible models (on the same platform), ensuring predictable and reliable results, which are essential for maintaining consistency in academic validation.
-
Simple: features are easy to implement or modify. Code is written in readable Python, no fancy styling. All code is validated and formatted by
ruff
,mypy
andtorchfix
.
For more information see our wiki.
Requires Python 3.12 and CUDA >=12.4.
Clone the repository and install via poetry
:
git clone /~https://github.com/neosr-project/neosr
cd neosr
poetry install --sync
See detailed Installation Instructions for more details.
Start training by running:
python train.py -opt options.toml
Where options.toml
is a configuration file. Templates can be found in options.
Tip
Please read the wiki Configuration Walkthrough for an explanation of each option.
arch | option |
---|---|
Real-ESRGAN | esrgan |
SRVGGNetCompact | compact |
SwinIR | swinir_small , swinir_medium |
HAT | hat_s , hat_m , hat_l |
OmniSR | omnisr |
SRFormer | srformer_light , srformer_medium |
DAT | dat_small , dat_medium , dat_2 |
DITN | ditn |
DCTLSA | dctlsa |
SPAN | span |
Real-CUGAN | cugan |
CRAFT | craft |
SAFMN | safmn , safmn_l |
RGT | rgt , rgt_s |
ATD | atd , atd_light |
PLKSR | plksr , plksr_tiny |
RealPLKSR | realplksr , realplksr_s |
DRCT | drct , drct_l , drct_s |
MSDAN | msdan |
SPANPlus | spanplus , spanplus_sts , spanplus_s , spanplus_st |
HiT-SRF | hit_srf , hit_srf_medium , hit_srf_large |
HMA | hma , hma_medium , hma_large |
MAN | man , man_tiny , man_light |
light-SAFMN++ | light_safmnpp |
MoSR | mosr , mosr_t |
GRFormer | grformer , grformer_medium , grformer_large |
EIMN | eimn , eimn_a , eimn_l |
LMLT | lmlt , lmlt_tiny , lmlt_large |
DCT | dct |
KRGN | krgn |
PlainUSR | plainusr , plainusr_ultra , plainusr_large |
HASN | hasn |
FlexNet | flexnet , metaflexnet |
CFSR | cfsr |
Sebica | sebica , sebica_mini |
Note
For all arch-specific parameters, read the wiki.
arch | option |
---|---|
NinaSR | ninasr , ninasr_b0 , ninasr_b2 |
net | option |
---|---|
U-Net w/ SN | unet |
PatchGAN w/ SN | patchgan |
EA2FPN (bespoke, based on A2-FPN) | ea2fpn |
DUnet | dunet |
MetaGan | metagan |
optimizer | option |
---|---|
Adam | Adam or adam |
AdamW | AdamW or adamw |
NAdam | NAdam or nadam |
Adan | Adan or adan |
AdamW Win2 | AdamW_Win or adamw_win |
ECO strategy | eco , eco_iters |
AdamW Schedule-Free | adamw_sf |
Adan Schedule-Free | adan_sf |
F-SAM | fsam , FSAM |
SOAP | soap |
loss | option |
---|---|
L1 Loss | L1Loss , l1_loss |
L2 Loss | MSELoss , mse_loss |
Huber Loss | HuberLoss , huber_loss |
CHC (Clipped Huber with Cosine Similarity Loss) | chc_loss |
NCC (Normalized Cross-Correlation) | ncc_opt , ncc_loss |
Perceptual Loss | perceptual_opt , vgg_perceptual_loss |
GAN | gan_opt , gan_loss |
MS-SSIM | mssim_opt mssim_loss |
LDL Loss | ldl_opt , ldl_loss |
Focal Frequency | ff_opt , ff_loss |
DISTS | dists_opt , dists_loss |
Wavelet Guided | wavelet_guided |
Perceptual Patch Loss | perceptual_opt , patchloss , ipk |
Consistency Loss (Oklab and CIE L*) | consistency_opt , consistency_loss |
KL Divergence | kl_opt , kl_loss |
MS-SWD | msswd_opt , msswd_loss |
FDL | fdl_opt , fdl_loss |
augmentation | option |
---|---|
Rotation | use_rot |
Flip | use_hflip |
MixUp | mixup |
CutMix | cutmix |
ResizeMix | resizemix |
CutBlur | cutblur |
model | description | option |
---|---|---|
Image | Base model for SISR, supports both Generator and Discriminator | image |
OTF | Builds on top of image , adding Real-ESRGAN on-the-fly degradations |
otf |
loader | option |
---|---|
Paired datasets | paired |
Single datasets (for inference, no GT required) | single |
Real-ESRGAN on-the-fly degradation | otf |
As part of neosr, I have released a dataset series called Nomos. The purpose of these datasets is to distill only the best images from the academic and community datasets. A total of 14 datasets were manually reviewed and processed, including: Adobe-MIT-5k, RAISE, LSDIR, LIU4k-v2, KONIQ-10k, Nikon LL RAW, DIV8k, FFHQ, Flickr2k, ModernAnimation1080_v2, Rawsamples, SignatureEdits, Hasselblad raw samples and Unsplash.
Nomos-v2
(recommended): contains 6000 images, multipurpose. Data distribution:
pie
title Nomos-v2 distribution
"Animal / fur" : 439
"Interiors" : 280
"Exteriors / misc" : 696
"Architecture / geometric" : 1470
"Drawing / painting / anime" : 1076
"Humans" : 598
"Mountain / Rocks" : 317
"Text" : 102
"Textures" : 439
"Vegetation" : 574
nomos_uni
(recommended for lightweight networks): contains 2989 images, multipurpose. Meant to be used on lightweight networks (<800k parameters).hfa2k
: contains 2568 anime images.
dataset download | sha256 |
---|---|
nomosv2 (3GB) | sha256 |
nomosv2.lmdb (3GB) | sha256 |
nomosv2_lq_4x (187MB) | sha256 |
nomosv2_lq_4x.lmdb (187MB) | sha256 |
nomos_uni (1.3GB) | sha256 |
nomos_uni_lq_4x | sha256 |
hfa2k | sha256 |
Datasets made by the upscaling community. More info can be found in author's repository.
DF2k-BHI
: a curated version of the classic DF2k dataset, made by @Phhofm. Read more about it here.4xNomosRealWeb Dataset
: realistically degraded LQ's for Nomos-v2 dataset (from @Phhofm).FaceUp
: Curated version of FFHQSSDIR
: Curated version of LSDIR.ArtFaces
: Curated version of MetFaces.Nature Dataset
: Curated version of iNaturalist.digital_art_v2
: Digital art dataset from @umzi2.
dataset | download |
---|---|
@Phhofm | HuggingFace |
@Phhofm 4xNomosRealWeb | Release page |
@Phhofm FaceUp | GDrive (4GB) |
@Phhofm SSDIR | Gdrive (4.5GB) |
@Phhofm ArtFaces | Release page |
@Phhofm Nature Dataset | Release page |
@umzi2 Digital Art (v2) | Release page |
- Training Guide from @Sirosky
- Philip's youtube channel
- OpenModelDB
- chaiNNer
Released under the Apache license. All licenses listed on license/readme. This code was originally based on BasicSR.
Thanks to victorca25/traiNNer, styler00dollar/Colab-traiNNer and timm for providing helpful insights into some problems.
Thanks to active contributors @Phhofm, @Sirosky, and @umzi2 for helping with tests and bug reporting.