Skip to content

A deep residual auto-encoding approach to colorize images.

License

Notifications You must be signed in to change notification settings

priyavrat-misra/image-colorizer

Repository files navigation

Image Colorization

cover

Contents


Overview

This project is a Deep Convolutional Neural Network approach to solve the task of image colorization. The goal is to produce a colored image given a grayscale image.
At it's heart, it uses Convolutional Auto-Encoders to solve this task. First few layers of ResNet-18 model are used as the Encoder, and the Decoder consists of a series of Deconvolution layers (i.e., upsample layers followed by convolutions) and residual connections.
The model is trained on a subset of MIT Places365 dataset, consisting of 41000 images of landscapes and scenes.

Approach

The images in the dataset are in RGB Colorspace. Before loading the images, the images are converted to LAB colorspace. This colorspace contains exactly the same information as RGB.
It has 3 channels, Lightness, A and B. The lightness channel can be used as the grayscale equivalent of a colored image, the rest 2 channels (A and B) contain the color information.

In a nutshell, the training process follows these steps:

  1. The lightness channel is separated from the other 2 channels and used as the model's input.
  2. The model predicts the A and B channels (or 'AB' for short).
  3. The loss is calculated by comparing the predicted AB and the corresponding original AB of the input image.

More about the training process can be found here.

Steps

  1. Defining a model architecture:
    • The model follows an Auto-Encoder kind of architecture i.e., it has an encoder and a decoder part.
    • The encoder is used to extract features of an image whereas,
    • the decoder is used to upsample the features. In other words, it increases the spacial resolution.
    • In here, the layers of the encoder are taken from ResNet-18 model, and the first conv layer is modified to take a single channel as input (i.e., grayscale or lightness) rather than 3 channels.
    • The decoder uses nearest neighbor upsampling (for increasing the spacial resolution), followed by convolutional layers (for dealing with the depth).
    • A more detailed visualization of the model architecture can be seen here.
  2. Defining a custom dataloader:
    • when loading the images, it converts them to LAB, and returns L and AB separately.
    • it does few data processing tasks as well like applying tranforms and normalization.
  3. Training the model:
    • The model is trained for 64 epochs with Adam Optimization.
    • For calculating the loss between the predicted AB and the original AB, Mean Squared Error is used.
  4. Inference:
    • Inference is done with unseen images and the results look promising, or should I say "natural"? :)

Results

results More colorized examples can be found in here.

TL;DR

Given an image, the model can colorize it.

Setup

  • Clone and change directory:
git clone "/~https://github.com/priyavrat-misra/image-colorization.git"
cd image-colorization/
  • Dependencies:
pip install -r requirements.txt

Usage

python colorize.py --img-path <path/to/image.jpg> --out-path <path/to/output.jpg> --res 360
# or the short-way:
python colorize.py -i <path/to/image.jpg> -o <path/to/output.jpg> -r 360

Note:

  • As the model is trained with 224x224 images, it gives best results when --res is set to lower resolutions (<=480) and okay-ish when set around ~720.
  • Setting --res higher than that of input image won't increase the output's quality.

Todo

  • define & train a model architecture
  • add argparse support
  • define a more residual architecture
  • use pretrained resnet-18 params for the layers used in the encoder & train the model
  • check how the colorization effect varies with image resolution
  • separate the model from the checkpoint file to a different file
  • complete README.md
  • deploy with flask
  • after that, host it maybe?

For any queries, feel free to reach me out on LinkedIn.