This repository contains the code for training/generating SOF (semantic occupancy field) as part of the TOG submission: SofGAN: A Portrait Image Generator with Dynamic Styling.
Clone the main SofGAN repo by git clone --recursive /~https://github.com/apchenstu/softgan_test.git
. This repo will be automatically included in softgan_test/modules
.
Create a root directory (e.g. data
), and for each instance (e.g. 00000
) create a folder with seg images and calibrated camera poses. The folder structure looks like:
└── data # instance id
└── 00000
│ ├── cam2world.npy # camera extrinsics
│ ├── cameras.npy
│ ├── intrinsic.npy # camera intrinsics
│ ├── zRange.npy # optional only when use depth for training
│ ├── 00000.png
│ ...
│ └── 00029.png
├── 00001
│ └── ...
...
└── xxxxx
└── ...
Download the example data from here. We provide a notebook for data preprocessing.
Ideally, SOF
could be trained with your own datasets with multi-view face segmentation maps. Similar to SRNs we uses an "OpenCV" style camera coordinate system, where the Y-axis points downwards (the up-vector points in the negative Y-direction), the X-axis points right, and the Z-axis points into the image plane. Camera poses are assumed to be in a "camera2world" format, i.e., they denote the matrix transform that transforms camera coordinates to world coordinates. Please specify --orthogonal
during training if you're using orthogonal projection for your own data. Please also notice that you might need to change the sample_instances_*
and sample_observations_*
parameter according to the number of instances and views of your own dataset.
As the accuracy of camera parameters might largly affect the training, you can specify --opt_cam
during training to automatically optimize the camera parameters.
The training is done following two phrases. Firstly, please train the network parameters with multiview segmaps:
python train.py --config_filepath=./configs/face_seg_real.yml
Training might take 1 to 3 days depends on the dataset size and quality.
We use inverse rendering to expand the trained geometric sampling space with single view segmaps collected from CelebAMaskHQ. The example config file is provided in ./configs/face_seg_single_view.yml
, notice that we set --overwrite_embeddings
and --freeze_networks
to True
, and specify --checkpoint_path
as the trained checkpoint in STEP 1. After training, you can access the corresponding latent code for each portrait by loading the checkpoint.
python train.py --config_filepath=./configs/face_seg_single_view.yml
Similar process could be used to back project in-the-wild portrait images into a latent vector in SOF
geometric sampling space, and used for mutiview portrait generation.
Please download the pre-trained checkpoint from either GoogleDrive or BaiduDisk (password: k0b8) and save to ./checkpoints
.
Please follow renderer.ipynb
in the SofGAN repo for free-view portrait generation.
Once trained, SOF could be used for generating free-view segmentation maps for arbitrary instances in the geometric space. The inference codes are provided in notebooks in scripts
:
- Most testing codes are included in
scripts/TestAll.ipynb
, e.g. generating multiview images, modify attributes, visualize depth layers and build depth prior with marching cube. - To generate sampling free-view portrait segmentations from the geometry space, please refer to
scripts/Test_MV_Inference.ipynb
. - To visulalize a trained SOF volume as in Fig.5, please use
scripts/Test_Slicing.ipynb
. - To calculat mIOU during SOF training (Fig.9), please modify the model checkpoint directory and run
scripts/Test_mIoU.ipynb
. - We also provide
scripts/Test_GMM.ipynb
for miscs like fitting GMM model to the geometric space.
Thanks vsitzmann for sharing the awesome idea of SRNs, which has greatly inspired our design of SOF.