Our code was tested on Ubuntu18.04 with Python 3.9, PyTorch 1.12, and Cuda 11.3. Follow these steps to reproduce our environment and results.
git clone /~https://github.com/xucao-42/mvas.git
cd mvas
conda env create -f environment.yml
conda activate mvas
mkdir ./data
cd ./code
Train on DiLiGenT-MV data (280 MB)
Download data from Google Drive and extract it under data
folder.
Run
python exp_runner.py --config configs/diligent_mv.conf
Train on SymPS data (5.7 GB)
Download data from Google Drive and extract it under data
folder.
Run
python exp_runner.py --config configs/symps_gargoyle.conf
# or you can try symps_house.conf and symps_moai.conf
Train on PANDORA data (2.7 GB)
Download data from Google Drive and extract it under data
folder.
Run
python exp_runner.py --config configs/pandora.conf
Results will be saved in results/$obj_name/$exp_time
.
DiLiGenT-MV
input_azimuth_maps
: These are 16-bit RGBA images where the alpha channel represents the object mask and the RGB channels are identical. Each RGB channel can be converted to azimuth angles within [0, pi] by multiplying it by pi/65535. The azimuth angle is measured clockwise from the x-axis, which points to the right, and is consistent with OpenCV convention (x-axis to the right, y-axis downward). The azimuth maps do not need to be stored in the range [0, 2π], as our method is π-invariant.vis_azimuth_maps
: These are for visualization purposes only and are not used during training.normal_maps
: These are the normal maps used to create the input azimuth maps. We applied SDPS-Net independently in each view to obtain the normal maps.params.json
: This file is from PS-NeRF preprocessing and contains the camera intrinsic parameters, as the normal and azimuth maps are cropped to 400 x 400.Calib_Results.mat
: This file is from the original DiLiGenT-MV dataset and provides the camera extrinsic information.
SymPS
input_azimuth_maps
: These are 16-bit gray-scale images. The pixel values can be converted to azimuth angles within [0, pi] by multiplying them by pi/65535. The azimuth angle is measured clockwise from the x-axis, which points to the right, and is consistent with OpenCV convention (x-axis to the right, y-axis downward). The azimuth maps do not need to be stored in the range [0, 2π], as our method is π-invariant.mask
: Binary masks indicating the object silhouettes.sparse
: This folder contains Colmap-calibrated camera intrinsic and extrinsic information.images_SfM
: These are images used for structure from motion in Colmap.
PANDORA
input_azimuth_maps
: These are 16-bit RGBA images where the alpha channel represents the object mask and the RGB channels are identical. Each RGB channel can be converted to azimuth angles within [0, pi] by multiplying it by pi/65535. The azimuth angle is measured clockwise from the x-axis, which points to the right, and is consistent with OpenCV convention (x-axis to the right, y-axis downward). The azimuth maps do not need to be stored in the range [0, 2π], as our method is π-invariant. Note that since PANDORA is a polarization image dataset, the azimuth maps have half-pi ambiguity.vis_azimuth_maps
: These are for visualization purposes only and are not used during training.sparse
: This folder is from PANDORA and contains Colmap-calibrated camera intrinsic and extrinsic information.images
: These are for reference purpose and not used in training.
Our implementation is built upon IDR , and benefits from PS-NeRF and PANDORA.
@inproceedings{mvas2023cao,
title = {Multi-View Azimuth Stereo via Tangent Space Consistency},
author = {Cao, Xu and Santo, Hiroaki and Okura, Fumio and Matsushita, Yasuyuki},
year = {2023},
booktitle = CVPR,
}