This is the official code implementation for the paper "Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds" (ICCV 2021) paper
- SimCLR
- Shape Classification
- Semantic Segmentation
- Indoor Object Detection
- Outdoor Object Detection
The code was tested with the following environment: Ubuntu 18.04, python 3.7, pytorch 1.7.1, torchvision 0.8.2 and CUDA 11.1.
For self-supervised pre-training, run the following command:
git clone /~
pip install -r requirements.txt
For downstream tasks, please refer to the Downstream Tasks
Please download the used dataset with the following links:
ScanNet (subset): Please follow the instruction in their official website. The 25k frames subset is enough for our model. You may also need to download the preprocessed data for evaluation here.
Make sure to put the files in the following structure:
|-- ROOT
| |-- BYOL
| |-- data
| |-- modelnet40_normal_resampled_cache
| |-- shapenet57448xyzonly.npz
| |-- scannet
| |-- scannet_frames_25k
Please run the following command:
python BYOL/
You need to edit the config file BYOL/config/config.yaml
to switch different backbone architectures (currently including BYOL-pointnet-cls, BYOL-dgcnn-cls, BYOL-dgcnn-semseg, BYOL-votenet-detection
You can find the checkpoints of the pre-training and downstream tasks in our Google Drive.
For PointNet or DGCNN classification backbones, you may evaluate the learnt representation with linear SVM classifier by running the following command:
For PointNet:
python BYOL/ -w /path/to/your/pre-trained/checkpoints
python BYOL/ -w /path/to/your/pre-trained/checkpoints
You can transform the pre-trained checkpoints to different downstream tasks by running:
For VoteNet:
python BYOL/ --input_path /path/to/your/pre-trained/checkpoints --output_path /path/to/the/transformed/checkpoints
For other backbones:
python BYOL/ --input_path /path/to/your/pre-trained/checkpoints --output_path /path/to/the/transformed/checkpoints
For the fine-tuning and evaluation of downstream tasks, please refer to other corresponding repos. We sincerely thank all these authors for their nice work!
- Classification: WangYueFt/dgcnn
- Semantic Segmentation: AnTao97/dgcnn.pytorch
- Indoor Object Detection: facebookresearch/votenet
If you found our paper or code useful for your research, please cite the following paper:
title={Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds},
author={Huang, Siyuan and Xie, Yichen and Zhu, Song-Chun and Zhu, Yixin},
journal={arXiv preprint arXiv:2109.00179},