ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language with Graph, Attention and BRNet

Introduction

In this work, we improve the previous state-of-the-art method named ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language by adapting graph and attention mechanism. ScanRefer is a neural network architecture that localizes objects in 3D point clouds given natural language descriptions referring to the underlying objects. We improve the object detection module of ScanRefer by substituting VoteNet with Back-tracing Representative Points Network (BRNet). We also propose a method of scene-language understanding and objects relationship understanding by adapting graph neural network, language self-attention, and cross-modal attention mechanism.

Setup

Please check the ScanRefer project website here for setup.

Usage

Training

To train the ScanRefer-GAB model with RGB values:

python scripts/train.py --use_color --use_brnet --use_self_attn --use_dgcnn --use_cross_attn

For more training options (like using preprocessed multiview features), please run scripts/train.py -h.

Evaluation

To evaluate the trained ScanRefer-GAB models, please find the folder under outputs/ with the current timestamp and run:

python scripts/eval.py --folder <folder_name> --reference --use_color --use_brnet --use_self_attn --use_dgcnn --use_cross_attn --no_nms --force --repeat 5

Visualization

To visualize the localization results predicted by the trained ScanRefer-GAB model in a specific scene, please find the corresponding folder under outputs/ with the current timestamp and run:

python scripts/visualize.py --folder <folder_name> --scene_id <scene_id> --use_color --use_brnet --use_self_attn --use_dgcnn --use_cross_attn

To visualize the attention weights run:

python scripts/visualize_selfattention.py --folder <folder_name> --scene_id <scene_id> --use_color --use_brnet --use_self_attn --use_dgcnn --use_cross_attn

Acknowledgement

We would like to thank Dave Zhenyu Chen for the ScanRefer codebase.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
benchmark		benchmark
data/scannet		data/scannet
docs		docs
lib		lib
models		models
scripts		scripts
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language with Graph, Attention and BRNet

Introduction

Setup

Usage

Training

Evaluation

Visualization

Acknowledgement

About

Releases

Packages

Languages

License

yoonhachoe/ScanRefer-GAB

Folders and files

Latest commit

History

Repository files navigation

ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language with Graph, Attention and BRNet

Introduction

Setup

Usage

Training

Evaluation

Visualization

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages