Skip to content

Topological Semantic Graph Memory for Image Goal Navigation (CoRL 2022 oral)

License

Notifications You must be signed in to change notification settings

rllab-snu/TopologicalSemanticGraphMemory

Repository files navigation

TSGM: Topological Semantic Graph Memory

This repository contains a Pytorch implementation of our CoRL 2022 oral paper:

Nuri Kim, Obin Kwon, Hwiyeon Yoo, Yunho Choi, Jeongho Park, Songhwai Oh
Seoul National University

Project website: https://bareblackfoot.github.io/TopologicalSemanticGraphMemory

Abstract

This work proposes an approach to incrementally collect a landmark-based semantic graph memory and use the collected memory for image goal navigation. Given a target image to search, an embodied robot utilizes the semantic memory to find the target in an unknown environment. We present a method for incorporating object graphs into topological graphs, called Topological Semantic Graph Memory (TSGM). Although TSGM does not use position information, it can estimate 3D spatial topological information about objects.

TSGM consists of
(1) Graph builder that takes the observed RGB-D image to construct a topological semantic graph.
(2) Cross graph mixer that takes the collected memory to get contextual information.
(3) Memory decoder that takes the contextual memory as an input to find an action to the target.

On the task of an image goal navigation, TSGM significantly outperforms competitive baselines by +5.0-9.0% on the success rate and +7.0-23.5% on SPL, which means that the TSGM finds efficient paths.

Demonstration

To visualize the TSGM generation, run the jupyter notebook build_tsgm_demo. This notebook will show the online TSGM generation during w/a/s/d control on the simulator. The rendering window will show the generated TSGM and the observations as follows: tsgm_demo Note that the top-down map and pose information are only used for visualization, not for the graph generation.

To check the effectiveness of the object encoder, run the jupyter notebook object_encoder.

Installation

The source code is developed and tested in the following setting.

  • Python 3.7
  • pytorch 1.10
  • detectron2
  • habitat-sim 0.2.1
  • habitat 0.2.1

Please refer to habitat-sim and habitat-lab for installation.

To start, we prefer creating the environment using conda:

conda env create -f environment.yml
conda activate tsgm
conda install habitat-sim==0.2.1 withbullet headless -c conda-forge -c aihabitat
cd 
mkdir programs
cd programs
git clone --branch stable /~https://github.com/facebookresearch/habitat-lab.git habitat-lab-v21
cd habitat-lab-v21
git checkout tags/v0.2.1
pip install -e .
conda activate tsgm
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

Gibson Env Setup

Most of the scripts in this code build the environments assuming that the gibson dataset is in habitat-lab/data/ folder.

The recommended folder structure of habitat-lab:

habitat-lab 
  └── data
      └── datasets
      │   └── pointnav
      │       └── gibson
      │           └── v1
      │               └── train
      │               └── val
      └── scene_datasets
          └── gibson
              └── *.glb, *.navmeshs  

Download Data

You can download the whole data here.

Place the data like this:

TopologicalSemanticGraphMemory
  └── data
      └── assets
      └── episodes
      └── noise_models
      └── scene_info
      └── pretrained_models
      └── detector
      └── graph
        └── gibson
            └── Img_encoder.pth.tar
            └── Obj_encoder.pth.tar

Creating Datasets

  1. Data Generation for Imitation Learning

    python collect_il_data.py --ep-per-env 200 --num-procs 4 --split train --data-dir IL_data/gibson
    

    This will generate the data for imitation learning. (takes around ~24hours) You can find some examples of the collected data in IL_data/gibson folder, and look into them with show_IL_data.ipynb. You can also download the collected il data from here.

  2. Collect Topological Semantic Graph for Imitation Learning

    python collect_graph.py ./configs/TSGM.yaml --data-dir IL_data/gibson --record-dir IL_data/gibson_graph --split train --num-procs 16
    

    This will generate the graph data for training the TSGM model. (takes around ~3hours) You can find some examples of the collected graph data in IL_data/gibson_graph folder, and look into them with show_graph_data.ipynb. You can also download the collected graph data from here.

Training

  1. Imitation Learning

    python train_il.py --policy TSGMPolicy --config configs/TSGM.yaml --version exp_name --data-dir IL_data/gibson --prebuild-path IL_data/gibson_graph
    

    This will train the imitation learning model. The model will be saved in ./checkpoints/exp_name.

  2. Reinforcement Learning The reinforcement learning code is highly based on habitat-lab/habitat_baselines. To train the agent with reinforcement learning (PPO), run:

    python train_rl.py --policy TSGMPolicy --config configs/TSGM.yaml --version exp_name --diff hard --use-detector --strict-stop --task imggoalnav --gpu 0,1
    

    This will train the agent with the imitation learning model in ./checkpoints/exp_name. The trained model will be saved in ./checkpoints/exp_name.

Evaluation

To evaluate the trained model, run:

python evaluate.py --config configs/TSGM.yaml --version version_name --diff hard --gpu 0

This will evaluate the trained model in ./data/checkpoints/{$version_name}_{$task}.

Or, you can evaluate the pretrained model with:

python evaluate.py --config configs/TSGM.yaml --version version_name --diff hard --eval-ckpt data/pretrained_models/tsgm_hard.pth --gpu 0 --policy TSGMPolicy

Results

Expected results for TSGM from running the code

Model Test set Easy (SR) Easy (SPL) Medium (SR) Medium (SPL) Hard (SR) Hard (SPL) Overall (SR) Overall (SPL)
TSGM-IL VGM 76.76 59.54 72.99 53.67 63.16 45.21 70.97 52.81
TSGM-RL VGM 91.86 85.25 82.03 68.10 70.30 49.98 81.4 67.78
TSGM-RL NRNS-straight 94.40 92.19 92.60 84.35 70.35 62.85 85.78 79.80
TSGM-RL NRNS-curved 93.60 91.04 89.70 77.88 64.20 55.03 82.50 74.13
TSGM-RL Meta 90.79 86.93 85.43 72.89 63.43 56.28 79.88 72.03

Visualize the Results

To visualize the TSGM from the recorded output from the evaluate (test with --record 3), please run the following command:

python visualize_tsgm.py --config-file configs/TSGM.yaml --eval-ckpt <checkpoint_path>

We release pre-trained models from the experiments in our paper:

Method Train Checkpoints
TSGM Imitation Learning tsgm_il.pth
TSGM Imitation Learning + Reinforcement Learning tsgm_rl.pth
Image Classifier Self-supervised Clustering Img_encoder.pth.tar
Object Classifier Supervised Clustering Obj_encoder.pth.tar

Citation

If you find this code useful for your research, please consider citing:

@inproceedings{TSGM,
      title={{Topological Semantic Graph Memory for Image Goal Navigation}},
      author={Nuri Kim and Obin Kwon and Hwiyeon Yoo and Yunho Choi and Jeongho Park and Songhawi Oh},
      year={2022},
      booktitle={CoRL}
}

Acknowledgements

In our work, we used parts of VGM, and Habitat-Lab repos and extended them.

Related Work

License

This project is released under the MIT license, as found in the LICENSE file.