GitHub - ashish-kmr/vmsr (original) (raw)
Learning Navigation Subroutines from Egocentric Videos
Ashish Kumar, Saurabh Gupta, Jitendra Malik
Conference on Robot Learning (Corl) 2019.
Citing
If you find this code base and models useful in your research, please consider citing the following paper:
@article{kumar2019learning,
title={Learning navigation subroutines by watching videos},
author={Kumar, Ashish and Gupta, Saurabh and Malik, Jitendra},
journal={arXiv preprint arXiv:1905.12612},
year={2019}
}
Contents
Requirements: software
- Python Virtual Env Setup: All code is implemented in Python but depends on a small number of python packages and a couple of C libraries. We recommend using virtual environment for installing these python packages and python bindings for these C libraries.
VENV_DIR=venv
pip install virtualenv
virtualenv $VENV_DIR
source $VENV_DIR/bin/activate
You may need to upgrade pip for installing openv-python.
pip install --upgrade pip
Install simple dependencies.
pip install -r requirements.txt
Patch bugs in dependencies.
sh patches/apply_patches.sh
2. Swiftshader: We useSwiftshader, a CPU based renderer to render the meshes. It is possible to use other renderers, replace SwiftshaderRenderer
in render/swiftshader_renderer.py
with bindings to your renderer.
apt-get install libxext-dev libx11-dev
mkdir -p deps
git clone --recursive https://github.com/google/swiftshader.git deps/swiftshader-src
cd deps/swiftshader-src && git checkout 91da6b00584afd7dcaed66da88e2b617429b3950
wget https://chromium.googlesource.com/native_client/pnacl-subzero/+archive/a018d6e2dc9b3f0b1a48d1deade8160e44589189.tar.gz
tar xvfz a018d6e2dc9b3f0b1a48d1deade8160e44589189.tar.gz -C third_party/pnacl-subzero/
mkdir build && cd build && cmake .. && make -j 16 libEGL libGLESv2
cd ../../../
cp deps/swiftshader-src/build/libEGL* libEGL.so.1
cp deps/swiftshader-src/build/libGLESv2* libGLESv2.so.2
3. PyAssimp: We use PyAssimp to load meshes. It is possible to use other libraries to load meshes, replaceShape
render/swiftshader_renderer.py
with bindings to your library for loading meshes.
mkdir -p deps
git clone https://github.com/assimp/assimp.git deps/assimp-src
cd deps/assimp-src
git checkout 2afeddd5cb63d14bc77b53740b38a54a97d94ee8
cmake CMakeLists.txt -G 'Unix Makefiles' && make -j 16
cd port/PyAssimp && python setup.py install
cd ../../../..
cp deps/assimp-src/lib/libassimp* .
Requirements: data
- Download the Stanford 3D Indoor Spaces Dataset and Matterport 3D dataset. The expected data format is ../data/mp3d/meshes; ../data/mp3d/class-maps; ../data/mp3d/room-dimension, where the mp3d folder contains the data from both MP3D and S3DIS. For S3DIS, you can follow the instructions at: https://github.com/tensorflow/models/tree/master/research/cognitive_mapping_and_planning/data. The data splits for inverse model training, and vmsr training are defined at: env-data/splits/mp3d/
Train and Evaluate Models
You can download the pretrained models by running the script scripts/download_pretrained_models.sh
.
- To train an inverse model, run the script
scripts/train_inverse_model.sh
(pretrained model included in the download script). - To train a vmsr policy, run the script
scripts/train_vmsr.sh
(pretrained model included in the download script). - To evaluate the pretrained model on exploration, run the script
scripts/evaluate_exploration.sh
. - To evaluate the pretrained model on downstream RL for point goal and area goal, run the script
scripts/downstream_rl.sh
. This script contains 4 settings corresponding to area goal and point goal tasks, each for dense and sparse rewards.
Credits
This code was written by Ashish Kumar and Saurabh Gupta. The rl code was adapted from PyTorch Implementations of Reinforcement Learning Algorithms Kostrikov, Ilya; GitHub. 2018