This repository is the official PyTorch implementation of the work:
RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images, CVPR Workshiop DLGC, 2024
[2024/07] Updated the requirements.txt and fixed some bugs.
[2024/06] Our paper was awarded Best Paper at the sixth DLGC. Congratulations!
-
python==3.9.12, CUDA=12.2, requirements.txt
-
Install
detectron2
from source -
sh scripts/install_deps.sh
-
Compile the cpp extension for
farthest points sampling (fps)
:sh core/csrc/compile.sh
Download the 6D pose datasets (LM, LM-O, YCB-V) from the BOP website and MP6D from MP6D
Please also download the metadata from [Metadata].
The structure of datasets
folder should look like below:
# recommend using soft links (ln -sf)
datasets/
├── lm_imgn
├── VOCdevkit
├── BOP_DATASETS
├──lm
├──lm
├──train
├──train_pbr
├──xyz_crop
├──......
├──test
├──xyz_crop
├──......
├──image_set
├──models
├──models_eval
├──test_targets_bop19.json
├──lmo
├──train_pbr
├──xyz_crop
├──......
├──test
├──test_bboxes
├──......
├──image_set
├──models
├──models_eval
├──lmo
├──test_targets_all.json
├──test_targets_bop19.json
├──ycbv
├──train_real
├──xyz_crop
├──......
├──train_pbr
├──xyz_crop
├──......
├──test
├──test_bboxes
├──......
├──image_set
├──models
├──models_eval
├──models_fine
├──ycbv
├──test_targets_bop19.json
├──test_targets_keyframe.json
├──mp6d
├──data
├──data_syn_1
├──data_syn_2
├──image_set
├──models_cad
├──models_eval
├──xyz_crop
├──mp6d_keyframe.json
./core/gdrn_modeling/train_gdrn.sh <config_path> <gpu_ids> (other args)
Example:
./core/gdrn_modeling/train_gdrn.sh configs/gdrn/lm/a6_cPnP_lm13.py 0 # multiple gpus: 0,1,2,3
# add --resume if you want to resume from an interrupted experiment.
./core/gdrn_modeling/test_gdrn.sh <config_path> <gpu_ids> <ckpt_path> (other args)
Example:
./core/gdrn_modeling/test_gdrn.sh configs/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e.py 0 output/gdrn/lmo/a6_cPnP_AugAAETrunc_BG0.5_lmo_real_pbr0.1_40e/gdrn_lmo_real_pbr.pth
Our trained RDPN models can be found here [Linemod][Linemod-Occluded] [YCBV] [MP6D]
-
Evaluation result on the LineMOD dataset (ADD(-S)):
RGB RGB-D PVNet CDPN DPODv2 PointFusion DenseFusion(iterative) G2L-Net PVN3D FFB6D RCVPose DFTr RDPN6D (Ours) MEAN 86.3 89.9 99.7 73.7 94.3 98.7 99.4 99.7 99.43 99.8 99.97 -
Evaluation result on the Linemod-Occluded dataset (ADD(-S)):
PVN3D FFB6D RCVPose Uni6D Uni6Dv2 DFTr RDPN6D (Ours) ALL 63.2 66.2 70.2 30.7 40.2 77.7 79.5 -
Evaluation result without any post refinement on the YCB-Video dataset (ADD-S AUC and ADD(-S) AUC):
PVN3D FFB6D RCVPose ES6D Uni6D DFTr RDPN6D (Ours) ADDS ADD(S) ADDS ADD(S) ADDS ADD(S) ADDS ADD(S) ADDS ADD(S) ADDS ADD(S) ADDS ADD(S) ALL 95.5 91.8 96.6 92.7 96.6 95.2 93.6 89.0 95.2 88.8 96.7 94.4 98.4 94.6 -
Evaluation result on the MP6D dataset (ADD-S AUC):
PVN3D FFB6D DFTr RDPN6D (Ours) ALL 85.42 86.29 93.01 95.9
If you find this useful in your research, please cite our paper:
@InProceedings{Hong_2024_CVPR,
author = {Hong, Zong-Wei and Hung, Yen-Yang and Chen, Chu-Song},
title = {RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
month = {June},
year = {2024},
pages = {5251-5260}
}
This work can not be finished well without the following reference, many thanks for the author's contribution: