We present ROCA, a novel end-to-end approach that retrieves and aligns 3D CAD models from a shape database to a single input image. This enables 3D perception of an observed scene from a 2D RGB observation, characterized as a lightweight, compact, clean CAD representation. Core to our approach is our differentiable alignment optimization based on dense 2D-3D object correspondences and Procrustes alignment. ROCA can thus provide a robust CAD alignment while simultaneously informing CAD retrieval by leveraging the 2D-3D correspondences to learn geometrically similar CAD models. Experiments on challenging, real-world imagery from ScanNet show that ROCA significantly improves on state of the art, from 9.5% to 17.6% in retrieval-aware CAD alignment accuracy.
This is an interactive example showing sample smartphone images. Use Next and Previous buttons to see different examples.
You can view the mesh from different viewpoints using your using your mouse or finger. Camera is also visualized in gray color to provide a frame of reference.
The interactive mesh viewer should appear on the right. Note that it may take a few seconds to load the mesh.
@article{gumeli2022roca,
title={ROCA: Robust CAD Model Retrieval and Alignment from a Single Image},
author={G{\"u}meli, Can and Dai, Angela and Nie{\ss}ner, Matthias},
booktitle={Proc. Computer Vision and Pattern Recognition (CVPR), IEEE},
year={2022}
}
}