Blocks World Revisited (original) (raw)
"the perception of solid objects is a process which can be based on the properties of three-dimensional transformations and the laws of nature"
-Larry Roberts (1965)
Idea
Since most current scene understanding approaches operate either on the 2D image or using a surface-based representation, they do not allow reasoning about the physical constraints within the 3D scene. Inspired by the "Blocks World" work in the 1960's, we present a qualitative physical representation of an outdoor scene where objects have volume and mass, and relationships describe 3D structure and mechanical configurations. Our representation allows us to apply powerful global geometric constraints between 3D volumes as well as the laws of statics in a qualitative manner. We also present a novel iterative "interpretation-by-synthesis" approach where, starting from an empty ground plane, we progressively ``build up'' a physically-plausible 3D interpretation of the image. For surface layout estimation, our method demonstrates an improvement in performance over the state-of-the-art geometric context algorithm. But more importantly, our approach automatically generates 3D parse graphs which describe qualitative geometric and mechanical properties of objects and relationships between objects within an image.
Global Constraints
- Static Equilibrium: Under the static world assumption, the forces and torques acting on a block should cancel out (Newton's first law).
- Support Force Constraint: A supporting object should have enough strength to provide contact reactionary forces on the supported objects.
- Volumetric Constraints: All the objects in the world must have fi nite volumes and cannot inter-penetrate each other.
Results
3D parse graphs automatically generated by our system for all 250 test images are available in the 3D Parse Graphs Gallery .
Downloads
Dataset
We used the Geometric Context dataset. This dataset can be downloaded from here . The ground-truth segmentations can also be downloaded from here.
Code
Download the blocks world code. Please cite the paper if you are using the code.
Citation
Abhinav Gupta, Alexei A. Efros and Martial Hebert, Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics, European Conference on Computer Vision, 2010. (PDF)
Bibtex Reference
@inproceedings{GuptaEfrosHebert_ECCV10,
author="Abhinav Gupta and Alexei A. Efros and Martial Hebert",
title="Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics",
booktitle="European Conference on Computer Vision(ECCV)",
year="2010",
}
Acknowledgements
This research was supported by NSF Grant IIS-0905402 and Guggenheim Fellowship to Alexei Efros.