Qixing Huang - Academia.edu (original) (raw)
Papers by Qixing Huang
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Monocular 3D prediction is one of the fundamental problems in 3D vision. Recent deep learning-bas... more Monocular 3D prediction is one of the fundamental problems in 3D vision. Recent deep learning-based approaches have brought us exciting progress on this problem. However, existing approaches have predominantly focused on end-to-end depth and normal predictions, which do not fully utilize the underlying 3D environment's geometric structures. This paper introduces StruMonoNet, which detects and enforces a planar structure to enhance pixelwise predictions. StruMonoNet innovates in leveraging a hybrid representation that combines visual feature and a surfel representation for plane prediction. This formulation allows us to combine the power of visual feature learning and the flexibility of geometric representations in incorporating geometric relations. As a result, StruMonoNet can detect relations between planes such as adjacent planes, perpendicular planes, and parallel planes, all of which are beneficial for dense 3D prediction. Experimental results show that StruMonoNet considerably outperforms state-ofthe-art approaches on NYUv2 and ScanNet.
ACM Transactions on Graphics, 2014
Near-regular structures are common in manmade and natural objects. Algorithmic detection of such ... more Near-regular structures are common in manmade and natural objects. Algorithmic detection of such regularity greatly facilitates our understanding of shape structures, leads to compact encoding of input geometries, and enables efficient generation and manipulation of complex patterns on both acquired and synthesized objects. Such regularity manifests itself both in the repetition of certain geometric elements, as well as in the structured arrangement of the elements. We cast the regularity detection problem as an optimization and efficiently solve it using linear programming techniques. Our optimization has a discrete aspect, that is, the connectivity relationships among the elements, as well as a continuous aspect, namely the locations of the elements of interest. Both these aspects are captured by our near-regular structure extraction framework, which alternates between discrete and continuous optimizations. We demonstrate the effectiveness of our framework on a variety of problems...
SIGGRAPH Asia 2013 Courses, 2013
Shape structure is about the arrangement and relations between shape parts. Structure-aware shape... more Shape structure is about the arrangement and relations between shape parts. Structure-aware shape processing goes beyond local geometry and low level processing, and analyzes and processes shapes at a high level. It focuses more on the global inter and intra semantic relations among the parts of shape rather than on their local geometry. With recent developments in easy shape acquisition, access to vast repositories of 3D models, and simple-to-use desktop fabrication possibilities, the study of structure in shapes has become a central research topic in shape analysis, editing, and modeling. A whole new line of structure-aware shape processing algorithms has emerged that base their operation on an attempt to understand such structure in shapes. The algorithms broadly consist of two key phases: an analysis phase, which extracts structural information from input data; and a (smart) processing phase, which utilizes the extracted information for exploration, editing, and synthesis of novel shapes. In this survey paper, we organize, summarize, and present the key concepts and methodological approaches towards efficient structure-aware shape processing. We discuss common models of structure, their implementation in terms of mathematical formalism and algorithms, and explain the key principles in the context of a number of state-ofthe-art approaches. Further, we attempt to list the key open problems and challenges, both at the technical and at the conceptual level, to make it easier for new researchers to better explore and contribute to this topic. Our goal is to both give the practitioner an overview of available structure-aware shape processing techniques, as well as identify future research questions in this important, emerging, and fascinating research area.
ACM Transactions on Graphics, 2013
In this paper we consider the problem of classifying shapes within a given category (e.g., chairs... more In this paper we consider the problem of classifying shapes within a given category (e.g., chairs) into finer-grained classes (e.g., chairs with arms, rocking chairs, swivel chairs). We introduce a multi-label (i.e., shapes can belong to multiple classes) semi-supervised approach that takes as input a large shape collection of a given category with associated sparse and noisy labels, and outputs cleaned and complete labels for each shape. The key idea of the proposed approach is to jointly learn a distance metric for each class which captures the underlying geometric similarity within that class, e.g., the distance metric for swivel chairs evaluates the global geometric resemblance of chair bases. We show how to achieve this objective by first geometrically aligning the input shapes, and then learning the class-specific distance metrics by exploiting the feature consistency provided by this alignment. The learning objectives consider both labeled data and the mutual relations betwee...
ACM Transactions on Graphics, 2011
We present an approach to segmenting shapes in a heterogenous shape database. Our approach segmen... more We present an approach to segmenting shapes in a heterogenous shape database. Our approach segments the shapes jointly, utilizing features from multiple shapes to improve the segmentation of each. The approach is entirely unsupervised and is based on an integer quadratic programming formulation of the joint segmentation problem. The program optimizes over possible segmentations of individual shapes as well as over possible correspondences between segments from multiple shapes. The integer quadratic program is solved via a linear programming relaxation, using a block coordinate descent procedure that makes the optimization feasible for large databases. We evaluate the presented approach on the Princeton segmentation benchmark and show that joint shape segmentation significantly outperforms single-shape segmentation techniques.
ACM Transactions on Graphics, 2012
We introduce a novel approach for computing high quality point-to-point maps among a collection o... more We introduce a novel approach for computing high quality point-to-point maps among a collection of related shapes. The proposed approach takes as input a sparse set of imperfect initial maps between pairs of shapes and builds a compact data structure which implicitly encodes an improved set of maps between all pairs of shapes. These maps align well with point correspondences selected from initial maps; they map neighboring points to neighboring points; and they provide cycle-consistency, so that map compositions along cycles approximate the identity map. The proposed approach is motivated by the fact that a complete set of maps between all pairs of shapes that admits nearly perfect cycle-consistency are highly redundant and can be represented by compositions of maps through a single base shape. In general, multiple base shapes are needed to adequately cover a diverse collection. Our algorithm sequentially extracts such a small collection of base shapes and creates correspondences fr...
2017 International Conference on 3D Vision (3DV), 2017
In this paper, we introduce a new method for classifying 3D objects. Our main idea is to project ... more In this paper, we introduce a new method for classifying 3D objects. Our main idea is to project a 3D object onto a spherical domain centered around its barycenter and develop neural network to classify the spherical projection. We introduce two complementary projections. The first captures depth variations of a 3D object, and the second captures contour-information viewed from different angles. Spherical projections combine key advantages of two mainstream 3D classification methods: image-based and 3D-based. Specifically, spherical projections are locally planar, allowing us to use massive image datasets (e.g, ImageNet) for pre-training. Also spherical projections are similar to voxel-based methods, as they encode complete information of a 3D object in a single neural network capturing dependencies across different views. Our novel network design can fully utilize these advantages. Experimental results on ModelNet40 and ShapeNetCore show that our method is superior to prior methods.
Computer Vision – ECCV 2020, 2020
We introduce a large-scale annotated mechanical components benchmark for classification and retri... more We introduce a large-scale annotated mechanical components benchmark for classification and retrieval tasks named Mechanical Components Benchmark (MCB): a large-scale dataset of 3D objects of mechanical components. The dataset enables data-driven feature learning for mechanical components. Exploring the shape descriptor for mechanical components is essential to computer vision and manufacturing applications. However, not much attention has been given on creating annotated mechanical components datasets on a large scale. This is because acquiring 3D models is challenging and annotating mechanical components requires engineering knowledge. Our main contributions are the creation of a large-scale annotated mechanical component benchmark, defining hierarchy taxonomy of mechanical components, and benchmarking the effectiveness of deep learning shape classifiers on the mechanical components. We created an annotated dataset and benchmarked seven state-of-the-art deep learning classification methods in three categories, namely: (1) point clouds, (2) volumetric representation in voxel grids, and (3) view-based representation.
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
3D shape models are naturally parameterized using vertices and faces, i.e., composed of polygons ... more 3D shape models are naturally parameterized using vertices and faces, i.e., composed of polygons forming a surface. However, current 3D learning paradigms for predictive and generative tasks using convolutional neural networks focus on a voxelized representation of the object. Lifting convolution operators from the traditional 2D to 3D results in high computational overhead with little additional benefit as most of the geometry information is contained on the surface boundary. Here we study the problem of directly generating the 3D shape surface of rigid and non-rigid shapes using deep convolutional neural networks. We develop a procedure to create consistent 'geometry images' representing the shape surface of a category of 3D objects. We then use this consistent representation for category-specific shape surface generation from a parametric representation or an image by developing novel extensions of deep residual networks for the task of geometry image generation. Our experiments indicate that our network learns a meaningful representation of shape surfaces allowing it to interpolate between shape orientations and poses, invent new shape surfaces and reconstruct 3D shape surfaces from previously unseen images 1 .
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
Monocular 3D prediction is one of the fundamental problems in 3D vision. Recent deep learning-bas... more Monocular 3D prediction is one of the fundamental problems in 3D vision. Recent deep learning-based approaches have brought us exciting progress on this problem. However, existing approaches have predominantly focused on end-to-end depth and normal predictions, which do not fully utilize the underlying 3D environment's geometric structures. This paper introduces StruMonoNet, which detects and enforces a planar structure to enhance pixelwise predictions. StruMonoNet innovates in leveraging a hybrid representation that combines visual feature and a surfel representation for plane prediction. This formulation allows us to combine the power of visual feature learning and the flexibility of geometric representations in incorporating geometric relations. As a result, StruMonoNet can detect relations between planes such as adjacent planes, perpendicular planes, and parallel planes, all of which are beneficial for dense 3D prediction. Experimental results show that StruMonoNet considerably outperforms state-ofthe-art approaches on NYUv2 and ScanNet.
ACM Transactions on Graphics, 2014
Near-regular structures are common in manmade and natural objects. Algorithmic detection of such ... more Near-regular structures are common in manmade and natural objects. Algorithmic detection of such regularity greatly facilitates our understanding of shape structures, leads to compact encoding of input geometries, and enables efficient generation and manipulation of complex patterns on both acquired and synthesized objects. Such regularity manifests itself both in the repetition of certain geometric elements, as well as in the structured arrangement of the elements. We cast the regularity detection problem as an optimization and efficiently solve it using linear programming techniques. Our optimization has a discrete aspect, that is, the connectivity relationships among the elements, as well as a continuous aspect, namely the locations of the elements of interest. Both these aspects are captured by our near-regular structure extraction framework, which alternates between discrete and continuous optimizations. We demonstrate the effectiveness of our framework on a variety of problems...
SIGGRAPH Asia 2013 Courses, 2013
Shape structure is about the arrangement and relations between shape parts. Structure-aware shape... more Shape structure is about the arrangement and relations between shape parts. Structure-aware shape processing goes beyond local geometry and low level processing, and analyzes and processes shapes at a high level. It focuses more on the global inter and intra semantic relations among the parts of shape rather than on their local geometry. With recent developments in easy shape acquisition, access to vast repositories of 3D models, and simple-to-use desktop fabrication possibilities, the study of structure in shapes has become a central research topic in shape analysis, editing, and modeling. A whole new line of structure-aware shape processing algorithms has emerged that base their operation on an attempt to understand such structure in shapes. The algorithms broadly consist of two key phases: an analysis phase, which extracts structural information from input data; and a (smart) processing phase, which utilizes the extracted information for exploration, editing, and synthesis of novel shapes. In this survey paper, we organize, summarize, and present the key concepts and methodological approaches towards efficient structure-aware shape processing. We discuss common models of structure, their implementation in terms of mathematical formalism and algorithms, and explain the key principles in the context of a number of state-ofthe-art approaches. Further, we attempt to list the key open problems and challenges, both at the technical and at the conceptual level, to make it easier for new researchers to better explore and contribute to this topic. Our goal is to both give the practitioner an overview of available structure-aware shape processing techniques, as well as identify future research questions in this important, emerging, and fascinating research area.
ACM Transactions on Graphics, 2013
In this paper we consider the problem of classifying shapes within a given category (e.g., chairs... more In this paper we consider the problem of classifying shapes within a given category (e.g., chairs) into finer-grained classes (e.g., chairs with arms, rocking chairs, swivel chairs). We introduce a multi-label (i.e., shapes can belong to multiple classes) semi-supervised approach that takes as input a large shape collection of a given category with associated sparse and noisy labels, and outputs cleaned and complete labels for each shape. The key idea of the proposed approach is to jointly learn a distance metric for each class which captures the underlying geometric similarity within that class, e.g., the distance metric for swivel chairs evaluates the global geometric resemblance of chair bases. We show how to achieve this objective by first geometrically aligning the input shapes, and then learning the class-specific distance metrics by exploiting the feature consistency provided by this alignment. The learning objectives consider both labeled data and the mutual relations betwee...
ACM Transactions on Graphics, 2011
We present an approach to segmenting shapes in a heterogenous shape database. Our approach segmen... more We present an approach to segmenting shapes in a heterogenous shape database. Our approach segments the shapes jointly, utilizing features from multiple shapes to improve the segmentation of each. The approach is entirely unsupervised and is based on an integer quadratic programming formulation of the joint segmentation problem. The program optimizes over possible segmentations of individual shapes as well as over possible correspondences between segments from multiple shapes. The integer quadratic program is solved via a linear programming relaxation, using a block coordinate descent procedure that makes the optimization feasible for large databases. We evaluate the presented approach on the Princeton segmentation benchmark and show that joint shape segmentation significantly outperforms single-shape segmentation techniques.
ACM Transactions on Graphics, 2012
We introduce a novel approach for computing high quality point-to-point maps among a collection o... more We introduce a novel approach for computing high quality point-to-point maps among a collection of related shapes. The proposed approach takes as input a sparse set of imperfect initial maps between pairs of shapes and builds a compact data structure which implicitly encodes an improved set of maps between all pairs of shapes. These maps align well with point correspondences selected from initial maps; they map neighboring points to neighboring points; and they provide cycle-consistency, so that map compositions along cycles approximate the identity map. The proposed approach is motivated by the fact that a complete set of maps between all pairs of shapes that admits nearly perfect cycle-consistency are highly redundant and can be represented by compositions of maps through a single base shape. In general, multiple base shapes are needed to adequately cover a diverse collection. Our algorithm sequentially extracts such a small collection of base shapes and creates correspondences fr...
2017 International Conference on 3D Vision (3DV), 2017
In this paper, we introduce a new method for classifying 3D objects. Our main idea is to project ... more In this paper, we introduce a new method for classifying 3D objects. Our main idea is to project a 3D object onto a spherical domain centered around its barycenter and develop neural network to classify the spherical projection. We introduce two complementary projections. The first captures depth variations of a 3D object, and the second captures contour-information viewed from different angles. Spherical projections combine key advantages of two mainstream 3D classification methods: image-based and 3D-based. Specifically, spherical projections are locally planar, allowing us to use massive image datasets (e.g, ImageNet) for pre-training. Also spherical projections are similar to voxel-based methods, as they encode complete information of a 3D object in a single neural network capturing dependencies across different views. Our novel network design can fully utilize these advantages. Experimental results on ModelNet40 and ShapeNetCore show that our method is superior to prior methods.
Computer Vision – ECCV 2020, 2020
We introduce a large-scale annotated mechanical components benchmark for classification and retri... more We introduce a large-scale annotated mechanical components benchmark for classification and retrieval tasks named Mechanical Components Benchmark (MCB): a large-scale dataset of 3D objects of mechanical components. The dataset enables data-driven feature learning for mechanical components. Exploring the shape descriptor for mechanical components is essential to computer vision and manufacturing applications. However, not much attention has been given on creating annotated mechanical components datasets on a large scale. This is because acquiring 3D models is challenging and annotating mechanical components requires engineering knowledge. Our main contributions are the creation of a large-scale annotated mechanical component benchmark, defining hierarchy taxonomy of mechanical components, and benchmarking the effectiveness of deep learning shape classifiers on the mechanical components. We created an annotated dataset and benchmarked seven state-of-the-art deep learning classification methods in three categories, namely: (1) point clouds, (2) volumetric representation in voxel grids, and (3) view-based representation.
2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
3D shape models are naturally parameterized using vertices and faces, i.e., composed of polygons ... more 3D shape models are naturally parameterized using vertices and faces, i.e., composed of polygons forming a surface. However, current 3D learning paradigms for predictive and generative tasks using convolutional neural networks focus on a voxelized representation of the object. Lifting convolution operators from the traditional 2D to 3D results in high computational overhead with little additional benefit as most of the geometry information is contained on the surface boundary. Here we study the problem of directly generating the 3D shape surface of rigid and non-rigid shapes using deep convolutional neural networks. We develop a procedure to create consistent 'geometry images' representing the shape surface of a category of 3D objects. We then use this consistent representation for category-specific shape surface generation from a parametric representation or an image by developing novel extensions of deep residual networks for the task of geometry image generation. Our experiments indicate that our network learns a meaningful representation of shape surfaces allowing it to interpolate between shape orientations and poses, invent new shape surfaces and reconstruct 3D shape surfaces from previously unseen images 1 .