Progressive-X+: Clustering in the Consensus Space (original) (raw)

Accurate Motion Estimation through Random Sample Aggregated Consensus

arXiv (Cornell University), 2017

We reconsider the classic problem of estimating accurately a 2D transformation from point matches between images containing outliers. RANSAC discriminates outliers by randomly generating minimalistic sampled hypotheses and verifying their consensus over the input data. Its response is based on the single hypothesis that obtained the largest inlier support. In this article we show that the resulting accuracy can be improved by aggregating all generated hypotheses. This yields RANSAAC, a framework that improves systematically over RANSAC and its state-of-the-art variants by statistically aggregating hypotheses. To this end, we introduce a simple strategy that allows to rapidly average 2D transformations, leading to an almost negligible extra computational cost. We give practical applications on projective transforms and homography+distortion models and demonstrate a significant performance gain in both cases.

Effective Sampling: Fast Segmentation Using Robust Geometric Model Fitting

IEEE Transactions on Image Processing

Identifying the underlying models in a set of data points contaminated by noise and outliers, leads to a highly complex multi-model fitting problem. This problem can be posed as a clustering problem by the projection of higher order affinities between data points into a graph, which can then be clustered using spectral clustering. Calculating all possible higher order affinities is computationally expensive. Hence in most cases only a subset is used. In this paper, we propose an effective sampling method to obtain a highly accurate approximation of the full graph required to solve multi-structural model fitting problems in computer vision. The proposed method is based on the observation that the usefulness of a graph for segmentation improves as the distribution of hypotheses (used to build the graph) approaches the distribution of actual parameters for the given data. In this paper, we approximate this actual parameter distribution using a k-th order statistics based cost function and the samples are generated using a greedy algorithm coupled with a data sub-sampling strategy. The experimental analysis shows that the proposed method is both accurate and computationally efficient compared to the state-of-the-art robust multi-model fitting techniques.

Motion segmentation by consensus

2008 19th International Conference on Pattern Recognition, 2008

We present a method for merging multiple partitions into a single partition, by minimising the ratio of pairwise agreements and contradictions between the equivalence relations corresponding to the partitions. The number of equivalence classes is determined automatically. This method is advantageous when merging segmentations obtained independently. We propose using this consensus approach to merge segmentations of features tracked on video. Each segmentation is obtained by clustering on the basis of mean velocity during a particular time interval.

Higher-Order Multicuts for Geometric Model Fitting and Motion Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence

The minimum cost lifted multicut problem is a generalization of the multicut problem (also known as correlation clustering) and is a means to optimizing a decomposition of a graph w.r.t. both positive and negative edge costs. It has been shown to be useful in a large variety of applications in computer vision thanks to the fact that multicut-based formulations do not require the number of components given a priori; instead, it is deduced from the solution. However, the standard multicut cost function is limited to pairwise relationships between nodes, while several important applications either require or can benefit from a higher-order cost function, i.e. hyper-edges. In this paper, we propose a pseudo-boolean formulation for a multiple model fitting problem. It is based on a formulation of any-order minimum cost lifted multicuts, which allows to partition an undirected graph with pairwise connectivity such as to minimize costs defined over any set of hyper-edges. As the proposed formulation is NP-hard and the branch-and-bound algorithm (as well as obtaining lower bounds) is too slow in practice, we propose an efficient local search algorithm for inference into resulting problems. We demonstrate versatility and effectiveness of our approach in several applications: 1) We define a geometric multiple model fitting, more specifically, a line fitting problem on all triplets of points and group points, that belong to the same line, together. 2) We formulate homography and motion estimation as a geometric model fitting problem where the task is to find groups of points that can be explained by the same geometrical transformation. 3) In motion segmentation our model allows to go from modeling translational motion to Euclidean or affine transformations, which improves the segmentation quality in terms of F-measure.

Multi-Bernoulli sample consensus for simultaneous robust fitting of multiple structures in machine vision

Signal, Image and Video Processing, 2014

In many image processing applications, such as parametric range and motion segmentation, multiple instances of a model are fitted to data points. The most common robust fitting method, RANSAC , and its extensions are normally devised to segment the structures sequentially, treating the points belonging to other structures as outliers. Thus, the ratio of inliers is small and successful fitting requires a very large number of random samples, incurring cumbrous computation. This paper presents a new method to simultaneously fit multiple structures to data points in a single run. We model the parameters of multiple structures as a random finite set with multi-Bernoulli distribution. Simultaneous search for all structure parameters is performed by Bayesian update of the multi-Bernoulli parameters. Experiments involving segmentation of numerous structures show that our method outperforms well-known methods in terms of estimation error and computational cost. The fast convergence and high accuracy of our method make it an excellent choice for real-time estimation and segmentation of multiple structures in image processing applications.

Parallel Structure from Motion from Local Increment to Global Averaging

arXiv: Computer Vision and Pattern Recognition, 2017

In this paper, we tackle the accurate and consistent Structure from Motion (SfM) problem, in particular camera registration, far exceeding the memory of a single computer in parallel. Different from the previous methods which drastically simplify the parameters of SfM and sacrifice the accuracy of the final reconstruction, we try to preserve the connectivities among cameras by proposing a camera clustering algorithm to divide a large SfM problem into smaller sub-problems in terms of camera clusters with overlapping. We then exploit a hybrid formulation that applies the relative poses from local incremental SfM into a global motion averaging framework and produce accurate and consistent global camera poses. Our scalable formulation in terms of camera clusters is highly applicable to the whole SfM pipeline including track generation, local SfM, 3D point triangulation and bundle adjustment. We are even able to reconstruct the camera poses of a city-scale data-set containing more than o...

Simultaneous Segmentation and 3D Reconstruction of Monocular Image Sequences

2007 IEEE 11th International Conference on Computer Vision, 2007

When trying to extract 3D scene information and camera motion from an image sequence alone, it is often necessary to cope with independently moving objects. Recent research has unveiled some of the mathematical foundations of the problem, but a general and practical algorithm, which can handle long, realistic sequences, is still missing. In this paper, we identify the necessary parts of such an algorithm, highlight both unexplored theoretical issues and practical challenges, and propose solutions. Theoretical issues include proper handling of different situations, in which the number of independent motions changes: objects can enter the scene, objects previously moving together can split and follow independent trajectories, or independently moving objects can merge into one common motion. We derive model scoring criteria to handle these changes in the number of segments. A further theoretical issue is the resolution of the relative scale ambiguity between such changes. Practical issues include robust 3D reconstruction of freely moving foreground objects, which often have few and short feature tracks. The proposed framework simultaneously tracks features, groups them into rigidly moving segments, and reconstructs all segments in 3D. Such an online approach, as opposed to batch processing techniques, which first track features, and then perform segmentation and reconstruction, is vital in order to handle small foreground objects.

Efficient Structure from Motion by Graph Optimization

Lecture Notes in Computer Science, 2010

We present an efficient structure from motion algorithm that can deal with large image collections in a fraction of time and effort of previous approaches while providing comparable quality of the scene and camera reconstruction. First, we employ fast image indexing using large image vocabularies to measure visual overlap of images without running actual image matching. Then, we select a small subset from the set of input images by computing its approximate minimal connected dominating set by a fast polynomial algorithm. Finally, we use task prioritization to avoid spending too much time in a few difficult matching problems instead of exploring other easier options. Thus we avoid wasting time on image pairs with low chance of success and avoid matching of highly redundant images of landmarks. We present results for several challenging sets of thousands of perspective as well as omnidirectional images.

Global motion estimation from point matches

2012

ABSTRACT Multiview structure recovery from a collection of images requires the recovery of the positions and orientations of the cameras relative to a global coordinate system. Our approach recovers camera motion as a sequence of two global optimizations. First, pair wise Essential Matrices are used to recover the global rotations by applying robust optimization using either spectral or semi definite programming relaxations. Then, we directly employ feature correspondences across images to recover the global translation vectors using a linear algorithm based on a novel decomposition of the Essential Matrix. Our method is efficient and, as demonstrated in our experiments, achieves highly accurate results on collections of real images for which ground truth measurements are available.