Rotation Averaging with Attention Graph Neural Networks (original) (raw)

NeuRoRA: Neural Robust Rotation Averaging

Computer Vision – ECCV 2020, 2020

Multiple rotation averaging is an essential task for structure from motion, mapping, and robot navigation. The task is to estimate the absolute orientations of several cameras given some of their noisy relative orientation measurements. The conventional methods for this task seek parameters of the absolute orientations that agree best with the observed noisy measurements according to a robust cost function. These robust cost functions are highly nonlinear and are designed based on certain assumptions about the noise and outlier distributions. In this work, we aim to build a neural network that learns the noise patterns from the data and predict/regress the model parameters from the noisy relative orientations. The proposed network is a combination of two networks: (1) a view-graph cleaning network, which detects outlier edges in the view-graph and rectifies noisy measurements; and (2) a fine-tuning network, which fine-tunes an initialization of absolute orientations bootstrapped from the cleaned graph, in a single step. The proposed combined network is very fast, moreover, being trained on a large number of synthetic graphs, it is more accurate than the conventional iterative optimization methods. Although the idea of replacing robust optimization methods by a graph-based network is demonstrated only for multiple rotation averaging, it could easily be extended to other graph-based geometric problems, for example, pose-graph optimization.

Robust Single Rotation Averaging

ArXiv, 2020

We propose a novel method for single rotation averaging using the Weiszfeld algorithm. Our contribution is threefold: First, we propose a robust initialization based on the elementwise median of the input rotation matrices. Our initial solution is more accurate and robust than the commonly used chordal L_2L_2L2-mean. Second, we propose an outlier rejection scheme that can be incorporated in the Weiszfeld algorithm to improve the robustness of L1L_1L1 rotation averaging. Third, we propose a method for approximating the chordal L1L_1L_1-mean using the Weiszfeld algorithm. An extensive evaluation shows that both our method and the state of the art perform equally well with the proposed outlier rejection scheme, but ours is 2−42-424 times faster.

Combining Deep Learning and Robust Estimation for Outlier-Resilient Underwater Visual Graph SLAM

Journal of Marine Science and Engineering

Visual Loop Detection (VLD) is a core component of any Visual Simultaneous Localization and Mapping (SLAM) system, and its goal is to determine if the robot has returned to a previously visited region by comparing images obtained at different time steps. This paper presents a new approach to visual Graph-SLAM for underwater robots that goes one step forward the current techniques. The proposal, which centers its attention on designing a robust VLD algorithm aimed at reducing the amount of false loops that enter into the pose graph optimizer, operates in three steps. In the first step, an easily trainable Neural Network performs a fast selection of image pairs that are likely to close loops. The second step carefully confirms or rejects these candidate loops by means of a robust image matcher. During the third step, all the loops accepted in the second step are subject to a geometric consistency verification process, being rejected those that do not fit with it. The accepted loops ar...

CRIN: Rotation-Invariant Point Cloud Analysis and Rotation Estimation via Centrifugal Reference Frame

Proceedings of the AAAI Conference on Artificial Intelligence

Various recent methods attempt to implement rotation-invariant 3D deep learning by replacing the input coordinates of points with relative distances and angles. Due to the incompleteness of these low-level features, they have to undertake the expense of losing global information. In this paper, we propose the CRIN, namely Centrifugal Rotation-Invariant Network. CRIN directly takes the coordinates of points as input and transforms local points into rotation-invariant representations via centrifugal reference frames. Aided by centrifugal reference frames, each point corresponds to a discrete rotation so that the information of rotations can be implicitly stored in point features. Unfortunately, discrete points are far from describing the whole rotation space. We further introduce a continuous distribution for 3D rotations based on points. Furthermore, we propose an attention-based down-sampling strategy to sample points invariant to rotations. A relation module is adopted at last for ...

RED-NN: Rotation-Equivariant Deep Neural Network for Classification and Prediction of Rotation

2019

In this work, we propose a new Convolutional Neural Network (CNN) for classification of rotated objects. This architecture is built around an ordered ensemble of oriented edge detectors to create a roto-translational space that transforms the input rotation into translation. This space allows the subsequent predictor to learn the internal spatial and angular relations of the objects regardless of their orientation. No data augmentation is needed and the model remains significantly smaller. It presents a self-organization capability and learns to predict the class and the rotation angle without requiring an angle-labeled dataset. We present the results of training with both upright and randomly rotated datasets. The accuracy outperforms the current state of the art on upright oriented training.

Hierarchical Attention and Graph Neural Networks: Toward Drift-Free Pose Estimation

arXiv (Cornell University), 2023

The most commonly used method for addressing 3D geometric registration is the iterative closet-point algorithm, this approach is incremental and prone to drift over multiple consecutive frames. The Common strategy to address the drift is the pose graph optimization subsequent to frame-toframe registration, incorporating a loop closure process that identifies previously visited places. In this paper, we explore a framework that replaces traditional geometric registration and pose graph optimization with a learned model utilizing hierarchical attention mechanisms and graph neural networks. We propose a strategy to condense the data flow, preserving essential information required for the precise estimation of rigid poses. Our results, derived from tests on the KITTI Odometry dataset, demonstrate a significant improvement in pose estimation accuracy. This improvement is especially notable in determining rotational components when compared with results obtained through conventional multi-way registration via pose graph optimization. The code will be made available upon completion of the review process.

Eliminating topological errors in neural network rotation estimation using self-selecting ensembles

ACM Transactions on Graphics

Many problems in computer graphics and computer vision applications involves inferring a rotation from a variety of different forms of inputs. With the increasing use of deep learning, neural networks have been employed to solve such problems. However, the traditional representations for 3D rotations, the quaternions and Euler angles, are found to be problematic for neural networks in practice, producing seemingly unavoidable large estimation errors. Previous researches has identified the discontinuity of the mapping from SO(3) to the quaternions or Euler angles as the source of such errors, and to solve it, embeddings of SO(3) have been proposed as the output representation of rotation estimation networks instead. In this paper, we argue that the argument against quaternions and Euler angles from local discontinuities of the mappings from SO(3) is flawed, and instead provide a different argument from the global topological properties of SO(3) that also establishes the lower bound o...

Deep Neural Networks for Rotation-Invariance Approximation and Learning

Analysis and Applications

Based on the tree architecture, the objective of this paper is to design deep neural networks with two or more hidden layers (called deep nets) for realization of radial functions so as to enable rotational invariance for near-optimal function approximation in an arbitrarily high-dimensional Euclidian space. It is shown that deep nets have much better performance than shallow nets (with only one hidden layer) in terms of approximation accuracy and learning capabilities. In particular, for learning radial functions, it is shown that near-optimal rate can be achieved by deep nets but not by shallow nets. Our results illustrate the necessity of depth in neural network design for realization of rotation-invariance target functions.

ROR: rejection of outliers by rotations

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001

AbstractÐWe address the problem of rejecting false matches of points between two perspective views. The two views are taken from two arbitrary, unknown positions and orientations. Even the best algorithms for image matching make some mistakes and output some false matches. We present an algorithm for identification of the false matches between the views. The algorithm exploits the possibility of rotating one of the images to achieve some common behavior of the correct matches. Those matches that deviate from this common behavior turn out to be false matches. Our algorithm does not, in any way, use the image characteristics of the matched features. In particular, it avoids problems that cause the false matches in the first place. The algorithm works even in cases where the percentage of false matches is as high as 85 percent. The algorithm may be run as a postprocessing step on output from any point matching algorithm. Use of the algorithm may significantly improve the ratio of correct matches to incorrect matches. For robust estimation algorithms which are later employed, this is a very desirable quality since it reduces significantly their computational cost. We present the algorithm, identify the conditions under which it works, and present results of testing it on both synthetic and real images. The code for the algorithm is available through the World Wide Web.

Accurate Motion Estimation through Random Sample Aggregated Consensus

arXiv (Cornell University), 2017

We reconsider the classic problem of estimating accurately a 2D transformation from point matches between images containing outliers. RANSAC discriminates outliers by randomly generating minimalistic sampled hypotheses and verifying their consensus over the input data. Its response is based on the single hypothesis that obtained the largest inlier support. In this article we show that the resulting accuracy can be improved by aggregating all generated hypotheses. This yields RANSAAC, a framework that improves systematically over RANSAC and its state-of-the-art variants by statistically aggregating hypotheses. To this end, we introduce a simple strategy that allows to rapidly average 2D transformations, leading to an almost negligible extra computational cost. We give practical applications on projective transforms and homography+distortion models and demonstrate a significant performance gain in both cases.