Peng Yin - Academia.edu (original) (raw)
Videos by Peng Yin
We present a method for localizing a single camera with respect to a point cloud map in indoor an... more We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes. The problem is challenging because correspondences of local invariant features are inconsistent across the domains between image and 3D. The problem is even more challenging as the method must handle various environmental conditions such as illumination, weather, and seasonal changes. Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors.
Our key insight is to retain condition-invariant 3D geometry features from limited data samples while eliminating the condition-related features by a designed Generative Adversarial Network. Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors.
6 views
Papers by Peng Yin
ArXiv, 2018
Recent literature in the robotics community has focused on learning robot behaviors that abstract... more Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to achieve a given task. In this paper, we present an approach to both learn and sequence robot behaviors, applied to the problem of visual navigation of mobile robots. We construct a layered representation of control policies composed of low- level behaviors and a meta-level policy. The low-level behaviors enable the robot to locomote in a particular environment while avoiding obstacles, and the meta-level policy actively selects the low-level behavior most appropriate for the current situation based purely on visual feedback. We demonstrate the effectiveness of our method on three simulated robot navigation tasks: a legged hexapod robot which must successfully traverse varying terrain, a wheeled robot which must navigate a maze-like course while avoid...
IEEE Robotics and Automation Letters, 2021
Real-time 3D place recognition is a crucial technology to recover from localization failure in ap... more Real-time 3D place recognition is a crucial technology to recover from localization failure in applications like autonomous driving, last-mile delivery, and service robots. However, it is challenging for 3D place retrieval methods to be accurate, efficient, and robust to the variant viewpoints differences. In this letter, we propose FusionVLAD, a fusion-based network that encodes a multiview representation of sparse 3D point clouds into viewpoint-free global descriptors. The system consists of two parallel branches: a spherical-view branch for orientation-invariant feature extraction, and the top-down view branch for translation-insensitive feature extraction. Furthermore, we design a parallel fusion module to enhance the combination of region-wise feature connection between the two branches. Experiments on two public datasets and two generated datasets show that our method outperforms state-of-the-art with robust place recognition accuracy and efficient inference time. Besides, FusionVLAD requires limited computation resources and makes it extremely suitable for low-cost robots' long-term place recognition task.
ArXiv, 2021
Recent years have witnessed the increasing application of place recognition in various environmen... more Recent years have witnessed the increasing application of place recognition in various environments, such as city roads, large buildings, and a mix of indoor and outdoor places. This task, however, still remains challenging due to the limitations of different sensors and the changing appearance of environments. Current works only consider the use of individual sensors, or simply combine different sensors, ignoring the fact that the importance of different sensors varies as the environment changes. In this paper, an adaptive weighting visual-LiDAR fusion method, named AdaFusion, is proposed to learn the weights for both images and point cloud features. Features of these two modalities are thus contributed differently according to the current environmental situation. The learning of weights is achieved by the attention branch of the network, which is then fused with the multi-modality feature extraction branch. Furthermore, to better utilize the potential relationship between images a...
ArXiv, 2021
We present a method for localizing a single camera with respect to a point cloud map in indoor an... more We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes. The problem is challenging because correspondences of local invariant features are inconsistent across the domains between image and 3D. The problem is even more challenging as the method must handle various environmental conditions such as illumination, weather, and seasonal changes. Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors. Our key insight is to retain condition-invariant 3D geometry features from limited data samples while eliminating the condition-related features by a designed Generative Adversarial Network. Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors. We evaluate our method on extensive self-collected datasets, which involve Long-term (variant appearance conditions), Large-scale (up to 2km stru...
IEEE Transactions on Intelligent Transportation Systems, 2021
Accurate localization on autonomous driving cars is essential for autonomy and driving safety, es... more Accurate localization on autonomous driving cars is essential for autonomy and driving safety, especially for complex urban streets and search-and-rescue subterranean environments where high-accurate GPS is not available. However current odometry estimation may introduce the drifting problems in long-term navigation without robust global localization. The main challenges involve scene divergence under the interference of dynamic environments and effective perception of observation and object layout variance from different viewpoints. To tackle these challenges, we present PSE-Match, a viewpoint-free place recognition method based on parallel semantic analysis of isolated semantic attributes from 3D point-cloud models. Compared with the original point cloud, the observed variance of semantic attributes is smaller. PSE-Match incorporates a divergence place learning network to capture different semantic attributes parallelly through the spherical harmonics domain. Using both existing b...
IEEE TIE, 2021
Recent years have witnessed the increasing application of place recognition in various environmen... more Recent years have witnessed the increasing application of place recognition in various environments, such as city roads, large buildings, and a mix of indoor and outdoor places. This task, however, still remains challenging due to the limitations of different sensors and the changing appearance of environments. Current works only consider the use of individual sensors, or simply combine different sensors, ignoring the fact that the importance of different sensors varies as the environment changes. In this paper, an adaptive weighting visual-LiDAR fusion method, named AdaFusion, is proposed to learn the weights for both images and point cloud features. Features of these two modalities are thus contributed differently according to the current environmental situation. The learning of weights is achieved by the attention branch of the network, which is then fused with the multi-modality feature extraction branch. Furthermore, to better utilize the potential relationship between images and point clouds, we design a twostage fusion approach to combine the 2D and 3D attention. Our work is tested on two public datasets, and experiments show that the adaptive weights help improve recognition accuracy and system robustness to varying environments.
ACS Nano, 2014
Kinetically controlled isothermal growth is fundamental to biological development, yet it remains... more Kinetically controlled isothermal growth is fundamental to biological development, yet it remains challenging to rationally design molecular systems that self-assemble isothermally into complex geometries via prescribed assembly and disassembly pathways. By exploiting the programmable chemistry of base pairing, sophisticated spatial and temporal control have been demonstrated in DNA self-assembly, but largely as separate pursuits. By integrating temporal with spatial control, here we demonstrate the "developmental" selfassembly of a DNA tetrahedron, where a prescriptive molecular program orchestrates the kinetic pathways by which DNA molecules isothermally selfassemble into a well-defined three-dimensional wireframe geometry. In this reaction, nine DNA reactants initially coexist metastably, but upon catalysis by a DNA initiator molecule, navigate 24 individually characterizable intermediate states via prescribed assembly pathways, organized both in series and in parallel, to arrive at the tetrahedral final product. In contrast to previous work on dynamic DNA nanotechnology, this developmental program coordinates growth of ringed substructures into a three-dimensional wireframe superstructure, taking a step toward the goal of kinetically controlled isothermal growth of complex three-dimensional geometries.
We present a method for localizing a single camera with respect to a point cloud map in indoor an... more We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes. The problem is challenging because correspondences of local invariant features are inconsistent across the domains between image and 3D. The problem is even more challenging as the method must handle various environmental conditions such as illumination, weather, and seasonal changes. Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors.
Our key insight is to retain condition-invariant 3D geometry features from limited data samples while eliminating the condition-related features by a designed Generative Adversarial Network. Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors.
6 views
ArXiv, 2018
Recent literature in the robotics community has focused on learning robot behaviors that abstract... more Recent literature in the robotics community has focused on learning robot behaviors that abstract out lower-level details of robot control. To fully leverage the efficacy of such behaviors, it is necessary to select and sequence them to achieve a given task. In this paper, we present an approach to both learn and sequence robot behaviors, applied to the problem of visual navigation of mobile robots. We construct a layered representation of control policies composed of low- level behaviors and a meta-level policy. The low-level behaviors enable the robot to locomote in a particular environment while avoiding obstacles, and the meta-level policy actively selects the low-level behavior most appropriate for the current situation based purely on visual feedback. We demonstrate the effectiveness of our method on three simulated robot navigation tasks: a legged hexapod robot which must successfully traverse varying terrain, a wheeled robot which must navigate a maze-like course while avoid...
IEEE Robotics and Automation Letters, 2021
Real-time 3D place recognition is a crucial technology to recover from localization failure in ap... more Real-time 3D place recognition is a crucial technology to recover from localization failure in applications like autonomous driving, last-mile delivery, and service robots. However, it is challenging for 3D place retrieval methods to be accurate, efficient, and robust to the variant viewpoints differences. In this letter, we propose FusionVLAD, a fusion-based network that encodes a multiview representation of sparse 3D point clouds into viewpoint-free global descriptors. The system consists of two parallel branches: a spherical-view branch for orientation-invariant feature extraction, and the top-down view branch for translation-insensitive feature extraction. Furthermore, we design a parallel fusion module to enhance the combination of region-wise feature connection between the two branches. Experiments on two public datasets and two generated datasets show that our method outperforms state-of-the-art with robust place recognition accuracy and efficient inference time. Besides, FusionVLAD requires limited computation resources and makes it extremely suitable for low-cost robots' long-term place recognition task.
ArXiv, 2021
Recent years have witnessed the increasing application of place recognition in various environmen... more Recent years have witnessed the increasing application of place recognition in various environments, such as city roads, large buildings, and a mix of indoor and outdoor places. This task, however, still remains challenging due to the limitations of different sensors and the changing appearance of environments. Current works only consider the use of individual sensors, or simply combine different sensors, ignoring the fact that the importance of different sensors varies as the environment changes. In this paper, an adaptive weighting visual-LiDAR fusion method, named AdaFusion, is proposed to learn the weights for both images and point cloud features. Features of these two modalities are thus contributed differently according to the current environmental situation. The learning of weights is achieved by the attention branch of the network, which is then fused with the multi-modality feature extraction branch. Furthermore, to better utilize the potential relationship between images a...
ArXiv, 2021
We present a method for localizing a single camera with respect to a point cloud map in indoor an... more We present a method for localizing a single camera with respect to a point cloud map in indoor and outdoor scenes. The problem is challenging because correspondences of local invariant features are inconsistent across the domains between image and 3D. The problem is even more challenging as the method must handle various environmental conditions such as illumination, weather, and seasonal changes. Our method can match equirectangular images to the 3D range projections by extracting cross-domain symmetric place descriptors. Our key insight is to retain condition-invariant 3D geometry features from limited data samples while eliminating the condition-related features by a designed Generative Adversarial Network. Based on such features, we further design a spherical convolution network to learn viewpoint-invariant symmetric place descriptors. We evaluate our method on extensive self-collected datasets, which involve Long-term (variant appearance conditions), Large-scale (up to 2km stru...
IEEE Transactions on Intelligent Transportation Systems, 2021
Accurate localization on autonomous driving cars is essential for autonomy and driving safety, es... more Accurate localization on autonomous driving cars is essential for autonomy and driving safety, especially for complex urban streets and search-and-rescue subterranean environments where high-accurate GPS is not available. However current odometry estimation may introduce the drifting problems in long-term navigation without robust global localization. The main challenges involve scene divergence under the interference of dynamic environments and effective perception of observation and object layout variance from different viewpoints. To tackle these challenges, we present PSE-Match, a viewpoint-free place recognition method based on parallel semantic analysis of isolated semantic attributes from 3D point-cloud models. Compared with the original point cloud, the observed variance of semantic attributes is smaller. PSE-Match incorporates a divergence place learning network to capture different semantic attributes parallelly through the spherical harmonics domain. Using both existing b...
IEEE TIE, 2021
Recent years have witnessed the increasing application of place recognition in various environmen... more Recent years have witnessed the increasing application of place recognition in various environments, such as city roads, large buildings, and a mix of indoor and outdoor places. This task, however, still remains challenging due to the limitations of different sensors and the changing appearance of environments. Current works only consider the use of individual sensors, or simply combine different sensors, ignoring the fact that the importance of different sensors varies as the environment changes. In this paper, an adaptive weighting visual-LiDAR fusion method, named AdaFusion, is proposed to learn the weights for both images and point cloud features. Features of these two modalities are thus contributed differently according to the current environmental situation. The learning of weights is achieved by the attention branch of the network, which is then fused with the multi-modality feature extraction branch. Furthermore, to better utilize the potential relationship between images and point clouds, we design a twostage fusion approach to combine the 2D and 3D attention. Our work is tested on two public datasets, and experiments show that the adaptive weights help improve recognition accuracy and system robustness to varying environments.
ACS Nano, 2014
Kinetically controlled isothermal growth is fundamental to biological development, yet it remains... more Kinetically controlled isothermal growth is fundamental to biological development, yet it remains challenging to rationally design molecular systems that self-assemble isothermally into complex geometries via prescribed assembly and disassembly pathways. By exploiting the programmable chemistry of base pairing, sophisticated spatial and temporal control have been demonstrated in DNA self-assembly, but largely as separate pursuits. By integrating temporal with spatial control, here we demonstrate the "developmental" selfassembly of a DNA tetrahedron, where a prescriptive molecular program orchestrates the kinetic pathways by which DNA molecules isothermally selfassemble into a well-defined three-dimensional wireframe geometry. In this reaction, nine DNA reactants initially coexist metastably, but upon catalysis by a DNA initiator molecule, navigate 24 individually characterizable intermediate states via prescribed assembly pathways, organized both in series and in parallel, to arrive at the tetrahedral final product. In contrast to previous work on dynamic DNA nanotechnology, this developmental program coordinates growth of ringed substructures into a three-dimensional wireframe superstructure, taking a step toward the goal of kinetically controlled isothermal growth of complex three-dimensional geometries.