Han-Pang Chiu - Academia.edu (original) (raw)
Papers by Han-Pang Chiu
... To cut down the complexity, Mikolajczyk and Schmid [13] apply scale selection [14] to select ... more ... To cut down the complexity, Mikolajczyk and Schmid [13] apply scale selection [14] to select characteristic points. ... 1. For example, Mikolajczyk and Schmid [15] detect affine invariant points with associated affine invariant regions. ...
The video content management has attracted increasing attention in recent years. We have proposed... more The video content management has attracted increasing attention in recent years. We have proposed a new spatio-temporal knowledge structure, called 3D C-string, to represent the spatio-temporal relations between the objects in a video and to keep track of the motions ...
... Han-Pang Chiu Leslie Pack Kaelbling Tomás Lozano-Pérez ... The shape models, defined as a col... more ... Han-Pang Chiu Leslie Pack Kaelbling Tomás Lozano-Pérez ... The shape models, defined as a collection of oriented prim-itive shapes centered at fixed 3D positions, can be learned from a few labeled images for each class. ...
Proceedings of the British Machine Vision Conference 2014, 2014
2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010
In this paper, we present a unified approach for a camera tracking system based on an error-state... more In this paper, we present a unified approach for a camera tracking system based on an error-state Kalman filter algorithm. The filter uses relative (local) measurements obtained from image based motion estimation through visual odometry, as well as global measurements produced by landmark matching through a pre-built visual landmark database and range measurements obtained from radio frequency (RF) ranging radios. We show our results by using the camera poses output by our system to render views from a 3D graphical model built upon the same coordinate frame as the landmark database which also forms the global coordinate system and compare them to the actual video images. These results help demonstrate both the long term stability and the overall accuracy of our algorithm as intended to provide a solution to the GPS denied ubiquitous camera tracking problem under both vision-aided and vision-impaired conditions.
18th International Conference on Pattern Recognition (ICPR'06), 2006
ABSTRACT
2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010
Han-Pang Chiu et al. "Class-specific grasping of 3D objects from a single 2D image.
2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007
Our goal is to circumvent one of the roadblocks to using existing approaches for single-view reco... more Our goal is to circumvent one of the roadblocks to using existing approaches for single-view recognition for achieving multi-view recognition, namely, the need for sufficient training data for many viewpoints. We show how to construct virtual training examples for multi-view recognition using a simple model of objects (nearly planar facades centered at fixed 3D positions). We also show how the models can be learned from a few labeled images for each class.
2013 IEEE International Conference on Robotics and Automation, 2013
ABSTRACT This paper proposes a navigation algorithm that provides a low-latency solution while es... more ABSTRACT This paper proposes a navigation algorithm that provides a low-latency solution while estimating the full nonlinear navigation state. Our approach uses Sliding-Window Factor Graphs, which extend existing incremental smoothing methods to operate on the subset of measurements and states that exist inside a sliding time window. We split the estimation into a fast short-term smoother, a slower but fully global smoother, and a shared map of 3D landmarks. A novel three-stage visual feature model is presented that takes advantage of both smoothers to optimize the 3D landmark map, while minimizing the computation required for processing tracked features in the short-term smoother. This three-stage model is formulated based on the maturity of the estimation of the 3D location of the underlying landmark in the map. Long-range associations are used as global measurements from matured landmarks in the short-term smoother and loop closure constraints in the long-term smoother. Experimental results demonstrate our approach provides highly-accurate solutions on large-scale real data sets using multiple sensors in GPS-denied settings.
2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014
2011 IEEE Virtual Reality Conference, 2011
In this paper, we present a unified approach for a drift-free and jitter-reduced vision-aided nav... more In this paper, we present a unified approach for a drift-free and jitter-reduced vision-aided navigation system. This approach is based on an error-state Kalman filter algorithm using both relative (local) measurements obtained from image based motion estimation through visual odometry, and global measurements as a result of landmark matching through a pre-built visual landmark database. To improve the accuracy in pose estimation for augmented reality applications, we capture the 3D local reconstruction uncertainty of each landmark point as a covariance matrix and implicity rely more on closer points in the filter. We conduct a number of experiments aimed at evaluating different aspects of our Kalman filter framework, and show our approach can provide highly-accurate and stable pose both indoors and outdoors over large areas. The results demonstrate both the long term stability and the overall accuracy of our algorithm as intended to provide a solution to the camera tracking problem in augmented reality applications.
CVPR 2011, 2011
Visual landmark matching with a pre-built landmark database is a popular technique for localizati... more Visual landmark matching with a pre-built landmark database is a popular technique for localization. Traditionally, landmark database was built with visual odometry system, and the 3D information of each visual landmark is reconstructed from video. Due to the drift of the visual odometry system, a global consistent landmark database is difficult to build, and the inaccuracy of each 3D landmark limits the performance of landmark matching. In this paper, we demonstrated that with the use of precise 3D Lidar range data, we are able to build a global consistent database of high precision 3D visual landmarks, which improves the landmark matching accuracy dramatically. In order to further improve the accuracy and robustness, landmark matching is fused with a multi-stereo based visual odometry system to estimate the camera pose in two aspects. First, a local visual odometry trajectory based consistency check is performed to reject some bad landmark matchings or those with large errors, and then a kalman filtering is used to further smooth out some landmark matching errors. Finally, a disk-cache-mechanism is proposed to obtain the real-time performance when the size of the landmark grows for a large-scale area. A week-long real time live marine training experiments have demonstrated the high-precision and robustness of our proposed system.
Pattern Recognition Letters, 2003
ABSTRACT
Pattern Recognition Letters, 2005
ABSTRACT
Pattern Recognition, 2002
ABSTRACT
Journal of Visual Communication and Image Representation, 2005
ABSTRACT
Computer Vision and Image Understanding, 2009
Multi-view object class recognition can be achieved using existing approaches for single-view obj... more Multi-view object class recognition can be achieved using existing approaches for single-view object class recognition, by treating different views as entirely independent classes. This strategy requires a large amount of training data for many viewpoints, which can be costly to obtain. We describe a method for constructing a weak three-dimensional model from as few as two views of an object of the target class, and using that model to transform images of objects from one view to several other views, effectively multiplying their value for class recognition. Our approach can be coupled with any 2D image-based recognition system. We show that automatically transformed images dramatically decrease the data requirements for multi-view object class recognition.
2014 IEEE International Conference on Robotics and Automation (ICRA), 2014
... To cut down the complexity, Mikolajczyk and Schmid [13] apply scale selection [14] to select ... more ... To cut down the complexity, Mikolajczyk and Schmid [13] apply scale selection [14] to select characteristic points. ... 1. For example, Mikolajczyk and Schmid [15] detect affine invariant points with associated affine invariant regions. ...
The video content management has attracted increasing attention in recent years. We have proposed... more The video content management has attracted increasing attention in recent years. We have proposed a new spatio-temporal knowledge structure, called 3D C-string, to represent the spatio-temporal relations between the objects in a video and to keep track of the motions ...
... Han-Pang Chiu Leslie Pack Kaelbling Tomás Lozano-Pérez ... The shape models, defined as a col... more ... Han-Pang Chiu Leslie Pack Kaelbling Tomás Lozano-Pérez ... The shape models, defined as a collection of oriented prim-itive shapes centered at fixed 3D positions, can be learned from a few labeled images for each class. ...
Proceedings of the British Machine Vision Conference 2014, 2014
2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010
In this paper, we present a unified approach for a camera tracking system based on an error-state... more In this paper, we present a unified approach for a camera tracking system based on an error-state Kalman filter algorithm. The filter uses relative (local) measurements obtained from image based motion estimation through visual odometry, as well as global measurements produced by landmark matching through a pre-built visual landmark database and range measurements obtained from radio frequency (RF) ranging radios. We show our results by using the camera poses output by our system to render views from a 3D graphical model built upon the same coordinate frame as the landmark database which also forms the global coordinate system and compare them to the actual video images. These results help demonstrate both the long term stability and the overall accuracy of our algorithm as intended to provide a solution to the GPS denied ubiquitous camera tracking problem under both vision-aided and vision-impaired conditions.
18th International Conference on Pattern Recognition (ICPR'06), 2006
ABSTRACT
2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010
Han-Pang Chiu et al. "Class-specific grasping of 3D objects from a single 2D image.
2007 IEEE Conference on Computer Vision and Pattern Recognition, 2007
Our goal is to circumvent one of the roadblocks to using existing approaches for single-view reco... more Our goal is to circumvent one of the roadblocks to using existing approaches for single-view recognition for achieving multi-view recognition, namely, the need for sufficient training data for many viewpoints. We show how to construct virtual training examples for multi-view recognition using a simple model of objects (nearly planar facades centered at fixed 3D positions). We also show how the models can be learned from a few labeled images for each class.
2013 IEEE International Conference on Robotics and Automation, 2013
ABSTRACT This paper proposes a navigation algorithm that provides a low-latency solution while es... more ABSTRACT This paper proposes a navigation algorithm that provides a low-latency solution while estimating the full nonlinear navigation state. Our approach uses Sliding-Window Factor Graphs, which extend existing incremental smoothing methods to operate on the subset of measurements and states that exist inside a sliding time window. We split the estimation into a fast short-term smoother, a slower but fully global smoother, and a shared map of 3D landmarks. A novel three-stage visual feature model is presented that takes advantage of both smoothers to optimize the 3D landmark map, while minimizing the computation required for processing tracked features in the short-term smoother. This three-stage model is formulated based on the maturity of the estimation of the 3D location of the underlying landmark in the map. Long-range associations are used as global measurements from matured landmarks in the short-term smoother and loop closure constraints in the long-term smoother. Experimental results demonstrate our approach provides highly-accurate solutions on large-scale real data sets using multiple sensors in GPS-denied settings.
2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014
2011 IEEE Virtual Reality Conference, 2011
In this paper, we present a unified approach for a drift-free and jitter-reduced vision-aided nav... more In this paper, we present a unified approach for a drift-free and jitter-reduced vision-aided navigation system. This approach is based on an error-state Kalman filter algorithm using both relative (local) measurements obtained from image based motion estimation through visual odometry, and global measurements as a result of landmark matching through a pre-built visual landmark database. To improve the accuracy in pose estimation for augmented reality applications, we capture the 3D local reconstruction uncertainty of each landmark point as a covariance matrix and implicity rely more on closer points in the filter. We conduct a number of experiments aimed at evaluating different aspects of our Kalman filter framework, and show our approach can provide highly-accurate and stable pose both indoors and outdoors over large areas. The results demonstrate both the long term stability and the overall accuracy of our algorithm as intended to provide a solution to the camera tracking problem in augmented reality applications.
CVPR 2011, 2011
Visual landmark matching with a pre-built landmark database is a popular technique for localizati... more Visual landmark matching with a pre-built landmark database is a popular technique for localization. Traditionally, landmark database was built with visual odometry system, and the 3D information of each visual landmark is reconstructed from video. Due to the drift of the visual odometry system, a global consistent landmark database is difficult to build, and the inaccuracy of each 3D landmark limits the performance of landmark matching. In this paper, we demonstrated that with the use of precise 3D Lidar range data, we are able to build a global consistent database of high precision 3D visual landmarks, which improves the landmark matching accuracy dramatically. In order to further improve the accuracy and robustness, landmark matching is fused with a multi-stereo based visual odometry system to estimate the camera pose in two aspects. First, a local visual odometry trajectory based consistency check is performed to reject some bad landmark matchings or those with large errors, and then a kalman filtering is used to further smooth out some landmark matching errors. Finally, a disk-cache-mechanism is proposed to obtain the real-time performance when the size of the landmark grows for a large-scale area. A week-long real time live marine training experiments have demonstrated the high-precision and robustness of our proposed system.
Pattern Recognition Letters, 2003
ABSTRACT
Pattern Recognition Letters, 2005
ABSTRACT
Pattern Recognition, 2002
ABSTRACT
Journal of Visual Communication and Image Representation, 2005
ABSTRACT
Computer Vision and Image Understanding, 2009
Multi-view object class recognition can be achieved using existing approaches for single-view obj... more Multi-view object class recognition can be achieved using existing approaches for single-view object class recognition, by treating different views as entirely independent classes. This strategy requires a large amount of training data for many viewpoints, which can be costly to obtain. We describe a method for constructing a weak three-dimensional model from as few as two views of an object of the target class, and using that model to transform images of objects from one view to several other views, effectively multiplying their value for class recognition. Our approach can be coupled with any 2D image-based recognition system. We show that automatically transformed images dramatically decrease the data requirements for multi-view object class recognition.
2014 IEEE International Conference on Robotics and Automation (ICRA), 2014