zhe lin - Academia.edu (original) (raw)
Papers by zhe lin
We describe an approach to segmenting foreground regions corresponding to a group of people into ... more We describe an approach to segmenting foreground regions corresponding to a group of people into individual humans. Given background subtraction and ground plane homography, hierarchical part-template matching is employed to determine a reliable set of human detection hypotheses, and progressive greedy optimization is performed to estimate the best configuration of humans under a Bayesian MAP framework. Then, appearance models and segmentations are simultaneously estimated in an iterative sampling-expectation paradigm. Each human appearance is represented by a nonparametric kernel density estimator in a joint spatial-color space and a recursive probability update scheme is employed for soft segmentation at each iteration. Additionally, an automatic occlusion reasoning method is used to determine the layered occlusion status between humans. The approach is evaluated on a number of images and videos, and also applied to human appearance matching using a symmetric distance measure derived from the Kullback-Leiber divergence.
We describe an approach to segmenting foreground regions corresponding to a group of people into ... more We describe an approach to segmenting foreground regions corresponding to a group of people into individual humans. Given background subtraction and ground plane homography, hierarchical part-template matching is employed to determine a reliable set of human detection hypotheses, and progressive greedy optimization is performed to estimate the best configuration of humans under a Bayesian MAP framework. Then, appearance models and segmentations are simultaneously estimated in an iterative sampling-expectation paradigm. Each human appearance is represented by a nonparametric kernel density estimator in a joint spatial-color space and a recursive probability update scheme is employed for soft segmentation at each iteration. Additionally, an automatic occlusion reasoning method is used to determine the layered occlusion status between humans. The approach is evaluated on a number of images and videos, and also applied to human appearance matching using a symmetric distance measure derived from the Kullback-Leiber divergence.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010
We present a learning-based, sliding window-style approach for the problem of detecting humans in... more We present a learning-based, sliding window-style approach for the problem of detecting humans in still images. Instead of traditional concatenation-style image location-based feature encoding, a global descriptor more invariant to pose variation is introduced. Specifically, we propose a principled approach to learning and classifying human/non-human image patterns by simultaneously segmenting human shapes and poses, and extracting articulation-insensitive features. The shapes and poses are segmented by an efficient, probabilistic hierarchical part-template matching algorithm, and the features are collected in the context of poses by tracing around the estimated shape boundaries. Histograms of oriented gradients are used as a source of low-level features from which our pose-invariant descriptors are computed, and kernel SVMs are adopted as the test classifiers. We evaluate our detection and segmentation approach on two public pedestrian datasets.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010
We present a learning-based, sliding window-style approach for the problem of detecting humans in... more We present a learning-based, sliding window-style approach for the problem of detecting humans in still images. Instead of traditional concatenation-style image location-based feature encoding, a global descriptor more invariant to pose variation is introduced. Specifically, we propose a principled approach to learning and classifying human/non-human image patterns by simultaneously segmenting human shapes and poses, and extracting articulation-insensitive features. The shapes and poses are segmented by an efficient, probabilistic hierarchical part-template matching algorithm, and the features are collected in the context of poses by tracing around the estimated shape boundaries. Histograms of oriented gradients are used as a source of low-level features from which our pose-invariant descriptors are computed, and kernel SVMs are adopted as the test classifiers. We evaluate our detection and segmentation approach on two public pedestrian datasets.
Training discriminative classifiers for a large number of classes is a challenging problem due to... more Training discriminative classifiers for a large number of classes is a challenging problem due to increased ambiguities between classes. In order to better handle the ambiguities and to improve the scalability of classifiers to larger number of categories, we learn pairwise dissimilarity profiles (functions of spatial location) between categories and adapt them into nearest neighbor classification. We introduce a dissimilarity distance measure and linearly or nonlinearly combine it with direct distances. We illustrate and demonstrate the approach mainly in the context of appearance-based person recognition.
Training discriminative classifiers for a large number of classes is a challenging problem due to... more Training discriminative classifiers for a large number of classes is a challenging problem due to increased ambiguities between classes. In order to better handle the ambiguities and to improve the scalability of classifiers to larger number of categories, we learn pairwise dissimilarity profiles (functions of spatial location) between categories and adapt them into nearest neighbor classification. We introduce a dissimilarity distance measure and linearly or nonlinearly combine it with direct distances. We illustrate and demonstrate the approach mainly in the context of appearance-based person recognition.
We describe an approach to segmenting foreground regions corresponding to a group of people into ... more We describe an approach to segmenting foreground regions corresponding to a group of people into individual humans. Given background subtraction and ground plane homography, hierarchical part-template matching is employed to determine a reliable set of human detection hypotheses, and progressive greedy optimization is performed to estimate the best configuration of humans under a Bayesian MAP framework. Then, appearance models and segmentations are simultaneously estimated in an iterative sampling-expectation paradigm. Each human appearance is represented by a nonparametric kernel density estimator in a joint spatial-color space and a recursive probability update scheme is employed for soft segmentation at each iteration. Additionally, an automatic occlusion reasoning method is used to determine the layered occlusion status between humans. The approach is evaluated on a number of images and videos, and also applied to human appearance matching using a symmetric distance measure derived from the Kullback-Leiber divergence.
We describe an approach to segmenting foreground regions corresponding to a group of people into ... more We describe an approach to segmenting foreground regions corresponding to a group of people into individual humans. Given background subtraction and ground plane homography, hierarchical part-template matching is employed to determine a reliable set of human detection hypotheses, and progressive greedy optimization is performed to estimate the best configuration of humans under a Bayesian MAP framework. Then, appearance models and segmentations are simultaneously estimated in an iterative sampling-expectation paradigm. Each human appearance is represented by a nonparametric kernel density estimator in a joint spatial-color space and a recursive probability update scheme is employed for soft segmentation at each iteration. Additionally, an automatic occlusion reasoning method is used to determine the layered occlusion status between humans. The approach is evaluated on a number of images and videos, and also applied to human appearance matching using a symmetric distance measure derived from the Kullback-Leiber divergence.
We describe an approach to segmenting foreground regions corresponding to a group of people into ... more We describe an approach to segmenting foreground regions corresponding to a group of people into individual humans. Given background subtraction and ground plane homography, hierarchical part-template matching is employed to determine a reliable set of human detection hypotheses, and progressive greedy optimization is performed to estimate the best configuration of humans under a Bayesian MAP framework. Then, appearance models and segmentations are simultaneously estimated in an iterative sampling-expectation paradigm. Each human appearance is represented by a nonparametric kernel density estimator in a joint spatial-color space and a recursive probability update scheme is employed for soft segmentation at each iteration. Additionally, an automatic occlusion reasoning method is used to determine the layered occlusion status between humans. The approach is evaluated on a number of images and videos, and also applied to human appearance matching using a symmetric distance measure derived from the Kullback-Leiber divergence.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010
We present a learning-based, sliding window-style approach for the problem of detecting humans in... more We present a learning-based, sliding window-style approach for the problem of detecting humans in still images. Instead of traditional concatenation-style image location-based feature encoding, a global descriptor more invariant to pose variation is introduced. Specifically, we propose a principled approach to learning and classifying human/non-human image patterns by simultaneously segmenting human shapes and poses, and extracting articulation-insensitive features. The shapes and poses are segmented by an efficient, probabilistic hierarchical part-template matching algorithm, and the features are collected in the context of poses by tracing around the estimated shape boundaries. Histograms of oriented gradients are used as a source of low-level features from which our pose-invariant descriptors are computed, and kernel SVMs are adopted as the test classifiers. We evaluate our detection and segmentation approach on two public pedestrian datasets.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010
We present a learning-based, sliding window-style approach for the problem of detecting humans in... more We present a learning-based, sliding window-style approach for the problem of detecting humans in still images. Instead of traditional concatenation-style image location-based feature encoding, a global descriptor more invariant to pose variation is introduced. Specifically, we propose a principled approach to learning and classifying human/non-human image patterns by simultaneously segmenting human shapes and poses, and extracting articulation-insensitive features. The shapes and poses are segmented by an efficient, probabilistic hierarchical part-template matching algorithm, and the features are collected in the context of poses by tracing around the estimated shape boundaries. Histograms of oriented gradients are used as a source of low-level features from which our pose-invariant descriptors are computed, and kernel SVMs are adopted as the test classifiers. We evaluate our detection and segmentation approach on two public pedestrian datasets.
Training discriminative classifiers for a large number of classes is a challenging problem due to... more Training discriminative classifiers for a large number of classes is a challenging problem due to increased ambiguities between classes. In order to better handle the ambiguities and to improve the scalability of classifiers to larger number of categories, we learn pairwise dissimilarity profiles (functions of spatial location) between categories and adapt them into nearest neighbor classification. We introduce a dissimilarity distance measure and linearly or nonlinearly combine it with direct distances. We illustrate and demonstrate the approach mainly in the context of appearance-based person recognition.
Training discriminative classifiers for a large number of classes is a challenging problem due to... more Training discriminative classifiers for a large number of classes is a challenging problem due to increased ambiguities between classes. In order to better handle the ambiguities and to improve the scalability of classifiers to larger number of categories, we learn pairwise dissimilarity profiles (functions of spatial location) between categories and adapt them into nearest neighbor classification. We introduce a dissimilarity distance measure and linearly or nonlinearly combine it with direct distances. We illustrate and demonstrate the approach mainly in the context of appearance-based person recognition.
We describe an approach to segmenting foreground regions corresponding to a group of people into ... more We describe an approach to segmenting foreground regions corresponding to a group of people into individual humans. Given background subtraction and ground plane homography, hierarchical part-template matching is employed to determine a reliable set of human detection hypotheses, and progressive greedy optimization is performed to estimate the best configuration of humans under a Bayesian MAP framework. Then, appearance models and segmentations are simultaneously estimated in an iterative sampling-expectation paradigm. Each human appearance is represented by a nonparametric kernel density estimator in a joint spatial-color space and a recursive probability update scheme is employed for soft segmentation at each iteration. Additionally, an automatic occlusion reasoning method is used to determine the layered occlusion status between humans. The approach is evaluated on a number of images and videos, and also applied to human appearance matching using a symmetric distance measure derived from the Kullback-Leiber divergence.