Jianchao Yang - Academia.edu (original) (raw)
Papers by Jianchao Yang
2013 IEEE International Conference on Multimedia and Expo (ICME), 2013
A novel problem of object recognition with dynamically allocated sensing resources is considered ... more A novel problem of object recognition with dynamically allocated sensing resources is considered in this paper. We call this problem opportunistic sensing since prior knowledge about the correlation between class label and signal distribution is exploited as early as in data acquisition. Two forms of sensing parameters -discrete sensor index and continuous linear measurement vector -are optimized within the same maximum negative entropy framework. The computationally intractable expected entropy is approximated using unscented transform for Gaussian models, and we solve the problem using a gradient-based method. Our formulation is theoretically shown to be closely related to the maximum mutual information criterion for sensor selection and linear feature extraction techniques such as PCA, LDA, and CCA. The proposed approach is validated on multi-view vehicle classification and face recognition datasets, and remarkable improvement over baseline methods is demonstrated in the experiments.
Asian and Pacific Conference on Synthetic Aperture Radar, 2007
The purpose of this paper is to evaluate the blur extent from motion blurred SAR images. Combined... more The purpose of this paper is to evaluate the blur extent from motion blurred SAR images. Combined with the averaging operation, a simple adaptive filter was proposed to identify the motion blur extent by using the autocorrelation method, which is expected to be the measure of the motion blurred SAR images.
With the popularity of digital cameras and camera phones, it is common for different people, who ... more With the popularity of digital cameras and camera phones, it is common for different people, who may or may not know each other, to attend the same event and take pictures and videos from different spatial or personal perspectives. Within the realm of social media, it is desirable to enable these people to share their pictures and videos in order
International Conference on Multimedia Computing and Systems/International Conference on Multimedia and Expo, 2011
In this work, we propose a novel supervised matrix factorization method used directly as a multi-... more In this work, we propose a novel supervised matrix factorization method used directly as a multi-class classifier. The coefficient matrix of the factorization is enforced to be sparse by ℓ1-norm regularization. The basis matrix is composed of atom dictionaries from different classes, which are trained in a jointly supervised manner by penalizing inhomogeneous representations given the labeled data samples. The
Computer Vision and Pattern Recognition, 2009
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image cl... more Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n2 ~ n3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we
European Conference on Computer Vision, 2010
Sparse coding of sensory data has recently attracted notable attention in research of learning us... more Sparse coding of sensory data has recently attracted notable attention in research of learning useful features from the unlabeled data. Empirical studies show that mapping the data into a signicantly higher- dimensional space with sparse coding can lead to superior classication performance. However, computationally it is challenging to learn a set of highly over-complete dictionary bases and to encode the
IEEE Transactions on Image Processing, 2009
In this paper, our contributions to the subspace learning problem are two-fold. We first justify ... more In this paper, our contributions to the subspace learning problem are two-fold. We first justify that most popular subspace learning algorithms, unsupervised or supervised, can be unitedly explained as instances of a ubiquitously supervised prototype. They all essentially minimize the intraclass compact- ness and at the same time maximize the interclass separability, yet with specialized labeling approaches, such as ground
2011 International Conference on Computer Vision, 2011
Most previous visual recognition systems simply assume ideal inputs without real-world degradatio... more Most previous visual recognition systems simply assume ideal inputs without real-world degradations, such as low resolution, motion blur and out-of-focus blur. In presence of such unknown degradations, the conventional approach first resorts to blind image restoration and then feeds the restored image into a classifier. Treating restoration and recognition separately, such a straightforward approach, however, suffers greatly from the defective output of the illposed blind image restoration. In this paper, we present a joint blind image restoration and recognition method based on the sparse representation prior to handle the challenging problem of face recognition from low-quality images, where the degradation model is realistic and totally unknown. The sparse representation prior states that the degraded input image, if correctly restored, will have a good sparse representation in terms of the training set, which indicates the identity of the test image. The proposed algorithm achieves simultaneous restoration and recognition by iteratively solving the blind image restoration in pursuit of the sparest representation for recognition. Based on such a sparse representation prior, we demonstrate that the image restoration task and the recognition task can benefit greatly from each other. Extensive experiments on face datasets under various degradations are carried out and the results of our joint model shows significant improvements over conventional methods of treating the two tasks independently.
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010
The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to ach... more The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to achieve good image classification performance. This paper presents a simple but effective coding scheme called Locality-constrained Linear Coding (LLC) in place of the VQ coding in traditional SPM. LLC utilizes the locality constraints to project each descriptor into its local-coordinate system, and the projected coordinates are integrated by max pooling to generate the final representation. With linear classifier, the proposed approach performs remarkably better than the traditional nonlinear SPM, achieving state-of-the-art performance on several benchmarks.
2008 15th IEEE International Conference on Image Processing, 2008
In this paper, we address the problem of hallucinating a high resolution face given a low resolut... more In this paper, we address the problem of hallucinating a high resolution face given a low resolution input face. The problem is approached through sparse coding. To exploit the facial structure, Non-negative Matrix Factorization (NMF) [1] is first employed to learn a localized part-based subspace. This subspace is effective for super-resolving the incoming low resolution face under reconstruction constraints. To further enhance the detailed facial information, we propose a local patch method based on sparse representation with respect to coupled overcomplete patch dictionaries, which can be fast solved through linear programming. Experiments demonstrate that our approach can hallucinate high quality super-resolution faces.
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010
In this paper, we propose a novel supervised hierarchical sparse coding model based on local imag... more In this paper, we propose a novel supervised hierarchical sparse coding model based on local image descriptors for classification tasks. The supervised dictionary training is performed via back-projection, by minimizing the training error of classifying the image level features, which are extracted by max pooling over the sparse codes within a spatial pyramid. Such a max pooling procedure across multiple spatial scales offer the model translation invariant properties, similar to the Convolutional Neural Network (CNN). Experiments show that our supervised dictionary improves the performance of the proposed model significantly over the unsupervised dictionary, leading to state-of-the-art performance on diverse image databases. Further more, our supervised model targets learning linear features, implying its great potential in handling large scale datasets in real applications.
2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image cl... more Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3 ) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.
2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008
... Techniques for non-negative and sparse representation have been well studied in recent years ... more ... Techniques for non-negative and sparse representation have been well studied in recent years to ... NMF; also matrix-based NMF has been ex-tended to Non-negative Tensor Factorization ... [18] proposed a su-pervised algorithm for learning semantic localized patterns with binary ...
Proceedings of the 3rd ACM SIGMM international workshop on Social media - WSM '11, 2011
With the popularity of digital cameras and camera phones, it is common for different people, who ... more With the popularity of digital cameras and camera phones, it is common for different people, who may or may not know each other, to attend the same event and take pictures and videos from different spatial or personal perspectives. Within the realm of social media, it is desirable to enable these people to share their pictures and videos in order
Rare Metal Materials and Engineering, 2008
The electron beam cold hearth remelting (EBCHR) process has emerged as new melting technology. It... more The electron beam cold hearth remelting (EBCHR) process has emerged as new melting technology. It is used to produce premium quality titanium alloys for critical rotating parts of the aero-engine and to recycle the titanium scraps. In the paper, the relationships between the melting rate and the chemical composition of Ti64 alloy the TiN dissolution ability were studied. The results
2013 IEEE International Conference on Computer Vision, 2013
2011 18th IEEE International Conference on Image Processing, 2011
Abstract In this paper, we propose an extension of the Non-Local Kernel Regression (NL-KR) method... more Abstract In this paper, we propose an extension of the Non-Local Kernel Regression (NL-KR) method and apply it to super-resolution (SR) tasks. The proposed method extends NL-KR via generalizing the self-similarity from single-scale to multi-scale, and propose an effective SR algorithm using the proposed multi-scale NL-KR model. Experimental results on both synthetic and real images demonstrate the effectiveness of the proposed method.
IEEE Transactions on Image Processing, 2000
The graph construction procedure essentially determines the potentials of those graph-oriented le... more The graph construction procedure essentially determines the potentials of those graph-oriented learning algorithms for image analysis. In this paper, we propose a process to build the so-called directed 1 -graph, in which the vertices involve all the samples and the ingoing edge weights to each vertex describe its 1 -norm driven reconstruction from the remaining samples and the noise. Then, a series of new algorithms for various machine learning tasks, e.g., data clustering, subspace learning, and semisupervised learning, are derived upon the 1 -graphs. Compared with the conventional -nearest-neighbor graph and -ball graph, the 1 -graph possesses the advantages: 1) greater robustness to data noise, 2) automatic sparsity, and 3) adaptive neighborhood for individual datum. Extensive experiments on three real-world datasets show the consistent superiority of 1 -graph over those classic graphs in data clustering, subspace learning, and semi-supervised learning tasks.
IEEE Transactions on Image Processing, 2000
This paper presents a new approach to single-image superresolution, based on sparse signal repres... more This paper presents a new approach to single-image superresolution, based on sparse signal representation. Research on image statistics suggests that image patches can be wellrepresented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low-and high-resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs [1], reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image superresolution and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle super-resolution with noisy inputs in a more unified framework.
2013 IEEE International Conference on Multimedia and Expo (ICME), 2013
A novel problem of object recognition with dynamically allocated sensing resources is considered ... more A novel problem of object recognition with dynamically allocated sensing resources is considered in this paper. We call this problem opportunistic sensing since prior knowledge about the correlation between class label and signal distribution is exploited as early as in data acquisition. Two forms of sensing parameters -discrete sensor index and continuous linear measurement vector -are optimized within the same maximum negative entropy framework. The computationally intractable expected entropy is approximated using unscented transform for Gaussian models, and we solve the problem using a gradient-based method. Our formulation is theoretically shown to be closely related to the maximum mutual information criterion for sensor selection and linear feature extraction techniques such as PCA, LDA, and CCA. The proposed approach is validated on multi-view vehicle classification and face recognition datasets, and remarkable improvement over baseline methods is demonstrated in the experiments.
Asian and Pacific Conference on Synthetic Aperture Radar, 2007
The purpose of this paper is to evaluate the blur extent from motion blurred SAR images. Combined... more The purpose of this paper is to evaluate the blur extent from motion blurred SAR images. Combined with the averaging operation, a simple adaptive filter was proposed to identify the motion blur extent by using the autocorrelation method, which is expected to be the measure of the motion blurred SAR images.
With the popularity of digital cameras and camera phones, it is common for different people, who ... more With the popularity of digital cameras and camera phones, it is common for different people, who may or may not know each other, to attend the same event and take pictures and videos from different spatial or personal perspectives. Within the realm of social media, it is desirable to enable these people to share their pictures and videos in order
International Conference on Multimedia Computing and Systems/International Conference on Multimedia and Expo, 2011
In this work, we propose a novel supervised matrix factorization method used directly as a multi-... more In this work, we propose a novel supervised matrix factorization method used directly as a multi-class classifier. The coefficient matrix of the factorization is enforced to be sparse by ℓ1-norm regularization. The basis matrix is composed of atom dictionaries from different classes, which are trained in a jointly supervised manner by penalizing inhomogeneous representations given the labeled data samples. The
Computer Vision and Pattern Recognition, 2009
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image cl... more Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n2 ~ n3) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we
European Conference on Computer Vision, 2010
Sparse coding of sensory data has recently attracted notable attention in research of learning us... more Sparse coding of sensory data has recently attracted notable attention in research of learning useful features from the unlabeled data. Empirical studies show that mapping the data into a signicantly higher- dimensional space with sparse coding can lead to superior classication performance. However, computationally it is challenging to learn a set of highly over-complete dictionary bases and to encode the
IEEE Transactions on Image Processing, 2009
In this paper, our contributions to the subspace learning problem are two-fold. We first justify ... more In this paper, our contributions to the subspace learning problem are two-fold. We first justify that most popular subspace learning algorithms, unsupervised or supervised, can be unitedly explained as instances of a ubiquitously supervised prototype. They all essentially minimize the intraclass compact- ness and at the same time maximize the interclass separability, yet with specialized labeling approaches, such as ground
2011 International Conference on Computer Vision, 2011
Most previous visual recognition systems simply assume ideal inputs without real-world degradatio... more Most previous visual recognition systems simply assume ideal inputs without real-world degradations, such as low resolution, motion blur and out-of-focus blur. In presence of such unknown degradations, the conventional approach first resorts to blind image restoration and then feeds the restored image into a classifier. Treating restoration and recognition separately, such a straightforward approach, however, suffers greatly from the defective output of the illposed blind image restoration. In this paper, we present a joint blind image restoration and recognition method based on the sparse representation prior to handle the challenging problem of face recognition from low-quality images, where the degradation model is realistic and totally unknown. The sparse representation prior states that the degraded input image, if correctly restored, will have a good sparse representation in terms of the training set, which indicates the identity of the test image. The proposed algorithm achieves simultaneous restoration and recognition by iteratively solving the blind image restoration in pursuit of the sparest representation for recognition. Based on such a sparse representation prior, we demonstrate that the image restoration task and the recognition task can benefit greatly from each other. Extensive experiments on face datasets under various degradations are carried out and the results of our joint model shows significant improvements over conventional methods of treating the two tasks independently.
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010
The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to ach... more The traditional SPM approach based on bag-of-features (BoF) requires nonlinear classifiers to achieve good image classification performance. This paper presents a simple but effective coding scheme called Locality-constrained Linear Coding (LLC) in place of the VQ coding in traditional SPM. LLC utilizes the locality constraints to project each descriptor into its local-coordinate system, and the projected coordinates are integrated by max pooling to generate the final representation. With linear classifier, the proposed approach performs remarkably better than the traditional nonlinear SPM, achieving state-of-the-art performance on several benchmarks.
2008 15th IEEE International Conference on Image Processing, 2008
In this paper, we address the problem of hallucinating a high resolution face given a low resolut... more In this paper, we address the problem of hallucinating a high resolution face given a low resolution input face. The problem is approached through sparse coding. To exploit the facial structure, Non-negative Matrix Factorization (NMF) [1] is first employed to learn a localized part-based subspace. This subspace is effective for super-resolving the incoming low resolution face under reconstruction constraints. To further enhance the detailed facial information, we propose a local patch method based on sparse representation with respect to coupled overcomplete patch dictionaries, which can be fast solved through linear programming. Experiments demonstrate that our approach can hallucinate high quality super-resolution faces.
2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010
In this paper, we propose a novel supervised hierarchical sparse coding model based on local imag... more In this paper, we propose a novel supervised hierarchical sparse coding model based on local image descriptors for classification tasks. The supervised dictionary training is performed via back-projection, by minimizing the training error of classifying the image level features, which are extracted by max pooling over the sparse codes within a spatial pyramid. Such a max pooling procedure across multiple spatial scales offer the model translation invariant properties, similar to the Convolutional Neural Network (CNN). Experiments show that our supervised dictionary improves the performance of the proposed model significantly over the unsupervised dictionary, leading to state-of-the-art performance on diverse image databases. Further more, our supervised model targets learning linear features, implying its great potential in handling large scale datasets in real applications.
2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009
Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image cl... more Recently SVMs using spatial pyramid matching (SPM) kernel have been highly successful in image classification. Despite its popularity, these nonlinear SVMs have a complexity O(n 2 ∼ n 3 ) in training and O(n) in testing, where n is the training size, implying that it is nontrivial to scaleup the algorithms to handle more than thousands of training images. In this paper we develop an extension of the SPM method, by generalizing vector quantization to sparse coding followed by multi-scale spatial max pooling, and propose a linear SPM kernel based on SIFT sparse codes. This new approach remarkably reduces the complexity of SVMs to O(n) in training and a constant in testing. In a number of image categorization experiments, we find that, in terms of classification accuracy, the suggested linear SPM based on sparse coding of SIFT descriptors always significantly outperforms the linear SPM kernel on histograms, and is even better than the nonlinear SPM kernels, leading to state-of-the-art performance on several benchmarks by using a single type of descriptors.
2008 IEEE Conference on Computer Vision and Pattern Recognition, 2008
... Techniques for non-negative and sparse representation have been well studied in recent years ... more ... Techniques for non-negative and sparse representation have been well studied in recent years to ... NMF; also matrix-based NMF has been ex-tended to Non-negative Tensor Factorization ... [18] proposed a su-pervised algorithm for learning semantic localized patterns with binary ...
Proceedings of the 3rd ACM SIGMM international workshop on Social media - WSM '11, 2011
With the popularity of digital cameras and camera phones, it is common for different people, who ... more With the popularity of digital cameras and camera phones, it is common for different people, who may or may not know each other, to attend the same event and take pictures and videos from different spatial or personal perspectives. Within the realm of social media, it is desirable to enable these people to share their pictures and videos in order
Rare Metal Materials and Engineering, 2008
The electron beam cold hearth remelting (EBCHR) process has emerged as new melting technology. It... more The electron beam cold hearth remelting (EBCHR) process has emerged as new melting technology. It is used to produce premium quality titanium alloys for critical rotating parts of the aero-engine and to recycle the titanium scraps. In the paper, the relationships between the melting rate and the chemical composition of Ti64 alloy the TiN dissolution ability were studied. The results
2013 IEEE International Conference on Computer Vision, 2013
2011 18th IEEE International Conference on Image Processing, 2011
Abstract In this paper, we propose an extension of the Non-Local Kernel Regression (NL-KR) method... more Abstract In this paper, we propose an extension of the Non-Local Kernel Regression (NL-KR) method and apply it to super-resolution (SR) tasks. The proposed method extends NL-KR via generalizing the self-similarity from single-scale to multi-scale, and propose an effective SR algorithm using the proposed multi-scale NL-KR model. Experimental results on both synthetic and real images demonstrate the effectiveness of the proposed method.
IEEE Transactions on Image Processing, 2000
The graph construction procedure essentially determines the potentials of those graph-oriented le... more The graph construction procedure essentially determines the potentials of those graph-oriented learning algorithms for image analysis. In this paper, we propose a process to build the so-called directed 1 -graph, in which the vertices involve all the samples and the ingoing edge weights to each vertex describe its 1 -norm driven reconstruction from the remaining samples and the noise. Then, a series of new algorithms for various machine learning tasks, e.g., data clustering, subspace learning, and semisupervised learning, are derived upon the 1 -graphs. Compared with the conventional -nearest-neighbor graph and -ball graph, the 1 -graph possesses the advantages: 1) greater robustness to data noise, 2) automatic sparsity, and 3) adaptive neighborhood for individual datum. Extensive experiments on three real-world datasets show the consistent superiority of 1 -graph over those classic graphs in data clustering, subspace learning, and semi-supervised learning tasks.
IEEE Transactions on Image Processing, 2000
This paper presents a new approach to single-image superresolution, based on sparse signal repres... more This paper presents a new approach to single-image superresolution, based on sparse signal representation. Research on image statistics suggests that image patches can be wellrepresented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low-and high-resolution image patches, we can enforce the similarity of sparse representations between the low resolution and high resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low resolution image patch can be applied with the high resolution image patch dictionary to generate a high resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs [1], reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image superresolution and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle super-resolution with noisy inputs in a more unified framework.