Erik Rodner | University of California, Berkeley (original) (raw)
Papers by Erik Rodner
Active learning is an essential tool to reduce manual annotation costs in the presence of large a... more Active learning is an essential tool to reduce manual annotation costs in the presence of large amounts of unsupervised data. In this paper, we introduce new active learning methods based on measuring the impact of a new example on the current model. This is done by deriving model changes of Gaussian process models in closed form. Furthermore, we study typical pitfalls in active learning and show that our methods automatically balance between the exploitation and the exploration trade-off. Experiments are performed with established benchmark datasets for visual object recognition and show that our new active learning techniques are able to outperform state-of-the-art methods.
An important advantage of Gaussian processes is the ability to directly estimate classification u... more An important advantage of Gaussian processes is the ability to directly estimate classification uncertainties in a Bayesian manner. In this paper, we develop techniques that allow for estimating these uncertainties with a runtime linear or even constant with respect to the number of training examples. Our approach makes use of all training data without any sparse approximation technique while needing only a linear amount of memory. To incorporate new information over time, we further derive online learning methods leading to significant speed-ups and allowing for hyperparameter optimization on-the-fly. We conduct several experiments on public image datasets for the tasks of one-class classification and active learning, where computing the uncertainty is an essential task. The experimental results highlight that we are able to compute classification uncertainties within microseconds even for largescale datasets with tens of thousands of training examples.
We present how to perform exact large-scale multi-class Gaussian process classification with para... more We present how to perform exact large-scale multi-class Gaussian process classification with parameterized histogram intersection kernels. In contrast to previous approaches, we use a full Bayesian model without any sparse approximation techniques, which allows for learning in sub-quadratic and classification in constant time. To handle the additional model flexibility induced by parameterized kernels, our approach is able to optimize the parameters with large-scale training data. A key ingredient of this optimization is a new efficient upper bound of the negative Gaussian process log-likelihood. Experiments with image categorization tasks exhibit high performance gains with flexible kernels as well as learning within a few minutes and classification in microseconds for databases, where exact Gaussian process inference was not possible before.
Semantic interpretation and understanding of images is an important goal of visual recognition re... more Semantic interpretation and understanding of images is an important goal of visual recognition research and offers a large variety of possible applications. One step towards this goal is semantic segmentation, which aims for automatic labeling of image regions and pixels with category names. Since usual images contain several millions of pixel, the use of kernel-based methods for the task of semantic segmentation is limited due to the involved computation times. In this paper, we overcome this drawback by exploiting efficient kernel calculations using the histogram intersection kernel for fast and exact Gaussian process classification. Our results show that non-parametric Bayesian methods can be utilized for semantic segmentation without sparse approximation techniques. Furthermore, in experiments, we show a significant benefit in terms of classification accuracy compared to state-of-the-art methods.
Transfer learning techniques are important to handle small training sets and to allow for quick g... more Transfer learning techniques are important to handle small training sets and to allow for quick generalization even from only a few examples. The following paper is the introduction as well as the literature overview part of my thesis related to the topic of transfer learning for visual recognition problems.
In this paper, we tackle the problem of visual categorization of dog breeds, which is a surprisin... more In this paper, we tackle the problem of visual categorization of dog breeds, which is a surprisingly challenging task due to simultaneously present low interclass distances and high intra-class variances. Our approach combines several techniques well known in our community but often not utilized for fine-grained recognition:
Images seen during test time are often not from the same distribution as images used for learning... more Images seen during test time are often not from the same distribution as images used for learning. This problem, known as domain shift, occurs when training classifiers from object-centric internet image databases and trying to apply them directly to scene understanding tasks. The consequence is often severe performance degradation and is one of the major barriers for the application of classifiers in real-world systems. In this paper, we show how to learn transform-based domain adaptation classifiers in a scalable manner. The key idea is to exploit an implicit rank constraint, originated from a max-margin domain adaptation formulation, to make optimization tractable. Experiments show that the transformation between domains can be very efficiently learned from data and easily applied to new categories. This begins to bridge the gap between large-scale internet image collections and object images captured in everyday life environments.
We present an algorithm that learns representations which explicitly compensate for domain mismat... more We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches.
ii iii Ich versichere, dass ich die Arbeit ohne fremde Hilfe und ohne Benutzung anderer als der a... more ii iii Ich versichere, dass ich die Arbeit ohne fremde Hilfe und ohne Benutzung anderer als der angegebenen Quellen angefertigt habe und dass die Arbeit in gleicher oderähnlicher Form noch keiner anderen Prüfungsbehörde vorgelegen hat und von dieser als Teil einer Prüfungsleistung angenommen wurde. Alle Ausführungen, die wörtlich oder sinngemäßübernommen wurden, sind als solche gekennzeichnet.
Detecting instances of unknown categories is an important task for a multitude of problems such a... more Detecting instances of unknown categories is an important task for a multitude of problems such as object recognition, event detection, and defect localization. This paper investigates the use of Gaussian process (GP) priors for this area of research. Focusing on the task of one-class classification for visual object recognition, we analyze different measures derived from GP regression and approximate GP classification. Experiments are performed using a large set of categories and different image kernel functions. Our findings show that the well-known Support Vector Data Description is significantly outperformed by at least two GP measures which indicates high potential of Gaussian processes for one-class classification.
Pattern Recognition Letters, 2011
The human visual system is often able to learn to recognize difficult object categories from only... more The human visual system is often able to learn to recognize difficult object categories from only a single view, whereas automatic object recognition with few training examples is still a challenging task. This is mainly due to the human ability to transfer knowledge from related classes. Therefore, an extension to Randomized Decision Trees is introduced for learning with very few examples by exploiting interclass relationships. The approach consists of a maximum a posteriori estimation of classifier parameters using a prior distribution learned from similar object categories. Experiments on binary and multiclass classification tasks show significant performance gains Ó
A robust segmentation is the most important part of an automatic character recognition system (e.... more A robust segmentation is the most important part of an automatic character recognition system (e.g. document processing, license plate recognition etc.). In our contribution we present an efficient segmentation framework using a preprocessing step for shadow suppression combined with a local thresholding technique. The method is based on a combination of difference of boxes filters and a new ternary segmentation, which are both simple low-level image operations.
Knowledge transfer from related object categories is a key concept to allow learning with few tra... more Knowledge transfer from related object categories is a key concept to allow learning with few training examples. We present how to use dependent Gaussian processes for transferring knowledge from a related category in a non-parametric Bayesian way. Our method is able to select this category automatically using efficient model selection techniques. We show how to optionally incorporate semantic similarities obtained from the hierarchical lexical database WordNet [1] into the selection process. The framework is applied to image categorization tasks using state-of-the-art image-based kernel functions. A large scale evaluation shows the benefits of our approach compared to independent learning and a SVM based approach.
Pattern Recognition and Image Analysis, 2011
Gaussian Processes are powerful tools in machine learning which offer wide applicability in regre... more Gaussian Processes are powerful tools in machine learning which offer wide applicability in regression and classification problems due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity which scales cubically with the number of samples. Our work addresses this issue by combining Gaussian Processes with Randomized Decision Forests to enable fast learning. An important advantage of our method is its simplicity and the ability to directly control the trade-off between classification performance and computation speed. Experiments on an indoor place recognition task show that our method can handle large training sets in reasonable time while retaining a good classification accuracy.
Facade classification is an important subtask for automatically building large 3d city models. In... more Facade classification is an important subtask for automatically building large 3d city models. In the following we present an approach for pixelwise labeling of facade images using an efficient Randomized Decision Forest classifier and robust local color features. Experiments are performed with a popular facade dataset and a new demanding dataset of pixelwise labeled images from the LabelMe project. Our method achieves high recognition rates and is significantly faster for training and testing than other methods based on costly feature transformation techniques.
We present an approach to generic object recognition with range information obtained using a Time... more We present an approach to generic object recognition with range information obtained using a Time-of-Flight camera and colour images from a visual sensor. Multiple sensor information is fused with Bayesian kernel combination using Gaussian processes (GP) and hyper-parameter optimisation. We study the suitability of approximate GP classification methods for such tasks and present and evaluate different image kernel functions for range and colour images. Experiments show that our approach significantly outperforms previous work on a challenging dataset which boosts the recognition rate from 78% to 88%.
Abstract Gaussian Processes are powerful tools in machine learning which offer wide applicability... more Abstract Gaussian Processes are powerful tools in machine learning which offer wide applicability in regression and classification problems due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity which scales cubically with the number of samples. Our work addresses this issue by combining Gaussian Processes with Randomized Decision Forests to enable fast learning.
Detecting instances of unknown categories is an important task for a multitude of problems such a... more Detecting instances of unknown categories is an important task for a multitude of problems such as object recognition, event detection, and defect localization. This paper investigates the use of Gaussian process (GP) priors for this area of research. Focusing on the task of one-class classification for visual object recognition, we analyze different measures derived from GP regression and approximate GP classification. Experiments are performed using a large set of categories and different image kernel functions.
Abstract. An important advantage of Gaussian processes is the ability to directly estimate classi... more Abstract. An important advantage of Gaussian processes is the ability to directly estimate classification uncertainties in a Bayesian manner. In this paper, we develop techniques that allow for estimating these uncertainties with a runtime linear or even constant with respect to the number of training examples. Our approach makes use of all training data without any sparse approximation technique while needing only a linear amount of memory.
Abstract Due to the massive (labeled) data available on the web, a tremendous interest in large-s... more Abstract Due to the massive (labeled) data available on the web, a tremendous interest in large-scale machine learning methods has emerged in the last years. Whereas, most of the work done in this new area of research focused on fast and efficient classification algorithms, we show in this paper how other aspects of learning can also be covered using massive datasets.
Active learning is an essential tool to reduce manual annotation costs in the presence of large a... more Active learning is an essential tool to reduce manual annotation costs in the presence of large amounts of unsupervised data. In this paper, we introduce new active learning methods based on measuring the impact of a new example on the current model. This is done by deriving model changes of Gaussian process models in closed form. Furthermore, we study typical pitfalls in active learning and show that our methods automatically balance between the exploitation and the exploration trade-off. Experiments are performed with established benchmark datasets for visual object recognition and show that our new active learning techniques are able to outperform state-of-the-art methods.
An important advantage of Gaussian processes is the ability to directly estimate classification u... more An important advantage of Gaussian processes is the ability to directly estimate classification uncertainties in a Bayesian manner. In this paper, we develop techniques that allow for estimating these uncertainties with a runtime linear or even constant with respect to the number of training examples. Our approach makes use of all training data without any sparse approximation technique while needing only a linear amount of memory. To incorporate new information over time, we further derive online learning methods leading to significant speed-ups and allowing for hyperparameter optimization on-the-fly. We conduct several experiments on public image datasets for the tasks of one-class classification and active learning, where computing the uncertainty is an essential task. The experimental results highlight that we are able to compute classification uncertainties within microseconds even for largescale datasets with tens of thousands of training examples.
We present how to perform exact large-scale multi-class Gaussian process classification with para... more We present how to perform exact large-scale multi-class Gaussian process classification with parameterized histogram intersection kernels. In contrast to previous approaches, we use a full Bayesian model without any sparse approximation techniques, which allows for learning in sub-quadratic and classification in constant time. To handle the additional model flexibility induced by parameterized kernels, our approach is able to optimize the parameters with large-scale training data. A key ingredient of this optimization is a new efficient upper bound of the negative Gaussian process log-likelihood. Experiments with image categorization tasks exhibit high performance gains with flexible kernels as well as learning within a few minutes and classification in microseconds for databases, where exact Gaussian process inference was not possible before.
Semantic interpretation and understanding of images is an important goal of visual recognition re... more Semantic interpretation and understanding of images is an important goal of visual recognition research and offers a large variety of possible applications. One step towards this goal is semantic segmentation, which aims for automatic labeling of image regions and pixels with category names. Since usual images contain several millions of pixel, the use of kernel-based methods for the task of semantic segmentation is limited due to the involved computation times. In this paper, we overcome this drawback by exploiting efficient kernel calculations using the histogram intersection kernel for fast and exact Gaussian process classification. Our results show that non-parametric Bayesian methods can be utilized for semantic segmentation without sparse approximation techniques. Furthermore, in experiments, we show a significant benefit in terms of classification accuracy compared to state-of-the-art methods.
Transfer learning techniques are important to handle small training sets and to allow for quick g... more Transfer learning techniques are important to handle small training sets and to allow for quick generalization even from only a few examples. The following paper is the introduction as well as the literature overview part of my thesis related to the topic of transfer learning for visual recognition problems.
In this paper, we tackle the problem of visual categorization of dog breeds, which is a surprisin... more In this paper, we tackle the problem of visual categorization of dog breeds, which is a surprisingly challenging task due to simultaneously present low interclass distances and high intra-class variances. Our approach combines several techniques well known in our community but often not utilized for fine-grained recognition:
Images seen during test time are often not from the same distribution as images used for learning... more Images seen during test time are often not from the same distribution as images used for learning. This problem, known as domain shift, occurs when training classifiers from object-centric internet image databases and trying to apply them directly to scene understanding tasks. The consequence is often severe performance degradation and is one of the major barriers for the application of classifiers in real-world systems. In this paper, we show how to learn transform-based domain adaptation classifiers in a scalable manner. The key idea is to exploit an implicit rank constraint, originated from a max-margin domain adaptation formulation, to make optimization tractable. Experiments show that the transformation between domains can be very efficiently learned from data and easily applied to new categories. This begins to bridge the gap between large-scale internet image collections and object images captured in everyday life environments.
We present an algorithm that learns representations which explicitly compensate for domain mismat... more We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, and introduce an efficient cost function based on misclassification loss. Our method combines several features previously unavailable in a single algorithm: multi-class adaptation through representation learning, ability to map across heterogeneous feature spaces, and scalability to large datasets. We present experiments on several image datasets that demonstrate improved accuracy and computational advantages compared to previous approaches.
ii iii Ich versichere, dass ich die Arbeit ohne fremde Hilfe und ohne Benutzung anderer als der a... more ii iii Ich versichere, dass ich die Arbeit ohne fremde Hilfe und ohne Benutzung anderer als der angegebenen Quellen angefertigt habe und dass die Arbeit in gleicher oderähnlicher Form noch keiner anderen Prüfungsbehörde vorgelegen hat und von dieser als Teil einer Prüfungsleistung angenommen wurde. Alle Ausführungen, die wörtlich oder sinngemäßübernommen wurden, sind als solche gekennzeichnet.
Detecting instances of unknown categories is an important task for a multitude of problems such a... more Detecting instances of unknown categories is an important task for a multitude of problems such as object recognition, event detection, and defect localization. This paper investigates the use of Gaussian process (GP) priors for this area of research. Focusing on the task of one-class classification for visual object recognition, we analyze different measures derived from GP regression and approximate GP classification. Experiments are performed using a large set of categories and different image kernel functions. Our findings show that the well-known Support Vector Data Description is significantly outperformed by at least two GP measures which indicates high potential of Gaussian processes for one-class classification.
Pattern Recognition Letters, 2011
The human visual system is often able to learn to recognize difficult object categories from only... more The human visual system is often able to learn to recognize difficult object categories from only a single view, whereas automatic object recognition with few training examples is still a challenging task. This is mainly due to the human ability to transfer knowledge from related classes. Therefore, an extension to Randomized Decision Trees is introduced for learning with very few examples by exploiting interclass relationships. The approach consists of a maximum a posteriori estimation of classifier parameters using a prior distribution learned from similar object categories. Experiments on binary and multiclass classification tasks show significant performance gains Ó
A robust segmentation is the most important part of an automatic character recognition system (e.... more A robust segmentation is the most important part of an automatic character recognition system (e.g. document processing, license plate recognition etc.). In our contribution we present an efficient segmentation framework using a preprocessing step for shadow suppression combined with a local thresholding technique. The method is based on a combination of difference of boxes filters and a new ternary segmentation, which are both simple low-level image operations.
Knowledge transfer from related object categories is a key concept to allow learning with few tra... more Knowledge transfer from related object categories is a key concept to allow learning with few training examples. We present how to use dependent Gaussian processes for transferring knowledge from a related category in a non-parametric Bayesian way. Our method is able to select this category automatically using efficient model selection techniques. We show how to optionally incorporate semantic similarities obtained from the hierarchical lexical database WordNet [1] into the selection process. The framework is applied to image categorization tasks using state-of-the-art image-based kernel functions. A large scale evaluation shows the benefits of our approach compared to independent learning and a SVM based approach.
Pattern Recognition and Image Analysis, 2011
Gaussian Processes are powerful tools in machine learning which offer wide applicability in regre... more Gaussian Processes are powerful tools in machine learning which offer wide applicability in regression and classification problems due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity which scales cubically with the number of samples. Our work addresses this issue by combining Gaussian Processes with Randomized Decision Forests to enable fast learning. An important advantage of our method is its simplicity and the ability to directly control the trade-off between classification performance and computation speed. Experiments on an indoor place recognition task show that our method can handle large training sets in reasonable time while retaining a good classification accuracy.
Facade classification is an important subtask for automatically building large 3d city models. In... more Facade classification is an important subtask for automatically building large 3d city models. In the following we present an approach for pixelwise labeling of facade images using an efficient Randomized Decision Forest classifier and robust local color features. Experiments are performed with a popular facade dataset and a new demanding dataset of pixelwise labeled images from the LabelMe project. Our method achieves high recognition rates and is significantly faster for training and testing than other methods based on costly feature transformation techniques.
We present an approach to generic object recognition with range information obtained using a Time... more We present an approach to generic object recognition with range information obtained using a Time-of-Flight camera and colour images from a visual sensor. Multiple sensor information is fused with Bayesian kernel combination using Gaussian processes (GP) and hyper-parameter optimisation. We study the suitability of approximate GP classification methods for such tasks and present and evaluate different image kernel functions for range and colour images. Experiments show that our approach significantly outperforms previous work on a challenging dataset which boosts the recognition rate from 78% to 88%.
Abstract Gaussian Processes are powerful tools in machine learning which offer wide applicability... more Abstract Gaussian Processes are powerful tools in machine learning which offer wide applicability in regression and classification problems due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity which scales cubically with the number of samples. Our work addresses this issue by combining Gaussian Processes with Randomized Decision Forests to enable fast learning.
Detecting instances of unknown categories is an important task for a multitude of problems such a... more Detecting instances of unknown categories is an important task for a multitude of problems such as object recognition, event detection, and defect localization. This paper investigates the use of Gaussian process (GP) priors for this area of research. Focusing on the task of one-class classification for visual object recognition, we analyze different measures derived from GP regression and approximate GP classification. Experiments are performed using a large set of categories and different image kernel functions.
Abstract. An important advantage of Gaussian processes is the ability to directly estimate classi... more Abstract. An important advantage of Gaussian processes is the ability to directly estimate classification uncertainties in a Bayesian manner. In this paper, we develop techniques that allow for estimating these uncertainties with a runtime linear or even constant with respect to the number of training examples. Our approach makes use of all training data without any sparse approximation technique while needing only a linear amount of memory.
Abstract Due to the massive (labeled) data available on the web, a tremendous interest in large-s... more Abstract Due to the massive (labeled) data available on the web, a tremendous interest in large-scale machine learning methods has emerged in the last years. Whereas, most of the work done in this new area of research focused on fast and efficient classification algorithms, we show in this paper how other aspects of learning can also be covered using massive datasets.