Peter Tino - Academia.edu (original) (raw)

Papers by Peter Tino

Research paper thumbnail of A Framework for Population-Based Stochastic Optimization on Abstract Riemannian Manifolds

We present Extended Riemannian Stochastic Derivative-Free Optimization (Extended RSDFO), a novel ... more We present Extended Riemannian Stochastic Derivative-Free Optimization (Extended RSDFO), a novel population-based stochastic optimization algorithm on Riemannian manifolds that addresses the locality and implicit assumptions of manifold optimization in the literature. We begin by investigating the Information Geometrical structure of statistical model over Riemannian manifolds. This establishes a geometrical framework of Extended RSDFO using both the statistical geometry of the decision space and the Riemannian geometry of the search space. We construct locally inherited probability distribution via an orientation-preserving diffeomorphic bundle morphism, and then extend the information geometrical structure to mixture densities over totally bounded subsets of manifolds. The former relates the information geometry of the decision space and the local point estimations on the search space manifold. The latter overcomes the locality of parametric probability distributions on Riemannian...

Research paper thumbnail of Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information

Research paper thumbnail of Sparsification of core set models in non-metric supervised learning

Pattern Recognition Letters, 2019

This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Research paper thumbnail of Functional brain networks for learning predictive statistics

Cortex, 2017

Making predictions about future events relies on interpreting streams of information that may ini... more Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. This skill relies on extracting regular patterns in space and time by mere exposure to the environment (i.e. without explicit feedback). Yet, we know little about the functional brain networks that mediate this type of statistical learning. Here, we test whether changes in the processing and connectivity of functional brain networks due to training relate to our ability to learn temporal regularities. By combining behavioral training and functional brain connectivity analysis, we demonstrate that individuals adapt to the environment's statistics as they change over time from simple repetition to probabilistic combinations. Further, we show that individual learning of temporal structures relates to response strategy. Our fMRI results demonstrate that learning-dependent changes in fMRI activation within and functional connectivity between brain networks relate to individual variability in strategy. In particular, extracting the exact sequence statistics (i.e. matching) relates to changes in brain networks known to be involved in memory and stimulus-response associations, while selecting the most probable outcomes in a given context (i.e. maximizing) relates to changes in frontal and striatal networks. Thus, our findings provide evidence that dissociable brain networks mediate individual ability in learning behaviorally-relevant statistics.

Research paper thumbnail of Personalized Medication Response Prediction for Attention-Deficit Hyperactivity Disorder: Learning in the Model Space vs. Learning in the Data Space

Frontiers in physiology, 2017

Attention-Deficit Hyperactive Disorder (ADHD) is one of the most common mental health disorders a... more Attention-Deficit Hyperactive Disorder (ADHD) is one of the most common mental health disorders amongst school-aged children with an estimated prevalence of 5% in the global population (American Psychiatric Association, 2013). Stimulants, particularly methylphenidate (MPH), are the first-line option in the treatment of ADHD (Reeves and Schweitzer, 2004; Dopheide and Pliszka, 2009) and are prescribed to an increasing number of children and adolescents in the US and the UK every year (Safer et al., 1996; McCarthy et al., 2009), though recent studies suggest that this is tailing off, e.g., Holden et al. (2013). Around 70% of children demonstrate a clinically significant treatment response to stimulant medication (Spencer et al., 1996; Schachter et al., 2001; Swanson et al., 2001; Barbaresi et al., 2006). However, it is unclear which patient characteristics may moderate treatment effectiveness. As such, most existing research has focused on investigating univariate or multivariate corre...

Research paper thumbnail of Unravelling socio-motor biomarkers in schizophrenia

Research paper thumbnail of Classifying Cognitive Profiles Using Machine Learning with Privileged Information in Mild Cognitive Impairment

Frontiers in computational neuroscience, 2016

Early diagnosis of dementia is critical for assessing disease progression and potential treatment... more Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalized Matrix Learning Vector Quantization (GMLVQ) classifiers to discriminate patients with Mild Cognitive Impairment (MCI) from healthy controls based on their cognitive skills. Further, we adopted a "Learning with privileged information" approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI) during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants. MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning class...

Research paper thumbnail of Research Division Federal Reserve Bank of St. Louis Working Paper Series Does Money Matter in Inflation Forecasting?

This paper provides the most fully comprehensive evidence to date on whether or not monetary aggr... more This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two non-linear techniques, namely, recurrent neural networks and kernel recursive least squares regression-techniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naive random walk model. The best models were non-linear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation. Beyond its economic findings, our study is in the tradition of physicists' long-standing interest in the interconnections among statistical mechanics, neural networks, and related nonparametric statistical methods, and suggests potential avenues of extension for such studies.

Research paper thumbnail of Intelligent Data Engineering and Automated Learning - IDEAL 2007, 8th International Conference, Birmingham, UK, December 16-19, 2007, Proceedings

Research paper thumbnail of Volatility Trading via Temporal Pattern Recognition in Quantized Financial Time Series

We investigate the potential of the analysis of noisy non-stationary time series by quantizing it... more We investigate the potential of the analysis of noisy non-stationary time series by quantizing it into streams of discrete symbols and applying finite-memory symbolic predictors. Careful quantization can reduce the noise in the time series to make model estimation more amenable. We apply the quantization strategy in a realistic setting involving financial forecasting and trading. In particular, using historical data, we simulate the trading of straddles on the financial indexes DAX and FTSE 100 on a daily basis, based on predictions of the daily volatility differences in the underlying indexes.

Research paper thumbnail of Large Scale Indefinite Kernel Fisher Discriminant

Similarity-Based Pattern Recognition, 2015

Indefinite similarity measures can be frequently found in bio-informatics by means of alignment s... more Indefinite similarity measures can be frequently found in bio-informatics by means of alignment scores. Lacking an underlying vector space, the data are given as pairwise similarities only. Indefinite Kernel Fisher Discriminant (iKFD) is a very effective classifier for this type of data but has cubic complexity and does not scale to larger problems. Here we propose an extension of iKFD such that linear runtime and memory complexity is achieved for low rank indefinite kernels. Evaluation at several larger similarity data from various domains shows that the proposed method provides similar generalization capabilities while being substantially faster for large scale data.

Research paper thumbnail of Oversampling the Minority Class in the Feature Space

IEEE Transactions on Neural Networks and Learning Systems, 2016

Research paper thumbnail of Comparison of Echo State Networks with Simple Recurrent Networks and Variable-Length Markov Models on Symbolic Sequences

Lecture Notes in Computer Science, 2007

Research paper thumbnail of Introducing a Star Topology into Latent Class Models for Collaborative Filtering

IFIP International Federation for Information Processing, 2004

Latent class models (LCM) represent the high dimensional data in a smaller dimensional space in t... more Latent class models (LCM) represent the high dimensional data in a smaller dimensional space in terms of latent variables. They are able to automatically discover the patterns from the data. We present a topographic version of two LCMs for collaborative filtering and apply the models to a large collection of user ratings for films. Latent classes are topologically organized on a "star-like" structure. This makes orientation in rating patterns captured by latent classes easier and more systematic. The variation in film rating patterns is modelled by multinomial and binomial distributions with varying independence assumptions. collaborative filtering, latent variable models, visualization

Research paper thumbnail of Towards Self-Aware Service Composition

2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014

Service-based applications are typically composedof web services, which are selected from a pool ... more Service-based applications are typically composedof web services, which are selected from a pool of servicesand/or cloud market. Selection and composition decisions tendto encounter numerous uncertainties: service consumers andapplications have little control of these services and tend to beuncertain about their level of support for the desired functionalitiesand non-functionalities. We contribute to an "intelligent"framework for selecting and composing services. The frameworkis ground on the premise of computationally self-awareness toinform decisions of selecting and composing the services to meetboth behavioral and functional requirements. The frameworkprovides the primitives for fine grained representation of knowledgeand levels of self-awareness for time, goal, interaction andstimuli. We have used volunteer service computing as an exampleto demonstrate the benefits that self-awareness can introduce toself-adaptation.

Research paper thumbnail of The Benefits of Modeling Slack Variables in SVMs

Neural Computation, 2015

In this letter, we explore the idea of modeling slack variables in support vector machine (SVM) a... more In this letter, we explore the idea of modeling slack variables in support vector machine (SVM) approaches. The study is motivated by SVM+, which models the slacks through a smooth correcting function that is determined by additional (privileged) information about the training examples not available in the test phase. We take a closer look at the meaning and consequences of smooth modeling of slacks, as opposed to determining them in an unconstrained manner through the SVM optimization program. To better understand this difference we only allow the determination and modeling of slack values on the same information—that is, using the same training input in the original input space. We also explore whether it is possible to improve classification performance by combining (in a convex combination) the original SVM slacks with the modeled ones. We show experimentally that this approach not only leads to improved generalization performance but also yields more compact, lower-complexity m...

Research paper thumbnail of Learning the deterministically constructed Echo State Networks

2014 International Joint Conference on Neural Networks (IJCNN), 2014

Echo State Networks (ESNs) have shown great promise in the applications of non-linear time series... more Echo State Networks (ESNs) have shown great promise in the applications of non-linear time series processing because of their powerful computational ability and efficient training strategy. However, the nature of randomization in the structure of the reservoir causes it be poorly understood and leaves room for further improvements for specific problems. A deterministically constructed reservoir model, Cycle Reservoir with Jumps (CRJ), shows superior generalization performance to standard ESN. However, the weights that govern the structure of the reservoir (reservoir weights) in CRJ model are obtained through exhaustive grid search which is very computational intensive. In this paper, we propose to learn the reservoir weights together with the linear readout weights using a hybrid optimization strategy. The reservoir weights are trained through nonlinear optimization techniques while the linear readout weights are obtained through linear algorithms. The experimental results demonstrate that the proposed strategy of training the CRJ network tremendously improves the computational efficiency without jeopardizing the generalization performance, sometimes even with better generalization performance.

Research paper thumbnail of Recent Trends in Learning of structured and non-standard data

In many application domains data are not given in a classical vector space but occur in form of s... more In many application domains data are not given in a classical vector space but occur in form of structural, sequential, relational characteristics or other non-standard formats. These data are often represented as graphs or by means of proximity matrices. Often these data sets are also huge and mathematically complicated to treat requesting for new efficient analysis algorithms which are the focus of this tutorial.

Research paper thumbnail of Modeling Complex Symbolic Sequences with Neural Based Systems

Artificial Neural Nets and Genetic Algorithms, 1998

ABSTRACT

Research paper thumbnail of A Kernel-Based Approach to Estimating Phase Shifts Between Irregularly Sampled Time Series: An Application to Gravitational Lenses

Lecture Notes in Computer Science, 2006

Research paper thumbnail of A Framework for Population-Based Stochastic Optimization on Abstract Riemannian Manifolds

We present Extended Riemannian Stochastic Derivative-Free Optimization (Extended RSDFO), a novel ... more We present Extended Riemannian Stochastic Derivative-Free Optimization (Extended RSDFO), a novel population-based stochastic optimization algorithm on Riemannian manifolds that addresses the locality and implicit assumptions of manifold optimization in the literature. We begin by investigating the Information Geometrical structure of statistical model over Riemannian manifolds. This establishes a geometrical framework of Extended RSDFO using both the statistical geometry of the decision space and the Riemannian geometry of the search space. We construct locally inherited probability distribution via an orientation-preserving diffeomorphic bundle morphism, and then extend the information geometrical structure to mixture densities over totally bounded subsets of manifolds. The former relates the information geometry of the decision space and the local point estimations on the search space manifold. The latter overcomes the locality of parametric probability distributions on Riemannian...

Research paper thumbnail of Feature relevance determination for ordinal regression in the context of feature redundancies and privileged information

Research paper thumbnail of Sparsification of core set models in non-metric supervised learning

Pattern Recognition Letters, 2019

This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Research paper thumbnail of Functional brain networks for learning predictive statistics

Cortex, 2017

Making predictions about future events relies on interpreting streams of information that may ini... more Making predictions about future events relies on interpreting streams of information that may initially appear incomprehensible. This skill relies on extracting regular patterns in space and time by mere exposure to the environment (i.e. without explicit feedback). Yet, we know little about the functional brain networks that mediate this type of statistical learning. Here, we test whether changes in the processing and connectivity of functional brain networks due to training relate to our ability to learn temporal regularities. By combining behavioral training and functional brain connectivity analysis, we demonstrate that individuals adapt to the environment's statistics as they change over time from simple repetition to probabilistic combinations. Further, we show that individual learning of temporal structures relates to response strategy. Our fMRI results demonstrate that learning-dependent changes in fMRI activation within and functional connectivity between brain networks relate to individual variability in strategy. In particular, extracting the exact sequence statistics (i.e. matching) relates to changes in brain networks known to be involved in memory and stimulus-response associations, while selecting the most probable outcomes in a given context (i.e. maximizing) relates to changes in frontal and striatal networks. Thus, our findings provide evidence that dissociable brain networks mediate individual ability in learning behaviorally-relevant statistics.

Research paper thumbnail of Personalized Medication Response Prediction for Attention-Deficit Hyperactivity Disorder: Learning in the Model Space vs. Learning in the Data Space

Frontiers in physiology, 2017

Attention-Deficit Hyperactive Disorder (ADHD) is one of the most common mental health disorders a... more Attention-Deficit Hyperactive Disorder (ADHD) is one of the most common mental health disorders amongst school-aged children with an estimated prevalence of 5% in the global population (American Psychiatric Association, 2013). Stimulants, particularly methylphenidate (MPH), are the first-line option in the treatment of ADHD (Reeves and Schweitzer, 2004; Dopheide and Pliszka, 2009) and are prescribed to an increasing number of children and adolescents in the US and the UK every year (Safer et al., 1996; McCarthy et al., 2009), though recent studies suggest that this is tailing off, e.g., Holden et al. (2013). Around 70% of children demonstrate a clinically significant treatment response to stimulant medication (Spencer et al., 1996; Schachter et al., 2001; Swanson et al., 2001; Barbaresi et al., 2006). However, it is unclear which patient characteristics may moderate treatment effectiveness. As such, most existing research has focused on investigating univariate or multivariate corre...

Research paper thumbnail of Unravelling socio-motor biomarkers in schizophrenia

Research paper thumbnail of Classifying Cognitive Profiles Using Machine Learning with Privileged Information in Mild Cognitive Impairment

Frontiers in computational neuroscience, 2016

Early diagnosis of dementia is critical for assessing disease progression and potential treatment... more Early diagnosis of dementia is critical for assessing disease progression and potential treatment. State-or-the-art machine learning techniques have been increasingly employed to take on this diagnostic task. In this study, we employed Generalized Matrix Learning Vector Quantization (GMLVQ) classifiers to discriminate patients with Mild Cognitive Impairment (MCI) from healthy controls based on their cognitive skills. Further, we adopted a "Learning with privileged information" approach to combine cognitive and fMRI data for the classification task. The resulting classifier operates solely on the cognitive data while it incorporates the fMRI data as privileged information (PI) during training. This novel classifier is of practical use as the collection of brain imaging data is not always possible with patients and older participants. MCI patients and healthy age-matched controls were trained to extract structure from temporal sequences. We ask whether machine learning class...

Research paper thumbnail of Research Division Federal Reserve Bank of St. Louis Working Paper Series Does Money Matter in Inflation Forecasting?

This paper provides the most fully comprehensive evidence to date on whether or not monetary aggr... more This paper provides the most fully comprehensive evidence to date on whether or not monetary aggregates are valuable for forecasting US inflation in the early to mid 2000s. We explore a wide range of different definitions of money, including different methods of aggregation and different collections of included monetary assets. In our forecasting experiment we use two non-linear techniques, namely, recurrent neural networks and kernel recursive least squares regression-techniques that are new to macroeconomics. Recurrent neural networks operate with potentially unbounded input memory, while the kernel regression technique is a finite memory predictor. The two methodologies compete to find the best fitting US inflation forecasting models and are then compared to forecasts from a naive random walk model. The best models were non-linear autoregressive models based on kernel methods. Our findings do not provide much support for the usefulness of monetary aggregates in forecasting inflation. Beyond its economic findings, our study is in the tradition of physicists' long-standing interest in the interconnections among statistical mechanics, neural networks, and related nonparametric statistical methods, and suggests potential avenues of extension for such studies.

Research paper thumbnail of Intelligent Data Engineering and Automated Learning - IDEAL 2007, 8th International Conference, Birmingham, UK, December 16-19, 2007, Proceedings

Research paper thumbnail of Volatility Trading via Temporal Pattern Recognition in Quantized Financial Time Series

We investigate the potential of the analysis of noisy non-stationary time series by quantizing it... more We investigate the potential of the analysis of noisy non-stationary time series by quantizing it into streams of discrete symbols and applying finite-memory symbolic predictors. Careful quantization can reduce the noise in the time series to make model estimation more amenable. We apply the quantization strategy in a realistic setting involving financial forecasting and trading. In particular, using historical data, we simulate the trading of straddles on the financial indexes DAX and FTSE 100 on a daily basis, based on predictions of the daily volatility differences in the underlying indexes.

Research paper thumbnail of Large Scale Indefinite Kernel Fisher Discriminant

Similarity-Based Pattern Recognition, 2015

Indefinite similarity measures can be frequently found in bio-informatics by means of alignment s... more Indefinite similarity measures can be frequently found in bio-informatics by means of alignment scores. Lacking an underlying vector space, the data are given as pairwise similarities only. Indefinite Kernel Fisher Discriminant (iKFD) is a very effective classifier for this type of data but has cubic complexity and does not scale to larger problems. Here we propose an extension of iKFD such that linear runtime and memory complexity is achieved for low rank indefinite kernels. Evaluation at several larger similarity data from various domains shows that the proposed method provides similar generalization capabilities while being substantially faster for large scale data.

Research paper thumbnail of Oversampling the Minority Class in the Feature Space

IEEE Transactions on Neural Networks and Learning Systems, 2016

Research paper thumbnail of Comparison of Echo State Networks with Simple Recurrent Networks and Variable-Length Markov Models on Symbolic Sequences

Lecture Notes in Computer Science, 2007

Research paper thumbnail of Introducing a Star Topology into Latent Class Models for Collaborative Filtering

IFIP International Federation for Information Processing, 2004

Latent class models (LCM) represent the high dimensional data in a smaller dimensional space in t... more Latent class models (LCM) represent the high dimensional data in a smaller dimensional space in terms of latent variables. They are able to automatically discover the patterns from the data. We present a topographic version of two LCMs for collaborative filtering and apply the models to a large collection of user ratings for films. Latent classes are topologically organized on a "star-like" structure. This makes orientation in rating patterns captured by latent classes easier and more systematic. The variation in film rating patterns is modelled by multinomial and binomial distributions with varying independence assumptions. collaborative filtering, latent variable models, visualization

Research paper thumbnail of Towards Self-Aware Service Composition

2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014

Service-based applications are typically composedof web services, which are selected from a pool ... more Service-based applications are typically composedof web services, which are selected from a pool of servicesand/or cloud market. Selection and composition decisions tendto encounter numerous uncertainties: service consumers andapplications have little control of these services and tend to beuncertain about their level of support for the desired functionalitiesand non-functionalities. We contribute to an "intelligent"framework for selecting and composing services. The frameworkis ground on the premise of computationally self-awareness toinform decisions of selecting and composing the services to meetboth behavioral and functional requirements. The frameworkprovides the primitives for fine grained representation of knowledgeand levels of self-awareness for time, goal, interaction andstimuli. We have used volunteer service computing as an exampleto demonstrate the benefits that self-awareness can introduce toself-adaptation.

Research paper thumbnail of The Benefits of Modeling Slack Variables in SVMs

Neural Computation, 2015

In this letter, we explore the idea of modeling slack variables in support vector machine (SVM) a... more In this letter, we explore the idea of modeling slack variables in support vector machine (SVM) approaches. The study is motivated by SVM+, which models the slacks through a smooth correcting function that is determined by additional (privileged) information about the training examples not available in the test phase. We take a closer look at the meaning and consequences of smooth modeling of slacks, as opposed to determining them in an unconstrained manner through the SVM optimization program. To better understand this difference we only allow the determination and modeling of slack values on the same information—that is, using the same training input in the original input space. We also explore whether it is possible to improve classification performance by combining (in a convex combination) the original SVM slacks with the modeled ones. We show experimentally that this approach not only leads to improved generalization performance but also yields more compact, lower-complexity m...

Research paper thumbnail of Learning the deterministically constructed Echo State Networks

2014 International Joint Conference on Neural Networks (IJCNN), 2014

Echo State Networks (ESNs) have shown great promise in the applications of non-linear time series... more Echo State Networks (ESNs) have shown great promise in the applications of non-linear time series processing because of their powerful computational ability and efficient training strategy. However, the nature of randomization in the structure of the reservoir causes it be poorly understood and leaves room for further improvements for specific problems. A deterministically constructed reservoir model, Cycle Reservoir with Jumps (CRJ), shows superior generalization performance to standard ESN. However, the weights that govern the structure of the reservoir (reservoir weights) in CRJ model are obtained through exhaustive grid search which is very computational intensive. In this paper, we propose to learn the reservoir weights together with the linear readout weights using a hybrid optimization strategy. The reservoir weights are trained through nonlinear optimization techniques while the linear readout weights are obtained through linear algorithms. The experimental results demonstrate that the proposed strategy of training the CRJ network tremendously improves the computational efficiency without jeopardizing the generalization performance, sometimes even with better generalization performance.

Research paper thumbnail of Recent Trends in Learning of structured and non-standard data

In many application domains data are not given in a classical vector space but occur in form of s... more In many application domains data are not given in a classical vector space but occur in form of structural, sequential, relational characteristics or other non-standard formats. These data are often represented as graphs or by means of proximity matrices. Often these data sets are also huge and mathematically complicated to treat requesting for new efficient analysis algorithms which are the focus of this tutorial.

Research paper thumbnail of Modeling Complex Symbolic Sequences with Neural Based Systems

Artificial Neural Nets and Genetic Algorithms, 1998

ABSTRACT

Research paper thumbnail of A Kernel-Based Approach to Estimating Phase Shifts Between Irregularly Sampled Time Series: An Application to Gravitational Lenses

Lecture Notes in Computer Science, 2006