Jorge Díez | University of Oviedo / Universidad de Oviedo (original) (raw)
Papers by Jorge Díez
Neural Computing and Applications
Navigation through large volumes of images is a complex and tedious task that requires tools to f... more Navigation through large volumes of images is a complex and tedious task that requires tools to facilitate the exploration and discovery of visual information. Photo summaries are one of these tools, which consist of selecting a reduced set of images that best represent the original data source. However, creating photo summaries in the context of recommender systems poses several challenges: How to select the most relevant images for each item? How to encode each image? How to evaluate the quality of the generated summary? In this manuscript, we propose a clustering-based method to create a visual summary in the context of a restaurant recommender system, which includes the photos taken by users who visited the restaurants (items) in a given city. These photos are encoded using a deep neural network that takes into account not only their content but also the relationships between users and restaurants. This encoding will allow us to create a visual summary that captures the essence ...
Recommender systems have proven their usefulness both for companies and customers. The former inc... more Recommender systems have proven their usefulness both for companies and customers. The former increase their sales and the latter get a more satisfying shopping experience. These systems can benefit from the advent of explainable artificial intelligence, since a well-explained recommendation will be more convincing and may broaden the customer’s purchasing options. Many approaches offer justifications for their recommendations based on the similarity (in some sense) between users, past purchases, etc., which require some knowledge of the users. In this paper we present a recommender system with explanatory capabilities which is able to deal with the so-called cold-start problem, since it does not require any previous knowledge of the user. Our method learns the relationship between the products and some relevant words appearing in the textual reviews written by previous customers for those products. Then, starting from the textual query of a user’s request for recommendation, our ap...
The reduction of energy consumption in buildings is one of the goals to improve energy efficiency... more The reduction of energy consumption in buildings is one of the goals to improve energy efficiency. One way to achieve energy savings in buildings is to develop intelligent control strategies for heating systems that are able to reduce power consumption without affecting the thermal comfort. An intelligent control system must be able to predict the temperature of the building in order to manage the heating system. In this paper, we present a rule-based model that is able to predict the indoor temperature for different values of k (hours ahead in time). The model has been learned with FRULER, a genetic fuzzy system that generates accurate and simple knowledge bases. Our approach has been validated with real data from a residential college.
ABSTRACT Learning tasks where the set Y of classes has an ordering relation arise in a number of ... more ABSTRACT Learning tasks where the set Y of classes has an ordering relation arise in a number of important application fields. In this context, the loss function may be defined in different ways, ranging from multiclass classification to ordinal or metric regression. However, to consider only the ordered structure of Y, a measure of goodness of a hypothesis h has to be related to the number of pairs whose relative ordering is swapped by h. In this paper, we present a method, based on the use of a multivariate version of Support Vector Machines (SVM) that learns to order minimizing the number of swapped pairs. Finally, using benchmark datasets, we compare the scores so achieved with those found by other alternative approaches.
Progress in Artificial Intelligence, 2012
The goal of multilabel (ML) classification is to induce models able to tag objects with the label... more The goal of multilabel (ML) classification is to induce models able to tag objects with the labels that better describe them. The main baseline for ML classification is binary relevance (BR), which is commonly criticized in the literature because of its label independence assumption. Despite this fact, this paper discusses some interesting properties of BR, mainly that it produces optimal models for several ML loss functions. Additionally, we present an analytical study of ML benchmarks datasets and point out some shortcomings. As a result, this paper proposes the use of synthetic datasets to better analyze the behavior of ML methods in domains with different characteristics. To support this claim, we perform some experiments using synthetic data proving the competitive performance of BR with respect to a more complex method in difficult problems with many labels, a conclusion which was not stated by previous studies.
Lecture Notes in Computer Science, 2004
The quality of food can be assessed from different points of view. In this paper, we deal with th... more The quality of food can be assessed from different points of view. In this paper, we deal with those aspects that can be appreciated through sensory impressions. When we are aiming to induce a function that maps object descriptions into ratings, we must consider that consumers' ratings are just a way to express their preferences about the products presented in the same testing session. Therefore, we postulate to learn from consumers' preference judgments instead of using an approach based on regression. This requires the use of special purpose kernels and feature subset selection methods. We illustrate the benefits of our approach in two families of real-world data bases.
Lecture Notes in Computer Science, 2008
We present nondeterministic hypotheses learned from an ordinal regression task. They try to predi... more We present nondeterministic hypotheses learned from an ordinal regression task. They try to predict the true rank for an entry, but when the classification is uncertain the hypotheses predict a set of consecutive ranks (an interval). The aim is to keep the set of ranks as small as possible, while still containing the true rank. The justification for learning such a hypothesis is based on a real world problem arisen in breeding beef cattle. After defining a family of loss functions inspired in Information Retrieval, we derive an algorithm for minimizing them. The algorithm is based on posterior probabilities of ranks given an entry. A couple of implementations are compared: one based on a multiclass SVM and other based on Gaussian processes designed to minimize the linear loss in ordinal regression tasks.
Twenty-first international conference on Machine learning - ICML '04, 2004
In this paper we tackle a real world problem, the search of a function to evaluate the merits of ... more In this paper we tackle a real world problem, the search of a function to evaluate the merits of beef cattle as meat producers. The independent variables represent a set of live animals' measurements; while the outputs cannot be captured with a single number, since the available experts tend to assess each animal in a relative way, comparing animals with the other partners in the same batch. Therefore, this problem can not be solved by means of regression methods; our approach is to learn the preferences of the experts when they order small groups of animals. Thus, the problem can be reduced to a binary classification, and can be dealt with a Support Vector Machine (SVM) improved with the use of a feature subset selection (FSS) method. We develop a method based on Recursive Feature Elimination (RFE) that employs an adaptation of a metric based method devised for model selection (ADJ). Finally, we discuss the extension of the resulting method to more general settings, and provide a comparison with other possible alternatives.
Lecture Notes in Computer Science, 2009
From a multi-class learning task, in addition to a classifier, it is possible to infer some usefu... more From a multi-class learning task, in addition to a classifier, it is possible to infer some useful knowledge about the relationship between the classes involved. In this paper we propose a method to learn a hierarchical clustering of the set of classes. The usefulness of such clusterings has been exploited in bio-medical applications to find out relations between diseases or populations of animals. The method proposed here defines a distance between classes based on the margin maximization principle, and then builds the hierarchy using a linkage procedure. Moreover, to quantify the goodness of the hierarchies we define a measure. Finally, we present a set of experiments comparing the scores achieved by our approach with other methods.
investigaciones …, 2007
En el presente trabajo se construye un modelo para la determinación de los factores influyentes e... more En el presente trabajo se construye un modelo para la determinación de los factores influyentes en la rentabilidad futura de una empresa a través de un enfoque basado en preferencias. Con este planteamiento se consiguen superar las limitaciones de los modelos de regresión y de los sistemas de clasificación. Los resultados obtenidos en una muestra de 1.745 empresas indican que es más efectivo adoptar estrategias de aumento del margen, principalmente a través de subidas de precios, que incrementar la rotación de los activos. Es destacable que el modelo construido funciona con una precisión muy superior a otros basados en regresiones.
Trends in Food Science & Technology, 2007
In this paper we discuss how to model preferences from a collection of ratings provided by a pane... more In this paper we discuss how to model preferences from a collection of ratings provided by a panel of consumers of some kind of food product. W e emphasize the role of tasting sessions, since the ratings tend to be relative to each session and hence regression methods are unable to capture consumer preferences. The method proposed is based on the use of Support V ector M achines (SV M ) and provides both linear and nonlinear models. To illustrate the performance of the approach, we report the experimental results obtained with a couple of real world datasets.
Trends in Food Science & Technology, 2001
In this paper we advocate the application of Artificial Intelligence techniques to quality assess... more In this paper we advocate the application of Artificial Intelligence techniques to quality assessment of food products. Machine Learning algorithms can help us to: a) extract operative human knowledge from a set of examples; b) conclude interpretable rules for classifying samples regardless of the non-linearity of the human behaviour or process; and c) help us to ascertain the degree of influence of each objective attribute of the assessed food on the final decision of an expert. We illustrate these topics with an example of how it is possible to clone the behaviour of bovine carcass classifiers, leading to possible further industrial applications.
Neural Computing and Applications
Navigation through large volumes of images is a complex and tedious task that requires tools to f... more Navigation through large volumes of images is a complex and tedious task that requires tools to facilitate the exploration and discovery of visual information. Photo summaries are one of these tools, which consist of selecting a reduced set of images that best represent the original data source. However, creating photo summaries in the context of recommender systems poses several challenges: How to select the most relevant images for each item? How to encode each image? How to evaluate the quality of the generated summary? In this manuscript, we propose a clustering-based method to create a visual summary in the context of a restaurant recommender system, which includes the photos taken by users who visited the restaurants (items) in a given city. These photos are encoded using a deep neural network that takes into account not only their content but also the relationships between users and restaurants. This encoding will allow us to create a visual summary that captures the essence ...
Recommender systems have proven their usefulness both for companies and customers. The former inc... more Recommender systems have proven their usefulness both for companies and customers. The former increase their sales and the latter get a more satisfying shopping experience. These systems can benefit from the advent of explainable artificial intelligence, since a well-explained recommendation will be more convincing and may broaden the customer’s purchasing options. Many approaches offer justifications for their recommendations based on the similarity (in some sense) between users, past purchases, etc., which require some knowledge of the users. In this paper we present a recommender system with explanatory capabilities which is able to deal with the so-called cold-start problem, since it does not require any previous knowledge of the user. Our method learns the relationship between the products and some relevant words appearing in the textual reviews written by previous customers for those products. Then, starting from the textual query of a user’s request for recommendation, our ap...
The reduction of energy consumption in buildings is one of the goals to improve energy efficiency... more The reduction of energy consumption in buildings is one of the goals to improve energy efficiency. One way to achieve energy savings in buildings is to develop intelligent control strategies for heating systems that are able to reduce power consumption without affecting the thermal comfort. An intelligent control system must be able to predict the temperature of the building in order to manage the heating system. In this paper, we present a rule-based model that is able to predict the indoor temperature for different values of k (hours ahead in time). The model has been learned with FRULER, a genetic fuzzy system that generates accurate and simple knowledge bases. Our approach has been validated with real data from a residential college.
ABSTRACT Learning tasks where the set Y of classes has an ordering relation arise in a number of ... more ABSTRACT Learning tasks where the set Y of classes has an ordering relation arise in a number of important application fields. In this context, the loss function may be defined in different ways, ranging from multiclass classification to ordinal or metric regression. However, to consider only the ordered structure of Y, a measure of goodness of a hypothesis h has to be related to the number of pairs whose relative ordering is swapped by h. In this paper, we present a method, based on the use of a multivariate version of Support Vector Machines (SVM) that learns to order minimizing the number of swapped pairs. Finally, using benchmark datasets, we compare the scores so achieved with those found by other alternative approaches.
Progress in Artificial Intelligence, 2012
The goal of multilabel (ML) classification is to induce models able to tag objects with the label... more The goal of multilabel (ML) classification is to induce models able to tag objects with the labels that better describe them. The main baseline for ML classification is binary relevance (BR), which is commonly criticized in the literature because of its label independence assumption. Despite this fact, this paper discusses some interesting properties of BR, mainly that it produces optimal models for several ML loss functions. Additionally, we present an analytical study of ML benchmarks datasets and point out some shortcomings. As a result, this paper proposes the use of synthetic datasets to better analyze the behavior of ML methods in domains with different characteristics. To support this claim, we perform some experiments using synthetic data proving the competitive performance of BR with respect to a more complex method in difficult problems with many labels, a conclusion which was not stated by previous studies.
Lecture Notes in Computer Science, 2004
The quality of food can be assessed from different points of view. In this paper, we deal with th... more The quality of food can be assessed from different points of view. In this paper, we deal with those aspects that can be appreciated through sensory impressions. When we are aiming to induce a function that maps object descriptions into ratings, we must consider that consumers' ratings are just a way to express their preferences about the products presented in the same testing session. Therefore, we postulate to learn from consumers' preference judgments instead of using an approach based on regression. This requires the use of special purpose kernels and feature subset selection methods. We illustrate the benefits of our approach in two families of real-world data bases.
Lecture Notes in Computer Science, 2008
We present nondeterministic hypotheses learned from an ordinal regression task. They try to predi... more We present nondeterministic hypotheses learned from an ordinal regression task. They try to predict the true rank for an entry, but when the classification is uncertain the hypotheses predict a set of consecutive ranks (an interval). The aim is to keep the set of ranks as small as possible, while still containing the true rank. The justification for learning such a hypothesis is based on a real world problem arisen in breeding beef cattle. After defining a family of loss functions inspired in Information Retrieval, we derive an algorithm for minimizing them. The algorithm is based on posterior probabilities of ranks given an entry. A couple of implementations are compared: one based on a multiclass SVM and other based on Gaussian processes designed to minimize the linear loss in ordinal regression tasks.
Twenty-first international conference on Machine learning - ICML '04, 2004
In this paper we tackle a real world problem, the search of a function to evaluate the merits of ... more In this paper we tackle a real world problem, the search of a function to evaluate the merits of beef cattle as meat producers. The independent variables represent a set of live animals' measurements; while the outputs cannot be captured with a single number, since the available experts tend to assess each animal in a relative way, comparing animals with the other partners in the same batch. Therefore, this problem can not be solved by means of regression methods; our approach is to learn the preferences of the experts when they order small groups of animals. Thus, the problem can be reduced to a binary classification, and can be dealt with a Support Vector Machine (SVM) improved with the use of a feature subset selection (FSS) method. We develop a method based on Recursive Feature Elimination (RFE) that employs an adaptation of a metric based method devised for model selection (ADJ). Finally, we discuss the extension of the resulting method to more general settings, and provide a comparison with other possible alternatives.
Lecture Notes in Computer Science, 2009
From a multi-class learning task, in addition to a classifier, it is possible to infer some usefu... more From a multi-class learning task, in addition to a classifier, it is possible to infer some useful knowledge about the relationship between the classes involved. In this paper we propose a method to learn a hierarchical clustering of the set of classes. The usefulness of such clusterings has been exploited in bio-medical applications to find out relations between diseases or populations of animals. The method proposed here defines a distance between classes based on the margin maximization principle, and then builds the hierarchy using a linkage procedure. Moreover, to quantify the goodness of the hierarchies we define a measure. Finally, we present a set of experiments comparing the scores achieved by our approach with other methods.
investigaciones …, 2007
En el presente trabajo se construye un modelo para la determinación de los factores influyentes e... more En el presente trabajo se construye un modelo para la determinación de los factores influyentes en la rentabilidad futura de una empresa a través de un enfoque basado en preferencias. Con este planteamiento se consiguen superar las limitaciones de los modelos de regresión y de los sistemas de clasificación. Los resultados obtenidos en una muestra de 1.745 empresas indican que es más efectivo adoptar estrategias de aumento del margen, principalmente a través de subidas de precios, que incrementar la rotación de los activos. Es destacable que el modelo construido funciona con una precisión muy superior a otros basados en regresiones.
Trends in Food Science & Technology, 2007
In this paper we discuss how to model preferences from a collection of ratings provided by a pane... more In this paper we discuss how to model preferences from a collection of ratings provided by a panel of consumers of some kind of food product. W e emphasize the role of tasting sessions, since the ratings tend to be relative to each session and hence regression methods are unable to capture consumer preferences. The method proposed is based on the use of Support V ector M achines (SV M ) and provides both linear and nonlinear models. To illustrate the performance of the approach, we report the experimental results obtained with a couple of real world datasets.
Trends in Food Science & Technology, 2001
In this paper we advocate the application of Artificial Intelligence techniques to quality assess... more In this paper we advocate the application of Artificial Intelligence techniques to quality assessment of food products. Machine Learning algorithms can help us to: a) extract operative human knowledge from a set of examples; b) conclude interpretable rules for classifying samples regardless of the non-linearity of the human behaviour or process; and c) help us to ascertain the degree of influence of each objective attribute of the assessed food on the final decision of an expert. We illustrate these topics with an example of how it is possible to clone the behaviour of bovine carcass classifiers, leading to possible further industrial applications.
When we are learning people’s preferences the training material can be expressed as in regression... more When we are learning people’s preferences the training material can be expressed as in regression problems: the description of each object is then followed by a number that assesses the degree of satisfaction. Alternatively, training examples can be represented by preference judgments: pairs of vectors (v, u) where someone expresses that he or she prefers v to u. Usually, obtaining preference information may be easier and more natural than obtaining the labels needed for a classification or regression approach. Moreover, this type of information is more accurate, since people tend to rate their preferences in a relative way, comparing objects with the other partners in the same batch.