Rodolfo Bojorque | Universidad Politécnica de Madrid (original) (raw)
Papers by Rodolfo Bojorque
Actualmente la información es un activo esencial para las diferentes organizaciones, en el Ecuado... more Actualmente la información es un activo esencial para las diferentes organizaciones, en el Ecuador empresas de toda índole han comenzado a realizar diferentes tipos de inversiones para protegerla, sin embargo, no siempre la inversión va a la par de la seguridad, pues la mayoría de las soluciones suelen ser iniciativa Ad-hoc que no responden a la realidad, políticas, misión y visión de las organizaciones, en este sentido es fundamental comprender a la seguridad de la información como un sistema donde per se la globalidad de la solución es siempre mayor a la suma de las soluciones de sus partes.
The expansion of recommender systems to the commercial and industrial level has allowed a rapid e... more The expansion of recommender systems to the commercial and industrial level has allowed a rapid evolution of techniques, methods and algorithms. Initially, research focused on improving the quality of predictions, however, there are significant challenges such as generating models able to work with large volumes of information. Data sparsity on datasets is a challenge for recommender systems. Clustering and explanation of recommendations are growing research fields in the recommender systems area. Model-based recommender systems provide more accurate predictions and recommendations, more scalable results and they better address the problem of sparsity. The most adopted model by the modern recommender systems is the Matrix Factorization and their derived techniques. This thesis exposes a comprehensive study of the state-of-the-art research works and it proposes a Bayesian non-negative matrix factorization method to improve the current clustering results in the collaborative filtering area. We also provide an innovative pre-clustering algorithm adapted to the proposed probabilistic method. Results obtained using several open datasets show: 1) A conclusive clustering quality improvement when BNMF is used, compared to the classical Matrix Factorization or to the improved k-means results, 2) A higher predictions accuracy using Matrix Factorization based methods than using improved KMeans, and 3) Better BNMF execution times compared to those of the classic matrix factorization, and an additional improvement when using the proposed pre-clustering algorithm.
Advances in intelligent systems and computing, Jun 11, 2019
Random Walks Samplings are important method to analyze any kind of network; it allows knowing the... more Random Walks Samplings are important method to analyze any kind of network; it allows knowing the network’s state any time, independently of the node from which the random walk starts. In this work, we have implemented a random walk of this type on a Markov Chain Network through Metropolis-Hastings Random Walks algorithm. This algorithm is an efficient method of sampling because it ensures that all nodes can be sampled with a uniform probability. We have determinate the required number of rounds of a random walk to ensuring the steady state of the network system. We concluded that, to determinate the correct number of rounds with which the system will find the steady state it is necessary start the random walk from different nodes, selected analytically, especially looking for nodes that may have random walks critics.
Ingenius, 2021
La gestión de talento humano es un factor fundamental en el éxito de las organizaciones. La inclu... more La gestión de talento humano es un factor fundamental en el éxito de las organizaciones. La inclusión en el entorno laboral de las personas con discapacidad ha ayudado a potenciar sus cualidades y a aprovechar su talento. Muchos de los sistemas de gestión de talento humano carecen de directrices para el reclutamiento y selección de una persona con discapacidad, es por eso por lo que el presente trabajo muestra el estudio realizado a estos dos procesos indicando los factores que influyen en la asignación o no de un cargo, donde de cada candidato es considerando el nivel y tipo de discapacidad, nivel de estudios, experiencia, capacitación entre otros aspectos, enfocándose en la tarea de aplicar técnicas de aprendizaje supervisado que permitan clasificar a un candidato con discapacidad para un puesto de trabajo como apto o no y técnicas de aprendizaje no supervisado como el clustering que ayuda a definir patrones ocultos en los datos si los hubiera. El resultado obtenido del estudio pr...
Advances in Intelligent Systems and Computing
This work demonstrates how the curricular design processes at a graduate level, that is to say th... more This work demonstrates how the curricular design processes at a graduate level, that is to say the one aimed to the academic offer leading to a third level professional degree, are positively affected when adapting and implementing agile methodologies that are generally applied to software product design process. This represents a considerable reduction of time and the sequential effectiveness of the process. The study considers 712 undergraduate programs from 30 higher education institutions in Ecuador that, based on the applicable legislation, had to re-design their entire academic offer within a set period. As a fundamental contribution, the methodological model of agile curricular design adopted by the Politecnica Salesiana University from Ecuador is described, whose results show that 96% of their careers achieved this goal in the established period, higher than the average effectiveness rate of other Higher Educations Institutions, which was 69.85%.
El presente libro esta compuesto por la recopilacion de temas de investigacion en el campo de las... more El presente libro esta compuesto por la recopilacion de temas de investigacion en el campo de las Ciencias de la Computacion, que trata aspectos de actualidad en el Ecuador, proporcionando estrategias de solucion a problemas identificados en tres aspectos fundamentales: Sistemas de Informacion Geografica, Sistemas de Gestion de Seguridad de la Informacion y Sistemas de Informacion General. Durante el proceso investigativo se realiza una importante comparacion de las herramientas tecnologicas de codigo abierto tipo ETL (Extract - Transform - Load ) junto con un analisis de las ventajas y desventajas de las mismas; se representan los diferentes modelos de procesos de Gestion de proyectos utilizando la especificacion de Software & Systems Process Engineering Metamodel SPEM 2.0. El segundo aspecto de esta obra permite centrar los esfuerzos en temas enfocados con la seguridad de la informacion. Se fundamenta en la importancia que tiene la informacion de la empresa como un activo fundamen...
Scientific documentation research leads to the computation of large amounts of information from p... more Scientific documentation research leads to the computation of large amounts of information from published works of the scientific community. It is necessary to explain these processes and create application frameworks. This paper provides the following: a) An Information System designed to extract scientific information from published papers, b) Accurate explanations of the main processing stages including data mining, natural language processing, and machine learning, and c) Categorized and explained results coming from the Artificial Intelligence case study. The results in this paper include the following: a) Topics and research area rankings and b) Quantity versus quality comparisons of topics and research areas.
Quality management systems are a challenge for higher education centers. Nowadays, there are diff... more Quality management systems are a challenge for higher education centers. Nowadays, there are different management systems, for instance: quality, environmental, information security, etc. that can be applied over education centers, but to implement all of them is not a guarantee of education quality because the educational process is very complex. However, a few years ago the Quality Management Systems for higher education centers are taking importance especially in Europe and North America, although in Latin America is an unexplored field. Higher education centers quality is a very complex problem because it is difficult to measure the quality since there are a lot of academic processes as enrollment, matriculation, teaching-learning with a lot of stakeholders as students, teachers, authorities even society; in a lot of locations as campuses, buildings, laboratories with different resources. Each process generates a lot of records and documentation. This information has a varied na...
En la actualidad, tener una “Pagina Web”, “Sitio Web”, “Portal Web” o como usted prefiera llamarl... more En la actualidad, tener una “Pagina Web”, “Sitio Web”, “Portal Web” o como usted prefiera llamarlo, para las empresas y organizaciones, ya no es una cuestion unicamente informativa, ahora se trata de una cuestion Institucional de “imagen” y “prestigio”.
This work shows similarity metrics behavior on sparse data for recommender systems (RS). Clusteri... more This work shows similarity metrics behavior on sparse data for recommender systems (RS). Clustering in RS is an important technique to perform groups of users or items with the purpose of personalization and optimization recommendations. The majority of clustering techniques try to minimize the Euclidean distance between the samples and their centroid, but this technique has a drawback on sparse data because it considers the lack of value as zero. We propose a comparative analysis of similarity metrics like Pearson Correlation, Jaccard, Mean Square Difference, Jaccard Mean Square Difference and Mean Jaccard Difference as an alternative method to Euclidean distance, our work shows results for FilmTrust and MovieLens 100K datasets, these both free and public with high sparsity. We probe that using similarity measures is better for accuracy in terms of Mean Absolute Error and Within-Cluster on sparse data.
IEEE Access
Recommendation to a group of users is a big challenge for collaborative filtering. The recommenda... more Recommendation to a group of users is a big challenge for collaborative filtering. The recommendations to groups of users arise from the convenience of being able to recommend a group of users about products or services that satisfy the entire group. In this paper, we propose the similarity measure SMGU, tailored for collaborative filtering recommendations to groups of users. This similarity measure combines both numerical and non-numerical information. Numerical information is weighted attending to the rating singularity of the group members. This paper focuses on the assumption that the singularity of the ratings cast by the users of the group is relevant information for finding suitable neighbors. For each item, we consider that a rating is singular for a group or for a user when that rating is different from the majority of the rating cast by the other users. Non-numerical structural information can be considered as valuable to match group preferences with neighbors preferences. Experiments have been run using open recommender systems data sets. Compared with representative baselines, results show accuracy improvements when the proposed method is used. Additionally, this paper provides a section devoted to the experiments reproducibility issue. Finally, this paper opens opportunities to face new challenges in the recommendation to a group of users: explanation of recommendations, determination of reliability measures, and improvement of accuracy, novelty, and diversity results. INDEX TERMS Recommendation to groups, group of users, collaborative filtering, recommender systems, singularity. I. INTRODUCTION This section is divided into three subsections: 1) Fundamental concepts of RS: recommendation to individual users, 2) Recommendation to groups of users: Objectives and particularities, and 3) General explanation and motivation of the proposed method for recommending to groups of users. A. RECOMMENDATIONS TO INDIVIDUAL USERS Recommender Systems (RS) [1], [2] allow to mitigate part of the Internet information overload problem. From the point of view of an RS user, based on his past preferences, the system automatically recommends a series of items (movies, books, music, electronics, clothing, etc.) that are available and that the user has not consumed. The RS can make recommendations based on various types of information sources; the most common ones are: content-based, demographic, collaborative, social, and context-aware.
IEEE Access
Recommender Systems present a high-level of sparsity in their ratings matrices. The collaborative... more Recommender Systems present a high-level of sparsity in their ratings matrices. The collaborative filtering sparse data makes it difficult to: 1) compare elements using memory-based solutions; 2) obtain precise models using model-based solutions; 3) get accurate predictions; and 4) properly cluster elements. We propose the use of a Bayesian non-negative matrix factorization (BNMF) method to improve the current clustering results in the collaborative filtering area. We also provide an original pre-clustering algorithm adapted to the proposed probabilistic method. Results obtained using several open data sets show: 1) a conclusive clustering quality improvement when BNMF is used, compared with the classical matrix factorization or to the improved KMeans results; 2) a higher predictions accuracy using matrix factorizationbased methods than using improved KMeans; and 3) better BNMF execution times compared with those of the classic matrix factorization, and an additional improvement when using the proposed pre-clustering algorithm.
Revista española de Documentación Científica
La investigación en el campo de la documentación científica nos lleva hacia un procesamiento auto... more La investigación en el campo de la documentación científica nos lleva hacia un procesamiento automático de grandes cantidades de información proveniente de los trabajos publicados por la comunidad científica. Resulta necesario explicar estos procesos y crear sistemas que los lleven a cabo. En este artículo se proporciona: a) Un Sistema de Información diseñado para extraer información científica a partir del texto que proporcionan los artículos publicados, b) Explicaciones de las etapas fundamentales de procesamiento: minería de datos, procesamiento del lenguaje natural, aprendizaje automático, y c) Resultados categorizados y explicados de nuestro caso de estudio: el área Artificial Intelligence. Los resultados de este artículo incluyen: a) Ranking de temas y ranking de áreas de investigación, y b) Comparativa entre cantidad y calidad de los temas y de las áreas de investigación.
Actualmente la información es un activo esencial para las diferentes organizaciones, en el Ecuado... more Actualmente la información es un activo esencial para las diferentes organizaciones, en el Ecuador empresas de toda índole han comenzado a realizar diferentes tipos de inversiones para protegerla, sin embargo, no siempre la inversión va a la par de la seguridad, pues la mayoría de las soluciones suelen ser iniciativa Ad-hoc que no responden a la realidad, políticas, misión y visión de las organizaciones, en este sentido es fundamental comprender a la seguridad de la información como un sistema donde per se la globalidad de la solución es siempre mayor a la suma de las soluciones de sus partes.
The expansion of recommender systems to the commercial and industrial level has allowed a rapid e... more The expansion of recommender systems to the commercial and industrial level has allowed a rapid evolution of techniques, methods and algorithms. Initially, research focused on improving the quality of predictions, however, there are significant challenges such as generating models able to work with large volumes of information. Data sparsity on datasets is a challenge for recommender systems. Clustering and explanation of recommendations are growing research fields in the recommender systems area. Model-based recommender systems provide more accurate predictions and recommendations, more scalable results and they better address the problem of sparsity. The most adopted model by the modern recommender systems is the Matrix Factorization and their derived techniques. This thesis exposes a comprehensive study of the state-of-the-art research works and it proposes a Bayesian non-negative matrix factorization method to improve the current clustering results in the collaborative filtering area. We also provide an innovative pre-clustering algorithm adapted to the proposed probabilistic method. Results obtained using several open datasets show: 1) A conclusive clustering quality improvement when BNMF is used, compared to the classical Matrix Factorization or to the improved k-means results, 2) A higher predictions accuracy using Matrix Factorization based methods than using improved KMeans, and 3) Better BNMF execution times compared to those of the classic matrix factorization, and an additional improvement when using the proposed pre-clustering algorithm.
Advances in intelligent systems and computing, Jun 11, 2019
Random Walks Samplings are important method to analyze any kind of network; it allows knowing the... more Random Walks Samplings are important method to analyze any kind of network; it allows knowing the network’s state any time, independently of the node from which the random walk starts. In this work, we have implemented a random walk of this type on a Markov Chain Network through Metropolis-Hastings Random Walks algorithm. This algorithm is an efficient method of sampling because it ensures that all nodes can be sampled with a uniform probability. We have determinate the required number of rounds of a random walk to ensuring the steady state of the network system. We concluded that, to determinate the correct number of rounds with which the system will find the steady state it is necessary start the random walk from different nodes, selected analytically, especially looking for nodes that may have random walks critics.
Ingenius, 2021
La gestión de talento humano es un factor fundamental en el éxito de las organizaciones. La inclu... more La gestión de talento humano es un factor fundamental en el éxito de las organizaciones. La inclusión en el entorno laboral de las personas con discapacidad ha ayudado a potenciar sus cualidades y a aprovechar su talento. Muchos de los sistemas de gestión de talento humano carecen de directrices para el reclutamiento y selección de una persona con discapacidad, es por eso por lo que el presente trabajo muestra el estudio realizado a estos dos procesos indicando los factores que influyen en la asignación o no de un cargo, donde de cada candidato es considerando el nivel y tipo de discapacidad, nivel de estudios, experiencia, capacitación entre otros aspectos, enfocándose en la tarea de aplicar técnicas de aprendizaje supervisado que permitan clasificar a un candidato con discapacidad para un puesto de trabajo como apto o no y técnicas de aprendizaje no supervisado como el clustering que ayuda a definir patrones ocultos en los datos si los hubiera. El resultado obtenido del estudio pr...
Advances in Intelligent Systems and Computing
This work demonstrates how the curricular design processes at a graduate level, that is to say th... more This work demonstrates how the curricular design processes at a graduate level, that is to say the one aimed to the academic offer leading to a third level professional degree, are positively affected when adapting and implementing agile methodologies that are generally applied to software product design process. This represents a considerable reduction of time and the sequential effectiveness of the process. The study considers 712 undergraduate programs from 30 higher education institutions in Ecuador that, based on the applicable legislation, had to re-design their entire academic offer within a set period. As a fundamental contribution, the methodological model of agile curricular design adopted by the Politecnica Salesiana University from Ecuador is described, whose results show that 96% of their careers achieved this goal in the established period, higher than the average effectiveness rate of other Higher Educations Institutions, which was 69.85%.
El presente libro esta compuesto por la recopilacion de temas de investigacion en el campo de las... more El presente libro esta compuesto por la recopilacion de temas de investigacion en el campo de las Ciencias de la Computacion, que trata aspectos de actualidad en el Ecuador, proporcionando estrategias de solucion a problemas identificados en tres aspectos fundamentales: Sistemas de Informacion Geografica, Sistemas de Gestion de Seguridad de la Informacion y Sistemas de Informacion General. Durante el proceso investigativo se realiza una importante comparacion de las herramientas tecnologicas de codigo abierto tipo ETL (Extract - Transform - Load ) junto con un analisis de las ventajas y desventajas de las mismas; se representan los diferentes modelos de procesos de Gestion de proyectos utilizando la especificacion de Software & Systems Process Engineering Metamodel SPEM 2.0. El segundo aspecto de esta obra permite centrar los esfuerzos en temas enfocados con la seguridad de la informacion. Se fundamenta en la importancia que tiene la informacion de la empresa como un activo fundamen...
Scientific documentation research leads to the computation of large amounts of information from p... more Scientific documentation research leads to the computation of large amounts of information from published works of the scientific community. It is necessary to explain these processes and create application frameworks. This paper provides the following: a) An Information System designed to extract scientific information from published papers, b) Accurate explanations of the main processing stages including data mining, natural language processing, and machine learning, and c) Categorized and explained results coming from the Artificial Intelligence case study. The results in this paper include the following: a) Topics and research area rankings and b) Quantity versus quality comparisons of topics and research areas.
Quality management systems are a challenge for higher education centers. Nowadays, there are diff... more Quality management systems are a challenge for higher education centers. Nowadays, there are different management systems, for instance: quality, environmental, information security, etc. that can be applied over education centers, but to implement all of them is not a guarantee of education quality because the educational process is very complex. However, a few years ago the Quality Management Systems for higher education centers are taking importance especially in Europe and North America, although in Latin America is an unexplored field. Higher education centers quality is a very complex problem because it is difficult to measure the quality since there are a lot of academic processes as enrollment, matriculation, teaching-learning with a lot of stakeholders as students, teachers, authorities even society; in a lot of locations as campuses, buildings, laboratories with different resources. Each process generates a lot of records and documentation. This information has a varied na...
En la actualidad, tener una “Pagina Web”, “Sitio Web”, “Portal Web” o como usted prefiera llamarl... more En la actualidad, tener una “Pagina Web”, “Sitio Web”, “Portal Web” o como usted prefiera llamarlo, para las empresas y organizaciones, ya no es una cuestion unicamente informativa, ahora se trata de una cuestion Institucional de “imagen” y “prestigio”.
This work shows similarity metrics behavior on sparse data for recommender systems (RS). Clusteri... more This work shows similarity metrics behavior on sparse data for recommender systems (RS). Clustering in RS is an important technique to perform groups of users or items with the purpose of personalization and optimization recommendations. The majority of clustering techniques try to minimize the Euclidean distance between the samples and their centroid, but this technique has a drawback on sparse data because it considers the lack of value as zero. We propose a comparative analysis of similarity metrics like Pearson Correlation, Jaccard, Mean Square Difference, Jaccard Mean Square Difference and Mean Jaccard Difference as an alternative method to Euclidean distance, our work shows results for FilmTrust and MovieLens 100K datasets, these both free and public with high sparsity. We probe that using similarity measures is better for accuracy in terms of Mean Absolute Error and Within-Cluster on sparse data.
IEEE Access
Recommendation to a group of users is a big challenge for collaborative filtering. The recommenda... more Recommendation to a group of users is a big challenge for collaborative filtering. The recommendations to groups of users arise from the convenience of being able to recommend a group of users about products or services that satisfy the entire group. In this paper, we propose the similarity measure SMGU, tailored for collaborative filtering recommendations to groups of users. This similarity measure combines both numerical and non-numerical information. Numerical information is weighted attending to the rating singularity of the group members. This paper focuses on the assumption that the singularity of the ratings cast by the users of the group is relevant information for finding suitable neighbors. For each item, we consider that a rating is singular for a group or for a user when that rating is different from the majority of the rating cast by the other users. Non-numerical structural information can be considered as valuable to match group preferences with neighbors preferences. Experiments have been run using open recommender systems data sets. Compared with representative baselines, results show accuracy improvements when the proposed method is used. Additionally, this paper provides a section devoted to the experiments reproducibility issue. Finally, this paper opens opportunities to face new challenges in the recommendation to a group of users: explanation of recommendations, determination of reliability measures, and improvement of accuracy, novelty, and diversity results. INDEX TERMS Recommendation to groups, group of users, collaborative filtering, recommender systems, singularity. I. INTRODUCTION This section is divided into three subsections: 1) Fundamental concepts of RS: recommendation to individual users, 2) Recommendation to groups of users: Objectives and particularities, and 3) General explanation and motivation of the proposed method for recommending to groups of users. A. RECOMMENDATIONS TO INDIVIDUAL USERS Recommender Systems (RS) [1], [2] allow to mitigate part of the Internet information overload problem. From the point of view of an RS user, based on his past preferences, the system automatically recommends a series of items (movies, books, music, electronics, clothing, etc.) that are available and that the user has not consumed. The RS can make recommendations based on various types of information sources; the most common ones are: content-based, demographic, collaborative, social, and context-aware.
IEEE Access
Recommender Systems present a high-level of sparsity in their ratings matrices. The collaborative... more Recommender Systems present a high-level of sparsity in their ratings matrices. The collaborative filtering sparse data makes it difficult to: 1) compare elements using memory-based solutions; 2) obtain precise models using model-based solutions; 3) get accurate predictions; and 4) properly cluster elements. We propose the use of a Bayesian non-negative matrix factorization (BNMF) method to improve the current clustering results in the collaborative filtering area. We also provide an original pre-clustering algorithm adapted to the proposed probabilistic method. Results obtained using several open data sets show: 1) a conclusive clustering quality improvement when BNMF is used, compared with the classical matrix factorization or to the improved KMeans results; 2) a higher predictions accuracy using matrix factorizationbased methods than using improved KMeans; and 3) better BNMF execution times compared with those of the classic matrix factorization, and an additional improvement when using the proposed pre-clustering algorithm.
Revista española de Documentación Científica
La investigación en el campo de la documentación científica nos lleva hacia un procesamiento auto... more La investigación en el campo de la documentación científica nos lleva hacia un procesamiento automático de grandes cantidades de información proveniente de los trabajos publicados por la comunidad científica. Resulta necesario explicar estos procesos y crear sistemas que los lleven a cabo. En este artículo se proporciona: a) Un Sistema de Información diseñado para extraer información científica a partir del texto que proporcionan los artículos publicados, b) Explicaciones de las etapas fundamentales de procesamiento: minería de datos, procesamiento del lenguaje natural, aprendizaje automático, y c) Resultados categorizados y explicados de nuestro caso de estudio: el área Artificial Intelligence. Los resultados de este artículo incluyen: a) Ranking de temas y ranking de áreas de investigación, y b) Comparativa entre cantidad y calidad de los temas y de las áreas de investigación.