Fátima Rodrigues - Academia.edu (original) (raw)

Papers by Fátima Rodrigues

Research paper thumbnail of ChatBot for student service based on RASA framework

Research Square (Research Square), Apr 5, 2023

The availability of face-to-face attendance at the School's Administrative Services for Students ... more The availability of face-to-face attendance at the School's Administrative Services for Students is limited to one schedule, which may prevent the timely clarification of students' questions, causing a decrease in their level of satisfaction. To solve this problem, a conversational agent was designed, consisting of a Portuguese language interpretation module using natural language processing and machine learning techniques. To keep the system abstracted from any technical dependency, a web service that manages the agent's knowledge base was developed. In the evaluation of the solution, the performance of several learning models was compared, and the results emphasize the superiority of BERT language model of Google, combined with the DIET classifier, obtaining a F1-Score of 0.965. The system was implemented through a prototype and, for a total of 256 questions, around 70% of correct responses were obtained, with a positive average satisfaction rating of 4.20 on a 0-5 scale.

Research paper thumbnail of A Deep Learning Approach to Monitoring Workers’ Stress at Office

Lecture notes in networks and systems, 2023

I dedicate this thesis to my parents, Cleide and René, who supported and motivated me even in the... more I dedicate this thesis to my parents, Cleide and René, who supported and motivated me even in the toughest moments. To my late grandfather, Carlos Marchetti, who inspired my pursuit in science. To my supervisor, Fatima Rodrigues, for her commitment and support which contributed in no small part to the project's success. Finally, I express my sincere gratitude to who has crossed my path and helped me to become what I am today.

Research paper thumbnail of Use of Data Mining Techniques to Characterize MV Consumers and to Support the Consumer- Supplier Relationship

This paper consists in the characterization of electric power profiles of medium voltage (MV) con... more This paper consists in the characterization of electric power profiles of medium voltage (MV) consumers, based on the data base knowledge discovery process. Data Mining techniques were used as support for the agents of the electric power retail markets, with the purpose of obtaining specific knowledge of their customers' consumption habits. A hierarchical clustering algorithm is used in order to form the different customers' classes and to find a set of representative consumption patterns. A classification model was built that, when applied to new consumers, allows classifying them in one of the obtained classes. New tariff options were also defined, taking into consideration the typical consumption profile of the class to which the customers belong. With the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.

Research paper thumbnail of An automated approach for binary classification on imbalanced data

Research Square (Research Square), Jun 6, 2023

Imbalanced data is present in various business areas and must be dealt with the appropriate resam... more Imbalanced data is present in various business areas and must be dealt with the appropriate resampling techniques and classification algorithms. However, there is a magnitude of multiple combinations of resampling and learning methods to handle imbalanced data that require specialised knowledge to be used correctly. In this paper, several approaches, ranging from more accessible and more advanced in the domains of data resampling and cost-sensitive techniques, will be considered to handle imbalanced data. The application developed delivers recommendations of the most suited combinations of techniques for a specific dataset, by extracting and comparing dataset meta-features values recorded in a knowledge base. It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results to a state-of-the-art solution and with a much smaller execution time.

Research paper thumbnail of Artigo Original Original Article

Rev Port Med Int, 2010

... Óbitos por gripe pandémica A (H1N1) 2009 em Portugal Período de Abril de 2009 a Março de 2010... more ... Óbitos por gripe pandémica A (H1N1) 2009 em Portugal Período de Abril de 2009 a Março de 2010 Filipe Froes*, António Diniz*, Isabel Falcão*, Baltazar Nunes**, Judite Catarino* ... 25. Nogueira PJ, Nunes B, Machado A, Rodrigues E, Gómez V, Sousa L, Falcão JM. ...

Research paper thumbnail of 2 DI/gEPL – Languages Specification and Processing Group

Abstract. In today’s society the exploration of one or more databases to extract information or k... more Abstract. In today’s society the exploration of one or more databases to extract information or knowledge to support management is a critical success factor for an organization. However, it is well known that several problems can affect data quality. These problems have a negative effect in the results extracted from data, influencing their correction and validity. In this context, it is quite important to understand theoretically and in practice these data problems. This paper presents a taxonomy of data quality problems, derived from real-world databases. The taxonomy organizes the problems at different levels of abstraction. Methods to detect data quality problems represented as binary trees are also proposed for each abstraction level. The paper also compares this taxonomy with others already proposed in the literature. 1.

Research paper thumbnail of SmartClean: uma ferramenta para a limpeza incremental de dados

Neste artigo apresenta-se a ferramenta SmartClean, destinada à detecção e correcção de problemas ... more Neste artigo apresenta-se a ferramenta SmartClean, destinada à detecção e correcção de problemas de qualidade dos dados. Comparativamente às ferramentas actualmente existentes, o SmartClean possui a mais-valia de não obrigar a que a sequência de execução das operações seja especificada pelo utilizador. Para tal, foi concebida uma sequência segundo a qual os problemas são manipulados (i.e., detectados e corrigidos). A existência da sequência suporta ainda a execução incremental das operações. No artigo, a arquitectura subjacente à ferramenta é exposta, sendo detalhados os seus componentes. A validade da ferramenta e, consequentemente, da arquitectura é comprovada através da apresentação do caso de estudo efectuado. Apesar do SmartClean possuir potencialidades de limpeza de dados noutros níveis (e.g., relação), no artigo apenas são descritas as relativas ao nível do valor individual do atributo.

Research paper thumbnail of CLUSTER ENSEMBLE SELECTION - Using Average Cluster Consistency

Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, 2009

In order to combine multiple data partitions into a more robust data partition, several approache... more In order to combine multiple data partitions into a more robust data partition, several approaches to produce the cluster ensemble and various consensus functions have been proposed. This range of possibilities in the multiple data partitions combination raises a new problem: which of the existing approaches, to produce the cluster ensembles' data partitions and to combine these partitions, best fits a given data set. In this paper, we address the cluster ensemble selection problem. We proposed a new measure to select the best consensus data partition, among a variety of consensus partitions, based on a notion of average cluster consistency between each data partition that belongs to the cluster ensemble and a given consensus partition. We compared the proposed measure with other measures for cluster ensemble selection, using 9 different data sets, and the experimental results shown that the consensus partitions selected by our approach usually were of better quality in comparison with the consensus partitions selected by other measures used in our experiments.

Research paper thumbnail of A Data Mining Framework for Response Modelling in Direct Marketing

Advances in Intelligent Systems and Computing, 2021

Research paper thumbnail of Impact of comorbidities in pulmonary rehabilitation outcomes in patients with COPD

Revista Portuguesa de Pneumologia (English Edition), 2013

Background: Chronic obstructive pulmonary disease (COPD) represents an increasing burden worldwid... more Background: Chronic obstructive pulmonary disease (COPD) represents an increasing burden worldwide. COPD can no longer be considered a disease which only involves the lungs, its systemic consequences make it an important risk factor for other chronic comorbidities. Aim: To determine the frequency of comorbidities in patients with COPD undergoing a pulmonary rehabilitation program (PRP) and to evaluate the influence of baseline characteristics as well as comorbidities on the outcomes of PRP. Methods: The present study included all COPD patients who were admitted to a PRP in our unit. The response to PR was measured by the improvement in exercise tolerance (6 min walk test), dyspnea (Mahler's Dyspnea Index) and health status (St. George's Respiratory Questionnaire). Results: 114 patients with COPD were included. Most patients (96.5%) had at least one comorbidity. Metabolic diseases (71.1%), cardiovascular diseases (67.5%), other respiratory conditions (57.9%) and anxiety/depression (21.1%) were the most prevalent ones. 64.9%, 64.9% and 51.1% of the patients improved in terms of exercise tolerance, quality of life and dyspnea, respectively. The overall results were similar in all levels of the disease and in all comorbid subgroups. Logistic regression analysis showed that respiratory failure and ischemic heart disease negatively influenced improvement in health status and anxiety/depression predicted lower improvement in dyspnea. ଝ Please cite this article as: Carreiro A, et al. Impacto das comorbilidades num programa de reabilitação respiratória em doentes com DPOC. Rev Port Pneumol. 2013.

Research paper thumbnail of A Web & Mobile City Maintenance Reporting Solution

Procedia Technology, 2013

In an era marked by the growing dominance of mobile computing, the systematic use of a smartphone... more In an era marked by the growing dominance of mobile computing, the systematic use of a smartphone or tablet to perform daily life tasks creates new business opportunities and drives innovation and redesign of organizations traditional work methods. With recent technological advances, space emerges for civic-oriented systems, in which citizens can promote the growth of their community through small acts of citizenship in favor of the community, allowing local management authorities to save human and financial resources. Urban problems solving is one of those areas, and the arrival of mobile technologies has revolutionized the way citizens report non-emergency situations to the responsible authorities for the local community, providing ideal conditions for the local government to respond quickly and effectively. This paper describes a proposed solution for a system capable of reporting and managing notifications of non-urgent urban situations, to foster the active participation of citizens in the community. The solution allows citizens to report a particular set of problems via a Web browser or mobile device, in a location based environment, and documenting the submitted reports with any type of multimedia information, added with the capability of automatically identify and underline the most usual problems spotted in images.

Research paper thumbnail of An Electric Energy Consumer Characterization Framework Based on Data Mining Techniques

IEEE Transactions on Power Systems, 2005

This paper presents an electricity consumer characterization framework based on a knowledge disco... more This paper presents an electricity consumer characterization framework based on a knowledge discovery in databases (KDD) procedure, supported by data mining (DM) techniques, applied on the different stages of the process. The core of this framework is a data mining model based on a combination of unsupervised and supervised learning techniques. Two main modules compose this framework: the load profiling module and the classification module. The load profiling module creates a set of consumer classes using a clustering operation and the representative load profiles for each class. The classification module uses this knowledge to build a classification model able to assign different consumers to the existing classes. The quality of this framework is illustrated with a case study concerning a real database of LV consumers from the Portuguese distribution company.

Research paper thumbnail of Extracting Structure, Text and Entities from PDF Documents of the Portuguese Legislation

Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, 2012

This paper presents an approach for text processing of PDF documents with well-defined layout str... more This paper presents an approach for text processing of PDF documents with well-defined layout structure. The scope of the approach is to explore the font's structure of PDF documents, using perceptual grouping. It consists on the extraction of text objects from the content stream of the documents and its grouping according to a set criterion, making also use of geometric-based regions in order to achieve the correct reading order. The developed approach processes the PDF documents using logical and structural rules to extract the entities present in them, and returns an optimized XML representation of the PDF document, useful for re-use, for example in text categorization. The system was trained and tested with Portuguese Legislation PDF documents extracted from the electronic Republic's Diary. Evaluation results show that our approach presents good results.

Research paper thumbnail of Resampling Approaches to Improve News Importance Prediction

Lecture Notes in Computer Science, 2014

Research paper thumbnail of Use of Data Mining Techniques to Characterize MV Consumers and to Support the Consumer- Supplier Relationship

This paper consists in the characterization of electric power profiles of medium voltage (MV) con... more This paper consists in the characterization of electric power profiles of medium voltage (MV) consumers, based on the data base knowledge discovery process. Data Mining techniques were used as sup- port for the agents of the electric power retail markets, with the purpose of obtaining specific knowledge of their customers' consumption habits. A hierarchical clustering algorithm is used in order to form the different customers' classes and to find a set of representative consumption patterns. A classification model was built that, when applied to new consumers, allows classifying them in one of the obtained classes. New tariff op- tions were also defined, taking into consideration the typical consumption profile of the class to which the customers belong. With the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.

Research paper thumbnail of Automatic Assessment of Short Free Text Answers

Assessment plays a central role in any educational process, because it is a common way to evaluat... more Assessment plays a central role in any educational process, because it is a common way to evaluate the students’ knowledge regarding the concepts related to learning objectives. Computer assisted assessment is a research branch established to study how computers can be used to automatically evaluate students’ answers. Computer assisted assessment systems developed so far, are based on a multitude of different techniques, such as Latent Semantic Analysis, Natural Language Processing and Artificial Intelligence, among others. These approaches require a reasonable corpus to start with, and depending on the domain, the corpus may require regular updates. In this paper we address the assessment of short free text answers by developing a system that captures the way the teacher evaluates the answer. For that, the system first classifies the teacher question by type. Then concerning the type of question, the system permits the teacher define scores associated with subparts of the answer. F...

Research paper thumbnail of An integrated system to support electricity tariff contract definition

Frontiers in Artificial Intelligence and Applications, 2010

ABSTRACT This paper presents an integrated system that helps both retail companies and electricit... more ABSTRACT This paper presents an integrated system that helps both retail companies and electricity consumers on the definition of the best retail contracts and tariffs. This integrated system is composed by a Decision Support System (DSS) based on a Consumer Characterization Framework (CCF). The CCF is based on data mining techniques, applied to obtain useful knowledge about electricity consumers from large amounts of consumption data. This knowledge is acquired following an innovative and systematic approach able to identify different consumers' classes, represented by a load profile, and its characterization using decision trees. The framework generates inputs to use in the knowledge base and in the database of the DSS. The rule sets derived from the decision trees are integrated in the knowledge base of the DSS. The load profiles together with the information about contracts and electricity prices form the database of the DSS. This DSS is able to perform the classification of different consumers, present its load profile and test different electricity tariffs and contracts. The final outputs of the DSS are a comparative economic analysis between different contracts and advice about the most economic contract to each consumer class. The presentation of the DSS is completed with an application example using a real data base of consumers from the Portuguese distribution company.

Research paper thumbnail of A new approach for multi-agent coalition formation and management in the scope of electricity markets

Energy, 2011

This paper presents a new methodology for the creation and management of coalitions in Electricit... more This paper presents a new methodology for the creation and management of coalitions in Electricity Markets. This approach is tested using the multi-agent market simulator MASCEM, taking advantage of its ability to provide the means to model and simulate VPP (Virtual Power Producers). VPPs are represented as coalitions of agents, with the capability of negotiating both in the market, and internally, with their members, in order to combine and manage their individual specific characteristics and goals, with the strategy and objectives of the VPP itself. The new features include the development of particular individual facilitators to manage the communications amongst the members of each coalition independently from the rest of the simulation, and also the mechanisms for the classification of the agents that are candidates to join the coalition. In addition, a global study on the results of the Iberian Electricity Market is performed, to compare and analyze different approaches for defining consistent and adequate strategies to integrate into the agents of MASCEM. This, combined with the application of learning and prediction techniques provide the agents with the ability to learn and adapt themselves, by adjusting their actions to the continued evolving states of the world they are playing in.

Research paper thumbnail of A multi-agent simulator for testing agent market strategies

… on Modelling and …, 2005

GECAD – Knowledge Engineering and Decision Support Group Institute of Engineering, Polytechnic of... more GECAD – Knowledge Engineering and Decision Support Group Institute of Engineering, Polytechnic of Porto, Portugal E-mail: {viamonte,csr,fr}@dei.isep.ipp.pt ... Department of Electrical Engineering, University of Trás-os-Montes e Alto Douro, Vila Real, Portugal E-mail: ...

Research paper thumbnail of Knowledge extraction from medium voltage load diagrams to support the definition of electrical tariffs

Engineering Intelligent Systems for Electrical Engineering and Communications, 2007

With the electricity market liberalization, distribution and retail companies are looking for bet... more With the electricity market liberalization, distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity customers. In this environment all consumers are free to choose their electricity supplier. A fair insight on the customers' behaviour will permit the definition of specific contract aspects based on the different consumption patterns. In this paper Data Mining (DM) techniques are applied to electricity consumption data from a utility client's database. To form the different customers' classes, and find a set of representative consumption patterns, we have used the Two-Step algorithm which is a hierarchical clustering algorithm. Each consumer class will be represented by its load profile resulting from the clustering operation. Next, to characterize each consumer class a classification model will be constructed with the C5.0 classification algorithm.

Research paper thumbnail of ChatBot for student service based on RASA framework

Research Square (Research Square), Apr 5, 2023

The availability of face-to-face attendance at the School's Administrative Services for Students ... more The availability of face-to-face attendance at the School's Administrative Services for Students is limited to one schedule, which may prevent the timely clarification of students' questions, causing a decrease in their level of satisfaction. To solve this problem, a conversational agent was designed, consisting of a Portuguese language interpretation module using natural language processing and machine learning techniques. To keep the system abstracted from any technical dependency, a web service that manages the agent's knowledge base was developed. In the evaluation of the solution, the performance of several learning models was compared, and the results emphasize the superiority of BERT language model of Google, combined with the DIET classifier, obtaining a F1-Score of 0.965. The system was implemented through a prototype and, for a total of 256 questions, around 70% of correct responses were obtained, with a positive average satisfaction rating of 4.20 on a 0-5 scale.

Research paper thumbnail of A Deep Learning Approach to Monitoring Workers’ Stress at Office

Lecture notes in networks and systems, 2023

I dedicate this thesis to my parents, Cleide and René, who supported and motivated me even in the... more I dedicate this thesis to my parents, Cleide and René, who supported and motivated me even in the toughest moments. To my late grandfather, Carlos Marchetti, who inspired my pursuit in science. To my supervisor, Fatima Rodrigues, for her commitment and support which contributed in no small part to the project's success. Finally, I express my sincere gratitude to who has crossed my path and helped me to become what I am today.

Research paper thumbnail of Use of Data Mining Techniques to Characterize MV Consumers and to Support the Consumer- Supplier Relationship

This paper consists in the characterization of electric power profiles of medium voltage (MV) con... more This paper consists in the characterization of electric power profiles of medium voltage (MV) consumers, based on the data base knowledge discovery process. Data Mining techniques were used as support for the agents of the electric power retail markets, with the purpose of obtaining specific knowledge of their customers' consumption habits. A hierarchical clustering algorithm is used in order to form the different customers' classes and to find a set of representative consumption patterns. A classification model was built that, when applied to new consumers, allows classifying them in one of the obtained classes. New tariff options were also defined, taking into consideration the typical consumption profile of the class to which the customers belong. With the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.

Research paper thumbnail of An automated approach for binary classification on imbalanced data

Research Square (Research Square), Jun 6, 2023

Imbalanced data is present in various business areas and must be dealt with the appropriate resam... more Imbalanced data is present in various business areas and must be dealt with the appropriate resampling techniques and classification algorithms. However, there is a magnitude of multiple combinations of resampling and learning methods to handle imbalanced data that require specialised knowledge to be used correctly. In this paper, several approaches, ranging from more accessible and more advanced in the domains of data resampling and cost-sensitive techniques, will be considered to handle imbalanced data. The application developed delivers recommendations of the most suited combinations of techniques for a specific dataset, by extracting and comparing dataset meta-features values recorded in a knowledge base. It facilitates effortless classification and automates part of the machine learning pipeline with comparable or better results to a state-of-the-art solution and with a much smaller execution time.

Research paper thumbnail of Artigo Original Original Article

Rev Port Med Int, 2010

... Óbitos por gripe pandémica A (H1N1) 2009 em Portugal Período de Abril de 2009 a Março de 2010... more ... Óbitos por gripe pandémica A (H1N1) 2009 em Portugal Período de Abril de 2009 a Março de 2010 Filipe Froes*, António Diniz*, Isabel Falcão*, Baltazar Nunes**, Judite Catarino* ... 25. Nogueira PJ, Nunes B, Machado A, Rodrigues E, Gómez V, Sousa L, Falcão JM. ...

Research paper thumbnail of 2 DI/gEPL – Languages Specification and Processing Group

Abstract. In today’s society the exploration of one or more databases to extract information or k... more Abstract. In today’s society the exploration of one or more databases to extract information or knowledge to support management is a critical success factor for an organization. However, it is well known that several problems can affect data quality. These problems have a negative effect in the results extracted from data, influencing their correction and validity. In this context, it is quite important to understand theoretically and in practice these data problems. This paper presents a taxonomy of data quality problems, derived from real-world databases. The taxonomy organizes the problems at different levels of abstraction. Methods to detect data quality problems represented as binary trees are also proposed for each abstraction level. The paper also compares this taxonomy with others already proposed in the literature. 1.

Research paper thumbnail of SmartClean: uma ferramenta para a limpeza incremental de dados

Neste artigo apresenta-se a ferramenta SmartClean, destinada à detecção e correcção de problemas ... more Neste artigo apresenta-se a ferramenta SmartClean, destinada à detecção e correcção de problemas de qualidade dos dados. Comparativamente às ferramentas actualmente existentes, o SmartClean possui a mais-valia de não obrigar a que a sequência de execução das operações seja especificada pelo utilizador. Para tal, foi concebida uma sequência segundo a qual os problemas são manipulados (i.e., detectados e corrigidos). A existência da sequência suporta ainda a execução incremental das operações. No artigo, a arquitectura subjacente à ferramenta é exposta, sendo detalhados os seus componentes. A validade da ferramenta e, consequentemente, da arquitectura é comprovada através da apresentação do caso de estudo efectuado. Apesar do SmartClean possuir potencialidades de limpeza de dados noutros níveis (e.g., relação), no artigo apenas são descritas as relativas ao nível do valor individual do atributo.

Research paper thumbnail of CLUSTER ENSEMBLE SELECTION - Using Average Cluster Consistency

Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, 2009

In order to combine multiple data partitions into a more robust data partition, several approache... more In order to combine multiple data partitions into a more robust data partition, several approaches to produce the cluster ensemble and various consensus functions have been proposed. This range of possibilities in the multiple data partitions combination raises a new problem: which of the existing approaches, to produce the cluster ensembles' data partitions and to combine these partitions, best fits a given data set. In this paper, we address the cluster ensemble selection problem. We proposed a new measure to select the best consensus data partition, among a variety of consensus partitions, based on a notion of average cluster consistency between each data partition that belongs to the cluster ensemble and a given consensus partition. We compared the proposed measure with other measures for cluster ensemble selection, using 9 different data sets, and the experimental results shown that the consensus partitions selected by our approach usually were of better quality in comparison with the consensus partitions selected by other measures used in our experiments.

Research paper thumbnail of A Data Mining Framework for Response Modelling in Direct Marketing

Advances in Intelligent Systems and Computing, 2021

Research paper thumbnail of Impact of comorbidities in pulmonary rehabilitation outcomes in patients with COPD

Revista Portuguesa de Pneumologia (English Edition), 2013

Background: Chronic obstructive pulmonary disease (COPD) represents an increasing burden worldwid... more Background: Chronic obstructive pulmonary disease (COPD) represents an increasing burden worldwide. COPD can no longer be considered a disease which only involves the lungs, its systemic consequences make it an important risk factor for other chronic comorbidities. Aim: To determine the frequency of comorbidities in patients with COPD undergoing a pulmonary rehabilitation program (PRP) and to evaluate the influence of baseline characteristics as well as comorbidities on the outcomes of PRP. Methods: The present study included all COPD patients who were admitted to a PRP in our unit. The response to PR was measured by the improvement in exercise tolerance (6 min walk test), dyspnea (Mahler's Dyspnea Index) and health status (St. George's Respiratory Questionnaire). Results: 114 patients with COPD were included. Most patients (96.5%) had at least one comorbidity. Metabolic diseases (71.1%), cardiovascular diseases (67.5%), other respiratory conditions (57.9%) and anxiety/depression (21.1%) were the most prevalent ones. 64.9%, 64.9% and 51.1% of the patients improved in terms of exercise tolerance, quality of life and dyspnea, respectively. The overall results were similar in all levels of the disease and in all comorbid subgroups. Logistic regression analysis showed that respiratory failure and ischemic heart disease negatively influenced improvement in health status and anxiety/depression predicted lower improvement in dyspnea. ଝ Please cite this article as: Carreiro A, et al. Impacto das comorbilidades num programa de reabilitação respiratória em doentes com DPOC. Rev Port Pneumol. 2013.

Research paper thumbnail of A Web & Mobile City Maintenance Reporting Solution

Procedia Technology, 2013

In an era marked by the growing dominance of mobile computing, the systematic use of a smartphone... more In an era marked by the growing dominance of mobile computing, the systematic use of a smartphone or tablet to perform daily life tasks creates new business opportunities and drives innovation and redesign of organizations traditional work methods. With recent technological advances, space emerges for civic-oriented systems, in which citizens can promote the growth of their community through small acts of citizenship in favor of the community, allowing local management authorities to save human and financial resources. Urban problems solving is one of those areas, and the arrival of mobile technologies has revolutionized the way citizens report non-emergency situations to the responsible authorities for the local community, providing ideal conditions for the local government to respond quickly and effectively. This paper describes a proposed solution for a system capable of reporting and managing notifications of non-urgent urban situations, to foster the active participation of citizens in the community. The solution allows citizens to report a particular set of problems via a Web browser or mobile device, in a location based environment, and documenting the submitted reports with any type of multimedia information, added with the capability of automatically identify and underline the most usual problems spotted in images.

Research paper thumbnail of An Electric Energy Consumer Characterization Framework Based on Data Mining Techniques

IEEE Transactions on Power Systems, 2005

This paper presents an electricity consumer characterization framework based on a knowledge disco... more This paper presents an electricity consumer characterization framework based on a knowledge discovery in databases (KDD) procedure, supported by data mining (DM) techniques, applied on the different stages of the process. The core of this framework is a data mining model based on a combination of unsupervised and supervised learning techniques. Two main modules compose this framework: the load profiling module and the classification module. The load profiling module creates a set of consumer classes using a clustering operation and the representative load profiles for each class. The classification module uses this knowledge to build a classification model able to assign different consumers to the existing classes. The quality of this framework is illustrated with a case study concerning a real database of LV consumers from the Portuguese distribution company.

Research paper thumbnail of Extracting Structure, Text and Entities from PDF Documents of the Portuguese Legislation

Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, 2012

This paper presents an approach for text processing of PDF documents with well-defined layout str... more This paper presents an approach for text processing of PDF documents with well-defined layout structure. The scope of the approach is to explore the font's structure of PDF documents, using perceptual grouping. It consists on the extraction of text objects from the content stream of the documents and its grouping according to a set criterion, making also use of geometric-based regions in order to achieve the correct reading order. The developed approach processes the PDF documents using logical and structural rules to extract the entities present in them, and returns an optimized XML representation of the PDF document, useful for re-use, for example in text categorization. The system was trained and tested with Portuguese Legislation PDF documents extracted from the electronic Republic's Diary. Evaluation results show that our approach presents good results.

Research paper thumbnail of Resampling Approaches to Improve News Importance Prediction

Lecture Notes in Computer Science, 2014

Research paper thumbnail of Use of Data Mining Techniques to Characterize MV Consumers and to Support the Consumer- Supplier Relationship

This paper consists in the characterization of electric power profiles of medium voltage (MV) con... more This paper consists in the characterization of electric power profiles of medium voltage (MV) consumers, based on the data base knowledge discovery process. Data Mining techniques were used as sup- port for the agents of the electric power retail markets, with the purpose of obtaining specific knowledge of their customers' consumption habits. A hierarchical clustering algorithm is used in order to form the different customers' classes and to find a set of representative consumption patterns. A classification model was built that, when applied to new consumers, allows classifying them in one of the obtained classes. New tariff op- tions were also defined, taking into consideration the typical consumption profile of the class to which the customers belong. With the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.

Research paper thumbnail of Automatic Assessment of Short Free Text Answers

Assessment plays a central role in any educational process, because it is a common way to evaluat... more Assessment plays a central role in any educational process, because it is a common way to evaluate the students’ knowledge regarding the concepts related to learning objectives. Computer assisted assessment is a research branch established to study how computers can be used to automatically evaluate students’ answers. Computer assisted assessment systems developed so far, are based on a multitude of different techniques, such as Latent Semantic Analysis, Natural Language Processing and Artificial Intelligence, among others. These approaches require a reasonable corpus to start with, and depending on the domain, the corpus may require regular updates. In this paper we address the assessment of short free text answers by developing a system that captures the way the teacher evaluates the answer. For that, the system first classifies the teacher question by type. Then concerning the type of question, the system permits the teacher define scores associated with subparts of the answer. F...

Research paper thumbnail of An integrated system to support electricity tariff contract definition

Frontiers in Artificial Intelligence and Applications, 2010

ABSTRACT This paper presents an integrated system that helps both retail companies and electricit... more ABSTRACT This paper presents an integrated system that helps both retail companies and electricity consumers on the definition of the best retail contracts and tariffs. This integrated system is composed by a Decision Support System (DSS) based on a Consumer Characterization Framework (CCF). The CCF is based on data mining techniques, applied to obtain useful knowledge about electricity consumers from large amounts of consumption data. This knowledge is acquired following an innovative and systematic approach able to identify different consumers' classes, represented by a load profile, and its characterization using decision trees. The framework generates inputs to use in the knowledge base and in the database of the DSS. The rule sets derived from the decision trees are integrated in the knowledge base of the DSS. The load profiles together with the information about contracts and electricity prices form the database of the DSS. This DSS is able to perform the classification of different consumers, present its load profile and test different electricity tariffs and contracts. The final outputs of the DSS are a comparative economic analysis between different contracts and advice about the most economic contract to each consumer class. The presentation of the DSS is completed with an application example using a real data base of consumers from the Portuguese distribution company.

Research paper thumbnail of A new approach for multi-agent coalition formation and management in the scope of electricity markets

Energy, 2011

This paper presents a new methodology for the creation and management of coalitions in Electricit... more This paper presents a new methodology for the creation and management of coalitions in Electricity Markets. This approach is tested using the multi-agent market simulator MASCEM, taking advantage of its ability to provide the means to model and simulate VPP (Virtual Power Producers). VPPs are represented as coalitions of agents, with the capability of negotiating both in the market, and internally, with their members, in order to combine and manage their individual specific characteristics and goals, with the strategy and objectives of the VPP itself. The new features include the development of particular individual facilitators to manage the communications amongst the members of each coalition independently from the rest of the simulation, and also the mechanisms for the classification of the agents that are candidates to join the coalition. In addition, a global study on the results of the Iberian Electricity Market is performed, to compare and analyze different approaches for defining consistent and adequate strategies to integrate into the agents of MASCEM. This, combined with the application of learning and prediction techniques provide the agents with the ability to learn and adapt themselves, by adjusting their actions to the continued evolving states of the world they are playing in.

Research paper thumbnail of A multi-agent simulator for testing agent market strategies

… on Modelling and …, 2005

GECAD – Knowledge Engineering and Decision Support Group Institute of Engineering, Polytechnic of... more GECAD – Knowledge Engineering and Decision Support Group Institute of Engineering, Polytechnic of Porto, Portugal E-mail: {viamonte,csr,fr}@dei.isep.ipp.pt ... Department of Electrical Engineering, University of Trás-os-Montes e Alto Douro, Vila Real, Portugal E-mail: ...

Research paper thumbnail of Knowledge extraction from medium voltage load diagrams to support the definition of electrical tariffs

Engineering Intelligent Systems for Electrical Engineering and Communications, 2007

With the electricity market liberalization, distribution and retail companies are looking for bet... more With the electricity market liberalization, distribution and retail companies are looking for better market strategies based on adequate information upon the consumption patterns of its electricity customers. In this environment all consumers are free to choose their electricity supplier. A fair insight on the customers' behaviour will permit the definition of specific contract aspects based on the different consumption patterns. In this paper Data Mining (DM) techniques are applied to electricity consumption data from a utility client's database. To form the different customers' classes, and find a set of representative consumption patterns, we have used the Two-Step algorithm which is a hierarchical clustering algorithm. Each consumer class will be represented by its load profile resulting from the clustering operation. Next, to characterize each consumer class a classification model will be constructed with the C5.0 classification algorithm.