Evangelos Theodoridis - Academia.edu (original) (raw)

Papers by Evangelos Theodoridis

Research paper thumbnail of On Topic Categorization of PubMed Query Results

IFIP Advances in Information and Communication Technology, 2012

ABSTRACT Nowadays, people frequently use search engines in order to find the information they nee... more ABSTRACT Nowadays, people frequently use search engines in order to find the information they need on the Web. Especially Web search constitutes a basic tool used by million researchers in their everyday work. A very popular indexing engine, concerning life sciences and biomedical research is PubMed. PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The present search engines usually return search results in a global ranking making it difficult to the users to browse in different topics or subtopics that they query. Because of this mixing of results belonging to different topics, the average users spend a lot of time to find Web pages, best matching their query. In this paper, we propose a novel system to address this problem. We present and evaluate a methodology that exploits semantic text clustering techniques in order to group biomedical document collections in homogeneous topics. In order to provide more accurate clustering results, we utilize various biomedical ontologies, like MeSH and GeneOntology. Finally, we embed the proposed methodology in an online system that post-processes the PubMed online database in order to provide to users the retrieved results according to well formed topics.

Research paper thumbnail of A PubMed Meta Search Engine Based on Biomedical Entity Mining

2014 25th International Workshop on Database and Expert Systems Applications, 2014

ABSTRACT Biomedical knowledge stored in the web is increasing significantly as most of the biomed... more ABSTRACT Biomedical knowledge stored in the web is increasing significantly as most of the biomedical research papers are published online. Biomedical entity extraction is a crucial procedure for efficient text analysis and retrieval. PubMed is a very popular indexing engine, concerning life sciences and biomedical research. Being a free database, it accesses primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. In this work, we propose a meta-search engine over PubMed, which classifies PubMed results according to their specific topic and the extracted Biomedical entities. This method helps researchers to browse and search in the retrieved results. In order to provide more accurate clustering results, we utilize the biomedical ontology, named MeSH as well as RxNorm which is a tool for supporting semantic interoperation between drug terminologies and pharmacy knowledge base systems. Finally, we embed the proposed methodology in an online system.

Research paper thumbnail of Topic Categorization of Biomedical Abstracts

International Journal on Artificial Intelligence Tools, 2015

Nowadays, people frequently use search engines in order to find the information they need on the ... more Nowadays, people frequently use search engines in order to find the information they need on the Web. Especially, Web search constitutes of a basic tool used by million researchers in their everyday work. A very popular indexing engine, concerning life sciences and biomedical research, is PubMed. PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The present search engines usually return search results in a global ranking, making it difficult for the users to browse in different topics or subtopics.

Research paper thumbnail of Post-processing in wireless sensor networks: Benchmarking sensor trace files

PE-WASUN'10 - Proceedings of the 7th ACM Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, and Ubiquitous Networks, Co-located with MSWiM'10, 2010

Wireless sensor network research usually focuses on the reliable and efficient collection of data... more Wireless sensor network research usually focuses on the reliable and efficient collection of data. Here, we address the next step in the traces lifetime: we aim at investigating and evaluating, by qualitative and quantitative means, data repositories of already collected measurements. We propose the use of a set of new metrics, which enable reliable evaluation of algorithms using traces (both in average cases and "stressful" setups) removing the need for running algorithms in a real testbed, at least in the development stage.

Research paper thumbnail of A Data Mining Methodology for Evaluating Maintainability according to ISO/IEC9126 Software Engineering-Product Quality Standard

This paper presents ongoing work on using data mining to evaluate a software system's maintai... more This paper presents ongoing work on using data mining to evaluate a software system's maintainability according to the ISO/IEC-9126 quality standard. More specifically it proposes a methodology for knowledge acquisition by integrating data from source code with the expertise of a software system's evaluators A process for the extraction of elements from source code and Analytical Hierarchical Processing for assigning

Research paper thumbnail of Extracting Knowledge from Web Search Engine Using Wikipedia

Communications in Computer and Information Science, 2013

ABSTRACT Nowadays, search engines are definitely a dominating web tool for finding information on... more ABSTRACT Nowadays, search engines are definitely a dominating web tool for finding information on the web. However, web search engines usually return web page references in a global ranking making it difficult to the users to browse different topics captured in the result set. Recently, there are meta-search engine systems that discover knowledge in these web search results providing the user with the possibility to browse different topics contained in the result set. In this paper, we focus on the problem of determining different thematic groups on web search engine results that existing web search engines provide. We propose a novel system that exploits semantic entities of Wikipedia for grouping the result set in different topic groups, according to the various meanings of the provided query. The proposed method utilizes a number of semantic annotation techniques using Knowledge Bases, like WordNet and Wikipedia, in order to perceive the different senses of each query term. Finally, the method annotates the extracted topics using information derived from clusters which in following are presented to the end user.

Research paper thumbnail of A web page usage prediction scheme using sequence indexing and clustering techniques

In this paper we consider the problem of web page usage prediction in a web site by modeling user... more In this paper we consider the problem of web page usage prediction in a web site by modeling users' navigation history and web page content with weighted suffix trees. This user's navigation prediction can be exploited either in an on-line recommendation system in a web site or in a web page cache system. The method proposed has the advantage that it demands a constant amount of computational effort per one user's action and consumes a relatively small amount of extra memory space. These features make the method ideal for an on-line working environment. Finally, we have performed an evaluation of the proposed scheme with experiments on various web site log files and web pages and we have found that its quality performance is fairly well and in many cases an outperforming one.

Research paper thumbnail of Indexing Textual Information

... Dissent, Protest and Transformative Action: An Exploratory Study of Staff Reactions to Electr... more ... Dissent, Protest and Transformative Action: An Exploratory Study of Staff Reactions to Electronic Monitoring and Control of E-mail Systems in One Company Based in Ireland Aidan Duane, and Patrick Finnegan (2007). Information Resources Management Journal (pp. 1-13). ...

Research paper thumbnail of Association Rules Mining for Retail Organizations

Research paper thumbnail of A web services-oriented architecture for integrating small programmable objects in the web of things

Proceedings - 3rd International Conference on Developments in eSystems Engineering, DeSE 2010, 2010

In this work, we present a concrete framework, based on web services-oriented architecture, for i... more In this work, we present a concrete framework, based on web services-oriented architecture, for integrating small programmable objects in the web of things. Functionality and data gathered by the Small Programmable Objects (SPO) are exposed using Web Services. Based on this, by exploiting XML encoding, SPO can be comprehensible by any web application. The architecture proposed is focused in providing secure and efficient interoperability between SPO and the web. Additionally, the proposed architecture provides management capabilities for deploying, maintaining and operating SPO applications across multiple networks. We present the multilayer architecture of our system and its implementation, which uses a combination of Java Standard and Micro Editions. Finally, we present a case study presenting our implementation. In this application we use SunSPOTs, which are wireless network motes developed by Sun Microsystems.

Research paper thumbnail of Code Quality Evaluation Methodology Using The ISO/IEC 9126 Standard

International Journal of Software Engineering & Applications, 2010

This work proposes a methodology for source code quality and static behaviour evaluation of a sof... more This work proposes a methodology for source code quality and static behaviour evaluation of a software system, based on the standard ISO/IEC-9126. It uses elements automatically derived from source code enhanced with expert knowledge in the form of quality characteristic rankings, allowing software engineers to assign weights to source code attributes. It is flexible in terms of the set of metrics and source code attributes employed, even in terms of the ISO/IEC-9126 characteristics to be assessed. We applied the methodology to two case studies, involving five open source and one proprietary system. Results demonstrated that the methodology can capture software quality trends and express expert perceptions concerning system quality in a quantitative and systematic manner.

Research paper thumbnail of Locating Maximal Multirepeats in Multiple Strings Under Various Constraints

The Computer Journal, 2006

A multirepeat in a string is a substring (factor) that appears a predefined number of times. A mu... more A multirepeat in a string is a substring (factor) that appears a predefined number of times. A multirepeat is maximal if it cannot be extended either to the right or to the left and produce a multirepeat. In this paper, we present algorithms for two different versions of the problem of finding maximal multirepeats in a set of strings. In the case of arbitrary gaps, we propose an algorithm with O(sN 2 n 1 a) time complexity. When the gap is bounded in a small range c, we propose an algorithm with O((c 2 1 s 2 )mN 2 n log(Nn) 1 a) time complexity. Here, N is the number of strings, n the mean length of each string, m the multiplicity of the multirepeat and a the number of reported occurrences. Our results extend previous work by considering sets of strings as well as by generalizing pairs to multirepeats.

Research paper thumbnail of Clustering for Monitoring Software Systems Maintainability Evolution

Electronic Notes in Theoretical Computer Science, 2009

Abstract This paper presents ongoing work on using data mining clustering to support the evaluati... more Abstract This paper presents ongoing work on using data mining clustering to support the evaluation of software systems’ maintainability. As input for our analysis we employ software measurement,data extracted from Java source code. We propose a two-steps clustering process which facilitates the assessment of a system’s maintainability at rst, and subsequently an in-cluster analysis in order to study the evolution

Research paper thumbnail of SmartSantander: IoT experimentation over a smart city testbed

Computer Networks, 2014

This paper describes the deployment and experimentation architecture of the Internet of Things ex... more This paper describes the deployment and experimentation architecture of the Internet of Things experimentation facility being deployed at Santander city. The facility is implemented within the SmartSantander project, one of the projects of the Future Internet Research and Experimentation initiative of the European Commission and represents a unique in the world city-scale experimental research facility. Additionally, this facility supports typical applications and services of a smart city. Tangible results are expected to influence the definition and specification of Future Internet architecture design from viewpoints of Internet of Things and Internet of Services. The facility comprises a large number of Internet of Things devices deployed in several urban scenarios which will be federated into a single testbed. In this paper the deployment being carried out at the main location, namely Santander city, is described. Besides presenting the current deployment, in this article the main insights in terms of the architectural design of a large-scale IoT testbed are presented as well. Furthermore, solutions adopted for implementation of the different components addressing the required testbed functionalities are also sketched out. The IoT experimentation facility described in this paper is conceived to provide a suitable platform for large scale experimentation and evaluation of IoT concepts under real-life conditions.

Research paper thumbnail of A WebPage Usage Prediction Scheme Using Weighted Suffix Trees

String Processing and Information Retrieval, 2007

In this paper we consider the problem of web page usage prediction in a web site by modeling user... more In this paper we consider the problem of web page usage prediction in a web site by modeling users’ navigation history with weighted suffix trees. This user’s navigation prediction can be exploited either in an on-line recommendation system in a website or in a web-page cache system. The method proposed has the advantage that it demands a constant amount of

Research paper thumbnail of On Topic Categorization of PubMed Query Results

IFIP Advances in Information and Communication Technology, 2012

ABSTRACT Nowadays, people frequently use search engines in order to find the information they nee... more ABSTRACT Nowadays, people frequently use search engines in order to find the information they need on the Web. Especially Web search constitutes a basic tool used by million researchers in their everyday work. A very popular indexing engine, concerning life sciences and biomedical research is PubMed. PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The present search engines usually return search results in a global ranking making it difficult to the users to browse in different topics or subtopics that they query. Because of this mixing of results belonging to different topics, the average users spend a lot of time to find Web pages, best matching their query. In this paper, we propose a novel system to address this problem. We present and evaluate a methodology that exploits semantic text clustering techniques in order to group biomedical document collections in homogeneous topics. In order to provide more accurate clustering results, we utilize various biomedical ontologies, like MeSH and GeneOntology. Finally, we embed the proposed methodology in an online system that post-processes the PubMed online database in order to provide to users the retrieved results according to well formed topics.

Research paper thumbnail of A PubMed Meta Search Engine Based on Biomedical Entity Mining

2014 25th International Workshop on Database and Expert Systems Applications, 2014

ABSTRACT Biomedical knowledge stored in the web is increasing significantly as most of the biomed... more ABSTRACT Biomedical knowledge stored in the web is increasing significantly as most of the biomedical research papers are published online. Biomedical entity extraction is a crucial procedure for efficient text analysis and retrieval. PubMed is a very popular indexing engine, concerning life sciences and biomedical research. Being a free database, it accesses primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. In this work, we propose a meta-search engine over PubMed, which classifies PubMed results according to their specific topic and the extracted Biomedical entities. This method helps researchers to browse and search in the retrieved results. In order to provide more accurate clustering results, we utilize the biomedical ontology, named MeSH as well as RxNorm which is a tool for supporting semantic interoperation between drug terminologies and pharmacy knowledge base systems. Finally, we embed the proposed methodology in an online system.

Research paper thumbnail of Topic Categorization of Biomedical Abstracts

International Journal on Artificial Intelligence Tools, 2015

Nowadays, people frequently use search engines in order to find the information they need on the ... more Nowadays, people frequently use search engines in order to find the information they need on the Web. Especially, Web search constitutes of a basic tool used by million researchers in their everyday work. A very popular indexing engine, concerning life sciences and biomedical research, is PubMed. PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The present search engines usually return search results in a global ranking, making it difficult for the users to browse in different topics or subtopics.

Research paper thumbnail of Post-processing in wireless sensor networks: Benchmarking sensor trace files

PE-WASUN'10 - Proceedings of the 7th ACM Symposium on Performance Evaluation of Wireless Ad Hoc, Sensor, and Ubiquitous Networks, Co-located with MSWiM'10, 2010

Wireless sensor network research usually focuses on the reliable and efficient collection of data... more Wireless sensor network research usually focuses on the reliable and efficient collection of data. Here, we address the next step in the traces lifetime: we aim at investigating and evaluating, by qualitative and quantitative means, data repositories of already collected measurements. We propose the use of a set of new metrics, which enable reliable evaluation of algorithms using traces (both in average cases and "stressful" setups) removing the need for running algorithms in a real testbed, at least in the development stage.

Research paper thumbnail of A Data Mining Methodology for Evaluating Maintainability according to ISO/IEC9126 Software Engineering-Product Quality Standard

This paper presents ongoing work on using data mining to evaluate a software system's maintai... more This paper presents ongoing work on using data mining to evaluate a software system's maintainability according to the ISO/IEC-9126 quality standard. More specifically it proposes a methodology for knowledge acquisition by integrating data from source code with the expertise of a software system's evaluators A process for the extraction of elements from source code and Analytical Hierarchical Processing for assigning

Research paper thumbnail of Extracting Knowledge from Web Search Engine Using Wikipedia

Communications in Computer and Information Science, 2013

ABSTRACT Nowadays, search engines are definitely a dominating web tool for finding information on... more ABSTRACT Nowadays, search engines are definitely a dominating web tool for finding information on the web. However, web search engines usually return web page references in a global ranking making it difficult to the users to browse different topics captured in the result set. Recently, there are meta-search engine systems that discover knowledge in these web search results providing the user with the possibility to browse different topics contained in the result set. In this paper, we focus on the problem of determining different thematic groups on web search engine results that existing web search engines provide. We propose a novel system that exploits semantic entities of Wikipedia for grouping the result set in different topic groups, according to the various meanings of the provided query. The proposed method utilizes a number of semantic annotation techniques using Knowledge Bases, like WordNet and Wikipedia, in order to perceive the different senses of each query term. Finally, the method annotates the extracted topics using information derived from clusters which in following are presented to the end user.

Research paper thumbnail of A web page usage prediction scheme using sequence indexing and clustering techniques

In this paper we consider the problem of web page usage prediction in a web site by modeling user... more In this paper we consider the problem of web page usage prediction in a web site by modeling users' navigation history and web page content with weighted suffix trees. This user's navigation prediction can be exploited either in an on-line recommendation system in a web site or in a web page cache system. The method proposed has the advantage that it demands a constant amount of computational effort per one user's action and consumes a relatively small amount of extra memory space. These features make the method ideal for an on-line working environment. Finally, we have performed an evaluation of the proposed scheme with experiments on various web site log files and web pages and we have found that its quality performance is fairly well and in many cases an outperforming one.

Research paper thumbnail of Indexing Textual Information

... Dissent, Protest and Transformative Action: An Exploratory Study of Staff Reactions to Electr... more ... Dissent, Protest and Transformative Action: An Exploratory Study of Staff Reactions to Electronic Monitoring and Control of E-mail Systems in One Company Based in Ireland Aidan Duane, and Patrick Finnegan (2007). Information Resources Management Journal (pp. 1-13). ...

Research paper thumbnail of Association Rules Mining for Retail Organizations

Research paper thumbnail of A web services-oriented architecture for integrating small programmable objects in the web of things

Proceedings - 3rd International Conference on Developments in eSystems Engineering, DeSE 2010, 2010

In this work, we present a concrete framework, based on web services-oriented architecture, for i... more In this work, we present a concrete framework, based on web services-oriented architecture, for integrating small programmable objects in the web of things. Functionality and data gathered by the Small Programmable Objects (SPO) are exposed using Web Services. Based on this, by exploiting XML encoding, SPO can be comprehensible by any web application. The architecture proposed is focused in providing secure and efficient interoperability between SPO and the web. Additionally, the proposed architecture provides management capabilities for deploying, maintaining and operating SPO applications across multiple networks. We present the multilayer architecture of our system and its implementation, which uses a combination of Java Standard and Micro Editions. Finally, we present a case study presenting our implementation. In this application we use SunSPOTs, which are wireless network motes developed by Sun Microsystems.

Research paper thumbnail of Code Quality Evaluation Methodology Using The ISO/IEC 9126 Standard

International Journal of Software Engineering & Applications, 2010

This work proposes a methodology for source code quality and static behaviour evaluation of a sof... more This work proposes a methodology for source code quality and static behaviour evaluation of a software system, based on the standard ISO/IEC-9126. It uses elements automatically derived from source code enhanced with expert knowledge in the form of quality characteristic rankings, allowing software engineers to assign weights to source code attributes. It is flexible in terms of the set of metrics and source code attributes employed, even in terms of the ISO/IEC-9126 characteristics to be assessed. We applied the methodology to two case studies, involving five open source and one proprietary system. Results demonstrated that the methodology can capture software quality trends and express expert perceptions concerning system quality in a quantitative and systematic manner.

Research paper thumbnail of Locating Maximal Multirepeats in Multiple Strings Under Various Constraints

The Computer Journal, 2006

A multirepeat in a string is a substring (factor) that appears a predefined number of times. A mu... more A multirepeat in a string is a substring (factor) that appears a predefined number of times. A multirepeat is maximal if it cannot be extended either to the right or to the left and produce a multirepeat. In this paper, we present algorithms for two different versions of the problem of finding maximal multirepeats in a set of strings. In the case of arbitrary gaps, we propose an algorithm with O(sN 2 n 1 a) time complexity. When the gap is bounded in a small range c, we propose an algorithm with O((c 2 1 s 2 )mN 2 n log(Nn) 1 a) time complexity. Here, N is the number of strings, n the mean length of each string, m the multiplicity of the multirepeat and a the number of reported occurrences. Our results extend previous work by considering sets of strings as well as by generalizing pairs to multirepeats.

Research paper thumbnail of Clustering for Monitoring Software Systems Maintainability Evolution

Electronic Notes in Theoretical Computer Science, 2009

Abstract This paper presents ongoing work on using data mining clustering to support the evaluati... more Abstract This paper presents ongoing work on using data mining clustering to support the evaluation of software systems’ maintainability. As input for our analysis we employ software measurement,data extracted from Java source code. We propose a two-steps clustering process which facilitates the assessment of a system’s maintainability at rst, and subsequently an in-cluster analysis in order to study the evolution

Research paper thumbnail of SmartSantander: IoT experimentation over a smart city testbed

Computer Networks, 2014

This paper describes the deployment and experimentation architecture of the Internet of Things ex... more This paper describes the deployment and experimentation architecture of the Internet of Things experimentation facility being deployed at Santander city. The facility is implemented within the SmartSantander project, one of the projects of the Future Internet Research and Experimentation initiative of the European Commission and represents a unique in the world city-scale experimental research facility. Additionally, this facility supports typical applications and services of a smart city. Tangible results are expected to influence the definition and specification of Future Internet architecture design from viewpoints of Internet of Things and Internet of Services. The facility comprises a large number of Internet of Things devices deployed in several urban scenarios which will be federated into a single testbed. In this paper the deployment being carried out at the main location, namely Santander city, is described. Besides presenting the current deployment, in this article the main insights in terms of the architectural design of a large-scale IoT testbed are presented as well. Furthermore, solutions adopted for implementation of the different components addressing the required testbed functionalities are also sketched out. The IoT experimentation facility described in this paper is conceived to provide a suitable platform for large scale experimentation and evaluation of IoT concepts under real-life conditions.

Research paper thumbnail of A WebPage Usage Prediction Scheme Using Weighted Suffix Trees

String Processing and Information Retrieval, 2007

In this paper we consider the problem of web page usage prediction in a web site by modeling user... more In this paper we consider the problem of web page usage prediction in a web site by modeling users’ navigation history with weighted suffix trees. This user’s navigation prediction can be exploited either in an on-line recommendation system in a website or in a web-page cache system. The method proposed has the advantage that it demands a constant amount of