Ivan Stankov | Cardiff University (original) (raw)

Papers by Ivan Stankov

Research paper thumbnail of Semantically enhanced document clustering

I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, f... more I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, for their invaluable guidance and support throughout my work. All members of the KES research group from School of Engineering in Cardiff University are thanked for their friendship and help. My deepest gratitude is to my family who has given continuous support and encouragement to me. i

Research paper thumbnail of Enhanced cross-domain document clustering with a semantically enhanced text stemmer (SETS)

International Journal of Knowledge-based and Intelligent Engineering Systems, May 13, 2013

ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clu... more ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clustering algorithms rely on text normalisation techniques to represent and cluster documents. Although most document clustering algorithms perform well in specific knowledge domains, processing cross-domain document repositories is still a challenge. This paper attempts to address this challenge. It investigates the performance of the sk-means clustering algorithm across domains, by comparing the cluster coherence produced with semantic-based and traditional TF-IDF-based document representations. The evaluation is conducted on 20 different generic sub-domains of a thousand documents, each randomly selected from the Reuters21578 corpus. The experimental results obtained from the evaluation demonstrate improved coherence of clusters produced by using a semantically enhanced text stemmer SETS, when compared to the text normalisation obtained with the Porter stemmer. In addition, semantic-based text normalisation is shown to be resistant to noise, which is often introduced in the index aggregation stage, a stage that acquires features to represent documents.

Research paper thumbnail of Semantic-based information retrieval in support of concept design

Advanced Engineering Informatics, Apr 1, 2011

This research is motivated by the realisation that semantic technology can be used to develop com... more This research is motivated by the realisation that semantic technology can be used to develop computational tools in support of designers’ creativity by focusing on the inspirational stage of design. The paper describes a semantic-based image retrieval tool developed for the needs of concept cars designers from two renowned European companies. It is created to help them find and interpret

Research paper thumbnail of Semantically Enhanced Text Stemmer (SETS) for Document Clustering

KES, 2012

ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged... more ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged by the word ambiguity and the fact that monosemic words are more domain-oriented than polysemic ones. The paper describes a semantically enhanced text normalization algorithm (SETS) aimed at improving document clustering and investigates the performance of the sk-means clustering algorithm across domains by comparing the cluster coherence produced with semantic-based and traditional (TF-IDF-based) document representations. The evaluation is conducted on 20 generic sub-domains of a thousand documents each randomly selected from the Reuters21578 corpus. The experimental results demonstrate improved coherence of the clusters produced by SETS compared to the text normalization obtained with the Porter stemmer. In addition, semantic-based text normalization is shown to be resistant to noise, which is often introduced in the index aggregation stage.

Research paper thumbnail of Semantically Enhanced Text Stemmer (SETS) for Cross-Domain Document Clustering

Lecture Notes in Computer Science, 2013

ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged... more ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged by the word ambiguity and the fact that monosemic words are more domain-oriented than polysemic ones. The paper describes a semantically enhanced text normalization algorithm (SETS) aimed at improving document clustering and investigates the performance of the sk-means clustering algorithm across domains by comparing the cluster coherence produced with semantic-based and traditional (TF-IDF-based) document representations. The evaluation is conducted on 20 generic sub-domains of a thousand documents each randomly selected from the Reuters21578 corpus. The experimental results demonstrate improved coherence of the clusters produced by SETS compared to the text normalization obtained with the Porter stemmer. In addition, semantic-based text normalization is shown to be resistant to noise, which is often introduced in the index aggregation stage.

Research paper thumbnail of Development of a System for “Home Control on Fingertips”

Nowadays, advanced technology devices are all around us. We take them for granted and we are usin... more Nowadays, advanced technology devices are all around us. We take them for granted and we are using them in all spheres of social life. We decided to combine the fast development of the technology and daily human habits in a centralized system that we called "Home Control on Fingertips" (HOCFIT). The system is designed in hierarchical three-layer model: master, sub-masters, and end-devices. Each of them has their features and suitable interfaces to communicate with each other. Master accepts the commands from the authorized user and transmits it to the definite sub-master and the sub-master manipulates the end-device. The communication software is based on the TCP/IP protocol stack. The advantage of this system lies in the universal design, which allows easy and fast future development when connecting new devices for control or data acquisition as well expansion of the system's functionality. The extendibility and flexibility of the system are assured using popular technology like Bluetooth, integrated in GSMs, PDAs, Notebooks, etc., and IrDA, used in every remote control, almost every new domestic appliances have today.

Research paper thumbnail of Discussion of Microkernel and Monolithic Kernel Approaches

The paper discusses monolithic approach against microkernel approach-both used in operating syste... more The paper discusses monolithic approach against microkernel approach-both used in operating systems. The paper presents the most critical features of operating systems concerning system architecture, communication model, and priority in processes, memory management and handling of any arrived exceptions as well as interrupts. The interrupt is a crucial factor for real time operating systems (RTOS). It is supposed the interrupts to be handled in real time for predictable time. The paper points out the benefits of each discussed approach and its weak points. At the end a conclusion for each appropriate operating system deployment is made, based on facts and results, and also on prediction for functionality. The paper discusses also the embedded systems as one of the most fast developing systems and mostly suitable platform for any RTOS installation or any other operating systems (OS) functionality.

Research paper thumbnail of On the Use of XML for Data Access in Networks of Embedded Devices

The paper deals with embedded networking and distributed embedded systems, which is a key area of... more The paper deals with embedded networking and distributed embedded systems, which is a key area of research over the last years. It presents a custom protocol for data access in embedded devices-CNDEP (Controller Network Data Extracting Protocol). The presented work is based on the presumption that all data exchanged should be encoded in universal, platform independent format-XML. The XML-based version of the protocol CNDEP should provide interoperation between embedded devices, considered as components. The underlying communication is UDP/IP/Ethernet, which ensures good integration of embedded devices with other information systems. The protocol CNDEP is designed as a part of Multi-tier Client/Server System for Distributed Measurement and Control. It is used for data transfer in data producing tier. The transfer delays for the protocol are measured and compared to the non-XML implementation for evaluation of the effectiveness and efficiency of the realization.

Research paper thumbnail of Discussion of Microkernel and Monolithic Kernel Approaches

dsnet.bjr-labs.com

Nowadays more and more resources are thrown for improving performance of existing software/hardwa... more Nowadays more and more resources are thrown for improving performance of existing software/hardware systems to produce new ones more powerful than ever. In the world of embedded systems and real-time systems anything is crucial – software applications, involving ...

Research paper thumbnail of Semantically enhanced document clustering

Cardiff University, 2013

I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, f... more I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, for their invaluable guidance and support throughout my work. All members of the KES research group from School of Engineering in Cardiff University are thanked for their friendship and help. My deepest gratitude is to my family who has given continuous support and encouragement to me. i

Research paper thumbnail of Study the Time of Access to Data in Heterogeneous Databases

This article assesses the performance of two of the most popular databases ORACLE and MS SQL, wit... more This article assesses the performance of two of the most popular databases ORACLE and MS SQL, with and without indexing data. After a brief overview of the opportunities, which afforded us to compare the two databases and to create a heterogeneous database. The main objective is to examine the execution time of queries when searching for information in various databases and to decide how the times can be improved.

Research paper thumbnail of International Conference on Computer Systems and Technologies- CompSysTech’06 Controller Network Data Extracting Protocol – Design and Implementation

Abstract: The paper presents the design and implementation of a UDP-based protocol for Distribute... more Abstract: The paper presents the design and implementation of a UDP-based protocol for Distributed Automation Systems. It is based on client/server interactions. Protocol specification is given together with its syntax, grammar and semantics. Message formats, protocol vocabulary and communication rules are described. The possible applications of the protocol are discussed and a sample implementation of the server and client are shown in the paper. Initial tests of the effectiveness of the protocol are made. The experiments are test-bed, carried out in the experimental network in “Distributed Systems and Computer Networks Lab ” in Technical University of Sofia, branch Plovdiv

Research paper thumbnail of Data exchange technologies in sensor networks

2020 12th Electrical Engineering Faculty Conference (BulEF), 2020

The present article is based upon the prerequisite of existence of real needs for reading, storin... more The present article is based upon the prerequisite of existence of real needs for reading, storing and reliable transfer of information from sensors. The goal is to present, analyze and evaluate widely accessible and easily exploited LoRa technology for data in variety of industrial sensor networks. A comparative analysis of ZigBee, 6LoWPAN, Z-Wave and LoRa has been performed.

Research paper thumbnail of Controller Network Data Extracting Protocol - Design and Implementation

The paper presents the design and implementation of a UDP-based protocol for Distributed Automati... more The paper presents the design and implementation of a UDP-based protocol for Distributed Automation Systems. It is based on client/server interactions. Protocol specification is given together with its syntax, grammar and semantics. Message formats, protocol vocabulary and communication rules are described. The possible applications of the protocol are discussed and a sample implementation of the server and client are shown in the paper. Initial tests of the effectiveness of the protocol are made. The experiments are test-bed, carried out in the experimental network in "Distributed Systems and Computer Networks Lab" in Technical University of Sofia, branch Plovdiv (http://net-lab.tu-plovdiv.bg/). They include evaluation of the communication capacity of the protocol. The minimum, maximum and average response times are calculated from the experimental results.

Research paper thumbnail of Enhanced cross-domain document clustering with a semantically enhanced text stemmer (SETS)

International Journal of Knowledge-based and Intelligent Engineering Systems, 2013

ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clu... more ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clustering algorithms rely on text normalisation techniques to represent and cluster documents. Although most document clustering algorithms perform well in specific knowledge domains, processing cross-domain document repositories is still a challenge. This paper attempts to address this challenge. It investigates the performance of the sk-means clustering algorithm across domains, by comparing the cluster coherence produced with semantic-based and traditional TF-IDF-based document representations. The evaluation is conducted on 20 different generic sub-domains of a thousand documents, each randomly selected from the Reuters21578 corpus. The experimental results obtained from the evaluation demonstrate improved coherence of clusters produced by using a semantically enhanced text stemmer SETS, when compared to the text normalisation obtained with the Porter stemmer. In addition, semantic-based text normalisation is shown to be resistant to noise, which is often introduced in the index aggregation stage, a stage that acquires features to represent documents.

Research paper thumbnail of Semantically enhanced document clustering

Research paper thumbnail of Web Services and Data Integration in Distributed Automation and Information Systems in Internet Environment

International Journal on Information Technology, Jun 30, 2014

Research paper thumbnail of Development of a System for “home Control on Fingertips”

tu-sofia.bg

Nowadays, advanced technology devices are all around us. We take them for granted and we are usin... more Nowadays, advanced technology devices are all around us. We take them for granted and we are using them in all spheres of social life. We decided to combine the fast development of the technology and daily human habits in a centralized system that we called “Home Control ...

Research paper thumbnail of Wireless Real-Time Gateway (WRTG) for Embedded Devices

ecad.tu-sofia.bg

This paper discusses the hierarchical manufacturing approach and distributed model used by embedd... more This paper discusses the hierarchical manufacturing approach and distributed model used by embedded systems for monitoring of particular environment's parameters. A set of sensors for the different nature of measurements could be attached to the Wireless Real-Time ...

Research paper thumbnail of Web based application for distributed remote measurement viewing

Proceedings of ICEST, Sofia, 2006

Recent years automation systems became more complex and wide spread with various applications due... more Recent years automation systems became more complex and wide spread with various applications due to ubiquitous using of communication technologies and especially Internet [3]. This led IT market to really huge growth and increasing familiarity with devices as pocket PCs, PDAs, ...

Research paper thumbnail of Semantically enhanced document clustering

I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, f... more I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, for their invaluable guidance and support throughout my work. All members of the KES research group from School of Engineering in Cardiff University are thanked for their friendship and help. My deepest gratitude is to my family who has given continuous support and encouragement to me. i

Research paper thumbnail of Enhanced cross-domain document clustering with a semantically enhanced text stemmer (SETS)

International Journal of Knowledge-based and Intelligent Engineering Systems, May 13, 2013

ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clu... more ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clustering algorithms rely on text normalisation techniques to represent and cluster documents. Although most document clustering algorithms perform well in specific knowledge domains, processing cross-domain document repositories is still a challenge. This paper attempts to address this challenge. It investigates the performance of the sk-means clustering algorithm across domains, by comparing the cluster coherence produced with semantic-based and traditional TF-IDF-based document representations. The evaluation is conducted on 20 different generic sub-domains of a thousand documents, each randomly selected from the Reuters21578 corpus. The experimental results obtained from the evaluation demonstrate improved coherence of clusters produced by using a semantically enhanced text stemmer SETS, when compared to the text normalisation obtained with the Porter stemmer. In addition, semantic-based text normalisation is shown to be resistant to noise, which is often introduced in the index aggregation stage, a stage that acquires features to represent documents.

Research paper thumbnail of Semantic-based information retrieval in support of concept design

Advanced Engineering Informatics, Apr 1, 2011

This research is motivated by the realisation that semantic technology can be used to develop com... more This research is motivated by the realisation that semantic technology can be used to develop computational tools in support of designers’ creativity by focusing on the inspirational stage of design. The paper describes a semantic-based image retrieval tool developed for the needs of concept cars designers from two renowned European companies. It is created to help them find and interpret

Research paper thumbnail of Semantically Enhanced Text Stemmer (SETS) for Document Clustering

KES, 2012

ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged... more ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged by the word ambiguity and the fact that monosemic words are more domain-oriented than polysemic ones. The paper describes a semantically enhanced text normalization algorithm (SETS) aimed at improving document clustering and investigates the performance of the sk-means clustering algorithm across domains by comparing the cluster coherence produced with semantic-based and traditional (TF-IDF-based) document representations. The evaluation is conducted on 20 generic sub-domains of a thousand documents each randomly selected from the Reuters21578 corpus. The experimental results demonstrate improved coherence of the clusters produced by SETS compared to the text normalization obtained with the Porter stemmer. In addition, semantic-based text normalization is shown to be resistant to noise, which is often introduced in the index aggregation stage.

Research paper thumbnail of Semantically Enhanced Text Stemmer (SETS) for Cross-Domain Document Clustering

Lecture Notes in Computer Science, 2013

ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged... more ABSTRACT This paper focuses on processing cross-domain document repositories, which is challenged by the word ambiguity and the fact that monosemic words are more domain-oriented than polysemic ones. The paper describes a semantically enhanced text normalization algorithm (SETS) aimed at improving document clustering and investigates the performance of the sk-means clustering algorithm across domains by comparing the cluster coherence produced with semantic-based and traditional (TF-IDF-based) document representations. The evaluation is conducted on 20 generic sub-domains of a thousand documents each randomly selected from the Reuters21578 corpus. The experimental results demonstrate improved coherence of the clusters produced by SETS compared to the text normalization obtained with the Porter stemmer. In addition, semantic-based text normalization is shown to be resistant to noise, which is often introduced in the index aggregation stage.

Research paper thumbnail of Development of a System for “Home Control on Fingertips”

Nowadays, advanced technology devices are all around us. We take them for granted and we are usin... more Nowadays, advanced technology devices are all around us. We take them for granted and we are using them in all spheres of social life. We decided to combine the fast development of the technology and daily human habits in a centralized system that we called "Home Control on Fingertips" (HOCFIT). The system is designed in hierarchical three-layer model: master, sub-masters, and end-devices. Each of them has their features and suitable interfaces to communicate with each other. Master accepts the commands from the authorized user and transmits it to the definite sub-master and the sub-master manipulates the end-device. The communication software is based on the TCP/IP protocol stack. The advantage of this system lies in the universal design, which allows easy and fast future development when connecting new devices for control or data acquisition as well expansion of the system's functionality. The extendibility and flexibility of the system are assured using popular technology like Bluetooth, integrated in GSMs, PDAs, Notebooks, etc., and IrDA, used in every remote control, almost every new domestic appliances have today.

Research paper thumbnail of Discussion of Microkernel and Monolithic Kernel Approaches

The paper discusses monolithic approach against microkernel approach-both used in operating syste... more The paper discusses monolithic approach against microkernel approach-both used in operating systems. The paper presents the most critical features of operating systems concerning system architecture, communication model, and priority in processes, memory management and handling of any arrived exceptions as well as interrupts. The interrupt is a crucial factor for real time operating systems (RTOS). It is supposed the interrupts to be handled in real time for predictable time. The paper points out the benefits of each discussed approach and its weak points. At the end a conclusion for each appropriate operating system deployment is made, based on facts and results, and also on prediction for functionality. The paper discusses also the embedded systems as one of the most fast developing systems and mostly suitable platform for any RTOS installation or any other operating systems (OS) functionality.

Research paper thumbnail of On the Use of XML for Data Access in Networks of Embedded Devices

The paper deals with embedded networking and distributed embedded systems, which is a key area of... more The paper deals with embedded networking and distributed embedded systems, which is a key area of research over the last years. It presents a custom protocol for data access in embedded devices-CNDEP (Controller Network Data Extracting Protocol). The presented work is based on the presumption that all data exchanged should be encoded in universal, platform independent format-XML. The XML-based version of the protocol CNDEP should provide interoperation between embedded devices, considered as components. The underlying communication is UDP/IP/Ethernet, which ensures good integration of embedded devices with other information systems. The protocol CNDEP is designed as a part of Multi-tier Client/Server System for Distributed Measurement and Control. It is used for data transfer in data producing tier. The transfer delays for the protocol are measured and compared to the non-XML implementation for evaluation of the effectiveness and efficiency of the realization.

Research paper thumbnail of Discussion of Microkernel and Monolithic Kernel Approaches

dsnet.bjr-labs.com

Nowadays more and more resources are thrown for improving performance of existing software/hardwa... more Nowadays more and more resources are thrown for improving performance of existing software/hardware systems to produce new ones more powerful than ever. In the world of embedded systems and real-time systems anything is crucial – software applications, involving ...

Research paper thumbnail of Semantically enhanced document clustering

Cardiff University, 2013

I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, f... more I would like to thank the supervisors of my studies, Professor Rossi Setchi and Dr Yulia Hicks, for their invaluable guidance and support throughout my work. All members of the KES research group from School of Engineering in Cardiff University are thanked for their friendship and help. My deepest gratitude is to my family who has given continuous support and encouragement to me. i

Research paper thumbnail of Study the Time of Access to Data in Heterogeneous Databases

This article assesses the performance of two of the most popular databases ORACLE and MS SQL, wit... more This article assesses the performance of two of the most popular databases ORACLE and MS SQL, with and without indexing data. After a brief overview of the opportunities, which afforded us to compare the two databases and to create a heterogeneous database. The main objective is to examine the execution time of queries when searching for information in various databases and to decide how the times can be improved.

Research paper thumbnail of International Conference on Computer Systems and Technologies- CompSysTech’06 Controller Network Data Extracting Protocol – Design and Implementation

Abstract: The paper presents the design and implementation of a UDP-based protocol for Distribute... more Abstract: The paper presents the design and implementation of a UDP-based protocol for Distributed Automation Systems. It is based on client/server interactions. Protocol specification is given together with its syntax, grammar and semantics. Message formats, protocol vocabulary and communication rules are described. The possible applications of the protocol are discussed and a sample implementation of the server and client are shown in the paper. Initial tests of the effectiveness of the protocol are made. The experiments are test-bed, carried out in the experimental network in “Distributed Systems and Computer Networks Lab ” in Technical University of Sofia, branch Plovdiv

Research paper thumbnail of Data exchange technologies in sensor networks

2020 12th Electrical Engineering Faculty Conference (BulEF), 2020

The present article is based upon the prerequisite of existence of real needs for reading, storin... more The present article is based upon the prerequisite of existence of real needs for reading, storing and reliable transfer of information from sensors. The goal is to present, analyze and evaluate widely accessible and easily exploited LoRa technology for data in variety of industrial sensor networks. A comparative analysis of ZigBee, 6LoWPAN, Z-Wave and LoRa has been performed.

Research paper thumbnail of Controller Network Data Extracting Protocol - Design and Implementation

The paper presents the design and implementation of a UDP-based protocol for Distributed Automati... more The paper presents the design and implementation of a UDP-based protocol for Distributed Automation Systems. It is based on client/server interactions. Protocol specification is given together with its syntax, grammar and semantics. Message formats, protocol vocabulary and communication rules are described. The possible applications of the protocol are discussed and a sample implementation of the server and client are shown in the paper. Initial tests of the effectiveness of the protocol are made. The experiments are test-bed, carried out in the experimental network in "Distributed Systems and Computer Networks Lab" in Technical University of Sofia, branch Plovdiv (http://net-lab.tu-plovdiv.bg/). They include evaluation of the communication capacity of the protocol. The minimum, maximum and average response times are calculated from the experimental results.

Research paper thumbnail of Enhanced cross-domain document clustering with a semantically enhanced text stemmer (SETS)

International Journal of Knowledge-based and Intelligent Engineering Systems, 2013

ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clu... more ABSTRACT The aim of document clustering is to produce coherent clusters of similar documents. Clustering algorithms rely on text normalisation techniques to represent and cluster documents. Although most document clustering algorithms perform well in specific knowledge domains, processing cross-domain document repositories is still a challenge. This paper attempts to address this challenge. It investigates the performance of the sk-means clustering algorithm across domains, by comparing the cluster coherence produced with semantic-based and traditional TF-IDF-based document representations. The evaluation is conducted on 20 different generic sub-domains of a thousand documents, each randomly selected from the Reuters21578 corpus. The experimental results obtained from the evaluation demonstrate improved coherence of clusters produced by using a semantically enhanced text stemmer SETS, when compared to the text normalisation obtained with the Porter stemmer. In addition, semantic-based text normalisation is shown to be resistant to noise, which is often introduced in the index aggregation stage, a stage that acquires features to represent documents.

Research paper thumbnail of Semantically enhanced document clustering

Research paper thumbnail of Web Services and Data Integration in Distributed Automation and Information Systems in Internet Environment

International Journal on Information Technology, Jun 30, 2014

Research paper thumbnail of Development of a System for “home Control on Fingertips”

tu-sofia.bg

Nowadays, advanced technology devices are all around us. We take them for granted and we are usin... more Nowadays, advanced technology devices are all around us. We take them for granted and we are using them in all spheres of social life. We decided to combine the fast development of the technology and daily human habits in a centralized system that we called “Home Control ...

Research paper thumbnail of Wireless Real-Time Gateway (WRTG) for Embedded Devices

ecad.tu-sofia.bg

This paper discusses the hierarchical manufacturing approach and distributed model used by embedd... more This paper discusses the hierarchical manufacturing approach and distributed model used by embedded systems for monitoring of particular environment's parameters. A set of sensors for the different nature of measurements could be attached to the Wireless Real-Time ...

Research paper thumbnail of Web based application for distributed remote measurement viewing

Proceedings of ICEST, Sofia, 2006

Recent years automation systems became more complex and wide spread with various applications due... more Recent years automation systems became more complex and wide spread with various applications due to ubiquitous using of communication technologies and especially Internet [3]. This led IT market to really huge growth and increasing familiarity with devices as pocket PCs, PDAs, ...