Asanee Kawtrakul | Kasetsart University (original) (raw)
Papers by Asanee Kawtrakul
Proceedings of the 7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, 2015
In this paper, we present a personal warning system model as one of the main functions of a RICE ... more In this paper, we present a personal warning system model as one of the main functions of a RICE WATCH system in order to support farmers in rice pest management. The model embodies the concept of time-based notification activated by using a crop calendar, and situation-based notification using a BUS model. The system also includes "What to Do Next," a knowledge integration module, in order to provide advice on how to prevent or treat pests appropriately. The warning or notification service is provided to farmers through multi-channels of communication such as SMS, e-mail, internet browser and mobile application. The recommendations or advice will be generated by using an inference engine to deduce disease preventive tasks and/or disease treatment. Based on estimates by Pest Forecasting and Early Warning Group, appropriate management of risk from pests could be reduced by 80%, as well as costs for pest management being reduced by 50%.
We present the concept of a Question-Answering System for providing knowledge services. The syste... more We present the concept of a Question-Answering System for providing knowledge services. The system is based on a rice production and rice disease textual database which has been structured according to a number of ontological conceptual functions, and associated annotations. In this paper, the rich lexical knowledge is utilized for identifying semantic roles in a question, connecting with the domain knowledge base in ontology and text formats to response the questions.
Advances in Computer Science and Technology, 2006
ABSTRACT Mining Know-Why or explanation knowledge will induce a knowledge of reasoning that is be... more ABSTRACT Mining Know-Why or explanation knowledge will induce a knowledge of reasoning that is beneficial for our daily use in diagnosis. Then, this framework is for discovering causality existing between causative antecedent and effective consequent discourse units. There are two main problems in the causality extraction; cause-effect identification and cause-effect boundary determination. The cause-effect identification problem can be solved by learning verb pairs and lexico syntactic pattern (NP1 V NP2) from annotated corpus, using the Naïve Bayes classifier. The cause-effect boundary determination problem can be solved by using centering theory and interesting cue phrase or causality link, where the interesting cue phrase would include the discourse markers and verb phrases. Our model of causality extraction shows the precision and recall of 86% and 70% respectively, where our evaluation is based on the expert's results.
Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distrib... more Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman’s terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recog...
With the development of the Internet and the World Wide Web, the enormous amount of knowledge res... more With the development of the Internet and the World Wide Web, the enormous amount of knowledge resources becomes the obstacle for knowledge consumers from effectively and efficiently accessing the information needed. To overcome such a problem, knowledge fusion is one of the solutions. This paper introduces the CyberBrain: a framework that combines approaches based on Knowledge Engineering and Language Engineering to provide the effective knowledge service. CyberBrain is a dynamic structure, interconnecting organization and communities. It behaves as a natural ecosystem, self-organizing, emerging and adaptive to acquire, collect, extract, and aggregate the related knowledge. With CyberBrain, appropriate and personalized knowledge services will be provided to support problem solving, decision making and early warning. At the current state, the framework is demonstrated with Rice Knowledge Portal using the PMM (Problem-Methods-Man) map generation. In addition, AGROVOC concept Server1 h...
The critical issue in ontology construction is to extract concepts and identify ontological relat... more The critical issue in ontology construction is to extract concepts and identify ontological relations both in taxonomic and other semantic relations. In large and various domains, this task can be time-consuming and costly. In this paper, we propose the methodology to discover semantic relations embedded in Thai NPs in order to enrich the existing domain ontologies by using machine learning techniques to learn the common ancestral concept of NP’s head and modifier. However, in Thai, there is no knowledge base like WordNet to identify the ancestral concept of term, so we applied Thai-English general dictionary and Thai-English thesaurus to translate Thai words to English words and define each word class from WordNet by using some heuristic rules and some partially decision from the expert. The presented system exhibits performance comparison, in total average, between two machine learning algorithms: SVM and C4.5 are 84.10% and 78.82 % of precision and 76.92% and 73.26 % of recall re...
This paper proposes a method for expanding Thai Lexie from monolingual dictionary to bilingual di... more This paper proposes a method for expanding Thai Lexie from monolingual dictionary to bilingual dictionary by utilizing multiple resources. The existing 100,000 Thai words were used as a query to search web pages for extracting words pair from the Internet. Moreover, a Thai Romanization model was developed for creating Thai word pronunciation by using Roman alphabets and a Forward Transliteration model was developed for creating loan word. Additionally, the examples of words usage were provided for benefiting and adding value of dictionary while the dictionary database could be utilized to the applications. On the other hand the dictionary was recorded with the XML format since XML is flexible to apply for applications as well as Papillon Dictionary. From the experiment with 120 Thai word queries, the accuracy of word pairs is 69.89 %. 1
ABSTRACT: Improvements in hardware, communication technology and database have led to the explosi... more ABSTRACT: Improvements in hardware, communication technology and database have led to the explosion of multimedia information repositories. In order to provide the quality of information retrieval and the quality of services, it is necessary to consider both retrieval techniques and database architecture. This paper presents the project named VLSHDS-Very Large Scale Hypermedia Delivery System. The quality of textual information search is enhanced by using NLP techniques. The quality of service over a
This paper presents an efficient, yet finegrained, approach to parsing Thai texts. This approach ... more This paper presents an efficient, yet finegrained, approach to parsing Thai texts. This approach was intended to resolve omission problems and sentential-NP grouping for Thai-English machine translation. The omission problems are zero anaphora, no explicit tenses and numbers, and no explicit topic markers. To resolve those, the augmented state transducer was exploited to resolve noun grouping and the lexical functional grammar was applied to identify omissions. From the experiment, it was found that the augmented state transducer could properly resolve sentential-noun grouping, while most omissions could be identified by the lexical functional grammar. At average, the parser yields 80.72 % accuracy and the number of produced trees is 30.36 % reduced compared with which of the original LFG.
English-Thai MT systems are nowadays restricted by incomplete vocabularies and translation knowle... more English-Thai MT systems are nowadays restricted by incomplete vocabularies and translation knowledge. Users must consequently accept only one translation result that is sometimes semantically divergent or ungrammatical. With the according reason, we propose novel Internet-based translation assistant software in order to facilitate document translation from English to Thai. In this project, we utilize the structural transfer model as the mechanism. This project di#ers from current English-Thai MT systems in the aspects that it empowers the users to manually select the most appropriate translation from every possibility and to manually train new translation rules to the system if it is necessary. With the applied model, we overcome four translation problems---lexicon rearrangement, structural ambiguity, phrase translation, and classifier generation. Finally, we started the system evaluation with 322 randomly selected sentences on the Future Magazine bilingual corpus and the system yie...
A Connected digit speech recognition is important in many applications such as voice-dialing tele... more A Connected digit speech recognition is important in many applications such as voice-dialing telephone, automated banking system, automatic data entry, PIN entry, etc. This research presents speech recognition system of speaker-independent Thai connected digit. The system employs mel frequency cepstrum coefficient (MFCC), delta MFCC, delta-delta MFCC, delta energy and delta-delta energy as features, and applies continuous density hidden Markov model (CDHMM) in the recognition process. The Viterbi beam search algorithm is used in decoding process. In training set, we use 100 speakers (50 females, 50 males) for 2000 utterances within the range of 20-28 years old. For the experiment, we used 50 speakers (25 females, 25 males) as testing set. The average recognition rate is 75.25 % for known length strings and 70.33 % for unknown length strings. 1.
Thesaurus is one of the most important components for information retrieval (IR) systems. A thesa... more Thesaurus is one of the most important components for information retrieval (IR) systems. A thesaurus provides a precise and controlled vocabulary that serves to coordinate document indexing and retrieval then it improves the retrieval effectiveness. However the major problem with the manual thesaurus is a laborintensive task and therefore also expensive to build and hard to update in timely manner. Consequently, this paper proposes one approach to construct Thai thesaurus automatically, called a Thai association thesaurus, based on the statistical technique and natural language processing technique. 1.
This paper introduces the new project called STREDEO: The Development of Multimedia-Multilingual ... more This paper introduces the new project called STREDEO: The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization. STREDEO aims to provide the system for multimedia multilingual document management consisting of storage, retrieval and delivery. The project can be divided into seven subprojects, which are: The
In this paper, a Web-Based Multilingual Technical Dictionary Development Project is introduced. T... more In this paper, a Web-Based Multilingual Technical Dictionary Development Project is introduced. The project provides a system that supports the development of vocabulary meaning in many different fields on the Internet. This project has 2 phases. Phase 1 is the development of bilingual English-Thai Dictionary that has the objective to update the meaning in Thai to be modern via Internet. Phase 2, we will extend this project to be multilingual dictionary by using English language as a hub. In this paper, we will emphasize on phase 1 and Technical dictionary only. 1
Problem-solving is one important form of intelligent behavior, where the goal is to find a soluti... more Problem-solving is one important form of intelligent behavior, where the goal is to find a solution which satisfies certain criteria. Lesson learned from the past in problem solving (e.g. how to protect the disease, how to control the disease) and valuable information tracked from the previous events and the recurrence; e.g. disease outbreak, are very important for knowledge sharing. This valuable knowledge is distributed over several websites among various kinds of sources in heterogeneous and unstructured formats. In order to reduce time consumption for users to access, construction and linking information space that attached with digesting information should be developed. This paper presents a framework for constructing a knowledge map from the webs that spread throughout the Internet focusing on semantic links between Problems, problem-solving Methods and problem-solver Man (PMM map). Based on specific-task ontology as a representation of specific-domain conceptualization, a kno...
Proceedings of the 7th International Conference on Management of computational and collective intElligence in Digital EcoSystems, 2015
In this paper, we present a personal warning system model as one of the main functions of a RICE ... more In this paper, we present a personal warning system model as one of the main functions of a RICE WATCH system in order to support farmers in rice pest management. The model embodies the concept of time-based notification activated by using a crop calendar, and situation-based notification using a BUS model. The system also includes "What to Do Next," a knowledge integration module, in order to provide advice on how to prevent or treat pests appropriately. The warning or notification service is provided to farmers through multi-channels of communication such as SMS, e-mail, internet browser and mobile application. The recommendations or advice will be generated by using an inference engine to deduce disease preventive tasks and/or disease treatment. Based on estimates by Pest Forecasting and Early Warning Group, appropriate management of risk from pests could be reduced by 80%, as well as costs for pest management being reduced by 50%.
We present the concept of a Question-Answering System for providing knowledge services. The syste... more We present the concept of a Question-Answering System for providing knowledge services. The system is based on a rice production and rice disease textual database which has been structured according to a number of ontological conceptual functions, and associated annotations. In this paper, the rich lexical knowledge is utilized for identifying semantic roles in a question, connecting with the domain knowledge base in ontology and text formats to response the questions.
Advances in Computer Science and Technology, 2006
ABSTRACT Mining Know-Why or explanation knowledge will induce a knowledge of reasoning that is be... more ABSTRACT Mining Know-Why or explanation knowledge will induce a knowledge of reasoning that is beneficial for our daily use in diagnosis. Then, this framework is for discovering causality existing between causative antecedent and effective consequent discourse units. There are two main problems in the causality extraction; cause-effect identification and cause-effect boundary determination. The cause-effect identification problem can be solved by learning verb pairs and lexico syntactic pattern (NP1 V NP2) from annotated corpus, using the Naïve Bayes classifier. The cause-effect boundary determination problem can be solved by using centering theory and interesting cue phrase or causality link, where the interesting cue phrase would include the discourse markers and verb phrases. Our model of causality extraction shows the precision and recall of 86% and 70% respectively, where our evaluation is based on the expert's results.
Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distrib... more Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman’s terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recog...
With the development of the Internet and the World Wide Web, the enormous amount of knowledge res... more With the development of the Internet and the World Wide Web, the enormous amount of knowledge resources becomes the obstacle for knowledge consumers from effectively and efficiently accessing the information needed. To overcome such a problem, knowledge fusion is one of the solutions. This paper introduces the CyberBrain: a framework that combines approaches based on Knowledge Engineering and Language Engineering to provide the effective knowledge service. CyberBrain is a dynamic structure, interconnecting organization and communities. It behaves as a natural ecosystem, self-organizing, emerging and adaptive to acquire, collect, extract, and aggregate the related knowledge. With CyberBrain, appropriate and personalized knowledge services will be provided to support problem solving, decision making and early warning. At the current state, the framework is demonstrated with Rice Knowledge Portal using the PMM (Problem-Methods-Man) map generation. In addition, AGROVOC concept Server1 h...
The critical issue in ontology construction is to extract concepts and identify ontological relat... more The critical issue in ontology construction is to extract concepts and identify ontological relations both in taxonomic and other semantic relations. In large and various domains, this task can be time-consuming and costly. In this paper, we propose the methodology to discover semantic relations embedded in Thai NPs in order to enrich the existing domain ontologies by using machine learning techniques to learn the common ancestral concept of NP’s head and modifier. However, in Thai, there is no knowledge base like WordNet to identify the ancestral concept of term, so we applied Thai-English general dictionary and Thai-English thesaurus to translate Thai words to English words and define each word class from WordNet by using some heuristic rules and some partially decision from the expert. The presented system exhibits performance comparison, in total average, between two machine learning algorithms: SVM and C4.5 are 84.10% and 78.82 % of precision and 76.92% and 73.26 % of recall re...
This paper proposes a method for expanding Thai Lexie from monolingual dictionary to bilingual di... more This paper proposes a method for expanding Thai Lexie from monolingual dictionary to bilingual dictionary by utilizing multiple resources. The existing 100,000 Thai words were used as a query to search web pages for extracting words pair from the Internet. Moreover, a Thai Romanization model was developed for creating Thai word pronunciation by using Roman alphabets and a Forward Transliteration model was developed for creating loan word. Additionally, the examples of words usage were provided for benefiting and adding value of dictionary while the dictionary database could be utilized to the applications. On the other hand the dictionary was recorded with the XML format since XML is flexible to apply for applications as well as Papillon Dictionary. From the experiment with 120 Thai word queries, the accuracy of word pairs is 69.89 %. 1
ABSTRACT: Improvements in hardware, communication technology and database have led to the explosi... more ABSTRACT: Improvements in hardware, communication technology and database have led to the explosion of multimedia information repositories. In order to provide the quality of information retrieval and the quality of services, it is necessary to consider both retrieval techniques and database architecture. This paper presents the project named VLSHDS-Very Large Scale Hypermedia Delivery System. The quality of textual information search is enhanced by using NLP techniques. The quality of service over a
This paper presents an efficient, yet finegrained, approach to parsing Thai texts. This approach ... more This paper presents an efficient, yet finegrained, approach to parsing Thai texts. This approach was intended to resolve omission problems and sentential-NP grouping for Thai-English machine translation. The omission problems are zero anaphora, no explicit tenses and numbers, and no explicit topic markers. To resolve those, the augmented state transducer was exploited to resolve noun grouping and the lexical functional grammar was applied to identify omissions. From the experiment, it was found that the augmented state transducer could properly resolve sentential-noun grouping, while most omissions could be identified by the lexical functional grammar. At average, the parser yields 80.72 % accuracy and the number of produced trees is 30.36 % reduced compared with which of the original LFG.
English-Thai MT systems are nowadays restricted by incomplete vocabularies and translation knowle... more English-Thai MT systems are nowadays restricted by incomplete vocabularies and translation knowledge. Users must consequently accept only one translation result that is sometimes semantically divergent or ungrammatical. With the according reason, we propose novel Internet-based translation assistant software in order to facilitate document translation from English to Thai. In this project, we utilize the structural transfer model as the mechanism. This project di#ers from current English-Thai MT systems in the aspects that it empowers the users to manually select the most appropriate translation from every possibility and to manually train new translation rules to the system if it is necessary. With the applied model, we overcome four translation problems---lexicon rearrangement, structural ambiguity, phrase translation, and classifier generation. Finally, we started the system evaluation with 322 randomly selected sentences on the Future Magazine bilingual corpus and the system yie...
A Connected digit speech recognition is important in many applications such as voice-dialing tele... more A Connected digit speech recognition is important in many applications such as voice-dialing telephone, automated banking system, automatic data entry, PIN entry, etc. This research presents speech recognition system of speaker-independent Thai connected digit. The system employs mel frequency cepstrum coefficient (MFCC), delta MFCC, delta-delta MFCC, delta energy and delta-delta energy as features, and applies continuous density hidden Markov model (CDHMM) in the recognition process. The Viterbi beam search algorithm is used in decoding process. In training set, we use 100 speakers (50 females, 50 males) for 2000 utterances within the range of 20-28 years old. For the experiment, we used 50 speakers (25 females, 25 males) as testing set. The average recognition rate is 75.25 % for known length strings and 70.33 % for unknown length strings. 1.
Thesaurus is one of the most important components for information retrieval (IR) systems. A thesa... more Thesaurus is one of the most important components for information retrieval (IR) systems. A thesaurus provides a precise and controlled vocabulary that serves to coordinate document indexing and retrieval then it improves the retrieval effectiveness. However the major problem with the manual thesaurus is a laborintensive task and therefore also expensive to build and hard to update in timely manner. Consequently, this paper proposes one approach to construct Thai thesaurus automatically, called a Thai association thesaurus, based on the statistical technique and natural language processing technique. 1.
This paper introduces the new project called STREDEO: The Development of Multimedia-Multilingual ... more This paper introduces the new project called STREDEO: The Development of Multimedia-Multilingual Document Storage, Retrieval and Delivery System for E-Organization. STREDEO aims to provide the system for multimedia multilingual document management consisting of storage, retrieval and delivery. The project can be divided into seven subprojects, which are: The
In this paper, a Web-Based Multilingual Technical Dictionary Development Project is introduced. T... more In this paper, a Web-Based Multilingual Technical Dictionary Development Project is introduced. The project provides a system that supports the development of vocabulary meaning in many different fields on the Internet. This project has 2 phases. Phase 1 is the development of bilingual English-Thai Dictionary that has the objective to update the meaning in Thai to be modern via Internet. Phase 2, we will extend this project to be multilingual dictionary by using English language as a hub. In this paper, we will emphasize on phase 1 and Technical dictionary only. 1
Problem-solving is one important form of intelligent behavior, where the goal is to find a soluti... more Problem-solving is one important form of intelligent behavior, where the goal is to find a solution which satisfies certain criteria. Lesson learned from the past in problem solving (e.g. how to protect the disease, how to control the disease) and valuable information tracked from the previous events and the recurrence; e.g. disease outbreak, are very important for knowledge sharing. This valuable knowledge is distributed over several websites among various kinds of sources in heterogeneous and unstructured formats. In order to reduce time consumption for users to access, construction and linking information space that attached with digesting information should be developed. This paper presents a framework for constructing a knowledge map from the webs that spread throughout the Internet focusing on semantic links between Problems, problem-solving Methods and problem-solver Man (PMM map). Based on specific-task ontology as a representation of specific-domain conceptualization, a kno...