Sung-Hyon Myaeng | KAIST - Academia.edu (original) (raw)
Uploads
Papers by Sung-Hyon Myaeng
The KIPS Transactions:PartB, 2012
Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014
ABSTRACT
Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014
ABSTRACT
Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014
ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a clus... more ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a cluster of "oracles" to a traditional Turing machine, the Ω-machine is capable of describing the interaction between human participants and mechanical machines. We also give two examples of social machines, collective intelligence and rumor spreading, and demonstrate how the general Ω-machine model could be used to simulate their computations.
International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), 2000
ABSTRACT Wrappers play an important role in extracting specified information from various sources... more ABSTRACT Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats. However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target Web page. Our experimental work shows that the proposed methods help increasing precision of the resulting wrapper, particularly extracting the title information, the most important entity on a Web page. The proposed methods can be useful in making a more efficient and correct information extraction system for various sources of information without user intervention
... The low acceptance rates witness the success of this year's conferen... more ... The low acceptance rates witness the success of this year's conference. ... Tokunaga Tee Kiah Chia Mu Li Hsin-Min Wang Lee-Feng Chien Hongqiao Li Haifeng Wang Fabio Crestani Chin-Yew Lin Ross Wilkinson Edward A. Fox Robert Luk Kam-Fai Wong Jianfeng Gao Wei ...
As the number of social networking services (SNS) and their users grow, so does the complexity of... more As the number of social networking services (SNS) and their users grow, so does the complexity of individual networks as well as the amount of information to be consumed by the users. It is inevitable to reduce the complexity and information overload, and we have embarked exploring topical aspects of SNS to form refined topicbased semantic social networks. Our current work focuses on conversational aspects of SNS and attempt to utilize the notions of topic diversity and topic purity between two users sharing conversations. This topic-based analysis of SNS makes it possible to show different types of users and their conversational characteristics. It also shows the possibility of breaking down a huge "syntactic" social network into topic-based ones based on different interaction types, so that the resulting semantic social networks can be useful in designing various targeted services on online social networks.
Recherche d'Information Assistee par Ordinateur, 1997
The challenge of an automatic opinion analysis has been the focus of attention in recent years in... more The challenge of an automatic opinion analysis has been the focus of attention in recent years in many domains such as online product review. Especially, in online news articles opinion analysis has good prospects, since newspaper is the most powerful media to disseminate people's opinions. We introduce a lexical information based approach to this task by exploiting lexical information, based on the quantitative analysis of opinions in the news articles. The method comprises semi-supervised subjectivity classification, gloss based sentiment classification, and rule based opinion holder finder. The method we present is remarkable since numbers of lexical clues we discovered were effective to this task. The experimental results show that our system achieves 45% of performance to extract opinionated sentences and 35% of performance to identify opinion holders.
This book constitutes the refereed proceedings of the 6th International Conference on Asian Digit... more This book constitutes the refereed proceedings of the 6th International Conference on Asian Digital Libraries, ICADL 2003, held in Kuala Lumpur, Malaysia in December 2003. The 68 revised full papers presented together with 15 poster abstracts and 3 invited papers were carefully reviewed from numerous submissions. The papers are organized in topical sections on information retrieval techniques, multimedia digital libraries, data mining and digital libraries, machine architecture and organization, human resources and training, ...
Automatic annotation of key phrases for their semantic categories can help improving effectivenes... more Automatic annotation of key phrases for their semantic categories can help improving effectiveness of a variety of text-based systems including information retrieval, summarization, question answering, etc. In this paper, we exploit semantic annotations for patent retrieval (i.e., patent invalidity search). We first annotated key phrases for two semantic categories, PROBLEM (e.g. "pattern matching") and SOLUTION (e.g. "dynamic programming") in a patent document, which constitute a particular technology. Semantic clusters are formed by grouping patent documents with the same PROBLEM or SOLUTION tag. A language modelling approach to information retrieval is extended to consider the semantically oriented clusters as well as document models. Our retrieval evaluation of the proposed approach using a collection of United States patent documents shows a 22% improvement over the baseline, a smoothed language modelling approach without using the semantic annotations.
Lecture Notes in Computer Science, 2016
Inferring or predicting actions to be taken by the user of a system is essential to the developme... more Inferring or predicting actions to be taken by the user of a system is essential to the development of intelligent user interfaces. We employ ConceptNet, a large-scale commonsense knowledge base, and its derivative called EventNet to predict events associated with a given event. To validate the usefulness of inferred events, we developed a contextual ad placement system that accepts news articles and retrieves relevant ads. By predicting what actions the interested readers would take and what objects or entities are related to the actions, we attempt to get to the needs and intents of potential beneficiaries of ads, moving one step further beyond topical similarities used in modern ad placement systems. Related objects to be used for ad searching are found by using ConceptNet and WordNet. Preliminary experiments show that the proposed approach based on events and predicted actions retrieve unique sets of relevant ads that are not retrieved at all by either keyword-based or taxonomy-...
The KIPS Transactions:PartB, 2012
Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014
ABSTRACT
Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014
ABSTRACT
Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014
ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a clus... more ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a cluster of "oracles" to a traditional Turing machine, the Ω-machine is capable of describing the interaction between human participants and mechanical machines. We also give two examples of social machines, collective intelligence and rumor spreading, and demonstrate how the general Ω-machine model could be used to simulate their computations.
International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), 2000
ABSTRACT Wrappers play an important role in extracting specified information from various sources... more ABSTRACT Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats. However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target Web page. Our experimental work shows that the proposed methods help increasing precision of the resulting wrapper, particularly extracting the title information, the most important entity on a Web page. The proposed methods can be useful in making a more efficient and correct information extraction system for various sources of information without user intervention
... The low acceptance rates witness the success of this year's conferen... more ... The low acceptance rates witness the success of this year's conference. ... Tokunaga Tee Kiah Chia Mu Li Hsin-Min Wang Lee-Feng Chien Hongqiao Li Haifeng Wang Fabio Crestani Chin-Yew Lin Ross Wilkinson Edward A. Fox Robert Luk Kam-Fai Wong Jianfeng Gao Wei ...
As the number of social networking services (SNS) and their users grow, so does the complexity of... more As the number of social networking services (SNS) and their users grow, so does the complexity of individual networks as well as the amount of information to be consumed by the users. It is inevitable to reduce the complexity and information overload, and we have embarked exploring topical aspects of SNS to form refined topicbased semantic social networks. Our current work focuses on conversational aspects of SNS and attempt to utilize the notions of topic diversity and topic purity between two users sharing conversations. This topic-based analysis of SNS makes it possible to show different types of users and their conversational characteristics. It also shows the possibility of breaking down a huge "syntactic" social network into topic-based ones based on different interaction types, so that the resulting semantic social networks can be useful in designing various targeted services on online social networks.
Recherche d'Information Assistee par Ordinateur, 1997
The challenge of an automatic opinion analysis has been the focus of attention in recent years in... more The challenge of an automatic opinion analysis has been the focus of attention in recent years in many domains such as online product review. Especially, in online news articles opinion analysis has good prospects, since newspaper is the most powerful media to disseminate people's opinions. We introduce a lexical information based approach to this task by exploiting lexical information, based on the quantitative analysis of opinions in the news articles. The method comprises semi-supervised subjectivity classification, gloss based sentiment classification, and rule based opinion holder finder. The method we present is remarkable since numbers of lexical clues we discovered were effective to this task. The experimental results show that our system achieves 45% of performance to extract opinionated sentences and 35% of performance to identify opinion holders.
This book constitutes the refereed proceedings of the 6th International Conference on Asian Digit... more This book constitutes the refereed proceedings of the 6th International Conference on Asian Digital Libraries, ICADL 2003, held in Kuala Lumpur, Malaysia in December 2003. The 68 revised full papers presented together with 15 poster abstracts and 3 invited papers were carefully reviewed from numerous submissions. The papers are organized in topical sections on information retrieval techniques, multimedia digital libraries, data mining and digital libraries, machine architecture and organization, human resources and training, ...
Automatic annotation of key phrases for their semantic categories can help improving effectivenes... more Automatic annotation of key phrases for their semantic categories can help improving effectiveness of a variety of text-based systems including information retrieval, summarization, question answering, etc. In this paper, we exploit semantic annotations for patent retrieval (i.e., patent invalidity search). We first annotated key phrases for two semantic categories, PROBLEM (e.g. "pattern matching") and SOLUTION (e.g. "dynamic programming") in a patent document, which constitute a particular technology. Semantic clusters are formed by grouping patent documents with the same PROBLEM or SOLUTION tag. A language modelling approach to information retrieval is extended to consider the semantically oriented clusters as well as document models. Our retrieval evaluation of the proposed approach using a collection of United States patent documents shows a 22% improvement over the baseline, a smoothed language modelling approach without using the semantic annotations.
Lecture Notes in Computer Science, 2016
Inferring or predicting actions to be taken by the user of a system is essential to the developme... more Inferring or predicting actions to be taken by the user of a system is essential to the development of intelligent user interfaces. We employ ConceptNet, a large-scale commonsense knowledge base, and its derivative called EventNet to predict events associated with a given event. To validate the usefulness of inferred events, we developed a contextual ad placement system that accepts news articles and retrieves relevant ads. By predicting what actions the interested readers would take and what objects or entities are related to the actions, we attempt to get to the needs and intents of potential beneficiaries of ads, moving one step further beyond topical similarities used in modern ad placement systems. Related objects to be used for ad searching are found by using ConceptNet and WordNet. Preliminary experiments show that the proposed approach based on events and predicted actions retrieve unique sets of relevant ads that are not retrieved at all by either keyword-based or taxonomy-...