Sung-Hyon Myaeng | KAIST - Academia.edu (original) (raw)

Uploads

Papers by Sung-Hyon Myaeng

Research paper thumbnail of 술어-논항 튜플 기반 근사 정렬을 이용한 문장 단위 바꿔쓰기표현 유형 및 오류 분석

The KIPS Transactions:PartB, 2012

Research paper thumbnail of Topic-based place semantics discovered from microblogging text messages

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014

ABSTRACT

Research paper thumbnail of Exploring the user-generated content (UGC) uploading behavior on youtube

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014

ABSTRACT

Research paper thumbnail of Introducing the omega-machine

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014

ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a clus... more ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a cluster of "oracles" to a traditional Turing machine, the Ω-machine is capable of describing the interaction between human participants and mechanical machines. We also give two examples of social machines, collective intelligence and rumor spreading, and demonstrate how the general Ω-machine model could be used to simulate their computations.

Research paper thumbnail of Improving Rule Generation Precision for Domain Knowledge based Wrappers

International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), 2000

ABSTRACT Wrappers play an important role in extracting specified information from various sources... more ABSTRACT Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats. However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target Web page. Our experimental work shows that the proposed methods help increasing precision of the resulting wrapper, particularly extracting the title information, the most important entity on a Web page. The proposed methods can be useful in making a more efficient and correct information extraction system for various sources of information without user intervention

Research paper thumbnail of Introduction to the special issue on computer processing of oriental languages

Research paper thumbnail of Information Retrieval Technology, Asia Information Retrieval Symposium, AIRS 2004, Beijing, China, October 18-20, 2004, Revised Selected Papers

... The low acceptance rates witness the success of this year's conferen... more ... The low acceptance rates witness the success of this year's conference. ... Tokunaga Tee Kiah Chia Mu Li Hsin-Min Wang Lee-Feng Chien Hongqiao Li Haifeng Wang Fabio Crestani Chin-Yew Lin Ross Wilkinson Edward A. Fox Robert Luk Kam-Fai Wong Jianfeng Gao Wei ...

Research paper thumbnail of DR-LINK: A System Update for TREC-2

Research paper thumbnail of Exploring Social Networks with Topical Analysis

As the number of social networking services (SNS) and their users grow, so does the complexity of... more As the number of social networking services (SNS) and their users grow, so does the complexity of individual networks as well as the amount of information to be consumed by the users. It is inevitable to reduce the complexity and information overload, and we have embarked exploring topical aspects of SNS to form refined topicbased semantic social networks. Our current work focuses on conversational aspects of SNS and attempt to utilize the notions of topic diversity and topic purity between two users sharing conversations. This topic-based analysis of SNS makes it possible to show different types of users and their conversational characteristics. It also shows the possibility of breaking down a huge "syntactic" social network into topic-based ones based on different interaction types, so that the resulting semantic social networks can be useful in designing various targeted services on online social networks.

Research paper thumbnail of Development of a Document Summarization System for Effective Information Services

Recherche d'Information Assistee par Ordinateur, 1997

Research paper thumbnail of Opinion Analysis based on Lexical Clues and their Expansion

The challenge of an automatic opinion analysis has been the focus of attention in recent years in... more The challenge of an automatic opinion analysis has been the focus of attention in recent years in many domains such as online product review. Especially, in online news articles opinion analysis has good prospects, since newspaper is the most powerful media to disseminate people's opinions. We introduce a lexical information based approach to this task by exploiting lexical information, based on the quantitative analysis of opinions in the news articles. The method comprises semi-supervised subjectivity classification, gloss based sentiment classification, and rule based opinion holder finder. The method we present is remarkable since numbers of lexical clues we discovered were effective to this task. The experimental results show that our system achieves 45% of performance to extract opinionated sentences and 35% of performance to identify opinion holders.

Research paper thumbnail of Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access, 6th International Conference on Asian Digital Libraries, ICADL 2003, Kuala Lumpur, Malaysia, December 8-12, 2003, Proceedings

This book constitutes the refereed proceedings of the 6th International Conference on Asian Digit... more This book constitutes the refereed proceedings of the 6th International Conference on Asian Digital Libraries, ICADL 2003, held in Kuala Lumpur, Malaysia in December 2003. The 68 revised full papers presented together with 15 poster abstracts and 3 invited papers were carefully reviewed from numerous submissions. The papers are organized in topical sections on information retrieval techniques, multimedia digital libraries, data mining and digital libraries, machine architecture and organization, human resources and training, ...

Research paper thumbnail of Multi-perspective User Preference Learning in a Chatting Domain

Research paper thumbnail of Topic and Topic Change Detection in Instance Messaging

Research paper thumbnail of Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008

Research paper thumbnail of A Patent Retrieval Method using Semantic Annotations

Automatic annotation of key phrases for their semantic categories can help improving effectivenes... more Automatic annotation of key phrases for their semantic categories can help improving effectiveness of a variety of text-based systems including information retrieval, summarization, question answering, etc. In this paper, we exploit semantic annotations for patent retrieval (i.e., patent invalidity search). We first annotated key phrases for two semantic categories, PROBLEM (e.g. "pattern matching") and SOLUTION (e.g. "dynamic programming") in a patent document, which constitute a particular technology. Semantic clusters are formed by grouping patent documents with the same PROBLEM or SOLUTION tag. A language modelling approach to information retrieval is extended to consider the semantically oriented clusters as well as document models. Our retrieval evaluation of the proposed approach using a collection of United States patent documents shows a 22% improvement over the baseline, a smoothed language modelling approach without using the semantic annotations.

Research paper thumbnail of A Topic Transition Map for Query Expansion: A Semantic Analysis of Click-Through Data and Test Collections

Lecture Notes in Computer Science, 2016

Research paper thumbnail of Generating Domain-Specific Clues Using News Corpus for Sentiment Classification

Research paper thumbnail of Visualization of Retrieved Documents using a Presentation Server

Research paper thumbnail of Predicting Actions with Commonsense Knowledge: an Application in Event-Driven Ad Placement

Inferring or predicting actions to be taken by the user of a system is essential to the developme... more Inferring or predicting actions to be taken by the user of a system is essential to the development of intelligent user interfaces. We employ ConceptNet, a large-scale commonsense knowledge base, and its derivative called EventNet to predict events associated with a given event. To validate the usefulness of inferred events, we developed a contextual ad placement system that accepts news articles and retrieves relevant ads. By predicting what actions the interested readers would take and what objects or entities are related to the actions, we attempt to get to the needs and intents of potential beneficiaries of ads, moving one step further beyond topical similarities used in modern ad placement systems. Related objects to be used for ad searching are found by using ConceptNet and WordNet. Preliminary experiments show that the proposed approach based on events and predicted actions retrieve unique sets of relevant ads that are not retrieved at all by either keyword-based or taxonomy-...

Research paper thumbnail of 술어-논항 튜플 기반 근사 정렬을 이용한 문장 단위 바꿔쓰기표현 유형 및 오류 분석

The KIPS Transactions:PartB, 2012

Research paper thumbnail of Topic-based place semantics discovered from microblogging text messages

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014

ABSTRACT

Research paper thumbnail of Exploring the user-generated content (UGC) uploading behavior on youtube

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014

ABSTRACT

Research paper thumbnail of Introducing the omega-machine

Proceedings of the 23rd International Conference on World Wide Web - WWW '14 Companion, 2014

ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a clus... more ABSTRACT In this paper, we propose the Ω-machine model for social machines. By introducing a cluster of "oracles" to a traditional Turing machine, the Ω-machine is capable of describing the interaction between human participants and mechanical machines. We also give two examples of social machines, collective intelligence and rumor spreading, and demonstrate how the general Ω-machine model could be used to simulate their computations.

Research paper thumbnail of Improving Rule Generation Precision for Domain Knowledge based Wrappers

International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), 2000

ABSTRACT Wrappers play an important role in extracting specified information from various sources... more ABSTRACT Wrappers play an important role in extracting specified information from various sources. Wrapper rules by which information is extracted are often created from the domain-specific knowledge. Domain-specific knowledge helps recognizing the meaning the text representing various entities and values and detecting their formats. However, such domain knowledge becomes powerless when value-representing data are not labeled with appropriate textual descriptions or there is nothing but a hyper link when certain text labels or values are expected. In order to alleviate these problems, we propose a probabilistic method for recognizing the entity type, i.e. generating wrapper rules, when there is no label associated with value-representing text. In addition, we have devised a method for using the information reachable by following hyperlinks when textual data are not immediately available on the target Web page. Our experimental work shows that the proposed methods help increasing precision of the resulting wrapper, particularly extracting the title information, the most important entity on a Web page. The proposed methods can be useful in making a more efficient and correct information extraction system for various sources of information without user intervention

Research paper thumbnail of Introduction to the special issue on computer processing of oriental languages

Research paper thumbnail of Information Retrieval Technology, Asia Information Retrieval Symposium, AIRS 2004, Beijing, China, October 18-20, 2004, Revised Selected Papers

... The low acceptance rates witness the success of this year's conferen... more ... The low acceptance rates witness the success of this year's conference. ... Tokunaga Tee Kiah Chia Mu Li Hsin-Min Wang Lee-Feng Chien Hongqiao Li Haifeng Wang Fabio Crestani Chin-Yew Lin Ross Wilkinson Edward A. Fox Robert Luk Kam-Fai Wong Jianfeng Gao Wei ...

Research paper thumbnail of DR-LINK: A System Update for TREC-2

Research paper thumbnail of Exploring Social Networks with Topical Analysis

As the number of social networking services (SNS) and their users grow, so does the complexity of... more As the number of social networking services (SNS) and their users grow, so does the complexity of individual networks as well as the amount of information to be consumed by the users. It is inevitable to reduce the complexity and information overload, and we have embarked exploring topical aspects of SNS to form refined topicbased semantic social networks. Our current work focuses on conversational aspects of SNS and attempt to utilize the notions of topic diversity and topic purity between two users sharing conversations. This topic-based analysis of SNS makes it possible to show different types of users and their conversational characteristics. It also shows the possibility of breaking down a huge "syntactic" social network into topic-based ones based on different interaction types, so that the resulting semantic social networks can be useful in designing various targeted services on online social networks.

Research paper thumbnail of Development of a Document Summarization System for Effective Information Services

Recherche d'Information Assistee par Ordinateur, 1997

Research paper thumbnail of Opinion Analysis based on Lexical Clues and their Expansion

The challenge of an automatic opinion analysis has been the focus of attention in recent years in... more The challenge of an automatic opinion analysis has been the focus of attention in recent years in many domains such as online product review. Especially, in online news articles opinion analysis has good prospects, since newspaper is the most powerful media to disseminate people's opinions. We introduce a lexical information based approach to this task by exploiting lexical information, based on the quantitative analysis of opinions in the news articles. The method comprises semi-supervised subjectivity classification, gloss based sentiment classification, and rule based opinion holder finder. The method we present is remarkable since numbers of lexical clues we discovered were effective to this task. The experimental results show that our system achieves 45% of performance to extract opinionated sentences and 35% of performance to identify opinion holders.

Research paper thumbnail of Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access, 6th International Conference on Asian Digital Libraries, ICADL 2003, Kuala Lumpur, Malaysia, December 8-12, 2003, Proceedings

This book constitutes the refereed proceedings of the 6th International Conference on Asian Digit... more This book constitutes the refereed proceedings of the 6th International Conference on Asian Digital Libraries, ICADL 2003, held in Kuala Lumpur, Malaysia in December 2003. The 68 revised full papers presented together with 15 poster abstracts and 3 invited papers were carefully reviewed from numerous submissions. The papers are organized in topical sections on information retrieval techniques, multimedia digital libraries, data mining and digital libraries, machine architecture and organization, human resources and training, ...

Research paper thumbnail of Multi-perspective User Preference Learning in a Chatting Domain

Research paper thumbnail of Topic and Topic Change Detection in Instance Messaging

Research paper thumbnail of Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008

Research paper thumbnail of A Patent Retrieval Method using Semantic Annotations

Automatic annotation of key phrases for their semantic categories can help improving effectivenes... more Automatic annotation of key phrases for their semantic categories can help improving effectiveness of a variety of text-based systems including information retrieval, summarization, question answering, etc. In this paper, we exploit semantic annotations for patent retrieval (i.e., patent invalidity search). We first annotated key phrases for two semantic categories, PROBLEM (e.g. "pattern matching") and SOLUTION (e.g. "dynamic programming") in a patent document, which constitute a particular technology. Semantic clusters are formed by grouping patent documents with the same PROBLEM or SOLUTION tag. A language modelling approach to information retrieval is extended to consider the semantically oriented clusters as well as document models. Our retrieval evaluation of the proposed approach using a collection of United States patent documents shows a 22% improvement over the baseline, a smoothed language modelling approach without using the semantic annotations.

Research paper thumbnail of A Topic Transition Map for Query Expansion: A Semantic Analysis of Click-Through Data and Test Collections

Lecture Notes in Computer Science, 2016

Research paper thumbnail of Generating Domain-Specific Clues Using News Corpus for Sentiment Classification

Research paper thumbnail of Visualization of Retrieved Documents using a Presentation Server

Research paper thumbnail of Predicting Actions with Commonsense Knowledge: an Application in Event-Driven Ad Placement

Inferring or predicting actions to be taken by the user of a system is essential to the developme... more Inferring or predicting actions to be taken by the user of a system is essential to the development of intelligent user interfaces. We employ ConceptNet, a large-scale commonsense knowledge base, and its derivative called EventNet to predict events associated with a given event. To validate the usefulness of inferred events, we developed a contextual ad placement system that accepts news articles and retrieves relevant ads. By predicting what actions the interested readers would take and what objects or entities are related to the actions, we attempt to get to the needs and intents of potential beneficiaries of ads, moving one step further beyond topical similarities used in modern ad placement systems. Related objects to be used for ad searching are found by using ConceptNet and WordNet. Preliminary experiments show that the proposed approach based on events and predicted actions retrieve unique sets of relevant ads that are not retrieved at all by either keyword-based or taxonomy-...