Towards a Lexicon-grammar based Framework for NLP: an Opinion Mining Application (original) (raw)

A Linguistic Approach to Opinion Mining

Reviews are used every day by common people or by companies who need to make decisions. Such amount of social data can be used to analyze the present and to predict the near future needs or the probable changes. Mining the opinions and the comments is a way to extract knowledge by previous experiences and by the feedback received. In this chapter we propose an automatic linguistic approach to Opinion Mining by means of a semantic analysis of textual resources and based on FreeWordNet, a new developed linguistic resource. FreeWordNet has been defined by the enrichment of the meanings expressed by adjectives and adverbs in WordNet with a set of properties and the polarity orientation. These properties are involved in the steps of distinction and identification of subjective, objective or factual sentences with polarity valence and contribute in a basic way to the task of features contextualization.

A holistic lexicon-based approach to opinion mining

2008

One of the important types of information on the Web is the opinions expressed in the user generated content, e.g., customer reviews of products, forum posts, and blogs. In this paper, we focus on customer reviews of products. In particular, we study the problem of determining the semantic orientations (positive, negative or neutral) of opinions expressed on product features in reviews. This problem has many applications, e.g., opinion mining, summarization and search. Most existing techniques utilize a list of opinion (bearing) words (also called opinion lexicon) for the purpose. Opinion words are words that express desirable (e.g., great, amazing, etc.) or undesirable (e.g., bad, poor, etc) states. These approaches, however, all have some major shortcomings. In this paper, we propose a holistic lexicon-based approach to solving the problem by exploiting external evidences and linguistic conventions of natural language expressions. This approach allows the system to handle opinion words that are context dependent, which cause major difficulties for existing algorithms. It also deals with many special words, phrases and language constructs which have impacts on opinions based on their linguistic patterns. It also has an effective function for aggregating multiple conflicting opinion words in a sentence. A system, called Opinion Observer, based on the proposed technique has been implemented. Experimental results using a benchmark product review data set and some additional reviews show that the proposed technique is highly effective. It outperforms existing methods significantly.

A Subjectivity Detection Method for Opinion Mining Based on Lexical Resources

WWW/INTERNET 2011, 2011

Extraction and monitoring of opinions about a product or service is much valuable to users of social media and to companies as a feedback mechanism and as a source of information to define marketing campaigns. Opinion Mining is a subfield of Data Mining which aims at automatically classify opinions with respect to their polarity, i.e., to infer if the author of an opinion has either a positive or negative sentiment about some subject (e.g., a movie, a car model, an appliance, etc). A common approach in the opinion mining task is to preprocess the collection of opinion texts used as training set in order to remove the sentences of an opinion that do not contain subjective content. These so called subjective extracts, one for each opinionated text, are then used in a second step to train a polarity classifier, that is used to predict the orientation of the original opinion (positive or negative). In this paper, we propose a new method for the subjectivity detection step of the opinion mining task. Our method is based on Part-of Speech (POS) tagging each sentence of an opinionated text, and on the use of lexical resources to better generate the corresponding subjective extract. We take advantage of WordNet and SentiWordNet, two publically available lexical resources, to calculate the association degrees between sentences of an opinionated document in the subjectivity detection step. We use well-known movie review datasets (from the Internet Movie Database) to provide comparative experiments and we show a statistically significant increase in classification accuracy of the resulting opinion mining system that can be up to 9%.

Chapter 20 Generating , Refining and Using Sentiment Lexicons

2012

In this chapter, which is based on [7–9], we report on work on the generation, refinement and use of sentiment lexicons that was carried out within the DuOMAn project. The project was focused on the development of language technology to support online media analysis. In the area of media analysis, one of the key tasks is collecting detailed information about opinions and attitudes toward specific topics from various sources, both offline (traditional newspapers, archives) and online (news sites, blogs, forums). Specifically, media analysis concerns the following system task: given a topic and list of documents (discussing the topic), find all instances of attitudes toward the topic (e.g., positive/negative sentiments, or, if the topic is an organisation or person, support/criticism of this entity). For every such instance, one should identify the source of the sentiment, the polarity and, possibly, subtopics that this attitude relates to (e.g., specific targets of criticism or suppo...

Tools and Techniques for Lexicon Driven Sentiment Analysis: A Review

— The growth of user's generated content increased in microblogging platforms like Facebook, Twitter and Blogger in form of client reviews, comments and opinion. Using this bulk of helpful data is difficult to analyze and also a time consuming task. So it is needed to have such an intelligent text mining system that automatically analyze such vast data and categorize them into positive or negative class. Due to the noisiness in data, it is difficult to design such text mining systems because they suffer from mistakes of spelling, grammatical and improper punctuation. Opinion mining is a useful tool to monitor consumer's feedback and public mood about certain product in terms of negativity or positivity. For example the management of customer relations can use these feedbacks and improve the products by keeping in view the complaints. Lexical tools are one of the famous and useful techniques for sentiment classification. Many extensions and modifications of these tools are available now days. The purpose of this research is to study the available lexical tools and techniques to raise an interest for this research area.

Corpus-Based Techniques for Sentiment Lexicon Generation: A Review

Corpus-Based Techniques for Sentiment Lexicon Generation: A Review, 2019

State-of-the-art sentiment analysis systems rely on a sentiment lexicon, which is the most essential feature that drives their performance. This resource is indispensable for, and greatly contributes to, sentiment analysis tasks. This is evident in the emergence of a large volume of research devoted to the development of automated sentiment lexicon generation algorithms. The task of tagging subjective words with a semantic orientation comprises two core approaches: dictionary-based and corpus-based. The former involves making use of an online dictionary to tag words, while the latter relies on co-occurrence statistics or syntactic patterns embedded in text corpora. The end result is a linguistic resource comprising a priori information about words, across the semantic dimension of sentiment. This paper provides a survey on the most prominent research works that utilize corpus-based techniques for sentiment lexicon generation. We also conduct a comparative analysis on the performance of state-of-the-art algorithms proposed for this task, and shed light on the current progress and challenges in this area.

A verb lexicon model for deep sentiment analysis and opinion mining applications

2011

This paper presents a lexicon model for subjectivity description of Dutch verbs that offers a framework for the development of sentiment analysis and opinion mining applications based on a deep syntactic-semantic approach. The model aims to describe the detailed subjectivity relations that exist between the participants of the verbs, expressing multiple attitudes for each verb sense. Validation is provided by an annotation study that shows that these subtle subjectivity relations are reliably identifiable by human annotators.

Generating, Refining and Using Sentiment Lexicons

Theory and Applications of Natural Language Processing, 2012

In this chapter, which is based on [7-9], we report on work on the generation, refinement and use of sentiment lexicons that was carried out within the DuOMAn project. The project was focused on the development of language technology to support online media analysis. In the area of media analysis, one of the key tasks is collecting detailed information about opinions and attitudes toward specific topics from various sources, both offline (traditional newspapers, archives) and online (news sites, blogs, forums). Specifically, media analysis concerns the following system task: given a topic and list of documents (discussing the topic), find all instances of attitudes toward the topic (e.g., positive/negative sentiments, or, if the topic is an organisation or person, support/criticism of this entity). For every such instance, one should identify the source of the sentiment, the polarity and, possibly, subtopics that this attitude relates to (e.g., specific targets of criticism or support). Subsequently, a (human) media analyst must be able to aggregate the extracted information by source, polarity or subtopics, allowing him to build support/criticism networks etc. [1]. Recent advances in language technology, especially in sentiment analysis, promise to (partially) automate this task. Sentiment analysis is often considered in the context of the following two tasks:

“Expresses-an-opinion-about”: using corpus statistics in an information extraction approach to opinion mining

2010

We present a technique for identifying the sources and targets of opinions without actually identifying the opinions themselves. We are able to use an information extraction approach that treats opinion mining as relation mining; we identify instances of a binary "expresses-an-opinion-about" relation. We find that we can classify source-target pairs as belonging to the relation at a performance level significantly higher than two relevant baselines. This technique is particularly suited to emerging approaches in corpus-based social science which focus on aggregating interactions between sources to determine their effects on socio-economically significant targets. Our application is the analysis of information technology (IT) innovations. This is an example of a more general problem where opinion is expressed using either sub- or supersets of expressive words found in newswire. We present an annotation scheme and an SVM-based technique that uses the local context as well as...

An Overview of Lexicon-Based Approach For Sentiment Analysis

2018

Sentiment Analysis is the extraction of thoughts, attitudes and subjectivity of script or text to identify polarity i.e. positive, negative or neutral. There are three methods available for sentiment analysis, supervised, lexicon-based and hybrid approach, where the supervised method supersedes in performance from lexicon-based method and hybrid is a combination of both. The performance of supervised method is extremely reliant on on the excellence and the size of exercise data while on the other hand several lexical objects seem positive in the script of a domain while appearing negative at the same time in another domain therefore lexicon based analysis doesn’t have high accuracy yet and optimizing it is still a very interesting research topic in the domain of Sentiment Analysis. This paper provides a comprehensive overview of the last updates in this field of lexicon based sentiment analysis along with their limitations and also shows our own methods’ comparison of results for bi...