Detección de destacados eventos en un corpus grande combinando técnicas para PLN y minería de datos (original) (raw)
Related papers
EVENT EXTRACTION FROM NATURAL LANGUAGE TEXT
Extraction and representation of events plays an important role in solving many natural Language processing applications, namely questioning answering system, named entity Recognition, text summarization etc. Events are defined as happening or Situations that occur in the real world. Several methods were defined to annotate the events manually. This paper aim to provide a framework that automatically extract and represent the events that occur in the natural language text. Experiments were conducted on TIME BANK Corpus which consist of nes articles. Most of the events were extracted by our method when compared with other events extraction methods, the results of our method were found to be encouraging.
Event extraction from textual data
The Journal of Computer Science and Its Application, 2019
Many text mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively extract and use attributes from unstructured data is still an open research issue. Event attribute extraction is a challenging research area with broad application in the field of data mining and other related field because of the importance of decision making from the hidden knowledge/patterns discovered from the textual data, for example, in crime detection: where events are extracted from an eyewitness report to concisely identify what happened during a crime. In this work, we present our approach to extracting these events based on the dependency parse tree relations of the text and its part of speech (POS). The proposed method uses a machine learning algorithm to predict events from a text. The preliminary result of the experiment run with WEKA tool shows that more than 90% of events can be predicted based on POS and the dependency relations (DepR) of a sentence.
A Hybrid Approach for Event Extraction
Event extraction is a popular and interesting research field in the area of Natural Language Processing (NLP). In this paper, we propose a hybrid approach for event extraction within the TimeML framework. Initially, we develop a machine learning based system based on Conditional Random Field (CRF). But most of the deverbal event nouns are not correctly identified by this machine learning approach. From this observation, we came up with a hybrid approach where we introduce several strategies in conjunction with machine learning. These strategies are based on semantic role-labeling, WordNet and handcrafted rules. Evaluation results on the TempEval-2010 datasets yield the precision, recall and F-measure values of approximately 93.00%, 96.00% and 94.47%, respectively. This is approximately 12% higher F-measure in comparison with the best performing system of SemEval-2010.
Event Time Relationship in Natural Language Text
International Journal of Recent Contributions from Engineering, Science & IT (iJES), 2019
Due to the numerous information needs, retrieval of events from a given natural language text is inevitable. In natural language processing (NLP) perspective, "Events" are situations, occurrences, real-world entities or facts. Extraction of events and arranging them on a timeline is helpful in various NLP application like building the summary of news articles, processing health records, and Question Answering System (QA) systems. This paper presents a framework for identifying the events and times from a given document and representing them using a graph data structure. As a result, a graph is derived to show event-time relationships in the given text. Events form the nodes in a graph, and edges represent the temporal relations among the nodes. Time of an event occurrence exists in two forms namely qualitative (like before, after, duringetc) and quantitative (exact time points/periods). To build the event-time-event structure quantitative time is normalized to qualitative...
Extracting Events and Temporal Expressions from Text
2010 IEEE Fourth International Conference on Semantic Computing, 2010
Extracting temporal information from raw text is fundamental for deep language understanding, and key to many applications like question answering, information extraction, and document summarization. Our long-term goal is to build complete temporal structure of documents and apply the temporal structure in other applications like textual entailment, question answering, dialog systems or others. In this paper, we present a first step, a system for extracting event, event features, temporal expression and its normalized values from raw text.
Identifying temporal relations between main events in new articles
2013 ACS International Conference on Computer Systems and Applications (AICCSA), 2013
With the expansion of the Web 2.0, daily huge amount of data is produced everywhere, namely news articles. These contents need to be exploited in order to extract relevant information and to build knowledge databases. In this concern, processing the temporal dimension of language and extracting temporal information from electronic news articles is becoming a prominent task. In this concern, we propose an approach for identifying inter-sentential temporal relations between main events from news articles. Our approach is based on a complete linguistic analysis of texts and supervised learning models.
EEQuest: An Event Extraction and Query System
We present EEQuest, an application that extracts events from text using natural language processing (nlp) and supervised machine-learning techniques, and provides a system to query events extracted from a text corpus. We provide a use case for the application wherein we extract business-related events from news articles. The extracted events are then categorized based on the business organization/company that they are related to. Finally, the events are added to a knowledge base using which a query system is built. The system can be used to display events related to a particular organization or a group of organizations. Although we are using the system to extract business-related events, the event extraction mechanism can be used in a more general sense with any available textual data, to extract any kind of events that have a structure that can answer the question: Who did what, when and where?
2008
Dating of contents is relevant to multiple advanced Natural Language Processing (NLP) applications, such as Information Retrieval or Question Answering. These could be improved by using techniques that consider a temporal dimension in their processes. To achieve it, an accurate detection of temporal expressions in data sources must be firstly done, dealing with them in an appropriated standard format that captures the time value of the expressions once resolved, and allows reasoning without ambiguity, in order to increase the range of search and the quality of the results to be returned. These tasks are completely necessary for NLP applications if an efficient temporal reasoning is afterwards expected. This work presents a typology of time expressions based on an empirical inductive approach, both from a structural perspective and from the point of view of their resolution. Furthermore, a method for the automatic recognition and resolution of temporal expressions in Spanish contents is provided, obtaining promising results when it is tested by means of an evaluation corpus.
Finding salient dates for building thematic timelines
We present an approach for detecting salient (important) dates in texts in order to automatically build event timelines from a search query (e.g. the name of an event or person, etc.). This work was carried out on a corpus of newswire texts in English provided by the Agence France Presse (AFP). In order to extract salient dates that warrant inclusion in an event timeline, we first recognize and normalize temporal expressions in texts and then use a machine-learning approach to extract salient dates that relate to a particular topic. We focused only on extracting the dates and not the events to which they are related.