Doina Tatar | Babes-Bolyai University (original) (raw)

Papers by Doina Tatar

Research paper thumbnail of Text entailment verification with text similarities

Proceedings of KEPT2007, Knowledge …, 2007

... This identification is completed with a set of heuristics for recognizing false entailment. .... more ... This identification is completed with a set of heuristics for recognizing false entailment. ... Monotonicity supposes that if aa text entails another text, then adding more text to the first, the entailment relation still holds [7]. The heuristics are represented by the bellow condition ...

Research paper thumbnail of Text Summarization by Formal Concept Analysis Approach

Research paper thumbnail of Association for Computational Linguistics UBB system at Senseval3

It is known that whenever a system’s actions depend on the meaning of the text being processed, d... more It is known that whenever a system’s actions depend on the meaning of the text being processed, disambiguation is beneficial or even necessary. The contest Senseval is an international frame where the research in this important field is validated in an hierarchical manner. In this paper we present our system participating for the first time at Senseval 3 contest on WSD, contest developed in March-April 2004. We present also our intentions on improving our system, intentions occurred from the study of results. 1

Research paper thumbnail of An Improved Algorithm on Word Sense Disambiguation

International Joint Conferences on Security and Intelligent Information Systems, 2003

The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in... more The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word [4]. Starting from the algorithm of Yarowsky [6,5,9,10] and the Naive Bayes Classifier (NBC) algorithm, in this paper we propose an original two-steps algorithm which combines their elements. This algorithm preserves the advantage of principles of Yarowsky (one sense per discourse and one sense per collocation) with the known high performance of NBC algorithms. We design an Intelligent Agent, who learns (based on the algorithm mentioned above) to find the correct sense for an ambiguous word in some given contexts.

Research paper thumbnail of Training Probabilistic Context-Free Grammars As Hidden Markov Models

It is considerred in this moment that the use of mathematical statistics methods in natural langu... more It is considerred in this moment that the use of mathematical statistics methods in natural language processing represents a leading topic in NLP. Statistical methods have rst been applied in the "speech-recognition" area. While Hidden Markov Model (HMM) is unanimously accepted as a mathematical tool in this area, its advantages have been less used in dealing with understanding natural language. In this paper we propose a method for association of a HMM to a context-free grammar (CFG). In this way, learning a CFG with a correct parsing tree will be realized by learning a HMM.

Research paper thumbnail of Quantitative Analysis of Poetic Texts

Quantitative Analysis of Poetic Texts, 2015

Research paper thumbnail of Text Entailment as directional relation

Research paper thumbnail of UBB system at Senseval-3

International Workshop on Semantic Evaluation, 2004

It is known that whenever a system's actions depend on the meaning of the text being processed, d... more It is known that whenever a system's actions depend on the meaning of the text being processed, disambiguation is beneficial or even necessary. The contest Senseval is an international frame where the research in this important field is validated in an hierarchical manner. In this paper we present our system participating for the first time at Senseval 3 contest on WSD, contest developed in March-April 2004. We present also our intentions on improving our system, intentions occurred from the study of results.

Research paper thumbnail of Hreb-like analysis of Eminescu's poems

Research paper thumbnail of Textual Entailment Recognizing by Theorem Proving Approach

In this paper we present two original methods for recognizing textual inference.First one is a mo... more In this paper we present two original methods for recognizing textual inference.First one is a modified resolution method such that some linguistic considerations are introduced in the unification of two atoms. The approach is possible due to the recent methods of transforming texts in logic formulas. Second one is based on semantic relations in text, as presented in WordNet. Some similarities between these two methods are remarked. Key words: unification, resolution, textual inference, WordNet. 1

Research paper thumbnail of A New Algorithm For Word Sense Disambiguation

The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in... more The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word [3]. Starting from the algorithm of Yarowsky [5, 4] and the Naive Bayes Classi er (NBC) algorithm , in this paper we propose an original algorithm which combines their elements. This algorithm preserve the advantage of principles of Yarowsky ( one sense per discourse and one sense per collocation) with the known high performance of the NBC algorithm. Moreover, an agent is constructed accomplishing this algorithm. 1.

Research paper thumbnail of Phrase Generation In Lexical Functional Grammars And Unification Grammars

In this paper we compare the process of deriving a phrase structure in a lexical functional gramm... more In this paper we compare the process of deriving a phrase structure in a lexical functional grammars with the process of obtaining feature structure for the symbol S of an unification grammar. If the c structure(D, C, e) generates the feature structure F , then F is the feature structure obtained as MGSat( ), where is a conjunction of a set of descriptions from Desc.

Research paper thumbnail of Some Remarks about Feature Selection in Word Sense Discrimination for Romanian Language

The problem of feature selection in Word Sense Discrimination (a subtask of Word Sense Disambigua... more The problem of feature selection in Word Sense Discrimination (a subtask of Word Sense Disambiguation) is crucial for the accuracy of re- sults. The paper proposes as a new feature the length of words (1). Some combination between this feature and other features usually used are studied and presented.

Research paper thumbnail of Textual Entailment as a Directional Relation

J. Res. Pract. Inf. Technol., 2009

This paper presents three methods for solving the problem of textual entailment, obtained from an... more This paper presents three methods for solving the problem of textual entailment, obtained from an equal number of text-to-text similarity metrics. The first method starts with the directional measure of text-to-text similarity presented in Corley and Mihalcea (2005), and integrates word sense disambiguation and several heuristics. The second method exploits the relations between the cosine directional measures of similarity as means to identify textual entailment. Finally, the third method relies on the directional variant of Levenshtein distance between two words. Each “word” in this method is a string consisting of all the words concatenated. In all these methods the decision about an entailment relation depends on the relation established between these measures of similarity. The methods are applied and evaluated on the whole set of text-hypothesis pairs included in the PASCAL RTE-1 development dataset (RTE-1, 2005). The corresponding accuracy and statistics are presented for eac...

Research paper thumbnail of Textual Entailment Recognizing by Theorem Proving Approach

ArXiv, 2008

In this paper we present two original methods for recognizing textual inference. First one is a m... more In this paper we present two original methods for recognizing textual inference. First one is a modified resolution method such that some linguistic considerations are introduced in the unification of two atoms. The approach is possible due to the recent methods of transforming texts in logic formulas. Second one is based on semantic relations in text, as presented in WordNet. Some similarities between these two methods are remarked.

Research paper thumbnail of A chain dictionary method for Word Sense Disambiguation and applications

ArXiv, 2008

A large class of unsupervised algorithms for Word Sense Disambiguation (WSD) is that of dictionar... more A large class of unsupervised algorithms for Word Sense Disambiguation (WSD) is that of dictionary-based methods. Various algorithms have as the root Lesk's algorithm, which exploits the sense definitions in the dictionary directly. Our approach uses the lexical base WordNet for a new algorithm originated in Lesk's, namely "chain algorithm for disambiguation of all words", CHAD. We show how translation from a language into another one and also text entailment verification could be accomplished by this disambiguation.

Research paper thumbnail of Intensional Logic Translation For Quantitative Natural Language Sentences

The performance of some natural language processing tasks improves if semantic processing is invo... more The performance of some natural language processing tasks improves if semantic processing is involved. Moreover, some tasks ( database query) cannot be carried out at all without semantic processing. The first semantic description was developed by Montague and all later approach to semantic in the frame of discourse representation theory follow Montague in using more powerful logic language. The present paper is a contribution in treatment of quantitative natural sentences.

Research paper thumbnail of Learning Taxonomy for Text Segmentation by Formal Concept Analysis

ArXiv, 2010

In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentat... more In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentation are approached. Formal Concept Analysis (FCA) method is applied to solve both of these linguistic problems. The proposed segmentation method offers a conceptual view for text segmentation, using a context-driven clustering of sentences. The Concept-oriented Clustering Segmentation algorithm (COCS) is based on k-means linear clustering of the sentences. Experimental results obtained using COCS algorithm are presented.

Research paper thumbnail of UBB system at Senseval-3

It is known that whenever a system’s actions depend on the meaning of the text being processed, d... more It is known that whenever a system’s actions depend on the meaning of the text being processed, disambiguation is beneficial or even necessary. The contest Senseval is an international frame where the research in this important field is validated in an hierarchical manner. In this paper we present our system participating for the first time at Senseval 3 contest on WSD, contest developed in March-April 2004. We present also our intentions on improving our system, intentions occurred from the study of results.

Research paper thumbnail of Bootstrapping in Word Sense Disambiguation

The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in... more The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word: Allen (1995), Manning, Schutze (1999), Tătar (2005). In this paper we present one algorithm (Tătar, Şerban 2001) which combines Yarowsky’s principles (Yarowsky 1999) (one sense per discourse and one sense per collocation) and the Naive Bayes Classifier method.

Research paper thumbnail of Text entailment verification with text similarities

Proceedings of KEPT2007, Knowledge …, 2007

... This identification is completed with a set of heuristics for recognizing false entailment. .... more ... This identification is completed with a set of heuristics for recognizing false entailment. ... Monotonicity supposes that if aa text entails another text, then adding more text to the first, the entailment relation still holds [7]. The heuristics are represented by the bellow condition ...

Research paper thumbnail of Text Summarization by Formal Concept Analysis Approach

Research paper thumbnail of Association for Computational Linguistics UBB system at Senseval3

It is known that whenever a system’s actions depend on the meaning of the text being processed, d... more It is known that whenever a system’s actions depend on the meaning of the text being processed, disambiguation is beneficial or even necessary. The contest Senseval is an international frame where the research in this important field is validated in an hierarchical manner. In this paper we present our system participating for the first time at Senseval 3 contest on WSD, contest developed in March-April 2004. We present also our intentions on improving our system, intentions occurred from the study of results. 1

Research paper thumbnail of An Improved Algorithm on Word Sense Disambiguation

International Joint Conferences on Security and Intelligent Information Systems, 2003

The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in... more The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word [4]. Starting from the algorithm of Yarowsky [6,5,9,10] and the Naive Bayes Classifier (NBC) algorithm, in this paper we propose an original two-steps algorithm which combines their elements. This algorithm preserves the advantage of principles of Yarowsky (one sense per discourse and one sense per collocation) with the known high performance of NBC algorithms. We design an Intelligent Agent, who learns (based on the algorithm mentioned above) to find the correct sense for an ambiguous word in some given contexts.

Research paper thumbnail of Training Probabilistic Context-Free Grammars As Hidden Markov Models

It is considerred in this moment that the use of mathematical statistics methods in natural langu... more It is considerred in this moment that the use of mathematical statistics methods in natural language processing represents a leading topic in NLP. Statistical methods have rst been applied in the "speech-recognition" area. While Hidden Markov Model (HMM) is unanimously accepted as a mathematical tool in this area, its advantages have been less used in dealing with understanding natural language. In this paper we propose a method for association of a HMM to a context-free grammar (CFG). In this way, learning a CFG with a correct parsing tree will be realized by learning a HMM.

Research paper thumbnail of Quantitative Analysis of Poetic Texts

Quantitative Analysis of Poetic Texts, 2015

Research paper thumbnail of Text Entailment as directional relation

Research paper thumbnail of UBB system at Senseval-3

International Workshop on Semantic Evaluation, 2004

It is known that whenever a system's actions depend on the meaning of the text being processed, d... more It is known that whenever a system's actions depend on the meaning of the text being processed, disambiguation is beneficial or even necessary. The contest Senseval is an international frame where the research in this important field is validated in an hierarchical manner. In this paper we present our system participating for the first time at Senseval 3 contest on WSD, contest developed in March-April 2004. We present also our intentions on improving our system, intentions occurred from the study of results.

Research paper thumbnail of Hreb-like analysis of Eminescu's poems

Research paper thumbnail of Textual Entailment Recognizing by Theorem Proving Approach

In this paper we present two original methods for recognizing textual inference.First one is a mo... more In this paper we present two original methods for recognizing textual inference.First one is a modified resolution method such that some linguistic considerations are introduced in the unification of two atoms. The approach is possible due to the recent methods of transforming texts in logic formulas. Second one is based on semantic relations in text, as presented in WordNet. Some similarities between these two methods are remarked. Key words: unification, resolution, textual inference, WordNet. 1

Research paper thumbnail of A New Algorithm For Word Sense Disambiguation

The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in... more The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word [3]. Starting from the algorithm of Yarowsky [5, 4] and the Naive Bayes Classi er (NBC) algorithm , in this paper we propose an original algorithm which combines their elements. This algorithm preserve the advantage of principles of Yarowsky ( one sense per discourse and one sense per collocation) with the known high performance of the NBC algorithm. Moreover, an agent is constructed accomplishing this algorithm. 1.

Research paper thumbnail of Phrase Generation In Lexical Functional Grammars And Unification Grammars

In this paper we compare the process of deriving a phrase structure in a lexical functional gramm... more In this paper we compare the process of deriving a phrase structure in a lexical functional grammars with the process of obtaining feature structure for the symbol S of an unification grammar. If the c structure(D, C, e) generates the feature structure F , then F is the feature structure obtained as MGSat( ), where is a conjunction of a set of descriptions from Desc.

Research paper thumbnail of Some Remarks about Feature Selection in Word Sense Discrimination for Romanian Language

The problem of feature selection in Word Sense Discrimination (a subtask of Word Sense Disambigua... more The problem of feature selection in Word Sense Discrimination (a subtask of Word Sense Disambiguation) is crucial for the accuracy of re- sults. The paper proposes as a new feature the length of words (1). Some combination between this feature and other features usually used are studied and presented.

Research paper thumbnail of Textual Entailment as a Directional Relation

J. Res. Pract. Inf. Technol., 2009

This paper presents three methods for solving the problem of textual entailment, obtained from an... more This paper presents three methods for solving the problem of textual entailment, obtained from an equal number of text-to-text similarity metrics. The first method starts with the directional measure of text-to-text similarity presented in Corley and Mihalcea (2005), and integrates word sense disambiguation and several heuristics. The second method exploits the relations between the cosine directional measures of similarity as means to identify textual entailment. Finally, the third method relies on the directional variant of Levenshtein distance between two words. Each “word” in this method is a string consisting of all the words concatenated. In all these methods the decision about an entailment relation depends on the relation established between these measures of similarity. The methods are applied and evaluated on the whole set of text-hypothesis pairs included in the PASCAL RTE-1 development dataset (RTE-1, 2005). The corresponding accuracy and statistics are presented for eac...

Research paper thumbnail of Textual Entailment Recognizing by Theorem Proving Approach

ArXiv, 2008

In this paper we present two original methods for recognizing textual inference. First one is a m... more In this paper we present two original methods for recognizing textual inference. First one is a modified resolution method such that some linguistic considerations are introduced in the unification of two atoms. The approach is possible due to the recent methods of transforming texts in logic formulas. Second one is based on semantic relations in text, as presented in WordNet. Some similarities between these two methods are remarked.

Research paper thumbnail of A chain dictionary method for Word Sense Disambiguation and applications

ArXiv, 2008

A large class of unsupervised algorithms for Word Sense Disambiguation (WSD) is that of dictionar... more A large class of unsupervised algorithms for Word Sense Disambiguation (WSD) is that of dictionary-based methods. Various algorithms have as the root Lesk's algorithm, which exploits the sense definitions in the dictionary directly. Our approach uses the lexical base WordNet for a new algorithm originated in Lesk's, namely "chain algorithm for disambiguation of all words", CHAD. We show how translation from a language into another one and also text entailment verification could be accomplished by this disambiguation.

Research paper thumbnail of Intensional Logic Translation For Quantitative Natural Language Sentences

The performance of some natural language processing tasks improves if semantic processing is invo... more The performance of some natural language processing tasks improves if semantic processing is involved. Moreover, some tasks ( database query) cannot be carried out at all without semantic processing. The first semantic description was developed by Montague and all later approach to semantic in the frame of discourse representation theory follow Montague in using more powerful logic language. The present paper is a contribution in treatment of quantitative natural sentences.

Research paper thumbnail of Learning Taxonomy for Text Segmentation by Formal Concept Analysis

ArXiv, 2010

In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentat... more In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentation are approached. Formal Concept Analysis (FCA) method is applied to solve both of these linguistic problems. The proposed segmentation method offers a conceptual view for text segmentation, using a context-driven clustering of sentences. The Concept-oriented Clustering Segmentation algorithm (COCS) is based on k-means linear clustering of the sentences. Experimental results obtained using COCS algorithm are presented.

Research paper thumbnail of UBB system at Senseval-3

It is known that whenever a system’s actions depend on the meaning of the text being processed, d... more It is known that whenever a system’s actions depend on the meaning of the text being processed, disambiguation is beneficial or even necessary. The contest Senseval is an international frame where the research in this important field is validated in an hierarchical manner. In this paper we present our system participating for the first time at Senseval 3 contest on WSD, contest developed in March-April 2004. We present also our intentions on improving our system, intentions occurred from the study of results.

Research paper thumbnail of Bootstrapping in Word Sense Disambiguation

The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in... more The task of disambiguation is to determine which of the senses of an ambiguous word is invoked in a particular use of the word: Allen (1995), Manning, Schutze (1999), Tătar (2005). In this paper we present one algorithm (Tătar, Şerban 2001) which combines Yarowsky’s principles (Yarowsky 1999) (one sense per discourse and one sense per collocation) and the Naive Bayes Classifier method.