Daisuke Bekki | Ochanomizu University (original) (raw)
Papers by Daisuke Bekki
Lecture Notes in Computer Science, 2009
In the medical field, there are many clinical texts such as electronic medical records, and resea... more In the medical field, there are many clinical texts such as electronic medical records, and research on Japanese natural language processing using these texts has been conducted. One such research involves Recognizing Textual Entailment (RTE) in clinical texts using a semantic analysis and logical inference system, ccg2lambda. However, it is difficult for existing inference systems to correctly determine the entailment relations, if the input sentence contains medical domain specific paraphrases such as disease names. In this study, we propose a method to supplement the equivalence relations of disease names as axioms by identifying candidates for paraphrases that lack in theorem proving. Candidates of paraphrases are identified by using a model for the NER task for disease names and a disease name dictionary. We also construct an inference test set that requires knowledge injection of disease names and evaluate our inference system. Experiments showed that our inference system was able to correctly infer for 106 out of 149 inference test sets.
IEICE Technical Report; IEICE Tech. Rep., Jul 1, 2021
arXiv (Cornell University), Nov 15, 2018
In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is ... more In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data. However, there is a tradeoff between adding more knowledge data for improved RTE performance and maintaining an efficient RTE system, as such a big database is problematic in terms of the memory usage and computational complexity. In this work, we show the processing time of a state-of-the-art logic-based RTE system can be significantly reduced by replacing its search-based axiom injection (abduction) mechanism by that based on Knowledge Base Completion (KBC). We integrate this mechanism in a Coq plugin that provides a proof automation tactic for natural language inference. Additionally, we show empirically that adding new knowledge data contributes to better RTE performance while not harming the processing speed in this framework.
The present paper proposes a method by which to translate outputs of a robust HPSG parser into se... more The present paper proposes a method by which to translate outputs of a robust HPSG parser into semantic representations of Typed Dynamic Logic (TDL), a dynamic plural semantics defined in typed lambda calculus. With its higher-order representations of contexts, TDL analyzes and describes the inherently inter-sentential nature of quantification and anaphora in a strictly lexicalized and compositional manner. The present study shows that the proposed translation method successfully combines robustness and descriptive adequacy of contemporary semantics. The present implementation achieves high coverage, approximately 90%, for the real text of the Penn Treebank corpus.
Lecture Notes in Computer Science, 2023
Springer eBooks, Aug 22, 2012
Lecture Notes in Computer Science, 2011
This paper aims to provide a logical background for Combinatory Categorial Grammar (CCG) and its ... more This paper aims to provide a logical background for Combinatory Categorial Grammar (CCG) and its typological discussions. Based on the Curry-Howard correspondence between Gentzen-style proof systems and Lambek Lamda Calculi, and those between Hilbert-style proof systems and substructural BCWK-logic, I define a new class of logic which provides subclasses for each of the substructural combinatory logics, called Subdirectional Combinatory Logic, and propose that CCG is a subdirectional logic of a relevance logic (Combinatory Hypothesis). This hypothesis allows us to discuss typological parameters in universal grammar in terms of the presence/absence of a certain directional combinators.
Comparative constructions pose a challenge in Natural Language Inference (NLI), which is the task... more Comparative constructions pose a challenge in Natural Language Inference (NLI), which is the task of determining whether a text entails a hypothesis. Comparatives are structurally complex in that they interact with other linguistic phenomena such as quantifiers, numerals, and lexical antonyms. In formal semantics, there is a rich body of work on comparatives and gradable expressions using the notion of degree. However, a logical inference system for comparatives has not been sufficiently developed for use in the NLI task. In this paper, we present a compositional semantics that maps various comparative constructions in English to semantic representations via Combinatory Categorial Grammar (CCG) parsers and combine it with an inference system based on automated theorem proving. We evaluate our system on three NLI datasets that contain complex logical inferences with comparatives, generalized quantifiers, and numerals. We show that the system outperforms previous logic-based systems as well as recent deep learning-based models.
Journal of Language Modelling, Jan 5, 2023
This paper presents a computational framework for Natural Language Inference (NLI) using logic-ba... more This paper presents a computational framework for Natural Language Inference (NLI) using logic-based semantic representations and theorem-proving. We focus on logical inferences with comparatives and other related constructions in English, which are known for their structural complexity and difficulty in performing efficient reasoning. Using the so-called A-not-A analysis of comparatives, we implement a fully automated system to map various comparative constructions to semantic representations in typed first-order logic via Combinatory Categorial Grammar parsers and to prove entailment relations via a theorem prover. We evaluate the system on a variety of NLI benchmarks that contain challenging inferences, in comparison with other recent logic-based systems and neural NLI models.
arXiv (Cornell University), Jun 5, 2019
We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based... more We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems.
arXiv (Cornell University), Jul 27, 2017
Determining semantic textual similarity is a core research subject in natural language processing... more Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higherorder automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logicbased systems and that features derived from the proofs are effective for learning textual similarity.
Lecture Notes in Computer Science, 2017
arXiv (Cornell University), Oct 2, 2019
Comparative constructions play an important role in natural language inference. However, attempts... more Comparative constructions play an important role in natural language inference. However, attempts to study semantic representations and logical inferences for comparatives from the computational perspective are not well developed, due to the complexity of their syntactic structures and inference patterns. In this study, using a framework based on Combinatory Categorial Grammar (CCG), we present a compositional semantics that maps various comparative constructions in English to semantic representations and introduces an inference system that effectively handles logical inference with comparatives, including those involving numeral adjectives, antonyms, and quantification. We evaluate the performance of our system on the FraCaS test suite and show that the system can handle a variety of complex logical inferences with comparatives.
Shizen gengo shori, 2023
Natural language inference is a core natural language understanding task for determining whether ... more Natural language inference is a core natural language understanding task for determining whether a hypothesis is true (entailment), false (contradiction), or neither (neutral) when a set of premises is true. Logical entailment and implicature can differ when an inference contains numeral expressions. Embedding numeral expressions in contexts such as negation and conditionals can enable reversing the entailment relation between a premise and a hypothesis to that embedded in general contexts. Furthermore, numeral expressions in Japanese are characterized by the flexibility of quantifier positions, the variety of numeral suffixes, and their usages. However,
Lecture Notes in Computer Science, 2009
In the medical field, there are many clinical texts such as electronic medical records, and resea... more In the medical field, there are many clinical texts such as electronic medical records, and research on Japanese natural language processing using these texts has been conducted. One such research involves Recognizing Textual Entailment (RTE) in clinical texts using a semantic analysis and logical inference system, ccg2lambda. However, it is difficult for existing inference systems to correctly determine the entailment relations, if the input sentence contains medical domain specific paraphrases such as disease names. In this study, we propose a method to supplement the equivalence relations of disease names as axioms by identifying candidates for paraphrases that lack in theorem proving. Candidates of paraphrases are identified by using a model for the NER task for disease names and a disease name dictionary. We also construct an inference test set that requires knowledge injection of disease names and evaluate our inference system. Experiments showed that our inference system was able to correctly infer for 106 out of 149 inference test sets.
IEICE Technical Report; IEICE Tech. Rep., Jul 1, 2021
arXiv (Cornell University), Nov 15, 2018
In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is ... more In logic-based approaches to reasoning tasks such as Recognizing Textual Entailment (RTE), it is important for a system to have a large amount of knowledge data. However, there is a tradeoff between adding more knowledge data for improved RTE performance and maintaining an efficient RTE system, as such a big database is problematic in terms of the memory usage and computational complexity. In this work, we show the processing time of a state-of-the-art logic-based RTE system can be significantly reduced by replacing its search-based axiom injection (abduction) mechanism by that based on Knowledge Base Completion (KBC). We integrate this mechanism in a Coq plugin that provides a proof automation tactic for natural language inference. Additionally, we show empirically that adding new knowledge data contributes to better RTE performance while not harming the processing speed in this framework.
The present paper proposes a method by which to translate outputs of a robust HPSG parser into se... more The present paper proposes a method by which to translate outputs of a robust HPSG parser into semantic representations of Typed Dynamic Logic (TDL), a dynamic plural semantics defined in typed lambda calculus. With its higher-order representations of contexts, TDL analyzes and describes the inherently inter-sentential nature of quantification and anaphora in a strictly lexicalized and compositional manner. The present study shows that the proposed translation method successfully combines robustness and descriptive adequacy of contemporary semantics. The present implementation achieves high coverage, approximately 90%, for the real text of the Penn Treebank corpus.
Lecture Notes in Computer Science, 2023
Springer eBooks, Aug 22, 2012
Lecture Notes in Computer Science, 2011
This paper aims to provide a logical background for Combinatory Categorial Grammar (CCG) and its ... more This paper aims to provide a logical background for Combinatory Categorial Grammar (CCG) and its typological discussions. Based on the Curry-Howard correspondence between Gentzen-style proof systems and Lambek Lamda Calculi, and those between Hilbert-style proof systems and substructural BCWK-logic, I define a new class of logic which provides subclasses for each of the substructural combinatory logics, called Subdirectional Combinatory Logic, and propose that CCG is a subdirectional logic of a relevance logic (Combinatory Hypothesis). This hypothesis allows us to discuss typological parameters in universal grammar in terms of the presence/absence of a certain directional combinators.
Comparative constructions pose a challenge in Natural Language Inference (NLI), which is the task... more Comparative constructions pose a challenge in Natural Language Inference (NLI), which is the task of determining whether a text entails a hypothesis. Comparatives are structurally complex in that they interact with other linguistic phenomena such as quantifiers, numerals, and lexical antonyms. In formal semantics, there is a rich body of work on comparatives and gradable expressions using the notion of degree. However, a logical inference system for comparatives has not been sufficiently developed for use in the NLI task. In this paper, we present a compositional semantics that maps various comparative constructions in English to semantic representations via Combinatory Categorial Grammar (CCG) parsers and combine it with an inference system based on automated theorem proving. We evaluate our system on three NLI datasets that contain complex logical inferences with comparatives, generalized quantifiers, and numerals. We show that the system outperforms previous logic-based systems as well as recent deep learning-based models.
Journal of Language Modelling, Jan 5, 2023
This paper presents a computational framework for Natural Language Inference (NLI) using logic-ba... more This paper presents a computational framework for Natural Language Inference (NLI) using logic-based semantic representations and theorem-proving. We focus on logical inferences with comparatives and other related constructions in English, which are known for their structural complexity and difficulty in performing efficient reasoning. Using the so-called A-not-A analysis of comparatives, we implement a fully automated system to map various comparative constructions to semantic representations in typed first-order logic via Combinatory Categorial Grammar parsers and to prove entailment relations via a theorem prover. We evaluate the system on a variety of NLI benchmarks that contain challenging inferences, in comparison with other recent logic-based systems and neural NLI models.
arXiv (Cornell University), Jun 5, 2019
We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based... more We propose a new domain adaptation method for Combinatory Categorial Grammar (CCG) parsing, based on the idea of automatic generation of CCG corpora exploiting cheaper resources of dependency trees. Our solution is conceptually simple, and not relying on a specific parser architecture, making it applicable to the current best-performing parsers. We conduct extensive parsing experiments with detailed discussion; on top of existing benchmark datasets on (1) biomedical texts and (2) question sentences, we create experimental datasets of (3) speech conversation and (4) math problems. When applied to the proposed method, an off-the-shelf CCG parser shows significant performance gains, improving from 90.7% to 96.6% on speech conversation, and from 88.5% to 96.8% on math problems.
arXiv (Cornell University), Jul 27, 2017
Determining semantic textual similarity is a core research subject in natural language processing... more Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higherorder automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logicbased systems and that features derived from the proofs are effective for learning textual similarity.
Lecture Notes in Computer Science, 2017
arXiv (Cornell University), Oct 2, 2019
Comparative constructions play an important role in natural language inference. However, attempts... more Comparative constructions play an important role in natural language inference. However, attempts to study semantic representations and logical inferences for comparatives from the computational perspective are not well developed, due to the complexity of their syntactic structures and inference patterns. In this study, using a framework based on Combinatory Categorial Grammar (CCG), we present a compositional semantics that maps various comparative constructions in English to semantic representations and introduces an inference system that effectively handles logical inference with comparatives, including those involving numeral adjectives, antonyms, and quantification. We evaluate the performance of our system on the FraCaS test suite and show that the system can handle a variety of complex logical inferences with comparatives.
Shizen gengo shori, 2023
Natural language inference is a core natural language understanding task for determining whether ... more Natural language inference is a core natural language understanding task for determining whether a hypothesis is true (entailment), false (contradiction), or neither (neutral) when a set of premises is true. Logical entailment and implicature can differ when an inference contains numeral expressions. Embedding numeral expressions in contexts such as negation and conditionals can enable reversing the entailment relation between a premise and a hypothesis to that embedded in general contexts. Furthermore, numeral expressions in Japanese are characterized by the flexibility of quantifier positions, the variety of numeral suffixes, and their usages. However,