EAT2seq: A generic framework for controlled sentence transformation without task-specific training (original) (raw)
Related papers
EAT: a simple and versatile semantic representation format for multi-purpose NLP
arXiv (Cornell University), 2019
Semantic representations are central in many NLP tasks that require human-interpretable data. The conjunctivist theoretical framework-primarily developed by Pietroski (2005 2018)-obtains expressive representations with only a few basic semantic types and relations systematically linked to syntactic positions. While representational simplicity is crucial for computational applications, such findings have not yet had major influence on NLP. We present the first generic semantic representation format for NLP directly based on these insights. We name the format EAT due to its basis in the Event-, Agent-, and Theme arguments in Neo-Davidsonian logical forms. It builds on the idea that similar tripartite argument relations are ubiquitous across categories, and can be constructed from grammatical structure without additional lexical information. We present a detailed exposition of EAT and how it relates to other prevalent formats used in prior work, such as Abstract Meaning Representation (AMR) and Minimal Recursion Semantics (MRS). EAT stands out in two respects: simplicity and versatility. Uniquely, EAT discards semantic metapredicates, and instead represents semantic roles entirely via positional encoding. This is made possible by limiting the number of roles to only three; a major decrease from the many dozens recognized in e.g. AMR and MRS. EAT's simplicity makes it exceptionally versatile in application. First, we show that drastically reducing semantic roles based on EAT benefits text generation from MRS in the test settings of Hajdik et al. (2019). Second, we implement the derivation of EAT from a syntactic parse, and apply this for parallel corpus generation between grammatical classes. Third, we train an encoderdecoder LSTM network to map EAT to English. Finally, we use both the encoder-decoder network and a rule-based alternative to conduct grammatical transformation from EAT-input. Our experiments illustrate EAT's ability to retain semantic information despite its simplicity.
2018
We present models for encoding sentences into embedding vectors that specifically target transfer learning to other NLP tasks. The models are efficient and result in accurate performance on diverse transfer tasks. Two variants of the encoding models allow for trade-offs between accuracy and compute resources. For both variants, we investigate and report the relationship between model complexity, resource consumption, the availability of transfer task training data, and task performance. Comparisons are made with baselines that use word level transfer learning via pretrained word embeddings as well as baselines do not use any transfer learning. We find that transfer learning using sentence embeddings tends to outperform word level transfer. With transfer learning via sentence embeddings, we observe surprisingly good performance with minimal amounts of supervised training data for a transfer task. We obtain encouraging results on Word Embedding Association Tests (WEAT) targeted at det...
TreeNet: Learning Sentence Representations with Unconstrained Tree Structure
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018
Recursive neural network (RvNN) has been proved to be an effective and promising tool to learn sentence representations by explicitly exploiting the sentence structure. However, most existing work can only exploit simple tree structure, e.g., binary trees, or ignore the order of nodes, which yields suboptimal performance. In this paper, we proposed a novel neural network, namely TreeNet, to capture sentences structurally over the raw unconstrained constituency trees, where the number of child nodes can be arbitrary. In TreeNet, each node learns from its left sibling and right child in a bottom-up left-to-right order, thus enabling the net to learn over any tree. Furthermore, multiple soft gates and a memory cell are employed in implementing the TreeNet to determine to what extent it should learn, remember and output, which proves to be a simple and efficient mechanism for semantic synthesis. Moreover, TreeNet significantly suppresses convolutional neural networks (CNN) and Long Short-Term Memory (LSTM) with fewer parameters. It improves the classification accuracy by 2%-5% with 42% of the best CNN's parameters or 94% of standard LSTM's. Extensive experiments demonstrate TreeNet achieves the state-of-the-art performance on all four typical text classification tasks.
Universal Sentence Encoder for English
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
We present easy-to-use TensorFlow Hub sentence embedding models having good task transfer performance. Model variants allow for trade-offs between accuracy and compute resources. We report the relationship between model complexity, resources, and transfer performance. Comparisons are made with baselines without transfer learning and to baselines that incorporate word-level transfer. Transfer learning using sentence-level embeddings is shown to outperform models without transfer learning and often those that use only word-level transfer. We show good transfer task performance with minimal training data and obtain encouraging results on word embedding association tests (WEAT) of model bias.
Representation biases in sentence transformers
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023
Variants of the BERT architecture specialised for producing full-sentence representations often achieve better performance on downstream tasks than sentence embeddings extracted from vanilla BERT. However, there is still little understanding of what properties of inputs determine the properties of such representations. In this study, we construct several sets of sentences with pre-defined lexical and syntactic structures and show that SOTA sentence transformers have a strong nominal-participant-set bias: cosine similarities between pairs of sentences are more strongly determined by the overlap in the set of their noun participants than by having the same predicates, lengthy nominal modifiers, or adjuncts. At the same time, the precise syntactic-thematic functions of the participants are largely irrelevant.
Neural sentence generation from formal semantics
Proceedings of the 11th International Conference on Natural Language Generation, 2018
Sequence-to-sequence models have shown strong performance in a wide range of NLP tasks, yet their applications to sentence generation from logical representations are underdeveloped. In this paper, we present a sequence-to-sequence model for generating sentences from logical meaning representations based on event semantics. We use a semantic parsing system based on Combinatory Categorial Grammar (CCG) to obtain data annotated with logical formulas. We augment our sequence-to-sequence model with masking for predicates to constrain output sentences. We also propose a novel evaluation method for generation using Recognizing Textual Entailment (RTE). Combining parsing and generation, we test whether or not the output sentence entails the original text and vice versa. Experiments showed that our model outperformed a baseline with respect to both BLEU scores and accuracies in RTE.
Simulating human language understanding on the computer is a great challenge. A way to approach it is to represent natural language meanings in logic, and to use logical provers to determine what does and does not follow from a text. What logic is best to use and how natural language meanings are best represented in it are far from trivial questions. This thesis focuses on semantic representation in deep parsing. It describes the Delilah parser and generator for Dutch, which computes semantic representations for sentences, discussing several issues and proposing some further improvements to the system. A style of logical form is developed that is optimized for inference in mainly two ways. One is the implementation of event semantics for verbs and nominalizations and with underlying states for intersective adjectives and their corresponding abstract nouns. This makes many entailments follow straightforwardly. The second is the introduction of Flat Logical Form, as an alternative to ...
Encoder-Decoder Shift-Reduce Syntactic Parsing
2017
Encoder-decoder neural networks have been used for many NLP tasks, such as neural machine translation. They have also been applied to constituent parsing by using bracketed tree structures as a target language, translating input sentences into syntactic trees. A more commonly used method to linearize syntactic trees is the shift-reduce system, which uses a sequence of transition-actions to build trees. We empirically investigate the effectiveness of applying the encoder-decoder network to transition-based parsing. On standard benchmarks, our system gives comparable results to the stack LSTM parser for dependency parsing, and significantly better results compared to the aforementioned parser for constituent parsing, which uses bracketed tree formats.
2014
Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades. As a result, the most accurate parsers are domain specific, complex, and inefficient. In this paper we show that the domain agnostic attention-enhanced sequence-to-sequence model achieves state-of-the-art results on the most widely used syntactic constituency parsing dataset, when trained on a large synthetic corpus that was annotated using existing parsers. It also matches the performance of standard parsers when trained only on a small human-annotated dataset, which shows that this model is highly data-efficient, in contrast to sequence-to-sequence models without the attention mechanism. Our parser is also fast, processing over a hundred sentences per second with an unoptimized CPU implementation.
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks
arXiv (Cornell University), 2022
How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce SUPER-NATURALINSTRUCTIONS, 1 a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting, and text composition. This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructionstraining models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones. Furthermore, we build Tk-INSTRUCT, a transformer model trained to follow a variety of incontext instructions (plain language task definitions or k-shot examples). Our experiments show that Tk-INSTRUCT outperforms existing instruction-following models such as Instruct-GPT by over 9% on our benchmark despite being an order of magnitude smaller. We further analyze generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances per task, and model sizes. We hope our dataset and model facilitate future progress towards more general-purpose NLP models. 2 1 SUPER-NATURALINSTRUCTIONS represents a supersized expansion of NATURALINSTRUCTIONS (Mishra et al., 2022b) which had 61 tasks. 2 The dataset, models, and a leaderboard can be found at https:// instructions.apps.allenai.org.