Software Requirements: A new Domain for Semantic Parsers (original) (raw)
Related papers
Software requirements as an application domain for natural language processing
Language Resources and Evaluation, 2017
Mapping functional requirements first to specifications and then to code is one of the most challenging tasks in software development. Since requirements are commonly written in natural language, they can be prone to ambiguity, incompleteness and inconsistency. Structured semantic representations allow requirements to be translated to formal models, which can be used to detect problems at an early stage of the development process through validation. Storing and querying such models can also facilitate software reuse. Several approaches constrain the input format of requirements to produce specifications, however they usually require considerable human effort in order to adopt domain-specific heuristics and/or controlled languages. We propose a mechanism that automates the mapping of requirements to formal representations using semantic role labeling. We describe the first publicly available dataset for this task, employ a hierarchical framework that allows requirements concepts to be annotated, and discuss how semantic role labeling can be adapted for parsing software requirements.
Requirements as an Application Domain for Natural Language Processing
2017
Mapping functional requirements first to specifications and then to code is one of the most challenging tasks in software development. Since requirements are commonly written in natural language, they can be prone to ambiguity, incompleteness and inconsistency. Structured semantic representations allow requirements to be translated to formal models, which can be used to detect problems at an early stage of the development process through validation. Storing and querying such models can also facilitate software reuse. Several approaches constrain the input format of requirements to produce specifications, however they usually require considerable human effort in order to adopt domain-specific heuristics and/or controlled languages. We propose a mechanism that automates the mapping of requirements to formal representations using semantic role labeling. We describe the first publicly available dataset for this task, employ a hierarchical framework that allows requirements concepts to b...
LASR: A tool for large scale annotation of software requirements
2012 Second IEEE International Workshop on Empirical Requirements Engineering (EmpiRE), 2012
Annotation of software requirements documents is performed by experts during the requirements analysis phase to extract crucial knowledge from informally written textual requirements. Different annotation tasks target the extraction of different types of information and require the availability of experts specialized in the field. Large scale annotation tasks require multiple experts where the limited number of experts can make the tasks overwhelming and very costly without proper tool support. In this paper, we present our annotation tool, LASR, that can aid the tasks of requirements analysis by attaining more accurate annotations. Our evaluation of the tool demonstrate that the annotation data collected by LASR from the trained non-experts can help compute gold-standard annotations that strongly agree with the true gold-standards set by the experts, and therefore eliminate the need of conducting costly adjudication sessions for large scale annotation work.
Lecture Notes in Computer Science, 2005
Software requirements specification is a critical activity of the software process, as errors at this stage inevitably lead to problems later on in system design and implementation. The requirements are written in natural language, with the potential for ambiguity, contradiction or misunderstanding, or simply an inability of developers to deal with a large amount of information. This paper proposes a methodology for the natural language processing of textual descriptions of the requirements of an unlimited natural language and their automatic mapping to the object-oriented analysis model.
Automatically Extracting Requirements Specifications from Natural Language
Natural language (supplemented with diagrams and some mathematical notations) is convenient for succinct communication of technical descriptions between the various stakeholders (e.g., customers, designers, implementers) involved in the design of software systems. However, natural language descriptions can be informal, incomplete, imprecise and ambiguous, and cannot be processed easily by design and analysis tools. Formal languages, on the other hand, formulate design requirements in a precise and unambiguous mathematical notation, but are more difficult to master and use. We propose a methodology for connecting semi-formal requirements with formal descriptions through an intermediate representation. We have implemented this methodology in a research prototype called ARSENAL with the goal of constructing a robust, scalable, and trainable framework for bridging the gap between natural language requirements and formal tools. The main novelty of ARSENAL lies in its automated generation...
Semantic Interpretation of Requirements through Cognitive Grammar and Configuration
Lecture Notes in Computer Science, 2014
Many attempts have been made to apply Natural Language Processing to requirements specifications. However, typical approaches rely on shallow parsing to identify object-oriented elements of the specifications (e.g. classes, attributes, and methods). As a result, the models produced are often incomplete, imprecise, and require manual revision and validation. In contrast, we propose a deep Natural Language Understanding approach to create complete and precise formal models of requirements specifications. We combine three main elements to achieve this: (1) acquisition of lexicon from a user-supplied glossary requiring little specialised prior knowledge; (2) flexible syntactic analysis based purely on word-order; and (3) Knowledge-based Configuration unifies several semantic analysis tasks and allows the handling of ambiguities and errors. Moreover, we provide feedback to the user, allowing the refinement of specifications into a precise and unambiguous form. We demonstrate the benefits of our approach on an example from the PROMISE requirements corpus.
Resolving Ambiguities in Natural Language Software Requirements: A Comprehensive Survey
Requirements Engineering is one of the most vital activities in the entire Software Development Life Cycle. The success of the software is largely dependent on how well the users’ requirements have been understood and converted into appropriate functionalities in the software. Typically, the users convey their requirements in natural language statements that initially appear easy to state. However, being stated in natural language, the statement of requirements often tends to suffer from misinterpretations and imprecise inferences. As a result, the requirements specified thus, may lead to ambiguities in the software specifications. One can indeed find numerous approaches that deal with ensuring precise requirement specifications. Naturally, an obvious approach to deal with ambiguities in natural language software specifications is to eliminate ambiguities altogether i.e. to use formal specifications. However, the formal methods have been observed to be cost-effective largely for the development of mission-critical software. Due to the technical sophistication required, these are yet to be accepted in the mainstream. Hence, the other alternative is to let the ambiguities exist in the natural language requirements but deal with the same using proven techniques viz. using approaches based on machine learning, knowledge and ontology to resolve them. One can indeed find numerous automated and semi-automated tools to resolve specific types of natural language software requirement ambiguities. However, to the best of our knowledge there is no published literature that attempts to compare and contrast the prevalent approaches to deal with ambiguities in natural language software requirements. Hence, in this paper, we attempt to survey and analyze the prevalent approaches that attempt to resolve ambiguities in natural language software requirements. We focus on presenting a state-of-the-art survey of the currently available tools for ambiguity resolution. The objective of this paper is to disseminate, dissect and analyze the research work published in the area, identify metrics for a comparative evaluation and eventually do the same. At the end, we identify open research issues with an aim to spark new interests and developments in this field.
Semantic Parsing Using Content and Context: A Case Study from Requirements Elicitation
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014
We present a model for the automatic semantic analysis of requirements elicitation documents. Our target semantic representation employs live sequence charts, a multi-modal visual language for scenariobased programming, which can be directly translated into executable code. The architecture we propose integrates sentencelevel and discourse-level processing in a generative probabilistic framework for the analysis and disambiguation of individual sentences in context. We show empirically that the discourse-based model consistently outperforms the sentence-based model when constructing a system that reflects all the static (entities, properties) and dynamic (behavioral scenarios) requirements in the document.
From Natural Language Requirements to Formal Specification Using an Ontology
2013 IEEE 25th International Conference on Tools with Artificial Intelligence, 2013
In order to check requirement specifications written in natural language, we have chosen to model domain knowledge through an ontology and to formally represent user requirements by its population. Our approach of ontology population focuses on instance property identification from texts. We do so using extraction rules automatically acquired from a training corpus and a bootstrapping terminology. These rules aim at identifying instance property mentions represented by triples of terms, using lexical, syntactic and semantic levels of analysis. They are generated from recurrent syntactic paths between terms denoting instances of concepts and properties. We show how focusing on instance property identification allows us to precisely identify concept instances explicitly or implicitly mentioned in texts.
ARSENAL: Automatic Requirements Specification Extraction from Natural Language
Lecture Notes in Computer Science, 2016
Requirements are informal and semi-formal descriptions of the expected behavior of a complex system from the viewpoints of its stakeholders (customers, users, operators, designers, and engineers). However, for the purpose of design, testing, and verification for critical systems, we can transform requirements into formal models that can be analyzed automatically. ARSENAL is a framework and methodology for systematically transforming natural language (NL) requirements into analyzable formal models and logic specifications. These models can be analyzed for consistency and implementability. The ARSENAL methodology is specialized to individual domains, but the approach is general enough to be adapted to new domains.