Prompts Matter: Insights and Strategies for Prompt Engineering in Automated Software Traceability (original) (raw)
Related papers
Exploiting Parts-of-Speech for Effective Automated Requirements Traceability
Context: Requirement traceability (RT) is defined as the ability to describe and follow the life of a requirement. RT helps developers ensure that relevant requirements are implemented and that the source code is consistent with its requirement with respect to a set of traceability links called trace links. Previous work leverages Parts Of Speech (POS) tagging of software artifacts to recover trace links among them. These studies work on the premise that discarding one or more POS tags results in an improved accuracy of Information Retrieval (IR) techniques. Objective: First, we show empirically that excluding one or more POS tags could negatively impact the accuracy of existing IR-based traceability approaches, namely the Vector Space Model (VSM) and the Jensen Shannon Model (JSM). Second, we propose a method that improves the accuracy of IR-based traceability approaches. Method: We developed an approach, called ConPOS, to recover trace links using constraint-based pruning. ConPOS uses major POS categories and applies constraints to the recovered trace links for pruning as a filtering process to significantly improve the effectiveness of IR-based techniques. We conducted an experiment to provide evidence that removing POSs does not improve the accuracy of IR techniques. Furthermore, we conducted two empirical studies to evaluate the effectiveness of ConPOS in recovering trace links compared to existing peer RT approaches. Results: The results of the first empirical study show that removing one or more POS negatively impacts the accuracy of VSM and JSM. Furthermore, the results from the other empirical studies show that ConPOS provides 11%-107%, 8%-64%, and 15%-170% higher precision, recall, and mean average precision (MAP) than VSM and JSM. Conclusion: We showed that ConPos outperforms existing IR-based RT approaches that discard some POS tags from the input documents.
Support for traceability management of software artefacts using Natural Language Processing
2016 Moratuwa Engineering Research Conference (MERCon), 2016
One of the major problems in software development process is managing software artefacts. While software evolves, inconsistencies between the artefacts do evolve as well. To resolve the inconsistencies in change management, a tool named “Software Artefacts Traceability Analyzer (SAT-Analyzer)” was introduced as the previous work of this research. Changes in software artefacts in requirement specification, Unified Modelling Language (UML) diagrams and source codes can be tracked with the help of Natural Language Processing (NLP) by creating a structured format of those documents. Therefore, in this research we aim at adding an NLP support as an extension to SAT-Analyzer. Enhancing the traceability links created in the SAT-analyzer tool is another focus due to artefact inconsistencies. This paper includes the research methodology and relevant research carried out in applying NLP for improved traceability management. Tool evaluation with multiple scenarios resulted in average Precision 72.22%, Recall 88.89% and F1 measure of 78.89% suggesting high accuracy for the domain.
Improving the identification of traceability links between source code and requirements
Software developers are interested in requirement tracea-bility to e.g., verify if all requirements are covered by a system design specification. Based on the assumption that related artifacts contain related terms, researchers have developed, used, and extended algorithms that identify related terms and subsequently infer which arti-facts are related (i.e., there is a traceability link between them). Source code is not as verbose as a natural language description, which reduces the applicability of algorithms that precisely rely on such a commonality. This paper extends the Vector Space Model using tf*idf term weights to improve the identification of traceability links between source code and requirements. To this extent, we modify the way how requirements are identified and to include user feedback. We show that the inclusion of user feedback significantly improved the number of correctly identified requirements.
2014
In the system and source code development, we develop a source code and documentation is mainly in natural language. The Continuous and frequent development require proper requirements change management Traceability is essential for management of change and analysis of its impact. This research paper presents a technique in the domain of traceability we believe that the application-domain knowledge that programmer's process when writing the code is often captured by the mnemonics for identifiers, the analysis of these mnemonics can help to associate high-level concepts with program concepts and vice-versa. . We propose a method based on information retrieval for traceability links between source code and free text documents. Here we use information retrieval techniques for establishing a links between source code and requirement, documentation and latent semantic indexing, is used to automatically identify traceability links from system code. Traceability is the most important f...