GitHub - jolemon/FineLocator: An approach to method-level bug localization by query expansion. (original) (raw)

Description

Baseline technique of method-level bug localization implemented for BLESER.

FineLocator is an approach to method-level bug localization. This technique retrieve information from input of bug report and source code and output a list of suspicious buggy method relevant to the bug report.

Each words in bug report and methods of source code are regarded as documents and transferred into numeric vectors by word embedding (word2vec). The vector representation of a document is calculated as the product of vectors of all words in the document and their corresponding TFIDF values, considering the influence of each single word.

To address the representation sparseness problem caused by short-length methods, the methods are augmented by each other using 3 weighted scores of query expansion, including semantic similarity, temporal proximity and call dependency. The semantic similarity is described as the cosine similarity of method vectors. The temporal proximity is described as the time difference between the latest modified time of methods. The call dependency is described as length of the shortest call path of methods. At last, suspicious buggy methods are ranked by the cosine similarity of vectors of bug report and vectors of augmented methods.

Requirement

On MacOS:

Instructions

  1. mvn package to make jar for "pt" (bug report preprocessor) and "word2vec" (embedding model) if not exist.
    Problem: the "word2vec" jar may contain a lot of redundant dependency component for various OS.
  2. Edit "input.properties" and "run.sh" for your specific configuration.
  3. run.sh ${proj} to run FineLocator.
  4. Get result in "${expResDir}/final/${proj}".