Pasquale Ardimento - Academia.edu (original) (raw)
Papers by Pasquale Ardimento
Lecture Notes in Computer Science, 2016
In modern software development, finding and fixing bugs is a vital part of software development a... more In modern software development, finding and fixing bugs is a vital part of software development and quality assurance. Once a bug is reported, it is typically recorded in the Bug Tracking System, and is assigned to a developer to resolve (bug triage). Current practice of bug triage is largely a manual collaborative process, which is often time-consuming and error-prone. Predicting on the basis of past data the time to fix a newly-reported bug has been shown to be an important target to support the whole triage process. Many researchers have, therefore, proposed methods for automated bug-fix time prediction, largely based on statistical prediction models exploiting the attributes of bug reports. However, existing algorithms often fail to validate on multiple large projects widely-used in bug studies, mostly as a consequence of inappropriate attribute selection [2]. In this paper, instead of focusing on attribute subset selection, we explore an alternative promising approach consisting of using all available textual information. The problem of bug-fix time estimation is then mapped to a text categorization problem. We consider a multi-topic Supervised Latent Dirichlet Allocation (SLDA) model, which adds to Latent Dirichlet Allocation a response variable consisting of an unordered binary target variable, denoting time to resolution discretized into FAST (negative class) and SLOW (positive class) labels. We have evaluated SLDA on four large-scale open source projects. We show that the proposed model greatly improves recall, when compared to standard single topic algorithms.
Communications in computer and information science, 2023
Communications in computer and information science, 2023
Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering
Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings
The computing education community has shown a long-time interest in how to analyze the Object-Ori... more The computing education community has shown a long-time interest in how to analyze the Object-Oriented (OO) source code developed by students to provide them with useful formative tips. In this paper, we propose and evaluate an approach to analyze how students use Java and its language constructs. The approach is implemented through a cloud-based integrated development environment (IDE) and it is based on the analysis of the most common violations of the OO paradigm in the student source code. Moreover, the IDE supports the automatic generation of reports about student’s mistakes and misconceptions that can be used by instructors to improve the course design. The paper discusses the preliminary results of an experiment performed in a class of a Programming II course to investigate the effects of the provided reports in terms of coding ability (concerning the correctness of the produced code).
Journal of Systems and Software
2020 International Joint Conference on Neural Networks (IJCNN)
Mobile devices have become, in the last years, an essential tool used to perform daily activities... more Mobile devices have become, in the last years, an essential tool used to perform daily activities. However, they also have become the target of continuous malware attacks usually coming out from new malware obtained as a variant of existing ones. For this reason, we suppose that by comparing the behavior of a new application with those of known malware applications it is possible to define it as malicious or trusted. According to this, the current study proposes an approach based on a data-aware declarative process mining technique to identify similarities and recurring patterns in the system call traces generated by a set of malicious mobile applications. The obtained characterization, represented by a set of declarative constraints within their data attributes, can be considered as a run-time fingerprint of a malware useful to evaluate the membership of a new application to a given malware family. The empirical validation of the proposed approach is performed on a dataset of more than 1200 trusted and malicious applications coming out from eight malware families and the obtained results show a very good discrimination ability.
Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering
Higher Education Learning Methodologies and Technologies Online, 2022
Electronics, 2022
Capturing and analyzing interaction data in real-time from development environments can help in u... more Capturing and analyzing interaction data in real-time from development environments can help in understanding how programmers handle coding activities. We propose the use of process mining to learn coding behavior from event logs captured from a customized Integrated Development Environment, concerning interactions with both such an environment and a Version Control System. In particular, by using an incremental approach, the discovered model can be refined after every single development session, which avoids the need to for the model to learn from scratch from previous sessions. It would also allow one to provide the programmer timely suggestions to improve their performance. In this paper, we applied off-line incremental behavior, so as to be able to analyze it at several levels of depth and at different moments. As a preliminary evaluation of our approach, we investigated the coding activities of six novice students of a Java academic programming course working on a programming c...
2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019
Comprehension of how students and developers head the development of software and what specific h... more Comprehension of how students and developers head the development of software and what specific hurdles they face, have a strong potential to better support the coding workflow. In this paper, we present the CodingMiner environment to generate event logs from IDE usage enabling the adoption of fuzzy-based process mining techniques to model and to study the developers’ coding process. The logs from the development sessions have been analyzed using the fuzzy miner to highlight emergent and interesting developers’ and students’ behaviors during coding. The mined processes show different IDE usage patterns for students with different skills and performances. To validate our approach, we describe the results of a study in which the CodingMiner environment is used to investigate the coding activities of twenty students of a CS2 course performing a given programming task during four assignments. Results also demonstrate that fuzzy-based process mining techniques can be effectively exploited to understand students and developers behavior during programming tasks providing useful insights to improve the way they code.
2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2020
Mobile phones are currently the main targets of continuous malware attacks. Usually, new maliciou... more Mobile phones are currently the main targets of continuous malware attacks. Usually, new malicious code is generated conveniently changing the existing one. According to this, it becomes very useful to identify new approaches for the analysis of malware phylogeny. This paper proposes a data-aware process mining approach performing a malware dynamic analysis. The process mining is performed by using a multiperspective declarative approach allowing to model a malware family as a set of constraints (within their data attributes) among the system call traces gathered from infected applications. The models are used to detect execution patterns or other relationships among families. The obtained models can be used to verify if a checked malware is a potential member of a known malware family and its difference with respect to other malware variants of the family. The approach is implemented and applied on a dataset composed of 5648 trusted and malicious applications across 39 malware families. The obtained results show great performance in malware phylogeny generation.
Higher Education Learning Methodologies and Technologies Online, 2019
Analyze the Object-oriented (OO) source code developed by students provides useful formative tips... more Analyze the Object-oriented (OO) source code developed by students provides useful formative tips to instructors. According to this, it is essential to understand the student’s real difficulties allowing instructors to shape effective courses. To provide run-time feedback to students and to study and analyze the evolution of their performances offline and over time we designed a framework and developed a tool. It allows to identify students’ misconceptions analysing source code and to create personalized student reports automatically. In this paper, we present an empirical study, conducted using our toolchain, that involves 1627 projects extracted from the multi-institution Blackbox dataset. We identified a violation model for Java language constructs based on established results in the computing education community. Afterwards, we grouped such violations in categories and analyzed the relations among them. Our contributions might be helpful in delivering formative feedback and supporting instructors who teach Java and object-oriented programming in general.
Neural Computing and Applications, 2021
Software maintenance and evolution can introduce defects in software systems. For this reason, th... more Software maintenance and evolution can introduce defects in software systems. For this reason, there is a great interest to identify defect prediction and estimation techniques. Recent research proposes just-in-time techniques to predict defective changes just at the commit level allowing the developers to fix the defect when it is introduced. However, the performance of existing just-in-time defect prediction models still requires to be improved. This paper proposes a new approach based on a large feature set containing product and process software metrics extracted from commits of software projects along with their evolution. The approach also introduces a deep temporal convolutional networks variant based on hierarchical attention layers to perform the fault prediction. The proposed approach is evaluated on a large dataset, composed of data gathered from six Java open-source systems. The obtained results show the effectiveness of the proposed approach in timely predicting defect proneness of code components.
Communications in Computer and Information Science, 2021
IEEE Access, 2020
The computing education community has shown a long-time interest in how to analyze the Object-Ori... more The computing education community has shown a long-time interest in how to analyze the Object-Oriented (OO) source code developed by students to provide them with useful formative tips. Instructors need to understand the student's difficulties to provide precise feedback on most frequent mistakes and to shape, design and effectively drive the course. This paper proposes and evaluates an approach allowing to analyze student's source code and to automatically generate feedback about the more common violations of the produced code. The approach is implemented through a cloud-based tool allowing to monitor how students use language constructs based on the analysis of the most common violations of the Object-Oriented paradigm in the student source code. Moreover, the tool supports the generation of reports about student's mistakes and misconceptions that can be used to improve the students' education. The paper reports the results of a quasi-experiment performed in a class of a CS1 course to investigate the effects of the provided reports in terms of coding ability (concerning the correctness and the quality of the produced source code). Results show that after the course the treatment group obtained higher scores and produced better source code than the control group following the feedback provided by the teachers.
Proceedings of the 16th International Conference on Software Technologies, 2021
This paper investigates whether the adoption of a transfer learning approach can be effective for... more This paper investigates whether the adoption of a transfer learning approach can be effective for just-in-time design smells prediction. The approach uses a variant of Temporal Convolutional Networks to predict design smells and a carefully selected fine-grained process and product metrics. The validation is performed on a dataset composed of three open-source systems and includes a comparison between transfer and direct learning. The hypothesis, which we want to verify, is that the proposed transfer learning approach is feasible to transfer the knowledge gained on mature systems to the system of interest to make reliable predictions even at the beginning of development when the available historical data is limited. The obtained results show that, when the class imbalance is high, the transfer learning provides F1-scores very close to the ones obtained by direct learning.
Journal of e-learning and knowledge society, 2011
Practitioners must continually update their skills to align their professional profile to market ... more Practitioners must continually update their skills to align their professional profile to market needs and social organizations in which they live, both characterized by extreme variability and volatility. In this scenario, Universities, the traditional Institution for the knowledge transferring, assume the role of an institution dedicated to lifelong learning. However the lifelong learning highlights several issues that make it unsuitable to the university instructional models. In order to face this problem the authors propose to use a Learning Network model integrating a Knowledge Base Experience (Prometheus) to support distribution of contents and to the enhancement knowledge transferring. The results of an empirical experimentation encourage their adoption in real contexts.
Lecture Notes in Computer Science, 2016
In modern software development, finding and fixing bugs is a vital part of software development a... more In modern software development, finding and fixing bugs is a vital part of software development and quality assurance. Once a bug is reported, it is typically recorded in the Bug Tracking System, and is assigned to a developer to resolve (bug triage). Current practice of bug triage is largely a manual collaborative process, which is often time-consuming and error-prone. Predicting on the basis of past data the time to fix a newly-reported bug has been shown to be an important target to support the whole triage process. Many researchers have, therefore, proposed methods for automated bug-fix time prediction, largely based on statistical prediction models exploiting the attributes of bug reports. However, existing algorithms often fail to validate on multiple large projects widely-used in bug studies, mostly as a consequence of inappropriate attribute selection [2]. In this paper, instead of focusing on attribute subset selection, we explore an alternative promising approach consisting of using all available textual information. The problem of bug-fix time estimation is then mapped to a text categorization problem. We consider a multi-topic Supervised Latent Dirichlet Allocation (SLDA) model, which adds to Latent Dirichlet Allocation a response variable consisting of an unordered binary target variable, denoting time to resolution discretized into FAST (negative class) and SLOW (positive class) labels. We have evaluated SLDA on four large-scale open source projects. We show that the proposed model greatly improves recall, when compared to standard single topic algorithms.
Communications in computer and information science, 2023
Communications in computer and information science, 2023
Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering
Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Companion Proceedings
The computing education community has shown a long-time interest in how to analyze the Object-Ori... more The computing education community has shown a long-time interest in how to analyze the Object-Oriented (OO) source code developed by students to provide them with useful formative tips. In this paper, we propose and evaluate an approach to analyze how students use Java and its language constructs. The approach is implemented through a cloud-based integrated development environment (IDE) and it is based on the analysis of the most common violations of the OO paradigm in the student source code. Moreover, the IDE supports the automatic generation of reports about student’s mistakes and misconceptions that can be used by instructors to improve the course design. The paper discusses the preliminary results of an experiment performed in a class of a Programming II course to investigate the effects of the provided reports in terms of coding ability (concerning the correctness of the produced code).
Journal of Systems and Software
2020 International Joint Conference on Neural Networks (IJCNN)
Mobile devices have become, in the last years, an essential tool used to perform daily activities... more Mobile devices have become, in the last years, an essential tool used to perform daily activities. However, they also have become the target of continuous malware attacks usually coming out from new malware obtained as a variant of existing ones. For this reason, we suppose that by comparing the behavior of a new application with those of known malware applications it is possible to define it as malicious or trusted. According to this, the current study proposes an approach based on a data-aware declarative process mining technique to identify similarities and recurring patterns in the system call traces generated by a set of malicious mobile applications. The obtained characterization, represented by a set of declarative constraints within their data attributes, can be considered as a run-time fingerprint of a malware useful to evaluate the membership of a new application to a given malware family. The empirical validation of the proposed approach is performed on a dataset of more than 1200 trusted and malicious applications coming out from eight malware families and the obtained results show a very good discrimination ability.
Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing
Proceedings of the 17th International Conference on Evaluation of Novel Approaches to Software Engineering
Higher Education Learning Methodologies and Technologies Online, 2022
Electronics, 2022
Capturing and analyzing interaction data in real-time from development environments can help in u... more Capturing and analyzing interaction data in real-time from development environments can help in understanding how programmers handle coding activities. We propose the use of process mining to learn coding behavior from event logs captured from a customized Integrated Development Environment, concerning interactions with both such an environment and a Version Control System. In particular, by using an incremental approach, the discovered model can be refined after every single development session, which avoids the need to for the model to learn from scratch from previous sessions. It would also allow one to provide the programmer timely suggestions to improve their performance. In this paper, we applied off-line incremental behavior, so as to be able to analyze it at several levels of depth and at different moments. As a preliminary evaluation of our approach, we investigated the coding activities of six novice students of a Java academic programming course working on a programming c...
2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), 2019
Comprehension of how students and developers head the development of software and what specific h... more Comprehension of how students and developers head the development of software and what specific hurdles they face, have a strong potential to better support the coding workflow. In this paper, we present the CodingMiner environment to generate event logs from IDE usage enabling the adoption of fuzzy-based process mining techniques to model and to study the developers’ coding process. The logs from the development sessions have been analyzed using the fuzzy miner to highlight emergent and interesting developers’ and students’ behaviors during coding. The mined processes show different IDE usage patterns for students with different skills and performances. To validate our approach, we describe the results of a study in which the CodingMiner environment is used to investigate the coding activities of twenty students of a CS2 course performing a given programming task during four assignments. Results also demonstrate that fuzzy-based process mining techniques can be effectively exploited to understand students and developers behavior during programming tasks providing useful insights to improve the way they code.
2020 IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS), 2020
Mobile phones are currently the main targets of continuous malware attacks. Usually, new maliciou... more Mobile phones are currently the main targets of continuous malware attacks. Usually, new malicious code is generated conveniently changing the existing one. According to this, it becomes very useful to identify new approaches for the analysis of malware phylogeny. This paper proposes a data-aware process mining approach performing a malware dynamic analysis. The process mining is performed by using a multiperspective declarative approach allowing to model a malware family as a set of constraints (within their data attributes) among the system call traces gathered from infected applications. The models are used to detect execution patterns or other relationships among families. The obtained models can be used to verify if a checked malware is a potential member of a known malware family and its difference with respect to other malware variants of the family. The approach is implemented and applied on a dataset composed of 5648 trusted and malicious applications across 39 malware families. The obtained results show great performance in malware phylogeny generation.
Higher Education Learning Methodologies and Technologies Online, 2019
Analyze the Object-oriented (OO) source code developed by students provides useful formative tips... more Analyze the Object-oriented (OO) source code developed by students provides useful formative tips to instructors. According to this, it is essential to understand the student’s real difficulties allowing instructors to shape effective courses. To provide run-time feedback to students and to study and analyze the evolution of their performances offline and over time we designed a framework and developed a tool. It allows to identify students’ misconceptions analysing source code and to create personalized student reports automatically. In this paper, we present an empirical study, conducted using our toolchain, that involves 1627 projects extracted from the multi-institution Blackbox dataset. We identified a violation model for Java language constructs based on established results in the computing education community. Afterwards, we grouped such violations in categories and analyzed the relations among them. Our contributions might be helpful in delivering formative feedback and supporting instructors who teach Java and object-oriented programming in general.
Neural Computing and Applications, 2021
Software maintenance and evolution can introduce defects in software systems. For this reason, th... more Software maintenance and evolution can introduce defects in software systems. For this reason, there is a great interest to identify defect prediction and estimation techniques. Recent research proposes just-in-time techniques to predict defective changes just at the commit level allowing the developers to fix the defect when it is introduced. However, the performance of existing just-in-time defect prediction models still requires to be improved. This paper proposes a new approach based on a large feature set containing product and process software metrics extracted from commits of software projects along with their evolution. The approach also introduces a deep temporal convolutional networks variant based on hierarchical attention layers to perform the fault prediction. The proposed approach is evaluated on a large dataset, composed of data gathered from six Java open-source systems. The obtained results show the effectiveness of the proposed approach in timely predicting defect proneness of code components.
Communications in Computer and Information Science, 2021
IEEE Access, 2020
The computing education community has shown a long-time interest in how to analyze the Object-Ori... more The computing education community has shown a long-time interest in how to analyze the Object-Oriented (OO) source code developed by students to provide them with useful formative tips. Instructors need to understand the student's difficulties to provide precise feedback on most frequent mistakes and to shape, design and effectively drive the course. This paper proposes and evaluates an approach allowing to analyze student's source code and to automatically generate feedback about the more common violations of the produced code. The approach is implemented through a cloud-based tool allowing to monitor how students use language constructs based on the analysis of the most common violations of the Object-Oriented paradigm in the student source code. Moreover, the tool supports the generation of reports about student's mistakes and misconceptions that can be used to improve the students' education. The paper reports the results of a quasi-experiment performed in a class of a CS1 course to investigate the effects of the provided reports in terms of coding ability (concerning the correctness and the quality of the produced source code). Results show that after the course the treatment group obtained higher scores and produced better source code than the control group following the feedback provided by the teachers.
Proceedings of the 16th International Conference on Software Technologies, 2021
This paper investigates whether the adoption of a transfer learning approach can be effective for... more This paper investigates whether the adoption of a transfer learning approach can be effective for just-in-time design smells prediction. The approach uses a variant of Temporal Convolutional Networks to predict design smells and a carefully selected fine-grained process and product metrics. The validation is performed on a dataset composed of three open-source systems and includes a comparison between transfer and direct learning. The hypothesis, which we want to verify, is that the proposed transfer learning approach is feasible to transfer the knowledge gained on mature systems to the system of interest to make reliable predictions even at the beginning of development when the available historical data is limited. The obtained results show that, when the class imbalance is high, the transfer learning provides F1-scores very close to the ones obtained by direct learning.
Journal of e-learning and knowledge society, 2011
Practitioners must continually update their skills to align their professional profile to market ... more Practitioners must continually update their skills to align their professional profile to market needs and social organizations in which they live, both characterized by extreme variability and volatility. In this scenario, Universities, the traditional Institution for the knowledge transferring, assume the role of an institution dedicated to lifelong learning. However the lifelong learning highlights several issues that make it unsuitable to the university instructional models. In order to face this problem the authors propose to use a Learning Network model integrating a Knowledge Base Experience (Prometheus) to support distribution of contents and to the enhancement knowledge transferring. The results of an empirical experimentation encourage their adoption in real contexts.