Text-to-text semantic similarity for automatic short answer grading (original) (raw)

Automated Short-Answer Grading using Semantic Similarity based on Word Embedding

International Journal of Technology, 2021

Automatic short-answer grading (ASAG) is a system that aims to help speed up the assessment process without an instructor's intervention. Previous research had successfully built an ASAG system whose performance had a correlation of 0.66 and mean absolute error (MAE) starting from 0.94 with a conventionally graded set. However, this study had a weakness in the need for more than one reference answer for each question. It used a string-based equation method and keyword matching process to measure the sentences' similarity in order to produce an assessment rubric. Thus, our study aimed to build a more concise short-answer automatic scoring system using a single reference answer. The mechanism used a semantic similarity measurement approach through word embedding techniques and syntactic analysis to assess the learner's accuracy. Based on the experiment results, the semantic similarity approach showed a correlation value of 0.70 and an MAE of 0.70 when compared with the grading reference.

Short Answer Grading Using String Similarity And Corpus-Based Similarity

International Journal of Advanced Computer Science and Applications, 2012

Most automatic scoring systems use pattern based that requires a lot of hard and tedious work. These systems work in a supervised manner where predefined patterns and scoring rules are generated. This paper presents a different unsupervised approach which deals with students' answers holistically using text to text similarity. Different String-based and Corpus-based similarity measures were tested separately and then combined to achieve a maximum correlation value of 0.504. The achieved correlation is the best value achieved for unsupervised approach Bag of Words (BOW) when compared to previous work. 120 | P a g e www.ijacsa.thesai.org

Evaluating Short Answer Using Text Similarity Measures

Abstract— Our study is to develop a semi-automatic correction tool for the evaluation of students in exams online. Our problem is the comparison of the texts of which it is the response of the student and the other is the response model of the teacher. The texts are introduced using a computer, tablet or Smartphone, calculated results are in real time to a remote server. the results in this case must be relevant and obtained very quickly. We proceeded as track of study the similarity between texts.

Design and Development of a Framework for an Automatic Answer Evaluation System Based on Similarity Measures

Journal of Intelligent Systems, 2017

The assessment of answers is an important process that requires great effort from evaluators. This assessment process requires high concentration without any fluctuations in mood. This substantiates the need to automate answer script evaluation. Regarding text answer evaluation, sentence similarity measures have been widely used to compare student written answers with reference texts. In this paper, we propose an automated answer evaluation system that uses our proposed cosine-based sentence similarity measures to evaluate the answers. Cosine measures have proved to be effective in comparing between free text student answers and reference texts. Here we propose a set of novel cosine-based sentence similarity measures with varied approaches of creating document vector space. In addition to this, we propose a novel synset-based word similarity measure for computation of document vectors coupled with varied approaches for dimensionality-reduction for reducing vector space dimensions. T...

A scoring rubric for automatic short answer grading system

TELKOMNIKA Telecommunication Computing Electronics and Control, 2019

During the past decades, researches about automatic grading have become an interesting issue. These studies focuses on how to make machines are able to help human on assessing students' learning outcomes. Automatic grading enables teachers to assess student's answers with more objective, consistent, and faster. Especially for essay model, it has two different types, i.e. long essay and short answer. Almost of the previous researches merely developed automatic essay grading (AEG) instead of automatic short answer grading (ASAG). This study aims to assess the sentence similarity of short answer to the questions and answers in Indonesian without any language semantic's tool. This research uses pre-processing steps consisting of case folding, tokenization, stemming, and stopword removal. The proposed approach is a scoring rubric obtained by measuring the similarity of sentences using the string-based similarity methods and the keyword matching process. The dataset used in this study consists of 7 questions, 34 alternative reference answers and 224 student's answers. The experiment results show that the proposed approach is able to achieve a correlation value between 0.65419 up to 0.66383 at Pearson's correlation, with Mean Absolute Error () value about 0.94994 until 1.24295. The proposed approach also leverages the correlation value and decreases the error value in each method.

Presentation of an Efficient Automatic Short Answer Grading Model Based on Combination of Pseudo Relevance Feedback and Semantic Relatedness Measures

2019

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for an automated system with high flexibility for assessing exams based on texts. Generally, ASAG methods can be categorized into supervised and unsupervised approaches. Supervised approaches such as machine learning and especially deep learning methods require the manually constructed pattern. On the other hands, while in the assessment process, student's answer is compared to an ideal response and scoring is done based on their similarity, semantic relatedness and similarity measures can be considered as unsupervised approaches for this aim. Whereas unsupervised approaches do not require lab...

Exploring Distinct Features for Automatic Short Answer Grading

Anais do XV Encontro Nacional de Inteligência Artificial e Computacional (ENIAC 2018)

Automatic short answer grading is the study field that addresses the assessment of students’ answers to questions in natural language. The grading of the answers is generally seen as a typical classification supervised learning. To stimulate research in the field, two datasets were publicly released in the SemEval 2013 competition task “Student Response Analysis”. Since then, some works have been developed to improve the results. In this context, the goal of this work is to tackle such task by implementing lessons learned from the literature in an effective way and report results for both datasets and all of its scenarios. The proposed method obtained better results in most scenarios of the competition task and, therefore, higher overall scores when compared to recent works.

An Automated Assessment System for Evaluation of Students\' Answers Using Novel Similarity Measures

Research Journal of Applied Sciences, Engineering and Technology `, 2016

Artificial Intelligence has many applications in which automating a human behavior by machines is one of very important research activities currently in progress. This paper proposes an automated assessment system which uses two novel similarity measures which evaluate students' short and long answers and compares it with cosine similarity measure and n-gram similarity measure. The proposed system evaluates the information recall and comprehension type answers in Bloom's taxonomy. The comparison shows that the proposed system which uses two novel similarity measures outperforms the n-gram similarity measure and cosine similarity measure for information recall questions and comprehension questions. The system generated scores are also compared with human scores and the system scores correlates with human scores using Pearson and Spearman's correlation.