Pre-Training With Scientific Text Improves Educational Question Generation (original) (raw)

Scalable Educational Question Generation with Pre-trained Language Models

arXiv (Cornell University), 2023

The automatic generation of educational questions will play a key role in scaling online education, enabling self-assessment at scale when a global population is manoeuvring their personalised learning journeys. We develop EduQG, a novel educational question generation model built by adapting a large language model. Our extensive experiments demonstrate that EduQG can produce superior educational questions by further pre-training and fine-tuning a pre-trained language model on the scientific text and science question data.

A Feasibility Study of Answer-Agnostic Question Generation for Education

ArXiv, 2022

We conduct a feasibility study into the applica-bility of answer-agnostic question generation models to textbook passages. We show that a significant portion of errors in such systems arise from asking irrelevant or uninterpretable questions and that such errors can be amelio-rated by providing summarized input. We find that giving these models human-written summaries instead of the original text results in a significant increase in acceptability of generated questions (33% → 83%) as determined by expert annotators. We also find that, in the absence of human-written summaries, automatic summarization can serve as a good middle ground.

Towards Automatic Generation of Questions from Long Answers

2020

Automatic question generation (AQG) has broad applicability in domains such as tutoring systems, conversational agents, healthcare literacy, and information retrieval. Existing efforts at AQG have been limited to short answer lengths of up to two or three sentences. However, several real-world applications require question generation from answers that span several sentences. Therefore, we propose a novel evaluation benchmark to assess the performance of existing AQG systems for long-text answers. We leverage the large-scale open-source Google Natural Questions dataset to create the aforementioned long-answer AQG benchmark. We empirically demonstrate that the performance of existing AQG methods significantly degrades as the length of the answer increases. Transformer-based methods outperform other existing AQG methods on long answers in terms of automatic as well as human evaluation. However, we still observe degradation in the performance of our best performing models with increasin...

Automatic Question Generation for Vocabulary Assessment

2005

In the REAP system, users are automatically provided with texts to read targeted to their individual reading levels. To find appropriate texts, the user's vocabulary knowledge must be assessed. We describe an approach to automatically generating questions for vocabulary assessment. Traditionally, these assessments have been hand-written. Using data from WordNet, we generate 6 types of vocabulary questions. They can have several forms, including wordbank and multiple-choice. We present experimental results that suggest that these automatically-generated questions give a measure of vocabulary skill that correlates well with subject performance on independently developed humanwritten questions. In addition, strong correlations with standardized vocabulary tests point to the validity of our approach to automatic assessment of word knowledge.

Selecting Better Samples from Pre-trained LLMs: A Case Study on Question Generation

arXiv (Cornell University), 2022

Large Language Models (LLMs) have in recent years demonstrated impressive prowess in natural language generation. A common practice to improve generation diversity is to sample multiple outputs from the model. However, there lacks a simple and robust way of selecting the best output from these stochastic samples. As a case study framed in the context of question generation, we propose two prompt-based approaches to selecting highquality questions from a set of LLM-generated candidates. Our method works under the constraints of 1) a black-box (non-modifiable) question generation model and 2) lack of access to human-annotated references-both of which are realistic limitations for real-world deployment of LLMs. With automatic as well as human evaluations, we empirically demonstrate that our approach can effectively select questions of higher qualities than greedy generation. 1 * Equal contribution. 1 We open-source all code and annotated data on github.

Few-Shot Question Generation for Personalized Feedback in Intelligent Tutoring Systems

PAIS 2022

Existing work on generating hints in Intelligent Tutoring Systems (ITS) focuses mostly on manual and non-personalized feedback. In this work, we explore automatically generated questions as personalized feedback in an ITS. Our personalized feedback can pinpoint correct and incorrect or missing phrases in student answers as well as guide them towards correct answer by asking a question in natural language. Our approach combines cause–effect analysis to break down student answers using text similarity-based NLP Transformer models to identify correct and incorrect or missing parts. We train a few-shot Neural Question Generation and Question Re-ranking models to show questions addressing components missing in the student’s answers which steers students towards the correct answer. Our model vastly outperforms both simple and strong baselines in terms of student learning gains by 45% and 35% respectively when tested in a real dialogue-based ITS. Finally, we show that our personalized corr...

Investigating Educational and Noneducational Answer Selection for Educational Question Generation

IEEE Access

Educational automatic question generation (AQG) is often unable to realize its full potential in educational applications due to insufficient training data. For this reason, current research relies on noneducational question answering datasets for system training and evaluation. However, noneducational training data may comprise different language patterns than educational data. Consequently, the research question of whether models trained on noneducational datasets transfer well to the educational AQG task arises. In this work, we investigate the AQG subtask of answer selection, which aims to extract meaningful answers for the questions to be generated. We train and evaluate six modern and well-established BERTbased machine learning model architectures on two widely used noneducational datasets. Furthermore, we introduce a novel, midsized educational dataset for answer selection called TQA-A. TQA-A is used to investigate the transfer capabilities of the noneducational models to the educational domain. In terms of phrase-level evaluation metrics, noneducational models perform similar to models trained directly on the novel educational TQA-A dataset, although trained with considerably more training data. Moreover, models trained directly on TQA-A select fewer named entity-based and more verb-based answers than noneducational models. This provides evidence for differences in noneducational and educational answer selection tasks.

I Do Not Understand What I Cannot Define: Automatic Question Generation With Pedagogically-Driven Content Selection

2021

Most learners fail to develop deep text comprehension when reading textbooks passively. Posing questions about what learners have read is a well-established way of fostering their text comprehension. However, many textbooks lack self-assessment questions because authoring them is timeconsuming and expensive. Automatic question generators may alleviate this scarcity by generating sound pedagogical questions. However, generating questions automatically poses linguistic and pedagogical challenges. What should we ask? And, how do we phrase the question automatically? We address those challenges with an automatic question generator grounded in learning theory. The paper introduces a novel pedagogically meaningful content selection mechanism to find question-worthy sentences and answers in arbitrary textbook contents. We conducted an empirical evaluation study with educational experts, annotating 150 generated questions in six different domains. Results indicate a high linguistic quality ...

Automatic Question Generation: A Systematic Review

SSRN Electronic Journal, 2019

Today's educational systems need an efficient tool to perform competently assessment of students on their major concepts they learnt from study material. Preparing a set of questions for assessment can be time consuming for teachers while getting questions from external sources like assessment books or question bank might not be relevant to content studied by students. Automatic Question Generation (AQG) is the technique for generating a right set of questions from a content, which can be text. Automatic question generation (QG) is a very important yet challenging problem in NLP. It is defined as the task of generating syntactically sound, semantically correct and relevant questions from several input formats like text, a structured database or a knowledge base. Question generation can be naturally applied in many domains such as MOOC, automated help systems, search engines, chatbot systems (e.g. for customer interaction), and healthcare for analyzing mental health. AQG has the got the immense attention from researchers in a field of computational linguistics. The review paper focuses on the recants ongoing research on NLP for generating automatic questions from the text through various methods. http://ssrn.com/link/ICAESMT-2019.html=xyz Information Systems &eBusiness Network (ISN) Question generation can be naturally applied in many domains such as MOOC, automated help systems, search engines, chatbot systems (e.g. for customer interaction), and healthcare for analyzing mental health. Despite its usefulness, manually creating meaningful and relevant questions is a timeconsuming and challenging task. For example, while evaluating students on reading comprehension, it is tedious for a teacher to manually create questions, find answers to those questions, and thereafter evaluate answers. Traditional approaches have either used a linguistically motivated set of transformation rules for transforming a given sentence into a question or a set of manually created templates with slot fillers to generate questions. Recently, neural network-based techniques such as sequence-to-sequence (Seq2Seq) learning have achieved remarkable success in various NLP tasks, including question generation. A modern approach that is deep learning is used to generates question. The approach proposed by author explore a straightforward task of question generation only from a triplet of subject, relation and object. Sequence-to-sequence prototype with attention for question generation from passages. The proposed algorithm generates questions and answers from corpus using pointer networks.

Question Generation: Past, Present & Future

2024

Question Generation (QG) is an essential area in Natural Language Processing (NLP) that aims to create questions from a given text. This paper reviews the evolution of QG methods, from early rule-based systems to contemporary deep learning techniques, and explores potential future advancements. By examining the strengths and weaknesses of each approach, we provide a comprehensive understanding of the progress in QG and propose directions for future research.