Linguistic Profiles of Students Interacting with Conversation-Based Assessment Systems (original) (raw)
Related papers
Evaluating English language learners’ conversations: Man vs. Machine
Computer Assisted Language Learning, 2018
This study investigated a new conversation-based assessment for English language learners. In this assessment, students converse with animated agents in natural language conversations to assess various aspects of English language learning. In a between subjects design, 31 students (N ¼ 31) were asked questions by either the newly created system or human interviewers. Results revealed that there was no difference in answers in regards to impacting the constructs under investigation. However, there were other differences including higher word count and potentially more complex language usage when speaking to a human. Furthermore, students liked interacting with the system. These results suggest that the newly created system can assess students' English capabilities comparably to human interviewers and implementing such a formative assessment is feasible.
Utilizing Natural Language Processing for Automated Assessment of Classroom Discussion
arXiv (Cornell University), 2023
Rigorous and interactive class discussions that support students to engage in high-level thinking and reasoning are essential to learning and are a central component of most teaching interventions. However, formally assessing discussion quality 'at scale' is expensive and infeasible for most researchers. In this work, we experimented with various modern natural language processing (NLP) techniques to automatically generate rubric scores for individual dimensions of classroom text discussion quality. Specifically, we worked on a dataset of 90 classroom discussion transcripts consisting of over 18000 turns annotated with fine-grained Analyzing Teaching Moves (ATM) codes and focused on four Instructional Quality Assessment (IQA) rubrics. Despite the limited amount of data, our work shows encouraging results in some of the rubrics while suggesting that there is room for improvement in the others. We also found that certain NLP approaches work better for certain rubrics.
ETS Research Report Series, 2021
Since its 1947 founding, ETS has conducted and disseminated scientific research to support its products and services, and to advance the measurement and education fields. In keeping with these goals, ETS is committed to making its research freely available to the professional community and to the general public. Published accounts of ETS research, including papers in the ETS Research Report series, undergo a formal peer-review process by ETS staff to ensure that they meet established scientific and professional standards. All such ETS-conducted peer reviews are in addition to any reviews that outside organizations may provide as part of their own publication processes. Peer review notwithstanding, the positions expressed in the ETS Research Report series and other published accounts of ETS research are those of the authors and not necessarily those of the Officers and Trustees of Educational Testing Service.
Using Formative Conversation-based Assessments to Support Students ’ English Language Development
2018
6 Abstract—In this article, we discuss the use of prototype formative conversation-based assessments designed to measure English learners’ language skills. Conversation-based assessments are technology enhanced assessment systems that simulate interactions between a test taker and one or more virtual agents. We discuss preliminary findings from two studies exploring the use of conversation-based assessments to gather evidence of students’ English language skills. The findings suggest that conversation-based assessments have the potential to provide useful information to teachers and students about the students’ language skills and can be used to enhance and support English language development.
… in honor of Lyle Bourne, Walter …, 2005
Lyle Bourne is that they have attempted to solve a three body problem. Specifically, they have attempted to productively coordinate science, computation, and application. It takes considerable depth, breadth, intelligence, and creativity to solve the three body problem -much more than possessed by nearly all of our colleagues in experimental psychology, cognitive science and discourse processing. Their contributions in the arena of computation have included analytical models in mathematical psychology, statistical algorithms, and computer models with diverse computational architectures. They have designed, implemented and tested several pioneering applications, including computerized learning systems, automated essay graders, and humancomputer interfaces that can be effectively used by humans. The interdisciplinary vision of three honorees for this Festschrift has profoundly inspired our research agenda, as will be made apparent in this chapter.
Improving the Measurement of Cognitive Skills Through Automated Conversations
Journal of Research on Technology in Education, 2018
Open-ended, short-answer questions, referred to as constructed responses (CR), allow students to express knowledge and skills through their own words. While CRs can reduce the likelihood of guessing correct answers, they also enable students to provide errant responses due to a lack of knowledge or a misunderstanding of the question. Conversation-based assessments (CBAs) require constructing responses to open-ended prompts (similar to CRs), but also leverage natural-language processing to provide adaptive follow-up prompts that target particular information. This work describes an investigation into potential benefits of CBAs as they are compared to a CR approach. Results from 632 middle schoolers indicate that, when compared to CRs, the CBA items allowed 41% of students to provide a more complete response and improve their score.
Simulated Dialogues with Virtual Agents: Effects of Agent Features in Conversation-Based Assessments
Lecture Notes in Computer Science, 2018
Pedagogical agents are widely employed in intelligent tutoring systems and game-based learning environments, but research suggests that learning benefits from virtual agents vary as a function of agent features (e.g., the form or register of agent dialogue) and student characteristics (e.g., prior knowledge). Students' responses to agent-based conversations provide useful evidence of students' knowledge and skills for assessment purposes; however, it is unclear whether these agent design features and student characteristics similarly influence students' interactions with agents in assessment (versus learning) contexts. In this paper, we explore relationships between agent features and student characteristics within a conversation-based assessment of science inquiry skills. We examined the effects of virtual agents' knowledge status (low vs. high) and language use (comparative vs. argumentative question framing) on agent perceptions (ratings) in a conversation-based assessment of scientific reasoning (i.e., using data to predict the weather). Preliminary results show that the effects of these features on students' perceptions of agents varied as a function of students' background characteristics, consistent with research on learning from agents. Implications for designing agent-based assessments will be discussed.
Intonational cues to student questions in tutoring dialogs
Interspeech 2006
Successful Intelligent Tutoring Systems (ITSs) must be able to recognize when their students are asking a question. They must identify question form as well as function in order to respond appropriately. Our study examines whether intonational features, specifically, F0 height and rise range, are useful cues to student question type in a corpus of 643 American English questions. Results show a quantitative effect of both form and function. In addition, among clarification-seeking questions, we observed differences based on the type of clarification being sought. 1
Bridging the Gap: Empowering and Educating Today’s Learners in Statistics. Proceedings of the Eleventh International Conference on Teaching Statistics, 2022
Research suggests "write-to-learn" tasks improve learning outcomes, yet constructed-response methods of formative assessment become unwieldy with large class sizes. This study evaluates natural language processing algorithms to assist this aim. Six short-answer tasks completed by 1,935 students were scored by several human raters using a detailed rubric and an algorithm. Results indicate substantial inter-rater agreement using quadratic weighted kappa for rater pairs (each QWK > 0.74) and group consensus (Fleiss' Kappa = 0.68). Additionally, intra-rater agreement was estimated for one rater who had scored 178 responses seven years prior (QWK = 0.88). With compelling rater agreement, the study then pilots cluster analysis of response text toward enabling instructors to ascribe meaning to clusters as a means for scalable formative assessment.