John Stamper | Carnegie Mellon University (original) (raw)

Papers by John Stamper

Proceedings of the Ninth ACM Conference on Learning @ Scale

The first annual workshop on Learnersourcing: Student-generated Content @ Scale is taking place a... more The first annual workshop on Learnersourcing: Student-generated Content @ Scale is taking place at Learning @ Scale 2022. This hybrid workshop will expose attendees to the ample opportunities in the learnersourcing space, including instructors, researchers, learning engineers, and many other roles. We believe participants from a wide range of backgrounds and prior knowledge on learnersourcing can both benefit and contribute to this workshop, as learnersourcing draws on work from education, crowdsourcing, learning analytics, data mining, ML/NLP, and many more fields. Additionally, as the learnersourcing process involves many stakeholders (students, instructors, researchers, instructional designers, etc.), multiple viewpoints can help to inform what future student-generated content might be useful, new and better ways to assess the quality of the content and spark potential collaboration efforts between attendees. We ultimately want to show how everyone can make use of learnersourcing and have participants gain hands-on experience using existing tools, create their own learnersourcing activities using them or their own platforms, and take part in discussing the next challenges and opportunities in the learnersourcing space. Our hope is to attract attendees interested in scaling the generation of instructional and assessment content and those interested in the use of online learning platforms.

Proceedings of the Eighth ACM Conference on Learning @ Scale, 2021

While generating multiple-choice questions has been shown to promote deep learning, students ofte... more While generating multiple-choice questions has been shown to promote deep learning, students often fail to realize this benefit and do not willingly participate in this activity. Additionally, the quality of the student-generated questions may be influenced by both their level of engagement and familiarity with the learning materials. Towards better understanding how students can generate high quality questions, we designed and deployed a multiple-choice question generation activity in seven college-level online chemistry courses. From these courses, we collected data on student interactions and their contribution to the questiongeneration task. A total of 201 students enrolled in the courses and 57 of them elected to generate a multiple-choice question. Our results indicated that students were able to contribute quality questions, with 67% of them being evaluated by experts as acceptable for use. We further identified several student behaviors in the online courses that are correlated to their participation in the task and the quality of their contribution. Our findings can help teachers and students better understand the benefits of student-generated questions and effectively implement future learnersourcing activities.

The number of students that can be helped in a given class period is limited by the time constrai... more The number of students that can be helped in a given class period is limited by the time constraints of the class and the number of agents available for providing help. We use a classroom-replay of previously collected data to evaluate a data-driven method for increasing the number of students that can be helped. We use a machine learning model to identify students who need help in real-time, and an interaction network to group students who need similar help together using approach maps. By assigning these groups of struggling students to peer tutors (as well the instructor), we were able to more than double the number of students helped.

Lecture Notes in Computer Science, 2019

In this research, we explore how expertise is shown in both humans and AI agents. Human experts f... more In this research, we explore how expertise is shown in both humans and AI agents. Human experts follow sets of strategies to complete domain specific tasks while AI agents follow a policy. We compare machine generated policies to human strategies in two game domains, using these examples we show how human strategies can be seen in agents. We believe this work can help lead to a better understanding of human strategies and expertise, while also leading to improved human-centered machine learning approaches. Finally, we hypothesize how a continuous improvement system of humans teaching agents who then teach humans could be created in future intelligent tutoring systems.

We demonstrate that, by using a small set of hand-graded students, we can automatically generate ... more We demonstrate that, by using a small set of hand-graded students, we can automatically generate rubric parameters with a high degree of validity, and that a predictive model incorporating these rubric parameters is more accurate than a previously reported model. We present this method as one approach to addressing the often challenging problem of grading assignments in programming environments. A classic solution is creating unit-tests that the student-generated program must pass, but the rigid, structured nature of unit-tests is suboptimal for assessing more open-ended assignments. Furthermore, the creation of unit-tests requires predicting the various ways a student might correctly solve a problem – a challenging and time-intensive process. The current study proposes an alternative, semi-automated method for generating rubric parameters using low-level data from the Alice programming environment.

Determining the impact of belief bias on everyday reasoning is critical for understanding how our... more Determining the impact of belief bias on everyday reasoning is critical for understanding how our beliefs can influence how we judge arguments. We examined the impact of belief bias on the user’s ability to identify logical fallacies in political arguments. We found that participants had more difficulty identifying logical fallacies in arguments that aligned with their own political beliefs. Interestingly, this effect diminishes with practice. These results suggest that while belief bias is a potential barrier to correctly evaluating everyday arguments, interventions focused on activating rational engagement may mitigate its impact.

Educational data mining is inherently falls into the category of the so-called secondary data ana... more Educational data mining is inherently falls into the category of the so-called secondary data analysis. It is common that data that have been collected for administrative or some other purposes at some point is considered as valuable for other (research) purpose. Collection of the student generated, student behavior and student performance related data on a massive scale in MOOCs, ITSs, LMS and other learning platforms raises various ethical and privacy concerns among researches, policy makers and the general public. This panel is aimed to discuss major challenges in ethics and privacy in EDM and how they are addressed now or should be addressed in the future to prevent any possible harm to the learners. Several experts are invited to discuss the potential and challenges of privacy-preserving EDM, ethics-aware predictive learning analytics, and availability of public benchmark datasets for EDM among others. Proceedings of the 8th International Conference on Educational Data Mining 13

Bayesian Knowledge Tracing [1], Performance Factors Analysis [6], MOOC activity analysis [3], and... more Bayesian Knowledge Tracing [1], Performance Factors Analysis [6], MOOC activity analysis [3], and others) or that have been uploaded to LearnSphere as a custom workflow, and (3) sharing their own analysis workflows with the community of researchers. Without any prior programming experience, researchers can use LearnSphere’s drag-and-drop interface to compare, across alternative analysis methods and across many different datasets, model fit metrics like AIC, BIC, and cross validation as well as parameter estimates themselves.

This mixed panel of different professionals working in EDM will be a conversation about increasin... more This mixed panel of different professionals working in EDM will be a conversation about increasing the connection between research and real-world applications. What’s going on now to scale techniques for use ”out there”in the field? What should researchers, funders, regulators, publishers, trainers, schools/universities and others be doing to get ready for practical work? What’s in the way that we can usefully start work to address? We’ll ask the audience to engage in this conversation as well what’s in your way to moving work from research environments to practically help learners at scale and to generate more useable data at scale? Proceedings of the 8th International Conference on Educational Data Mining 11

A great deal of learning analytics research has focused on what can be achieved by analyzing log ... more A great deal of learning analytics research has focused on what can be achieved by analyzing log data, which can yield important insights about how students learn in online systems. Log data cannot capture all important learning phenomena, especially in open-ended, collaborative, or project-based environments. Collecting and processing/analyzing additional multimodal data streams, however, present many methodological challenges. We describe two datasets from similar collaborative-learning oriented educational technologies deployed in classrooms but with different streams of multimodal data collected. We discuss the differing insights that have resulted from each study, due largely to the specific streams of multimodal data collected. We review the challenges that remain. Finally, we present methods we’ve developed to streamline the temporal alignment and linkage across multiple data streams.

Previous work has demonstrated that in the context of Massively Open Online Courses (MOOCs), doin... more Previous work has demonstrated that in the context of Massively Open Online Courses (MOOCs), doing activities is more predictive of learning than reading text or watching videos (Koedinger et al., 2015). This paper breaks down the general behaviors of reading and watching into finer behaviors, and considers how these finer behaviors may provide evidence for active learning as well. By characterizing learner strategies through patterns in their data, we can evaluate which strategies (or measures of them) are predictive of learning outcomes. We investigated strategies such as page re-reading (active reading) and video watching in response to an incorrect attempt (active watching) and found that they add predictive power beyond mere counts of the amount of doing, reading, and watching.

Proceedings of the 8th International Conference on Learning Analytics and Knowledge, 2018

We demonstrate that, by using a small set of hand-graded student work, we can automatically gener... more We demonstrate that, by using a small set of hand-graded student work, we can automatically generate rubric criteria with a high degree of validity, and that a predictive model incorporating these rubric criteria is more accurate than a previously reported model. We present this method as one approach to addressing the often challenging problem of grading assignments in programming environments. A classic solution is creating unit-tests that the studentgenerated program must pass, but the rigid, structured nature of unit-tests is suboptimal for assessing the more open-ended assignments students encounter in introductory programming environments like Alice. Furthermore, the creation of unit-tests requires predicting the various ways a student might correctly solve a problem-a challenging and time-intensive process. The current study proposes an alternative, semi-automated method for generating rubric criteria using low-level data from the Alice programming environment. CCS CONCEPTS • Applied computing → Computer-assisted instruction; Computermanaged instruction;

Proceedings of the Seventh International Learning Analytics & Knowledge Conference, 2017

K-12 classrooms use block-based programming environments (BBPEs) for teaching computer science an... more K-12 classrooms use block-based programming environments (BBPEs) for teaching computer science and computational thinking (CT). To support assessment of student learning in BBPEs, we propose a learning analytics framework that combines hypothesis-and data-driven approaches to discern students' programming strategies from BBPE log data. We use a principled approach to design assessment tasks to elicit evidence of specific CT skills. Piloting these tasks in high school classrooms enabled us to analyze student programs and video recordings of students as they built their programs. We discuss a priori patterns derived from this analysis to support data-driven analysis of log data in order to better assess understanding and use of CT in BBPEs.

Lecture Notes in Computer Science, 2012

Using the online educational game Battleship Numberline, we have collected over 8 million number ... more Using the online educational game Battleship Numberline, we have collected over 8 million number line estimates from hundreds of thousands of players. Using random assignment, we evaluate the effects of various adaptive sequencing algorithms on player engagement and learning.

In this age of fake news and alternative facts, the need for a citizenry capable of critical thin... more In this age of fake news and alternative facts, the need for a citizenry capable of critical thinking has never been greater. While teaching critical thinking skills in the classroom remains an enduring challenge, research on an ill-defined domain like critical thinking in the educational technology space is even more scarce. We propose a difficulty factors assessment (DFA) to explore two factors that may make learning to identify fallacies more difficult: type of instruction and belief bias. This study will allow us to make two key contributions. First, we will better understand the relationship between sense-making and induction when learning to identify informal fallacies. Second, we will contribute to the limited work examining the impact of belief bias on informal (rather than formal) reasoning. The results of this DFA will also be used to improve the next iteration of our fallacy tutor, which may ultimately contribute to a computational model of informal fallacies.

Lecture Notes in Computer Science, 2019

When students are given agency in playing and learning from a digital learning game, how do their... more When students are given agency in playing and learning from a digital learning game, how do their decisions about sequence of gameplay impact learning and enjoyment? We explored this question in the context of Decimal Point, a math learning game that teaches decimals to middle-school students. Our analysis is based on students in a high-agency condition, those who can choose the order of gameplay, as well as when to stop. By clustering student mini-game sequences by edit distance-the number of edit operations to turn one sequence into another-we found that, among students who stopped early, those who deviated more from a canonical game sequence reported higher enjoyment than those who did not. However, there were no differences in learning gains. Our results suggest that students who can self-regulate and exercise agency will enjoy the game, but the type and number of choices may also have an impact on enjoyment factors. At the same time, more investigation into the amount and means of delivering instruction to maximize learning efficiency within the game is necessary. We conclude by discussing digital learning game design lessons to create a game that more closely aligns with students' learning needs and affective states.

Lecture Notes in Computer Science, 2017

Bayesian Knowledge Tracing (BKT) has been employed successfully in intelligent learning environme... more Bayesian Knowledge Tracing (BKT) has been employed successfully in intelligent learning environments to individualize curriculum sequencing and help messages. Standard BKT employs four parameters, which are estimated separately for individual knowledge components, but not for individual students. Studies have shown that individualizing the parameter estimates for students based on existing data logs improves goodness of fit and leads to substantially different practice recommendations. This study investigates how well BKT parameters in a tutor lesson can be individualized ahead of time, based on learners' prior activities, including reading text and completing prior tutor lessons. We find that directly applying best-fitting individualized parameter estimates from prior tutor lessons does not appreciably improve BKT goodness of fit for a later tutor lesson, but that individual differences in the later lesson can be effectively predicted from measures of learners' behaviors in reading text and in completing the prior tutor lessons.

In recent years, educational data mining has emerged as a burgeoning new area for scientific inve... more In recent years, educational data mining has emerged as a burgeoning new area for scientific investigation. One reason for the emerging excitement about educational data mining is the increasing availability of fine-grained, extensive, and longitudinal data on student learning. These data come from many sources, including standardized tests combined with student

Proceedings of the Ninth ACM Conference on Learning @ Scale

Proceedings of the Eighth ACM Conference on Learning @ Scale, 2021

Lecture Notes in Computer Science, 2019

Proceedings of the 8th International Conference on Learning Analytics and Knowledge, 2018

Proceedings of the Seventh International Learning Analytics & Knowledge Conference, 2017

Lecture Notes in Computer Science, 2012

Lecture Notes in Computer Science, 2019

Lecture Notes in Computer Science, 2017