Domain Model Discovery from Textbooks for Computer Programming Intelligent Tutors (original) (raw)
Related papers
International Journal of Computational Linguistics and Applications, 2014
Modern digital world has enormous amount of data on the Web easily accessible anywhere and anytime. This ease of access also creates new paradigms of education and learning. The modernday learners have access to lot many and in fact one of the best learning materials created in any part of the world. However, despite abundant availability of material, we still lack appropriate systems that can automatically identify learning needs of a user and present them with the most relevant (and best-quality) material to pursue. This paper presents our algorithmic design towards this goal. We propose a text processing-based system that works in three phases: (a) identifying learning needs of a learner; (b) retrieving relevant materials and ranking them; and (c) presenting material to learner and monitoring the learning process. We use know-how of text processing, information retrieval, recommender systems and educational psychology and presents useful and relevant learning material (including slides, videos, articles etc.) to a learner in a focused subject domain. Our initial experiments have produced promising results. We are working towards a Web-scale deployment of the system.
Extracting learning concepts from educational texts in intelligent tutoring systems automatically
Expert Systems with Applications, 2010
This paper argues that the educational support systems can give a meaning to an educational content semantically, and an answer has been sought to the question of ''what should be taught to students" in the field of intelligent tutoring systems. With reference this aim, a system, which automatically detects the concepts to be learned by students, has been designed. The developed system uses the statistical language models together with conceptual map modeling as a student model to extract the minimal set of learning concepts within an educational content. In the study, ten corpora have been generated as a learning domain, which consist of two different subjects in mathematics. For each subject, five distinct chapters have been quoted from the books written by various authors. After extracting the candidate concepts from the given content, the system checks to clarify whether these candidates are in a dictionary within postprocessing. The dictionary consists of approximately 9500 technical terms related to the learning domain. The system performance has also been analyzed using Recall, Precision and F-measure scores. The results indicate that the postprocessing step increases precision with a small loss of recall.
Automatic Concept Extraction for Domain and Student Modeling in Adaptive Textbooks
International Journal of Artificial Intelligence in Education, 2020
The increasing popularity of digital textbooks as a new learning media has resulted in a growing interest in developing a new generation of adaptive textbooks that can help readers to learn better through adapting to the readers' learning goals and the current state of knowledge. These adaptive textbooks are most frequently powered by internal knowledge models, which associate a list of unique domain knowledge concepts with each section of the textbook. With this kind of concept-level knowledge representation, a number of intelligent operations could be performed, which include student modeling, adaptive navigation support, and content recommendation. However, manual indexing of each textbook section with concepts is challenging, time-consuming, and prone to errors. Modern research in the area of natural language processing offers an attractive alternative, called automatic keyphrase extraction. While a range of keyphrase and concept extraction methods have been developed over the last twenty years, few of the known approaches were applied and evaluated in a textbook context. In this paper, we present FACE, a supervised feature-based machine learning method for automatic concept extractions from digital textbooks. This method has been created for building domain and student models that form the core of intelligent textbooks. We evaluated FACE on a newly constructed full-scale dataset by assessing how well it approximates concept annotations produced by human experts and how well it supports the needs of student modeling. The results show that FACE outperforms several state-of-the-art keyphrase extraction methods.
Mining Relevant Examples for Learning in ITS Student Models
2014 IEEE International Conference on Computer and Information Technology, 2014
An Intelligent Tutoring System (ITS) provides direct customized instruction or feedback to students while they perform a task in a tutoring system without the intervention of a human. One of the main functions of an ITS system is to present its students with course materials that are most appropriate to their current knowledge of domain concepts, example being one of the course materials. ITS systems typically compare and analyze student model (SM) components for student's current knowledge of concepts (main topics, e.g. scanf in C programming) that are required to understand the next example (e.g. codes for scanf) suitable for learning a task (e.g. write C code to read 2 integers from the keyboard). Existing systems such as NavEx and PADS perform an exhaustive matching of student knowledge level with all examples in the database. This research proposes a task-based technique for managing and classifying examples for more effective retrieval of relevant examples for learning a task. We propose a system called EASK for translating task and example solutions into concepts for similarity matching, which is more readily available, easily extendible and adaptable to other domains. Examples and tasks are represented as vectors of weights computed with term frequency measure TFIDF that signify the importance of a concept for an example. Examples most similar to a task are found by using a classification method called k-NN, which finds the closeness between different objects such as examples and tasks using cosine similarity measure and selecting the k objects (examples) with highest similarity scores. As a by-product, k-NN also predicts the class label (difficulty level) of the task. Our proposed model achieves this prediction with 89% accuracy.
Automatic extraction of notions from course material
2008
Formally defining the knowledge units taught in a course helps instructors ensure a sound coverage of topics and provides an objective basis for comparing the content of two courses. The main issue is to list and define the course concepts, down to basic knowledge units. Ontology learning techniques can help partially automate the process by extracting information from existing materials such as slides and textbooks. The TrucStudio course planning tool, discussed in this article, provides such support and relies on Text2Onto to extract concepts from course material. We conducted experiments on two different programming courses to assess the quality of the results.
Using Large Language Models to Automatically Identify Programming Concepts in Code Snippets
Proceedings of the 2023 ACM Conference on International Computing Education Research - Volume 2
Curating course material that aligns with students' learning goals is a challenging and time-consuming task that instructors undergo when preparing their curricula. For instance, it is a challenge to find multiple-choice questions or example codes that demonstrate recursion in an unlabeled question bank or repository. Recently, Large Language Models (LLMs) have demonstrated the capability to generate high-quality learning materials at scale. In this poster, we use LLMs to identify programming concepts found within code snippets, allowing instructors to quickly curate their course materials. We compare programming concepts generated by LLMs with concepts generated by experts to see the extent to which they agree. The agreement was calculated using Cohen's Kappa. CCS CONCEPTS • Social and professional topics → Computing education; • Computing methodologies → Natural language generation.
Zenodo (CERN European Organization for Nuclear Research), 2022
Domain modeling is a central component in education technologies as it represents the target domain students are supposed to train on and eventually master. Automatically generating domain models can lead to substantial cost and scalability benefits. Automatically extracting key concepts or knowledge components from, for instance, textbooks can enable the development of automatic or semi-automatic processes for creating domain models. We explore in this work the use of transformer based pre-trained models for the task of keyphrase extraction. Specifically, we investigate and evaluate four different variants of BERT, a pre-trained transformer based architecture, that vary in terms of training data, training objective, or training strategy to extract knowledge components from textbooks for the domain of intro-toprogramming. We report results obtained using the following BERT-based models: BERT, CodeBERT, SciBERT and RoBERTa.
Edu-APCCM: Automatic Programming Code Constructs Mining from Learning Content
International Journal of Engineering and Advanced Technology
The current education ecosystem is moving towards centralized online blended learning. Online learning repositories have replaced traditional libraries. Learning repositories contain learning materials, which can be located with the help of associated metadata. Associating metadata to the content (definition, program, example, figure, and table) of individual learning concept (topic) from the learning material also leads to a better search. If a student knows the prerequisites of the topic s/he wants to learn then the study of current topic would be more fruitful. The prerequisites of a computer science topic can be obtained from its explanation and the programming code snippet used for its implementation. This paper proposes a metadata “code construct as a prerequisite of a code snippet”. For example “recursion and function call are prerequisite to understand recursive module of binary tree traversal”. It also proposes the framework to automatically identify, extract and present th...
Domain model relations discovering in educational texts based on user created annotations
2011
Abstract Domain model with its metadata is essential part of every adaptive educational system. At the same time it is often a bottleneck as quality metadata is essential requirement and their manual creation is difficult or even impossible in some extent. So any effort in automated acquisition of metadata is crucial for effective learning supported by an educational system. In this paper we propose a method of discovering relations in educational texts using annotations created by users (learners).