AI-Assisted Co-Creation (original) (raw)

Authors

Stanislav Pozdniakov The University of Queensland https://orcid.org/0000-0003-4451-9181
Jonathan Brazil The University of Queensland https://orcid.org/0000-0002-6669-2076
Mehrnoush Mohammadi The University of Queensland https://orcid.org/0000-0001-9596-7414
Mollie Dollinger Curtin University https://orcid.org/0000-0003-1105-9051
Shazia Sadiq The University of Queensland https://orcid.org/0000-0001-6739-4145
Hassan Khosravi The University of Queensland https://orcid.org/0000-0001-8664-6117

DOI:

https://doi.org/10.18608/jla.2025.8601

Keywords:

co-creation, feedback, generative AI, learning analytics, research paper

Abstract

Engaging students in creating high-quality novel content, such as educational resources, promotes deep and higher-order learning. However, students often lack the necessary training or knowledge to produce such content. To address this gap, this paper explores the potential of incorporating generative AI (GenAI) to review students’ work and provide them with real-time feedback and assistance during content creation. Specifically, we use RiPPLE, which enables students to create bite-size learning resources and incorporates instant GenAI feedback, highlighting strengths and suggesting improvements to enhance quality. The AI reviews the resource and provides feedback encompassing three main components: a summary of the resource, a list of strengths, and suggestions for improvement. We evaluate this approach by analyzing log data from 1063 student-created multiple-choice questions (MCQs) and the corresponding AI feedback. This analysis aims to understand the depth, scope, and tone of the feedback provided by the AI, as well as the way students engage with and utilize this feedback in their content creation process. Additionally, we examined the perceived helpfulness of the GenAI feedback analyzed via 3324 student ratings and thematically analyzed 601 comments they provided about the feedback. Our findings demonstrate the potential value of AI-generated feedback for students when integrated into pedagogical design. Our analysis suggests that not only can AI-generated feedback provide students with a breadth of feedback to improve their writing and/or discipline-specific content knowledge, but also it is largely well received by students for both its clarity and its positive tone. Despite challenges in ensuring the accuracy of AI-generated feedback, this study shows how this feedback can enable students to make actionable changes in their academic performance.

References

Abdi, S., Khosravi, H., Sadiq, S., & Demartini, G. (2021). Evaluating the quality of learning resources: A learnersourcing approach. IEEE Transactions on Learning Technologies, 14(1), 81–92. https://doi.org/10.1109/TLT.2021.3058644

Baker, R. S., Nasiar, N., Ocumpaugh, J. L., Hutt, S., Andres, J. M. A. L., Slater, S., Schofield, M., Moore, A., Paquette, L., Munshi, A., & Biswas, G. (2021). Affect-targeted interviews for understanding student frustration. In I. Roll, D. McNamara, S. Sosnovsky, R. Luckin, & V. Dimitrova (Eds.), Artificial intelligence in education (pp. 52–63, Vol. 12748). Springer International Publishing. https://doi.org/10.1007/978-3-030-78292-4_5

Bernius, J. P., Krusche, S., & Bruegge, B. (2022). Machine learning based feedback on textual student answers in large courses. Computers and Education: Artificial Intelligence, 3, 100081. https://doi.org/10.1016/j.caeai.2022.100081

Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., . . . Liang, P. (2022). On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258. https://doi.org/10.48550/arXiv.2108.07258

Bovill, C. (2020). Co-creation in learning and teaching: The case for a whole-class approach in higher education. Higher Education, 79(6), 1023–1037. https://doi.org/10.1007/s10734-019-00453-w

Bovill, C., Cook-Sather, A., & Felten, P. (2011). Students as co-creators of teaching approaches, course design, and curricula: Implications for academic developers. International Journal for Academic Development, 16(2), 133–145. https://doi.org/10.1080/1360144X.2011.568690

Buckingham Shum, S., Ferguson, R., & Martinez-Maldonaldo, R. (2019). Human-centred learning analytics. Journal of Learning Analytics, 6(2), 1–9. https://doi.org/10.18608/jla.2019.62.1

Carless, D., & Boud, D. (2018). The development of student feedback literacy: Enabling uptake of feedback. Assessment & Evaluation in Higher Education, 43(8), 1315–1325. https://doi.org/10.1080/02602938.2018.1463354

Cavalcanti, A. P., Barbosa, A., Carvalho, R., Freitas, F., Tsai, Y.- S., Gaševi´c, D., & Mello, R. F. (2021). Automatic feedback in online learning environments: A systematic literature review. Computers and Education: Artificial Intelligence, 2, 100027. https://doi.org/10.1016/j.caeai.2021.100027

Chen, B., Zhu, X., & Díaz Del Castillo, F. (2023). Integrating generative AI in knowledge building. Computers and Education: Artificial Intelligence, 5, 100184. https://doi.org/10.1016/j.caeai.2023.100184

Chen, Y., Peng, Z., Kim, S. - H., & Choi, C. (2023). What we can do and cannot do with topic modeling: A systematic review. Communication Methods and Measures, 17(2), 111–130. https://doi.org/10.1080/19312458.2023.2167965

Choi, S., Lee, H., Lee, Y., & Kim, J. (2024). VIVID: Human-AI collaborative authoring of vicarious dialogues from lecture videos. arXiv preprint arXiv:2403.09168. http://arxiv.org/abs/2403.09168

Collins, J. (2006). Writing multiple-choice questions for continuing medical education activities and self-assessment modules. RadioGraphics, 26(2), 543–551. https://doi.org/10.1148/rg.262055145

Cook-Sather, A. (2014). Student-faculty partnership in explorations of pedagogical practice: A threshold concept in academic development. International Journal for Academic Development, 19(3), 186–198. https://doi.org/10.1080/1360144X.2013.805694

Crossley, S. A., Skalicky, S., & Dascalu, M. (2019). Moving beyond classic readability formulas: New methods and new models. Journal of Research in Reading, 42(3-4), 541–561. https://doi.org/10.1111/1467-9817.12283

Dai, W., Lin, J., Jin, H., Li, T., Tsai, Y. - S., Gaševi´c, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. In M. Chang, N. -S. Chen, R. Kuo, G. Rudolph, D. G. Sampson, & A. Tlili (Eds.), Proceedings of the 2023 IEEE International Conference on Advanced Learning Technologies (ICALT), 10–13 July 2023, Orem, Utah, USA (pp. 323–325). IEEE. https://doi.org/10.1109/ICALT58122.2023.00100

Darvishi, A., Khosravi, H., Abdi, S., Sadiq, S., & Gaševi´c, D. (2022). Incorporating training, self-monitoring and AI-assistance to improve peer feedback quality. In Proceedings of the Ninth ACM Conference on Learning @ Scale (L@S 2022), 1–3 June 2022, New York, New York, USA (pp. 35–47). ACM. https://doi.org/10.1145/3491140.3528265

Darvishi, A., Khosravi, H., & Sadiq, S. (2021). Employing peer review to evaluate the quality of student generated content at scale: A trust propagation approach. Proceedings of the Eighth ACM Conference on Learning @ Scale (L@S 2021), 22–25 June 2021, online, 139–150. https://doi.org/10.1145/3430895.3460129

Darvishi, A., Khosravi, H., Sadiq, S., Gaševic, D., & Siemens, G. (2024). Impact of AI assistance on student agency. Computers & Education, 210, 104967. https://doi.org/10.1016/j.compedu.2023.104967

Denny, P., Gulwani, S., Heffernan, N. T., Käser, T., Moore, S., Rafferty, A. N., & Singla, A. (2024). Generative AI for education (GAIED): Advances, opportunities, and challenges. arXiv preprint arXiv:2402.01580. https://doi.org/10.48550/arXiv.2402.01580

Denny, P., Khosravi, H., Hellas, A., Leinonen, J., & Sarsa, S. (2023). Can we trust AI-generated educational content? Comparative analysis of human and AI-generated learning resources. arXiv preprint arXiv:2306.10509. https://doi.org/10.48550/arXiv.2306.10509

Denny, P., Leinonen, J., Prather, J., Luxton-Reilly, A., Amarouche, T., Becker, B. A., & Reeves, B. N. (2024). Prompt problems: A new programming exercise for the generative AI era. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education (SIGCSE 2024), 20–23 March 2024, Portland, Oregon, USA (pp. 296–302, Vol. 1). ACM. https://doi.org/10.1145/3626252.3630909

Di Mitri, D., Schneider, J., & Drachsler, H. (2022). Keep me in the loop: Real-time feedback with multimodal data. International Journal of Artificial Intelligence in Education, 32(4), 1093–1118. https://doi.org/10.1007/s40593-021-00281-z

Dollinger, M., Brazil, J., Matthews, K., & Khosravi, H. (2024). Is scalable “students as partners” possible?: Towards large-scale, inclusive, in-class partnerships. In T. Cochrane, V. Narayan, E. Bone, C. Deneen, M. Saligari, K. Tregloan, & R. Vanderburg (Eds.), Proceedings of the 2024 ASCILITE Conference: Navigating the Terrain: Emerging Frontiers in Learning Spaces, Pedagogies, and Technologies (ASCILITE 2024), 1–4 December 2024, Melbourne, Australia (pp. 1–11). ASCILITE Publications. https://doi.org/10.14742/apubs.2024.983

Dollinger, M., & Lodge, J. (2020). Student-staff co-creation in higher education: An evidence-informed model to support future design and implementation. Journal of Higher Education Policy and Management, 42(5), 532–546. https://doi.org/10.1080/1360080X.2019.1663681

Dollinger, M., & Lodge, J. M. (2018). Co-creation strategies for learning analytics. In Proceedings of the Eighth International Conference on Learning Analytics and Knowledge (LAK 2018), 7–9 March 2018, Sydney, Australia (pp. 97–101). ACM. https://doi.org/10.1145/3170358.3170372

Doyle, E., Buckley, P., & Whelan, J. (2019). Assessment co-creation: An exploratory analysis of opportunities and challenges based on student and instructor perspectives. Teaching in Higher Education, 24(6), 739–754. https://doi.org/10.1080/13562517.2018.1498077

Fan, Y., Jovanovic, J., Saint, J., Jiang, Y., Wang, Q., & Gaševic, D. (2022). Revealing the regulation of learning strategies of MOOC retakers: A learning analytic study. Computers & Education, 178, 104404. https://doi.org/10.1016/j.compedu.2021.104404

Flesch, R. (1948). A new readability yardstick. Journal of Applied Psychology, 32(3), 221–233. https://doi.org/10.1037/h0057532

Gao, M., Hu, X., Ruan, J., Pu, X., & Wan, X. (2024). LLM-based NLG evaluation: Current status and challenges. arXiv preprint arXiv:2402.01383. https://doi.org/10.48550/arXiv.2402.01383

Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Guo, Q., Wang, M., & Wang, H. (2024). Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997. https://doi.org/10.48550/arXiv.2312.10997

Gatta, R., Lenkowicz, J., Vallati, M., Rojas, E., Damiani, A., Sacchi, L., De Bari, B., Dagliati, A., Fernandez-Llatas, C., Montesi, M., Marchetti, A., Castellano, M., & Valentini, V. (2017). pMineR: An innovative R library for performing process mining in medicine. In A. ten Teije, C. Popow, J. Holmes, & L. Sacchi (Eds.), Artificial intelligence in medicine. AIME 2017. Lecture notes in computer science (pp. 351–355, Vol. 10259). Springer. https://doi.org/10.1007/978-3-319-59758-4_42

Gombert, S., Fink, A., Giorgashvili, T., Jivet, I., Di Mitri, D., Yau, J., Frey, A., & Drachsler, H. (2024). From the automated assessment of student essay content to highly informative feedback: A case study. International Journal of Artificial Intelligence in Education, 34, 1378–1416. https://doi.org/10.1007/s40593-023-00387-6

Gosmar, D., & Dahl, D. A. (2025). Hallucination mitigation using agentic AI natural language-based frameworks. arXiv preprint arXiv:2501.13946. https://doi.org/10.48550/arXiv.2501.13946

Grootendorst, M. (2022). BERTopic: Neural topic modeling with a class-based TF-IDF procedure. arXiv preprint arXiv:2203.05794. https://doi.org/10.48550/arXiv.2203.05794

Han, J., Yoo, H., Myung, J., Kim, M., Lim, H., Kim, Y., Lee, T. Y., Hong, H., Kim, J., Ahn, S. -Y., & Oh, A. (2023). FABRIC: Automated scoring and feedback generation for essays. arXiv preprint arXiv:2310.05191. https://doi.org/10.48550/arXiv.2310.05191

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. https://doi.org/10.3102/003465430298487

Hilton, C. B., Goldwater, M. B., Hancock, D., Clemson, M., Huang, A., & Denyer, G. (2022). Scalable science education via online cooperative questioning. CBE—Life Sciences Education, 21(1), ar4. https://doi.org/10.1187/cbe.19-11-0249

Huang, K., Mello, R. F., & Junior, C. P. (2025). That’s what RoBERTa said: Explainable classification of peer feedback. In Proceedings of the 15th International Conference on Learning Analytics and Knowledge (LAK 2025), 3–7 March 2025, Dublin, Ireland (pp. 880–886). ACM. https://doi.org/10.1145/3706468.37065

Hutt, S., DePiro, A., Wang, J., Rhodes, S., Baker, R. S., Hieb, G., Sethuraman, S., Ocumpaugh, J., & Mills, C. (2024). Feedback on feedback: Comparing classic natural language processing and generative AI to evaluate peer feedback. In Proceedings of the 14th International Conference on Learning Analytics and Knowledge (LAK 2024), 18–22 March 2024, Tokyo, Japan (pp. 55–65). ACM. https://doi.org/10.1145/3636555.3636850

Hutto, C., & Gilbert, E. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM-14), 1–4 June 2014, Ann Arbor, Michigan, USA (pp. 216–225, Vol. 8). AAAI. https://doi.org/10.1609/icwsm.v8i1.14550

Hwang, K., Challagundla, S., Alomair, M. M., Chen, L. K., & Choa, F.- S. (2024). Towards AI-assisted multiple choice question generation and quality evaluation at scale: Aligning with Bloom’s taxonomy. In NeurIPS Workshop on Generative AI for Education (GAIED) (NeurIPS 2023), 10–16 December 2023, New Orleans, Louisiana, USA. NeurIPS. https://gaied.org/neurips2023/files/17/17_paper.pdf

Iraj, H., Fudge, A., Khan, H., Faulkner, M., Pardo, A., & Kovanovi´c, V. (2021). Narrowing the feedback gap: Examining student engagement with personalized and actionable feedback messages. Journal of Learning Analytics, 8(3), 101–116. https://doi.org/10.18608/jla.2021.7184

Jurenka, I., Kunesch, M., McKee, K., Gillick, D., Zhu, S., Wiltberger, S., Phal, S. M., Hermann, K., Kasenberg, D., Bhoopchand, A., Anand, A., Pîslar, M., Chan, S., Wang, L., She, J., Mahmoudieh, P., Rysbek, A., Huber, A., Wiltshire, B., . . . Ibrahim, L. (2024). Towards responsible development of generative AI for education: An evaluation-driven approach. arXiv preprint arXiv:2407.12687. https://doi.org/10.48550/arXiv.2407.12687

Jury, B., Lorusso, A., Leinonen, J., Denny, P., & Luxton-Reilly, A. (2024). Evaluating LLM-generated worked examples in an introductory programming course. In N. Herbert & C. Seton (Eds.), Proceedings of the 26th Australasian Computing Education Conference (ACE 2024), 29 January–2 February 2024, Sydney, Australia (pp. 77–86). ACM. https://doi.org/10.1145/3636243.3636252

Kazemitabaar, M., Ye, R., Wang, X., Henley, A. Z., Denny, P., Craig, M., & Grossman, T. (2024). CodeAid: Evaluating a classroom deployment of an LLM-based programming assistant that balances student and educator needs. In F. F. Mueller, P. Kyburz, J. R. Williamson, C. Sas, M. L. Wilson, ePhoebe Toups Dugas, & I. Shklovski (Eds.), Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (CHI 2024), 11–16 May 2024, Honolulu, Hawaii, USA (pp. 1–20). ACM. https://doi.org/10.1145/3613904.3642773

Keuning, H., Jeuring, J., & Heeren, B. (2019). A systematic literature review of automated feedback generation for programming exercises. ACM Transactions on Computing Education, 19(1). https://doi.org/10.1145/3231711

Khosravi, H., Denny, P., Moore, S., & Stamper, J. (2023). Learnersourcing in the age of AI: Student, educator and machine partnerships for content creation. Computers and Education: Artificial Intelligence, 5, 100151. https://doi.org/10.1016/j.caeai.2023.100151

Khosravi, H., Kitto, K., & Williams, J. J. (2019). RiPPLE: A crowdsourced adaptive platform for recommendation of learning activities. Journal of Learning Analytics, 6(3), 91–105. https://doi.org/10.18608/jla.2019.63.12

Kim, J. (2015). Learnersourcing: Improving learning with collective learner activity [Doctoral dissertation, Massachusetts Institute of Technology].

Knight, S., Shibani, A., Abel, S., Gibson, A., Ryan, P., Sutton, N., Wight, R., Lucas, C., Sándor, Á., Kitto, K., Liu, M., Mogarkar, R. V., & Buckingham Shum, S. (2020). AcaWriter: A learning analytics tool for formative feedback on academic writing. Journal of Writing Research, 12(1). https://doi.org/10.17239/jowr-2020.12.01.06

Laban, P., Vig, J., Hearst, M. A., Xiong, C., & Wu, C.- S. (2023). Beyond the chat: Executable and verifiable text-editing with LLMs. arXiv preprint arXiv:2309.15337. https://doi.org/10.48550/arXiv.2309.15337

Lahza, H., Khosravi, H., Demartini, G., & Gasevic, D. (2022). Effects of technological interventions for self-regulation: A control experiment in learnersourcing. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 542–548). ACM. https://doi.org/10.1145/3506860.3506911

Lee, U., Han, A., Lee, J., Lee, E., Kim, J., Kim, H., & Lim, C. (2024). Prompt Aloud!: Incorporating image-generative AI into STEAM class with learning analytics using prompt data. Education and Information Technologies, 29, 9575–9605. https://doi.org/10.1007/s10639-023-12150-4

Leiker, D., Gyllen, A. R., Eldesouky, I., & Cukurova, M. (2023). Generative AI for learning: Investigating the potential of learning videos with synthetic virtual instructors [Series Title: Communications in Computer and Information Science]. In N. Wang, G. Rebolledo-Mendez, V. Dimitrova, N. Matsuda, & O. C. Santos (Eds.), Artificial intelligence in education. posters and late breaking results, workshops and tutorials, industry and innovation tracks, practitioners, doctoral consortium and blue sky. AIED 2023. Communications in computer and information science (pp. 523–529, Vol. 1831). Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-36336-8_81

Li, J., Chen, J., Ren, R., Cheng, X., Zhao, W. X., Nie, J.- Y., & Wen, J.- R. (2024). The dawn after the dark: An empirical study on factuality hallucination in large language models. arXiv preprint arXiv:2401.03205. https://doi.org/10.48550/arXiv.2401.03205

Liang, Y., & Matthews, K. E. (2023). Are Confucian educational values a barrier to engaging students as partners in Chinese universities? Higher Education Research & Development, 42(6), 1453–1466. https://doi.org/10.1080/07294360.2022.2138276

Lim, L.- A., Dawson, S., Gaševic, D., Joksimovic, S., Pardo, A., Fudge, A., & Gentili, S. (2020). Students’ perceptions of, and emotional responses to, personalised learning analytics-based feedback: An exploratory study of four courses. Assessment & Evaluation in Higher Education, 46(3), 339–359. https://doi.org/10.1080/02602938.2020.1782831

Liu, F., & Lee, J. S. (2023). Hybrid models for sentence readability assessment. In E. Kochmar, J. Burstein, A. Horbach, R. Laarmann-Quante, N. Madnani, A. Tack, V. Yaneva, Z. Yuan, & T. Zesch (Eds.), Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023), 13 July 2023, Toronto, Ontario, Canada (pp. 448–454). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.bea-1.37

Lubicz-Nawrocka, T., & Bovill, C. (2023). Do students experience transformation through co-creating curriculum in higher education? Teaching in Higher Education, 28(7), 1744–1760. https://doi.org/10.1080/13562517.2021.1928060 Martinc, M., Pollak, S., & Robnik-Šikonja, M. (2021). Supervised and unsupervised neural approaches to text readability. Computational Linguistics, 47(1), 141–179. https://doi.org/10.1162/coli_a_00398

Martinez-Maldonado, R., Echeverria, V., Schulte, J., Shibani, A., Mangaroska, K., & Buckingham Shum, S. (2020). Moodoo: Indoor positioning analytics for characterising classroom teaching. In I. Bittencourt, M. Cukurova, K. Muldner, R. Luckin, & E. Millán (Eds.), Artificial intelligence in education. AIED 2020. Lecture notes in computer science (pp. 360–373, Vol. 12163). Springer. https://doi.org/10.1007/978-3-030-52237-7_29

Matcha, W., Uzir, N. A., Gasevic, D., & Pardo, A. (2020). A systematic review of empirical studies on learning analytics dashboards: A self-regulated learning perspective. IEEE Transactions on Learning Technologies, 13(2), 226–245. https://doi.org/10.1109/TLT.2019.2916802

Misiejuk, K., Kaliisa, R., & Scianna, J. (2024). Augmenting assessment with AI coding of online student discourse: A question of reliability. Computers and Education: Artificial Intelligence, 6, 100216. https://doi.org/10.1016/j.caeai.2024.100216

Misiejuk, K., Wasson, B., & Egelandsdal, K. (2021). Using learning analytics to understand student perceptions of peer feedback. Computers in Human Behavior, 117, 106658. https://doi.org/10.1016/j.chb.2020.106658

Mohammadi, M., Tajik, E., Martinez-Maldonado, R., Sadiq, S., Tomaszewski, W., & Khosravi, H. (2024). Artificial intelligence in multimodal learning analytics: A systematic literature review. ResearchGate preprint. https://doi.org/10.13140/RG.2.2.16241.29281

Mollick, E., & Mollick, L. (2023). Assigning AI: Seven approaches for students, with prompts. arXiv preprint arXiv:2306.10052. https://doi.org/10.48550/arXiv.2306.10052

Moore, S., Nguyen, H. A., Chen, T., & Stamper, J. (2023). Assessing the quality of multiple-choice questions using GPT-4 and rule-based methods [Series Title: Lecture Notes in Computer Science]. In O. Viberg, I. Jivet, P. J. Muñoz-Merino, M. Perifanou, & T. Papathoma (Eds.), Responsive and sustainable educational futures. EC-TEL 2023. Lecture notes in computer science (pp. 229–245, Vol. 14200). Springer. https://doi.org/10.1007/978-3-031-42682-7_16

Moore, S., Nguyen, H. A., & Stamper, J. (2021, June 8). Examining the effects of student participation and performance on the quality of learnersourcing multiple-choice questions. In Proceedings of the Eighth ACM Conference on Learning @ Scale (L@S 2021), 22–25 June 2021, online (pp. 209–220). ACM. https://doi.org/10.1145/3430895.3460140

Nazaretsky, T., Mejia-Domenzain, P., Swamy, V., Frej, J., & Käser, T. (2024). AI or human? Evaluating student feedback perceptions in higher education. In R. Ferreira Mello, N. Rummel, I. Jivet, G. Pishtari, & J. Ruipérez Valiente (Eds.), Technology enhanced learning for inclusive and equitable quality education. EC-TEL 2024. Lecture notes in computer science (Vol. 15159). Springer. https://doi.org/10.1007/978-3-031-72315-5_20

Nguyen, H., Stec, H., Hou, X., Di, S., & McLaren, B. (2023). Evaluating ChatGPT’s decimal skills and feedback generation in a digital learning game. In O. Viberg, I. Jivet, P. J. Muñoz-Merino, M. Perifanou, & T. Papathoma (Eds.), Responsive and sustainable educational futures. EC-TEL 2023. Lecture notes in computer science (pp. 278–293, Vol. 14200). Springer Nature. https://doi.org/10.1007/978-3-031-42682-7_19

Panadero, E., & Lipnevich, A. A. (2022). A review of feedback models and typologies: Towards an integrative model of feedback elements. Educational Research Review, 35, 100416. https://doi.org/10.1016/j.edurev.2021.100416

Panickssery, A., Bowman, S. R., & Feng, S. (2024). LLM evaluators recognize and favor their own generations. arXiv preprint arXiv:2404.13076. https://doi.org/10.48550/arXiv.2404.13076

Pardo, A., Jovanovic, J., Dawson, S., Gaševi´c, D., & Mirriahi, N. (2019). Using learning analytics to scale the provision of personalised feedback. British Journal of Educational Technology, 50(1), 128–138. https://doi.org/10.1111/bjet.12592

Pozdniakov, S., Brazil, J., Abdi, S., Bakharia, A., Sadiq, S., Gasevic, D., Denny, P., & Khosravi, H. (2024). Large language models meet user interfaces: The case of provisioning feedback. Computers and Education: Artificial Intelligence, 7, 100289. https://doi.org/10.1016/j.caeai.2024.100289

Qian, Y., & Lehman, J. (2017). Students’ misconceptions and other difficulties in introductory programming: A literature review. ACM Transactions on Computing Education (TOCE), 18(1). https://doi.org/10.1145/3077618

Saldaña, J. (2015). The coding manual for qualitative researchers. SAGE.

Sanders, E. B.- N., & Stappers, P. J. (2008). Co-creation and the new landscapes of design. CoDesign, 4(1), 5–18. https://doi.org/10.1080/15710880701875068

Sarmiento, J. P., & Wise, A. F. (2022). Participatory and co-design of learning analytics: An initial review of the literature. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 535–541). ACM. https://doi.org/10.1145/3506860.350691

Shibani, A., & Buckingham Shum, S. (2024). AI-assisted writing in education: Ecosystem risks and mitigations. arXiv preprint arXiv:2404.10281. https://doi.org/10.48550/arXiv.2404.10281

Shibani, A., Knight, S., & Buckingham Shum, S. (2022). Questioning learning analytics? Cultivating critical engagement as student automated feedback literacy. In Proceedings of the 12th International Conference on Learning Analytics and Knowledge (LAK 2022), 21–25 March 2022, online (pp. 326–335). ACM. https://doi.org/10.1145/3506860.3506912

Shibani, A., Knight, S., Kitto, K., Karunanayake, A., & Buckingham Shum, S. (2024). Untangling critical interaction with AI in students’ written assessment. In F. F. Mueller, P. Kyburz, J. R. Williamson, & C. Sas (Eds.), Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA 2024), 11–16 May 2024, Honolulu, Hawaii, USA (pp. 1–6). ACM. https://doi.org/10.1145/3613905.3651083

Singh, A., Brooks, C., & Doroudi, S. (2022). Learnersourcing in theory and practice: Synthesizing the literature and charting the future. In Proceedings of the Ninth ACM Conference on Learning @ Scale (L@S 2022), 1–3 June 2022, New York, New York, USA (pp. 234–245). ACM. https://doi.org/10.1145/3491140.3528277

Singh, A., Brooks, C., Wang, X., Li, W., Kim, J., & Wilson, D. (2024). Bridging learnersourcing and AI: Exploring the dynamics of student-AI collaborative feedback generation. In Proceedings of the 14th International Conference on Learning Analytics and Knowledge (LAK 2024), 18–22 March 2024, Tokyo, Japan (pp. 742–748). ACM. https://doi.org/10.1145/3636555.3636853

Wang, C., Wang, H., Li, Y., Dai, J., Gu, X., & Yu, T. (2024). Factors influencing university students’ behavioral intention to use generative artificial intelligence: Integrating the theory of planned behavior and AI literacy. International Journal of Human–Computer Interaction, 1–23. https://doi.org/10.1080/10447318.2024.2383033

Wang, R., Wirawarn, P., Goodman, N., & Demszky, D. (2023). SIGHT: A large annotated dataset on student insights gathered from higher education transcripts. arXiv preprint arXiv:2306.09343. https://doi.org/10.48550/arXiv.2306.09343

Xu, S., Huang, X., Lo, C. K., Chen, G., & Jong, M. S.-y. (2024). Evaluating the performance of ChatGPT and GPT-4o in coding classroom discourse data: A study of synchronous online mathematics instruction. Computers and Education: Artificial Intelligence, 7, 100325. https://doi.org/10.1016/j.caeai.2024.100325

Yan, L., Sha, L., Zhao, L., Li, Y., Martinez-Maldonado, R., Chen, G., Li, X., Jin, Y., & Gaševi´c, D. (2023). Practical and ethical challenges of large language models in education: A systematic scoping review. British Journal of Educational Technology, 55(1), 90–112. https://doi.org/10.1111/bjet.13370

Ye, H., Liu, T., Zhang, A., Hua, W., & Jia, W. (2023). Cognitive mirage: A review of hallucinations in large language models. arXiv preprint arXiv:2309.06794. https://doi.org/10.48550/arXiv.2309.06794

Zambrano, A. F., Liu, X., Barany, A., Baker, R. S., Kim, J., & Nasiar, N. (2023). From nCoder to ChatGPT: From automated coding to refining human coding. In G. Arastoopour Irgens & S. Knight (Eds.), Advances in quantitative ethnography. ICQE 2023. Communications in computer and information science (pp. 470–485, Vol. 1895). Springer. https://doi.org/10.1007/978-3-031-47014-1_2