Neural Networks or Linguistic Features? - Comparing Different Machine-Learning Approaches for Automated Assessment of Text Quality Traits Among L1- and L2-Learners’ Argumentative Essays Julian F. Lohmann, Fynn Junge, Jens Möller, Johanna Fleckenstein, Ruth Trüb, et al. International Journal of Artificial Intelligence in Education, 2025 Recent investigations in automated essay scoring research imply that hybrid models, which combine feature engineering and the powerful tools of deep neural networks (DNNs), reach state-of-the-art performance. However, most of these findings are from holistic scoring tasks. In the present study, we use a total of four prompts from two different corpora consisting of both L1 and L2 learner essays annotated with trait scores (e.g., content, organization, and language quality). In our main experiments, we compare three variants of trait-specific models using different inputs: (1) models based on 220 linguistic features, (2) models using essay-level contextual embeddings from the distilled version of the pre-trained transformer BERT (DistilBERT), and (3) a hybrid model using both types of features. Results imply that when trait-specific models are trained based on a single resource, the feature-based models slightly outperform the embedding-based models. These differences are most prominent for the organization traits. The hybrid models outperform the single-resource models, indicating that linguistic features and embeddings indeed capture partially different aspects relevant for the assessment of essay traits. To gain more insights into the interplay between both feature types, we run addition and ablation tests for individual feature groups. Trait-specific addition tests across prompts indicate that the embedding-based models can most consistently be enhanced in content assessment when combined with morphological complexity features. Most consistent performance gains in the organization traits are achieved when embeddings are combined with length features, and most consistent performance gains in the assessment of the language traits when combined with lexical complexity, error, and occurrence features. Cross-prompt scoring again reveals slight advantages for the feature-based models.
(De)motivating Zero-Performing Students With Negative Feedback: Does the Salience of Performance Information Matter? Marlene Steinbach, Johanna Fleckenstein, Livia Kuklick, Jennifer Meyer Journal of Computer Assisted Learning, 2025 BackgroundProviding students with information on their current performance could help them improve by stimulating their reflection, but negative feedback that saliently mirrors task‐related failure can harm motivation. In the context of automated scoring based on artificial intelligence, we explored how feedback on written texts might be designed to be least detrimental for zero‐performing students who are likely to receive negative feedback frequently and might suffer from its motivational consequences.ObjectivesThis experiment set out to investigate whether making the negative performance information in automated feedback messages less salient reduces the potential threat of negative feedback for zero‐performing students' task‐specific self‐concept, intrinsic value, and performance.MethodsA sample of 105 (Mage = 13.97 years) zero‐performing students received negative feedback with either more or less salient performance information after completing an English writing task. We used regression analysis to examine pre–post effects and group differences in self‐concept, intrinsic value, and performance.Results and ConclusionsThe analyses showed that zero‐performing students' performance improved but their self‐concept and intrinsic value declined over the course of two writing tasks, with feedback provided after the initial task. Contrary to expectations, our findings showed that students' task‐specific self‐concept and intrinsic value declined more in the condition with less salient performance information (i.e., without a red cross as a salient visual performance cue). Our findings highlight the motivational potential of performance information and are discussed in terms of the need for further research into how negative feedback can be designed to effectively motivate and support zero‐performing learners.
Sequence Tagging in EFL Email Texts as Feedback for Language Learners Proceedings of the 12th Workshop on Natural Language Processing for Computer Assisted Language Learning Nlp4call 2023, 2023
Measuring Task-Level Behavioral Learning Engagement During Text Revision R Schiller, J Fleckenstein, U Mertens, J Meyer Computers & Education, 105656 , 2026 2026
The Future of Feedback: How Can AI Help Transform Feedback to Be More Engaging, Effective, and Scalable? J Meyer, O Köller, T Jansen, J Fleckenstein, MW Asher, S Bichler, ... arXiv preprint arXiv:2603.12463 , 2026 2026
On the role of engagement in automated feedback effectiveness: Insights from keystroke logging R Schiller, J Fleckenstein, L Höft, A Horbach, J Meyer Computers & Education 238, 105386 , 2025 2025 Citations: 7
Self-assessment accuracy in the age of artificial Intelligence: Differential effects of LLM-generated feedback LW Liebenow, FTC Schmidt, J Meyer, J Fleckenstein Computers & Education 237, 105385 , 2025 2025 Citations: 15
Data extraction by generative artificial intelligence: Assessing determinants of accuracy using human-extracted data from systematic review databases. T Jansen, LW Liebenow, U Mertens, FTC Schmidt, JF Lohmann, ... Psychological Bulletin 151 (10), 1280 , 2025 2025 Citations: 15
Neural networks or linguistic features?-Comparing different machine-learning approaches for automated assessment of text quality traits among L1-and L2-learners’ argumentative … JF Lohmann, F Junge, J Möller, J Fleckenstein, R Trüb, S Keller, T Jansen, ... International Journal of Artificial Intelligence in Education 35 (3), 1178-1217 , 2025 2025 Citations: 9
Testing teacher judgments comprehensively: Accuracy, halo, frame of reference, strategy, and personality effects in holistic and analytic assessments of student essays. JF Lohmann, F Lötscher, F Junge, S Keller, T Jansen, J Fleckenstein, ... Journal of Educational Psychology , 2025 2025 Citations: 2
“Can (A) I do this task?” The role of AI as a socializer of students' self-beliefs of their abilities T Jansen, J Meyer, J Fleckenstein, A Wigfield, J Möller Learning and Individual Differences 122, 102731 , 2025 2025 Citations: 8
(De) motivating Zero‐Performing Students With Negative Feedback: Does the Salience of Performance Information Matter? M Steinbach, J Fleckenstein, L Kuklick, J Meyer Journal of Computer Assisted Learning 41 (4), e70070 , 2025 2025 Citations: 2
Nonengagement and unsuccessful engagement with feedback in lower secondary education: The role of student characteristics J Meyer, T Jansen, J Fleckenstein Contemporary Educational Psychology 81, 102363 , 2025 2025 Citations: 22
Understanding individual differences in students’ responses to technology-based feedback on a writing task: the role of achievement motives and initial task performance J Meyer, T Jansen, M Daumiller, J Fleckenstein Journal of Research on Technology in Education, 1-31 , 2025 2025 Citations: 8
LLM feedback for academic writing: Effects on students’ performance and engagement R Glüsing, J Fleckenstein, F Schmidt, J Möller Available at SSRN 5445319 , 2025 2025 Citations: 3
Negative Feedback: Does the Salience of Performance Information Matter? M Steinbach, J Fleckenstein, L Kuklick, J Meyer 2025
Understanding the effectiveness of automated feedback: Using process data to uncover the role of behavioral engagement R Schiller, J Fleckenstein, U Mertens, A Horbach, J Meyer Computers & Education 223, 105163 , 2024 2024 Citations: 30
How am I going? Behavioral engagement mediates the effect of individual feedback on writing performance J Fleckenstein, T Jansen, J Meyer, R Trüb, EE Raubach, SD Keller Learning and Instruction 93, 101977 , 2024 2024 Citations: 22
Language quality, content, structure: What analytic ratings tell us about EFL writing skills at upper secondary school level in Germany and Switzerland SD Keller, J Lohmann, R Trüb, J Fleckenstein, J Meyer, T Jansen, J Möller Journal of Second Language Writing 65, 101129 , 2024 2024 Citations: 19
Two-way immersion promotes additional language learning: performance of bilingual sixth-grade students in English as a third language S Preusler, J Fleckenstein, S Zitzmann, J Baumert, J Möller International Journal of Bilingual Education and Bilingualism 27 (7), 910-922 , 2024 2024 Citations: 7
Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays J Fleckenstein, J Meyer, T Jansen, SD Keller, O Köller, J Möller Computers and Education: Artificial Intelligence 6, 100209 , 2024 2024 Citations: 225
Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions J Meyer, T Jansen, R Schiller, LW Liebenow, M Steinbach, A Horbach, ... Computers and Education: Artificial Intelligence 6, 100199 , 2024 2024 Citations: 486
Empirische arbeit: comparing generative AI and expert feedback to students’ writing: insights from student teachers T Jansen, L Höft, L Bahr, J Fleckenstein, J Möller, O Köller, J Meyer Psychologie in Erziehung und Unterricht 71 (2), 80-92 , 2024 2024 Citations: 67
MOST CITED SCHOLAR PUBLICATIONS
Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions J Meyer, T Jansen, R Schiller, LW Liebenow, M Steinbach, A Horbach, ... Computers and Education: Artificial Intelligence 6, 100199 , 2024 2024 Citations: 486
Measuring grit FTC Schmidt, J Fleckenstein, J Retelsdorf, L Eskreis-Winkler, J Möller European Journal of Psychological Assessment , 2017 2017 Citations: 306
Same same, but different? Relations between facets of conscientiousness and grit FTC Schmidt, G Nagy, J Fleckenstein, J Möller, JAN Retelsdorf European journal of personality 32 (6), 705-720 , 2018 2018 Citations: 227
Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays J Fleckenstein, J Meyer, T Jansen, SD Keller, O Köller, J Möller Computers and Education: Artificial Intelligence 6, 100209 , 2024 2024 Citations: 225
Expectancy value interactions and academic achievement: Differential relationships with achievement measures J Meyer, J Fleckenstein, O Köller Contemporary Educational Psychology 58, 58-74 , 2019 2019 Citations: 204
Automated feedback and writing: a multi-level meta-analysis of effects on students' performance J Fleckenstein, L Liebenow, J Meyer Frontiers in Artificial Intelligence 6 , 2023 2023 Citations: 171
The relationship of personality traits and different measures of domain-specific achievement in upper secondary education J Meyer, J Fleckenstein, J Retelsdorf, O Köller Learning and Individual Differences 69, 45-59 , 2019 2019 Citations: 126
The long‐term proficiency of early, middle, and late starters learning English as a foreign language at school: A narrative review and empirical study J Baumert, J Fleckenstein, M Leucht, O Köller, J Möller Language Learning 70 (4), 1091-1135 , 2020 2020 Citations: 94
Linking TOEFL iBT® writing rubrics to CEFR levels: Cut scores and validity evidence from a standard setting study J Fleckenstein, S Keller, M Krüger, RJ Tannenbaum, O Köller Assessing Writing 43, 100420 , 2020 2020 Citations: 80
Erfolgreich integrieren-die Staatliche Europa-Schule Berlin J Möller, F Hohenstein, J Fleckenstein, O Köller, J Baumert Waxmann Verlag , 2017 2017 Citations: 73
Empirische arbeit: comparing generative AI and expert feedback to students’ writing: insights from student teachers T Jansen, L Höft, L Bahr, J Fleckenstein, J Möller, O Köller, J Meyer Psychologie in Erziehung und Unterricht 71 (2), 80-92 , 2024 2024 Citations: 67
Is a long essay always a good essay? The effect of text length on writing assessment J Fleckenstein, J Meyer, T Jansen, S Keller, O Köller Frontiers in psychology 11, 562462 , 2020 2020 Citations: 67
English writing skills of students in upper secondary education: Results from an empirical study in Switzerland and Germany SD Keller, J Fleckenstein, M Krüger, O Köller, AA Rupp Journal of Second Language Writing 48, 100700 , 2020 2020 Citations: 63
Pädagogische und didaktische Anforderungen an die häusliche Aufgabenbearbeitung O Köller, J Fleckenstein, K Guill, J Meyer Langsam vermisse ich die Schule…“. Schule während und nach der Corona … , 2020 2020 Citations: 48
Conscientiousness and cognitive ability as predictors of academic achievement: Evidence of synergistic effects from integrative data analysis J Meyer, O Lüdtke, FTC Schmidt, J Fleckenstein, U Trautwein, O Köller European Journal of Personality 38 (1), 36-52 , 2024 2024 Citations: 46
Teachers’ judgement accuracy concerning CEFR levels of prospective university students J Fleckenstein, M Leucht, O Köller Language Assessment Quarterly 15 (1), 90-101 , 2018 2018 Citations: 40
Wer hat Biss? Beharrlichkeit und beständiges Interesse von Lehramtsstudierenden J Fleckenstein, FTC Schmidt, J Möller Psychologie in Erziehung und Unterricht 61 (4), 281-286 , 2014 2014 Citations: 40
Mehrsprachigkeit als Ressource J Fleckenstein, J Möller, J Baumert Zeitschrift für Erziehungswissenschaft 21 (1), 97-120 , 2018 2018 Citations: 38
Proficient beyond borders: assessing non-native speakers in a native speakers’ framework J Fleckenstein, M Leucht, HA Pant, O Köller Large-scale assessments in education 4 (1), 19 , 2016 2016 Citations: 38
Promoting mathematics achievement in one-way immersion: Performance development over four years of elementary school J Fleckenstein, SK Gebauer, J Möller Contemporary Educational Psychology 56, 228-235 , 2019 2019 Citations: 37