Generating Explanations in Medical Question-Answering by Expectation Maximization Inference over Evidence Wei Sun, Mingxiao Li, Damien Sileo, Jesse Davis, Marie-Francine Moens ACM Transactions on Computing for Healthcare, 2025 Medical Question Answering (medical QA) systems play an essential role in assisting healthcare workers in finding answers to their questions. However, it is not sufficient to merely provide answers by medical QA systems because users might want explanations, that is, more analytic statements in natural language that describe the elements and context that support the answer. To do so, we propose a novel approach for generating natural language explanations for answers predicted by medical QA systems. As high-quality medical explanations require additional medical knowledge, so that our system extracts knowledge from medical textbooks to enhance the quality of explanations during the explanation generation process. Concretely, we designed an Expectation-Maximization approach that makes inferences about the evidence found in these texts, offering an efficient way to focus attention on lengthy evidence passages. Experimental results, conducted on two datasets MQAE-diag and MQAE, demonstrate the effectiveness of our framework for reasoning with textual evidence. Our approach outperforms state-of-the-art models, achieving a significant improvement of 6.13 and 5.47 percentage points on the Rouge-L score; 6.49 and 5.28 percentage points on the Bleu-4 score on the MQAE-diag and MQAE datasets.
BRIDGING THE DATA PROVENANCE GAP ACROSS TEXT, SPEECH, AND VIDEO 13th International Conference on Learning Representations Iclr 2025, 2025
A large-scale audit of dataset licensing and attribution in AI Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, et al. Nature Machine Intelligence, 2024 The race to train language models on vast, diverse and inconsistently documented datasets raises pressing legal and ethical concerns. To improve data transparency and understanding, we convene a multi-disciplinary effort between legal and machine learning experts to systematically audit and trace more than 1,800 text datasets. We develop tools and standards to trace the lineage of these datasets, including their source, creators, licences and subsequent use. Our landscape analysis highlights sharp divides in the composition and focus of data licenced for commercial use. Important categories including low-resource languages, creative tasks and new synthetic data all tend to be restrictively licenced. We observe frequent miscategorization of licences on popular dataset hosting sites, with licence omission rates of more than 70% and error rates of more than 50%. This highlights a crisis in misattribution and informed use of popular datasets driving many recent breakthroughs. Our analysis of data sources also explains the application of copyright law and fair use to finetuning data. As a contribution to continuing improvements in dataset transparency and responsible use, we release our audit, with an interactive user interface, the Data Provenance Explorer, to enable practitioners to trace and filter on data provenance for the most popular finetuning data collections: www.dataprovenance.org.
tasksource: A Large Collection of NLP tasks with a Structured Dataset Preprocessing Framework 2024 Joint International Conference on Computational Linguistics Language Resources and Evaluation Lrec Coling 2024 Main Conference Proceedings, 2024
Generating Multiple-Choice Questions for Medical QA with Distractors and Cue-Masking 2024 Joint International Conference on Computational Linguistics Language Resources and Evaluation Lrec Coling 2024 Main Conference Proceedings, 2024
DISRPT: A Multilingual, Multi-domain, Cross-framework Benchmark for Discourse Processing 2024 Joint International Conference on Computational Linguistics Language Resources and Evaluation Lrec Coling 2024 Main Conference Proceedings, 2024
Consent in Crisis: The Rapid Decline of the AI Data Commons Advances in Neural Information Processing Systems, 2024
Analysis and Prediction of NLP models via Task Embeddings 2022 Language Resources and Evaluation Conference Lrec 2022, 2022
Zero-Shot Recommendation as Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2022
A Pragmatics-Centered Evaluation Framework for Natural Language Understanding 2022 Language Resources and Evaluation Conference Lrec 2022, 2022
DiscSense: Automated semantic analysis of discourse markers Lrec 2020 12th International Conference on Language Resources and Evaluation Conference Proceedings, 2020
Mining discourse markers for unsupervised sentence representation learning Damien Sileo, Tim Van De Cruys, Camille Pradel, Philippe Muller Naacl Hlt 2019 2019 Conference of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Proceedings of the Conference, 2019
Semantic role analysis for automatic summarization Extraction Et Gestion Des Connaissances Egc 2018, 2018
RECENT SCHOLAR PUBLICATIONS
Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models G Loiseau, D Sileo, D Riquet, M Meyer, M Tommasi arXiv preprint arXiv:2603.29497 , 2026 2026
Reasoning Core: A Scalable Procedural Data Generation Suite for Symbolic Pre-training and Post-Training V Lacombe, V Quesnel, D Sileo arXiv preprint arXiv:2603.02208 , 2026 2026 Citations: 3
Logic Haystacks: Probing LLMs’ Long-Context Logical Reasoning (Without Easily Identifiable Unrelated Padding) D Sileo Proceedings of the 19th Conference of the European Chapter of the … , 2026 2026 Citations: 1
Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization G Loiseau, D Sileo, D Riquet, M Meyer, M Tommasi arXiv preprint arXiv:2602.20743 , 2026 2026
A benchmark of expert-level academic questions to assess AI capabilities Center for AI Safety Phan Long agibenchmark@ safe. ai 1 Gatti Alice 1 Li ... Nature 649 (8099), 1139-1146 , 2026 2026 Citations: 513
MortalMATH: Evaluating the Conflict Between Reasoning Objectives and Emergency Contexts E Lanzeray, S Meilliez, M Ruelle, D Sileo arXiv preprint arXiv:2601.18790 , 2026 2026
Attention Overflow: Language Model Input Blur during Long-Context Missing Items Identification D Sileo Proceedings of the 14th International Joint Conference on Natural Language … , 2025 2025 Citations: 6
Tau-Eval: A Unified Evaluation Framework for Useful and Private Text Anonymization G Loiseau, D Sileo, D Riquet, M Meyer, M Tommasi Proceedings of the 2025 Conference on Empirical Methods in Natural Language … , 2025 2025 Citations: 7
Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem V Quesnel, D Sileo arXiv preprint arXiv:2509.06809 , 2025 2025 Citations: 1
Bridging the data provenance gap across text, speech, and video S Longpre, N Singh, M Cherep, K Tiwary, J Materzynska, W Brannon, ... International Conference on Learning Representations 2025, 60592-60670 , 2025 2025 Citations: 27
Generating Explanations in Medical Question-Answering by Expectation Maximization Inference over Evidence W Sun, M Li, D Sileo, J Davis, MF Moens ACM Transactions on Computing for Healthcare 6 (2), 1-23 , 2025 2025 Citations: 6
Tarot: Task-oriented authorship obfuscation using policy optimization methods G Loiseau, D Sileo, D Riquet, M Meyer, M Tommasi Proceedings of the Sixth Workshop on Privacy in Natural Language Processing … , 2025 2025 Citations: 5
Recipient Profiling: Predicting Characteristics from Messages M Borquez, M Keller, M Perrot, D Sileo arXiv preprint arXiv:2412.12954 , 2024 2024 Citations: 1
Consent in crisis: The rapid decline of the ai data commons S Longpre, R Mahari, A Lee, C Lund, H Oderinwale, W Brannon, ... Advances in Neural Information Processing Systems 37, 108042-108087 , 2024 2024 Citations: 130
Scaling synthetic logical reasoning datasets with context-sensitive declarative grammars D Sileo Proceedings of the 2024 Conference on Empirical Methods in Natural Language … , 2024 2024 Citations: 6
A large-scale audit of dataset licensing and attribution in AI S Longpre, R Mahari, A Chen, N Obeng-Marnu, D Sileo, W Brannon, ... Nature Machine Intelligence 6 (8), 975-987 , 2024 2024 Citations: 218
DISRPT: A multilingual, multi-domain, cross-framework benchmark for discourse processing C Braud, A Zeldes, L Rivière, YJ Liu, P Muller, D Sileo, T Aoyama Proceedings of the 2024 Joint International Conference on Computational … , 2024 2024 Citations: 18
tasksource: A Large Collection of NLP tasks with a Structured Dataset Preprocessing Framework D Sileo Proceedings of the 2024 Joint International Conference on Computational … , 2024 2024 Citations: 81
Generating multiple-choice questions for medical question answering with distractors and cue-masking D Sileo, K Uma, MF Moens Proceedings of the 2024 Joint International Conference on Computational … , 2024 2024 Citations: 9
Consent in crisis: The rapid decline of the ai data commons, 2024 S Longpre, R Mahari, A Lee, C Lund, H Oderinwale, W Brannon, ... URL https://arxiv. org/abs/2407.14933 , 2024 2024 Citations: 8
MOST CITED SCHOLAR PUBLICATIONS
Beyond the imitation game: Quantifying and extrapolating the capabilities of language models A Srivastava, A Rastogi, A Rao, AAM Shoeb, A Abid, A Fisch, AR Brown, ... Transactions on machine learning research , 2023 2023 Citations: 2653
A benchmark of expert-level academic questions to assess AI capabilities Center for AI Safety Phan Long agibenchmark@ safe. ai 1 Gatti Alice 1 Li ... Nature 649 (8099), 1139-1146 , 2026 2026 Citations: 513
A large-scale audit of dataset licensing and attribution in AI S Longpre, R Mahari, A Chen, N Obeng-Marnu, D Sileo, W Brannon, ... Nature Machine Intelligence 6 (8), 975-987 , 2024 2024 Citations: 218
Consent in crisis: The rapid decline of the ai data commons S Longpre, R Mahari, A Lee, C Lund, H Oderinwale, W Brannon, ... Advances in Neural Information Processing Systems 37, 108042-108087 , 2024 2024 Citations: 130
Nl-augmenter: A framework for task-sensitive natural language augmentation K Dhole, V Gangal, S Gehrmann, A Gupta, Z Li, S Mahamood, ... Northern European Journal of Language Technology 9 , 2023 2023 Citations: 106
Zero-Shot Recommendation as Language Modeling D Sileo, W Vossen, R Raymaekers European Conference on Information Retrieval, 223-230 , 2022 2022 Citations: 94
Mining Discourse Markers for Unsupervised Sentence Representation Learning D Sileo, T Van-De-Cruys, C Pradel, P Muller Proceedings of the 2019 Conference of the North American Chapter of the … , 2019 2019 Citations: 83
tasksource: A Large Collection of NLP tasks with a Structured Dataset Preprocessing Framework D Sileo Proceedings of the 2024 Joint International Conference on Computational … , 2024 2024 Citations: 81
MindGames: Targeting Theory of Mind in Large Language Models with Dynamic Epistemic Modal Logic D Sileo, A Lernould Findings of the Association for Computational Linguistics: EMNLP 2023, 4570–4577 , 2023 2023 Citations: 41
Bridging the data provenance gap across text, speech, and video S Longpre, N Singh, M Cherep, K Tiwary, J Materzynska, W Brannon, ... International Conference on Learning Representations 2025, 60592-60670 , 2025 2025 Citations: 27
The data provenance initiative: A large scale audit of dataset licensing & attribution in ai, 2023 S Longpre, R Mahari, A Chen, N Obeng-Marnu, D Sileo, W Brannon, ... URL https://arxiv. org/abs/2310.16787 , 2023 2023 Citations: 21
A Pragmatics-Centered Evaluation Framework for Natural Language Understanding D Sileo, P Muller, T Van de Cruys, C Pradel Proceedings of the Thirteenth Language Resources and Evaluation Conference … , 2022 2022 Citations: 19
DISRPT: A multilingual, multi-domain, cross-framework benchmark for discourse processing C Braud, A Zeldes, L Rivière, YJ Liu, P Muller, D Sileo, T Aoyama Proceedings of the 2024 Joint International Conference on Computational … , 2024 2024 Citations: 18
DiscSense: Automated Semantic Analysis of Discourse Markers D Sileo, T Van de Cruys, C Pradel, P Muller Proceedings of The 12th Language Resources and Evaluation Conference, 991-999 , 2020 2020 Citations: 18
Probing neural language models for understanding of words of estimative probability D Sileo, MF Moens Proceedings of the 12th Joint Conference on Lexical and Computational … , 2023 2023 Citations: 16
Composition of Embeddings: Lessons from Statistical Relational Learning D Sileo, T Van de Cruys, C Pradel, P Muller 8th Joint Conference on Lexical and Computational Semantics (SEM 2019), 33-43 , 2019 2019 Citations: 10
Generating multiple-choice questions for medical question answering with distractors and cue-masking D Sileo, K Uma, MF Moens Proceedings of the 2024 Joint International Conference on Computational … , 2024 2024 Citations: 9
Visual Grounding Strategies for Text-Only Natural Language Processing D Sileo Proceedings of the Third Workshop on Beyond Vision and LANguage: inTEgrating … , 2021 2021 Citations: 9
Consent in crisis: The rapid decline of the ai data commons, 2024 S Longpre, R Mahari, A Lee, C Lund, H Oderinwale, W Brannon, ... URL https://arxiv. org/abs/2407.14933 , 2024 2024 Citations: 8
Analysis and Prediction of NLP Models Via Task Embeddings D Sileo, MF Moens Proceedings of The 13th Language Resources and Evaluation Conference, LREC 2022 , 2022 2022 Citations: 8