Artificial intelligence for surgical scene understanding: a systematic review and reporting quality meta-analysis Matthias Carstens, Shubha Vasisht, Zheyuan Zhang, Iulia Barbur, Annika Reinke, et al. Npj Digital Medicine, 2026 Surgical scene understanding (SSU) uses artificial intelligence (AI) to interpret visual data from surgeries, such as laparoscopic videos. Despite promising foundational research on instrument and anatomy recognition, clinical adoption remains minimal. This systematic review and meta-analysis (PROSPERO: CRD420251005301) evaluates current SSU research in minimally invasive abdominal surgery, focusing on data curation, model design, validation, reporting standards, and clinical relevance. A total of 188 studies were reviewed. Most relied on small, single-center datasets (70.7%), primarily laparoscopic cholecystectomies (59.0%), reflecting an overall narrow topical breadth. Validation practices were often weak, rarely involving external datasets (10.1%) or clinical experts. Few studies addressed clinical translation (5.9%), model performance variability estimation (38.3%), or made code available (29.8%). Overall, limited progress toward clinical integration has been made over the past decade. Our findings highlight the need for diverse, multi-institutional datasets, robust validation practices, and clinically driven development to unlock the full potential of SSU in surgical practice.
Artificial Intelligence-Based Analysis of Laparoscopic Imaging for Intraoperative Surgical Decision Support Zheyuan Zhang, Pietro Mascagni, Annika Reinke, Pieter De Backer, Wouter Bogaert, et al. Annual Review of Biomedical Engineering, 2026 Across surgical specialties, minimally invasive (laparoscopic) surgery has become a standard technique, as it is associated with less trauma, reduced postoperative complication rates, and quicker recovery for patients as compared with open surgery. Due to the limited field of view and limited haptic feedback, the surgical decision-making process in laparoscopic surgery is currently guided solely by surgeons’ visual interpretation of the laparoscopic video stream. Modern artificial intelligence (AI) methods excel at the interpretation of visual data and find applications in clinical routine in fields such as radiology and endoscopy. AI methods could help augment laparoscopic surgery through objective real-time analysis of the laparoscopic video stream. Research studies have demonstrated the feasibility of AI-based surgical scene and process understanding. This review provides an overview of these AI applications, focusing on approaches that could, in the next decade, be translated into intraoperative surgical decision support tools for increased surgical quality and patient safety.
Translational challenges and clinical potential of artificial intelligence in minimally invasive surgery Matthias Carstens, Micha Pfeiffer, Stefanie Speidel, Marius Distler, Jürgen Weitz, et al. Chirurgie Germany, 2025 Zusammenfassung Künstliche Intelligenz (KI) bietet enormes Potenzial für die Chirurgie. Anwendungsfelder reichen von interdisziplinärer Therapiestratifizierung über die Unterstützung der Operationsplanung bis zur Entscheidungsunterstützung im Operationssaal, die im Fokus dieses Beitrags steht. Künstliche neuronale Netzwerke zur Analyse chirurgischer Videos können chirurgische Sicherheit, Effizienz und Planbarkeit verbessern. Voraussetzung dafür sind hochwertige, vielfältige (Meta‑)Daten, deren Annotation, Training und Validierung komplexe Anforderungen stellen. Trotz technischer Fortschritte scheitert die klinische Umsetzung bis dato oft an fehlender Datenstandardisierung, unzureichender Infrastruktur, regulatorischen Hürden und ethischen Unsicherheiten. Viele Modelle bleiben Black Boxes, was Akzeptanz und Vertrauen hemmt. Systeme müssen zudem robust, transparent und praktikabel in klinische Abläufe integrierbar sein. Um die klinische Translation von KI in der Chirurgie zu fördern, sind konsequente Datenerhebungsstrategien, datenschutzkonforme Lernverfahren, Explainable AI und Human-in-the-loop -Ansätze entscheidend. Auch regulatorische Rahmenbedingungen wie die EU Medical Device Regulation bzw. das Medizinprodukterecht-Durchführungsgesetz und der EU AI Act müssen KI-spezifisch für den medizinischen und insbesondere den interventionellen Bereich weiterentwickelt werden, um sichere, interdisziplinäre Assistenztechnologien im Operationssaal zu ermöglichen, die den chirurgischen Alltag sinnvoll ergänzen.
Postoperative complication management: How do large language models measure up to human expertise? Sophie-Caroline Schwarzkopf, Jean-Paul Bereuter, Mark Enrik Geissler, Jürgen Weitz, Marius Distler, et al. Plos Digital Health, 2025 Managing postoperative complications is an essential part of surgical care and largely depends on the medical team’s experience. Large Language Models (LLMs) have demonstrated immense potential in supporting medical professionals. To evaluate the potential of LLMs in surgical patient care, we compared the performance of three state-of-the-art LLMs in managing postoperative complications to that of a panel of medical professionals based on six postsurgical patient cases. Six realistic postoperative patient cases were queried using GPT-3, GPT-4, and Gemini-Advanced and presented to human surgical caregivers. Humans and LLMs provided a triage assessment, an initial suspected diagnosis, and an acute management plan, including initial diagnostic and therapeutic measures. Responses were compared based on medical contextual correctness, coherence, and completeness. In comparison to human caregivers, GPT-3 and GPT-4 possess considerable competencies in correctly identifying postoperative complications (humans: 76.3% vs. GPT-3: 75.0% vs. GPT-4: 96.7%, p = 0.47) as well as triaging patients accordingly (humans: 84.8% vs. GPT-3: 50% vs. GPT-4: 38.3%, p = 0.19). With regard to diagnostic and therapeutic management of postoperative complications, GPT-3 and GPT-4 provided comprehensive management plans. Gemini-Advanced often provided no diagnostic or therapeutic recommendations and censored its outputs. In summary, LLMs can accurately interpret postoperative care scenarios and provide comprehensive management recommendations. These results showcase the improvements in LLMs performance with regard to postoperative surgical use cases and provide evidence for their potential value to support and augment surgical routine care.
Artificial intelligence in pancreatic intraductal papillary mucinous neoplasm imaging: A systematic review Muhammad Ibtsaam Qadir, Jackson A. Baril, Michele T. Yip-Schneider, Duane Schonlau, Thi Thanh Thoa Tran, et al. Plos Digital Health, 2025 Based on the Fukuoka and Kyoto international consensus guidelines, the current clinical management of intraductal papillary mucinous neoplasm (IPMN) largely depends on imaging features. While these criteria are highly sensitive in detecting high-risk IPMN, they lack specificity, resulting in surgical overtreatment. Artificial Intelligence (AI)-based medical image analysis has the potential to augment the clinical management of IPMNs by improving diagnostic accuracy. Based on a systematic review of the academic literature on AI in IPMN imaging, 1041 publications were identified of which 25 published studies were included in the analysis. The studies were stratified based on prediction target, underlying data type and imaging modality, patient cohort size, and stage of clinical translation and were subsequently analyzed to identify trends and gaps in the field. Research on AI in IPMN imaging has been increasing in recent years. The majority of studies utilized CT imaging to train computational models. Most studies presented computational models developed on single-center datasets (n = 11,44%) and included less than 250 patients (n = 18,72%). Methodologically, convolutional neural network (CNN)-based algorithms were most commonly used. Thematically, most studies reported models augmenting differential diagnosis (n = 9,36%) or risk stratification (n = 10,40%) rather than IPMN detection (n = 5,20%) or IPMN segmentation (n = 2,8%). This systematic review provides a comprehensive overview of the research landscape of AI in IPMN imaging. Computational models have potential to enhance the accurate and precise stratification of patients with IPMN. Multicenter collaboration and datasets comprising various modalities are necessary to fully utilize this potential, alongside concerted efforts towards clinical translation.
Importance of the Data in the Surgical Environment Dominik Rivoir, Martin Wagner, Sebastian Bodenstedt, Keno März, Fiona Kolbinger, et al. Artificial Intelligence and the Perspective of Autonomous Surgery, 2024
Artificial Intelligence–Based Analysis of Laparoscopic Imaging for Intraoperative Surgical Decision Support Z Zhang, P Mascagni, A Reinke, P De Backer, W Bogaert, M Mezzina, ... Annual Review of Biomedical Engineering 28 (1), 135-162 , 2026 2026 Citations: 1
Adaptive-CaRe: Adaptive Causal Regularization for Robust Outcome Prediction N Bhasker, FR Kolbinger, S Hu, G Kutyniok, S Speidel arXiv preprint arXiv:2602.06611 , 2026 2026
Field strength-dependent performance variability in deep learning-based analysis of magnetic resonance imaging MI Qadir, D Schonlau, U Dydak, FR Kolbinger arXiv preprint arXiv:2512.22176 , 2025 2025
Artificial intelligence for surgical scene understanding: a systematic review and reporting quality meta-analysis M Carstens, S Vasisht, Z Zhang, I Barbur, A Reinke, L Maier-Hein, ... npj Digital Medicine , 2025 2025 Citations: 9
6 Fingers, 1 Kidney: Natural Adversarial Medical Images Reveal Critical Weaknesses of Vision-Language Models L Mayer, P Kalinowski, C Ebersbach, M Knopp, T Rädsch, ... arXiv preprint arXiv:2512.04238 , 2025 2025 Citations: 2
A Decade of C3 glomerulopathy-a nationwide cohort study RH Overwijk, F Kolbinger, M Eijgelsheim, O Dekkers, A Kronbichler, ... Kidney International Reports , 2025 2025
Current validation practice undermines surgical AI development A Reinke, ZO Li, MD Tizabi, P André, M Knopp, MM Rother, IP Machado, ... arXiv preprint arXiv:2511.03769 , 2025 2025 Citations: 3
Translationale Herausforderungen und klinisches Potenzial von künstlicher Intelligenz in der minimal-invasiven Chirurgie M Carstens, M Pfeiffer, S Speidel, M Distler, J Weitz, FR Kolbinger Die Chirurgie 96 (11), 901-906 , 2025 2025 Citations: 1
Vision-language models for automated video analysis and documentation in laparoscopic surgery: a proof-of-concept study EH Stueker, FR Kolbinger, OL Saldanha, D Digomann, S Pistorius, ... International Journal of Surgery 111 (11), 7777-7786 , 2025 2025 Citations: 9
Federated Learning for Surgical Vision in Appendicitis Classification: Results of the FedSurg EndoVis 2024 Challenge M Kirchner, H Hoffmann, AC Jenke, OL Saldanha, K Pfeiffer, W Kanjo, ... arXiv preprint arXiv:2510.04772 , 2025 2025
# 2442 Ten years of C3 glomerulopathy: a nationwide cohort study R Overwijk, F Kolbinger, M Eijgelsheim, O Dekkers, A Kronbichler, ... Nephrology Dialysis Transplantation 40 (Supplement_3), gfaf116. 1281 , 2025 2025
Decentralized, privacy-preserving surgical video analysis with Swarm Learning OL Saldanha, K Pfeiffer, S Bodenstedt, M Kirchner, AC Jenke, C Barata, ... medRxiv, 2025.10. 02.25337106 , 2025 2025 Citations: 1
Mission balance: Generating under-represented class samples using video diffusion models DK Venkatesh, I Funke, M Pfeiffer, F Kolbinger, HM Schmeiser, M Distler, ... International Conference on Medical Image Computing and Computer-Assisted … , 2025 2025 Citations: 4
Histopathological evaluation of abdominal aortic aneurysms with deep learning FR Kolbinger, OSM El Nahhas, MC Nackenhorst, C Brostjan, W Eilenberg, ... Diagnostic Pathology 20 (1), 104 , 2025 2025 Citations: 2
Appendix300: A multi-institutional laparoscopic appendectomy video dataset for computational modeling tasks FR Kolbinger, M Kirchner, K Pfeiffer, S Bodenstedt, AC Jenke, J Barthel, ... medRxiv, 2025.09. 05.25335174 , 2025 2025 Citations: 2
Translational challenges and clinical potential of artificial intelligence in minimally invasive surgery M Carstens, M Pfeiffer, S Speidel, M Distler, J Weitz, FR Kolbinger Chirurgie (Heidelberg, Germany) , 2025 2025
Postoperative complication management: How do large language models measure up to human expertise? SC Schwarzkopf, JP Bereuter, ME Geissler, J Weitz, M Distler, ... PLOS Digital Health 4 (8), e0000933 , 2025 2025
Pancreatoduodenectomy versus total pancreatectomy and simultaneous intraportal islet autotransplantation for periampullary cancer at high-risk of postoperative pancreatic … S Hempel, FR Kolbinger, F Oehme, O Radulova-Mauersberger, J Schmid, ... Plos one 20 (7), e0327949 , 2025 2025
PIVOTS: Aligning unseen Structures using Preoperative to Intraoperative Volume-To-Surface Registration for Liver Navigation P Liu, B Güttner, Y Su, C Li, J Xu, M Liu, Z Min, A Zhylka, J Smit, K Olthof, ... arXiv preprint arXiv:2507.20337 , 2025 2025
MOST CITED SCHOLAR PUBLICATIONS
The future landscape of large language models in medicine J Clusmann, FR Kolbinger, HS Muti, ZI Carrero, JN Eckardt, NG Laleh, ... Communications medicine 3 (1), 141 , 2023 2023 Citations: 1272
Reporting guidelines in medical artificial intelligence: a systematic review and meta-analysis FR Kolbinger, GP Veldhuizen, J Zhu, D Truhn, JN Kather Communications Medicine 4 (1), 71 , 2024 2024 Citations: 143
Regression-based Deep-Learning predicts molecular biomarkers from pathology slides OSM El Nahhas, CML Loeffler, ZI Carrero, M van Treeck, FR Kolbinger, ... nature communications 15 (1), 1253 , 2024 2024 Citations: 132
Targeting histone deacetylase 8 as a therapeutic approach to cancer and neurodegenerative diseases A Chakrabarti, J Melesina, FR Kolbinger, I Oehme, J Senger, O Witt, ... Future medicinal chemistry 8 (13), 1609-1634 , 2016 2016 Citations: 115
The Dresden surgical anatomy dataset for abdominal organ segmentation in surgical data science M Carstens, FM Rinner, S Bodenstedt, AC Jenke, J Weitz, M Distler, ... Scientific Data 10 (1), 1-8 , 2023 2023 Citations: 109
Structure-based design and biological characterization of selective histone deacetylase 8 (HDAC8) inhibitors with anti-neuroblastoma activity T Heimburg, FR Kolbinger, P Zeyen, E Ghazy, D Herp, K Schmidtkunz, ... Journal of medicinal chemistry 60 (24), 10188-10204 , 2017 2017 Citations: 103
Dual role of HDAC10 in lysosomal exocytosis and DNA repair promotes neuroblastoma chemoresistance J Ridinger, E Koeneke, FR Kolbinger, K Koerholz, S Mahboobi, L Hellweg, ... Scientific reports 8 (1), 10039 , 2018 2018 Citations: 65
Artificial intelligence in colorectal cancer surgery: present and future perspectives G Quero, P Mascagni, FR Kolbinger, C Fiorillo, D De Sio, F Longo, ... Cancers 14 (15), 3803 , 2022 2022 Citations: 61
Anatomy segmentation in laparoscopic surgery: comparison of machine learning and human expertise–an experimental study FR Kolbinger, FM Rinner, AC Jenke, M Carstens, S Krell, S Leger, ... International Journal of Surgery 109 (10), 2962-2974 , 2023 2023 Citations: 58
Artificial Intelligence for context-aware surgical guidance in complex robot-assisted oncological procedures: An exploratory feasibility study FR Kolbinger, S Bodenstedt, M Carstens, S Leger, S Krell, FM Rinner, ... European Journal of Surgical Oncology, 106996 , 2023 2023 Citations: 57
The HDAC6/8/10 inhibitor TH34 induces DNA damage-mediated cell death in human high-grade neuroblastoma cell lines FR Kolbinger, E Koeneke, J Ridinger, T Heimburg, M Müller, T Bayer, ... Archives of toxicology 92 (8), 2649-2664 , 2018 2018 Citations: 51
Long-term temporally consistent unpaired video translation from simulated surgical 3d data D Rivoir, M Pfeiffer, R Docea, F Kolbinger, C Riediger, J Weitz, S Speidel Proceedings of the IEEE/CVF international conference on computer vision … , 2021 2021 Citations: 49
The use and future perspective of Artificial Intelligence—A survey among German surgeons M Pecqueux, C Riediger, M Distler, F Oehme, U Bork, FR Kolbinger, ... Frontiers in public health 10, 982335 , 2022 2022 Citations: 47
A kinome-wide RNAi screen identifies ALK as a target to sensitize neuroblastoma cells for HDAC8-inhibitor treatment J Shen, S Najafi, S Stäble, J Fabian, E Koeneke, FR Kolbinger, JK Wrobel, ... Cell Death & Differentiation 25 (12), 2053-2070 , 2018 2018 Citations: 43
Direct prediction of genetic aberrations from pathology images in gastric cancer with swarm learning OL Saldanha, HS Muti, HI Grabsch, R Langer, B Dislich, M Kohlruss, ... Gastric cancer 26 (2), 264-274 , 2023 2023 Citations: 36
Surgomics: personalized prediction of morbidity, mortality and long-term outcome in surgery using machine learning on multimodal data M Wagner, JM Brandenburg, S Bodenstedt, A Schulze, AC Jenke, A Stern, ... Surgical endoscopy 36 (11), 8568-8591 , 2022 2022 Citations: 35
The importance of machine learning in autonomous actions for surgical decision making M Wagner, S Bodenstedt, M Daum, A Schulze, R Younis, J Brandenburg, ... Artificial Intelligence Surgery 2 (2), 64-79 , 2022 2022 Citations: 30
More is more? Total pancreatectomy for periampullary cancer as an alternative in patients with high-risk pancreatic anastomosis: a propensity score-matched analysis S Hempel, F Oehme, E Tahirukaj, FR Kolbinger, B Müssle, T Welsch, ... Annals of Surgical Oncology 28 (13), 8309-8317 , 2021 2021 Citations: 24
Why do people cycle (a lot)? A multivariate approach on mental health, personality traits and motivation as determinants for cycling ambition JS Kesenheimer, C Sagioglou, A Kronbichler, P Gauckler, FR Kolbinger Journal of Applied Sport Psychology 35 (6), 1005-1025 , 2023 2023 Citations: 19
Active learning for extracting surgomic features in robot-assisted minimally invasive esophagectomy: a prospective annotation study JM Brandenburg, AC Jenke, A Stern, MTJ Daum, A Schulze, R Younis, ... Surgical endoscopy 37 (11), 8577-8593 , 2023 2023 Citations: 19