Giuseppe Jurman

Scopus Publications

The Venus score for the assessment of the quality and trustworthiness of biomedical datasets
Davide Chicco, Alessandro Fabris, and Giuseppe Jurman
Springer Science and Business Media LLC
AbstractBiomedical datasets are the mainstays of computational biology and health informatics projects, and can be found on multiple data platforms online or obtained from wet-lab biologists and physicians. The quality and the trustworthiness of these datasets, however, can sometimes be poor, producing bad results in turn, which can harm patients and data subjects. To address this problem, policy-makers, researchers, and consortia have proposed diverse regulations, guidelines, and scores to assess the quality and increase the reliability of datasets. Although generally useful, however, they are often incomplete and impractical. The guidelines of Datasheets for Datasets, in particular, are too numerous; the requirements of the Kaggle Dataset Usability Score focus on non-scientific requisites (for example, including a cover image); and the European Union Artificial Intelligence Act (EU AI Act) sets forth sparse and general data governance requirements, which we tailored to datasets for biomedical AI. Against this backdrop, we introduce our new Venus score to assess the data quality and trustworthiness of biomedical datasets. Our score ranges from 0 to 10 and consists of ten questions that anyone developing a bioinformatics, medical informatics, or cheminformatics dataset should answer before the release. In this study, we first describe the EU AI Act, Datasheets for Datasets, and the Kaggle Dataset Usability Score, presenting their requirements and their drawbacks. To do so, we reverse-engineer the weights of the influential Kaggle Score for the first time and report them in this study. We distill the most important data governance requirements into ten questions tailored to the biomedical domain, comprising the Venus score. We apply the Venus score to twelve datasets from multiple subdomains, including electronic health records, medical imaging, microarray and bulk RNA-seq gene expression, cheminformatics, physiologic electrogram signals, and medical text. Analyzing the results, we surface fine-grained strengths and weaknesses of popular datasets, as well as aggregate trends. Most notably, we find a widespread tendency to gloss over sources of data inaccuracy and noise, which may hinder the reliable exploitation of data and, consequently, research results. Overall, our results confirm the applicability and utility of the Venus score to assess the trustworthiness of biomedical data.

Neuropsychological tests and machine learning: identifying predictors of MCI and dementia progression
Carlotta Cazzolli, Marco Chierici, Monica Dallabona, Chiara Guella, and Giuseppe Jurman
Springer Science and Business Media LLC
Abstract Background Early prediction of progression in dementia is of major importance for providing patients with adequate clinical care, with considerable impact on the organization of the whole healthcare system. Aims The main task is tailoring robust and consolidated machine learning models to detect which neuropsychological tests are more effective in predicting a patient’s mental status. In a translational medicine perspective, such identification tool should find its place in the clinician’s toolbox as a support throughout his daily diagnostic routine. A second objective involves predicting the patient’s diagnosis based on the results of the cognitive assessment. Methods 281 patients with MCI or dementia diagnosis were assessed through 14 commonly administered neuropsychological tests designed to evaluate different cognitive domains. A suite of machine learning models, trained on different subsets of data, was used to detect the most informative tests and to predict the patient’s diagnosis. Two external validation datasets containing MMSE and FAB tests were involved in this second task. Results The tests qualitatively and statistically associated to a cognitive decline are MMSE, FAB, BSTR, AM, and VSF, of which at least three were considered the most informative also by machine learning. 73% average accuracy was obtained in the diagnosis prediction on three subsets of original and external data. Discussion Detecting the most informative tests could reduce the visits’ time and prevent the cognitive assessment from being biased by external factors. Machine learning models’ prediction represents a useful baseline for the clinician’s actual diagnosis and a reliable insight into the future development of the patient’s cognitive status.

Explainable deep neural networks for predicting sample phenotypes from single-cell transcriptomics
Jordi Martorell-Marugán, Raúl López-Domínguez, Juan Antonio Villatoro-García, Daniel Toro-Domínguez, Marco Chierici, Giuseppe Jurman, and Pedro Carmona-Sáez
Oxford University Press (OUP)
Abstract Recent advances in single-cell RNA-Sequencing (scRNA-Seq) technologies have revolutionized our ability to gather molecular insights into different phenotypes at the level of individual cells. The analysis of the resulting data poses significant challenges, and proper statistical methods are required to analyze and extract information from scRNA-Seq datasets. Sample classification based on gene expression data has proven effective and valuable for precision medicine applications. However, standard classification schemas are often not suitable for scRNA-Seq due to their unique characteristics, and new algorithms are required to effectively analyze and classify samples at the single-cell level. Furthermore, existing methods for this purpose have limitations in their usability. Those reasons motivated us to develop singleDeep, an end-to-end pipeline that streamlines the analysis of scRNA-Seq data training deep neural networks, enabling robust prediction and characterization of sample phenotypes. We used singleDeep to make predictions on scRNA-Seq datasets from different conditions, including systemic lupus erythematosus, Alzheimer’s disease and coronavirus disease 2019. Our results demonstrate strong diagnostic performance, validated both internally and externally. Moreover, singleDeep outperformed traditional machine learning methods and alternative single-cell approaches. In addition to prediction accuracy, singleDeep provides valuable insights into cell types and gene importance estimation for phenotypic characterization. This functionality provided additional and valuable information in our use cases. For instance, we corroborated that some interferon signature genes are consistently relevant for autoimmunity across all immune cell types in lupus. On the other hand, we discovered that genes linked to dementia have relevant roles in specific brain cell populations, such as APOE in astrocytes.

Session-by-Session Prediction of Anti-Endothelial Growth Factor Injection Needs in Neovascular Age-Related Macular Degeneration Using Optical-Coherence-Tomography-Derived Features and Machine Learning
Flavio Ragni, Stefano Bovo, Andrea Zen, Diego Sona, K. De Nadai, G. Adamo, Marco Pellegrini, Francesco Nasini, Chiara Vivarelli, Marco Tavolato,et al.

Background/Objectives: Neovascular age-related macular degeneration (nAMD) is a retinal disorder leading to irreversible central vision loss. The pro-re-nata (PRN) treatment for nAMD involves frequent intravitreal injections of anti-VEGF medications, placing a burden on patients and healthcare systems. Predicting injections needs at each monitoring session could optimize treatment outcomes and reduce unnecessary interventions. Methods: To achieve these aims, machine learning (ML) models were evaluated using different combinations of clinical variables, including retinal thickness and volume, best-corrected visual acuity, and features derived from macular optical coherence tomography (OCT). A “Leave Some Subjects Out” (LSSO) nested cross-validation approach ensured robust evaluation. Moreover, the SHapley Additive exPlanations (SHAP) analysis was employed to quantify the contribution of each feature to model predictions. Results: Results demonstrated that models incorporating both structural and functional features achieved high classification accuracy in predicting injection necessity (AUC = 0.747 ± 0.046, MCC = 0.541 ± 0.073). Moreover, the explainability analysis identified as key predictors both subretinal and intraretinal fluid, alongside central retinal thickness. Conclusions: These findings suggest that session-by-session prediction of injection needs in nAMD patients is feasible, even without processing the entire OCT image. The proposed ML framework has the potential to be integrated into routine clinical workflows, thereby optimizing nAMD therapeutic management.

Generating and evaluating synthetic data in digital pathology through diffusion models
Matteo Pozzi, Shahryar Noei, Erich Robbi, Luca Cima, Monica Moroni, Enrico Munari, Evelin Torresani, and Giuseppe Jurman
Springer Science and Business Media LLC

Correction to: AI models for automated segmentation of engineered polycystic kidney tubules (Scientific Reports, (2024), 14, 1, (2847), 10.1038/s41598-024-52677-1)
Simone Monaco, Nicole Bussola, Sara Buttò, Diego Sona, Flavio Giobergia, Giuseppe Jurman, Christodoulos Xinaris, and Daniele Apiletti
Springer Science and Business Media LLC

AI models for automated segmentation of engineered polycystic kidney tubules
Simone Monaco, Nicole Bussola, Sara Buttò, Diego Sona, Flavio Giobergia, Giuseppe Jurman, Christodoulos Xinaris, and Daniele Apiletti
Springer Science and Business Media LLC
AbstractAutosomal dominant polycystic kidney disease (ADPKD) is a monogenic, rare disease, characterized by the formation of multiple cysts that grow out of the renal tubules. Despite intensive attempts to develop new drugs or repurpose existing ones, there is currently no definitive cure for ADPKD. This is primarily due to the complex and variable pathogenesis of the disease and the lack of models that can faithfully reproduce the human phenotype. Therefore, the development of models that allow automated detection of cysts’ growth directly on human kidney tissue is a crucial step in the search for efficient therapeutic solutions. Artificial Intelligence methods, and deep learning algorithms in particular, can provide powerful and effective solutions to such tasks, and indeed various architectures have been proposed in the literature in recent years. Here, we comparatively review state-of-the-art deep learning segmentation models, using as a testbed a set of sequential RGB immunofluorescence images from 4 in vitro experiments with 32 engineered polycystic kidney tubules. To gain a deeper understanding of the detection process, we implemented both pixel-wise and cyst-wise performance metrics to evaluate the algorithms. Overall, two models stand out as the best performing, namely UNet++ and UACANet: the latter uses a self-attention mechanism introducing some explainability aspects that can be further exploited in future developments, thus making it the most promising algorithm to build upon towards a more refined cyst-detection platform. UACANet model achieves a cyst-wise Intersection over Union of 0.83, 0.91 for Recall, and 0.92 for Precision when applied to detect large-size cysts. On all-size cysts, UACANet averages at 0.624 pixel-wise Intersection over Union. The code to reproduce all results is freely available in a public GitHub repository.

Forecasting daily total pollen concentrations on a global scale
László Makra, Luca Coviello, Andrea Gobbi, Giuseppe Jurman, Cesare Furlanello, Mauro Brunato, Lewis H. Ziska, Jeremy J. Hess, Athanasios Damialis, Maria Pilar Plaza Garcia,et al.
Wiley
AbstractBackgroundThere is evidence that global anthropogenic climate change may be impacting floral phenology and the temporal and spatial characteristics of aero‐allergenic pollen. Given the extent of current and future climate uncertainty, there is a need to strengthen predictive pollen forecasts.MethodsThe study aims to use CatBoost (CB) and deep learning (DL) models for predicting the daily total pollen concentration up to 14 days in advance for 23 cities, covering all five continents. The model includes the projected environmental parameters, recent concentrations (1, 2 and 4 weeks), and the past environmental explanatory variables, and their future values.ResultsThe best pollen forecasts include Mexico City (R2(DL_7) ≈ .7), and Santiago (R2(DL_7) ≈ .8) for the 7th forecast day, respectively; while the weakest pollen forecasts are made for Brisbane (R2(DL_7) ≈ .4) and Seoul (R2(DL_7) ≈ .1) for the 7th forecast day. The global order of the five most important environmental variables in determining the daily total pollen concentrations is, in decreasing order: the past daily total pollen concentration, future 2 m temperature, past 2 m temperature, past soil temperature in 28–100 cm depth, and past soil temperature in 0–7 cm depth. City‐related clusters of the most similar distribution of feature importance values of the environmental variables only slightly change on consecutive forecast days for Caxias do Sul, Cape Town, Brisbane, and Mexico City, while they often change for Sydney, Santiago, and Busan.ConclusionsThis new knowledge of the ecological relationships of the most remarkable variables importance for pollen forecast models according to clusters, cities and forecast days is important for developing and improving the accuracy of airborne pollen forecasts.

Artificial intelligence of imaging and clinical neurological data for predictive, preventive and personalized (P3) medicine for Parkinson Disease: The NeuroArtP3 protocol for a multi-center research study
Maria Chiara Malaguti, Lorenzo Gios, Bruno Giometto, Chiara Longo, Marianna Riello, Donatella Ottaviani, Maria Pellegrini, Raffaella Di Giacopo, Davide Donner, Umberto Rozzanigo,et al.
Public Library of Science (PLoS)
Background The burden of Parkinson Disease (PD) represents a key public health issue and it is essential to develop innovative and cost-effective approaches to promote sustainable diagnostic and therapeutic interventions. In this perspective the adoption of a P3 (predictive, preventive and personalized) medicine approach seems to be pivotal. The NeuroArtP3 (NET-2018-12366666) is a four-year multi-site project co-funded by the Italian Ministry of Health, bringing together clinical and computational centers operating in the field of neurology, including PD. Objective The core objectives of the project are: i) to harmonize the collection of data across the participating centers, ii) to structure standardized disease-specific datasets and iii) to advance knowledge on disease’s trajectories through machine learning analysis. Methods The 4-years study combines two consecutive research components: i) a multi-center retrospective observational phase; ii) a multi-center prospective observational phase. The retrospective phase aims at collecting data of the patients admitted at the participating clinical centers. Whereas the prospective phase aims at collecting the same variables of the retrospective study in newly diagnosed patients who will be enrolled at the same centers. Results The participating clinical centers are the Provincial Health Services (APSS) of Trento (Italy) as the center responsible for the PD study and the IRCCS San Martino Hospital of Genoa (Italy) as the promoter center of the NeuroartP3 project. The computational centers responsible for data analysis are the Bruno Kessler Foundation of Trento (Italy) with TrentinoSalute4.0 –Competence Center for Digital Health of the Province of Trento (Italy) and the LISCOMPlab University of Genoa (Italy). Conclusions The work behind this observational study protocol shows how it is possible and viable to systematize data collection procedures in order to feed research and to advance the implementation of a P3 approach into the clinical practice through the use of AI models.

Scoring Tumor-Infiltrating Lymphocytes in breast DCIS: A guideline-driven artificial intelligence approach

A temporally and spatially explicit, data-driven estimation of airborne ragweed pollen concentrations across Europe
László Makra, István Matyasovszky, Gábor Tusnády, Lewis H. Ziska, Jeremy J. Hess, László G. Nyúl, Daniel S. Chapman, Luca Coviello, Andrea Gobbi, Giuseppe Jurman,et al.
Elsevier BV

Endoscopy-based IBD identification by a quantized deep learning pipeline
Massimiliano Datres, Elisa Paolazzi, Marco Chierici, Matteo Pozzi, Antonio Colangelo, Marcello Dorian Donzella, and Giuseppe Jurman
Springer Science and Business Media LLC
Abstract Background Discrimination between patients affected by inflammatory bowel diseases and healthy controls on the basis of endoscopic imaging is an challenging problem for machine learning models. Such task is used here as the testbed for a novel deep learning classification pipeline, powered by a set of solutions enhancing characterising elements such as reproducibility, interpretability, reduced computational workload, bias-free modeling and careful image preprocessing. Results First, an automatic preprocessing procedure is devised, aimed to remove artifacts from clinical data, feeding then the resulting images to an aggregated per-patient model to mimic the clinicians decision process. The predictions are based on multiple snapshots obtained through resampling, reducing the risk of misleading outcomes by removing the low confidence predictions. Each patient’s outcome is explained by returning the images the prediction is based upon, supporting clinicians in verifying diagnoses without the need for evaluating the full set of endoscopic images. As a major theoretical contribution, quantization is employed to reduce the complexity and the computational cost of the model, allowing its deployment on small power devices with an almost negligible 3% performance degradation. Such quantization procedure holds relevance not only in the context of per-patient models but also for assessing its feasibility in providing real-time support to clinicians even in low-resources environments. The pipeline is demonstrated on a private dataset of endoscopic images of 758 IBD patients and 601 healthy controls, achieving Matthews Correlation Coefficient 0.9 as top performance on test set. Conclusion We highlighted how a comprehensive pre-processing pipeline plays a crucial role in identifying and removing artifacts from data, solving one of the principal challenges encountered when working with clinical data. Furthermore, we constructively showed how it is possible to emulate clinicians decision process and how it offers significant advantages, particularly in terms of explainability and trust within the healthcare context. Last but not least, we proved that quantization can be a useful tool to reduce the time and resources consumption with an acceptable degradation of the model performs. The quantization study proposed in this work points up the potential development of real-time quantized algorithms as valuable tools to support clinicians during endoscopy procedures.

Signature literature review reveals AHCY, DPYSL3, and NME1 as the most recurrent prognostic genes for neuroblastoma
Davide Chicco, Tiziana Sanavia, and Giuseppe Jurman
Springer Science and Business Media LLC
AbstractNeuroblastoma is a childhood neurological tumor which affects hundreds of thousands of children worldwide, and information about its prognosis can be pivotal for patients, their families, and clinicians. One of the main goals in the related bioinformatics analyses is to provide stable genetic signatures able to include genes whose expression levels can be effective to predict the prognosis of the patients. In this study, we collected the prognostic signatures for neuroblastoma published in the biomedical literature, and noticed that the most frequent genes present among them were three:AHCY,DPYLS3, andNME1. We therefore investigated the prognostic power of these three genes by performing a survival analysis and a binary classification on multiple gene expression datasets of different groups of patients diagnosed with neuroblastoma. Finally, we discussed the main studies in the literature associating these three genes with neuroblastoma. Our results, in each of these three steps of validation, confirm the prognostic capability ofAHCY,DPYLS3, andNME1, and highlight their key role in neuroblastoma prognosis. Our results can have an impact on neuroblastoma genetics research: biologists and medical researchers can pay more attention to the regulation and expression of these three genes in patients having neuroblastoma, and therefore can develop better cures and treatments which can save patients’ lives.

Ten simple rules for providing bioinformatics support within a hospital
Davide Chicco and Giuseppe Jurman
Springer Science and Business Media LLC
AbstractBioinformatics has become a key aspect of the biomedical research programmes of many hospitals’ scientific centres, and the establishment of bioinformatics facilities within hospitals has become a common practice worldwide. Bioinformaticians working in these facilities provide computational biology support to medical doctors and principal investigators who are daily dealing with data of patients to analyze. These bioinformatics analysts, although pivotal, usually do not receive formal training for this job. We therefore propose these ten simple rules to guide these bioinformaticians in their work: ten pieces of advice on how to provide bioinformatics support to medical doctors in hospitals. We believe these simple rules can help bioinformatics facility analysts in producing better scientific results and work in a serene and fruitful environment.

The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification
Davide Chicco and Giuseppe Jurman
Springer Science and Business Media LLC
AbstractBinary classification is a common task for which machine learning and computational statistics are used, and the area under the receiver operating characteristic curve (ROC AUC) has become the common standard metric to evaluate binary classifications in most scientific fields. The ROC curve has true positive rate (also called sensitivity or recall) on the y axis and false positive rate on the x axis, and the ROC AUC can range from 0 (worst result) to 1 (perfect result). The ROC AUC, however, has several flaws and drawbacks. This score is generated including predictions that obtained insufficient sensitivity and specificity, and moreover it does not say anything about positive predictive value (also known as precision) nor negative predictive value (NPV) obtained by the classifier, therefore potentially generating inflated overoptimistic results. Since it is common to include ROC AUC alone without precision and negative predictive value, a researcher might erroneously conclude that their classification was successful. Furthermore, a given point in the ROC space does not identify a single confusion matrix nor a group of matrices sharing the same MCC value. Indeed, a given (sensitivity, specificity) pair can cover a broad MCC range, which casts doubts on the reliability of ROC AUC as a performance measure. In contrast, the Matthews correlation coefficient (MCC) generates a high score in its $$[-1; +1]$$ [ - 1 ; + 1 ] interval only if the classifier scored a high value for all the four basic rates of the confusion matrix: sensitivity, specificity, precision, and negative predictive value. A high MCC (for example, MCC $$=$$ = 0.9), moreover, always corresponds to a high ROC AUC, and not vice versa. In this short study, we explain why the Matthews correlation coefficient should replace the ROC AUC as standard statistic in all the scientific studies involving a binary classification, in all scientific fields.

A statistical comparison between Matthews correlation coefficient (MCC), prevalence threshold, and Fowlkes–Mallows index
Davide Chicco and Giuseppe Jurman
Elsevier BV

Machine Learning Applications in the Study of Parkinson’s Disease: A Systematic Review
Jordi Martorell-Marugán, Marco Chierici, Sara Bandres-Ciga, Giuseppe Jurman, and Pedro Carmona-Sáez
Bentham Science Publishers Ltd.
Background: Parkinson’s disease is a common neurodegenerative disorder that has been studied from multiple perspectives using several data modalities. Given the size and complexity of these data, machine learning emerged as a useful approach to analyze them for different purposes. These methods have been successfully applied in a broad range of applications, including the diagnosis of Parkinson’s disease or the assessment of its severity. In recent years, the number of published articles that used machine learning methodologies to analyze data derived from Parkinson’s disease patients have grown substantially. Objective: Our goal was to perform a comprehensive systematic review of the studies that applied machine learning to Parkinson’s disease data Methods: We extracted published articles in PubMed, SCOPUS and Web of Science until March 15, 2022. After selection, we included 255 articles in this review. Results: We classified the articles by data type and we summarized their characteristics, such as outcomes of interest, main algorithms, sample size, sources of data and model performance. Conclusion: This review summarizes the main advances in the use of Machine Learning methodologies for the study of Parkinson’s disease, as well as the increasing interest of the research community in this area.

Differential diagnosis of systemic lupus erythematosus and Sjögren's syndrome using machine learning and multi-omics data
Jordi Martorell-Marugán, Marco Chierici, Giuseppe Jurman, Marta E. Alarcón-Riquelme, and Pedro Carmona-Sáez
Elsevier BV

Genetic predisposition to lung adenocarcinoma outcome is a feature already present in patients' noninvolved lung tissue
Francesca Minnai, Sara Noci, Marco Chierici, Chiara Elisabetta Cotroneo, Barbara Bartolini, Matteo Incarbone, Davide Tosi, Giovanni Mattioni, Giuseppe Jurman, Tommaso A. Dragani,et al.
Wiley
AbstractEmerging evidence suggests that the prognosis of patients with lung adenocarcinoma can be determined from germline variants and transcript levels in nontumoral lung tissue. Gene expression data from noninvolved lung tissue of 483 lung adenocarcinoma patients were tested for correlation with overall survival using multivariable Cox proportional hazard and multivariate machine learning models. For genes whose transcript levels are associated with survival, we used genotype data from 414 patients to identify germline variants acting as cis‐expression quantitative trait loci (eQTLs). Associations of eQTL variant genotypes with gene expression and survival were tested. Levels of four transcripts were inversely associated with survival by Cox analysis (CLCF1, hazard ratio [HR] = 1.53; CNTNAP1, HR = 2.17; DUSP14, HR = 1.78; and MT1F: HR = 1.40). Machine learning analysis identified a signature of transcripts associated with lung adenocarcinoma outcome that was largely overlapping with the transcripts identified by Cox analysis, including the three most significant genes (CLCF1, CNTNAP1, and DUSP14). Pathway analysis indicated that the signature is enriched for ECM components. We identified 32 cis‐eQTLs for CNTNAP1, including 6 with an inverse correlation and 26 with a direct correlation between the number of minor alleles and transcript levels. Of these, all but one were prognostic: the six with an inverse correlation were associated with better prognosis (HR < 1) while the others were associated with worse prognosis. Our findings provide supportive evidence that genetic predisposition to lung adenocarcinoma outcome is a feature already present in patients' noninvolved lung tissue.

histolab: A Python library for reproducible Digital Pathology preprocessing with automated testing
Alessia Marcolini, Nicole Bussola, Ernesto Arbitrio, Mohamed Amgad, Giuseppe Jurman, and Cesare Furlanello
Elsevier BV

Towards a potential pan-cancer prognostic signature for gene expression based on probesets and ensemble machine learning
Davide Chicco, Abbas Alameer, Sara Rahmati, and Giuseppe Jurman
Springer Science and Business Media LLC
AbstractCancer is one of the leading causes of death worldwide and can be caused by environmental aspects (for example, exposure to asbestos), by human behavior (such as smoking), or by genetic factors. To understand which genes might be involved in patients’ survival, researchers have inventedprognostic genetic signatures: lists of genes that can be used in scientific analyses to predict if a patient will survive or not. In this study, we joined together five different prognostic signatures, each of them related to a specific cancer type, to generate a unique pan-cancer prognostic signature, that contains 207 unique probesets related to 187 unique gene symbols, with one particular probeset present in two cancer type-specific signatures (203072_at related to the MYO1E gene). We applied our proposed pan-cancer signature with the Random Forests machine learning method to 57 microarray gene expression datasets of 12 different cancer types, and analyzed the results. We also compared the performance of our pan-cancer signature with the performances of two alternative prognostic signatures, and with the performances of each cancer type-specific signature on their corresponding cancer type-specific datasets. Our results confirmed the effectiveness of our prognostic pan-cancer signature. Moreover, we performed a pathway enrichment analysis, which indicated an association between the signature genes and a protein-protein interaction analysis, that highlighted PIK3R2 and FN1 as key genes having a fundamental relevance in our signature, suggesting an important role in pan-cancer prognosis for both of them.

Author Correction: Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network (Nature Communications, (2021), 12, 1, (3297), 10.1038/s41467-021-23143-7)
Mathys Grapotte, Manu Saraswat, Chloé Bessière, Christophe Menichelli, Jordan A. Ramilowski, Jessica Severin, Yoshihide Hayashizaki, Masayoshi Itoh, Michihira Tagami, Mitsuyoshi Murata,et al.
Springer Science and Business Media LLC

Automatically detecting Crohn’s disease and Ulcerative Colitis from endoscopic imaging
Marco Chierici, Nicolae Puica, Matteo Pozzi, Antonello Capistrano, Marcello Dorian Donzella, Antonio Colangelo, Venet Osmani, and Giuseppe Jurman
Springer Science and Business Media LLC
Abstract Background The SI-CURA project (Soluzioni Innovative per la gestione del paziente e il follow up terapeutico della Colite UlceRosA) is an Italian initiative aimed at the development of artificial intelligence solutions to discriminate pathologies of different nature, including inflammatory bowel disease (IBD), namely Ulcerative Colitis (UC) and Crohn’s disease (CD), based on endoscopic imaging of patients (P) and healthy controls (N). Methods In this study we develop a deep learning (DL) prototype to identify disease patterns through three binary classification tasks, namely (1) discriminating positive (pathological) samples from negative (healthy) samples (P vs N); (2) discrimination between Ulcerative Colitis and Crohn’s Disease samples (UC vs CD) and, (3) discrimination between Ulcerative Colitis and negative (healthy) samples (UC vs N). Results The model derived from our approach achieves a high performance of Matthews correlation coefficient (MCC) > 0.9 on the test set for P versus N and UC versus N, and MCC > 0.6 on the test set for UC versus CD. Conclusion Our DL model effectively discriminates between pathological and negative samples, as well as between IBD subgroups, providing further evidence of its potential as a decision support tool for endoscopy-based diagnosis.

The ABC recommendations for validation of supervised machine learning results in biomedical sciences
Davide Chicco and Giuseppe Jurman
Frontiers Media SA
COPYRIGHT © 2022 Chicco and Jurman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. The ABC recommendations for validation of supervised machine learning results in biomedical sciences

An Invitation to Greater Use of Matthews Correlation Coefficient in Robotics and Artificial Intelligence
Davide Chicco and Giuseppe Jurman
Frontiers Media SA
A binary classification is a computational procedure that labels data elements as members of one or another category. In machine learning and computational statistics, input data elements which are part of two classes are usually encoded as 0’s or –1’s (negatives) and 1’s (positives). During a binary classification, a method assigns each data element to one of the two categories, usually after a machine learning phase. A typical evaluation procedure then creates a 2 × 2 contingency table called confusion matrix, where the positive elements correctly predicted positive are called true positives (TP), the negative elements correctly predicted negative are called true negatives (TN), the positive elements wrongly labeled as negatives are called false negatives (FN), and the negative elements wrongly labeled as positives are called false positives (FP). Since it would be difficult to always analyze the four categories of the confusion matrix for each test, scientists defined statistical rates that summarize TP, FP, FN, and TN in one value. Accuracy (Eq. 1), for example, is a rate that indicates the ratio of correct positives and negatives (Zliobaite, 2015), while F1 score (Eq. 2), is the harmonic mean of positive predictive value and true positive rate (Lipton et al., 2014; Huang et al., 2015).

Giuseppe Jurman

RESEARCH, TEACHING, or OTHER INTERESTS

Scopus Publications

RECENT SCHOLAR PUBLICATIONS