Machine learning approaches for predicting breast cancer recurrence using clinical and histopathological data Mohd Abas Bhat, Mushtaq Ahmad Mir, R. Vijaya Lakshmi, Tejaswini Pradhan, G. V. V. Jagannadha Rao, Ghanshyam G. Tejani, Syed Abid Hussain Clinical and Experimental Medicine, 2026 Breast cancer remains the most common malignancy among women worldwide, with recurrence representing a major clinical challenge. Although significant progress has been made in early detection and treatment, recurrence affects up to 40% of patients in Brazil, influencing survival outcomes and therapeutic decisions. In this context, Machine Learning offers valuable potential for enhancing recurrence prediction by enabling data-driven risk assessment and personalized patient care. In the present study, clinical and histopathological information was extracted from unstructured medical records of breast cancer patients. A clustering technique (K-Means) was applied to identify patient subgroups with varying tumor aggressiveness profiles. Survival outcomes were further analyzed using the Cox proportional hazards model. Two distinct subgroups were identified: for less aggressive tumors, Quadratic Discriminant Analysis achieved a remarkably high recall of 0.9872, while for more aggressive tumors, Random Forest provided the most favorable trade-off between recall (0.7296) and precision (0.6811). Future research should explore validation across multiple institutions, incorporate molecular biomarkers, and leverage deep learning approaches to enhance predictive performance.
A Nature-Inspired Framework for Dimensionality Reduction and Cancer Diagnosis From Gene Expression Profiles Abrar Yaqoob, Khawaja T. Tasneem, Mushtaq Ahmad Mir, R. Vijaya Lakshmi, Tejaswini Pradhan, G. V. V. Jagannadha Rao, Mohd Asif Shah Concurrency and Computation Practice and Experience, 2026 High‐dimensional gene expression datasets pose significant challenges for cancer classification due to the presence of redundant and irrelevant features. To address this issue, we propose a hybrid framework that integrates the flower pollination algorithm (FPA) with support vector machines (SVM) for effective feature selection and classification. The FPA, inspired by the global and local pollination processes of flowering plants, is adapted into a binary variant using a sigmoid transfer function to select informative subsets of genes. The objective function balances classification accuracy with feature subset sparsity, thereby reducing dimensionality while preserving discriminative power. The selected gene subsets are subsequently evaluated using SVM, which provides robust classification in small‐sample, high‐dimensional scenarios. The proposed FPA‐SVM framework was tested on multiple benchmark cancer datasets, including colon tumor, CNS, ALL‐AML, breast cancer, lung cancer, ovarian cancer, lymphoma, MLL, and SRBCT. Experimental results demonstrate superior performance, with accuracy levels exceeding 98% for most binary‐class datasets and competitive results for multiclass datasets, achieving up to 88.3% accuracy. These findings highlight the effectiveness of the proposed method in enhancing cancer classification, reducing dimensionality, and identifying potential biomarkers for precision medicine.
Optimizing accuracy and dimensionality: a swarm intelligence strategy for robust cancer genomics classification Abrar Yaqoob, Mushtaq Ahmad Mir, R. Vijaya Lakshmi, Tejaswini Pradhan, G. V. V. Jagannadha Rao, Ghanshyam G. Tejani, Mohd Asif Shah Biodata Mining, 2025 High-dimensional gene expression datasets pose a major challenge in cancer classification due to redundancy, noise, and the risk of overfitting. To address these issues, this study proposes a hybrid framework that integrates the Dung Beetle Optimizer (DBO) for feature selection with Support Vector Machines (SVM) for classification. DBO, a recently developed nature-inspired algorithm, effectively identifies informative and non-redundant subsets of genes by simulating dung beetles' foraging, rolling, obstacle avoidance, stealing, and breeding behaviors. The selected features are then classified using SVM with Radial Basis Function (RBF) kernels, which provide robust decision boundaries even in high-dimensional spaces. Extensive experiments were conducted on publicly available cancer-related gene expression datasets, covering binary, ternary, and quaternary classification tasks. Results show that the proposed DBO-SVM framework achieves 97.4-98.0% accuracy on binary datasets and 84-88% accuracy on multiclass datasets, with balanced Precision, Recall, and F1-scores. These findings highlight the method's ability to enhance classification performance while reducing computational cost and improving biological interpretability. The proposed hybrid model demonstrates strong potential as an efficient and reliable tool for precision medicine and biomedical data analysis.
Transforming Cancer Classification: The Role of Advanced Gene Selection Abrar Yaqoob, Mushtaq Ahmad Mir, G. V. V. Jagannadha Rao, Ghanshyam G. Tejani Diagnostics, 2024 Background/Objectives: Accurate classification in cancer research is vital for devising effective treatment strategies. Precise cancer classification depends significantly on selecting the most informative genes from high-dimensional datasets, a task made complex by the extensive data involved. This study introduces the Two-stage MI-PSA Gene Selection algorithm, a novel approach designed to enhance cancer classification accuracy through robust gene selection methods. Methods: The proposed method integrates Mutual Information (MI) and Particle Swarm Optimization (PSO) for gene selection. In the first stage, MI acts as an initial filter, identifying genes rich in cancer-related information. In the second stage, PSO refines this selection to pinpoint an optimal subset of genes for accurate classification. Results: The experimental findings reveal that the MI-PSA method achieves a best classification accuracy of 99.01% with a selected subset of 19 genes, substantially outperforming the MI and SVM methods, which attain best accuracies of 93.44% and 91.26%, respectively, for the same gene count. Furthermore, MI-PSA demonstrates superior performance in terms of average and worst-case accuracy, underscoring its robustness and reliability. Conclusions: The MI-PSA algorithm presents a powerful approach for identifying critical genes essential for precise cancer classification, advancing both our understanding and management of this complex disease.
A Pragmatic Review of Learning Models Used for Unsupervised Analysis of Existing Cyber Physical Deployments from an Empirical Perspective International Journal of Intelligent Systems and Applications in Engineering, 2024