Advanced gesture recognition in Indian sign language using a synergistic combination of YOLOv10 with Swin Transformer model Umang Rastogi, Rajendra Prasad Mahapatra, Sushil Kumar Scientific Reports, 2025 Communication between deaf or mute individuals and hearing persons is often hindered by the lack of mutual understanding of sign or vocal language. To bridge this gap, Indian Sign Language Recognition (ISLR) systems are essential. This paper proposes a real-time ISLR framework based on the YOLOv10-ST model, which integrates the Swin Transformer into the YOLOv10 architecture for enhanced feature extraction. The model also incorporates Mish activation to improve gradient flow and detection accuracy. A custom dataset comprising 15, 000 static images (1, 000 per sign for 15 signs) and 35 dynamic videos (covering 7 sign classes) was used for training and evaluation. Experimental results demonstrate high performance, with the model achieving 97.50% precision, 98.10% recall, and 96.58% F1-score for image-based sign recognition, and 95.24% precision, 96.00% recall, and 95.87% F1-score for video-based gestures. The model also achieves a mean Average Precision (mAP) of 97.62% and real-time inference speeds of 48.7 FPS. Ablation studies validate the contributions of Swin Transformer and Mish activation, while paired t-tests confirm statistical significance (p $$< 0.005$$ ). The experimental findings demonstrate that the YOLOv10-ST model efficiently recognizes static and dynamic ISL in real time with minimal computational overhead.
Progressing Alzheimer's Diagnosis with Ensemble CNN Model Vivek Yadav, Vaibhav, Rohit, Sushil Kumar, Umang Rastogi Icdt 2025 3rd International Conference on Disruptive Technologies, 2025 This paper proposes an Alzheimer's disease prediction model using CNN, implemented on a system with AMD Radeon Vega 8 Graphics to improve computational traceability. The research is encouraging because the implicates of the model touched the 97% accuracy of the disease prediction by MRI images of patients. The integration of the Vega 8 GPU allows the system to process large datasets and perform computations at a significantly faster rate, thereby reducing training time. The MRI dataset used in this study is highly imbalanced, with four classes: Non-Demented or Non-D, Very Mild Dementia or V.M.D, Mild Dementia or M.D, Moderate Demented or M.D.A. CNN was chosen for its capability in highlight extraction and learning from MRI pictures, without requiring human interference. The show was run with parallel and multiclass datasets, and IT got 97% exactness. The systematic analysis of the test shows that the system can diagnose Alzheimer's right at the initial stage of the disease development. This approach shows how CNNs with the aid of AMD Radeon Vega 8 Graphics perform well in automating Alzheimer's diagnosis while breaking the barriers associated with manual feature extraction and orthodox dependence on experts.
Book Recommendation System Using Hybrid Content and Collaborative Filtering Techniques Urvi Gupta, Tripti Singh, Vidyush Singh, Umang Rastogi, Sushil Kumar Icdt 2025 3rd International Conference on Disruptive Technologies, 2025 In this study, the process of producing a book recommendation system outlines the process of finding books to provide personalized suggestions according to user preference and behavior. The system increases user satisfaction by tailoring recommendations to their tastes, improving the overall reading experience, and enhancing engagement and retention on plat-forms. It tracks users with collaborative filtering. interactions, content-based filtering to study book characteristics such as genre and author, and mixed methods to combine both techniques that will help ensure accuracy and circumvent similar limitations of cold start problems. Machine learning models-, pre-processing of data algorithms, and accuracy and recall measures of analysis ensure the effectiveness of the system in providing individualized and relevant recommendations that are 85% accurate. The findings reveal test the system's capacity to generate good suggestions, improving user experience and engagement. Improvements in the future could include real-time recommendations and broader feedback integration towards further optimization in accuracy and usability.
A Hybrid Approach based on Haar Cascade, Softmax, and CNN for Human Face Recognition Pancham Singh, Mrignainy Kansal, Rajeev Kumar Singh, Sushil Kumar, Chelsi Sen Journal of Scientific and Industrial Research, 2024 Face recognition has been studied long but it is still an important and current research field in deep learning, computer vision, and forensics. There are several applications such as group action systems, human-machine interaction, and security systems, where face recognition is of vital importance. It is noticed that the algorithms based on Deep Learning (DL) have shown higher performances, stipulation of accuracy, and processing speed as compared to traditional machine learning algorithms. With its dominant methodology in deep learning, the Convolutional Neural Network (CNN) has contributed immensely to face recognition. In this paper, a novel hybrid version of the deep learning algorithm containing Haar Cascade, SoftMax, and CNN components is proposed. It provides promising results for applications based on the recognition of human faces. In the experiments, the accuracy of this hybrid algorithm is achieved at 99.95%, which is significantly higher than existing Viola-Jonas and Principal Component Analysis (PCA), which have accuracy rates of 74.38% and 81.81% respectively. However, the accuracy of our proposed algorithm close to Linear Discriminant Analysis (LDA) at 95.45%, and SoftMax and CNN at 94%. In this paper, the proposed hybrid deep learning algorithm improves the result performance and is compared with some existing techniques for face recognition.