Assistant Professor in ECE Dept., PACE INSTITUTE OF TECHNOLOGY & SCIENCES. My research interests are computer vision and image processing.
EDUCATION
I received my Ph. D in electronics and communication engineering as a specialization of machine learning from Koneru Lakshmaiah Education Foundation, India, in September 2020, supervised by Prof. A. S. C. S. Sastry and Prof. P. V. V. Kishore., I received my M. Tech in electronics and communication engineering as a specialization of signal processing from Koneru Lakshmaiah Education Foundation, India, in June 2016, supervised by Prof. P. V. V. Kishore, and B. Tech in electronics and communication engineering from Guntur Engineering College (JNTU Kakinada), India, in 2014.
RESEARCH INTERESTS
Computer vision, Machine Learning, Deep Learning, Gesture Recognition.
73
Scopus Publications
1297
Scholar Citations
21
Scholar h-index
31
Scholar i10-index
Scopus Publications
Classification of cardiac MRI images by multi resolution residual feature forged convolutional vision transformer (MrRf2 -CViT) P V V Kishore, D Anil Kumar Engineering Research Express, 2026 Vision Transformers (ViT) have displayed unprecedented success in the segmentation of Left Ventricle (LV), right ventricle (RV) and myocardial (MYO) functions from cardiac MRI (CMRI) images that helped in the classification of cardiovascular diseases (CVDs). ViT based segmentation is a computationally intensive and data extensive training operation. Classification of CMRI data without segmentation modules is challenging because of the relatively less spatial dimensionality and movement artifacts of the cardiac function. To overcome this, we investigated convolutional feature projection with ViT encoders (CViT) for CMRI CVD classification which has been influenced by scale, shift and deformations of features during the training process. To improve CViT classification of CVDs through CMRI data, we propose Multi Resolution Feature Forged Convolutional Vision Transformer (MrRf 2 —CViT). Contrasting to traditional CViT’s positional encoding and convolutional projection of patched image features, the MrRf 2 —CNN generates relational features at different scales on a pre-defined set of image sequences. These consecutively extracted image features are flattened, tokenized, and are linearly projected with ViT’s encoder for classification. This MrRf 2 -CViT model leverages the transformer’s attention and context association capabilities to accomplish shift and scale invariance in sequential CMRI data. Specifically, the MrRf 2 -CViT learns the patterns within dynamic heart movements represented spatially by convolutional features that are effectively localized with ViT’s attention mechanism for detecting the CVDs. Sunnybrook Cardiac (SCM) and Automated Cardiac Diagnosis Challenge (ACDC) MRI data has been applied to the proposed MrRf 2 -CViT for evaluation. The experiments conducted show an increase in performance of the proposed approach (MrRf 2 -CViT) over state—of—the—art MRI image classification methods.
Stretched image resolution augmented attention for classification of paediatric cough sounds using smart phone microphonic recordings P V V Kishore, D Anil Kumar, K Dikshitha, Soans Santosh, Gnane Swarnadh Satapathi Engineering Research Express, 2026 Cough-sound analysis offers a low-cost, rapid screening pathway for acute respiratory infections (ARIs), but deep-learning models trained on fixed-resolution Mel-spectrogram images often generalize poorly because cough events exhibit substantial within-class spatio-temporal variability across subjects and recording conditions. To address this, we propose Stretched Image Resolution Augmented Attention (SIRAA), a multi-stream convolutional framework that learns complementary representations from multi-resolution log-Mel spectrogram images generated by systematically varying two MFCC design parameters: the number of Mel filters and FFT size. SIRAA processes three resolutions in parallel and performs resolution-conditioned feature fusion via an augmented attention mechanism, injecting high-resolution local cues and low-resolution global structure into a central stream used for final classification. We introduce IndiCough2024, a clinician-labelled pediatric cough dataset collected using smartphones in a hospital setting, comprising 503 children (1–6 years) and five diagnostic classes: asthma, pneumonia, upper respiratory tract infection, lower respiratory tract infection, and croup. On IndiCough2024, SIRAA achieves up to 95.7% top-1 accuracy and yields greater than 18 percentage-point improvement over comparable single-resolution training pipelines. Additional benchmarking on public cough datasets further supports the robustness of the proposed multi-resolution augmented attention strategy for cough-based multi-class respiratory screening.
Optimizing continuous sign language recognition through motion selective sparse spatial feature extraction V Prathyusha, P V V Kishore, D Anil Kumar, G V K Murthy Engineering Research Express, 2025 Continuous sign language recognition (CSLR) faces challenges due to variations in motion on spatial features across consecutive frames, affecting the performance of visual feature extraction models. This paper proposes Motion Selective Sparse Spatial Features (MS3F) to represent motion-rich visual information across video frames. MS3F extracts contextual spatial features by optimizing cross-modal loss between visual and text features, computes frame differencing to construct motion feature vectors, and uses a gated recurrent unit (GRU) to learn motion-selective sparse spatial features. A bidirectional long-short term memory network (Bi-LSTM) is then trained to learn temporal dependencies in MS3F features for tokenized targeted glosses. Experiments on the novel Doordarshan Continuous Indian Sign Language (DC-ISL) dataset, along with established benchmarks RWTH-Phoenix-2014 and Chinese-CSL (CCSL), demonstrate the effectiveness of MS3F, achieving word error rates (WERs) of 20.1%, 19.8%, and 19.9%, respectively. Compared to dense baseline models, MS3F improves WER by 2.9% to 3.4% across datasets while reducing inference time by 30% to 34% and GPU memory consumption by 22% to 27%. The motion-selective sparsity mechanism achieves approximately 30% feature reduction, processing only motion-rich frames and enabling real-time performance on standard GPU hardware. This work demonstrates that MS3F effectively captures visual information containing significant motion content while maintaining computational efficiency, advancing practical CSLR technology for real-world deployment.
Acute Respiratory Infections Identification With Cough Sounds and Overlapping Patch Modulated Vision Transformers P. V. V. Kishore, D. Anil Kumar, Pasupuleti Sasikiran, Kaja Krishna Mohan, P. Praveen Kumar, Mogadala Vinod Kumar IEEE Access, 2025 Rapid detection of Acute Respiratory Infections (ARI) is crucial to reduce breathing difficulties and severe life-threatening conditions. Automatic cough identification is being conducted using speech frequency analysis and machine learning models. Learning models trained on Mel frequency spectrum(MFCC) features of cough sounds represented as images have recorded an average binary classification accuracy of 68%. Variable cough sound vs silent intervals between samples of a class in MFCC spectral images has shown to influence training algorithms to learn meaningful patterns for classification. To learn all possible local patterns in the MFCC cough images using a vision transformer model (ViT), we propose an image patch overlapping vision transformer <inline-formula> <tex-math notation="LaTeX">$IPO-ViT$ </tex-math></inline-formula>. The patch overlapping factor <inline-formula> <tex-math notation="LaTeX">$k\\ $ </tex-math></inline-formula>controls the quantity of common pixels between them. The <inline-formula> <tex-math notation="LaTeX">$IPO-ViT$ </tex-math></inline-formula> patch encoder computes all possible local pixel pattern relationships by breaking the image into overlapping patches and equating them across all classes making a balanced augmented dataset. The <inline-formula> <tex-math notation="LaTeX">$IPO-ViT$ </tex-math></inline-formula> is evaluated on our own 511 – sound cough dataset (IndiCough_2024) with 5 classes captured at AJ Institute of Medical Sciences, paediatric division along with benchmarks EPFL COUGH VID, Coswara for COVID-19 Diagnosis and Covid19-Cough. The <inline-formula> <tex-math notation="LaTeX">$IPO-ViT$ </tex-math></inline-formula> achieved higher accuracies of around 92.33% over the state-of-the-art cough sound-based disease identification networks.
An Efficient Proposal for Deep Learning-Based Diabetes Prediction D Baswaraj, Ch V V Narasimha Raju, Pundru Chandra Shaker Reddy, Ajmeera Kiran, Mohammad Khaja Shaik, D Anil Kumar 2nd IEEE International Conference on Networks Multimedia and Information Technology Nmitcon 2024, 2024
3-Dimensional Indian Dance Pose Classification using Convolution-al Neural Network D. Anil Kumar, T. Suresh Babu, E. Sai Gowtham, M. Anusha Chandana, G. V. Vineelka, K. Narendra Reddy 2023 IEEE International Conference on Research Methodologies in Knowledge Management Artificial Intelligence and Telecommunication Engineering Rmkmate 2023, 2023
A quad joint relational feature for 3D skeletal action recognition with circular CNNs Proceedings IEEE International Symposium on Circuits and Systems, 2020
DSLR-net a depth based sign language recognition using two stream convents International Journal of Innovative Technology and Exploring Engineering, 2019
Machine learning based 2D pose estimation model for human action recognition using geometrical maps International Journal of Innovative Technology and Exploring Engineering, 2019
Multi modal Rgb D data based Cnn training with uni modal Rgb data testing for real time sign language recognition International Journal of Recent Technology and Engineering, 2019
Depth based 3D indian sign language recognition using adaptive kernels International Journal of Innovative Technology and Exploring Engineering, 2019
Fusing spatio-temporal joint features for adequate skeleton based action recognition using global alignment kernel International Journal of Engineering and Advanced Technology, 2019
Training granular convolution neural network with depth motion maps along with joint angular displacement maps for kinect based human action recognition Journal of Advanced Research in Dynamical and Control Systems, 2019
SWIFT cognitive behavioral assessment model built on cognitive analytics of empirical mode internet of things International Journal of Engineering and Technology Uae, 2018
Spatial Joint features for 3D human skeletal action recognition system using spatial graph kernels International Journal of Engineering and Technology Uae, 2018
Fire detection using computer vision models in surveillance videos Arpn Journal of Engineering and Applied Sciences, 2017
Computer vision based dance posture extraction using slic Journal of Theoretical and Applied Information Technology, 2017
Indian sign language recognition: A comparison between ANN and FIS Journal of Theoretical and Applied Information Technology, 2016
Indian sign language recognition system using new fusion based edge operator Journal of Theoretical and Applied Information Technology, 2016
Edge and texture preserving hybrid algorithm for denoising infield ultrasound medical images Journal of Theoretical and Applied Information Technology, 2016
Medical image watermarking: Run through review Arpn Journal of Engineering and Applied Sciences, 2016
Classification of cardiac MRI images by multi resolution residual feature forged convolutional vision transformer (MrRf 2 -CViT) PVV Kishore, DA Kumar Engineering Research Express 8 (7), 075203 , 2026 2026
Stretched image resolution augmented attention for classification of paediatric cough sounds using smart phone microphonic recordings PVV Kishore, DA Kumar, K Dikshitha, S Santosh, GS Satapathi Engineering Research Express 8 (3), 035213 , 2026 2026
Optimizing continuous sign language recognition through motion selective sparse spatial feature extraction V Prathyusha, PVV Kishore, D Anil Kumar, GVK Murthy Engineering Research Express 7 (4), 045288 , 2025 2025
Ppent: a pose embedding refinement framework aligning estimated and motion-captured skeletons for real-time word-level sign language recognition PVV Kishore, GH Bindu, B Prasad, DA Kumar, PP Kumar, M Suneetha, ... International Journal of Information Technology, 1-19 , 2025 2025
Wavelet convolutional vision transformer (WCViT) for Indian classical dance identification PVV Kishore, D Anil Kumar, G Hima Bindu, B Prasad, P Praveen Kumar, ... International Journal of Information Technology, 1-19 , 2025 2025 Citations: 1
Acute Respiratory Infections Identification With Cough Sounds and Overlapping Patch Modulated Vision Transformers PVV Kishore, DA Kumar, P Sasikiran, KK Mohan, PP Kumar, MV Kumar IEEE Access 13, 77507-77521 , 2025 2025
Alternating wavelet channel and spatial attention mechanism for online video-based Indian classical dance recognition PVV Kishore, DA Kumar, PP Kumar, GH Bindu International Journal of Information Technology, 1-19 , 2024 2024 Citations: 5
An Efficient Proposal for Deep Learning-Based Diabetes Prediction D Baswaraj, CVVN Raju, PCS Reddy, A Kiran, MK Shaik, DA Kumar 2024 Second International Conference on Networks, Multimedia and Information … , 2024 2024 Citations: 6
Deep bharatanatyam pose recognition: a wavelet multi head progressive attention DA Kumar, PVV Kishore, K Sravani Pattern Analysis and Applications 27 (2), 53 , 2024 2024 Citations: 8
Machine interpretation of ballet dance: alternating wavelet spatial and channel attention based learning model PVV Kishore, DA Kumar, PP Kumar, D Srihari, N Sasikala, L Divyasree IEEE Access 12, 55264-55280 , 2024 2024 Citations: 4
Sign language recognition (slr): a brisk paired deep metric attention learning (bpdmal) model for video data applications PVV Kishore, D Anil Kumar, K Srinivasa Rao SN Computer Science 5 (4), 419 , 2024 2024 Citations: 1
Smart water metering system EK Kumar, DA Kumar, T Manwitha, GY Sai AIP Conference Proceedings 2512 (1), 020073 , 2024 2024
Three stream human action recognition using Kinect EK Kumar, DA Kumar, K Murali, PS Kiran, MTK Kumar AIP Conference Proceedings 2512 (1), 020077 , 2024 2024 Citations: 3
A deep learning based approach to recognize the gestures used for controlling smart wheelchair EK Kumar, BP Kumar, L Rajasekhar, KS Chandana, DA Kumar AIP Conference Proceedings 2512 (1), 020060 , 2024 2024 Citations: 1
Human action recognition from depth sensor via skeletal joint and shape trajectories with a time-series graph matching DA Kumar, EK Kumar, M Suneetha, L Rajasekhar AIP conference proceedings 2512 (1), 020029 , 2024 2024 Citations: 6
Joint motion affinity maps (JMAM) and their impact on deep learning models for 3D sign language recognition PVV Kishore, DA Kumar, RC Tanguturi, K Srinivasarao, PP Kumar, ... IEEE Access 12, 11258-11275 , 2024 2024 Citations: 14
Joint Motion Affinity Maps (JMAM) and Their Impact on Deep Learning Models for 3D Sign Language Recognition (vol 12, pg 11258, 2024) PVV Kishore, DA Kumar, RC Tanguturi, K Srinivasarao, PP Kumar, ... IEEE ACCESS 12, 162929-162929 , 2024 2024
Ensemble nonlinear machine learning model for chronic kidney diseases prediction S Sampath, ML Prasad, MM Hussain, R Parameswari, DA Kumar, ... 2023 IEEE 3rd Mysore Sub Section International Conference (MysuruCon), 1-6 , 2023 2023 Citations: 21
3-Dimensional Indian Dance Pose Classification using Convolution-al Neural Network DA Kumar, TS Babu, ES Gowtham, MA Chandana, GV Vineelka, ... 2023 International Conference on Research Methodologies in Knowledge … , 2023 2023
View invariant human action recognition using surface maps via convolutional networks DA Kumar, PVV Kishore, GVK Murthy, TR Chaitanya, SK Subhani 2023 International Conference on Research Methodologies in Knowledge … , 2023 2023 Citations: 2
MOST CITED SCHOLAR PUBLICATIONS
Motionlets matching with adaptive kernels for 3-d indian sign language recognition PVV Kishore, DA Kumar, ASCS Sastry, EK Kumar IEEE Sensors Journal 18 (8), 3327-3337 , 2018 2018 Citations: 112
Indian classical dance action identification and classification with convolutional neural networks PVV Kishore, KVV Kumar, E Kiran Kumar, A Sastry, M Teja Kiran, ... Advances in Multimedia 2018 (1), 5141402 , 2018 2018 Citations: 99
Training CNNs for 3-D sign language recognition with color texture coded joint angular displacement maps EK Kumar, PVV Kishore, A Sastry, MTK Kumar, DA Kumar IEEE Signal Processing Letters 25 (5), 645-649 , 2018 2018 Citations: 98
Optical flow hand tracking and active contour hand shape features for continuous sign language recognition with artificial neural networks PVV Kishore, MVD Prasad, DA Kumar, A Sastry 2016 IEEE 6th international conference on advanced computing (IACC), 346-351 , 2016 2016 Citations: 79
3D sign language recognition with joint distance and angular coded color topographical descriptor on a 2–stream CNN EK Kumar, PVV Kishore, MTK Kumar, DA Kumar Neurocomputing 372, 40-54 , 2020 2020 Citations: 73
Yoganet: 3-d yoga asana recognition using joint angular displacement maps with convnets TKK Maddala, PVV Kishore, KK Eepuri, AK Dande IEEE Transactions on Multimedia 21 (10), 2492-2503 , 2019 2019 Citations: 65
Three-dimensional sign language recognition with angular velocity maps and connived feature resnet EK Kumar, PVV Kishore, MTK Kumar, DA Kumar, A Sastry IEEE Signal Processing Letters 25 (12), 1860-1864 , 2018 2018 Citations: 55
Multi modal spatio temporal co-trained CNNs with single modal testing on RGB–D based sign language gesture recognition S Ravi, M Suman, PVV Kishore, K Kumar, A Kumar Journal of Computer Languages 52, 88-102 , 2019 2019 Citations: 54
Indian Classical Dance Classification with Adaboost Multiclass Classifier on Multi Feature Fusion KVV Kumar, PVV Kishore, DA Kumar 2017 Citations: 52
Indian sign language recognition using graph matching on 3D motion captured signs DA Kumar, A Sastry, PVV Kishore, EK Kumar Multimedia Tools and Applications 77 (24), 32063-32091 , 2018 2018 Citations: 37
Selfie continuous sign language recognition with neural network classifier G Anantha Rao, PVV Kishore, A Sastry, D Anil Kumar, E Kiran Kumar Proceedings of 2nd International Conference on Micro-Electronics … , 2017 2017 Citations: 35
Selfie sign language recognition with convolutional neural networks PVV Kishore, GA Rao, EK Kumar, MTK Kumar, DA Kumar International Journal of Intelligent Systems and Applications 11 (10), 63 , 2018 2018 Citations: 34
3D sign language recognition using spatio temporal graph kernels DA Kumar, A Sastry, PVV Kishore, EK Kumar Journal of King Saud University-Computer and Information Sciences 34 (2 … , 2022 2022 Citations: 33
S3DRGF: Spatial 3-D relational geometric features for 3-D sign language representation and recognition DA Kumar, A Sastry, PVV Kishore, EK Kumar, MTK Kumar IEEE Signal Processing Letters 26 (1), 169-173 , 2018 2018 Citations: 32
Neural network classifier for continuous sign language recognition with selfie video GA Rao, PVV Kishore, DA Kumar, A Sastry Far East Journal of Electronics and Communications 17 (1), 49 , 2017 2017 Citations: 32
A four-stream ConvNet based on spatial and depth flow for human action classification using RGB-D data D Srihari, PVV Kishore, EK Kumar, DA Kumar, MTK Kumar, MVD Prasad, ... Multimedia Tools and Applications 79 (17), 11723-11746 , 2020 2020 Citations: 28
INDIAN SIGN LANGUAGE RECOGNITION SYSTEM USING NEW FUSION BASED EDGE OPERATOR. MVD Prasad, PVV Kishore, EK Kumar, DA Kumar Journal of Theoretical & Applied Information Technology 88 (3) , 2016 2016 Citations: 26
Continuous sign language recognition from tracking and shape features using fuzzy inference engine PVV Kishore, DA Kumar, M Manikanta 2016 International Conference on Wireless Communications, Signal Processing … , 2016 2016 Citations: 24
Nutritive and feeding value of cottonseed meal in broilers. A review G Thirumalaisamy, MR Purushothaman, PV Kumar, P Selvaraj, ... Adv. Anim. Vet. Sci 4 (8), 398-404 , 2016 2016 Citations: 23
Early estimation model for 3D-discrete indian sign language recognition using graph matching EK Kumar, PVV Kishore, DA Kumar, MTK Kumar Journal of King Saud University-Computer and Information Sciences 33 (7 … , 2021 2021 Citations: 22