Alwin Poulose

@en.knu.ac.kr

Researcher at Center for ICT & Automobile Convergence (CITAC)
Kyungpook National University, Daegu, South Korea



                             

https://researchid.co/alwinpoulosepalatty

Dr. Alwin Poulose was born in Manjapra, Kerala, India, in 1992. He received a B.Sc. degree in computer maintenance and electronics from the Union Christian College, Aluva, Kerala, India, in 2012, the M.Sc. degree in electronics from the MES College Marampally, Kerala, India, in 2014, the M. Tech degree in communication systems from Christ University, Bangalore, India in 2017, and the Ph.D. degree in electronics and electrical engineering from Kyungpook National University, Daegu, South Korea in 2021. His research interests include indoor localization, human activity recognition, facial emotion recognition, and human behavior prediction. He is a reviewer of prominent engineering and science international journals and has served as a technical program committee member at several international conferences. He is currently a researcher at the Center for ICT & Automobile Convergence (CITAC), Kyungpook National University, Daegu, South Korea.

EDUCATION

2017/08/21 – 2021/08/31: Ph.D. Degree in Electronic and Electrical Engineering, Kyungpook National University, Daegu, South Korea.

2015/06/01 – 2017/05/21: M. Tech Degree in Electronics and Communication Engineering, Christ University, Bangalore, India.

2014/08/05 – 2015/04/25: Trainee for International English Language Testing System (IELTS), Newman's Academy, Angamaly, Kerala, India.

2012/07/02 – 2014/07/31: M.Sc. Degree in Electronics, MES College, Marampally, Kerala, India.

2009/07/01 – 2012/04/30: B. Sc. Degree in Computer Maintenance and Electronics, Aluva, Kerala, India.

RESEARCH INTERESTS

indoor localization, human activity recognition, facial emotion recognition, and human behavior prediction

32

Scopus Publications

1146

Scholar Citations

18

Scholar h-index

22

Scholar i10-index

Scopus Publications

  • CVGG-19: Customized Visual Geometry Group Deep Learning Architecture for Facial Emotion Recognition
    Jung Hwan Kim, Alwin Poulose, and Dong Seog Han

    Institute of Electrical and Electronics Engineers (IEEE)
    Facial emotion recognition (FER) detects a user’s facial expression with the camera sensors and behaves according to the user’s emotions. The FER can apply to entertainment, security, and traffic safety. The FER system requires a highly accurate and efficient algorithm to classify the driver’s emotions. The-state-of-art architectures for FER, such as visual geometry group (VGG), Inception-V1, ResNet, and Xception, have some level of performance for classification. Nevertheless, the original VGG architectures suffer from the vanishing gradient, limited improvement performance, and expensive computational cost. In this paper, we propose the customized visual geometry group-19 (CVGG-19), which adopts the designs of the VGG, Inception-v1, ResNet, and Xception. Our proposed CVGG-19 architecture outperforms the conventional VGG-19 architecture by 59.29%, reducing the computational cost by 89.5%. Moreover, the CVGG-19 architecture’s F1-score, which represents the real-time classifying performance, displays superior to the Inception-V1, ResNet50, and Xception architectures by 3.86% on average

  • Quantification of golgi dispersal and classification using machine learning models
    Rutika Sansaria, Krishanu Dey Das, and Alwin Poulose

    Elsevier BV

  • Dispersive Modeling of Normal and Cancerous Cervical Cell Responses to Nanosecond Electric Fields in Reversible Electroporation Using a Drift-Step Rectifier Diode Generator
    Mayank Kumar, Sachin Kumar, Shubhro Chakrabartty, Alwin Poulose, Hala Mostafa, and Bhawna Goyal

    MDPI AG
    This paper creates an approximate three-dimensional model for normal and cancerous cervical cells using image processing and computer-aided design (CAD) tools. The model is then exposed to low-frequency electric pulses to verify the work with experimental data. The transmembrane potential, pore density, and pore radius evolution are analyzed. This work adds a study of the electrodeformation of cells under an electric field to investigate cytoskeleton integrity. The Maxwell stress tensor is calculated for the dispersive bi-lipid layer plasma membrane. The solid displacement is calculated under electric stress to observe cytoskeleton integrity. After verifying the results with previous experiments, the cells are exposed to a nanosecond pulsed electric field. The nanosecond pulse is applied using a drift-step rectifier diode (DSRD)-based generator circuit. The cells’ transmembrane voltage (TMV), pore density, pore radius evolution, displacement of the membrane under electric stress, and strain energy are calculated. A thermal analysis of the cells under a nanosecond pulse is also carried out to prove that it constitutes a non-thermal process. The results showed differences in normal and cancerous cell responses to electric pulses due to changes in morphology and differences in the cells’ electrical and mechanical properties. This work is a model-driven microdosimetry method that could be used for diagnostic and therapeutic purposes.

  • Versatility Investigation of Grown Titanium Dioxide Nanoparticles and Their Comparative Charge Storage for Memristor Devices
    Shubhro Chakrabartty, Abdulkarem H. M. Almawgani, Sachin Kumar, Mayank Kumar, Suvojit Acharjee, Alaaddin Al-Shidaifat, Alwin Poulose, and Turki Alsuwian

    MDPI AG
    Memristive devices have garnered significant attention in the field of electronics over the past few decades. The reason behind this immense interest lies in the ubiquitous nature of memristive dynamics within nanoscale devices, offering the potential for revolutionary applications. These applications span from energy-efficient memories to the development of physical neural networks and neuromorphic computing platforms. In this research article, the angle toppling technique (ATT) was employed to fabricate titanium dioxide (TiO2) nanoparticles with an estimated size of around 10 nm. The nanoparticles were deposited onto a 50 nm SiOx thin film (TF), which was situated on an n-type Si substrate. Subsequently, the samples underwent annealing processes at temperatures of 550 °C and 950 °C. The structural studies of the sample were done by field emission gun-scanning electron microscope (FEG-SEM) (JEOL, JSM-7600F). The as-fabricated sample exhibited noticeable clusters of nanoparticles, which were less prominent in the samples annealed at 550 °C and 950 °C. The element composition revealed the presence of titanium (Ti), oxygen (O2), and silicon (Si) from the substrate within the samples. X-ray diffraction (XRD) analysis revealed that the as-fabricated sample predominantly consisted of the rutile phase. The comparative studies of charge storage and endurance measurements of as-deposited, 550 °C, and 950 °C annealed devices were carried out, where as-grown device showed promising responses towards brain computing applications. Furthermore, the teaching–learning-based optimization (TLBO) technique was used to conduct further comparisons of results.

  • Enhancing Animal Welfare with Interaction Recognition: A Deep Dive into Pig Interaction Using Xception Architecture and SSPD-PIR Method
    Jung Hwan Kim, Alwin Poulose, Savina Jassica Colaco, Suresh Neethirajan, and Dong Seog Han

    MDPI AG
    The advent of artificial intelligence (AI) in animal husbandry, particularly in pig interaction recognition (PIR), offers a transformative approach to enhancing animal welfare, promoting sustainability, and bolstering climate resilience. This innovative methodology not only mitigates labor costs but also significantly reduces stress levels among domestic pigs, thereby diminishing the necessity for constant human intervention. However, the raw PIR datasets often encompass irrelevant porcine features, which pose a challenge for the accurate interpretation and application of these datasets in real-world scenarios. The majority of these datasets are derived from sequential pig imagery captured from video recordings, and an unregulated shuffling of data often leads to an overlap of data samples between training and testing groups, resulting in skewed experimental evaluations. To circumvent these obstacles, we introduced a groundbreaking solution—the Semi-Shuffle-Pig Detector (SSPD) for PIR datasets. This novel approach ensures a less biased experimental output by maintaining the distinctiveness of testing data samples from the training datasets and systematically discarding superfluous information from raw images. Our optimized method significantly enhances the true performance of classification, providing unbiased experimental evaluations. Remarkably, our approach has led to a substantial improvement in the isolation after feeding (IAF) metric by 20.2% and achieved higher accuracy in segregating IAF and paired after feeding (PAF) classifications exceeding 92%. This methodology, therefore, ensures the preservation of pertinent data within the PIR system and eliminates potential biases in experimental evaluations. As a result, it enhances the accuracy and reliability of real-world PIR applications, contributing to improved animal welfare management, elevated food safety standards, and a more sustainable and climate-resilient livestock industry.

  • DISubNet: Depthwise Separable Inception Subnetwork for Pig Treatment Classification Using Thermal Data
    Savina Jassica Colaco, Jung Hwan Kim, Alwin Poulose, Suresh Neethirajan, and Dong Seog Han

    MDPI AG
    Thermal imaging is increasingly used in poultry, swine, and dairy animal husbandry to detect disease and distress. In intensive pig production systems, early detection of health and welfare issues is crucial for timely intervention. Using thermal imaging for pig treatment classification can improve animal welfare and promote sustainable pig production. In this paper, we present a depthwise separable inception subnetwork (DISubNet), a lightweight model for classifying four pig treatments. Based on the modified model architecture, we propose two DISubNet versions: DISubNetV1 and DISubNetV2. Our proposed models are compared to other deep learning models commonly employed for image classification. The thermal dataset captured by a forward-looking infrared (FLIR) camera is used to train these models. The experimental results demonstrate that the proposed models for thermal images of various pig treatments outperform other models. In addition, both proposed models achieve approximately 99.96–99.98% classification accuracy with fewer parameters.

  • Deep Learning Approaches for Bimodal Speech Emotion Recognition: Advancements, Challenges, and a Multi-Learning Model
    Samuel Kakuba, Alwin Poulose, and Dong Seog Han

    Institute of Electrical and Electronics Engineers (IEEE)
    Though acoustic speech emotion recognition has been studied for a while, bimodal speech emotion recognition using both acoustic and text has gained momentum since speech emotion recognition doesn’t only involve the acoustic modality. However, there is less review work on the available bimodal speech emotion recognition (SER) research. The review works available mostly concentrate on the use of convolution neural networks (CNNs) and recurrent neural networks (RNNs). However, recent deep learning techniques like attention mechanisms and fusion strategies have shaped the bimodal SER research without explicit analysis of their significance when used singly or in combination with the traditional deep learning techniques. We therefore, review the recently published literature that involves these deep learning techniques in this paper to ascertain the current trends and challenges of bimodal SER research that have hampered it to be fully deployed in the natural environment for off-the-shelf SER applications. In addition, we carried out experiments to ascertain the optimal combination of acoustic features and the significance of the attention mechanisms and their combination with the traditional deep learning techniques. We propose a multi-technique model called the deep learning-based multi-learning model for emotion recognition (DBMER) that operates with multi-learning capabilities of CNNs, RNNs, and multi-head attention mechanisms. We noted that attention mechanisms play a pivotal role in the performance of bimodal dyadic SER systems. However, few publicly available datasets, the difficulty in acquisition of bimodal SER data, cross-corpus and multilingual studies remain open problems in bimodal SER research. Our experiments on the proposed DBMER model showed that though each of the deep learning techniques benefits the task, the results are more accurate and robust when they are used in careful combination with multi-level fusion approaches.

  • Simulation of an Indoor Visible Light Communication System Using Optisystem
    Alwin Poulose

    MDPI AG
    Visible light communication (VLC ) is an emerging research area in wireless communication. The system works the same way as optical fiber-based communication systems. However, the VLC system uses free space as its transmission medium. The invention of the light-emitting diode (LED) significantly updated the technologies used in modern communication systems. In VLC, the LED acts as a transmitter and sends data in the form of light when the receiver is in the line of sight (LOS) condition. The VLC system sends data by blinking the light at high speed, which is challenging to identify by human eyes. The detector receives the flashlight at high speed and decodes the transmitted data. One significant advantage of the VLC system over other communication systems is that it is easy to implement using an LED and a photodiode or phototransistor. The system is economical, compact, inexpensive, small, low power, prevents radio interference, and eliminates the need for broadcast rights and buried cables. In this paper, we investigate the performance of an indoor VLC system using Optisystem simulation software. We simulated an indoor VLC system using LOS and non-line-of-sight (NLOS) propagation models. Our simulation analyzes the LOS propagation model by considering the direct path with a single LED as a transmitter. The NLOS propagation model-based VLC system analyses two scenarios by considering single and dual LEDs as its transmitter. The effect of incident and irradiance angles in an LOS propagation model and an eye diagram of LOS/NLOS models are investigated to identify the signal distortion. We also analyzed the impact of the field of view (FOV) of an NLOS propagation model using a single LED as a transmitter and estimated the bitrate (Rb). Our theoretical results show that the system simulated in this paper achieved bitrates in the range of 2.1208×107 to 4.2147×107 bits/s when the FOV changes from 30∘ to 90∘. A VLC hardware design is further considered for real-time implementations. Our VLC hardware system achieved an average of 70% data recovery rate in the LOS propagation model and a 40% data recovery rate in the NLOS propagation model. This paper’s analysis shows that our simulated VLC results are technically beneficial in real-world VLC systems.

  • Verilog Design, Synthesis, and Netlisting of IoT-Based Arithmetic Logic and Compression Unit for 32 nm HVT Cells
    Raj Mouli Jujjavarapu and Alwin Poulose

    MDPI AG
    Micro-processor designs have become a revolutionary technology almost in every industry. They brought the reality of automation and also electronic gadgets. While trying to improvise these hardware modules to handle heavy computational loads, they have substantially reached a limit in size, power efficiency, and similar avenues. Due to these constraints, many manufacturers and corporate entities are trying many ways to optimize these mini beasts. One such approach is to design microprocessors based on the specified operating system. This approach came to the limelight when many companies launched their microprocessors. In this paper, we will look into one method of using an arithmetic logic unit (ALU) module for internet of things (IoT)-enabled devices. A specific set of operations is added to the classical ALU to help fast computational processes in IoT-specific programs. We integrated a compression module and a fast multiplier based on the Vedic algorithm in the 16-bit ALU module. The designed ALU module is also synthesized under a 32-nm HVT cell library from the Synopsys database to generate an overview of the areal efficiency, logic levels, and layout of the designed module; it also gives us a netlist from this database. The synthesis provides a complete overview of how the module will be manufactured if sent to a foundry.

  • Deep Learning-Based Speech Emotion Recognition using Multi-Level Fusion of Concurrent Features
    Samuel Kakuba, Alwin Poulose, and Dong Seog Han

    Institute of Electrical and Electronics Engineers (IEEE)
    The detection and classification of emotional states in speech involves the analysis of audio signals and text transcriptions. There are complex relationships between the extracted features at different time intervals which ought to be analyzed to infer the emotions in speech. These relationships can be represented as spatial, temporal and semantic tendency features. In addition to emotional features that exist in each modality, the text modality consists of semantic and grammatical tendencies in the uttered sentences. Spatial and temporal features have been extracted sequentially in deep learning-based models using convolutional neural networks (CNN) followed by recurrent neural networks (RNN) which may not only be weak at the detection of the separate spatial-temporal feature representations but also the semantic tendencies in speech. In this paper, we propose a deep learning-based model named concurrent spatial-temporal and grammatical (CoSTGA) model that concurrently learns spatial, temporal and semantic representations in the local feature learning block (LFLB) which are fused as a latent vector to form an input to the global feature learning block (GFLB). We also investigate the performance of multi-level feature fusion compared to single-level fusion using the multi-level transformer encoder model (MLTED) that we also propose in this paper. The proposed CoSTGA model uses multi-level fusion first at the LFLB level where similar features (spatial or temporal) are separately extracted from a modality and secondly at the GFLB level where the spatial-temporal features are fused with the semantic tendency features. The proposed CoSTGA model uses a combination of dilated causal convolutions (DCC), bidirectional long short-term memory (BiLSTM), transformer encoders (TE), multi-head and self-attention mechanisms. Acoustic and lexical features were extracted from the interactive emotional dyadic motion capture (IEMOCAP) dataset. The proposed model achieves 75.50% and 75.82% of weighted and unweighted accuracy, 75.32% and 75.57% of recall and F1 score respectively. These results imply that concurrently learned spatial-temporal features with semantic tendencies learned in a multi-level approach improve the model’s effectiveness and robustness.

  • Attention-Based Multi-Learning Approach for Speech Emotion Recognition With Dilated Convolution
    Samuel Kakuba, Alwin Poulose, and Dong Seog Han

    Institute of Electrical and Electronics Engineers (IEEE)
    The success of deep learning in speech emotion recognition has led to its application in resource-constrained devices. It has been applied in human-to-machine interaction applications like social living assistance, authentication, health monitoring and alertness systems. In order to ensure a good user experience, robust, accurate and computationally efficient deep learning models are necessary. Recurrent neural networks (RNN) like long short-term memory (LSTM), gated recurrent units (GRU) and their variants that operate sequentially are often used to learn time series sequences of the signal, analyze long-term dependencies and the contexts of the utterances in the speech signal. However, due to their sequential operation, they encounter problems in convergence and sluggish training that uses a lot of memory resources and encounters the vanishing gradient problem. In addition, they do not consider spatial cues that may exist in the speech signal. Therefore, we propose an attention-based multi-learning model (ABMD) that uses residual dilated causal convolution (RDCC) blocks and dilated convolution (DC) layers with multi-head attention. The proposed ABMD model achieves comparable performance while taking global contextualized long-term dependencies between features in a parallel manner using a large receptive field with less increase in the number of parameters compared to the number of layers and considers spatial cues among the speech features. Spectral and voice quality features extracted from the raw speech signals are used as inputs. The proposed ABMD model obtained a recognition accuracy and F1 score of 93.75% and 92.50% on the SAVEE datasets, 85.89% and 85.34% on the RAVDESS datasets and 95.93% and 95.83% on the EMODB datasets. The model’s robustness in terms of the confusion ratio of the individual discrete emotions especially happiness which is often confused with emotions that belong to the same dimensional plane with it also improved when validated on the same datasets.

  • Point Cloud Map Generation and Localization for Autonomous Vehicles Using 3D Lidar Scans
    Alwin Poulose, Minjin Baek, and Dong Seog Han

    IEEE
    Autonomous vehicles are the future intelligent vehicles, which are expected to reduce the number of human drivers, improve efficiency, avoid collisions, and become the ideal city vehicles of the future. To achieve this goal, vehicle manufacturers have started to work in this field to harness the potential and solve current challenges to achieve the desired results. In this sense, the first challenge is transforming conventional vehicles into autonomous ones that meet users’ expectations. The evolution of conventional vehicles into autonomous vehicles includes the adoption and improvement of different technologies and computer algorithms. The essential task affecting the autonomous vehicle’s performance is its localization, apart from perception, path planning, and control, and the accuracy and efficiency of localization play a crucial role in autonomous driving. In this paper, we describe the implementation of map-based localization using point cloud matching for autonomous vehicles. The Robot Operating System (ROS) along with Autoware, which is an open-source software platform for autonomous vehicles, are utilized for the implementation of the vehicle localization system presented in this paper. Point cloud maps are generated based on 3D lidar points, and a normal distributions transform (NDT) matching algorithm is used for localizing the test vehicle through matching real-time lidar measurements with the pre-built point cloud maps. The experiment results show that the map-based localization system using 3D lidar scans enables real-time localization performance that is sufficiently accurate and efficient for autonomous driving in a campus environment. The paper comprises the methods used for point cloud map generation and vehicle localization as well as the step-by-step procedure for the implementation with a ROS-based system for the purpose of autonomous driving.

  • HIT HAR: Human Image Threshing Machine for Human Activity Recognition Using Deep Learning Models
    Alwin Poulose, Jung Hwan Kim, and Dong Seog Han

    Hindawi Limited
    In recent days, research in human activity recognition (HAR) has played a significant role in healthcare systems. The accurate activity classification results from the HAR enhance the performance of the healthcare system with broad applications. HAR results are useful in monitoring a person’s health, and the system predicts abnormal activities based on user movements. The HAR system’s abnormal activity predictions provide better healthcare monitoring and reduce users’ health issues. The conventional HAR systems use wearable sensors, such as inertial measurement unit (IMU) and stretch sensors for activity recognition. These approaches show remarkable performances to the user’s basic activities such as sitting, standing, and walking. However, when the user performs complex activities, such as running, jumping, and lying, the sensor-based HAR systems have a higher degree of misclassification results due to the reading errors from sensors. These sensor errors reduce the overall performance of the HAR system with the worst classification results. Similarly, radiofrequency or vision-based HAR systems are not free from classification errors when used in real time. In this paper, we address some of the existing challenges of HAR systems by proposing a human image threshing (HIT) machine-based HAR system that uses an image dataset from a smartphone camera for activity recognition. The HIT machine effectively uses a mask region-based convolutional neural network (R-CNN) for human body detection, a facial image threshing machine (FIT) for image cropping and resizing, and a deep learning model for activity classification. We demonstrated the effectiveness of our proposed HIT machine-based HAR system through extensive experiments and results. The proposed HIT machine achieved 98.53% accuracy when the ResNet architecture was used as its deep learning model.

  • Pig Treatment Classification on Thermal Image Data using Deep Learning
    Savina Jassica Colaco, Jung Hwan Kim, Alwin Poulose, Zutphen Sanne Van, Suresh Neethirajan, and Dong Seog Han

    IEEE
    —Recently, image classification has gained recognition in several applications like self-driving cars, security surveillance systems, face detection, etc. The conventional methods have been overtaken by deep learning methods which can detect and classify objects in complex scenarios. In this paper, we propose a simple CNN model for pig treatment classification on thermal images. The proposed model is compared with different deep learning models which are widely used for image classification. The models are evaluated with our own thermal dataset collected using a FLIR camera. The experimental results show the thermal images of different pig treatments are better classified with the proposed model. The proposed model can achieve 99.96% accuracy with a few parameters.

  • Foreground Extraction Based Facial Emotion Recognition Using Deep Learning Xception Model
    Alwin Poulose, Chinthala Sreya Reddy, Jung Hwan Kim, and Dong Seog Han

    IEEE
    The facial emotion recognition (FER) system has a very significant role in the autonomous driving system (ADS). In ADS, the FER system identifies the driver's emotions and provides the current driver's mental status for safe driving. The driver's mental status determines the safety of the vehicle and prevents the chances of road accidents. In FER, the system identifies the driver's emotions such as happy, sad, angry, surprise, disgust, fear, and neutral. To identify these emotions, the FER system needs to train with large FER datasets and the system's performance completely depends on the type of the FER dataset used in the model training. The recent FER system uses publicly available datasets such as FER 2013, extended Cohn-Kanade (CK+), AffectNet, JAFFE, etc. for model training. However, the model trained with these datasets has some major flaws when the system tries to extract the FER features from the datasets. To address the feature extraction problem in the FER system, in this paper, we propose a foreground extraction technique to identify the user emotions. The proposed foreground extraction-based FER approach accurately extracts the FER features and the deep learning model used in the system effectively utilizes these features for model training. The model training with our FER approach shows accurate classification results than the conventional FER approach. To validate our proposed FER approach, we collected user emotions from 9 people and used the Xception architecture as the deep learning model. From the FER experiment and result analysis, the proposed foreground extraction-based approach reduces the classification error that exists in the conventional FER approach. The FER results from the proposed approach show a 3.33% model accuracy improvement than the conventional FER approach.

  • Feature-Based Deep LSTM Network for Indoor Localization Using UWB Measurements
    Alwin Poulose and Dong Seog Han

    IEEE
    Indoor localization using ultra-wideband (UWB) measurements is an effective localization approach when the localization system exists in non-line of sight (NLOS) conditions from the indoor experiment area. In UWB-based indoor localization, the system estimates the user’s distance information using anchor-tag communication. The user’s distance information in the UWB system is an influencing factor to determine localization performance. A deep learning-based localization system uses the raw distance information for model training and testing and the model predicts the user’s current positions. Recently developed deep learning-based UWB localization approaches achieve the best localization results when compared to conventional approaches. However, when the deep learning models use raw distance information, the system lacks sufficient features for training and this is reflected in the model’s performance. To solve this problem, we propose a feature-based localization approach for UWB localization. The proposed approach uses deep long short-term memory (DLSTM) network for training and testing. Using extracted features from the user’s distance information gives a better model performance than raw distance data and the DLSTM network is capable of encoding temporal dependencies and learn high-level representation from the extracted feature data. The simulation results show that the proposed feature-based DLSTM localization system achieved a 5cm mean localization error as compared to conventional UWB localization approaches.

  • The extensive usage of the facial image threshing machine for facial emotion recognition performance
    Jung Hwan Kim, Alwin Poulose, and Dong Seog Han

    MDPI AG
    Facial emotion recognition (FER) systems play a significant role in identifying driver emotions. Accurate facial emotion recognition of drivers in autonomous vehicles reduces road rage. However, training even the advanced FER model without proper datasets causes poor performance in real-time testing. FER system performance is heavily affected by the quality of datasets than the quality of the algorithms. To improve FER system performance for autonomous vehicles, we propose a facial image threshing (FIT) machine that uses advanced features of pre-trained facial recognition and training from the Xception algorithm. The FIT machine involved removing irrelevant facial images, collecting facial images, correcting misplacing face data, and merging original datasets on a massive scale, in addition to the data-augmentation technique. The final FER results of the proposed method improved the validation accuracy by 16.95% over the conventional approach with the FER 2013 dataset. The confusion matrix evaluation based on the unseen private dataset shows a 5% improvement over the original approach with the FER 2013 dataset to confirm the real-time testing.

  • Feature Vector Extraction Technique for Facial Emotion Recognition Using Facial Landmarks
    Alwin Poulose, Jung Hwan Kim, and Dong Seog Han

    IEEE
    The facial emotion recognition (FER) system classifies the driver's emotions and these results are crucial in the autonomous driving system (ADS). The ADS effectively utilizes the features from FER and increases its safety by preventing road accidents. In FER, the system classifies the driver's emotions into different categories such as happy, sad, angry, surprise, disgust, fear, and neutral. These emotions determine the driver's mental condition and the current mental status of the driver can give us valuable information to predict the occurrence of road accidents. Conventional FER systems use direct facial image pixel values as its input and these pixel values provide a limited number of features for training the model. The limited number of features from facial images degrade the performance of the system and it gives a higher degree of classification error. To address this problem in the conventional FER systems, we propose a feature vector extraction technique that combines the facial image pixel values with the facial landmarks and the deep learning model uses these combined features as its input. Our experiments and results show that the proposed feature vector extraction-based FER approach reduces the classification error for emotion recognition and enhances the performance of the system. The proposed FER approach achieved a classification accuracy of 99.96% and a 0.095 model loss from the ResNet architecture.

  • ISPLInception: An Inception-ResNet Deep Learning Architecture for Human Activity Recognition
    Mutegeki Ronald, Alwin Poulose, and Dong Seog Han

    Institute of Electrical and Electronics Engineers (IEEE)
    Advances in deep learning (DL) model design have pushed the boundaries of the areas in which it can be applied. The fields with an immense availability of complex big data have been big beneficiaries of these advances. One such field is human activity recognition (HAR). HAR is a popular area of research in a connected world because internet-of-things (IoT) devices and smartphones are becoming more prevalent. A major research goal of recent research work has been to improve predictive accuracy for devices with limited computational resources. In this paper, we propose iSPLInception, a DL model motivated by the Inception-ResNet architecture from Google, that not only achieves high predictive accuracy but also uses fewer device resources. We evaluate the proposed model’s performance on four public HAR datasets from the University of California, Irvine (UCI) machine learning repository. The proposed model’s performance is compared to that of existing DL architectures that have been proposed in the recent past to solve the HAR problem. The proposed model outperforms these approaches on several metrics of accuracy, cross-entropy loss, and $F_{1}$ score on all the four datasets. The performance of the proposed iSPLInception model is validated on the UCI HAR using smartphones dataset, Opportunity activity recognition dataset, Daphnet freezing of gait dataset, and PAMAP2 physical activity monitoring dataset. The experiments and result analysis indicate that the proposed iSPLInception model achieves remarkable performance for HAR applications.

  • Hybrid deep learning model based indoor positioning using wi-fi RSSI heat maps for autonomous applications
    Alwin Poulose and Dong Seog Han

    MDPI AG
    Positioning using Wi-Fi received signal strength indication (RSSI) signals is an effective method for identifying the user positions in an indoor scenario. Wi-Fi RSSI signals in an autonomous system can be easily used for vehicle tracking in underground parking. In Wi-Fi RSSI signal based positioning, the positioning system estimates the signal strength of the access points (APs) to the receiver and identifies the user’s indoor positions. The existing Wi-Fi RSSI based positioning systems use raw RSSI signals obtained from APs and estimate the user positions. These raw RSSI signals can easily fluctuate and be interfered with by the indoor channel conditions. This signal interference in the indoor channel condition reduces localization performance of these existing Wi-Fi RSSI signal based positioning systems. To enhance their performance and reduce the positioning error, we propose a hybrid deep learning model (HDLM) based indoor positioning system. The proposed HDLM based positioning system uses RSSI heat maps instead of raw RSSI signals from APs. This results in better localization performance for Wi-Fi RSSI signal based positioning systems. When compared to the existing Wi-Fi RSSI based positioning technologies such as fingerprint, trilateration, and Wi-Fi fusion approaches, the proposed approach achieves reasonably better positioning results for indoor localization. The experiment results show that a combination of convolutional neural network and long short-term memory network (CNN-LSTM) used in the proposed HDLM outperforms other deep learning models and gives a smaller localization error than conventional Wi-Fi RSSI signal based localization approaches. From the experiment result analysis, the proposed system can be easily implemented for autonomous applications.

  • An Accurate Indoor User Position Estimator for Multiple Anchor UWB Localization
    Alwin Poulose, Ziga Emersic, Odongo Steven Eyobu, and Dong Seog Han

    IEEE
    UWB-based positioning systems have been proven to provide a significant high level of ac-curacy hence offering a huge potential for a variety of indoor applications. However, the major challenges related to UWB localization are multipath effects, excess delay, clock drift, signal interferences and system computational time to estimate the user position. To compensate for these challenges, the UWB system uses multiple anchors in the experiment area and this gives accurate position results with minimum localization errors. However, the use of multiple anchors in the UWB system means processing large amounts of data in the system controller for localization, which leads to high computational time to estimate the current user position. To reduce the complexity of the UWB systems, we propose a position estimator for multiple anchor indoor localization, which uses the extended Kalman filter (EKF). The proposed UWB-EKF estimator was mathematically analysed and the simulation results were compared with classical localization algorithms considering the mean localization errors. In the simulation, three classical localization algorithms: linearized least square estimation (LLSE), weighted centroid estimation (WCE) and maximum likelihood estimation (MLE) were used for performance comparison. Thorough extensive simulation done in this study achieves results which demonstrate the effectiveness of the proposed UWB-EKF estimator for multiple anchor UWB indoor localization.

  • UWB indoor localization using deep learning LSTM networks
    Alwin Poulose and Dong Seog Han

    MDPI AG
    Localization using ultra-wide band (UWB) signals gives accurate position results for indoor localization. The penetrating characteristics of UWB pulses reduce the multipath effects and identify the user position with precise accuracy. In UWB-based localization, the localization accuracy depends on the distance estimation between anchor nodes (ANs) and the UWB tag based on the time of arrival (TOA) of UWB pulses. The TOA errors in the UWB system, reduce the distance estimation accuracy from ANs to the UWB tag and adds the localization error to the system. The position accuracy of a UWB system also depends on the line of sight (LOS) conditions between the UWB anchors and tag, and the computational complexity of localization algorithms used in the UWB system. To overcome these UWB system challenges for indoor localization, we propose a deep learning approach for UWB localization. The proposed deep learning model uses a long short-term memory (LSTM) network for predicting the user position. The proposed LSTM model receives the distance values from TOA-distance model of the UWB system and predicts the current user position. The performance of the proposed LSTM model-based UWB localization system is analyzed in terms of learning rate, optimizer, loss function, batch size, number of hidden nodes, timesteps, and we also compared the mean localization accuracy of the system with different deep learning models and conventional UWB localization approaches. The simulation results show that the proposed UWB localization approach achieved a 7 cm mean localization error as compared to conventional UWB localization approaches.

  • Performance Analysis of Fingerprint Matching Algorithms for Indoor Localization
    Alwin Poulose and Dong Seog Han

    IEEE
    Localization using Wi-Fi received signal strength indication (RSSI) signals gives accurate user position results for indoor localization when the RSSI signals from Wi-Fi access points (APs) cover the entire localization area. The most popular localization algorithm used in the Wi-Fi RSSI signal based localization systems is the Wi-Fi fingerprinting and it uses different fingerprint matching algorithms for user position estimation. In this paper, we propose a comparative analysis of different fingerprint matching algorithms for Wi-Fi RSSI signal based localization systems. In the analysis, we used nearest neighbour (NN), k-nearest neighbors algorithm (kNN), weighted $k$ -nearest neighbour (wkNN) and Bayesian fingerprint matching algorithms for user position estimation. The performance of these fingerprint matching algorithms is discussed in terms of average localization error and probability distribution of localization error. The experiment results show that the wkNN fingerprint matching algorithm gives high position accuracy as compared to other fingerprint matching algorithms. The results from the NN fingerprint matching algorithm has high localization error and is not suitable for Wi-Fi RSSI signal based localization systems.

  • Performance Analysis of Sensor Fusion Techniques for Heading Estimation Using Smartphone Sensors
    Alwin Poulose, Benaoumeur Senouci, and Dong Seog Han

    Institute of Electrical and Electronics Engineers (IEEE)
    Efficient indoor positioning requires accurate heading and step length estimation algorithms. Therefore, in order to improve the indoor position accuracy, it is necessary to estimate both the user heading and step length with minimal error. These include errors from the accelerometer, magnetometer and gyroscope of smartphone sensors. Fusing different sensor data has a high impact on improving heading accuracy. In this paper, we present a comparative analysis of different sensor fusion techniques for heading estimation using smartphone sensors. The performance of different sensor fusion techniques is discussed in terms of root mean square error and cumulative distribution functions of heading errors. The experimental results show the effects of different sensor fusion techniques for heading estimation. The performance of five sensor fusion techniques such as a linear Kalman filter (LKF), extended Kalman filter (EKF), unscented Kalman filter (UKF), particle filters (PF) and complementary filters (CF) were analyzed. The UKF fusion algorithm shows better results compared to EKF and LKF fusion algorithms. The EKF approach is better than LKF and CF approaches. The experimental results show that the PF fusion technique has poor performance for heading estimation.

  • Hybrid indoor localization using IMU sensors and smartphone camera
    Alwin Poulose and Dong Seog Han

    MDPI AG
    Smartphone camera or inertial measurement unit (IMU) sensor-based systems can be independently used to provide accurate indoor positioning results. However, the accuracy of an IMU-based localization system depends on the magnitude of sensor errors that are caused by external electromagnetic noise or sensor drifts. Smartphone camera based positioning systems depend on the experimental floor map and the camera poses. The challenge in smartphone camera-based localization is that accuracy depends on the rapidness of changes in the user’s direction. In order to minimize the positioning errors in both the smartphone camera and IMU-based localization systems, we propose hybrid systems that combine both the camera-based and IMU sensor-based approaches for indoor localization. In this paper, an indoor experiment scenario is designed to analyse the performance of the IMU-based localization system, smartphone camera-based localization system and the proposed hybrid indoor localization system. The experiment results demonstrate the effectiveness of the proposed hybrid system and the results show that the proposed hybrid system exhibits significant position accuracy when compared to the IMU and smartphone camera-based localization systems. The performance of the proposed hybrid system is analysed in terms of average localization error and probability distributions of localization errors. The experiment results show that the proposed oriented fast rotated binary robust independent elementary features (BRIEF)-simultaneous localization and mapping (ORB-SLAM) with the IMU sensor hybrid system shows a mean localization error of 0.1398 m and the proposed simultaneous localization and mapping by fusion of keypoints and squared planar markers (UcoSLAM) with IMU sensor-based hybrid system has a 0.0690 m mean localization error and are compared with the individual localization systems in terms of mean error, maximum error, minimum error and standard deviation of error.

RECENT SCHOLAR PUBLICATIONS

  • CVGG-19: Customized Visual Geometry Group Deep Learning Architecture for Facial Emotion Recognition
    JH Kim, A Poulose, DS Han
    IEEE Access 12, 41557-41578 2024

  • Agricultural Object Detection with You Look Only Once (YOLO) Algorithm: A Bibliometric and Systematic Literature Review
    CM Badgujar, A Poulose, H Gan
    arXiv preprint arXiv:2401.10379 2024

  • Quantification of golgi dispersal and classification using machine learning models
    R Sansaria, KD Das, A Poulose
    Micron 176, 103547 2024

  • Dispersive Modeling of Normal and Cancerous Cervical Cell Responses to Nanosecond Electric Fields in Reversible Electroporation Using a Drift-Step Rectifier Diode Generator
    M Kumar, S Kumar, S Chakrabartty, A Poulose, H Mostafa, B Goyal
    Micromachines 14 (12), 2136 2023

  • Deep Learning Approaches for Bimodal Speech Emotion Recognition: Advancements, Challenges, and a Multi-Learning Model
    S Kakuba, A Poulose, DS Han
    IEEE Access 2023

  • Versatility Investigation of Grown Titanium Dioxide Nanoparticles and Their Comparative Charge Storage for Memristor Devices
    S Chakrabartty, AHM Almawgani, S Kumar, M Kumar, S Acharjee, ...
    Micromachines 14 (8), 1616 2023

  • Enhancing animal welfare with interaction recognition: A deep dive into pig interaction using xception architecture and SSPD-PIR method
    JH Kim, A Poulose, SJ Colaco, S Neethirajan, DS Han
    Agriculture 13 (8), 1522 2023

  • Advancing Pig Welfare Assessment: Introducing the SSPD-PER Method for Objective and Reliable Pig Emotion Recognition
    JH Kim, SJ Colaco, A Poulose, S Neethirajan, DS Han
    Preprints 2023

  • DISubNet: depthwise separable inception subnetwork for pig treatment classification using thermal data
    SJ Colaco, JH Kim, A Poulose, S Neethirajan, DS Han
    Animals 13 (7), 1184 2023

  • Deep learning-based speech emotion recognition using multi-level fusion of concurrent features
    S Kakuba, A Poulose, DS Han
    IEEE Access 10, 125538-125551 2022

  • Attention-based multi-learning approach for speech emotion recognition with dilated convolution
    S Kakuba, A Poulose, DS Han
    IEEE Access 10, 122302-122313 2022

  • Simulation of an Indoor Visible Light Communication System Using Optisystem
    A Poulose
    Signals 3 (4), 765-793 2022

  • Point cloud map generation and localization for autonomous vehicles using 3d lidar scans
    A Poulose, M Baek, DS Han
    2022 27th asia pacific conference on communications (APCC), 336-341 2022

  • Verilog design, synthesis, and netlisting of IoT-based arithmetic logic and compression unit for 32 nm HVT cells
    RM Jujjavarapu, A Poulose
    Signals 3 (3), 620-641 2022

  • Location estimation apparatus and method using heat map of received signal strength, and recording medium on which a program for performing the same
    A Poulose, DS Han
    KR Patent KR20,220,112,334 A 2022

  • Pig Treatment Classification on Thermal Image Data using Deep Learning
    SJ Colaco, JH Kim, A Poulose, ZS Van, S Neethirajan, DS Han
    2022 Thirteenth International Conference on Ubiquitous and Future Networks 2022

  • Medication recommender system for healthcare solutions
    A Poulose, AP Valappil, J Sebastian
    Journal of Information and Optimization Sciences 43 (5), 1073-1080 2022

  • Music recommender system via deep learning
    A Poulose, CS Reddy, S Dash, BJR Sahu
    Journal of information and optimization sciences 43 (5), 1081-1088 2022

  • Facial Landmark Extractor for Facial Emotion Recognition
    JH Kim, A Poulose, DS Han
    한국통신학회 학술대회논문집, 1456-1457 2022

  • Simulation of an Indoor Visible Light Communication System Using Optisystem. Signals 2022, 3, 765–793
    A Poulose
    s Note: MDPI stays neutral with regard to jurisdictional claims in published 2022

MOST CITED SCHOLAR PUBLICATIONS

  • An indoor position-estimation algorithm using smartphone IMU sensor data
    A Poulose, OS Eyobu, DS Han
    Ieee Access 7, 11165-11177 2019
    Citations: 151

  • UWB indoor localization using deep learning LSTM networks
    A Poulose, DS Han
    Applied Sciences 10 (18), 6290 2020
    Citations: 125

  • iSPLInception: an inception-ResNet deep learning architecture for human activity recognition
    M Ronald, A Poulose, DS Han
    IEEE Access 9, 68985-69001 2021
    Citations: 112

  • Hybrid indoor localization using IMU sensors and smartphone camera
    A Poulose, DS Han
    Sensors 19 (23), 5084 2019
    Citations: 89

  • A sensor fusion framework for indoor localization using smartphone sensors and Wi-Fi RSSI measurements
    A Poulose, J Kim, DS Han
    Applied Sciences 9 (20), 4379 2019
    Citations: 85

  • The extensive usage of the facial image threshing machine for facial emotion recognition performance
    JH Kim, A Poulose, DS Han
    Sensors 21 (6), 2026 2021
    Citations: 57

  • Hybrid deep learning model based indoor positioning using Wi-Fi RSSI heat maps for autonomous applications
    A Poulose, DS Han
    Electronics 10 (1), 2 2020
    Citations: 55

  • Localization error analysis of indoor positioning system based on UWB measurements
    A Poulose, OS Eyobu, M Kim, DS Han
    2019 Eleventh International Conference on Ubiquitous and Future Networks 2019
    Citations: 49

  • Performance analysis of sensor fusion techniques for heading estimation using smartphone sensors
    A Poulose, B Senouci, DS Han
    IEEE Sensors Journal 19 (24), 12369-12380 2019
    Citations: 47

  • An accurate indoor user position estimator for multiple anchor uwb localization
    A Poulose, Ž Emeršič, OS Eyobu, DS Han
    2020 international conference on information and communication technology 2020
    Citations: 44

  • Performance analysis of fingerprint matching algorithms for indoor localization
    A Poulose, DS Han
    2020 International Conference on Artificial Intelligence in Information and 2020
    Citations: 35

  • Indoor localization using PDR with Wi-Fi weighted path loss algorithm
    A Poulose, DS Han
    2019 International Conference on Information and Communication Technology 2019
    Citations: 32

  • A combined PDR and Wi-Fi trilateration algorithm for indoor localization
    A Poulose, OS Eyobu, DS Han
    2019 International Conference on Artificial Intelligence in Information and 2019
    Citations: 30

  • Foreground extraction based facial emotion recognition using deep learning xception model
    A Poulose, CS Reddy, JH Kim, DS Han
    2021 Twelfth International Conference on Ubiquitous and Future Networks 2021
    Citations: 23

  • Feature-based deep LSTM network for indoor localization using UWB measurements
    A Poulose, DS Han
    2021 International Conference on Artificial Intelligence in Information and 2021
    Citations: 22

  • Deep learning-based speech emotion recognition using multi-level fusion of concurrent features
    S Kakuba, A Poulose, DS Han
    IEEE Access 10, 125538-125551 2022
    Citations: 20

  • Attention-based multi-learning approach for speech emotion recognition with dilated convolution
    S Kakuba, A Poulose, DS Han
    IEEE Access 10, 122302-122313 2022
    Citations: 20

  • Indoor localization with smartphones: Magnetometer calibration
    A Poulose, J Kim, DS Han
    2019 IEEE International Conference on Consumer Electronics (ICCE), 1-3 2019
    Citations: 19

  • HIT HAR: Human Image Threshing Machine for Human Activity Recognition Using Deep Learning Models
    A Poulose, JH Kim, DS Han
    Computational Intelligence and Neuroscience 2022 (Article ID 1808990), 21 2022
    Citations: 18

  • Feature vector extraction technique for facial emotion recognition using facial landmarks
    A Poulose, JH Kim, DS Han
    2021 International Conference on Information and Communication Technology 2021
    Citations: 14