@usm.my
Associate Professor Dr.
School of Electrical & Electronic Engineering
Digital Signal Processing, Biometrics
Scopus Publications
Norsalina Hassan and Dzati Athiar Ramli
Universiti Putra Malaysia
Bioacoustic signals have been used as a modality in environmental monitoring and biodiversity research. These signals also carry species or individual information, thus allowing the recognition of species and individuals based on vocals. Nevertheless, vocal communication in a crowded social environment is a challenging problem for automated bioacoustic recogniser systems due to interference problems in concurrent signals from multiple individuals. The bioacoustics sources are separated from the mixtures of multiple individual signals using a technique known as Blind source separation (BSS) to address the abovementioned issue. In this work, we explored the BSS of an underdetermined mixture based on a two-stage sparse component analysis (SCA) approach that consisted of (1) mixing matrix estimation and (2) source estimation. The key point of our procedure was to investigate the algorithm’s robustness to noise and the effect of increasing the number of sources. Using the two-stage SCA technique, the performances of the estimated mixing matrix and the estimated source were evaluated and discussed at various signal-to-noise ratios (SNRs). The use of different sources is also validated. Given its robustness, the SCA algorithm presented a stable and reliable performance in a noisy environment with small error changes when the noise level was increased.
Norsalina Hassan and Dzati Athiar Ramli
MDPI AG
Blind source separation (BSS) recovers source signals from observations without knowing the mixing process or source signals. Underdetermined blind source separation (UBSS) occurs when there are fewer mixes than source signals. Sparse component analysis (SCA) is a general UBSS solution that benefits from sparse source signals which consists of (1) mixing matrix estimation and (2) source recovery estimation. The first stage of SCA is crucial, as it will have an impact on the recovery of the source. Single-source points (SSPs) were detected and clustered during the process of mixing matrix estimation. Adaptive time–frequency thresholding (ATFT) was introduced to increase the accuracy of the mixing matrix estimations. ATFT only used significant TF coefficients to detect the SSPs. After identifying the SSPs, hierarchical clustering approximates the mixing matrix. The second stage of SCA estimated the source recovery using least squares methods. The mixing matrix and source recovery estimations were evaluated using the error rate and mean squared error (MSE) metrics. The experimental results on four bioacoustics signals using ATFT demonstrated that the proposed technique outperformed the baseline method, Zhen’s method, and three state-of-the-art methods over a wide range of signal-to-noise ratio (SNR) ranges while consuming less time.
Qaedah Ali Musaeed Naji Mahdi, Maria Ali, Muhammad Nouman Atta, Abdullah Khan, Saima Anwar Lashari, and Dzati Athiar Ramli
Elsevier BV
Maria Ali, Muhammad daniyal liaquat, Muhammad Nouman Atta, Abdullah Khan, Saima Anwar Lashari, and Dzati Athiar Ramli
Elsevier BV
Yangyang Li and Dzati Athiar Ramli
Institute of Electrical and Electronics Engineers (IEEE)
Blind source separation (BSS) is a critical task in untangling non-stationary signals without prior information. This paper extensively explores diverse time-frequency analysis (TFA) methods within BSS systems over the past decade. It underscores the pivotal role of TFA in dealing with non-stationary signals by characterizing their attributes across time and frequency domains. This approach provides a comprehensive understanding of signal dynamics that surpasses conventional techniques focusing solely on temporal or spectral domains. The paper delves into various TFA methods, investigating their influencing factors and aiding researchers in selecting relevant techniques aligned with their objectives. Furthermore, it comprehensively reviews contemporary research, categorizing BSS algorithms into three classes. The role of commonly used TFA methods in each class is systematically evaluated, identifying their strengths and limitations during different separation stages. The paper addresses challenges in implementing BSS algorithms, particularly in under-determined systems with fewer mixing channels than source signals. It highlights the central role of TFA in overcoming these challenges and enhancing separation outcomes.
Yangyang Li and Dzati Athiar Ramli
MDPI AG
The estimation accuracy of the mixed matrix is very important to the performance of the underdetermined blind source separation (UBSS) system. To improve the estimation accuracy of the mixed matrix, the sparsity of the mixed signal is required. The novel fractional domain time–frequency plane is obtained by rotating the time–frequency plane after the short-time Fourier transform. This plane represents the fine characteristics of the mixed signal in the time domain and the frequency domain. The rotation angle is determined by global searching for the minimum L1 norm to make the mixed signal sufficiently sparse. The obtained time–frequency points do not need single source point detection, reducing the calculation amount of the original algorithm, and the insensitivity to noise in the fractional domain improves the robustness of the algorithm in the noise environment. The simulation results show that the sparsity of the mixed signal and the estimation accuracy of the mixed matrix are improved. Compared with the existing mixed matrix estimation algorithms, the proposed method is effective.
Muhammad Imran, Hafeez Anwar, Muhammad Tufail, Abdullah Khan, Murad Khan, and Dzati Athiar Ramli
Computers, Materials and Continua (Tech Science Press)
Evelyn Siao Yung Ern and Dzati Athiar Ramli
Springer International Publishing
Kai Jye Chee and Dzati Athiar Ramli
MDPI AG
The existing electrocardiogram (ECG) biometrics do not perform well when ECG changes after the enrollment phase because the feature extraction is not able to relate ECG collected during enrollment and ECG collected during classification. In this research, we propose the sequence pair feature extractor, inspired by Bidirectional Encoder Representations from Transformers (BERT)’s sentence pair task, to obtain a dynamic representation of a pair of ECGs. We also propose using the self-attention mechanism of the transformer to draw an inter-identity relationship when performing ECG identification tasks. The model was trained once with datasets built from 10 ECG databases, and then, it was applied to six other ECG databases without retraining. We emphasize the significance of the time separation between enrollment and classification when presenting the results. The model scored 96.20%, 100.0%, 99.91%, 96.09%, 96.35%, and 98.10% identification accuracy on MIT-BIH Atrial Fibrillation Database (AFDB), Combined measurement of ECG, Breathing and Seismocardiograms (CEBSDB), MIT-BIH Normal Sinus Rhythm Database (NSRDB), MIT-BIH ST Change Database (STDB), ECG-ID Database (ECGIDDB), and PTB Diagnostic ECG Database (PTBDB), respectively, over a short time separation. The model scored 92.70% and 64.16% identification accuracy on ECGIDDB and PTBDB, respectively, over a long time separation, which is a significant improvement compared to state-of-the-art methods.
Junaid Ur Rahman, Asfandyar Khan, Javed Iqbal Bangash, Abdullah Khan, Dzati Athiar Ramli, and Shafiullah khan
Elsevier BV
Maria Ali, Muhammad Nasim Haider, Saima Anwar Lashari, Wareesa Sharif, Abdullah Khan, and Dzati Athiar Ramli
Elsevier BV
Mohd Nizam Mohd Najib and Dzati Athiar Ramli
Springer Singapore
Fatin Izzati Mohamad Abdul Hadi, Dzati Athiar Ramli, and Ahmad Saiful Azhar
Springer Singapore
Najah Ghazali and Dzati Athiar Ramli
Elsevier BV
Barkat Ali, Saima Anwar Lashari, Wareesa Sharif, Abdullah Khan, Kamran ullah, and Dzati Athiar Ramli
Elsevier BV
Fatin Izzati MA Hadi, Dzati Athiar Ramli, and Norsalina Hassan
Elsevier BV
Teo Wil Son, Dzati Athiar Ramli, and Azniza Abd Aziz
Elsevier BV
Chen ShanWei, Shir LiWang, Ng Theam Foo, and Dzati Athiar Ramli
Elsevier BV
The pandemic of Covid-19 has caused a shift of paradigm of education, from face-to-face to e-learning. E-learning leads to an escalation in digitalization of handwritten documents because it requires submission of homework and assignments through online. To help teachers in checking digitalized handwritten homework, this paper proposes an automatic checking system based on a convolutional neural network (CNN) for handwritten numeral recognition. The CNN is used to recognize four arithmetic operations in mathematical questions consisting of addition, deduction, multiplication and division. The performance CNN in handwritten numeral recognition have been optimized in terms of activation function and gradient descent algorithm. The proposed CNN is also trained and tested with the MNIST handwritten data set. The experimental results show that the recognition accuracy the improved CNN improves to a certain extent as compared to before optimization.
Tahira Khalil, Javed Iqbal Bangash, Abdul Waheed Khan, Saima Anwar Lashari, Abdullah Khan, and Dzati Athiar Ramli
Elsevier BV
Anthony Ngozichukwuka Uwaechia and Dzati Athiar Ramli
Institute of Electrical and Electronics Engineers (IEEE)
Electrocardiogram (ECG) has extremely discriminative characteristics in the biometric field and has recently received significant interest as a promising biometric trait. However, ECG signals are susceptible to several types of noises, such as baseline wander, powerline interference, and high/low-frequency noises, making it challenging to realize biometric identification systems precisely and robustly. Therefore, ECG signal denoising is a major preprocessing step and plays a crucial role in ECG-based biometric human identification. ECG signal analysis for biometric recognition can combine several steps, such as preprocessing, feature extraction, feature selection, feature transformation, and classification which is a very challenging task. Moreover, the employed success measures and appropriate constitution of the ECG signal database also play significant roles in biometric system analysis, considering that publicly available databases are essential by the research community to evaluate the performance of their proposed algorithms. In this survey, we review most of the techniques employed for the ECG as biometrics for human authentication. Firstly, we present an overview and discussion on ECG signal preprocessing, feature extraction, feature selection, and feature transformation for ECG-based biometric systems. Secondly, we present a survey of the available ECG databases to evaluate and compare the acquisition protocol, acquisition hardware, and acquisition resolution (bits) for ECG-based biometric systems. Thirdly, we also present a survey on different techniques, including deep learning methods: deep supervised learning, deep semi-supervised learning, and deep unsupervised learning, for ECG signal classification. Lastly, we present the state-of-art approaches of information fusion in multimodal biometric systems.
Imran Uddin, Dzati A. Ramli, Abdullah Khan, Javed Iqbal Bangash, Nosheen Fayyaz, Asfandyar Khan, and Mahwish Kundi
Hindawi Limited
In the area of machine learning, different techniques are used to train machines and perform different tasks like computer vision, data analysis, natural language processing, and speech recognition. Computer vision is one of the main branches where machine learning and deep learning techniques are being applied. Optical character recognition (OCR) is the ability of a machine to recognize the character of a language. Pashto is one of the most ancient and historical languages of the world, spoken in Afghanistan and Pakistan. OCR application has been developed for various cursive languages like Urdu, Chinese, and Japanese, but very little work is done for the recognition of the Pashto language. When it comes to handwritten character recognition, it becomes more difficult for OCR to recognize the characters as every handwritten character’s shape is influenced by the writer’s hand motion dynamics. The reason for the lack of research in Pashto handwritten character data as compared to other languages is because there is no benchmark dataset available for experimental purposes. This study focuses on the creation of such a dataset, and then for the evaluation purpose, a machine is trained to correctly recognize unseen Pashto handwritten characters. To achieve this objective, a dataset of 43000 images was created. Three Feed Forward Neural Network models with backpropagation algorithm using different Rectified Linear Unit (ReLU) layer configurations (Model 1 with 1-ReLU Layer, Model 2 with 2-ReLU layers, and Model 3 with 3-ReLU Layers) were trained and tested with this dataset. The simulation shows that Model 1 achieved accuracy up to 87.6% on unseen data while Model 2 achieved an accuracy of 81.60% and 3% accuracy, respectively. Similarly, loss (cross-entropy) was the lowest for Model 1 with 0.15 and 3.17 for training and testing, followed by Model 2 with 0.7 and 4.2 for training and testing, while Model 3 was the last with loss values of 6.4 and 3.69. The precision, recall, and f-measure values of Model 1 were better than those of both Model 2 and Model 3. Based on results, Model 1 (with 1 ReLU activation layer) is found to be the most efficient as compared to the other two models in terms of accuracy to recognize Pashto handwritten characters.
D A Ramli, Z X Wan, and H Ibrahim
IOP Publishing
N S Ibrahim and D A Ramli
IOP Publishing
i-vector subspace modelling is one of the recent methods that has become attractive to sound-based biometric recognition domain. This method provides a benefit of modelling both intra-domain and inter-domain variability into one low dimensional space. This paper focuses on the analysis of i-vector channel compensation techniques for the purpose of improving the i-vector sound-based biometric recognition performance. This work was mainly motivated by the need to quantify the impact of different compensation techniques to the i-vector performance specifically towards the fusion compensation approach. The performances of six channel compensation techniques: (a) whitening, (b) Within Class Covariance Normalization (WCCN), (c) Linear Discriminant Analysis (LDA), (d) whitening and WCCN, (e) whitening and LDA and (f) WCCN and LDA have been investigated in this study. 2656 syllables of bio-acoustic sounds are used as experimental data and parameters of the system are initially tuned with different GMM component sizes i.e. 16, 32, 64 and 128 number of Gaussians. To the end, we assess the effect of the tuned parameter and observe the recognition rate. Experimental results reveal that the accuracy of i-vector with the fusion of WCCN and LDA compensation outperforms other compensation approaches with result of 92.00%. Consequently, these findings allow a better understanding of the compensation approaches, in particular, the fundamental concept of the compensation procedure that leads to the success of the i-vector paradigm.
R X Tiong, D A Ramli, and N S Ibrahim
IOP Publishing
Dealing with signal and session variability is a common problem in biometric recognition system since biometric signal is frequently inconsistent over time. Health, aging, emotional conditions and different recording settings are some of the factors that contribute to the variability issue. This cause the two samples of the same subject tends to be different from each other hence giving a mismatch effect between the enrolments and testing condition. Over the years, solving the variability problem by subspace representation concept has become prevalent. Hence, it motivates us to validate a recognition algorithm based on factor analysis perspective and we use electrocardiogram (ECG) signal for our experimental data as it is subject to change over time and sensitive to different sensors. We first model each supervectors extracted from Gaussian Mixture Model (GMM) into two different factors which are subject and session independent supervectors based on Joint Factor Analysis (JFA) algorithm. For the second model which is based on i-vector approach, the supervectors extracted from GMM is first modelled to be a single total factor and a compensation method is then employed to compensate the variability effect. Three compensation methods for the i-vector are employed which are Probabilistic Linear Discriminate Analysis (PLDA), Linear Discriminate Analysis (LDA) and Within Class Covariance Normalization (WCCN). The ECG-ID database obtained from physionet database consists of 90 subjects with a total of 310 ECG recordings; each recorded for 20 seconds are used in this study. Experimental results reveal the robustness of the i-vectors PLDA approach by giving 2.156% and 2.155% of Equal Error Rate (EER) for protocol 1 and 2, respectively.
Xu Yang Gan, Haidi Ibrahim, and Dzati Athiar Ramli
IOP Publishing