@acecr.ac.ir
Communications
Iranian Research Institute for Electrical Engineering, ACECR
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Mohsen Mohammadi and Hamid Reza Sadegh Mohammadi
Springer Science and Business Media LLC
Fateme Mostajer Kheirkhah, Hamid Reza Sadegh Mohammadi, and Abdolhossein Shahverdi
Institution of Engineering and Technology (IET)
Sperm motility analysis is an important factor in male fertility diagnosis. This article presents a hybrid segmentation method to detect sperm cells, which is robust to density variation of the cells in the image sequences. In addition, a preprocessing scheme is employed to remove fixed sperm cells and debris, which facilitate and speed up the cells' tracking stage. The article also proposes an automated sperm-tracking algorithm in semen samples image sequences. It is a multi-step tracking scheme, which is an enhanced version of adaptive window average speed (AWAS) tracking algorithm. It retrieves lost sperm cells during the tracking stage in adjacent frames and alleviates the cells collide problem. The proposed tracking algorithm provides both superior accuracy and higher speed compared to those of the other competitive algorithms for image sequences regardless of their particle densities.
Mohsen Mohammadi and Hamid Reza Sadegh Mohammadi
IEEE
Speaker recognition is one of the most common and user-friendly methods for biological signals based people identification. Nowadays, Speaker verification based on factor analysis and i-vector space has a great impact on the performance improvement of these systems. In this paper, a method is proposed for weighting the model and test vectors, which utilizes the statistical characteristics of target training vectors. The effect of the use of weighted vectors on the accuracy of scoring and the performance of the entire speaker verification system was evaluated for Mel-frequency cepstral coefficients (MFCC) and power-normalized cepstral coefficients (PNCC) feature vectors, and two scoring methods, i.e., the cosine distance and probabilistic linear discriminant analysis (PLDA). TIMIT database has been used in the evaluation of the system. The test results indicate that the use of proposed weighted vectors reduces the error rate of the speaker verification system significantly.
Mohsen Mohammadi and Hamid Reza Sadegh Mohammadi
IEEE
Identity vectors are the state-of-the-art feature vectors for speaker recognition applications. One of the most important advantages of i-vector is its allowance for implementation of channel and noise compensatory methods such as linear discriminant analysis (LDA). The motivation for this is to look for new orthogonal axes to achieve superior discrimination between different classes. The axes should comply with the inter-class variance maximization and intra-class variance minimization requirements. The conventional method for the LDA transform computation considers Gaussian distribution assumption and uses parametric representations for both intra- and inter-speaker scatter matrices. Of course, the actual distribution of i-vectors may not necessarily be Gaussian. In this paper, we investigate the performance of LDA, and three nonparametric techniques, i.e., NDA, GDA, and SVDA separately and in combination with LDA. Experiments were conducted on TIMIT and NIST SRE 2008 datasets with MFCC and PNCC feature vectors. The results show that using the combination of parametric and nonparametric methods can lead to better results.
Fateme Mostajer Kheirkhah, Hamid Reza Sadegh Mohammadi, and Abdolhossein Shahverdi
Elsevier BV
Proper recognition and tracking of microscopic sperm cells in video images are vital steps of male infertility diagnosis and treatment. The segmentation and detection of sperms in microscopic image analysis is a complicate process as a result of their small sizes, fast movements, and considerable collisions. Histogram-based thresholding schemes are very popular for this purpose, since they are quite fast and provide almost acceptable results. This paper proposes a combined method for sperm cells detection, which consists of a non-linear pre-processing stage, a histogram-based thresholding algorithm, and a tracking method based on an adaptive distance scheme. The results of conducted experiments verify the superiority of the proposed scheme with incorporated Kittler algorithm compared to the other competitive methods in the majority of cases.
Mohsen Mohammadi and Hamid Reza Sadegh Mohammadi
IEEE
So far, many methods have been proposed for speaker verification which provide good results, but their performances reduce in actual noisy environments. A common approach to partially alleviate this problem is the fusion of several methods. In this paper, four systems based on different speech features, i.e., MFCC, IMFCC, LFCC, and PNCC were combined in score-level to improve verification accuracy under clean and noisy speech conditions. The features pairwise and foursome fusion in a speaker verification system based on speaker modeling through the Gaussian mixture model (GMM) were evaluated. TIMIT and NOISEX92 databases were used to implement as the speech and noise datasets, respectively. The experimental results show that the score-level fusion of different feature vectors enhances the accuracy of speaker verification system and this reduces the equal error rates is in some cases up to 44%.
Mohsen Mohammadi and Hamid Reza Sadegh Mohammadi
IEEE
This paper presents a comparative study and evaluation of the performances of four speech feature vectors, i.e., MFCC, IMFCC, LFCC, and PNCC in a speaker verification system based on speaker modeling through the Gaussian mixture model (GMM) under clean and noisy speech conditions. The TIMIT and NOISEX92 dataset were used in implementing the tests for speech signal and noise, respectively. The evaluation results show that IMFCC and PNCC provide superior performance in the presence of noise. In order to enhance the performance of the system under noisy conditions, the application of spectral subtraction algorithm as a pre-processing stage was investigated. It only improved the performance for the speech signal contaminated with white noise.
F. Mostajer Kheirkhah, H. R. Sadegh Mohammadi, and A. Shahverdi
IEEE
Proper recognition of microscopic sperm cells in video images is an important step in diagnosis and treatment of male infertility. The small sizes of the sperm cells make their segmentation and detection an important stage in the microscopic images analysis. Histogram-based thresholding schemes are one of the common approaches for this purpose. This paper proposes a non-linear amplitude compression transform method applied as a pre-processing stage for histogram-based thresholding algorithms. The results of conducted experiments verify the higher performance of the proposed scheme when used with Kittler method compared to its utilization with the other competitive algorithms in most cases for this application.
Rahim Saeidi, Tomi Kinnunen, Hamid Reza Sadegh Mohammadi, Robert Rodman, and Pasi Franti
IEEE
Gaussian selection is a technique applied in the GMM-UBM framework to accelerate score calculation. We have recently introduced a novel Gaussian selection method known as sorted GMM (SGMM). SGMM uses scalar-indexing of the universal background model mean vectors to achieve fast search of the top-scoring Gaussians. In the present work we extend this method by using 2-dimensional indexing, which leads to simultaneous frame and Gaussian selection. Our results on the NIST 2002 speaker recognition evaluation corpus indicate that both the 1- and 2- dimensional SGMMs outperform frame decimation and temporal tracking of top-scoring Gaussians by a wide margin (in terms of Gaussian computations relative to GMM-UBM as baseline).
R. Saeidi, H.R.S. Mohammadi, T. Ganchev, and R.D. Rodman
Institute of Electrical and Electronics Engineers (IEEE)
Recently, we introduced the sorted Gaussian mixture models (SGMMs) algorithm providing the means to tradeoff performance for operational speed and thus permitting the speed-up of GMM-based classification schemes. The performance of the SGMM algorithm depends on the proper choice of the sorting function, and the proper adjustment of its parameters. In the present work, we employ particle swarm optimization (PSO) and an appropriate fitness function to find the most advantageous parameters of the sorting function. We evaluate the practical significance of our approach on the text-independent speaker verification task utilizing the NIST 2002 speaker recognition evaluation (SRE) database while following the NIST SRE experimental protocol. The experimental results demonstrate a superior performance of the SGMM algorithm using PSO when compared to the original SGMM. For comprehensiveness we also compared these results with those from a baseline Gaussian mixture model-universal background model (GMM-UBM) system. The experimental results suggest that the performance loss due to speed-up is partially mitigated using PSO-derived weights in a sorted GMM-based scheme.
Rahim Saeidi, Hamid Reza Sadegh Mohammadi, Todor Ganchev, and Robert D. Rodman
Springer Berlin Heidelberg
In this paper we evaluate sorted Gaussian Mixture Model (GMM) system performance for Text Independent Speaker Verification under the feature domain normalization conditions. Sorted GMM is a speed-up algorithm proposed for GMM based systems. Cepstral Mean Subtraction (CMS) and Dynamic Range Normalization (DRN) are the normalization schemes studied for sorted GMM system purposes. Effectiveness of these normalizations has been proved in speaker recognition systems while their effectiveness on the speed-up of GMM based speaker verification is showed in this study. The baseline system is a universal background model–Gaussian mixture model (UBM-GMM) system and evaluations were performed on the NIST 2002 speaker recognition evaluation database with NIST SRE rules. It is shown that CMS and DRN normalizations enhance both the baseline system and sorted GMM system performances. In other words, the performance loss due to reducing the computational load is mitigated by applying CMS and DRN.
A. A. Lotfi Neyestanak, M. Jahanbakht, H. R. Sadegh Mohammadi, and A. Graeeli
Informa UK Limited
In this paper a high power inductor has been designed, analyzed, and then fabricated to work with high power capacitive loads in different industrial tests. This inductor has also been used as a part of resonant generators next to isolated transformer, voltage regulator, and low/high power filters. Moreover, the simulation of magnetic field on the proposed inductor has been performed using the ANSYS Software. The inductance measurement of the implemented variable inductor matches with the simulation results.
R. Saeidi, H. R. Sadegh Mohammadi, T. Ganchev, and R. D. Rodman
IEEE
In this paper, we propose a hierarchical mixture clustering method and investigate its application for complexity reduction of a GMM based speaker identification system. We show that by using GMM-HMC one can cluster speakers more accurately than that of a sorted GMM with the same acceleration rate. The system was tested on a universal background model-Gaussian mixture model with KL-divergence as the distance measure. While the proposed systempsilas performance is slightly inferior to the baseline system, its comparatively smaller computational load provides the potential to develop systems with higher performance.
R. Saeidi, T. Ganchev, and H. R. Sadegh Mohammadi
IEEE
In this work we study an enhanced sorting function for the recently developed sorted GMM, which is computationally efficient method for implementing the Gaussian mixture model universal background model (GMM-UBM) scheme. The sorted GMM employ partial search and thus has lower computational complexity and relaxed memory requirements when compared to the well-known tree-structured GMM of the same model order. Experimental evaluation of the sorted GMM and its enhanced version was performed on two databases: (1) clean speech in Farsi recorded from TV broadcasts, and (2) telephone quality speech in english (NIST 2002 SRE one-speaker detection data). The enhanced sorting scheme outperformed the original one, primarily for cases where very high acceleration rates were targeted, in scenarios where there was match between training and testing conditions. However, in mismatched train-test conditions the original sorted GMM performed better. Finally, the sorted GMM proved 14 times faster than the baseline system at the cost of only 0.43 increase in equal error rate.
H. R. Sadegh Mohammadi and R. Saeidi
IEEE
In this paper the application of Gaussian mixture model (GMM) classifier is investigated as an efficient post-processing method to enhance the performance of GMM-based speaker identification systems; such as Gaussian mixture model universal background model (GMM-UBM) scheme. The proposed classifier presents outstanding performance while its computational complexity is almost negligible compared to the main GMM system. Moreover, the effects of the model order of GMM classifier is studied using experimental method. Experimental results verify the superior performance of applying GMM post-processor while the proper selection of model order for this GMM has a great impact on the overall performance of the system.
H. R. Sadegh Mohammadi, R. Saeidi, M. R. Rohani, and R. D. Rodman
IEEE
In this paper a new inter-frame fast scoring scheme is proposed for Gaussian mixture model universal background model (GMM-UBM) speaker verification systems. It is combined with a recently introduced intra-frame efficient scoring method called the sorted Gaussian mixture model (SGMM) classifier which itself uses a sorted UBM known as the sorted background model (SBM). To enhance the performance of the system a GMM identifier is applied as a post-processing block. Experimental results show that the performance of this combined method compares favorably with the baseline GMM-UBM system, while the computational load of the proposed system is greatly less than that of the baseline system.
R. Saeidi, H. R. Sadegh Mohammadi, R. D. Rodman, and T. Kinnunen
IEEE
In this paper we propose a new segmentation algorithm called delta MFCC based speech segmentation (DMFCC-SS), with application to speaker recognition systems. We show that DMFCC-SS can separate the regions of speech that result from similar likelihood scores using models such as a Gaussian mixture model (GMM), and can therefore be used to identify the regions of speech between two transitional states in a speech signal. By combining this segmentation algorithm with the discriminative power of transient frames in speaker recognition, we can investigate the tradeoff in speed-up rates that result from DMFCC-SS, with speaker verification equal error rates that result from representatives of each segment. We use a universal background model Gaussian mixture model (UBM-GMM) as a baseline system. The proposed speed-up algorithm, working in the pre-processing stage, performs well while having no computational load compared to the main GMM system. Experimental results show the superior performance of this pre-processing method in comparison with other algorithms working in the pre-processing stage of a UBM-GMM system.