Hussain Md Abu Nyeem

@mist.ac.bd

Professor, Department of EECE
Military Institute of Science and Technology



              

https://researchid.co/h.nyeem

RESEARCH INTERESTS

Image computing and processing
Digital image analysis
Pattern recognition
Data hiding

51

Scopus Publications

Scopus Publications


  • DCENSnet: A new deep convolutional ensemble network for skin cancer classification
    Dibaloke Chanda, Md. Saif Hassan Onim, Hussain Nyeem, Tareque Bashar Ovi, and Sauda Suara Naba

    Elsevier BV


  • Unleashing the power of generative adversarial networks: A novel machine learning approach for vehicle detection and localisation in the dark
    Md Saif Hassan Onim, Hussain Nyeem, Md. Wahiduzzaman Khan Arnob, and Arunima Dey Pooja

    Institution of Engineering and Technology (IET)
    AbstractMachine vision in low‐light conditions is a critical requirement for object detection in road transportation, particularly for assisted and autonomous driving scenarios. Existing vision‐based techniques are limited to daylight traffic scenarios due to their reliance on adequate lighting and high frame rates. This paper presents a novel approach to tackle this problem by investigating Vehicle Detection and Localisation (VDL) in extremely low‐light conditions by using a new machine learning model. Specifically, the proposed model employs two customised generative adversarial networks, based on Pix2PixGAN and CycleGAN, to enhance dark images for input into a YOLOv4‐based VDL algorithm. The model's performance is thoroughly analysed and compared against the prominent models. Our findings validate that the proposed model detects and localises vehicles accurately in extremely dark images, with an additional run‐time of approximately 11 ms and an accuracy improvement of 10%–50% compared to the other models. Moreover, our model demonstrates a 4%–8% increase in Intersection over Union (IoU) at a mean frame rate of 9 fps, which underscores its potential for broader applications in ubiquitous road‐object detection. The results demonstrate the significance of the proposed model as an early step to overcoming the challenges of low‐light vision in road‐object detection and autonomous driving, paving the way for safer and more efficient transportation systems.

  • Enhancing Mitochondrial Image Segmentation with Extended U-Net
    Sakib Abdul Ahad and Hussain Nyeem

    IEEE
    Accurate segmentation of mitochondria from electron microscopy data holds crucial significance in unraveling intricate cellular structures and functions. In this paper, we present a comprehensive approach to enhance the semantic segmentation of mitochondria through the integration of an extended U-Net model. Our proposed model introduces an additional layer, encompassing both encoder and decoder components, optimized for capturing intricate mitochondrial features. Moreover, we delve into the impact of employing diverse loss functions, training epochs, and dataset augmentation to assess their effects on segmentation performance. The Mean Intersection over Union (Mean IoU) and Dice Coefficient (DSC) metric serves as our indicator of segmentation quality. Our experimental findings highlight the effectiveness of the extended model and underscore the importance of tailored loss functions and training strategies, culminating in substantial advancements in mitochondria segmentation accuracy.

  • Multilevel Fusion with Dual Stream 3DCNN-LSTM for Advancing Dynamic Hand Gesture Recognition
    Bristy Chanda and Hussain Nyeem

    IEEE
    This paper introduces a novel method for identifying dynamic hand gestures in multimedia applications, tackling the task of analysing time-varying characteristics within video streams. Our proposed method utilizes a multi-level fusion network that combines RGB and depth modalities using Long Short-Term Memory (LSTM) and a 3D Convolutional Neural Network (3DCNN). Unlike existing methods, we incorporate a U-Net-based semantic segmentation technique to extract depth features via a gesture-specific mask. The 3DCNN and LSTM models are employed to extract spectral and spatial features from both RGB and depth images. We evaluate our approach using the 20BN-Jester dataset (ten classes) and the Sebastien Marcel Dynamic Hand Posture dataset (four classes). Our results indicate that our late fusion model, which combines depth and RGB data, achieves an average validation accuracy of 97.8% on the 20BN-Jester dataset and 98.5% on the Sebastien Marcel Dynamic Hand Posture dataset, demonstrating its effectiveness in dynamic hand gesture recognition.

  • Pixel Standardization Meets U<sup>2</sup>Net: Advancing Polyp Segmentation in Colonoscopy Images
    Tareque Bashar Ovi, Nomaiya Bashree, Saleh Ahmed, Hussain Nyeem, and Md Abdul Wahed

    IEEE
    Accurate segmentation of polyps is crucial for the automatic detection and removal of colorectal polyps, which can progress into cancerous tumours. In this paper, we propose a new $U^2 Net$ framework with image standardization to address the limitations of classic UNet and its variants in capturing contextual information related to the diverse sizes, shapes, and textures of polyps. By employing Residual U-blocks in a dual nested layer architecture, the $U^{2}Net$ model facilitates the combination of receptive fields of different scales, enabling the gathering of more comprehensive contextual information. Besides, image standardization in the proposed framework further enhances the potential of $U^{2}Net$ to achieve highly accurate automatic segmentation of polyp areas from colonoscopy images. We demonstrate promising preliminary results on the Kvasir-Seg dataset, with mean scores of 87.8% for dice score, 81.3% for intersection over union (IoU), 91.7% for precision, and 88.1% for recall. These results highlight the effectiveness of our proposed framework in accurately segmenting polyps leading to the development of more efficient solutions for colorectal screening.

  • Context-Aware Skin Lesion Segmentation with U<sup>2</sup>Net and Image Standardization
    Nomaiya Bashree, Tareque Bashar Ovi, Saleh Ahmed, Md Abdul Wahed, and Hussain Nyeem

    IEEE
    Skin cancer is a major global cause of mortality. This paper introduces a new deep learning (DL) framework of using $U^{2}$-Net with image standardization, for early detection of malignant skin lesions to reduce mortality rates. The proposed approach employs a two-level nested U -structure, which differs from the conventional U-Net and transfer learning architectures. By integrating receptive fields of various sizes within Residual U-blocks, $U^{2}$ Net model captures contextual information from multiple scales. Additionally, while incorporated with image standardization, the model thus more effectively generalize global features. Early experimental results demonstrate the potential of the proposed framework for highly accurate automatic skin lesion segmentation on the H AM 1 0000 dataset with mean scores of 97.50% for accuracy, 94.62% for dice score, 89.63% for IOU, and 95.70% for recall. The proposed $U^{2}$ Net framework advances the field of skin cancer detection, with the promise of reducing its mortality rate.

  • BIDInet: Advancing Dark Image Enhancement with Burst-Feeding and Invertible Blocks
    Trisha Dash Mou and Hussain Nyeem

    IEEE
    This paper presents a novel approach of enhancing dark images using burst photography in developing a burst-fed invertible dark image enhancement network (BIDInet). BIDInet combines multiple invertible blocks with the pyramid structure of VGG-19. With burst-fed image-frames, BIDInet produces a single enhanced image with reduced artefacts. Trained on the See-in-Dark (SID) dataset, experimental results indicate that BIDInet achieves a PSNR value of 33.39 dB and an SSIM value of 0.66 for image enhancement, as well as a PSNR value of 23.43 dB and an SSIM value of 0.76 for image denoising, with 97.86% denoising accuracy. Comparisons with existing models demonstrate the effectiveness of using multiple invertible modules based on the VGG-19 network's pyramid block for enhancing and denoising extremely dark images in various applications.

  • BLPnet: A new DNN model and Bengali OCR engine for Automatic Licence Plate Recognition
    Md. Saif Hassan Onim, Hussain Nyeem, Koushik Roy, Mahmudul Hasan, Abtahi Ishmam, Md. Akiful Hoque Akif, and Tareque Bashar Ovi

    Elsevier BV

  • Reversible data hiding with dual pixel-value-ordering and minimum prediction error expansion
    Md. Abdul Wahed and Hussain Nyeem

    Public Library of Science (PLoS)
    Pixel Value Ordering (PVO) holds an impressive property for high fidelity Reversible Data Hiding (RDH). In this paper, we introduce a dual PVO (dPVO) for Prediction Error Expansion (PEE), and thereby develop a new RDH scheme to offer a better rate-distortion performance. Particularly, we propose to embed in two phases: forward and backward. In the forward phase, PVO with classic PEE is applied to every non-overlapping image block of size 1 × 3. In the backward phase, minimum-set and maximum-set of pixels are determined from the pixels predicted in the forward phase. The minimum set only contains the lowest predicted pixels and the maximum set contains the largest predicted pixels of each image block. Proposed dPVO with PEE is then applied to both sets, so that the pixel values of minimum set are increased and that of the maximum set are decreased by a unit value. Thereby, the pixels predicted in the forward embedding can partially be restored to their original values resulting in both a better embedded image quality and a higher embedding rate. Experimental results have recorded a promising rate-distortion performance of our scheme with a significant improvement of embedded image quality at higher embedding rates compared to the popular and state-of-the-art PVO-based RDH schemes.

  • Modelling Lips State Detection Using CNN for Non-verbal Communications
    Abtahi Ishmam, Mahmudul Hasan, Md. Saif Hassan Onim, Koushik Roy, Md. Akiful Hoque Akif, and Hussain Nyeem

    Springer Nature Singapore

  • Automatic Hand Gesture Recognition with Semantic Segmentation and Deep Learning
    Bristy Chanda and Hussain Nyeem

    IEEE
    Automatic Hand Gesture Recognition is a key requirement for variety of applications, including translation of Sign Language, Human-Computer Interaction (HCI) and, ubiquitous vision-based systems. Due to the lighting variance and complicated background in the input image set of gestures, meeting this criterion remains a challenge. This paper introduces semantic segmentation to deep learning-based hand gesture recognition system for sign language translation. Building on the U - Net architecture, the proposed model obtains the semantically segmented mask of the input image, which is then fed to convolutional neural networks (CNNs) for multiclass classification. The proposed model is trained and tested for four different depths of the CNN architectures followed by the comparison with some pre-trained CNN architectures such as Inception V3, VGG16, VGG19, ResNet50. The proposed model is evaluated on National University of Singapore (NUS) hand posture dataset II (subset A), which contains 2000 images in 10 classes. A significant recognition rate of 97.15 % is achieved for the proposed model outperforming a set of prominent models and demonstrating its promises for sign language translation.

  • Developing BrutNet: A New Deep CNN Model with GRU for Realtime Violence Detection
    Mahmudul Haque, Syma Afsha, and Hussain Nyeem

    IEEE
    Computer vision with deep learning has recently emerged for Automatic Violence Detection and Classification (AVDC) with enormous potential. This paper reports an early development of a new Deep Convolutional Neural Network (DCNN) model that we call BrutNet. Building on the Gated Recurrent Unit (GRU), the BrutNet is designed to operate on the patterns within multiple frames of a video or video clips of shape 160× 90 with a duration of at least 3 seconds. For obtaining the image-feature set and the pattern of each frame, convolutional layers were considered for each frame of the time distributed layer. The model thus encodes the data from 4D to 2D to obtain a 512-features set for each frame. The temporal nature of these frames is then extracted by the GRU layer as a 1D vector, which is processed by several dense layers. A binary classification is thereby performed denoting the content as violent and non-violent. Dropout layers with a dropping rate of 0.25 were added to avoid overfitting the model. Besides, ReLu-activation and sigmoid-activation functions were defined in the hidden and output layers, respectively. Being trained with a recent high-resolution AVDC video dataset and appropriate hyper-parameters on the NVIDIA Tesla K80 GPU of Google Colab, the initial testing and validation of the model has recorded a test accuracy of 90.00% outperforming the earlier LSTM based ResNet50 model.

  • Machine Learning Models for Content Classification in Film Censorship and Rating
    Syma Afsha, Mahmudul Haque, and Hussain Nyeem

    IEEE
    Automated Film Censorship and Rating (AFCR) has recently turned out to be a major research area of Machine Learning (ML). The production and streaming services of films including movies, tv-series, animations and other audio-visual contents have been widely expanded leading to their manual censorship and rating to be a more exhausting task. Development of ML based methods has thus been emerging to designing an AFCR system. However, the initial ad-hoc efforts of developing the AFCR system demand a “complete” conceptual model of the system with its potential classes and their criteria. This paper primarily attempts to determine both the general and contextual classes of the content, and their criteria for an AFCR system. Besides, the state-of-the-art AFCR systems have been systematically reviewed to identify their underlying ML models, advantages and limitations. With a comparative analysis of the exiting ML models, we have demonstrated the effectiveness of sequential and multimodal analysis in the development of an efficient AFCR system.

  • Data-driven Embedding with Pixel Repetition for High Capacity Reversible Data Hiding
    Mohammad Mahruf Mahdi and Hussain Nyeem

    IEEE
    This paper introduces a new data-driven paradigm of Pixel-Repetition (PR) based embedding for high capacity Reversible Data Hiding (RDH). A simplified up-sampling with PR is applied to an input image that repeats each input pixel to create an image-block of size $2\\times 2$ in the up-sampled image. Each up-sampled pixel is then proposed to get embedded in their least significant bits (LSBs) leaving the least possible most significant bits (MSBs). This consideration significantly maximises the embedding capacity, but it also increases the distance between the original and embedded up-sampled pixels. To minimise these high variations in the embedded pixels, we also propose to determine the closer version of an embedded pixel from its original and (2’s) complement versions. A flag-bit is reserved at the LSB of each up-sampled pixel to track which version is used during embedding for extraction of the embedded data-bits. For the pixel values of 2-bits or lesser, however, direct LSB embedding is used without any flag bit. The proposed data-driven embedding ensures better utilization of the redundancy as observed with an impressive embedding rate-distortion performance having more than 8% improvements in both the embedding rate and embedded image quality over the prominent PR-based schemes. Index Terms–Embedding capacity, data-driven embedding, data hiding, pixel repetition, imperceptibility, reversibility.

  • Improving Automatic Sign Language Translation with Image Binarisation and Deep Learning
    Mahmudul Haque, Syma Afsha, Tareque Bashar Ovi, and Hussain Nyeem

    IEEE
    Sign Language Translation (SLT) has been widely investigated to provide a futuristic solution to tackle human speech and hearing disability. Recent deep learning-based SLT models have redefined computer vision-based detection and classification to automatically translate the hand-gestured based sign language (SL) into natural language (NL) with higher accuracy. Unlike the existing models that directly learn from the natural image-sets, in this paper, we propose a 2D Convolutional Neural Network (CNN) model with customised hyper-parameters to be trained with binary SL image-sets. We thus introduce a binarisation step to preprocess the images of size $28 \\times 28$ to feed the model. Preliminary results of our model trained with binarised image-set demonstrate its potential with an impressive classification accuracy of 99.99% on the NVIDIA Tesla K80 GPU environment (Google Colab) for an automatic SLT system.

  • Revisiting Deep Learning Models for Road Lane Detection
    Raiyan Ibne Hafiz, Toaha Bin Faruq, and Hussain Nyeem

    IEEE
    This paper revisits the prominent deep-learning based lane-detection (DLLD) models and their datasets. Deep learning has redefined the limits of the recent computer-vision based models for the emerging intelligent transportation system with assisted/self-driving. However, these models are not readily applicable to the irregular and complex lanes in the developing and underdeveloped countries. In support of this development and validation of the DLLD models, a balanced mix of varying roads and scene-conditions, including the rural, suburban and urban areas of the developing and underdeveloped countries, different traffic, weather and lighting conditions, moving artefact (i.e., motion-blurring), and irregular and unmarked lanes are determined. In light of these conditions, we then analyse the merits and limitations of promising DLLD models and available datasets. Finally, the avenues and need for the future development of those models and datasets are suggested to ensure their real-time and universal applicability to the unstructured and complex road-lane scenarios, for the safer roads and better traffic.

  • Pixel grouping of digital images for reversible data hiding
    Sultan Abdul Hasib and Hussain Md Abu Nyeem

    University of Szeged
    Pixel Grouping (PG) of digital images has been a key consideration in recent development of the Reversible Data Hiding (RDH) schemes. While a PG kernel with neighborhood pixels helps compute image groups for better embedding rate-distortion performance, only horizontal neighborhood pixel group of size 1×3 has so far been considered. In this paper, we formulate PG kernels of sizes 3×1, 2×3 and 3×2 and investigate their effect on the rate-distortion performance of a prominent PG-based RDH scheme. Specially, a kernel of size 3×2 (or 2×3) that creates a pair of pixel-trios having triangular shape and offers a greater possible correlation among the pixels. This kernel thus can be better utilized for improving a PG-based RDH scheme. Considering this, we develop and present an improved PG-based RDH scheme and the computational models of its key processes. Experimental results demonstrated that our proposed RDH scheme offers reasonably better  embedding rate-distortion performance than the original scheme.

  • High Fidelity Embedding of Electronic Patient Record in Medical Images
    Mohammad Ali Kawser, Mohammad Ali Kawser, Hussain Nyeem, and Hussain Nyeem

    IEEE
    In support of easier access and effective management of the increasing electronic patient record (EPR) and medical images, an improved Pixel Repetition (PR) based embedding scheme is investigated in this paper for hiding EPR in the medical images to ensure higher-rate embedding without any noticeable visual degradation. Particularly, a medical image is up-sampled using PR that creates a block of four pixels from each original pixel called the seed. Each block embeds 4 bits of EPR-data using PR-embedding conditions followed by the embedding of k-bits of the data in the LSBs (Least Significant Bits) of the seed pixels. Furthermore, a pixel correction technique (2k-bit correction) has been applied to enhance the quality of the embedded medical image. The proposed scheme demonstrates significantly higher embedding capacity in comparison to other state-of-the-art techniques while maintaining good visual quality of the embedded medical image.

  • Relative Entropy Pre-Fitting Model for Noisy and Intensity Inhomogeneous Image Segmentation
    Chowdhury Mohammad Abid Rahman and Hussain Nyeem

    IEEE
    This paper reports an improved Active Contour Model (ACM) for image segmentation. Despite the significant development of the ACMs, their performances for the noisy and intensity-inhomogeneous images are still deficient. To tackle the intensity-inhomogeneity and noises in segmentation, new construction of local pre-fitting for evolving active contours is proposed. Two locally pre-fitted images are constructed from the local intensity estimation, and the relative entropy measures are used to define the local energy functional that provides statistical information of these images with the original image. Thereby, the desired contour evolution of the proposed model is expedited, and its noise-immunity is increased. The proposed model has demonstrated more initialization robustness, faster contour evolution and higher segmentation accuracy over the prominent ACMs for all the test images both with and without noise and intensity-inhomogeneity.

  • Image Quality Enhancement with 2<sup>k</sup>-bit Correction in Pixel Repetition Embedding
    Mohammad Ali Kawser and Hussain Nyeem

    IEEE
    While a higher embedding capacity of an RDH scheme is a timely need for the recent imaging-based applications, it proportionately degrades the visual quality of an embedded image. Existing RDH schemes have no separate consideration of improving the embedded image quality. This paper presents a new approach to improving the embedded image quality for a high capacity RDH scheme by utilizing a simple 2k -bit correction. A Pixel-Repetition (PR) is used to create an image block of 2 × 2 size for each of the original pixel (that we call a seed) of the image. Unlike the previous PR-based schemes, the embedding conditions are redefined to regenerate the seed pixel from the other 2 pixels of the block to embed additional k-bits in the seed pixel and apply the 2k -bit correction to ensure a better-embedded image quality. The preliminary results demonstrate that the proposed pixel correction significantly improves the embedded image quality at the higher embedding rate compared to the other promising RDH schemes.

  • Embedding with pixel repetition for reversible data hiding
    Mohammad Ali Kawser and Hussain Nyeem

    IEEE
    This paper presents a new Pixel-Repetition (PR) based embedding scheme to better utilize the redundancy of image data for a high-capacity Reversible Data Hiding (RDH) scheme. Each pixel of an input image is repeated to create an image-block of size 2×2 in the up-sampled image. The proposed embedding scheme maps the up-sampled pixels using a set of unique conditions to embed 4-bit of data in each 2×2 block. The mapping conditions also ensure the regeneration of the first-pixel of a block from the other pixels in the block, which allows embedding of additional data-bits in every first-pixels of a block. Unlike the existing schemes, the proposed scheme with the modified mapping conditions and embedding in the first pixel in each block ensures better utilization of the redundancy. Thus, the proposed scheme demonstrates a significant improvement in the embedding rate-distortion performance over the relevant RDH schemes.

  • A New Weighted Relative Entropy Pre-Fitting for Active Contour based Image Segmentation
    Chowdhury Mohammad Abid Rahman and Hussain Nyeem

    IEEE
    The Active Contour Model (ACM) has demonstrated its promises for image segmentation. For developing an ACM, a new construction of local fitting energy with a weighted relative entropy is reported in this paper. With a level set method based on local pre-fitted images and local-dispersion based scaling, a weighted energy functional is formulated. To improve the performance of segmentation, the new energy functional relies on the relative entropy to include local similarity estimations for better curve evolution. Locally pre-fitted images are constructed to provide initialization robustness, noise endurance and to reduce computational complexity. Besides, the energy is scaled with the local dispersion to infuse edge mapping in the energy functional for improving accuracy and reducing the effect of inhomogeneity. The proposed model has thereby demonstrated the improvement in initialization, curve evolution, time efficiency, computational complexity, and noise endurance.

  • Active Contour based Segmentation of ROIs in Medical Images
    Chowdhury M Abid Rahman and Hussain Nyeem

    IEEE
    Despite the impressive performance, Active Contour Model (ACM) is yet to be verified for automatic segmentation of Region-of-Interest (ROI) in medical images (MIs). Different modality MIs have unique properties and requirements that constitute the automatic segmentation of ROI even more challenging than that for natural images. In this paper, we investigate the performance of popular ACM based segmentation methods of ROI in multi-modality MIs. Three different ACM based segmentation methods that rely on the object's regions and edges are examined. Performances of the methods are evaluated and compared in terms of popular evaluation metrics with commonly used modalities of MIs. Our experimental results demonstrate that region-based ACM generally has the best segmentation performance and computational efficiency over the edge-based ACMs for all modalities of MI. Region-based segmentation thus can be promising for automatic segmentation of ROIs in MI-based understanding, detection and recognition applications.