Masoumeh Zareapoor

Scopus Publications

BiMAC: Bidirectional Multimodal Alignment in Contrastive Learning
Masoumeh Zareapoor, Pourya Shamsolmoali, Yue Lu
Proceedings of the Aaai Conference on Artificial Intelligence, 2025
Achieving robust performance in vision-language tasks requires strong multimodal alignment, where textual and visual data interact seamlessly. Existing frameworks often combine contrastive learning with image captioning to unify visual and textual representations. However, reliance on global representations and unidirectional information flow from images to text limits their ability to reconstruct visual content accurately from textual descriptions. To address this limitation, we propose BiMAC, a novel framework that enables bidirectional interactions between images and text at both global and local levels. BiMAC employs advanced components to simultaneously reconstruct visual content from textual cues and generate textual descriptions guided by visual features. By integrating a text-region alignment mechanism, BiMAC identifies and selects relevant image patches for precise cross-modal interaction, reducing information noise and enhancing mapping accuracy. BiMAC achieves state-of-the-art performance across diverse vision-language tasks, including image-text retrieval, captioning, and classification.
Rethinking Fast Adversarial Training: A Splitting Technique to Overcome Catastrophic Overfitting
Masoumeh Zareapoor, Pourya Shamsolmoali
Lecture Notes in Computer Science, 2025
ClusVPR: Efficient Visual Place Recognition With Clustering-Based Weighted Transformer
Yifan Xu, Pourya Shamsolmoali, Masoume Zareapoor, Jie Yang
IEEE Transactions on Artificial Intelligence, 2025
Visual place recognition (VPR) is a highly challenging task that has a wide range of applications, including robot navigation and self-driving vehicles. VPR is a difficult task due to duplicate regions and insufficient attention to small objects in complex scenes, resulting in recognition deviations. In this article, we present ClusVPR, a novel approach that tackles the specific issues of redundant information in duplicate regions and representations of small objects. Different from existing methods that rely on convolutional neural networks (CNNs) for feature map generation, ClusVPR introduces a unique paradigm called clustering-based weighted transformer network (CWTNet). CWTNet uses the power of clustering-based weighted feature maps and integrates global dependencies to effectively address visual deviations encountered in large-scale VPR problems. We also introduce the optimized-VLAD (OptLAD) layer, which significantly reduces the number of parameters and enhances model efficiency. This layer is specifically designed to aggregate the information obtained from scale-wise image patches. Additionally, our pyramid self-supervised strategy focuses on extracting representative and diverse features from scale-wise image patches rather than from entire images. This approach is essential for capturing a broader range of information required for robust VPR. Extensive experiments on four VPR datasets show our model's superior performance compared to existing models while being less complex.
ShapeMorph: 3D Shape Completion via Blockwise Discrete Diffusion
Jiahui Li, Pourya Shamsolmoali, Yue Lu, Masoumeh Zareapoor
Proceedings 2025 IEEE Winter Conference on Applications of Computer Vision Wacv 2025, 2025
We introduce ShapeMorph, a diffusion-based method specifically designed for generating precise and diverse 3D shape completions. By integrating an irregular dis-crete representation with a novel blockwise discrete dif-fusion model, ShapeMorph can produce multiple, high-quality shape completions while maintaining fidelity to the input. In particular, each 3D shape is encoded into a com-pact sequence of irregularly distributed discrete variables, ensuring an accurate capture of the object's topological de-tails. We then propose a blockwise discrete diffusion model to precisely learn the shape completion distribution based on various incompleteness. We also introduce a Flow trans-former into our diffusion process, serving as a denoising network, to enhance the modeling adaptability and flexibil-ity. ShapeMorph addresses common challenges in existing methods, such as poor completion, limited diversity, and misalignment with the input. Results show ShapeMorph outperforms state-of-the-art methods and effectively pro-cesses a variety of input types and levels of incompleteness.
Hybrid Gromov-Wasserstein Embedding for Capsule Learning
Pourya Shamsolmoali, Masoumeh Zareapoor, Swagatam Das, Eric Granger, Salvador García
IEEE Transactions on Neural Networks and Learning Systems, 2025
Capsule networks (CapsNets) aim to parse images into a hierarchy of objects, parts, and their relationships using a two-step process involving part-whole transformation and hierarchical component routing. However, this hierarchical relationship modeling is computationally expensive, which has limited the wider use of CapsNet despite its potential advantages. The current state of CapsNet models primarily focuses on comparing their performance with capsule baselines, falling short of achieving the same level of proficiency as deep convolutional neural network (CNN) variants in intricate tasks. To address this limitation, we present an efficient approach for learning capsules that surpasses canonical baseline models and even demonstrates superior performance compared with high-performing convolution models. Our contribution can be outlined in two aspects: first, we introduce a group of subcapsules onto which an input vector is projected. Subsequently, we present the hybrid Gromov-Wasserstein (HGW) framework, which initially quantifies the dissimilarity between the input and the components modeled by the subcapsules, followed by determining their alignment degree through optimal transport (OT). This innovative mechanism capitalizes on new insights into defining alignment between the input and subcapsules, based on the similarity of their respective component distributions. This approach enhances CapsNets' capacity to learn from intricate, high-dimensional data while retaining their interpretability and hierarchical structure. Our proposed model offers two distinct advantages: 1) its lightweight nature facilitates the application of capsules to more intricate vision tasks, including object detection; and 2) it outperforms baseline approaches in these demanding tasks. Our empirical findings illustrate that HGW capsules (HGWCapsules) exhibit enhanced robustness against affine transformations, scale effectively to larger datasets, and surpass CNN and CapsNet models across various vision tasks.
From Missing Pieces to Masterpieces: Image Completion With Context-Adaptive Diffusion
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Michael Felsberg, Dacheng Tao, Xuelong Li
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
Image completion is a challenging task, particularly when ensuring that generated content seamlessly integrates with existing parts of an image. While recent diffusion models have shown promise, they often struggle with maintaining coherence between known and unknown (missing) regions. This issue arises from the lack of explicit spatial and semantic alignment during the diffusion process, resulting in content that does not smoothly integrate with the original image. Additionally, diffusion models typically rely on global learned distributions rather than localized features, leading to inconsistencies between the generated and existing image parts. In this work, we propose ConFill, a novel framework that introduces a Context-Adaptive Discrepancy (CAD) model to ensure that intermediate distributions of known and unknown regions are closely aligned throughout the diffusion process. By incorporating CAD, our model progressively reduces discrepancies between generated and original images at each diffusion step, leading to contextually aligned completion. Moreover, ConFill uses a new Dynamic Sampling mechanism that adaptively increases the sampling rate in regions with high reconstruction complexity. This approach enables precise adjustments, enhancing detail and integration in restored areas. Extensive experiments demonstrate that ConFill outperforms current methods, setting a new benchmark in image completion.
Fractional Correspondence Framework in Detection Transformer
Masoumeh Zareapoor, Pourya Shamsolmoali, Huiyu Zhou, Yue Lu, Salvador García
Mm 2024 Proceedings of the 32nd ACM International Conference on Multimedia, 2024
The Detection Transformer (DETR), by incorporating the Hungarian algorithm, has significantly simplified the matching process in object detection tasks. This algorithm facilitates optimal one-to-one matching of predicted bounding boxes to ground-truth annotations during training. While effective, this strict matching process does not inherently account for the varying densities and distributions of objects, leading to suboptimal correspondences such as failing to handle multiple detections of the same object or missing small objects. To address this, we propose the Regularized Transport Plan (RTP). RTP introduces a flexible matching strategy that captures the cost of aligning predictions with ground truths to find the most accurate correspondences between these sets. By utilizing the differentiable Sinkhorn algorithm, RTP allows for soft, fractional matching rather than strict one-to-one assignments. This approach enhances the model’s capability to manage varying object densities and distributions effectively. Our extensive evaluations on the MS-COCO and VOC benchmarks demonstrate the effectiveness of our approach. RTP-DETR, surpassing the performance of the Deform-DETR and the recently introduced DINO-DETR, achieving absolute gains in mAP of +3.8% and +1.7% , respectively.
SeTformer Is What You Need for Vision and Language
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Michael Felsberg
Proceedings of the Aaai Conference on Artificial Intelligence, 2024
The dot product self-attention (DPSA) is a fundamental component of transformers. However, scaling them to long sequences, like documents or high-resolution images, becomes prohibitively expensive due to the quadratic time and memory complexities arising from the softmax operation. Kernel methods are employed to simplify computations by approximating softmax but often lead to performance drops compared to softmax attention. We propose SeTformer, a novel transformer where DPSA is purely replaced by Self-optimal Transport (SeT) for achieving better performance and computational efficiency. SeT is based on two essential softmax properties: maintaining a non-negative attention matrix and using a nonlinear reweighting mechanism to emphasize important tokens in input sequences. By introducing a kernel cost function for optimal transport, SeTformer effectively satisfies these properties. In particular, with small and base-sized models, SeTformer achieves impressive top-1 accuracies of 84.7% and 86.2% on ImageNet-1K. In object detection, SeTformer-base outperforms the FocalNet counterpart by +2.2 mAP, using 38% fewer parameters and 29% fewer FLOPs. In semantic segmentation, our base-size model surpasses NAT by +3.5 mIoU with 33% fewer parameters. SeTformer also achieves state-of-the-art results in language modeling on the GLUE benchmark. These findings highlight SeTformer applicability for vision and language tasks.
Distance-based Weighted Transformer Network for image completion
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Xuelong Li, Yue Lu
Pattern Recognition, 2024
TRAINING MIXTURE-OF-EXPERTS: A FOCUS ON EXPERT-TOKEN MATCHING
2nd Tiny Papers Track at Iclr 2024 Tiny Papers @ Iclr 2024, 2024
Self-organized design of virtual reality simulator for identification and optimization of healthcare software components
Amit Kumar Srivastava, Shishir Kumar, Masoumeh Zareapoor
Journal of Ambient Intelligence and Humanized Computing, 2024
Efficient Routing in Sparse Mixture-of-Experts
Masoumeh Zareapoor, Pourya Shamsolmoali, Fateme Vesaghati
Proceedings of the International Joint Conference on Neural Networks, 2024
What influences news learning and sharing on mobile platforms? An analysis of multi-level informational factors
Jianmei Wang, Masoumeh Zareapoor, Yeh-Cheng Chen, Pourya Shamsolmoali, Jinwen Xie
Library Hi Tech, 2023
GEN: Generative Equivariant Networks for Diverse Image-to-Image Translation
Pourya Shamsolmoali, Masoumeh Zareapoor, Swagatam Das, Salvador Garcia, Eric Granger, Jie Yang
IEEE Transactions on Cybernetics, 2023
Entropy Transformer Networks: A Learning Approach via Tangent Bundle Data Manifold
Pourya Shamsolmoali, Masoumeh Zareapoor
Proceedings of the International Joint Conference on Neural Networks, 2023
TransInpaint: Transformer-based Image Inpainting with Context Adaptation
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger
Proceedings 2023 IEEE Cvf International Conference on Computer Vision Workshops Iccvw 2023, 2023
Image Completion Via Dual-Path Cooperative Filtering
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger
ICASSP IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2023
VTAE: Variational Transformer Autoencoder With Manifolds Learning
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Dacheng Tao, Xuelong Li
IEEE Transactions on Image Processing, 2023
Asymmetric Correlation Quantization Hashing for Cross-Modal Retrieval
Lu Wang, Masoumeh Zareapoor, Jie Yang, Zhonglong Zheng
IEEE Transactions on Multimedia, 2022
Salient Skin Lesion Segmentation via Dilated Scale-Wise Feature Fusion Network
Pourya Shamsolmoali, Masoumeh Zareapoor, Jie Yang, Eric Granger, Huiyu Zhou
Proceedings International Conference on Pattern Recognition, 2022
Multipatch Feature Pyramid Network for Weakly Supervised Object Detection in Optical Remote Sensing Images
Pourya Shamsolmoali, Jocelyn Chanussot, Masoumeh Zareapoor, Huiyu Zhou, Jie Yang
IEEE Transactions on Geoscience and Remote Sensing, 2022
Rotation Equivariant Feature Image Pyramid Network for Object Detection in Optical Remote Sensing Imagery
Pourya Shamsolmoali, Masoumeh Zareapoor, Jocelyn Chanussot, Huiyu Zhou, Jie Yang
IEEE Transactions on Geoscience and Remote Sensing, 2022
Enhanced Single-Shot Detector for Small Object Detection in Remote Sensing Images
Pourya Shamsolmoali, Masoumeh Zareapoor, Jie Yang, Eric Granger, Jocelyn Chanussot
International Geoscience and Remote Sensing Symposium IGARSS, 2022
Imbalanced data learning by minority class augmentation using capsule adversarial networks
Pourya Shamsolmoali, Masoumeh Zareapoor, Linlin Shen, Abdul Hamid Sadka, Jie Yang
Neurocomputing, 2021
Image synthesis with adversarial networks: A comprehensive survey and case studies
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger, Huiyu Zhou, Ruili Wang, M. Emre Celebi, Jie Yang
Information Fusion, 2021
Equivariant Adversarial Network for Image-to-image Translation
Masoumeh Zareapoor, Jie Yang
ACM Transactions on Multimedia Computing Communications and Applications, 2021
Road Segmentation for Remote Sensing Images Using Adversarial Spatial Pyramid Networks
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Ruili Wang, Jie Yang
IEEE Transactions on Geoscience and Remote Sensing, 2021
Cluster-wise unsupervised hashing for cross-modal similarity search
Lu Wang, Jie Yang, Masoumeh Zareapoor, Zhonglong Zheng
Pattern Recognition, 2021
Oversampling adversarial network for class-imbalanced fault diagnosis
Masoumeh Zareapoor, Pourya Shamsolmoali, Jie Yang
Mechanical Systems and Signal Processing, 2021
Multimodal image fusion based on point-wise mutual information
Donghao Shen, Masoumeh Zareapoor, Jie Yang
Image and Vision Computing, 2021
Infrared and visible image fusion via global variable consensus
Donghao Shen, Masoumeh Zareapoor, Jie Yang
Image and Vision Computing, 2020
Infrared and visible image fusion based on dilated residual attention network
Hafiz Tayyab Mustafa, Jie Yang, Hamza Mustafa, Masoumeh Zareapoor
Optik, 2020
A Hybrid Model for Container-code Detection
Cai Sun, Kuikun Liu, Haoyuan Chi, Mesoume Zareapoor
Proceedings 2020 13th International Congress on Image and Signal Processing Biomedical Engineering and Informatics Cisp Bmei 2020, 2020
Perceptual image quality using dual generative adversarial network
Masoumeh Zareapoor, Huiyu Zhou, Jie Yang
Neural Computing and Applications, 2020
GAN-Poser: an improvised bidirectional GAN model for human motion prediction
Deepak Kumar Jain, Masoumeh Zareapoor, Rachna Jain, Abhishek Kathuria, Shivam Bachhety
Neural Computing and Applications, 2020
MLDNet: Multi-level dense network for multi-focus image fusion
Hafiz Tayyab Mustafa, Masoumeh Zareapoor, Jie Yang
Signal Processing Image Communication, 2020
AMIL: Adversarial multi-instance learning for human pose estimation
Pourya Shamsolmoali, Masoumeh Zareapoor, Huiyu Zhou, Jie Yang
ACM Transactions on Multimedia Computing Communications and Applications, 2020
Deep-Learning-Based Small Surface Defect Detection via an Exaggerated Local Variation-Based Generative Adversarial Network
Jian Lian, Weikuan Jia, Masoumeh Zareapoor, Yuanjie Zheng, Rong Luo, Deepak Kumar Jain, Neeraj Kumar
IEEE Transactions on Industrial Informatics, 2020
G-GANISR: Gradual generative adversarial network for image super resolution
Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Deepak Kumar Jain, Jie Yang
Neurocomputing, 2019
Towards realistic image via function learning
Masoumeh Zareapoor, Junhao Zhang, Jie Yang
Multimedia Tools and Applications, 2019
Deep convolution network for surveillance records super-resolution
Pourya Shamsolmoali, Masoumeh Zareapoor, Deepak Kumar Jain, Vinay Kumar Jain, Jie Yang
Multimedia Tools and Applications, 2019
High-dimensional multimedia classification using deep CNN and extended residual units
Pourya Shamsolmoali, Deepak Kumar Jain, Masoumeh Zareapoor, Jie Yang, M. Afshar Alam
Multimedia Tools and Applications, 2019
Deep semantic preserving hashing for large scale image retrieval
Masoumeh Zareapoor, Jie Yang, Deepak Kumar Jain, Pourya Shamsolmoali, Neha Jain, Surya Kant
Multimedia Tools and Applications, 2019
A Novel Deep Structure U-Net for Sea-Land Segmentation in Remote Sensing Images
Pourya Shamsolmoali, Masoumeh Zareapoor, Ruili Wang, Huiyu Zhou, Jie Yang
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019
Image super resolution by dilated dense progressive network
Pourya Shamsolmoali, Masoumeh zareapoor, Junhao Zhang, Jie Yang
Image and Vision Computing, 2019
Diverse adversarial network for image super-resolution
Masoumeh Zareapoor, M. Emre Celebi, Jie Yang
Signal Processing Image Communication, 2019
Multi-scale convolutional neural network for multi-focus image fusion
Hafiz Tayyab Mustafa, Jie Yang, Masoumeh Zareapoor
Image and Vision Computing, 2019
Convolutional neural network in network (CNNiN): Hyperspectral image classification and dimensionality reduction
Pourya Shamsolmoali, Masoumeh Zareapoor, Jie Yang
Iet Image Processing, 2019
Data Mining for Secure Online Payment Transaction
Masoumeh Zareapoor, Pourya Shamsolmoali, M. Afshar Alam
Digital Currency Breakthroughs in Research and Practice, 2019
Multi-Aspect DDOS Detection System for Securing Cloud Network
Pourya Shamsolmoali, Masoumeh Zareapoor, M.Afshar Alam
Better Security and Encryption within Cloud Computing Systems, 2019
Learning depth super-resolution by using multi-scale convolutional neural network
Masoumeh Zareapoor, Pourya Shamsolmoali, Jie Yang
Journal of Intelligent and Fuzzy Systems, 2019
Local spatial information for image super-resolution
Masoumeh Zareapoor, Deepak Kumar Jain, Jie Yang
Cognitive Systems Research, 2018
Kernelized support vector machine with deep learning: An efficient approach for extreme multiclass dataset
Masoumeh Zareapoor, Pourya Shamsolmoali, Deepak Kumar Jain, Haoxiang Wang, Jie Yang
Pattern Recognition Letters, 2018
Hybrid deep neural networks for face emotion recognition
Neha Jain, Shishir Kumar, Amit Kumar, Pourya Shamsolmoali, Masoumeh Zareapoor
Pattern Recognition Letters, 2018
Mutual information based multi-modal remote sensing image registration using adaptive feature weight
Junhao Zhang, Masoumeh Zareapoor, Xiangjian He, Donghao Shen, Deying Feng, Jie Yang
Remote Sensing Letters, 2018
A Novel Strategy for Mining Highly Imbalanced Data in Credit Card Transactions
Masoumeh Zareapoor, Jie Yang
Intelligent Automation and Soft Computing, 2018
Boosting prediction performance on imbalanced dataset
Masoumeh Zareapoor, Pourya Shamsolmoali
International Journal of Information and Communication Technology, 2018
Advance DDOS detection and mitigation technique for securing cloud
Masoumeh Zareapoor, Pourya Shamsolmoali, M. Afshar Alam
International Journal of Computational Science and Engineering, 2018
Deep supervised auto-encoder hashing for image retrieval
Sanli Tang, Haoyuan Chi, Jie Yang, Xiaolin Huang, Masoumeh Zareapoor
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics, 2018
Unsupervised feature selection algorithm based on sparse representation
Guoqing Cui, Jie Yang, Masoumeh Zareapoor, Jiechen Wang
2016 3rd International Conference on Systems and Informatics Icsai 2016, 2017
Multi-aspect ddos detection system for securing cloud network
Pourya Shamsolmoali, Masoumeh Zareapoor, M.Afshar Alam
Handbook of Research on End to End Cloud Computing Architecture Design, 2016
Text Mining for Phishing E-mail Detection
Masoumeh Zareapoor, K. R. Seeja
Advances in Intelligent Systems and Computing, 2015
Application of credit card fraud detection: Based on bagging ensemble classifier
Masoumeh Zareapoor, Pourya Shamsolmoali
Procedia Computer Science, 2015
Highly discriminative features for phishing email classification by SVD
Masoumeh Zareapoor, Pourya Shamsolmoali, M. Afshar Alam
Advances in Intelligent Systems and Computing, 2015
Statistical-based filtering system against DDOS attacks in cloud computing
Pourya Shamsolmoali, Masoumeh Zareapoor
Proceedings of the 2014 International Conference on Advances in Computing Communications and Informatics Icacci 2014, 2014
FraudMiner: A novel credit card fraud detection model based on frequent itemset mining
K. R. Seeja, Masoumeh Zareapoor
Scientific World Journal, 2014

Masoumeh Zareapoor

RESEARCH, TEACHING, or OTHER INTERESTS

Scopus Publications

RECENT SCHOLAR PUBLICATIONS

MOST CITED SCHOLAR PUBLICATIONS

Publications