Unleashing Creativity in the Metaverse: Generative AI and Multimodal Content Abdulmotaleb El Saddik, Jamil Ahmad, Mustaqeem Khan, Saad Abouzahir, Wail Gueaieb ACM Transactions on Multimedia Computing Communications and Applications, 2025 The metaverse presents an emerging creative expression and collaboration frontier where generative artificial intelligence (GenAI) can play a pivotal role with its ability to generate multimodal content from simple prompts. These prompts allow the metaverse to interact with GenAI, where context information, instructions, input data, or even output indications constituting the prompt can come from within the metaverse. However, their integration poses challenges regarding interoperability, lack of standards, scalability, and maintaining a high-quality user experience. This article explores how GenAI can productively assist in enhancing creativity within the contexts of the metaverse and unlock new opportunities. We provide a technical, in-depth overview of the different generative models for image, video, audio, and 3D content within the metaverse environments. We also explore the bottlenecks, opportunities, and innovative applications of GenAI from the perspectives of end users, developers, service providers, and AI researchers. This survey commences by highlighting the potential of GenAI for enhancing the metaverse experience through dynamic content generation to populate massive virtual worlds. Subsequently, we shed light on the ongoing research practices and trends in multimodal content generation, enhancing realism and creativity and alleviating bottlenecks related to standardization, computational cost, privacy, and safety. Last, we share insights into promising research directions toward the integration of GenAI with the metaverse for creative enhancement, improved immersion, and innovative interactive applications.
ST-GCVA: Hierarchical graph-based spatio-temporal reasoning for robust violence detection M Khan, J Alkilani, W Abusafia, J Ahmad, F Ullah, N Dilshad Intelligent Systems with Applications, 200676 , 2026 2026
FocusSDF: Boundary-Aware Learning for Medical Image Segmentation via Signed Distance Supervision M Shafique, N Rahim, J Ahmad, MR Siadat, K Malik, G Malik 2026 IEEE 23rd International Symposium on Biomedical Imaging (ISBI), 1-5 , 2026 2026 Citations: 1
AI-Driven Digital Twin Models for Wireless Endoscopic Gastrointestinal Monitoring M Khan, J Ahmad, Y Mahmood 2026 7th International Conference on Advancements in Computational Sciences … , 2026 2026
ViolenceNet: Multi-Scale Transformer with Joint Features Understanding M Khan, W Abusafia, J Alkilani, M Mohamed, A Ez-Zyn, H Zokrait, ... 2026 IEEE International Conference on Consumer Electronics (ICCE), 1-6 , 2026 2026
Distilling Knowledge to Efficient Transformer for Semi-Supervised Citrus Maturity Detection using Consumer UAVs J Ahmad, M Khan, W Gueaieb, A El Saddik, G De Masi, F Karray IEEE Transactions on Consumer Electronics , 2026 2026 Citations: 1
AG-CLIP: Attribute-guided CLIP for Zero-shot Fine-grained Recognition J Ahmad, M Khan, W Guiaeab, A Elsaddik, G De Masi, F Karray IEEE Open Journal of the Computer Society , 2026 2026 Citations: 1
Leveraging model explainability and fine-grained cutmix augmentation for robust detection of apricot diseases in UAV images J Ahmad, W Gueaieb, A El Saddik, G De Masi, F Karray Expert Systems with Applications 296, 128946 , 2026 2026 Citations: 3
Context-Aware Detection and Grading of Intracranial Aneurysms in DSA Images J Ahmad, K Malik, F Ullah, K Ahmad, M Khan, G Malik 2025 Computing, Communications and IoT Applications (ComComAp), 324-329 , 2025 2025
Residual-Enhanced YOLO with Motion-Blur Augmentation for UAV-Based Weed Detection J Ahmad, S Mahmoud, M Mahmoud, L Elfateh, M Khan 2025 IEEE 19th International Conference on Application of Information and … , 2025 2025
Unleashing Creativity in the Metaverse: Generative AI and Multimodal Content A El Saddik, J Ahmad, M Khan, S Abouzahir, W Gueaieb ACM Transactions on Multimedia Computing, Communications and Applications 21 … , 2025 2025 Citations: 37
Joint multi-scale multimodal transformer for emotion using consumer devices M Khan, J Ahmad, W Gueaieb, G De Masi, F Karray, A El Saddik IEEE Transactions on Consumer Electronics 71 (1), 1092-1101 , 2025 2025 Citations: 31
Knowledge-infused learning for fine-grained plant disease recognition J Ahmad, W Gueaieb, A El Saddik, G De Masi, F Karray 2024 IEEE International conference on image processing (ICIP), 395-401 , 2024 2024 Citations: 1
Yield estimation and health assessment of temperate fruits: A modular framework J Ahmad, W Gueaieb, A El Saddik, G De Masi, F Karray Engineering Applications of Artificial Intelligence 136, 108871 , 2024 2024 Citations: 12
Artificial Intelligence-based intrusion detection system for V2V communication in vehicular adhoc networks A Khalil, H Farman, MM Nasralla, B Jan, J Ahmad Ain Shams Engineering Journal 15 (4), 102616 , 2024 2024 Citations: 44
Enabling consumer UAVs for precision agriculture applications: A case study of yield estimation J Ahmad, W Gueaieb, A El Saddik, G De Masi, F Karray 2024 IEEE International conference on consumer electronics (ICCE), 1-6 , 2024 2024 Citations: 5
Skin-former: mobile-friendly transformer for skin lesion diagnosis M Khan, J Ahmad, A El Saddik, W Gueaieb 2024 IEEE International Conference on Consumer Electronics (ICCE), 1-6 , 2024 2024 Citations: 14
Drone-HAT: Hybrid attention transformer for complex action recognition in drone surveillance videos M Khan, J Ahmad, A El Saddik, W Gueaieb, G De Masi, F Karray Proceedings of the IEEE/CVF conference on computer vision and pattern … , 2024 2024 Citations: 32
StrokeNet: An automated approach for segmentation and rupture risk prediction of intracranial aneurysm M Irfan, KM Malik, J Ahmad, G Malik Computerized Medical Imaging and Graphics 108, 102271 , 2023 2023 Citations: 22
Prognosis prediction in COVID-19 patients through deep feature space reasoning J Ahmad, AKJ Saudagar, KM Malik, MB Khan, A AlTameem, ... Diagnostics 13 (8), 1387 , 2023 2023 Citations: 4
Enabling automation and edge intelligence over resource constraint IoT devices for smart home M Nasir, K Muhammad, A Ullah, J Ahmad, SW Baik, M Sajjad Neurocomputing 491, 494-506 , 2022 2022 Citations: 95
MOST CITED SCHOLAR PUBLICATIONS
Action recognition in video sequences using deep bi-directional LSTM with CNN features A Ullah, J Ahmad, K Muhammad, M Sajjad, SW Baik IEEE access 6, 1155-1166 , 2017 2017 Citations: 986
Convolutional neural networks based fire detection in surveillance videos K Muhammad, J Ahmad, I Mehmood, S Rho, SW Baik Ieee Access 6, 18174-18183 , 2018 2018 Citations: 702
Efficient deep CNN-based fire detection and localization in video surveillance applications K Muhammad, J Ahmad, Z Lv, P Bellavista, P Yang, SW Baik IEEE Transactions on Systems, Man, and Cybernetics: Systems 49 (7), 1419-1434 , 2018 2018 Citations: 658
Early fire detection using convolutional neural networks during surveillance for effective disaster management K Muhammad, J Ahmad, SW Baik Neurocomputing 288, 30-42 , 2018 2018 Citations: 640
Speech emotion recognition from spectrograms with deep convolutional neural network AM Badshah, J Ahmad, N Rahim, SW Baik 2017 international conference on platform technology and service (PlatCon), 1-5 , 2017 2017 Citations: 588
Secure surveillance framework for IoT systems using probabilistic image encryption K Muhammad, R Hamza, J Ahmad, J Lloret, H Wang, SW Baik IEEE Transactions on Industrial Informatics 14 (8), 3679-3689 , 2018 2018 Citations: 347
Deep learning methods and applications J Ahmad, H Farman, Z Jan Deep learning: convergence to big data analytics, 31-42 , 2019 2019 Citations: 265
Attention induced multi-head convolutional neural network for human activity recognition ZN Khan, J Ahmad Applied soft computing 110, 107671 , 2021 2021 Citations: 236
Deep features-based speech emotion recognition for smart affective services AM Badshah, N Rahim, N Ullah, J Ahmad, K Muhammad, MY Lee, ... Multimedia Tools and Applications 78 (5), 5571-5589 , 2019 2019 Citations: 230
CISSKA-LSB: color image steganography using stego key-directed adaptive LSB substitution method K Muhammad, J Ahmad, NU Rehman, Z Jan, M Sajjad Multimedia Tools and Applications 76 (6), 8597-8626 , 2017 2017 Citations: 168
A Secure Method for Color Image Steganography using Gray-Level Modification and Multi-level Encryption K Muhammad, J Ahmad, ZJ Haleem Farman, M Sajjad, SW Baik KSII Transactions on Internet and Information Systems 9 (5), 1938-1962 , 2015 2015 Citations: 113
Internet of energy: Opportunities, applications, architectures and challenges in smart industries Y Shahzad, H Javed, H Farman, J Ahmad, B Jan, M Zubair Computers & Electrical Engineering 86, 106739 , 2020 2020 Citations: 108
Visual features based boosted classification of weeds for real-time selective herbicide sprayer systems J Ahmad, K Muhammad, I Ahmad, W Ahmad, ML Smith, LN Smith, ... Computers in Industry 98, 23-33 , 2018 2018 Citations: 99
Enabling automation and edge intelligence over resource constraint IoT devices for smart home M Nasir, K Muhammad, A Ullah, J Ahmad, SW Baik, M Sajjad Neurocomputing 491, 494-506 , 2022 2022 Citations: 95
Image steganography for authenticity of visual contents in social networks K Muhammad, J Ahmad, S Rho, SW Baik Multimedia Tools and Applications 76 (18), 18985-19004 , 2017 2017 Citations: 90
Disease detection in plum using convolutional neural network under true field conditions J Ahmad, B Jan, H Farman, W Ahmad, A Ullah Sensors 20 (19), 5569 , 2020 2020 Citations: 84
Analytical network process based optimum cluster head selection in wireless sensor network H Farman, H Javed, B Jan, J Ahmad, S Ali, FN Khalil, M Khan PLoS One 12 (7), e0180848 , 2017 2017 Citations: 68
Medical image retrieval with compact binary codes generated in frequency domain using highly reactive convolutional features J Ahmad, K Muhammad, SW Baik Journal of medical systems 42 (2), 24 , 2018 2018 Citations: 66
A novel image steganographic approach for hiding text in color images using HSI color model K Muhammad, J Ahmad, H Farman, M Zubair arXiv preprint arXiv:1503.00388 , 2015 2015 Citations: 66
Endoscopic image classification and retrieval using clustered convolutional features J Ahmad, K Muhammad, MY Lee, SW Baik Journal of medical systems 41 (12), 196 , 2017 2017 Citations: 64