@uaeu.ac.ae
United Arab Emirates University
Scopus Publications
Scholar Citations
Scholar h-index
Scholar i10-index
Mubarak Albarka Umar, Ali Nawaz, and Tariq Qayyum
IEEE
Over 10 Million deaths in the world are because of cancer. Cancer is the second leading cause of death after cardiovascular disease. Additionally, Cancer has significant effects on the socioeconomic status of a family. There are several studies about socioeconomic status and cancer. This work firstly focuses on exploring the relationships between socioeconomic status and cancer mortality rate from disparate open-source data using statistical analysis. Initially, the data consists of 34 features which are reduced to 13 most relevant features using the backward selection method. Secondly, based on the cancer data, we build an appropriate model that can predict the cancer death rate. Specifically, a linear regression model is built and trained for cancer mortality rate prediction. Several models were first built and linear regression diagnostics are performed on the models to check for any assumption violations, finally, the most appropriate model is selected and fine-tuned to provide optimized results. The model is assessed and R2 and RMSE are used to evaluate the model's performance, the model achieved an R2 of 81.12% and RMSE score of 12.23 on test data. Our work also highlights the importance of checking regression assumptions in linear regression modeling.
Muhammad Danish Waseem, Ali Nawaz, Uzair Rasheed, Abir Raza, and Mubarak Omar Albarka
IEEE
Dengue is a viral disease, spread by the mosquito species Aedes aegypti. According to WHO, every year 100-400 million cases of dengue infection are reported worldwide. Dengue mosquito inhibits in tropical regions and proliferates in wet climate conditions. Since it is impossible to clean those regions from the mosquito completely, therefore an analysis of the relationship between different climatic factors and dengue spread is important to forecast the number of cases ahead so that precautionary measures can be taken beforehand to minimize the disease spread. Specifically, to predict the spread we employed two prominent time series models i.e. SARIMA and SARIMAX on the publicly available DengAI dataset. The performance of the models is evaluated by using Mean Absolute Error (MAE), achieving MAE scores of 27.39 and 25.52 on SARIMA and SARIMAX respectively, which reveals that our proposed methodology outperformed other existing machine learning methods.
Mubarak Albarka Umar, Chen Zhanfang, and Yan Liu
ACM
One of the key challenges of the machine learning (ML) based intrusion detection system (IDS) is the expensive computation time which is largely caused by the redundant, incomplete, and unrelated features contain in the IDS datasets. To overcome such challenges and ensure building efficient and more accurate IDS models, many researchers utilize preprocessing techniques such as normalization and feature selection, and a hybrid modeling approach is typically used. In this work, we propose a hybrid IDS modeling approach with an algorithm for feature selection (FS) and another for building the IDS. The FS method is a wrapper-based FS with a decision tree as the feature evaluator. Five selected ML algorithms are individually used in combination with the proposed FS method to build five IDS models using the UNSW-NB15 dataset. As a baseline, five more IDS models are built, in a single modeling approach, using the full features of the datasets. We evaluate the effectiveness of our proposed method by comparing it with the baseline models and also with state-of-the-art works. Our method achieves the best DR of 97.95% and proved to be quite effective in comparison to state-of-the-art works. We, therefore, recommend its usage especially in IDS modeling with the UNSW-NB15 dataset.
Le Cui, Libo Cheng, Xiaoming Jiang, Zhanfang Chen, and Albarka
IOS Press