@unilorin.edu.ng
Lecturer, Faculty of Physical Sciences
University of Ilorin
Oyebayo Ridwan OLANIRAN (PhD) is a lecturer in the Department of Statistics, Faculty of Physical Sciences, University of Ilorin. He has over nine years of university teaching, research, and administrative experience.
He has to his credit many publications in reputable outlets covering journals and edited conference proceedings. He has successfully supervised several undergraduate projects, postgraduate diploma and Master dissertations.
Dr. Olaniran is a member of professional bodies within and outside Nigeria, including the Nigeria Mathematical Society (NMS), International Society of Clinical Biostatistics (ISCB), International Biometrics Society-Group Nigeria (IBS-Gni), International Society for Bayesian Analysis (ISBA), and American Society for Clinical Oncology (ASCO).
EDUCATION
Universiti Tun Hussein Onn, Malaysia, Batu Pahat, Johor Malaysia. Sept, 2016 – Oct, 2019
PhD Science, Statistics,
University of Ilorin, Ilorin, Kwara State, Nigeria. Oct, 2014 – April, 2016
Master of Science, Statistics, 79.63/100, Distinction.
University of Ilorin, Ilorin, Kwara State, Nigeria. Oct, 2009 – June, 2013
Bachelor of Science, Statistics, 4.82/5.00, First Class.
Statistics, Probability and Uncertainty, Statistics and Probability
Scopus Publications
Oyebayo Ridwan Olaniran, Saidat Fehintola Olaniran, Jeza Allohibi, Abdulmajeed Atiah Alharbi, and Nada MohammedSaeed Alharbi
Springer Science and Business Media LLC
Oyebayo Ridwan Olaniran, Fatimah M. Alghamdi, Nada MohammedSaeed Alharbi, Gamal A. Abd-Elmougod, Samirah Alzubaidi, and Abdisalam Hassan Muse
Springer Science and Business Media LLC
Oyebayo Ridwan Olaniran, Saidat Fehintola Olaniran, Ali Rashash R. Alzahrani, Nada MohammedSaeed Alharbi, and Asma Ahmad Alzahrani
MDPI AG
The analysis of high-dimensional count data presents a unique set of challenges, including overdispersion, zero-inflation, and complex nonlinear relationships that traditional generalized linear models and standard machine learning approaches often fail to adequately address. This study introduces and validates a novel Random Forest framework specifically developed for high-dimensional Poisson and Negative Binomial regression, designed to overcome the limitations of existing methods. Through comprehensive simulations and a real-world genomic application to the Norwegian Mother and Child Cohort Study, we demonstrate that the proposed methods achieve superior predictive accuracy, quantified by lower root mean squared error and deviance, and critically produced exceptionally stable and interpretable feature selections. Our theoretical and empirical results show that these distribution-optimized ensembles significantly outperform both penalized-likelihood techniques and naive-transformation-based ensembles in balancing statistical robustness with biological interpretability. The study concludes that the proposed frameworks provide a crucial methodological advancement, offering a powerful and reliable tool for extracting meaningful insights from complex count data in fields ranging from genomics to public health.
Hafsat Jalo Suleiman, Isredza Rahmi A. Hamid, and Oyebayo Ridwan Olaniran
Engineering, Technology & Applied Science Research
Cloud computing enables access to various resources online, supporting services across numerous sectors. However, meeting real-time demands in IoT-based computing is challenging due to high latency issues. This is particularly problematic for low-latency applications, such as health monitoring and traffic surveillance, which require fast processing of large datasets. Performance drop occurs when data moves between central databases and cloud data centers. Edge and fog computing have emerged as new solutions to address this. These models place computing resources closer to users, significantly reducing latency and energy consumption while improving data processing efficiency. This paper presents a prediction system utilizing a fog-cloud framework, combining machine learning and deep learning with wearable IoT devices for real-time cardiovascular disease prediction. The system is trained using cardiovascular data from Gombe State, Nigeria, and evaluated based on energy consumption, precision, accuracy, recall, F1 score, and AUC. The proposed Optimized Naïve Bayes Random Forest (ONBRF) model offers a reliable and energy efficient approach to predicting heart disease.
Oyebayo Ridwan Olaniran, Saidat Fehintola Olaniran, Ali Rashash R. Alzahrani, Nada MohammedSaeed Alharbi, and Asma Ahmad Alzahrani
MDPI AG
Fractional cointegration has been extensively examined in time series analysis, but its extension to heterogeneous panel data with unobserved heterogeneity and cross-sectional dependence remains underdeveloped. This paper develops a robust framework for testing fractional cointegration in heterogeneous panel data, where unobserved heterogeneity, cross-sectional dependence, and persistent shocks complicate traditional approaches. We propose the Bayesian Tapered Narrowband Least Squares (BTNBLS) estimator, which addresses three critical challenges: (1) spectral leakage in long-memory processes, mitigated via tapered periodograms; (2) precision loss in fractional parameter estimation, resolved through narrowband least squares; and (3) unobserved heterogeneity in cointegrating vectors (θi) and memory parameters (ν,δ), modeled via hierarchical Bayesian priors. Monte Carlo simulations demonstrate that BTNBLS outperforms conventional estimators (OLS, NBLS, TNBLS), achieving minimal bias (0.041–0.256), near-nominal coverage probabilities (0.87–0.94), and robust control of Type 1 errors (0.01–0.07) under high cross-sectional dependence (ρ=0.8), while the Bayesian Chen–Hurvich test attains near-perfect power (up to 1.00) in finite samples. Applied to Purchasing Power Parity (PPP) in 18 fragile Sub-Saharan African economies, BTNBLS reveals statistically significant fractional cointegration between exchange rates and food price ratios in 15 countries (p<0.05), with a pooled estimate (θ^=0.33, p<0.001) indicating moderate but resilient long-run equilibrium adjustment. These results underscore the importance of Bayesian shrinkage and spectral tapering in panel cointegration analysis, offering policymakers a reliable tool to assess persistence of shocks in institutionally fragmented markets.
Nur Fazlin Ibrahim, Mohd Asrul Affendi Abdullah, and Oyebayo Ridwan Olaniran
Engineering, Technology & Applied Science Research
This study examines the influence of climate variables on paddy production in Malaysia, focusing on historical data from 1980 to 2016. The employed methodology incorporates Multiple Linear Regression (MLR) to identify the critical predictors, Johansen cointegration tests to explore the long-term relationships, and Vector Error Correction Models (VECMs) alongside Granger causality tests to analyze the dynamic interactions among variables. The performed analysis reveals consistent patterns in mean rainy days and rainfall amounts, indicating a relatively stable climate. In contrast, mean 24-hour temperatures show an upward trend, while mean 24-hour relative humidity exhibits a decline. The findings identify the mean rainfall amount and 24-hour relative humidity as significant predictors of the paddy production. The advanced analytical techniques confirm two long-term cointegrating relationships among the variables. Granger causality tests reveal a bidirectional relationship between the mean rainfall amount and paddy production, suggesting mutual predictability. Conversely, the mean 24-hour relative humidity exhibited a unidirectional relationship, predicting paddy production but not vice versa. These findings underscore the critical role of climate variables, particularly rainfall and humidity, in shaping the paddy cultivation outcomes in Malaysia.
Oyebayo Ridwan Olaniran, Ali Rashash R. Alzahrani, Nada MohammedSaeed Alharbi, and Asma Ahmad Alzahrani
MDPI AG
Ensemble methods have proven highly effective in enhancing predictive performance by combining multiple models. We introduce a novel ensemble approach, the Random Generalized Additive Logistic Forest (RGALF), which integrates generalized additive models (GAMs) within a random forest framework to improve binary classification tasks. Unlike traditional random forests, which rely on piecewise constant predictions in terminal nodes, RGALF fits GAM logistic regression (LR) models to the data in each terminal node, enabling it to capture complex nonlinear relationships and interactions among predictors. By aggregating these node-specific GAMs, RGALF addresses multicollinearity, enhances interpretability, and achieves superior bias–variance tradeoffs, particularly in nonlinear settings. Theoretical analysis confirms that RGALF achieves Stone’s optimal rates for additive models (O(n−2k/(2k+d)) under appropriate conditions, outperforming the slower convergence of traditional random forests (O(n−2/3)). Furthermore, empirical results demonstrate RGALF’s effectiveness across both simulated and real-world datasets. In simulations, RGALF demonstrates superior performance over random forests (RFs), reducing variance by up to 69% and bias by 19% in nonlinear settings, with significant MSE improvements (0.032 vs. RF’s 0.054 at n=1000), while achieving optimal convergence rates (O(n−0.48) vs. RF’s O(n−0.29)). On real-world medical datasets, RGALF attains near-perfect accuracy and AUC: 100% accuracy/AUC for Heart Failure and Hepatitis C (HCV) prediction, 99% accuracy/100% AUC for Pima Diabetes, and 98.8% accuracy/100% AUC for Indian Liver Patient (ILPD), outperforming state-of-the-art methods. Notably, RGALF captures complex biomarker interactions (BMI–insulin in diabetes) missed by traditional models.
Oyebayo Ridwan Olaniran and Ali Rashash R. Alzahrani
MDPI AG
The pervasive challenge of missing data in scientific research forces a critical trade-off: discarding incomplete observations, which risks significant information loss, while conventional imputation methods struggle to maintain accuracy in high-dimensional settings. Although approaches like multiple imputation (MI) and random forest (RF) proximity-based imputation offer improvements over naive deletion, they exhibit limitations in complex missing data scenarios or sparse high-dimensional settings. To address these gaps, we propose a novel integration of Multiple Imputation by Chained Equations (MICE) with Bayesian Random Forest (BRF), leveraging MICE’s iterative flexibility and BRF’s probabilistic robustness to enhance the imputation accuracy and downstream predictive performance. Our hybrid framework, BRF-MICE, uniquely combines the efficiency of MICE’s chained equations with BRF’s ability to quantify uncertainty through Bayesian tree ensembles, providing stable parameter estimates even under extreme missingness. We empirically validate this approach using synthetic datasets with controlled missingness mechanisms (MCAR, MAR, MNAR) and dimensionality, contrasting it against established methods, including RF and Bayesian Additive Regression Trees (BART). The results demonstrate that BRF-MICE achieves a superior performance in classification and regression tasks, with a 15–20% lower error under varying missingness conditions compared to RF and BART while maintaining computational scalability. The method’s iterative Bayesian updates effectively propagate imputation uncertainty, reducing overconfidence in high-dimensional predictions, a key weakness of frequentist alternatives.
Alabi W. Banjoko, Waheed B. Yahya, and Oyebayo R. Olaniran
Elsevier BV
Oyebayo Ridwan Olaniran, Aliu Omotayo Sikiru, Jeza Allohibi, Abdulmajeed Atiah Alharbi, and Nada MohammedSaeed Alharbi
MDPI AG
This paper proposes a novel two-stage ensemble framework combining Long Short-Term Memory (LSTM) and Bidirectional LSTM (BiLSTM) with randomized feature selection to enhance diabetes prediction accuracy and calibration. The method first trains multiple LSTM/BiLSTM base models on dynamically sampled feature subsets to promote diversity, followed by a meta-learner that integrates predictions into a final robust output. A systematic simulation study conducted reveals that feature selection proportion critically impacts generalization: mid-range values (0.5–0.8 for LSTM; 0.6–0.8 for BiLSTM) optimize performance, while values close to 1 induce overfitting. Furthermore, real-life data evaluation on three benchmark datasets—Pima Indian Diabetes, Diabetic Retinopathy Debrecen, and Early Stage Diabetes Risk Prediction—revealed that the framework achieves state-of-the-art results, surpassing conventional (random forest, support vector machine) and recent hybrid frameworks with an accuracy of up to 100%, AUC of 99.1–100%, and superior calibration (Brier score: 0.006–0.023). Notably, the BiLSTM variant consistently outperforms unidirectional LSTM in the proposed framework, particularly in sensitivity (98.4% vs. 97.0% on retinopathy data), highlighting its strength in capturing temporal dependencies.
Mohd Asrul Affendi Abdullah, Lai Jesintha, Gopal Pillay Khuneswari, Siti Afiqah Muhamad Jamil, and Oyebayo Ridwan Olaniran
Engineering, Technology & Applied Science Research
Model construction is of significant importance for the extraction of information from datasets and the prediction of responses based on predictor variables. The objective of this study is to compare the Multiple Regression (MR) and model averaging approaches in the context of missing data and to validate the effectiveness of the Multiple Imputation (MI) method used to address missing data issues. A comparison was performed between the results obtained from the multiple-imputed data and those derived from the Complete Case (CC) data, using a diabetes dataset from Hospital Besar Alor Setar. Prior to the application of MI and model building, k-fold cross-validation was employed to partition the dataset, resulting in 90% of the data lacking complete covariates for training and 10% of the data comprising complete covariates for testing. Subsequently, MI was applied to the 90% training dataset. Model M115, derived from the multiple-imputed data, was identified as the optimal model for MR. In the model averaging approach, two models were identified as optimal: Model 1 (without interaction variables) and Model 2 (with interaction variables). The first one, exhibited the lowest values of Mean Square Error (MSE), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). These results indicate that model averaging, specifically Model 1, is the superior model-building approach for this study, demonstrating improved performance compared to MR and validating the effectiveness of the MI method.
Saidat Fehintola Olaniran, Oyebayo Ridwan Olaniran, Jeza Allohibi, and Abdulmajeed Atiah Alharbi
MDPI AG
Fractional cointegration in time series data has been explored by several authors, but panel data applications have been largely neglected. A previous study of ours discovered that the Chen and Hurvich fractional cointegration test for time series was fairly robust to a moderate degree of heterogeneity across sections of the six tests considered. Therefore, this paper advances a customized version of the Chen and Hurvich methodology to detect cointegrating connections in panels with unobserved fixed effects. Specifically, we develop a test statistic that accommodates variation in the long-term cointegrating vectors and fractional cointegration parameters across observational units. The behavior of our proposed test is examined through extensive Monte Carlo experiments under various data-generating processes and circumstances. The findings reveal that our modified test performs quite well comparatively and can successfully identify fractional cointegrating relationships in panels, even in the presence of idiosyncratic disturbances unique to each cross-sectional unit. Furthermore, the proposed modified test procedure established the presence of long-run equilibrium between the exchange rate and labor wage of 36 countries’ agricultural markets.
Oyebayo Ridwan Olaniran, Ali Rashash R. Alzahrani, and Mohammed R. Alzahrani
MDPI AG
This paper examines the distribution of eigenvalues for a 2×2 random confusion matrix used in machine learning evaluation. We also analyze the distributions of the matrix’s trace and the difference between the traces of random confusion matrices. Furthermore, we demonstrate how these distributions can be applied to calculate the superiority probability of machine learning models. By way of example, we use the superiority probability to compare the accuracy of four disease outcomes machine learning prediction tasks.
Saidat Fehintola Olaniran, Oyebayo Ridwan Olaniran, Jeza Allohibi, Abdulmajeed Atiah Alharbi, and Mohd Tahir Ismail
MDPI AG
Asymptotic theories for fractional cointegrations have been extensively studied in the context of time series data, with numerous empirical studies and tests having been developed. However, most previously developed testing procedures for fractional cointegration are primarily designed for time series data. This paper proposes a generalized residual-based test for fractionally cointegrated panels with fixed effects. The test’s development is based on a bivariate panel series with the regressor assumed to be fixed across cross-sectional units. The proposed test procedure accommodates any integration order between [0,1], and it is asymptotically normal under the null hypothesis. Monte Carlo experiments demonstrate that the test exhibits better size and power compared to a similar residual-based test across varying sample sizes.
Sena Alaeikhanehshir, Taiwo Ajayi, Frederieke H. Duijnhoven, Coralie Poncet, Ridwan O. Olaniran, Esther H. Lips, Laura J. van 't Veer, Suzette Delaloge, Isabel T. Rubio, Alastair M. Thompson,et al.
American Society of Clinical Oncology (ASCO)
PURPOSE A number of studies are currently investigating de-escalation of radiation therapy in patients with a low risk of in-breast relapses on the basis of clinicopathologic factors and molecular tests. We evaluated whether 70-gene risk score is associated with risk of locoregional recurrence (LRR) and estimated 8-year cumulative incidences for LRR in patients with early-stage breast cancer treated with breast conservation. METHODS In this exploratory substudy of European Organisation for Research and Treatment of Cancer 10041/BIG 03-04 MINDACT trial, we evaluated women with a known clinical and genomic 70-gene risk score test result and who had breast-conserving surgery (BCS). The primary end point was LRR at 8 years, estimated by cumulative incidences. Distant metastasis and death were considered competing risks. RESULTS Among 6,693 enrolled patients, 5,470 (81.7%) underwent BCS, of whom 98% received radiotherapy. At 8-year follow-up, 189 patients experienced a LRR, resulting in an 8-year cumulative incidence of 3.2% (95% CI, 2.7 to 3.7). In patients with a low-risk 70-gene signature, the 8-year LRR incidence was 2.7% (95% CI, 2.1 to 3.3). In univariable analysis, adjusted for chemotherapy, five of 12 variables were associated with LRR, including the 70-gene signature. In multivariable modeling, adjuvant endocrine therapy and to a lesser extent tumor size and grade remained significantly associated with LRR. CONCLUSION This exploratory analysis of the MINDACT trial estimated an 8-year low LRR rate of 3.2% after BCS. The 70-gene signature was not independently predictive of LRR perhaps because of the low number of events observed and currently cannot be used in clinical decision making regarding LRR. The overall low number of events does provide an opportunity to design trials toward de-escalation of local therapy.
Oyebayo Ridwan Olaniran and Ali Rashash R. Alzahrani
MDPI AG
Random forest (RF) is a widely used data prediction and variable selection technique. However, the variable selection aspect of RF can become unreliable when there are more irrelevant variables than relevant ones. In response, we introduced the Bayesian random forest (BRF) method, specifically designed for high-dimensional datasets with a sparse covariate structure. Our research demonstrates that BRF possesses the oracle property, which means it achieves strong selection consistency without compromising the efficiency or bias.
Oyebayo Ridwan Olaniran and Mohd Asrul A. Abdullah
Elsevier BV
Jibril Abubakar, Mohd Asrul Affendi Abdullah, and Oyebayo Ridwan Olaniran
International Academic Press
The exponential, Weibull, log-logistic and lognormal distributions represent the class of light and heavy-tailed distributions that are often used in modelling time-to-event data. The exponential distribution is often applied if the hazard is constant, while the log-logistic and lognormal distributions are mainly used for modelling unimodal hazard functions. The Weibull distribution is on the other hand well-known for modelling monotonic hazard rates. Recently, in practice, survival data often exhibit both monotone and non-monotone hazards. This gap has necessitated the introduction of Exponentiated Weibull Distribution (EWD) that can accommodate both monotonic and non-monotonic hazard functions. It also has the strength of adapting unimodal functions with bathtub shape. Estimating the parameter of EWD distribution poses another problem as the flexibility calls for the introduction of an additional parameter. Parameter estimation using the maximum likelihood approach has no closed-form solution, and thus, approximation techniques such as Newton-Raphson is often used. Therefore, in this paper, we introduce another estimation technique called Variational Bayesian (VB) approach. We considered the case of the accelerated failure time (AFT) regression model with covariates. The AFT model was developed using two comparative studies based on real-life and simulated data sets. The results from the experiments reveal that the Variational Bayesian (VB) approach is better than the competing Metropolis-Hasting Algorithm and the reference maximum likelihood estimates.
Oyebayo Ridwan Olaniran, Saidat Fehintola Olaniran, and Jumoke Popoola
Springer International Publishing
M. B. Mohammed, H. S. Zulkafli, N. Ali, O. R. Olaniran, and H. Ahmed
Informa UK Limited
Oyebayo Ridwan Olaniran
AIP Publishing
O R Olaniran and M A A Abdullah
IOP Publishing
Abstract In this study, the Variational Bayes (VB) approach was hybridized with the bootstrap prior procedure to improve the accuracy of subset selection as well as optimizing the algorithm time in modelling high-dimensional genomic data with inherent sparse structure. The new hybrid VB approach is shown to yields a minimal sufficient statistic which under mild regularity conditions converges to the true sparse structure. Simulation and real-life high-dimensional genomic data experiments revealed comparable empirical performance with other competing frequentist and Bayesian methods. In addition, a new fast algorithm that illustrates the procedure was developed and implemented in the environment of R statistical software as package “VBbootprior”.
Jumoke Popoola, Waheed Babatunde Yahya, Olusogo Popoola, and Oyebayo Ridwan Olaniran
International Academic Press
Internet traffic data such as the number of transmitted packets and time spent on the transmission of Internet protocols (IPs) have been shown to exhibit self-similar property which can contain the long memory property, particularly in a heavy Internet traffic. Simulating this type of dataset is an important aspect of delay avoidance planning, especially when trying to mimic real-life processing of packets on the Internet. Most of the existing procedures often assumed the process follows a Gaussian distribution, and thus long memory processes such as Fractional Brownian Motion (FBM) and Fractional Gaussian Noise (FGN) among others are used. These approaches often result in estimation errors arising from the use of inappropriate distribution. However, it has been established that the distribution of Internet processes are heavy-tailed. Therefore, in this paper, a new method that is capable of generating heavy-tailed self-similar traffic is proposed based on the first-order autoregressive AR (1) process. The proposed method is compared with some of the existing methods at varying values of the self-similar index and sample sizes. The imposed self-similarity indices were estimated using the Range/Standard deviation statistic (R/S). Performance analysis was achieved using the absolute percentage errors. The results showed that the proposed method has a lower average error when compared with other competing methods.
Oyebayo Ridwan Olaniran and Mohd Asrul Affendi Abdullah
Austrian Statistical Society
In this paper, the one-way ANOVA model and its application in Bayesian multi-class variable selection is considered. A full Bayesian bootstrap prior ANOVA test function is developed within the framework of parametric empirical Bayes. The test function developed was later used for variable screening in multiclass classification scenario. Performance comparison between the proposed method and existing classical ANOVA method was achieved using simulated and real life gene expression datasets. Analysis results revealed lower false positive rate and higher sensitivity for the proposed method.