@astanait.edu.kz
Department of Computational and Data Science
Astana IT University
Information Systems, Multidisciplinary, Computer Science, Artificial Intelligence
Scopus Publications
Gaukhar Toktagulova and Aigul Mimenbayeva
IEEE
This study presents a comparative analysis of machine learning algorithms for forecasting harvest data in Kazakhstan. Leveraging extensive datasets spanning over three decades, our research evaluates the performance and predictive accuracy of various algorithms in forecasting potato yields—the predominant crop in the region. Through empirical analysis, Linear Regression, Lasso Regression, and Ridge Regression emerge as top-performing models, exhibiting R^2 scores exceeding 0.95. These linear models effectively capture the complex relationship between weather conditions and agricultural outcomes, offering valuable insights for agricultural planning and decision-making. Additionally, Gradient Boosting and ElasticNet algorithms demonstrate competitive performance, highlighting the potential of ensemble learning techniques in agricultural forecasting. Conversely, Support Vector Regression (SVR) exhibits poor performance in this context, emphasizing the importance of selecting appropriate algorithms tailored to the specific characteristics of agricultural datasets. Overall, our findings underscore the significance of employing advanced machine learning techniques to enhance the accuracy and reliability of harvest forecasts, thereby empowering stakeholders in the agricultural sector to make informed decisions and ensure food security in Kazakhstan.
Aigul Mimenbayeva, Maylen Omirtay, Rozamgul Niyazova, Gulmira Bekmagambetova, Raya Suleimenova, and Ainur Tursumbayeva
IEEE
The paper analyzes and predicts the yield of grain crops of LLP “North-Kazakhstan Agricultural Experimental Station” for the last 30 years. The class of artificial neural networks-multilayer perceptron (MLP) was used as a research tool. The process of training of a model of a neural network consisting of 30 networks was performed on the basis of 70% training and 30% control samples. Based on the analysis of residuals histogram and scatter plot of the target and output function, the best neural network models MLP-1-7-1 (p=0.92), MLP-1-8-1 (p=0.97), MLP-1-5-1 (p=0.82) were selected. Next, using the best MLP 1-8-1 network, the model was tested the given data. Using time series projection, a graph was illustrated and a table of predicted grain yield values for the coming years was tabulated. The absolute error showed the high accuracy of the obtained best MLP 1-8-1 network for forecasting the amount of grain crop yield. The obtained model of artificial network can be applied in research and monitoring of agricultural production development.
Aigul Mimenbayeva, Samat Artykbayev, Raya Suleimenova, Gulnar Abdygalikova, Akgul Naizagarayeva, and Aisulu Ismailova
Private Company Technology Center
The process of clustering of normalized vegetation indices in five regions with a total area of 2565 hectares of the North Kazakhstan region was studied. A methodological approach to organizing the clustering process is proposed using the vegetation indices NDVI, MSAVI, ReCI, NDWI and NDRE, taking into account individual characteristics in the three main phases of spring wheat development As a result of the research, vegetation indices were grouped into 3 classes using the k-means clustering method. The first cluster contained vegetation indices whose maximum values occupied about 33.98% of the total area of the study area. It was found that NDVImax located in the first cluster was positively correlated with soil-corrected vegetation indices MSAVI and crop moisture indicators NDMI (R2=0.92). The second cluster is characterized by minimum values of NDVImax coefficients at the germination, tillering and ripening phases (from 0.53 to 0.55). The lowest values of vegetation indices occupied 35.9 % in the germination phase, 37.9 % in the tillering phase, and 40.1 % of the field from the total area. The third cluster is characterized by average values of vegetation indices in all three phases. A correlation matrix was also constructed to assess the closeness of the relationship between actual yield and NDVI vegetation indices. The maximum coefficient was obtained at the germination phase, R=0.94 with a minimum significance coefficient p=0.018. The approach used in this study can be useful in the analysis of satellite data, as it can improve the sensitivity of the constellation procedure. From a practical point of view, the results obtained make it possible to assess the condition of agricultural crops in the early stages of the growing season, which makes it possible to improve their productivity based on the results of cluster analysis
Aigul Mimenbayeva and Tamara Zhukabayeva
ACM
The article is devoted to most popular free online resources according to Earth Observing System research, that can be used to solve individual analytical problems in the study of geospatial images. The functions of such services as Earth Explorer, Land Viewer, EO Browser, Sentinel Playground, Copernicus, INPE were analyzed, and 1488 hectares of meadows were surveyed in Pavlodar region, Irtysh area on EOS Land Viewer and EOS Crop Monitoring platforms.