JASIC Volume. 4, Issue 1 (2023)

Contributor(s)

Yuma Erick, Chinecheremu Umezuruike, Nasasira Jossy, Balyejusa Gusite
 

Keywords

Machine Learning Kakira Sugar Limited Random Forest Regression Decision Tree Regression Machine Learning Algorithm Multiple Linear Regression
 

Download Full-text (PDF)

... Download File [ 0.52 MB ]
 
Go Back

Development of a machine learning regression model for accurate sugarcane crop yield prediction, Jinja – Uganda

Abstract: Sugarcane is one of the key crops grown worldwide and used for sugar processing, food, alcohol, biogas, fertilizer, and other products. There is a problem with Sugarcane yield prediction, yields aren’t accurately predicted, and this creates an impact on yields. This research looks at identifying methods used for the prediction, design, development, and evaluation of the three machine learning regression models used for predicting sugarcane yields in Uganda. This research was implemented using Data Science methodology, several machine learning algorithms for prediction of yields on dataset have been analyzed. The collected and analyzed dataset in this research had one output/ dependent variable and eight independent variables. The algorithms used to develop the prediction models are the Multiple Linear Regression algorithm, Decision Tree Regression algorithm, and Random Forest Regression algorithm to predict the output. The dataset of 3 years, 2019, 2020, and 2022 was considered and merged to train and test the model at a ratio of 80% to 20%. The accuracies of the individual models were compared after training, testing the dataset, and evaluation. The multiple Linear regression model results indicate that out of 100%, the model accuracy was 76.5%, the Decision Tree Regression Model scored 89.2%, Random Forest Regression Model was 94.6%. The random forest model came out as the best model. The Random Forest model has a percentage improvement of 60.4%. In future research, researchers can work on, A web-based machine learning model, Deep learning methods used to improve the model and more data can be used to improve the accuracy.