Predicting poverty using regression

Poverty reduction is the first Sustainable Development Goal set by the United Nations to be achieved by 2030, but current data indicates that the progress is insufficient. The diverse factors influencing poverty across different nations pose a challenge in developing effective predictive models. This paper evaluates the use of various regression models to predict poverty rates using a comprehensive dataset of 111 variables from sources such as the UNandthe World Bank. The data, spanning multiple domains like political stability, education, and economic conditions, was preprocessed and transformed to create auxiliary features and interactions. Among the models, Ridge regression yielded the best results, achieving a Root Mean Square Error (RMSE) of 3.6, indicating high predictive accuracy on a global scale. This study highlights the importance of addressing multicollinearity and incorporating a wide range of features to improve the generalizability of poverty prediction models. Future research should explore more complex methods, such as neural networks, and refine model hyperparameters for enhanced performance.