Resampling Approaches for Handling Imbalanced Regression Tasks

Imbalanced classification tasks have been studied by the research community for a long time. Numerous problems have been identified with standard approaches and new proposals have been put forward for addressing these relevant tasks. Surprisingly, the same attention has not been given to predictive tasks with a numeric target variable, i.e. regression. However, similar problems occur on these domains, when the target of the end-user is the performance on a subset of rare values of the target variable. As in classification standard evaluation metrics fail, and new approaches are required to bias the learning algorithms to the end-user goals. In this talk we will present resampling approaches to these problems. These methods have as main advantage the possibility of being used together with any existing regression tool and still focus on the goals of the end-user, i.e. values poorly represented in the training data.