In this paper we discuss the problem of feature selection for supervised learning from the standpoint of statistical machine learning. We inquire what subset of features will lead to the best classifi