Averaging Support Vector Machines for Processing Large Data Sets
The handling of large data sets by support vector machines (SVMs)(Vapnik, 1998) employing a nonlinear kernel suffers from the non-linear scaling of the numerical solution techniques for the underlying optimisation problem. This is in particular valid if the kernel matrix cannot be stored in the main memory anymore and therefore the evaluation of the kernel on given data points needs to be recomputed again and again. We investigate a simple approach to allow the processing of larger data sets: We separate the large data set into a number of smaller ones, each small enough to allow the caching of the kernel matrix, and learn a support vector machine for each of these data sets. For the evaluation on data points we then just simply average the results of the different SVMs.