The influence of weighting the k-occurrences on hubness-aware classification methods

Hubness is a phenomenon present in many highdimensional data sets. It is related to the skewness in the distribution of k-occurrences, i.e. occurrences of data points in k-neighbor sets of other data points. Several hubnessaware methods that focus on exploiting this phenomenon have recently been proposed. In this paper, we examine the potential impact of weighting the k-occurrences, by taking into account the distance between the respective data points, on hubness-aware nearest-neighbor methods, more specifically hw-kNN, h-FNN and HIKNN. We show that such distance-based weighting can be both advantageous and detrimental and that it influences different methods in different ways.