Empirical Analysis of Ranking Models for an Adaptable Dataset Search

Currently available datasets still have a large unexplored potential for interlinking. Ranking techniques contribute to this task by scoring datasets according to the likelihood of finding entities related to those of a target dataset. Ranked datasets can be either manually selected for standalone linking discovery tasks or automatically inspected by programs that would go through the ranking looking for entity links. In the first case, users typically choose datasets that seem more appropriate among those at the top of the ranking, having little tendency for an exhaustive selection over the entire ranking. On the other hand, automated processes would scan all datasets along a whole slice of the top of the ranking. Metrics such as nDCG better capture the degree of adherence of rankings to users expectations of finding the most relevant datasets at the very top of the ranking. Automatic processes, on the contrary, would benefit most from rankings that would have greater recall of datasets with related entities throughout the entire slice traversed. In this case, the Recall at Position k would better discriminate ranking models. This work presents empirical comparisons between different ranking models and argues that different algorithms could be used depending on whether the ranking is manually or automatically handled and, also, depending on the available metadata of the datasets. Experiments indicate that ranking algorithms that performed best with nDCG do not always have the best Recall at Position k, for high recall levels. Under the automatic perspective, the best algorithms may find the same number of datasets with related entities by inspecting a slice of the rank at least 40\% smaller. Under the manual perspective, the best algorithms may increase nDCG by 5-20\%, depending on the set of features.