Enriching a Thesaurus to Improve Retrieval of Audiovisual Documents
In many archives of audiovisual documents, annotation and retrieval are done using metadata from a structured vocabulary or a thesaurus. In practice, many of these thesauri have limited or no structure. The objective of this paper is to find out whether retrieval of audiovisual resources from a collection indexed with an in-house thesaurus can be improved by anchoring the thesaurus to an external, semantically richer thesaurus. We propose a method to enrich the structure of a thesaurus and we investigate its added value for retrieval purposes. We first anchor the thesaurus to an external resource, WordNet. From this anchoring we infer relations between pairs of terms in the thesaurus that were previously unrelated. We employ the enriched thesaurus in a retrieval experiment on a TRECVID 2007 data set. The results are promising: with simple techniques we are able to enrich a thesaurus in such a way that it adds to retrieval performance.