Data Structures in IR (DSIR)
The course presents an overview of theoretical and practical approaches to implementation of information retrieval systems. It is mainly focused on classic big and large-scale search problems but also includes brief description of structures applicable for other IR tasks. The course covers a wide range of questions from a high-level theoretical view on data structures design to particular questions of implementation. It includes such important practical problems, which are poorly presented in available educational literature, as parallelization, lossy compressions techniques, and relevant modern hardware features. The course contains a discussion of known open source and commercial systems implementations. Some considered examples are based on lecturer’s practical experience from his participation in IR systems development projects. The course can be interesting for students who want to know details of IR system implementation or tailoring existing systems for a specific data scale or IR task. It was presented at internal seminars for employees at Ask.com in 2007 and 2008.