Clustering Data: Does Theory Help?
0.25
0.5
0.75
1.25
1.5
1.75
2
The problem of Clustering data has received the attention of many fields. Theoretical Computer Science has brought to bear powerful ideas to find nearly optimal clusterings. In Statistics, mixture models of data have been useful in understanding the structure of data and developing algorithms. In practice, many heuristics, (eg. dimension reduction, the k-means algorithm) are widely used. The talk will describe some aspects of the first two, and attempt to answer the question: Is there a happy marriage of these these with practice?