We consider the problem of jointly training structured models for extraction from multiple web sources whose records enjoy partial content overlap. This has important applications in open-domain extra