Data variety is a key challenge of big data. This is particularly evident in the drug discovery arena where data not only stems from multiple sources but also is of multiple heterogenous types (e.g. d