Menu

Processing Linked Data at Warp Speed

calendar icon Oct 19, 2014 1841 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

The Web of Data has grown immensely over the past years. From only one dataset in 2007 the linked portion of the Open Data Cloud has grown to over 31 billion triples (in 2011) usually shown in the diagrams and a plethora of open data sets published by individuals, organizations and governments all over the world usually not shown. Given this immense growth the question arises how to process these data. Even if you can process 10’000 triples per second it will still take more than 861 hours to process the whole cloud… so algorithms traveling (or traversing) the linked data cloud using conventional methods are going to be slow. In this talk I will talk about two methods for processing large numbers of triples. First, I will introduce the distributed graph-processing framework Signa/Collect, which allows to process billions of edges in seconds. I will highlight the usefulness of the framework in 3 application scenarios. Second, I will briefly touch upon the need and challenges when processing large graphs as data-streams, where the actual data is not stored but only the portions necessary for processing are kept.

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.