Menu

Retrieve (and Leverage) the Inner Graph Behind the Data

calendar icon Nov 13, 2024 142 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

Session Chair: Gianluca Demartini Many challenging data integration problems, in particular in data journalism, feature heterogeneity at the level of the schema and the data model. To overcome the heterogeneity, we have shown how data of many (semi)structured models can be converted in fine-granularity graphs, enriched and densified with the help of information extraction. Such fine-grained graphs, however, are hard to grasp for non-technical users. To help them get acquainted with a dataset, we devised an abstraction method, which identifies, in fine-granularity data graphs, structured objects endowed with an internal structure, and relationships between them. Given a semistructured dataset, we automatically produce an Entity-Relationship style diagram; in contrast with traditional E-R models, our entities may feature deep nesting, reflecting the nested and possibly recursive structure present in some data models. We thus obtain an automated way of "rescuing" the conceptual model, which we argue is best viewed as a graph, behind any application dataset. We then describe automatic techniques for finding the most interesting paths connecting entities in a dataset.

RELATED CATEGORIES

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.