Talks @ DI

Generating and Ranking Candidate Data Models from Background Knowledge

Transmissão através de Videoconferência

Por Daniela Oliveira.

Abstract: Knowledge Graphs have emerged as a core technology to aggregate and publish knowledge on the Web. However, integrating knowledge from different sources, not specifically designed to be interoperable, is not a trivial task. Finding the right ontologies to model a dataset is a challenge since several valid data models exist and there is no clear agreement between them. In this presentation, I will describe a framework to facilitate the selection of a data model by generating and ranking candidates that match entity types and for the properties associated with these types. These candidates are obtained by aggregating freely available RDF datasets in a Knowledge Graph and then enriching it using ontology matching techniques. The entity type candidates are obtained by exploiting the content and graph structure of this Knowledge Graph to compute a score that considers both the accuracy and interoperability of the candidates. Matching properties are predicted with a Random Forest model, trained on the Knowledge Graph properties, that makes predictions to generate candidates, and ranks them according to different measures. We present experiments using two use-cases, libraries and biomedical, and show that our methodology can produce meaningful candidate data models that can be adapted depending on the use case.

Bio: Daniela Oliveira is currently an Invited Assistant Professor in the Department of Informatics of the Faculty of Sciences in the University of Lisbon. Her background includes a BSc in health sciences and a MSc in bioinformatics. Daniela's current research focuses on general data integration to enable the publication of meaningful knowledge on the Web, with a special interest in the automation of health data to generate new knowledge in the domain.

Zoom: https://videoconf-colibri.zoom.us/j/85733155604

14h30
Departamento de Informática