Virtual Statistics Knowledge Graph Generation from CSV files

  1. Chaves-Fraga, David
  2. Priyatna, Freddy
  3. Santana-Perez, Idafen
  4. Corcho, Oscar
Libro:
Emerging Topics in Semantic Technologies. Vol. 36

Editorial: IOS Press

Ano de publicación: 2018

Páxinas: 235-244

Tipo: Capítulo de libro

DOI: 10.3233/978-1-61499-894-5-235 GOOGLE SCHOLAR lock_openAcceso aberto editor

Resumo

Statistics data is often published as tabular data by statistics offices and governmental agencies. In last years, many of these institutions have addressed an interoperable way of releasing their data, by means of semantic technologies. Existing approaches normally employ ad-hoc techniques to transform this tabular data to the Statistics Knowledge Graph (SKG) and materialize it. This approach imposes the need of periodical maintenance to ensure the synchronization between the dataset and the transformation result. Using R2RML, the W3C mapping language recommendation, the generation of virtual SKG is possible thanks to the capability of its processors. However, as the size of the R2RML mapping documents depends on the number of columns in the tabular data and the number of dimensions to be generated, it may be prohibitively large, hindering its maintenance. In this paper we propose an approach to reduce the size of the mapping document by extending RMLC, a mapping language for tabular data. We provide a mapping translator from RMLC to R2RML and a comparative analysis over two different real statistics datasets.