User Settings
Article

Automating Relational Database Schema Design for Very Large Semantic Datasets

3

TL;DRAbstract

Many semantic datasets or RDF datasets are very large but have no pre-defined data structures. Triple stores are commonly used as RDF databases yet they cannot achieve good query performance for large datasets owing to excessive self-joins. Recent research work proposed to store RDF data in column-based databases. Yet, some study has shown that such an approach is not scalable to the number of predicates. The third common approach is to organize an RDF data set in different tables in a relational database. Multiple “correlated ” predicates are maintained in the same table called property table so that table-joins are not needed for queries that involve only the predicates within the table. The main challenge for the property table approach is that it is infeasible to manually design good schemas for the property tables of a very large RDF dataset. We propose a novel data-mining technique called Attribute Clustering by Table Load (ACTL) that clusters a given set of attributes into corre

Chat with Paper

AI Agents for this Paper

Many semantic datasets or RDF datasets are very large but have no pre-defined data structures. Triple stores are commonly used as RDF databases yet they cannot achieve good query performance for large datasets owing to excessive self-joins. Recent research work proposed to store RDF data in column-based databases. Yet, some study has shown that such an approach is not scalable to the number of predicates. The third common approach is to organize an RDF data set in different tables in a relational database. Multiple “correlated ” predicates are maintained in the same table called property table so that table-joins are not needed for queries that involve only the predicates within the table. The main challenge for the property table approach is that it is infeasible to manually design good schemas for the property tables of a very large RDF dataset. We propose a novel data-mining technique called Attribute Clustering by Table Load (ACTL) that clusters a given set of attributes into corre

Keywords

Computer scienceJoinsRDFData miningTable (database)Relational databaseScalabilityCluster analysis

Chat

Click to start Chat