TDB/Datasets

From Jena Wiki

Jump to: navigation, search


An RDF Dataset is a collection of one, unnamed, default graph and zero, or more named graphs. In a SPARQL query, a query pattern is matched against the default graph unless the GRAPH keyword is applied to a pattern.

Contents

Dataset Storage

One file location (directory) is used to store one RDF dataset. The unnamed graph of the dataset is held as a single graph while all the named graphs are held in a collection of quad indexes.

Every dataset is obtained via TDBFactory.createDataset(Location) within a JVM is the same dataset. If a model is obtained from via TDBFactory.createModel(Location) there is a hidden, shared dataset and the appropriate model is returned.

Dataset Query

(TDB Version 0.7.0 and later)

There is full support for SPARQL query over named graphs in a TDB-back dataset.

All the named graphs can be treated as a single graph which is the union (RDF merge) of all the named graphs. This is given the special graph name <urn:x-arq:UnionGraph> in a GRAPH pattern.

When querying the RDF merge of named graphs, the default graph in the store is not included. This feature applies to queries only. It does not affect the storage nor does it change loading.

Alternatively, if the symbol tdb:unionDefaultGraph (see TDB Configuration) is set, the unnamed graph for the query is the union of all the named graphs in the datasets. The stored default graph is ignored and is not part of the the data of the union graph although it is accessible by the special name <urn:x-arq:DefaultGraph> in a GRAPH pattern.

Special Graph Names
URI Meaning
urn:x-arq:UnionGraph The RDF merge of all the named graphs in the datasets of the query.
urn:x-arq:DefaultGraph The default graph of the dataset, used when the default graph of the query is the union graph.

Note that setting tdb:unionDefaultGraph does not affect the default graph or default model obtained with dataset.getDefaultModel().

The RDF merge of all named graph can be accessed as the named graph urn:x-arq:UnionGraph using Dataset.getNamedModel("urn:x-arq:UnionGraph") .

Dynamic Datasets

(TDB version 0.8.5 and later)

SPARQL has the concept of a dataset description. In a query string, the clauses for FROM and FROM NAMED specify the dataset. The FROM clauses define the graphs that are merged to form the default graph, and the FROM NAMED clauses identify the graphs to be included as named graphs.

Normally, ARQ interprets these as coming from the web, that is the graphs are read using HTTP GET. TDB modifies this behavior; instead of the universe of graphs being the web, the universe of graph is the TDB data store. FROM and FROM NAMED describe a dataset with graphs drawn only from the TDB data store.

  • Just using one or more FROM clauses, with no FROM NAMED in a query, leaves the named graphs as all the named graphs in the data store.
  • Just using one or more FROM NAMED, with no FROM in a query, causes an empty default graph to be used.
  • If the symbol TDB.symUnionDefaultGraph is also set, then the default graph is set union of all the named graphs (FROM NAMED) and the graphs already used for the default graph via FROM.
  • urn:x-arq:UnionGraph and urn:x-arq:DefaultGraph explicitly name the union of named graphs (FROM NAMED) and the described default graph (union of <tt>FROM) directly.
# Follow a foaf:knows path across both Alice and Bobs FOAF data
# where the data is  in the datastore as named graphs.
BASE <http://example>
SELECT ?zName
FROM <alice-foaf>
FROM <bob-foaf>
{  
   <http://example/Alice#me> foaf:knows ?y .
   ?y foaf:knows ?z .
   ?z foaf:name ?zName .
}
Personal tools
remote navigation