Chris Bizer, Freie Universität Berlin
Richard Cyganiak, Freie Universität Berlin
Olaf Hartig, Humboldt-Universität zu Berlin

NG4J - Named Graphs API for Jena

The Named Graphs API for Jena (NG4J) is an extension to the Jena Semantic Web framework for parsing, manipulating and serializing sets of Named Graphs.

News

Features

NG4J extends Jena with:


Contents

    1. Named Graphs
    2. NG4J
    1. Operations on GraphSet and Model Level
    2. Querying Named Graphs with SPARQL
    3. Using Database Persistence
    4. Semantic Web Publishing (SWP) API and Digital Signatures
    5. Semantic Web Client Library

1. Introduction

This section gives an introduction into the ideas behind Named Graphs and the NG4J API.

1.1 Named Graphs

The Semantic Web can be seen as a collection of RDF graphs. The RDF recommendation explains the meaning of any one graph, and how to merge a set of graphs into one, but does not provide suitable mechanisms for talking about graphs or relations between graphs. But the ability to express metainformation about graphs is required for:


RDF reification has well-known problems in addressing these use cases. To avoid these problems several authors propose quads, consisting of an RDF triple and a further URIref or blank node or ID. The proposals vary widely in the semantic of the fourth element, using it to refer to information sources, to model IDs or statement IDs or more generally to "contexts". Named Graphs propose a general and simple variation on RDF, using sets of named RDF graphs.

A set of Named Graphs is a collection of RDF graphs. Each one is named with a URIref.

The name of a graph may occur either in the graph itself, in other graphs, or not at all. Graphs may share URIrefs but not blank nodes. Named Graphs can be seen as a reformulation of quads in which the fourth element's distinct syntactic and semantic properties are clearly distinguished, and the relationship to RDF's triples, abstract syntax and semantics is clearer.

Further details about Named Graphs are found in:

 

1.2 NG4J - Named Graph API for Jena

The Named Graphs API for Jena (NG4J) is an extension to the Jena Semantic Web toolkit for parsing, manipulating and serializing sets of Named Graphs. NG4J is an exerimental implementation of the new syntaxes (TriX, TriG) developed within the Semantic Web Interest Group. Its purpose is to have something for playing around with the new technologies.

Jena and the Model and Graph layers

Jena is a two-layered framework. Developers usually work with the Model layer, using types like Model, Statement and Resource. The Model layer is designed to be convenient to use for application programmers.

Beneath lies the Graph layer, whose most important types are Graph, Triple and Node. It offers functionality similar to the Model layer, but is less convenient to use for an application programmer, and is easier to work with when you design Jena system components like triple stores or inference engines.

NG4J lies between those two layers, using parts of both. Only a part of its functionality is exposed at the Model layer, by the NamedGraphModel class. The graph-centric and quad-centric methods are exposed directly on the Graph Layer.

Working with NamedGraphs and Quads

The basic idea of the NG4J API is to have a NamedGraphSet which represents a collection of Named Graphs. It can can be manipulated by adding and removing entire Graphs, or by working with individual Quads. The NamedGraph object wraps existing Jena graphs, meaning that GraphMem, GraphRDB and GraphD2RQ can be reused.

Working with Models and Resources

You can get a Jena model view on a NamedGraphSet, which can be used like a normal Jena model. The Jena Statements returned by this model or its resources can be casted to NamedGraphStatements. NamedGraphStatements can provide provenance information about the Named Graphs in which they are contained.

API Overview

The tables below give an overview about NG4J's main interfaces and their most important methods. See JavaDoc for details about all methods.

NamedGraph
A collection of RDF triples which is named by a URI. A NamedGraph can be created by wrapping existing Jena graphs like GraphMem, GraphRDB or GraphD2RQ.
getGraphName()
Returns the URI of the named graph.
add(Triple) Adds an triple to the NamedGraph. (inherited from Jena's Graph type)
find(TripleMatch) Returns an iterator over the query results. (inherited from Jena's Graph type)
More ...

 

NamedGraphSet
A set of Named Graphs is a collection of RDF graphs where each one is named with a URIref. The collection can be accessed and modified by adding and removing NamedGraph instances, by adding, removing and finding Quads (RDF triples with an additional graph name) and through it's union graph (the RDF graph containing all statements from all graphs in the set) which is accessible as a Jena Graph and as a Jena Model. NamedGraphSets can be serialized using TriX and TriG. We provide a NamedGraphSet implementation based on Jena graphs. Other implementations might represent information as quads or build on a quad-based store like RDF Gateway.
addGraph(namedgraph) Adds a NamedGraph to the set.
addQuad(quad) Adds a quad to the NamedGraphSet. A new NamedGraph will automatically be created if necessary.
findQuads(quad) Finds Quads that match a quad pattern.
read(url, lang)
Read NamedGraphs from a URL into the NamedGraphSet. Supported RDF serialization languages are TriX, TriG, RDF/XML, N-Triples and N3.
write(writer, lang, baseURI) Writes a serialized representation of the NamedGraphSet to a Writer. Supported RDF serialization languages are TriX, TriG, RDF/XML, N-Triples and N3. If the specified serialization language doesn't support Named Graphs, then the union graph will be serialized, and knowledge about graph names is lost. Only TriX and TriG support graph naming.
asJenaGraph(defaultGraphForAdding) Returns a Jena Graph view on the NamedGraphSet, equivalent to the union graph of all graphs in the graph set.
asJenaModel(defaultGraphForAdding) Returns a Jena Model view on the NamedGraphSet, equivalent to the union graph of all graphs in the graph set. All Statements returned by this NamedGraphsModel can be casted to NamedGraphStatements in order to access provenance information about the graphs they are contained in.
More ...

 

NamedGraphModel implements Jena Model
NamedGraphModel provides a resource-centric view on a NamedGraphSet. It behaves like a normal Jana model based on the union graph of the NamedGraphSet. Statements returned by the NamedGraphModel can be casted to NamedGraphStatements which are able to provide provenance information. Reading RDF files into the NamedGraphModel replaces statements previously loaded from the same source or URL.
getResource()
Returns a Jena resource. Statements returned by this resource can be casted to NamedGraphStatements.
Supports all other Jena Model methods. More ...

 

NamedGraphStatement implements Jena Statement
A NamedGraphStatement is a Statement which can provide provenance information about the NamedGraphs in which it is contained.
listGraphs()

List all NamedGraphs which contain the statement.
listQuads() List all quads which contain the triple of the statement.
getGraphName() Returns the name of a graph containing the statement. If several graphs contain the statement, one will be chosen arbitrarily. Useful if it is known that there is only one.
More ...

 


2. Usage Examples

2.1 Operations on GraphSet and Model Level

////////////////////////////////////////////////
//         Operations on GraphSet Level
////////////////////////////////////////////////

// Create a new graphset
NamedGraphSet graphset = new NamedGraphSetImpl();

// Create a new NamedGraph in the NamedGraphSet
NamedGraph graph = graphset.createGraph("http://example.org/persons/123");

// Add information to the NamedGraph
graph.add(new Triple(Node.createURI("http://richard.cyganiak.de/foaf.rdf#RichardCyganiak"),
        Node.createURI("http://xmlns.com/foaf/0.1/name") ,
        Node.createLiteral("Richard Cyganiak", null, null)));

// Create a quad
Quad quad = new Quad(Node.createURI("http://www.bizer.de/InformationAboutRichard"),
        Node.createURI("http://richard.cyganiak.de/foaf.rdf#RichardCyganiak"),
        Node.createURI("http://xmlns.com/foaf/0.1/mbox") ,
        Node.createURI("mailto:richard@cyganiak.de"));

// Add the quad to the graphset. This will create a new NamedGraph in the
// graphset.
graphset.addQuad(quad);

// Find information about Richard across all graphs in the graphset
Iterator it = graphset.findQuads( 
        Node.ANY, 
        Node.createURI("http://richard.cyganiak.de/foaf.rdf#RichardCyganiak"),
        Node.ANY,
        Node.ANY);

while (it.hasNext()) {
    Quad q = (Quad) it.next();
    System.out.println("Source: " + q.getGraphName());
    System.out.println("Statement: " + q.getTriple());
    // (This will output the two statements created above)
}

// Count all graphs in the graphset (2)
System.out.println("The graphset contains " + graphset.countGraphs() + " graphs.");

// Serialize the graphset to System.out, using the TriX syntax
graphset.write(System.out, "TRIX", null);

////////////////////////////////////////////////
//         Operations on Model Level
////////////////////////////////////////////////

// Get a Jena Model view on the GraphSet
Model model = graphset.asJenaModel("http://example.org/defaultgraph");

// Add provenance information about a graph
Resource informationAboutRichard = model.getResource("http://www.bizer.de/InformationAboutRichard");
informationAboutRichard.addProperty(model.createProperty("http://purl.org/dc/elements/1.1/author"), "Chris Bizer");
informationAboutRichard.addProperty(model.createProperty("http://purl.org/dc/elements/1.1/date"), "09/15/2004");

// Get a Jena resource and statement
Resource richard = model.getResource("http://richard.cyganiak.de/foaf.rdf#RichardCyganiak");

NamedGraphStatement mboxStmt = 
        (NamedGraphStatement) richard.getProperty(model.getProperty("http://xmlns.com/foaf/0.1/mbox"));

// Get an iterator over all graphs which contain the statement.
it = mboxStmt.listGraphNames();

// So who has published my email address all over the Web??!?
while (it.hasNext()) {
    Resource g = (Resource) it.next();
    System.out.println();
    System.out.println("GraphName: " + g.toString());
    System.out.println("Author: " + 
        g.getProperty(model.getProperty("http://purl.org/dc/elements/1.1/author")).getString());
    System.out.println("Date: " + 
        g.getProperty(model.getProperty("http://purl.org/dc/elements/1.1/date")).getString());
}

// Serialize the model to System.out, using the TriG syntax
model.write(System.out, "TRIG", "http://richard.cyganiak.de/foaf.rdf");

When run, the example code will write this output to System.out.

 

2.2 Querying Named Graphs with SPARQL

The SPARQL RDF query language supports queries on sets of named graphs. The GRAPH keyword can be used to select a specific graph, or to bind to a variable the names of all graphs whose contents match a query pattern:

PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?otherFoafFile
WHERE {
  GRAPH <http://example.org/my-foaf.rdf> {
    ?me foaf:mbox <mailto:me@example.org> .
  }
  GRAPH ?otherFoafFile {
    ?otherFoafFile foaf:maker ?someone
    ?someone foaf:knows ?me .
  }
}

The first part reads my FOAF file and selects the resource representing me (by matching my email address). The second part finds and returns all FOAF files whose makers claim to know me. Note that all the files in question must have been loaded into the graph set before the query is executed.

As an alternative approach, NG4J also includes a Semantic Web Client that tries to load files transparently from the Web as required to answer a query.

Here's the code for running a SPARQL query against a NamedGraphSet:

Query sparql = QueryFactory.create(
        "SELECT * WHERE { GRAPH ?graph { ?s ?p ?o } }");
QueryExecution qe = QueryExecutionFactory.create(
        sparql, new NamedGraphDataset(set));
ResultSet results = qe.execSelect();
while (results.hasNext()) {
    QuerySolution result = results.nextSolution();
    RDFNode graph = result.get("graph");
    RDFNode s = result.get("s");
    RDFNode p = result.get("p");
    RDFNode o = result.get("o");
    System.out.println(graph + " { " + s + " " + p + " " + o + " . }");
}

This will run the query and print the results.

 

2.3 Using Database Persistence

NamedGraphSets can be stored in a relational database. As of NG4J v0.9, HSQLDB, MySQL, Oracle, PostgreSQL, and Apache Derby are supported. To create a persistent NamedGraphSet, you have to set up a java.sql database connection and create a new NamedGraphSetDB instance:

String URL = "jdbc:mysql://localhost/db_name";
String USER = "username";
String PW = "mypassword";
Class.forName("com.mysql.jdbc.Driver");
Connection connection = DriverManager.getConnection(URL, USER, PW);
NamedGraphSet set = new NamedGraphSetDB(connection);
// ...
set.close();

The resulting NamedGraphSet can be used just like a normal in-memory NamedGraphSet.

Optionally, you can specify a table prefix that will be stuck in front of all database table names. This allows multiple NamedGraphSets to coexist in the same database. There's also a method for deleting sets from the database.

NamedGraphSet set = new NamedGraphSetDB(connection, "myapp");
// ...
NamedGraphSetDB.delete(connection, "myapp");

 

2.4 Semantic Web Publishing (SWP) API and Digital Signatures

Working with Digital Signatures and Named Graphs

The Semantic Web Publishing Vocabulary (SWP) is used for publishing Named Graphs on the Web. It allows authorities to indicate whether they assert or quote a graph and to sign graphs using digital signatures. More information about the SWP vocabulary is available on the TriQL.P Trust Architecture page.

The SWP API in NG4J builds on the NamedGraphSet inferface. Named Graphs are asserted or quoted by an SWPAuthority, a service or person that is able to publish information. The authority must have a valid X.509 certificate in order to be able to sign Named Graphs.

Signing Named Graphs

The general algorithm for signing and recording the digital signature of a Named Graph is as follows:

Verifying Signatures

Named Graph digital signatures can be verified in the following manner:

SWPNamedGraphSet
An extension to NamedGraphSet that handles various types of digital signature; options include signing each graph in the set or the NamedGraphSet as a whole.
swpAssert(SWPAuthority authority) Assert all graphs in the graphset with the given Authority, but don't sign the resulting Warrant.
swpQuote(SWPAuthority authority)
Quote all graphs in the graphset with the given Authority, but don't sign the resulting Warrant.
assertWithSignature(SWPAuthority authority,
Node signatureMethod,
Node digestMethod,
ArrayList listOfAuthorityProperties,
String keystore,
String password )
Assert all graphs in the graphset with the given Authority and sign the asserted graph with a digital signature according to the specified signatureMethod.
quoteWithSignature( SWPAuthority authority,
Node signatureMethod,
Node digestMethod,
ArrayList listOfAuthorityProperties,
String keystore,
String password )
Quote all graphs in the graphset with the given Authority and sign the asserted graph with a digital signature according to the specified signatureMethod.
verifyAllSignatures() Verify all signatures and graph digests in the NamedGraphSet.
More ...

SWPAuthority
Represents a person or service that is able to publish information.
setID(Node id) Sets the ID of the authority. Authorities can by identified using a URIref or a bNode
setLabel(String label) Sets the Label / Name of the authority. Will be serialized using rdfs:label
setEmail(String email) Sets the eMail address of the authority. Will be serialized using foaf:mbox
setPublicKey(PublicKey key) Sets the public key of the authority. Will be serialized using swp:hasKey
setCertificate(X509Certificate cert) Sets the X.509 certificate of the authority.
addProperty(Node predicate, Node object) Adds an additional property of the authority.
addDescriptionToGraph(NamedGraph graph, ArrayList listOfAuthorityProperties) Given a NamedGraph and an arraylist of properties, add properties and Authority properties to the NamedGraph.
More ...

Usage Example

SWPNamedGraphSet set = new SWPNamedGraphSetImpl();
	
//Create an SWPAuthority so we can assert/quote graphs
SWPAuthority auth = new SWPAuthorityImpl();

// set the email addr of the authority (FOAF vocabulary)
auth.setEmail("mailto:rowland@grid.cx");

// set the ID of the authority
auth.setID(Node.createURI("http://grid.cx/rowland"));

// set the X.509 certificate using the users' PKCS #12 keystore
Certificate[] chain = de.fuberlin.wiwiss.ng4j.swp.utils.PKCS12Utils.getCertChain(keystore, password);
auth.setCertificate((X509Certificate )chain[0]);

String keystore = "/path/to/your/pkcs12/keystore";
String password = "passwd";

// Assert all graphs in the NamedGraphSet
// We provide an Authority, a Node specifying the signature algorithm (SHA1WithRSA),
// a Node specifying the digest method (SHA-1), a null list of properties,
// a path the a PKCS #12 keystore, and its password.
set.assertWithSignature(auth, 
			SWP.JjcRdfC14N_rsa_sha1, 
			SWP.JjcRdfC14N_sha1, 
			null, 
			keystore, 
			password);

// Verify all signatures in the NamedGraphSet
set.verifyAllSignatures();

// print out the resulting NamedGraphSet
set.write(System.out, "TRIG", "");

The output might look like this.

 

Manipulating PKCS #12 keystores

The SWP Framework relies heavily on the PKI (Public Key Infrastructure) to provide reliable cryptographic digital signatures, and non-repudiation of signatures. Generating PKCS #12 keystores

Unless you are willing to write certificate management tools yourself, using Bouncy Castle, Cryptix or IAIK, you will most likely want to use OpenSSL to generate and manage your certificates.

While OpenSSL provides a wide range of options got generating DSA and RSA keys with other features, it can be very cumbersome to use. Each OpenSSL distribution should contain a perl script called CA.pl which makes life a little bit easier.

Before you can generate your own certificates, you need to have a Certificate Authority (CA) that can be used to vouch for the veracity of your certificate. Unless you already have a CA, you will want to make your own.

Generating the CA: The CA can can be generated using the CA.pl script using the following command:

./CA.pl -newca

User certifcates can be generated by first creating a certificate request, to be signed by the CA:

./CA.pl -newreq
./CA.pl -sign

Once you have your certicate request signed, you can then package it into a PKCS #12 keystore. The following command will achieve this:

./CA.pl -pkcs12

You should now have a file called newcert.p12. This is the file that you should use as input to any of the methods in SWPNamedGraphSet.

 

2.5 Semantic Web Client Library

NG4J's Semantic Web Client Library represents the complete Semantic Web as a single set of named graphs. When queried, the SemanticWebClient dynamically retrieves information from the Semantic Web by dereferencing HTTP URIs, by following rdfs:seeAlso links, and by querying the Sindice search engine. It is described in a separate document: Semantic Web Client Library.


 

3. Download

Version Comment Release Date
V0.9.3 Upgrade to Jena 2.6.2. 2010-02-25
V0.9.2 Upgrade to the latest versions of Jena (2.6.0) and ARQ (2.8.0); addition of RDFa support to the Semantic Web Client Library; a few code restructurings; minor bug fix in the TriG parser. 2009-06-14
V0.9.1 Upgrade to the latest versions of Bouncy Castle libraries (1.43) and Axis (1.4); addition of a SingleNamedGraphModel which is a Jena model that wraps a single NamedGraph; several bug fixes (mainly concurrency issues and memory leaks in the Semantic Web Client Library; TriG writer); a few code cleanups; deprecated GRDDL support in the Semantic Web Client Library. 2009-06-14
V0.9 Support for Apache Derby database added; Java 1.4 compatibility dropped; TriQL removed; bugfixes and code cleanups. 2009-02-20
V0.8 Upgrade to the latest versions of Jena and ARQ (Jena 2.5.6, ARQ 2.4); support for different output formats in the command line tool of the Semantic Web Client Library; bugfixes; deprecated TriQL (please use SPARQL queries instead). 2008-11-17
V0.7 Support for URI based search with the Sindice search engine and management of 303 redirects in the Semantic Web Client Library, database persistence with Oracle supported, Bouncy Castle libraries updated to version 138, code refactoring and bugfixes. 2008-09-04
V0.6 Support for GRDDL in the Semantic Web Client Library, bugfixes. 2007-04-19
V0.5 Semantic Web Client Library, SPARQL support, PostgreSQL and HSQLDB support, bugfixes. 2006-10-09
V0.4 SWP and X.509 signatures support, TriQL improvements, bugfixes. 2005-02-24
V0.3 TriG support, TriX syntactic extensions support, Jena 2.2 compatibility, bugfixes and minor improvements. 2004-12-17
V0.2 TriQL support, MySQL persistence added. 2004-11-02
V0.1 Initial release. 2004-09-14

SourceForge.net Logo


4. Acknowledgements

NG4J is developed by:

Lots of thanks to those who have contributed to the project:


5. Support and Feedback

Additional information about NG4J is available on the project's wiki. See the MediaWiki Hosted Application at the NG4J SourceForge site http://sourceforge.net/apps/mediawiki/ng4j/.

We are interested in hearing about your opinion and your experience with NG4J. Please sent comments and bug reports to the NG4J-namedgraphs mailing list:

ng4j-namedgraphs@lists.sourceforge.net

The archives of the list are found at http://sourceforge.net/mailarchive/forum.php?forum=ng4j-namedgraphs
You can subscribe to the list at http://lists.sourceforge.net/lists/listinfo/ng4j-namedgraphs.

A maven generated Web site about the NG4J project is available at http://paneris.net/ng4j/index.html.


$Date: 2010/02/25 14:27:58 $
$Id: index.html,v 1.42 2010/02/25 14:27:58 hartig Exp $