Berlin SPARQL Benchmark (BSBM)

The SPARQL Query Language for RDF and the SPARQL Protocol for RDF are implemented by a growing number of storage systems and are used within enterprise and open web settings. As SPARQL is taken up by the community there is a growing need for benchmarks to compare the performance of storage systems that expose SPARQL endpoints via the SPARQL protocol. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources.

The Berlin SPARQL Benchmark (BSBM) defines a suite of benchmarks for comparing the performance of these systems across architectures. The benchmark is built around an e-commerce use case in which a set of products is offered by different vendors and consumers have posted reviews about products. The benchmark query mix illustrates the search and navigation pattern of a consumer looking for a product.

News

11/12/2013: New benchmark experiment using the BSBM and the DBpedia benchmark to compare the performance of several NoSQL databases for RDF processing (10 million to 1 billion triples).
04/26/2013: Results of the April 2013 BSBM V3.1 Experiment released, benchmarking Virtuoso, BigOWLIM, BigData and Jena TDB with datasets ranging from 10 million to 150 billion (!) triples within the Explore and Business Intelligence use cases. See also blog post about the experiences made while conducting the experiment.
08/26/2011: Berlin SPARQL Benchmark (BSBM) Specification - V3.1 released.
08/21/2011: Peter Boncz released a call for participation in the BSBM V3.1 benchmark, which is part of the LOD2 benchmarking activities. The benchmark will be run on an impressive cluster consisting of 4880 cores and 12TB of memory.
02/22/2011: Results of the February 2011 BSBM V3 Experiment released, benchmarking Virtuoso, BigOWLIM, 4store, BigData and Jena TDB with 100 million and 200 million triples datasets within the Explore and Update use cases.
11/29/2010: Berlin SPARQL Benchmark (BSBM) Specification - V3.0 released.
11/30/2009: Results of the November 2009 BSBM V2 Experiment released, benchmarking Virtuoso, BigOWLIM and Jena TDB with 100 million and 200 million triples datasets.
04/16/2009: Additional data generator for producing Linked Data style named graph sets released by Olaf Hartig from Humboldt University.
03/24/2009: Results of the March 2009 BSBM V2 Experiment released, benchmarking four RDF stores, two RDB-to-RDF wrappers and two SQL databases with datasets up to 100 million triples.
09/17/2008: Results from running BSBM Version 2 against three RDF stores, two RDB-to-RDF wrappers and two SQL databases released.
09/12/2008: Version 2 of the BSBM benchmark specification, data generator and test driver released. The new version includes 2 additional benchmark queries, fine-tuned data generation rules, and a relational representation of the dataset as well as a SQL version of the query mix. Multi-threaded version of the test-driver in order to simulate multiple clients simultaneously working against a server.
07/30/2008: Results from running BSBM Version 1 with 100 million triple datasets on Virtuoso and D2R Server added.
07/23/2008: New benchmark results for D2R Server added.
07/19/2008: Initial benchmark results for Virtuoso, Sesame and SDB published.

Please send comments and feedback about the benchmark to Chris Bizer and Andreas Schultz.

Publications

Christian Bizer, Andreas Schultz: The Berlin SPARQL Benchmark . In: International Journal on Semantic Web & Information Systems, Vol. 5, Issue 2, Pages 1-24, 2009.
Christian Bizer, Andreas Schultz: Benchmarking the Performance of Storage Systems that expose SPARQL Endpoints. In: Proceedings of the 4th International Workshop on Scalable Semantic Web knowledge Base Systems (SSWS2008).

More information about benchmarking RDF stores is found at http://www.w3.org/wiki/RdfStoreBenchmarking