Contents
Document Version: 1.0
Publication Date: 11/30/2009
1. Introduction
The Berlin SPARQL Benchmark (BSBM) is a benchmark for comparing the performance of storage systems that expose SPARQL endpoints. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.
This document presents the results of a November 2009 BSBM experiment in which the Berlin SPARQL Benchmark was used to measure the performance of
The stores were benchmarked with datasets of 100 millions and 200 millions triples.
This November 2009 BSBM experiment was run in addition to the March 2009 BSBM experiment which compared triple stores, relational database-to-RDF wrappers and SQL database management systems. The results of the March 2009 experiment are found here. Within in the November 2009 experiment BigOWLIM was tested for the first time, TDB was measured again, because of significant speed ups since March 2009. Virtuoso was included for comparison reasons, being the fastest triple store in the previous BSBM experiment.
2. Benchmark Datasets
We ran the benchmark using the Triple version of the BSMB dataset (benchmark scenario NTR). The benchmark was run for different dataset sizes. The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification.
Details about the benchmark datasets are summarized in the following table:
Number of Triples
100M | 200M | |
Number of Products | 284,826 | 570,000 |
Number of Producers | 5,618 | 11,240 |
Number of Product Features | 47,884 | 94,259 |
Number of Product Types | 2011 | 3,949 |
Number of Vendors | 2,854 | 5,710 |
Number of Offers | 5,696,520 | 11,400,000 |
Number of Reviewers | 146,054 | 292,271 |
Number of Reviews | 2,848,260 | 5,700,000 |
Total Number of Instances |
9,034,027
|
18,077,429 |
Exact Total Number of Triples | 100,000,112 | 200,031,413 |
File Size Turtle (unzipped) | 8.5 GB | 18 GB |
Note: All datasets were generated with the -fc option for forward chaining.
There is a RDF triple and a relational representation of the benchmark datasets. Both representations can be downloaded below:
Download Turtle Representation of the Benchmark Datasets
- 100M Benchmark Dataset (Turtle, gzipped size: 2.2 GB)
(If you generate the datasets by yourself the Test Driver data is generated automatically in directory "td_data")
Download Test Driver data
3. Benchmark Machine
The benchmarks were run on the same machine as the March 2009 experiments. This machine has the following specification:
- Hardware:
- Processors: Intel Core 2 Quad Q9450 2.66GHz, FSB 1333MHz, L1 256KB, shared L2: overall 12,288KB
- Memory: 8GB DDR2 667 (4 x 2GB)
- Hard Disks: 160GB (10,000 rpm) SATA2, 750GB (7,200 rpm) SATA2
- Software:
- Operating System: Ubuntu 8.04 64-bit, Kernel 2.6.24-24
- Filesystem: ext3
- Seperate partitions for application data (on 7,200 rpm HDD) and data bases (on 10,000 rpm HDD).
- Java Version and JVM: Version 1.6.0_16-b01, HotSpot(TM) 64-Bit Server VM (build 14.2-b01).
4. Benchmark Results
This section reports the results of running the BSBM benchmark against three RDF stores.
The load performance of the systems was measured by loading the Turtle representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.
The query performance of the systems was measured by running 500 BSBM query mixes (altogether 12,500 queries) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a ramp-up period is executed before the actual test runs.
We applied the following test procedure to each store:
- Load data into the store.
- Shutdown store, clear OS caches, restart store.
- Run ramp-up.
- Execute single-client test run (500 mixes performance measurement, randomizer seed: 808080)
- Execute multiple-client test runs. ( 4 clients, 500 query mixes, randomizer seed: 863528)
- Execute test run with reduced query mix. (repeat steps 2 to 4 with reduced query mix and different randomizer seed 919191)
The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.
An overview of load times for SUTs and the different datasets are given in the following table (in hh:min:sec):
SUT
|
100M
|
200M
|
---|---|---|
Jena TDB |
01:42:45
|
06:14:41
|
BigOWLIM | 00:33:47 | 01:18:18 |
VirtuosoTS | 07:43:39 | 48:41:11 |
4.1 TDB over Joseki
4.1.1 Configuration
The following changes were made to the default configuration
of the software:
- TDB: Version 0.83 (SVN rev. 6745)
The statistics-based BGP optimizer was used by generating the stats.opt file, which has to be copied to the database location.
- Joseki: Version 3.4 (CVS 2009-09-29)
All Log levels set to WARN
4.1.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
1:42:45
|
6:14:41
|
4.1.3 Benchmark Query
results: QpS (Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | 200m | |
---|---|---|
Query 1 | 69.4 | 29.2 |
Query 2 | 46.7 | 32.3 |
Query 3 | 62.1 | 30.6 |
Query 4 | 44.1 | 20.0 |
Query 5 | 1.2 | 0.7 |
Query 6 | 0.3 | 0.1 |
Query 7 | 5.8 | 3.0 |
Query 8 | 9.9 | 6.1 |
Query 9 | 1.9 | 1.0 |
Query 10 | 10.6 | 5.9 |
Query 11 | 20.0 | 15.1 |
Query 12 | 1.9 | 1.0 |
4.1.4 Benchmark Overall results: QMpH for the 100M and M datasets for all runs
For the 100M and 200M datasets we ran a test with 4 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | |
100M | 407 | 524 |
200M | 210 | 250 |
4.1.5 Result
Summaries
- TDB 100M:
Number of clients 14Download links
- TDB 200M:
Number of clients 14Download links
4.1.6 Run Logs (detailed
information)
- TDB run logs for 100M:
Number of clients 1 4 Download links
- TDB run logs for 200M:
Number of clients 14Download links
4.2 BigOWLIM 3.1
4.2.1 Configuration
The following changes were made to the default configuration of the software:
- BigOWLIM: Version 3.1
Store Config File
- Tomcat: Version 5.5.25.5ubuntu
JAVA_OPTS = ... -Xmx6144m ...
4.2.2 Load Time
The table below summarizes the load times of theTurtle files (in hh:mm:ss) :
100M | 200M |
33:47
|
1:18:18
|
4.2.3 Benchmark Query results: QpS (Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):