Intro
Benchmark Datasets
Benchmark Machine
Benchmark Results

Jena TDB
BigOWLIM
Virtuoso - Triple Store

Store Comparison
Thanks

Document Version: 1.0
Publication Date: 11/30/2009

1. Introduction

The Berlin SPARQL Benchmark (BSBM) is a benchmark for comparing the performance of storage systems that expose SPARQL endpoints. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.

This document presents the results of a November 2009 BSBM experiment in which the Berlin SPARQL Benchmark was used to measure the performance of

Virtuoso Version 5.0.11
Jena TDB revision 6745
BigOWLIM Version 3.1.0

The stores were benchmarked with datasets of 100 millions and 200 millions triples.

This November 2009 BSBM experiment was run in addition to the March 2009 BSBM experiment which compared triple stores, relational database-to-RDF wrappers and SQL database management systems. The results of the March 2009 experiment are found here. Within in the November 2009 experiment BigOWLIM was tested for the first time, TDB was measured again, because of significant speed ups since March 2009. Virtuoso was included for comparison reasons, being the fastest triple store in the previous BSBM experiment.

2. Benchmark Datasets

We ran the benchmark using the Triple version of the BSMB dataset (benchmark scenario NTR). The benchmark was run for different dataset sizes. The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification.

Details about the benchmark datasets are summarized in the following table:

Number of Triples

	100M	200M
Number of Products	284,826	570,000
Number of Producers	5,618	11,240
Number of Product Features	47,884	94,259
Number of Product Types	2011	3,949
Number of Vendors	2,854	5,710
Number of Offers	5,696,520	11,400,000
Number of Reviewers	146,054	292,271
Number of Reviews	2,848,260	5,700,000
Total Number of Instances	9,034,027	18,077,429
Exact Total Number of Triples	100,000,112	200,031,413
File Size Turtle (unzipped)	8.5 GB	18 GB

Note: All datasets were generated with the -fc option for forward chaining.

There is a RDF triple and a relational representation of the benchmark datasets. Both representations can be downloaded below:

Download Turtle Representation of the Benchmark Datasets

100M Benchmark Dataset (Turtle, gzipped size: 2.2 GB)

Important: Test Driver data for all datasets:
(If you generate the datasets by yourself the Test Driver data is generated automatically in directory "td_data")

Download Test Driver data

3. Benchmark Machine

The benchmarks were run on the same machine as the March 2009 experiments. This machine has the following specification:

Hardware:

Processors: Intel Core 2 Quad Q9450 2.66GHz, FSB 1333MHz, L1 256KB, shared L2: overall 12,288KB
Memory: 8GB DDR2 667 (4 x 2GB)
Hard Disks: 160GB (10,000 rpm) SATA2, 750GB (7,200 rpm) SATA2

Software:

Operating System: Ubuntu 8.04 64-bit, Kernel 2.6.24-24
- Filesystem: ext3
- Seperate partitions for application data (on 7,200 rpm HDD) and data bases (on 10,000 rpm HDD).
Java Version and JVM: Version 1.6.0_16-b01, HotSpot(TM) 64-Bit Server VM (build 14.2-b01).

4. Benchmark Results

This section reports the results of running the BSBM benchmark against three RDF stores.

Test Procedure

The load performance of the systems was measured by loading the Turtle representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.

The query performance of the systems was measured by running 500 BSBM query mixes (altogether 12,500 queries) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a ramp-up period is executed before the actual test runs.

We applied the following test procedure to each store:

Load data into the store.
Shutdown store, clear OS caches, restart store.
Run ramp-up.
Execute single-client test run (500 mixes performance measurement, randomizer seed: 808080)
Execute multiple-client test runs. ( 4 clients, 500 query mixes, randomizer seed: 863528)
Execute test run with reduced query mix. (repeat steps 2 to 4 with reduced query mix and different randomizer seed 919191)

The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.

An overview of load times for SUTs and the different datasets are given in the following table (in hh:min:sec):

SUT	100M	200M
Jena TDB	01:42:45	06:14:41
BigOWLIM	00:33:47	01:18:18
VirtuosoTS	07:43:39	48:41:11

4.1 TDB over Joseki

Jena TDB homepage

4.1.1 Configuration

The following changes were made to the default configuration of the software:

TDB: Version 0.83 (SVN rev. 6745)

The statistics-based BGP optimizer was used by generating the stats.opt file, which has to be copied to the database location.

Joseki: Version 3.4 (CVS 2009-09-29)

All Log levels set to WARN

Config files:
- TDB
- Joseki

4.1.2 Load Time

The table below summarizes the load times Turtle files (in hh:mm:ss) :

100M	200M
1:42:45	6:14:41

4.1.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	69.4	29.2
Query 2	46.7	32.3
Query 3	62.1	30.6
Query 4	44.1	20.0
Query 5	1.2	0.7
Query 6	0.3	0.1
Query 7	5.8	3.0
Query 8	9.9	6.1
Query 9	1.9	1.0
Query 10	10.6	5.9
Query 11	20.0	15.1
Query 12	1.9	1.0

4.1.4 Benchmark Overall results: QMpH for the 100M and M datasets for all runs

For the 100M and 200M datasets we ran a test with 4 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

	1	4
100M	407	524
200M	210	250

4.1.5 Result Summaries

TDB 100M:

Number of clients	1	4
Download links	xml / txt	xml / txt

TDB 200M:

Number of clients	1	4
Download links	xml / txt	xml / txt

4.1.6 Run Logs (detailed information)

TDB run logs for 100M:

Number of clients	1	4
Download links	txt	txt

TDB run logs for 200M:

Number of clients	1	4
Download links	txt	txt

4.2 BigOWLIM 3.1

BigOWLIM homepage

4.2.1 Configuration

The following changes were made to the default configuration of the software:

BigOWLIM: Version 3.1

Store Config File

Tomcat: Version 5.5.25.5ubuntu

JAVA_OPTS = ... -Xmx6144m ...

4.2.2 Load Time

The table below summarizes the load times of theTurtle files (in hh:mm:ss) :

100M	200M
33:47	1:18:18

4.2.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	42.4	14.2
Query 2	71.0	29.0
Query 3	52.5	17.6
Query 4	31.6	12.2
Query 5	1.8	1.0
Query 6	0.4	0.2
Query 7	6.7	3.7
Query 8	7.8	4.2
Query 9	32.0	11.9
Query 10	19.9	9.2
Query 11	13.6	10.5
Query 12	21.7	15.2

4.2.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 4 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

4.3 Virtuoso Open-Source Edition v5.0.11 (Triple Store)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	144.6	104.8
Query 2	42.4	35.2
Query 3	136.7	104.4
Query 4	49.6	40.0
Query 5	6.0	3.4
Query 6	0.5	0.3
Query 7	4.3	2.3
Query 8	11.6	5.8
Query 9	53.6	31.0
Query 10	6.4	3.8
Query 11	37.8	26.9
Query 12	33.2	23.4

4.3.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 4 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

5. Store Comparison

5.1 Query Mixes per Hour

Running 500 query mixes against the different stores resulted in the following performance numbers (in QMpH). The best performance figure for each dataset size is set bold in the tables.

	Jena TDB	BigOWLIM	VirtuosoTS
100m	406.9	834.9	936.4
200m	209.5	416.2	495.9

The reduced query mix consists of the same query sequence as the complete mix but without queries 5 and 6. The two queries were excluded as they alone consumed a large portion of the overall query execution time for bigger dataset sizes.

	Jena TDB	BigOWLIM	VirtuosoTS
100m	968.5	2822.4	1957.1
200m	431.2	1397.9	1122.2

5.2 Detailed Results For The Complete-Query-Mix Benchmark Run

The details of running the complete query mix are given here. There are two different views:

5.2.1 Queries per Second by Query and Dataset Size

Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each dataset size is set bold in the tables. For comparison reasons the MySQL and Virtuoso results for the SQL queries are also included in the tables but not considered when determining the best performance figure.

Query 1

	Jena TDB	BigOWLIM	VirtuosoTS
100m	69.4	42.4	144.6
200m	29.2	14.2	104.8

Query 2

	Jena TDB	BigOWLIM	VirtuosoTS
100m	46.7	71.0	42.4
200m	32.3	29.0	35.2

Query 3

	Jena TDB	BigOWLIM	VirtuosoTS
100m	62.1	52.5	136.7
200m	30.6	17.6	104.4

Query 4

	Jena TDB	BigOWLIM	VirtuosoTS
100m	44.1	31.6	49.6
200m	20.0	12.2	40.0

Query 5

	Jena TDB	BigOWLIM	VirtuosoTS
100m	1.2	1.8	6.0
200m	0.7	1.0	3.4

Query 6

	Jena TDB	BigOWLIM	VirtuosoTS
100m	0.3	0.4	0.5
200m	0.1	0.2	0.3

Query 7

	Jena TDB	BigOWLIM	VirtuosoTS
100m	5.8	6.7	4.3
200m	3.0	3.7	2.3

Query 8

	Jena TDB	BigOWLIM	VirtuosoTS
100m	9.9	7.8	11.6
200m	6.1	4.2	5.8

Query 9

	Jena TDB	BigOWLIM	VirtuosoTS
100m	1.9	32.0	53.6
200m	1.0	11.9	31.0

Query 10

	Jena TDB	BigOWLIM	VirtuosoTS
100m	10.6	19.9	6.4
200m	5.9	9.2	3.8

Query 11

	Jena TDB	BigOWLIM	VirtuosoTS
100m	20.0	13.6	37.8
200m	15.1	10.5	26.9

Query 12

	Jena TDB	BigOWLIM	VirtuosoTS
100m	1.9	21.7	33.2
200m	1.0	15.2	23.4

5.2.2 Queries per Second by Dataset Size and Query

100M Triples Dataset

200M Triples Dataset

	Jena TDB	BigOWLIM	VirtuosoTS
Query 1	69.4	42.4	144.6
Query 2	46.7	71.0	42.4
Query 3	62.1	52.5	136.7
Query 4	44.1	31.6	49.6
Query 5	1.2	1.8	6.0
Query 6	0.3	0.4	0.5
Query 7	5.8	6.7	4.3
Query 8	9.9	7.8	11.6
Query 9	1.9	32.0	53.6
Query 10	10.6	19.9	6.4
Query 11	20.0	13.6	37.8
Query 12	1.9	21.7	33.2

	Jena TDB	BigOWLIM	VirtuosoTS
Query 1	29.2	14.2	104.8
Query 2	32.3	29.0	35.2
Query 3	30.6	17.6	104.4
Query 4	20.0	12.2	40.0
Query 5	0.7	1.0	3.4
Query 6	0.1	0.2	0.3
Query 7	3.0	3.7	2.3
Query 8	6.1	4.2	5.8
Query 9	1.0	11.9	31.0
Query 10	5.9	9.2	3.8
Query 11	15.1	10.5	26.9
Query 12	1.0	15.2	23.4

5.3 Detailed Results For The Reduced-Query-Mix Benchmark Run

The details of running the reduced query mix are given here. There are two different views:

5.3.1 Queries per Second by Query and Dataset Size (reduced query mix)

Query 1

	Jena TDB	BigOWLIM	VirtuosoTS
100m	21.9	21.2	37.2
200m	8.4	8.1	14.4

Query 2

	Jena TDB	BigOWLIM	VirtuosoTS
100m	40.9	58.1	34.9
200m	19.7	25.5	23.9

Query 3

	Jena TDB	BigOWLIM	VirtuosoTS
100m	24.1	30.8	33.5
200m	7.7	10.4	6.7

Query 4

	Jena TDB	BigOWLIM	VirtuosoTS
100m	12.7	15.8	13.8
200m	5.0	5.6	5.5

Query 5

Query 6

Query 7

	Jena TDB	BigOWLIM	VirtuosoTS
100m	6.4	7.9	4.4
200m	3.1	4.2	2.6

Query 8

	Jena TDB	BigOWLIM	VirtuosoTS
100m	11.2	9.1	12.0
200m	6.4	5.5	6.3

Query 9

	Jena TDB	BigOWLIM	VirtuosoTS
100m	2.1	42.1	55.8
200m	0.9	14.3	34.4

Query 10

	Jena TDB	BigOWLIM	VirtuosoTS
100m	12.0	21.6	6.0
200m	6.5	11.1	4.1

Query 11

	Jena TDB	BigOWLIM	VirtuosoTS
100m	20.2	14.4	39.3
200m	12.4	10.5	30.5

Query 12

	Jena TDB	BigOWLIM	VirtuosoTS
100m	2.0	22.8	31.3
200m	0.9	14.3	20.7

5.3.2 Queries per Second by Dataset Size and Query (reduced query mix)

	1	4
100M	835	1486
200M	416	709

	1	4
100M	936	1914
200M	495	914

BSBM Results for Virtuoso, Jena TDB, BigOWLIM (November 2009)

Contents

1. Introduction

2. Benchmark Datasets

3. Benchmark Machine

4. Benchmark Results

4.1 TDB over Joseki

4.2 BigOWLIM 3.1

4.3 Virtuoso Open-Source Edition v5.0.11 (Triple Store)

5. Store Comparison

5.1 Query Mixes per Hour

5.2 Detailed Results For The Complete-Query-Mix Benchmark Run

5.2.1 Queries per Second by Query and Dataset Size

5.2.2 Queries per Second by Dataset Size and Query

100M Triples Dataset

200M Triples Dataset

5.3 Detailed Results For The Reduced-Query-Mix Benchmark Run

5.3.1 Queries per Second by Query and Dataset Size (reduced query mix)

5.3.2 Queries per Second by Dataset Size and Query (reduced query mix)

100m

200m

6. Thanks