Intro
Benchmark Datasets
Benchmark Machine
Benchmark Results for the Explore Use Case

4store
BigData
BigOwlim
TDB
Virtuoso 7

Benchmark Results for the Explore and Update use case

4store
BigOwlim
TDB

Store Comparison
Experience with the Business Intelligence query mix
Thanks

Document Version: 1.0
Publication Date: 02/22/2011

1. Introduction

The Berlin SPARQL Benchmark (BSBM) is a benchmark for comparing the performance of storage systems that expose SPARQL endpoints. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.

This document presents the results of a February 2011 BSBM experiment in which the Berlin SPARQL Benchmark Version 3 was used to measure the performance of:

4store (version 1.1.2)
BigData (rev. 4169)
BigOwlim (version 3.4.3129)
TDB (version 0.8.9)
Virtuoso (version 7.00.3200-pthreads for Linux as of Jan 25 2011)

The stores were benchmarked with datasets of 100 millions and 200 millions triples.

2. Benchmark Datasets

We ran the benchmark using the Triple version of the BSMB dataset. The benchmark was run for different dataset sizes. The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification.

Details about the benchmark datasets are summarized in the following table:

Number of Triples	100M	200M
Number of Products	284,826	570,000
Number of Producers	5,618	11,240
Number of Product Features	47,884	94,259
Number of Product Types	2,011	3,949
Number of Vendors	2,896	5,758
Number of Offers	5,696,520	11,400,000
Number of Reviewers	146,093	292,095
Number of Reviews	2,848,260	5,700,000
Total Number of Instances	9,034,108	18,077,301
Exact Total Number of Triples	100,000,748	200,031,975
File Size Turtle (unzipped)	8.7 GB	18 GB

Note: All datasets were generated with the -fc option for forward chaining.

The BSBM dataset generator and test driver can be downloaded from SourceForge.

The RDF representation of the benchmark datasets can be generated in the following way:

To generate the 100m dataset as Turtle file type the following command in the BSBM directory:

  ./generate -fc -s ttl -fn dataset_100m -pc 284826

Variations:

* generate N-Triples instead of Turtle:

   use -s nt instead of -s ttl

* generate update dataset for the Explore and Update use case:

   add -ud

* Generate multiple files instead of one, for example 100 files:

   add -nof 100

* Write test driver data to a different directory (default is td_data), for example for the 100m dataset:

   add -dir td_data_100m

3. Benchmark Machine

We used a machine with the following specification for the benchmark experiment:

Hardware:

Processors: Intel i7 950, 3.07GHz (4 cores)
Memory: 24GB
Hard Disks: 2 x 1.8TB (7,200 rpm) SATA2.

Software:

Operating System: Ubuntu 10.04 64-bit, Kernel 2.6.32-24-generic
- Filesystem: ext4
- Seperate partitions for application data and data bases.
Java Version and JVM: Version 1.6.0_20, OpenJDK 64-Bit Server VM (build 19.0-b09).
BSBM generator and test driver version: bsbmtools-v0.2

4. Benchmark Results for the Explore Use Case

This section reports the results of running the Explore use case of the BSBM benchmark against:

4store (version 1.1.2)
BigData (rev. 4169)
BigOwlim (version 3.4.3129)
TDB (version 0.8.9)
Virtuoso (version 7.00.3200-pthreads for Linux as of Jan 25 2011)

Test Procedure

The load performance of the systems was measured by loading the Turtle representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.

The query performance of the systems was measured by running 500 BSBM query mixes (altogether 12,500 queries for the Explore use case) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a ramp-up period is executed before the actual test runs.

We applied the following test procedure to each store:

Load data into the store.
Shutdown store, (optional: clear OS caches and swap), restart store.
Run ramp-up (randomizer seed: 1212123, at least 8000 query mixes).

./testdriver -rampup -seed 1212123 http://sparql-endpoint

Execute single-client test run (500 mixes performance measurement, randomizer seed: 9834533)

./testdriver -seed 9834533 http://sparql-endpoint

Execute multi-client runs (4, 8 and 64 clients; randomizer seeds: 8188326, 9175932 and 4187411). For each run add two times the number of clients of warm up query mixes.

For example for a run with 4 clients execute:

./testdriver -seed 8188326 -w 8 -mt 4 -o results_4clients.xml http://sparql-endpoint

The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.

An overview of load times for SUTs and the different datasets are given in the following table (in hh:min:sec):

SUT	100M	200M
4store	26:42*	1:12:04*
BigData	1:03:47	3:24:25
BigOwlim	17:22	38:36
TDB	1:14:48	2:45:13
Virtuoso	1:49:26**	3:59:38**

* The N-Triples version of the dataset was used.
** The dataset was split into 100 respectively 200 Turtle fiiles and loaded with the DB.DBA.TTLP function consecutively.

4.1 4store

4store homepage

4.1.1 Configuration

The following changes were made to the default configuration of the software:

4store: Version 1.1.2

Command to setup the database:

  4s-backend-setup --node 0 --cluster 1 --segments 8 bsbm

Command to start the SPARQL endpoint:

  4s-httpd -p 8000 -s -1 bsbm

Libraries: Raptor2 and Rasqal

Raptor2 version: 2.0.0

Rasqal version:  0.9.24

4.1.2 Load Time

The table below summarizes the load times Turtle files (in hh:mm:ss) :

100M	200M
26:42	1:12:04

4.1.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	117.6	145.3
Query 2	49.0	55.7
Query 3	102.4	122.9
Query 4	43.4	62.9
Query 5	7.8	5.0
Query 6	not executed	not executed
Query 7	41.3	49.3
Query 8	49.1	57.1
Query 9	233.0	117.3
Query 10	49.2	52.8
Query 11	145.3	33.3
Query 12	46.5	40.4

4.1.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

	Single
100M	5589
200M	4593

4.1.5 Result Summaries

4store 100M:

Number of clients
Single

Download links
xml / txt

Number of clients	Single
Download links	xml / txt

4store 200M:

Number of clients
Single

Download links
xml / txt

Number of clients	Single
Download links	xml / txt

4.1.6 Run Logs (detailed information)

4store run logs for 100M:

Number of clients Single

Download links
txt

Number of clients	Single
Download links	txt

4store run logs for 200M:

Number of clients Single

Download links
txt

Number of clients	Single
Download links	txt

4.2 BigData

BigData homepage

4.2.1 Configuration

The following changes were made to the default configuration of the software:

BigData: Version rev. 4169

For loading and starting the server the ANT script in the directory "bigdata-perf/bsbm3" was used.

4.2.2 Load Time

The table below summarizes the load times Turtle files (in hh:mm:ss) :

100M	200M
1:03:47	3:24:25

4.2.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	64.2	64.4
Query 2	33.6	35.3
Query 3	12.4	14.4
Query 4	38.4	40.0
Query 5	2.3	1.7
Query 6	not executed	not executed
Query 7	31.3	21.9
Query 8	48.5	14.1
Query 9	54.8	44.6
Query 10	61.6	28.8
Query 11	43.8	26.7
Query 12	54.8	31.1

4.2.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

	1	4	8	64
100M	2428	4153	4286	4136
200M	1795	3040	3167	2689

4.2.5 Result Summaries

BigData 100M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

BigData 200M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

4.2.6 Run Logs (detailed information)

BigData run logs for 100M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

BigData run logs for 200M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

4.3 BigOwlim

Owlim homepage

4.3.1 Configuration

The following changes were made to the default configuration of the software:

BigOwlim: Version 3.4.3129
Tomcat: Version 6.0.24

Modified heap size:

  CATALINA_OPTS="-Xms512m -Xmx8G"

Modified database location:

  JAVA_OPTS='-Dinfo.aduna.platform.appdata.basedir=/database/tomcat'

Sesame: Version 2.3.2
Config files:
- Bigowlim template for Sesame

4.3.2 Load Time

The table below summarizes the load times Turtle files (in hh:mm:ss) :

100M	200M
17:22	38:36

4.3.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	112.5	45.0
Query 2	159.3	111.6
Query 3	125.0	48.1
Query 4	97.9	37.8
Query 5	3.0	1.8
Query 6	not executed	not executed
Query 7	32.6	11.0
Query 8	38.0	12.6
Query 9	141.8	67.9
Query 10	48.5	22.4
Query 11	51.3	18.7
Query 12	65.4	29.1

4.3.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

	1	4	8	64
100M	3534	9349	12798	15285
200M	1795	3713	4041	3622

4.3.5 Result Summaries

BigOwlim 100M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

BigOwlim 200M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

4.3.6 Run Logs (detailed information)

BigOwlim run logs for 100M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

BigOwlim run logs for 200M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

4.4 TDB

TDB homepage

Fuseki homepage

4.1.1 Configuration

The following changes were made to the default configuration of the software:

TDB: Version 0.8.9

Loading was done with tdbloader2

Statistics for the BGP optimizer were generated with the "tdbconfig stats" command and copied into the database directory.

Fuseki: Version 0.1.0

Started server with: ./fuseki-server --loc /database/tdb /bsbm

4.4.2 Load Time

The table below summarizes the load times Turtle files (in hh:mm:ss) :

100M	200M
1:14:48	2:45:13

4.4.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	75.1	62.2
Query 2	41.0	44.1
Query 3	82.2	66.1
Query 4	62.1	47.3
Query 5	2.0	1.2
Query 6	not executed	not executed
Query 7	22.6	15.0
Query 8	24.4	15.9
Query 9	124.6	97.1
Query 10	33.5	26.4
Query 11	30.0	23.4
Query 12	33.3	28.3

4.4.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

	1	4	8	64
100M	2274	4065	3035	2242
200M	1443	2206	1474	*

* TDB crashed for the 200m dataset and 64 clients, because of an error (fixed by now).

4.4.5 Result Summaries

TDB 100M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

TDB 200M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

4.4.6 Run Logs (detailed information)

TDB run logs for 100M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

TDB run logs for 200M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

4.5 Virtuoso

Virtuoso homepage

4.5.1 Configuration

The following changes were made to the default configuration of the software:

Virtuoso: Version 7.00.3200-pthreads for Linux as of Jan 25 2011

Loading of datasets:

  The loading was done with the TTLP (single threaded) function by loading splits of the dataset.
  For the 100m dataset 100 files were generated, for the 200m dataset 200.

For the configuration see the "virtuoso.ini" file.

Config files:
- virtuoso.ini

4.5.2 Load Time

The table below summarizes the load times Turtle files (in hh:mm:ss) :

100M	200M
1:49:26	3:59:38

4.5.3 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m	200m
Query 1	200.7	163.0
Query 2	71.1	73.8
Query 3	201.4	195.4
Query 4	103.9	94.8
Query 5	15.2	9.3
Query 6	not executed	not executed
Query 7	24.9	15.4
Query 8	54.0	22.5
Query 9	379.1	160.1
Query 10	113.7	69.9
Query 11	73.6	39.5
Query 12	68.0	39.8

4.5.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs

For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

	1	4	8	64
100M	7352	25194	36269	18008
200M	4669	13265	18264	16564

4.5.5 Result Summaries

Virtuoso 100M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

Virtuoso 200M:

Number of clients	Single	Multi (zip)
Download links	xml / txt	xml / txt

4.5.6 Run Logs (detailed information)

Virtuoso run logs for 100M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

Virtuoso run logs for 200M:

Number of clients	Single	Multi(zip)
Download links	txt	txt

5. Benchmark Results for the Explore-And-Update Use Case

This section reports the results of running the Explore-And-Update use case of the BSBM benchmark against:

4store (version 1.1.2)
BigOwlim (version 3.4.3129)
TDB (version 0.8.9)

BigData and Virtuoso are not listed here for different reasons: BigData does not provide all SPARQL features yet, that are required to run the update query mix. When executing the update mix on Virtuoso, we ran into technical problems and are still working on solving them together with the Openlink team.

Test Procedure

The load performance of the systems was measured by loading the Turtle or N-Triples representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.

The query performance of the systems was measured by running 500 BSBM query mixes (altogether 15,000 queries from the Explore and Update query mixes) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a ramp-up period is executed before the actual test runs.

We applied the following test procedure to each store:

Load data into the store.
Shutdown store, clear OS caches and swap, restart store.
Run ramp-up (randomizer seed: 1212123, at least 8000 query mixes).

./testdriver -ucf usecases/exploreAndUpdate/sparql.txt -rampup -seed 1212123 http://sparql-endpoint

Execute single-client test run (500 mixes performance measurement, randomizer seed: 9834533)

./testdriver -ucf usecases/exploreAndUpdate/sparql.txt -udataset dataset_update.nt -seed 9834533 -u http://sparql-endpoint-update-service http://sparql-endpoint

5.1 4store

4store homepage

5.1.1 Configuration

The following changes were made to the default configuration of the software:

4store: Version 1.1.2

Command to setup the database:

  4s-backend-setup --node 0 --cluster 1 --segments 8 bsbm

Command to start the SPARQL endpoint:

  4s-httpd -p 8000 -s -1 bsbm

Libraries: Raptor2 and Rasqal

Raptor2 version: 2.0.0

Rasqal version:  0.9.24

Queries:

Update Query 2 in queries/update/query2.txt was changed to the following because the DELETE WHERE syntax was not supported:

DELETE 
{ %Offer% ?p ?o }
WHERE
{ %Offer% ?p ?o }

5.1.2 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m
Upd. Query 1	20.3
Upd. Query 2	70.4
Exp. Query 1	116.4
Exp. Query 2	66.7
Exp. Query 3	141.7
Exp. Query 4	59.6
Exp. Query 5	8.1
Exp. Query 6	not executed
Exp. Query 7	53.0
Exp. Query 8	67.1
Exp. Query 9	351.5
Exp. Query 10	68.8
Exp. Query 11	192.6
Exp. Query 12	62.1

5.1.3 Benchmark Overall result: QMpH for the 100M dataset

The result is in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

5311.1 QMpH

5.1.4 Result Summaries

4store 100M:

Number of clients
Single

Download links
xml / txt

Number of clients	Single
Download links	xml / txt

5.1.5 Run Logs (detailed information)

4store run logs for 100M:

Number of clients Single

Download links
txt

Number of clients	Single
Download links	txt

5.2 BigOwlim

Owlim homepage

5.3.1 Configuration

The following changes were made to the default configuration of the software:

BigOwlim: Version
Tomcat: Version 6.0.24

Modified heap size:

  CATALINA_OPTS="-Xms512m -Xmx8G"

Modified database location:

  JAVA_OPTS='-Dinfo.aduna.platform.appdata.basedir=/database/tomcat'

Joseki: Version 3.4.3

For configuration see the joseki-config.ttl file.

Changes to bin/rdfserver:

  JAVA_ARGS="-server -Xmx8G -Dcache-memory=5G -Druleset=empty"

Test driver:

Add the following to the parameter list for the testdriver execution:

  -uqp request

Config files:

5.2.2 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m
Upd. Query 1	19.4
Upd. Query 2	30.5
Exp. Query 1	222.0
Exp. Query 2	57.8
Exp. Query 3	278.6
Exp. Query 4	196.2
Exp. Query 5	2.6
Exp. Query 6	not executed
Exp. Query 7	44.3
Exp. Query 8	51.6
Exp. Query 9	292.6
Exp. Query 10	65.6
Exp. Query 11	64.4
Exp. Query 12	63.6

5.2.3 Benchmark Overall result: QMpH for the 100M dataset

The result is in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

2809.2 QMpH

5.2.4 Result Summaries

BigOwlim 100M:

Number of clients
Single

Download links
xml / txt

Number of clients	Single
Download links	xml / txt

5.2.5 Run Logs (detailed information)

BigOwlim run logs for 100M:

Number of clients Single

Download links
txt

Number of clients	Single
Download links	txt

5.3 TDB

TDB homepage

Fuseki homepage

4.1.1 Configuration

The following changes were made to the default configuration of the software:

TDB: Version 0.8.9

Loading was done with bin/tdbloader2

Statistics for the BGP optimizer were generated with the "bin/tdbconfig stats" command. The resulting "stats.opt" file has to be copied into the database directory.

Fuseki: Version 0.1.0

Started server with: ./fuseki-server --update --loc /database/tdb /bsbm

Test driver:

Add the following to the parameter list for the testdriver execution:

  -uqp request

5.3.2 Benchmark Query results: QpS (Queries per Second)

The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):

	100m
Upd. Query 1	0.7
Upd. Query 2	2.4
Exp. Query 1	55.5
Exp. Query 2	39.1
Exp. Query 3	84.5
Exp. Query 4	56.7
Exp. Query 5	2.0
Exp. Query 6	not executed
Exp. Query 7	68.9
Exp. Query 8	82.3
Exp. Query 9	141.0
Exp. Query 10	125.3
Exp. Query 11	78.5
Exp. Query 12	112.6

5.3.3 Benchmark Overall result: QMpH for the 100M dataset

The result is in Query Mixes per Hour (QMpH) meaning that larger numbers are better.

680.8 QMpH

5.3.4 Result Summaries

TDB 100M:

Number of clients
Single

Download links
xml / txt

Number of clients	Single
Download links	xml / txt

5.3.5 Run Logs (detailed information)

TDB run logs for 100M:

Number of clients Single

Download links
txt

Number of clients	Single
Download links	txt

6. Store Comparison

This section compares the SPARQL query performance of the different stores.

6.1 Query Mixes per Hour for Single Clients

Running 500 query mixes against the different stores resulted in the following performance numbers (in QMpH). The best performance figure for each dataset size is set bold in the tables.

6.1.1 QMpH: Explore use case

The complete query mix is given here.

	100m	200m
4store	5589	4593
BigData	2428	1795
BigOwlim	3534	1795
TDB	2274	1443
Virtuoso	7352	4669

A much more detailed view of the results for the Explore use case is given under Detailed Results For The Explore-Query-Mix Benchmark Run.

6.1.2 QMpH: Explore and Update use case

The Explore and Update query mix consists of the Update query mix (queries 1 and 2) and the Explore query mix (queries 3 to 14).

	100m
4store	5311
BigOwlim	2809
TDB	680

A much more detailed view of the results for the Explore and Update use case is given under Detailed Results For The Explore-And-Update-Query-Mix Benchmark Run.

6.2 Query Mixes per Hour for Multiple Clients (Explore query mix only)

Dataset Size 100M		Number of clients
	1	4	8	64
4store	5589	*	*	*
BigData	2428	4153	4286	4136
BigOwlim	3534	9349	12798	15285
TDB	2274	4065	3035	2242
Virtuoso	7352	25194	36269	18008

Dataset Size 200M		Number of clients
	1	4	8	64
4store	4593	*	*	*
BigData	1795	3040	3167	2689
BigOwlim	1795	3713	4041	3622
TDB	1443	2206	1474	**
Virtuoso	4669	13265	18264	16564

* We ran into technical problems while testing with multiple clients.

** TDB crashed for the 200m dataset and 64 clients, because of a bug that has been fixed by now.

6.3 Detailed Results For The Explore-Query-Mix Benchmark Run

The details of running the Explore query mix are given here. There are two different views:

6.3.1 Queries per Second by Query and Dataset Size

Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each dataset size is set bold in the tables.

Query 1

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	117.6	64.2	112.5	75.1	200.7
200m	145.3	64.4	45.0	62.2	163.0

Query 2

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	49.0	33.6	159.3	41.0	71.1
200m	55.7	35.3	111.6	44.1	73.8

Query 3

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	102.4	12.4	125.0	82.2	201.4
200m	122.9	14.4	48.1	66.1	195.4

Query 4

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	43.4	38.4	97.9	62.1	103.9
200m	62.9	40.0	37.8	47.3	94.8

Query 5

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	7.8	2.3	3.0	2.0	15.2
200m	5.0	1.7	1.8	1.2	9.3

Query 6

Not Executed.

Query 7

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	41.3	31.3	32.6	22.6	24.9
200m	49.3	21.9	11.0	15.0	15.4

Query 8

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	49.1	48.5	38.0	24.4	54.0
200m	57.1	14.1	12.6	15.9	22.5

Query 9

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	233.0	54.8	141.8	124.6	379.1
200m	117.3	44.6	67.9	97.1	160.1

Query 10

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	49.2	61.6	48.5	33.5	113.7
200m	52.8	28.8	22.4	26.4	69.9

Query 11

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	145.3	43.8	51.3	30.0	73.6
200m	33.3	26.7	18.7	23.4	39.5

Query 12

	4store	BigData	BigOwlim	TDB	Virtuoso
100m	46.5	54.8	65.4	33.3	68.0
200m	40.4	31.1	29.1	28.3	39.8

6.3.2 Queries per Second by Dataset Size and Query

100m

	4store	BigData	BigOwlim	TDB	Virtuoso
Query 1	117.6	64.2	112.5	75.1	200.7
Query 2	49.0	33.6	159.3	41.0	71.1
Query 3	102.4	12.4	125.0	82.2	201.4
Query 4	43.4	38.4	97.9	62.1	103.9
Query 5	7.8	2.3	3.0	2.0	15.2
Query 6	not executed	not executed	not executed	not executed	not executed
Query 7	41.3	31.3	32.6	22.6	24.9
Query 8	49.1	48.5	38.0	24.4	54.0
Query 9	233.0	54.8	141.8	124.6	379.1
Query 10	49.2	61.6	48.5	33.5	113.7
Query 11	145.3	43.8	51.3	30.0	73.6
Query 12	46.5	54.8	65.4	33.3	68.0

200m

	4store	BigData	BigOwlim	TDB	Virtuoso
Query 1	145.3	64.4	45.0	62.2	163.0
Query 2	55.7	35.3	111.6	44.1	73.8
Query 3	122.9	14.4	48.1	66.1	195.4
Query 4	62.9	40.0	37.8	47.3	94.8
Query 5	5.0	1.7	1.8	1.2	9.3
Query 6	not executed	not executed	not executed	not executed	not executed
Query 7	49.3	21.9	11.0	15.0	15.4
Query 8	57.1	14.1	12.6	15.9	22.5
Query 9	117.3	44.6	67.9	97.1	160.1
Query 10	52.8	28.8	22.4	26.4	69.9
Query 11	33.3	26.7	18.7	23.4	39.5
Query 12	40.4	31.1	29.1	28.3	39.8

6.4 Detailed Results For The Explore-and-Update-Query-Mix Benchmark Run

The details of the Explore and Update query mix are given here for the Update part and here for the Explore part. There are two different views:

6.4.1 Queries per Second by Query and Dataset Size (Explore and Update query mix)

Upd. Query 1

	4store	BigOwlim	TDB
100m	20.3	19.4	0.7

Upd. Query 2

	4store	BigOwlim	TDB
100m	70.4	30.5	2.4

Exp. Query 1

	4store	BigOwlim	TDB
100m	116.4	222.0	55.5

Exp. Query 2

	4store	BigOwlim	TDB
100m	66.7	57.8	39.1

Exp. Query 3

	4store	BigOwlim	TDB
100m	141.7	278.6	84.5

Exp. Query 4

	4store	BigOwlim	TDB
100m	59.6	196.2	56.7

Exp. Query 5

	4store	BigOwlim	TDB
100m	8.1	2.6	2.0

Exp. Query 6

Not executed.

Exp. Query 7

	4store	BigOwlim	TDB
100m	53.0	44.3	68.9

Exp. Query 8

	4store	BigOwlim	TDB
100m	67.1	51.6	82.3

Exp. Query 9

	4store	BigOwlim	TDB
100m	351.5	292.6	141.0

Exp. Query 10

	4store	BigOwlim	TDB
100m	68.8	65.6	125.3

Exp. Query 11

	4store	BigOwlim	TDB
100m	192.6	64.4	78.5

Exp. Query 12

	4store	BigOwlim	TDB
100m	62.1	63.6	112.6

6.4.2 Queries per Second by Dataset Size and Query (Explore and Update query mix)

100m

	4store	BigOwlim	TDB
Upd. Query 1	20.3	19.4	0.7
Upd. Query 2	70.4	30.5	2.4
Exp. Query 1	116.4	222.0	55.5
Exp. Query 2	66.7	57.8	39.1
Exp. Query 3	141.7	278.6	84.5
Exp. Query 4	59.6	196.2	56.7
Exp. Query 5	8.1	2.6	2.0
Exp. Query 6	not executed	not executed	not executed
Exp. Query 7	53.0	44.3	68.9
Exp. Query 8	67.1	51.6	82.3
Exp. Query 9	351.5	292.6	141.0
Exp. Query 10	68.8	65.6	125.3
Exp. Query 11	192.6	64.4	78.5
Exp. Query 12	62.1	63.6	112.6

7. Experiences with the Business Intelligence use case

BigData and 4store currently do not provide all SPARQL features that are required to run the BI query mix. We thus tried to run the Business Intelligence use case of the Berlin SPARQL Benchmark only against Virtuoso, TDB and BigOwlim. However, we ran into several "technical problems" that prevented us from finishing the tests and from publishing meaningful results. We thus decided to give the store vendors more time to fix and optimize their stores and will run the BI query mix experiment again in about four months (July 2011). For the next test runs, we will also modify query 4, because of its quadratic complexity and dominates the benchmark for larger data sets. We will circulate the updated BI query mix specification via the SPARQL developers mailing list in May 2011 and will ask the vendors for feedback on the specifcation.

8. Thanks

Thanks a lot to Orri Erling for his proposal to have the Business Intelligence use case and initial queries for the query mix. Lots of thanks also go to Ivan Mikhailov for his in-depth review of the Business Intelligence query mix and for finding several bugs in the queries. We also want to thank Peter Boncz and Hugh Williams for feedback on the new version of the BSBM benchmark.

We want to thank the store vendors and implementers for helping us to setup and configure their stores for the experiment. Lots of thanks to Andy Seaborne, Ivan Mikhailov, Hugh Williams, Zdravko Tashev, Atanas Kiryakov, Barry Bishop, Bryan Thompson, Mike Personick and Steve Harris.

The work on the BSBM Benchmark Version 3 is funded through the LOD2 - Creating Knowledge out of Linked Data project.

Please send comments and feedback about the benchmark to Chris Bizer and Andreas Schultz.

BSBM V3 Results (February 2011)

Contents

1. Introduction

2. Benchmark Datasets

3. Benchmark Machine

4. Benchmark Results for the Explore Use Case

4.1 4store

4.2 BigData

4.3 BigOwlim

4.4 TDB

4.5 Virtuoso

5. Benchmark Results for the Explore-And-Update Use Case

5.1 4store

5.2 BigOwlim

5.3 TDB

6. Store Comparison

6.1 Query Mixes per Hour for Single Clients

6.2 Query Mixes per Hour for Multiple Clients (Explore query mix only)

6.3 Detailed Results For The Explore-Query-Mix Benchmark Run

6.3.1 Queries per Second by Query and Dataset Size

6.3.2 Queries per Second by Dataset Size and Query

100m

200m

6.4 Detailed Results For The Explore-and-Update-Query-Mix Benchmark Run

6.4.1 Queries per Second by Query and Dataset Size (Explore and Update query mix)

6.4.2 Queries per Second by Dataset Size and Query (Explore and Update query mix)

100m

7. Experiences with the Business Intelligence use case

8. Thanks