Contents
- Intro
- Benchmark Datasets
- Benchmark Machine
- Benchmark Results for the Explore Use Case
- Benchmark Results for the Explore and Update use case
- Store Comparison
- Experience with the Business Intelligence query mix
- Thanks
Document Version: 1.0
Publication Date: 02/22/2011
1. Introduction
The Berlin SPARQL Benchmark (BSBM) is a benchmark for comparing the performance of storage systems that expose SPARQL endpoints. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.
This document presents the results of a February 2011 BSBM experiment in which the Berlin SPARQL Benchmark Version 3 was used to measure the performance of:
- 4store (version 1.1.2)
- BigData (rev. 4169)
- BigOwlim (version 3.4.3129)
- TDB (version 0.8.9)
- Virtuoso (version 7.00.3200-pthreads for Linux as of Jan 25 2011)
The stores were benchmarked with datasets of 100 millions
and
200 millions triples.
2. Benchmark Datasets
We ran the benchmark using the Triple version of the BSMB dataset. The benchmark was run for different dataset sizes. The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification.
Details about the benchmark datasets are summarized in the
following table:
Number of Triples |
100M | 200M |
Number of Products | 284,826 | 570,000 |
Number of Producers | 5,618 | 11,240 |
Number of Product Features | 47,884 | 94,259 |
Number of Product Types | 2,011 | 3,949 |
Number of Vendors | 2,896 | 5,758 |
Number of Offers | 5,696,520 | 11,400,000 |
Number of Reviewers | 146,093 | 292,095 |
Number of Reviews | 2,848,260 | 5,700,000 |
Total Number of Instances | 9,034,108 | 18,077,301 |
Exact Total Number of Triples | 100,000,748 | 200,031,975 |
File Size Turtle (unzipped) | 8.7 GB | 18 GB |
Note: All datasets were generated with the -fc option for forward chaining.
The BSBM dataset generator and test driver can be downloaded from SourceForge.
The RDF representation of the
benchmark datasets can be generated in the following way:
To generate the 100m dataset as Turtle file type the following command in the BSBM directory:
./generate -fc -s ttl -fn dataset_100m -pc 284826
Variations:
* generate N-Triples instead of Turtle:
use -s nt instead of -s ttl
* generate update dataset for the Explore and Update use case:
add -ud
* Generate multiple files instead of one, for example 100 files:
add -nof 100
* Write test driver data to a different directory (default is td_data), for example for the 100m dataset:
add -dir td_data_100m
3. Benchmark Machine
We used a machine with the following
specification for the benchmark experiment:
- Hardware:
- Processors: Intel i7 950, 3.07GHz (4 cores)
- Memory: 24GB
- Hard Disks: 2 x 1.8TB (7,200 rpm)
SATA2.
- Software:
- Operating System: Ubuntu 10.04 64-bit, Kernel 2.6.32-24-generic
- Filesystem: ext4
- Seperate partitions for application data and data bases.
- Java Version and JVM: Version 1.6.0_20, OpenJDK 64-Bit Server VM (build 19.0-b09).
- BSBM generator and test driver version: bsbmtools-v0.2
4.
Benchmark Results for the Explore Use Case
This section reports the results of running the Explore
use case of
the BSBM
benchmark against:
- 4store (version 1.1.2)
- BigData (rev. 4169)
- BigOwlim (version 3.4.3129)
- TDB (version 0.8.9)
- Virtuoso (version 7.00.3200-pthreads for Linux as of Jan 25 2011)
The load performance of the systems was measured by loading the Turtle representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.
The query performance of the systems was measured by running 500 BSBM query mixes (altogether 12,500 queries for the Explore use case) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a ramp-up period is executed before the actual test runs.
We applied the following test procedure to each store:
- Load data into the store.
- Shutdown store, (optional: clear OS caches and swap), restart store.
- Run ramp-up (randomizer seed: 1212123, at least 8000 query mixes).
- Execute single-client test run (500 mixes performance measurement, randomizer seed: 9834533)
- Execute multi-client runs (4, 8
and 64 clients; randomizer seeds: 8188326, 9175932 and 4187411). For
each run add two times the number of clients of warm up query mixes.
./testdriver -rampup -seed 1212123 http://sparql-endpoint
./testdriver -seed 9834533 http://sparql-endpoint
For example for a run with 4 clients execute:
./testdriver -seed 8188326 -w 8 -mt 4 -o results_4clients.xml http://sparql-endpoint
The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.
An overview of load times for SUTs and the different datasets are given in the following table (in hh:min:sec):
SUT
|
100M
|
200M
|
---|---|---|
4store |
26:42* |
1:12:04* |
BigData |
1:03:47 |
3:24:25 |
BigOwlim |
17:22 |
38:36 |
TDB |
1:14:48 |
2:45:13 |
Virtuoso |
1:49:26** |
3:59:38** |
* The N-Triples version of the dataset was used.
** The dataset was split into 100 respectively 200 Turtle fiiles and loaded with the DB.DBA.TTLP function consecutively.
4.1 4store
4.1.1 Configuration
The following changes were made to the default configuration
of the software:
- 4store: Version 1.1.2
Command to setup the database:
4s-backend-setup --node 0 --cluster 1 --segments 8 bsbm
Command to start the SPARQL endpoint:
4s-httpd -p 8000 -s -1 bsbm
Raptor2 version: 2.0.0
Rasqal version: 0.9.24
4.1.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
26:42 |
1:12:04 |
4.1.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | 200m | |
---|---|---|
Query 1 | 117.6 | 145.3 |
Query 2 | 49.0 | 55.7 |
Query 3 | 102.4 | 122.9 |
Query 4 | 43.4 | 62.9 |
Query 5 | 7.8 | 5.0 |
Query 6 | not executed | not executed |
Query 7 | 41.3 | 49.3 |
Query 8 | 49.1 | 57.1 |
Query 9 | 233.0 | 117.3 |
Query 10 | 49.2 | 52.8 |
Query 11 | 145.3 | 33.3 |
Query 12 | 46.5 | 40.4 |
4.1.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64
clients. The results are in Query Mixes per Hour (QMpH)
meaning that larger numbers are better.
Single |
|
100M | 5589 |
200M | 4593 |
4.1.5 Result
Summaries
- 4store 100M:
Number of clients Single
Download links
- 4store 200M:
Number of clients Single
Download links
4.1.6 Run Logs (detailed
information)
- 4store run logs for 100M:
Number of clients Single
Download links
- 4store run logs for 200M:
Number of clients Single
Download links
4.2 BigData
4.2.1 Configuration
The following changes were made to the default configuration
of the software:
- BigData: Version rev. 4169
For loading and starting the server the ANT script in the directory "bigdata-perf/bsbm3" was used.
4.2.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
1:03:47 |
3:24:25 |
4.2.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | 200m | |
---|---|---|
Query 1 | 64.2 | 64.4 |
Query 2 | 33.6 | 35.3 |
Query 3 | 12.4 | 14.4 |
Query 4 | 38.4 | 40.0 |
Query 5 | 2.3 | 1.7 |
Query 6 | not executed | not executed |
Query 7 | 31.3 | 21.9 |
Query 8 | 48.5 | 14.1 |
Query 9 | 54.8 | 44.6 |
Query 10 | 61.6 | 28.8 |
Query 11 | 43.8 | 26.7 |
Query 12 | 54.8 | 31.1 |
4.2.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 2428 | 4153 | 4286 | 4136 |
200M | 1795 | 3040 | 3167 | 2689 |
4.2.5 Result
Summaries
- BigData 100M:
Number of clients Single
Multi (zip)
Download links
- BigData 200M:
Number of clients Single
Multi (zip)
Download links
4.2.6 Run Logs (detailed
information)
- BigData run logs for 100M:
Number of clients Single
Multi(zip)
Download links
- BigData run logs for 200M:
Number of clients Single
Multi(zip)
Download links
4.3 BigOwlim
4.3.1 Configuration
The following changes were made to the default configuration
of the software:
- BigOwlim: Version 3.4.3129
- Tomcat: Version 6.0.24
Modified heap size:
CATALINA_OPTS="-Xms512m -Xmx8G"
Modified database location:
JAVA_OPTS='-Dinfo.aduna.platform.appdata.basedir=/database/tomcat'
- Sesame: Version 2.3.2
- Config files:
4.3.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
17:22 |
38:36 |
4.3.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | 200m | |
---|---|---|
Query 1 | 112.5 | 45.0 |
Query 2 | 159.3 | 111.6 |
Query 3 | 125.0 | 48.1 |
Query 4 | 97.9 | 37.8 |
Query 5 | 3.0 | 1.8 |
Query 6 | not executed | not executed |
Query 7 | 32.6 | 11.0 |
Query 8 | 38.0 | 12.6 |
Query 9 | 141.8 | 67.9 |
Query 10 | 48.5 | 22.4 |
Query 11 | 51.3 | 18.7 |
Query 12 | 65.4 | 29.1 |
4.3.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 3534 | 9349 | 12798 | 15285 |
200M | 1795 | 3713 | 4041 | 3622 |
4.3.5 Result
Summaries
- BigOwlim 100M:
Number of clients Single
Multi (zip)
Download links
- BigOwlim 200M:
Number of clients Single
Multi (zip)
Download links
4.3.6 Run Logs (detailed
information)
- BigOwlim run logs for 100M:
Number of clients Single
Multi(zip)
Download links
Number of clients | Single |
Multi(zip) |
---|---|---|
Download links |
4.4 TDB
4.1.1 Configuration
The following changes were made to the default configuration
of the software:
- TDB: Version 0.8.9
Loading was done with tdbloader2
Statistics for the BGP optimizer were generated with the "tdbconfig stats" command and copied into the database directory.
- Fuseki: Version 0.1.0
Started server with: ./fuseki-server --loc /database/tdb /bsbm
4.4.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
1:14:48 |
2:45:13 |
4.4.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | 200m | |
---|---|---|
Query 1 | 75.1 | 62.2 |
Query 2 | 41.0 | 44.1 |
Query 3 | 82.2 | 66.1 |
Query 4 | 62.1 | 47.3 |
Query 5 | 2.0 | 1.2 |
Query 6 | not executed | not executed |
Query 7 | 22.6 | 15.0 |
Query 8 | 24.4 | 15.9 |
Query 9 | 124.6 | 97.1 |
Query 10 | 33.5 | 26.4 |
Query 11 | 30.0 | 23.4 |
Query 12 | 33.3 | 28.3 |
4.4.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 2274 | 4065 | 3035 | 2242 |
200M | 1443 | 2206 | 1474 |
* |
* TDB crashed for
the 200m dataset and 64 clients, because of an error (fixed by now).
4.4.5 Result
Summaries
- TDB 100M:
Number of clients Single
Multi (zip)
Download links
- TDB 200M:
Number of clients Single
Multi (zip)
Download links
4.4.6 Run Logs (detailed
information)
- TDB run logs for 100M:
Number of clients Single
Multi(zip)
Download links
Number of clients | Single |
Multi(zip) |
---|---|---|
Download links |
4.5 Virtuoso
4.5.1 Configuration
The following changes were made to the default configuration
of the software:
- Virtuoso: Version 7.00.3200-pthreads for Linux as of Jan 25 2011
Loading of datasets:
The loading was done with the TTLP (single threaded) function by loading splits of the dataset.
For the 100m dataset 100 files were generated, for the 200m dataset 200.
For the configuration see the "virtuoso.ini" file.
- Config files:
4.5.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
1:49:26 |
3:59:38 |
4.5.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | 200m | |
---|---|---|
Query 1 | 200.7 | 163.0 |
Query 2 | 71.1 | 73.8 |
Query 3 | 201.4 | 195.4 |
Query 4 | 103.9 | 94.8 |
Query 5 | 15.2 | 9.3 |
Query 6 | not executed | not executed |
Query 7 | 24.9 | 15.4 |
Query 8 | 54.0 | 22.5 |
Query 9 | 379.1 | 160.1 |
Query 10 | 113.7 | 69.9 |
Query 11 | 73.6 | 39.5 |
Query 12 | 68.0 | 39.8 |
4.5.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 7352 | 25194 |
36269 | 18008 |
200M | 4669 | 13265 | 18264 |
16564 |
4.5.5 Result
Summaries
- Virtuoso 100M:
Number of clients Single
Multi (zip)
Download links
- Virtuoso 200M:
Number of clients Single
Multi (zip)
Download links
4.5.6 Run Logs (detailed
information)
- Virtuoso run logs for 100M:
Number of clients Single
Multi(zip)
Download links
Number of clients | Single |
Multi(zip) |
---|---|---|
Download links |
5. Benchmark Results for the Explore-And-Update Use Case
This section reports the results of running the Explore-And-Update
use case of the BSBM
benchmark against:
- 4store (version 1.1.2)
- BigOwlim (version 3.4.3129)
- TDB (version 0.8.9)
BigData and Virtuoso are not listed here for different reasons:
BigData does not provide all SPARQL features yet, that are required to run the update query mix. When executing the update mix on
Virtuoso, we ran into technical problems and are still working on solving them together with the Openlink team.
The load performance of the systems was measured by loading the Turtle or N-Triples representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.
The query performance of the systems was measured by running 500 BSBM query mixes (altogether 15,000 queries from the Explore and Update query mixes) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a ramp-up period is executed before the actual test runs.
We applied the following test procedure to each store:
- Load data into the store.
- Shutdown store, clear OS caches and swap, restart store.
- Run ramp-up (randomizer seed: 1212123, at least 8000 query mixes).
- Execute single-client test run (500 mixes performance
measurement, randomizer seed: 9834533)
./testdriver -ucf usecases/exploreAndUpdate/sparql.txt -rampup -seed 1212123 http://sparql-endpoint
./testdriver -ucf usecases/exploreAndUpdate/sparql.txt -udataset dataset_update.nt -seed 9834533 -u http://sparql-endpoint-update-service http://sparql-endpoint
The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.
5.1 4store
5.1.1 Configuration
The following changes were made to the default configuration
of the software:
- 4store: Version 1.1.2
Command to setup the database:
4s-backend-setup --node 0 --cluster 1 --segments 8 bsbm
Command to start the SPARQL endpoint:
4s-httpd -p 8000 -s -1 bsbm
Raptor2 version: 2.0.0
Rasqal version: 0.9.24
- Queries:
Update Query 2 in queries/update/query2.txt was changed to the following because the DELETE WHERE syntax was not supported:
DELETE
{ %Offer% ?p ?o }
WHERE
{ %Offer% ?p ?o }
5.1.2 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | |
---|---|
Upd. Query 1 | 20.3 |
Upd. Query 2 | 70.4 |
Exp. Query 1 | 116.4 |
Exp. Query 2 | 66.7 |
Exp. Query 3 | 141.7 |
Exp. Query 4 | 59.6 |
Exp. Query 5 | 8.1 |
Exp. Query 6 | not executed |
Exp. Query 7 | 53.0 |
Exp. Query 8 | 67.1 |
Exp. Query 9 | 351.5 |
Exp. Query 10 | 68.8 |
Exp. Query 11 | 192.6 |
Exp. Query 12 | 62.1 |
5.1.3
Benchmark
Overall result: QMpH for the 100M dataset
The result is in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
5.1.4 Result
Summaries
- 4store 100M:
Number of clients Single
Download links
5.1.5 Run Logs (detailed
information)
- 4store run logs for 100M:
Number of clients Single
Download links
5.2 BigOwlim
5.3.1 Configuration
The following changes were made to the default configuration
of the software:
- BigOwlim: Version
- Tomcat: Version 6.0.24
Modified heap size:
CATALINA_OPTS="-Xms512m -Xmx8G"
Modified database location:
JAVA_OPTS='-Dinfo.aduna.platform.appdata.basedir=/database/tomcat'
- Joseki: Version 3.4.3
For configuration see the joseki-config.ttl file.
Changes to bin/rdfserver:
JAVA_ARGS="-server -Xmx8G -Dcache-memory=5G -Druleset=empty"
- Test driver:
Add the following to the parameter list for the testdriver execution:Config files:
-uqp request
5.2.2 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | |
---|---|
Upd. Query 1 | 19.4 |
Upd. Query 2 | 30.5 |
Exp. Query 1 | 222.0 |
Exp. Query 2 | 57.8 |
Exp. Query 3 | 278.6 |
Exp. Query 4 | 196.2 |
Exp. Query 5 | 2.6 |
Exp. Query 6 | not executed |
Exp. Query 7 | 44.3 |
Exp. Query 8 | 51.6 |
Exp. Query 9 | 292.6 |
Exp. Query 10 | 65.6 |
Exp. Query 11 | 64.4 |
Exp. Query 12 | 63.6 |
5.2.3
Benchmark
Overall result: QMpH for the 100M dataset
The result is in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
5.2.4 Result
Summaries
- BigOwlim 100M:
Number of clients Single
Download links
5.2.5 Run Logs (detailed
information)
- BigOwlim run logs for 100M:
Number of clients Single
Download links
5.3 TDB
4.1.1 Configuration
The following changes were made to the default configuration
of the software:
- TDB: Version 0.8.9
Loading was done with bin/tdbloader2
Statistics for the BGP optimizer were generated with the "bin/tdbconfig stats" command. The resulting "stats.opt" file has to be copied into the database directory.
- Fuseki: Version 0.1.0
Started server with: ./fuseki-server --update --loc /database/tdb /bsbm
- Test driver:
Add the following to the parameter list for the testdriver execution:
-uqp request
5.3.2 Benchmark Query results: QpS (Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100m | |
---|---|
Upd. Query 1 | 0.7 |
Upd. Query 2 | 2.4 |
Exp. Query 1 | 55.5 |
Exp. Query 2 | 39.1 |
Exp. Query 3 | 84.5 |
Exp. Query 4 | 56.7 |
Exp. Query 5 | 2.0 |
Exp. Query 6 | not executed |
Exp. Query 7 | 68.9 |
Exp. Query 8 | 82.3 |
Exp. Query 9 | 141.0 |
Exp. Query 10 | 125.3 |
Exp. Query 11 | 78.5 |
Exp. Query 12 | 112.6 |
5.3.3
Benchmark
Overall result: QMpH for the 100M dataset
The result is in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
5.3.4 Result
Summaries
- TDB 100M:
Number of clients Single
Download links
5.3.5 Run Logs (detailed
information)
- TDB run logs for 100M:
Number of clients Single
Download links
6. Store Comparison
This section compares the SPARQL query performance of the different stores.
6.1 Query Mixes per Hour for Single Clients
Running 500 query mixes against the different stores resulted in the following performance numbers (in QMpH). The best performance figure for each dataset size is set bold in the tables.
6.1.1 QMpH: Explore use case
The complete query mix is given here.
100m | 200m | |
---|---|---|
4store | 5589 | 4593 |
BigData | 2428 | 1795 |
BigOwlim | 3534 | 1795 |
TDB | 2274 | 1443 |
Virtuoso | 7352 | 4669 |
A much more detailed view of the results for the Explore use case is given under Detailed Results For The Explore-Query-Mix Benchmark Run.
6.1.2 QMpH: Explore and Update use case
The Explore and Update query mix consists of the Update query mix (queries 1 and 2) and the Explore query mix (queries 3 to 14).
100m | |
---|---|
4store | 5311 |
BigOwlim | 2809 |
TDB | 680 |
A much more detailed view of the results
for the Explore and Update use case is given under Detailed
Results
For The Explore-And-Update-Query-Mix Benchmark Run.
6.2 Query Mixes per Hour for Multiple Clients (Explore query mix
only)
Dataset Size 100M | Number of clients | |||
1
|
4
|
8
|
64
|
|
4store |
5589 | * |
* |
* |
BigData | 2428 | 4153 |
4286 |
4136 |
BigOwlim |
3534 | 9349 |
12798 |
15285 |
TDB |
2274 | 4065 |
3035 |
2242 |
Virtuoso |
7352 | 25194 |
36269 |
18008 |
Dataset Size 200M | Number of clients | |||
1
|
4
|
8
|
64
|
|
4store |
4593 | * |
* |
* |
BigData |
1795 | 3040 |
3167 |
2689 |
BigOwlim |
1795 | 3713 |
4041 |
3622 |
TDB |
1443 | 2206 |
1474 |
** |
Virtuoso |
4669 | 13265 |
18264 |
16564 |
* We ran into technical problems while testing with multiple clients.
** TDB crashed for the 200m dataset and 64 clients, because of a bug
that has been fixed by now.
6.3 Detailed Results For The Explore-Query-Mix Benchmark Run
The details of running the Explore query mix are given here. There are two different views:
6.3.1 Queries per Second by Query and Dataset Size
Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each dataset size is set bold in the tables.
Query 1
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 117.6 | 64.2 | 112.5 | 75.1 | 200.7 |
200m | 145.3 | 64.4 | 45.0 | 62.2 | 163.0 |
Query 2
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 49.0 | 33.6 | 159.3 | 41.0 | 71.1 |
200m | 55.7 | 35.3 | 111.6 | 44.1 | 73.8 |
Query 3
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 102.4 | 12.4 | 125.0 | 82.2 | 201.4 |
200m | 122.9 | 14.4 | 48.1 | 66.1 | 195.4 |
Query 4
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 43.4 | 38.4 | 97.9 | 62.1 | 103.9 |
200m | 62.9 | 40.0 | 37.8 | 47.3 | 94.8 |
Query 5
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 7.8 | 2.3 | 3.0 | 2.0 | 15.2 |
200m | 5.0 | 1.7 | 1.8 | 1.2 | 9.3 |
Query 6
Not Executed.Query 7
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 41.3 | 31.3 | 32.6 | 22.6 | 24.9 |
200m | 49.3 | 21.9 | 11.0 | 15.0 | 15.4 |
Query 8
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 49.1 | 48.5 | 38.0 | 24.4 | 54.0 |
200m | 57.1 | 14.1 | 12.6 | 15.9 | 22.5 |
Query 9
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 233.0 | 54.8 | 141.8 | 124.6 | 379.1 |
200m | 117.3 | 44.6 | 67.9 | 97.1 | 160.1 |
Query 10
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 49.2 | 61.6 | 48.5 | 33.5 | 113.7 |
200m | 52.8 | 28.8 | 22.4 | 26.4 | 69.9 |
Query 11
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 145.3 | 43.8 | 51.3 | 30.0 | 73.6 |
200m | 33.3 | 26.7 | 18.7 | 23.4 | 39.5 |
Query 12
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
100m | 46.5 | 54.8 | 65.4 | 33.3 | 68.0 |
200m | 40.4 | 31.1 | 29.1 | 28.3 | 39.8 |
6.3.2 Queries per Second by Dataset Size and Query
Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each query is set bold in the tables.
100m
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
Query 1 | 117.6 | 64.2 | 112.5 | 75.1 | 200.7 |
Query 2 | 49.0 | 33.6 | 159.3 | 41.0 | 71.1 |
Query 3 | 102.4 | 12.4 | 125.0 | 82.2 | 201.4 |
Query 4 | 43.4 | 38.4 | 97.9 | 62.1 | 103.9 |
Query 5 | 7.8 | 2.3 | 3.0 | 2.0 | 15.2 |
Query 6 | not executed | not executed | not executed | not executed | not executed |
Query 7 | 41.3 | 31.3 | 32.6 | 22.6 | 24.9 |
Query 8 | 49.1 | 48.5 | 38.0 | 24.4 | 54.0 |
Query 9 | 233.0 | 54.8 | 141.8 | 124.6 | 379.1 |
Query 10 | 49.2 | 61.6 | 48.5 | 33.5 | 113.7 |
Query 11 | 145.3 | 43.8 | 51.3 | 30.0 | 73.6 |
Query 12 | 46.5 | 54.8 | 65.4 | 33.3 | 68.0 |
200m
4store | BigData | BigOwlim | TDB | Virtuoso | |
---|---|---|---|---|---|
Query 1 | 145.3 | 64.4 | 45.0 | 62.2 | 163.0 |
Query 2 | 55.7 | 35.3 | 111.6 | 44.1 | 73.8 |
Query 3 | 122.9 | 14.4 | 48.1 | 66.1 | 195.4 |
Query 4 | 62.9 | 40.0 | 37.8 | 47.3 | 94.8 |
Query 5 | 5.0 | 1.7 | 1.8 | 1.2 | 9.3 |
Query 6 | not executed | not executed | not executed | not executed | not executed |
Query 7 | 49.3 | 21.9 | 11.0 | 15.0 | 15.4 |
Query 8 | 57.1 | 14.1 | 12.6 | 15.9 | 22.5 |
Query 9 | 117.3 | 44.6 | 67.9 | 97.1 | 160.1 |
Query 10 | 52.8 | 28.8 | 22.4 | 26.4 | 69.9 |
Query 11 | 33.3 | 26.7 | 18.7 | 23.4 | 39.5 |
Query 12 | 40.4 | 31.1 | 29.1 | 28.3 | 39.8 |
6.4 Detailed Results For The Explore-and-Update-Query-Mix Benchmark Run
The details of the Explore and Update query mix are given here for the Update part and here for the Explore part. There are two different views:
6.4.1 Queries per Second by Query and Dataset Size (Explore and Update query mix)
Running 500
query mixes
against the different stores lead to the following query throughput for
each type of query over all 500 runs (in Queries per Second). The best
performance figure for each dataset size is set bold in the tables.
Upd. Query 1
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 20.3 | 19.4 | 0.7 |
Upd. Query 2
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 70.4 | 30.5 | 2.4 |
Exp. Query 1
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 116.4 | 222.0 | 55.5 |
Exp. Query 2
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 66.7 | 57.8 | 39.1 |
Exp. Query 3
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 141.7 | 278.6 | 84.5 |
Exp. Query 4
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 59.6 | 196.2 | 56.7 |
Exp. Query 5
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 8.1 | 2.6 | 2.0 |
Exp. Query 6
Not executed.Exp. Query 7
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 53.0 | 44.3 | 68.9 |
Exp. Query 8
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 67.1 | 51.6 | 82.3 |
Exp. Query 9
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 351.5 | 292.6 | 141.0 |
Exp. Query 10
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 68.8 | 65.6 | 125.3 |
Exp. Query 11
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 192.6 | 64.4 | 78.5 |
Exp. Query 12
4store | BigOwlim | TDB | |
---|---|---|---|
100m | 62.1 | 63.6 | 112.6 |
6.4.2 Queries per Second by Dataset Size and Query (Explore and Update query mix)
Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each query is set bold in the tables.
100m
4store | BigOwlim | TDB | |
---|---|---|---|
Upd. Query 1 | 20.3 | 19.4 | 0.7 |
Upd. Query 2 | 70.4 | 30.5 | 2.4 |
Exp. Query 1 | 116.4 | 222.0 | 55.5 |
Exp. Query 2 | 66.7 | 57.8 | 39.1 |
Exp. Query 3 | 141.7 | 278.6 | 84.5 |
Exp. Query 4 | 59.6 | 196.2 | 56.7 |
Exp. Query 5 | 8.1 | 2.6 | 2.0 |
Exp. Query 6 | not executed | not executed | not executed |
Exp. Query 7 | 53.0 | 44.3 | 68.9 |
Exp. Query 8 | 67.1 | 51.6 | 82.3 |
Exp. Query 9 | 351.5 | 292.6 | 141.0 |
Exp. Query 10 | 68.8 | 65.6 | 125.3 |
Exp. Query 11 | 192.6 | 64.4 | 78.5 |
Exp. Query 12 | 62.1 | 63.6 | 112.6 |
7. Experiences with the Business Intelligence use case
BigData and 4store currently do not provide all SPARQL features that are required to run the BI query mix. We thus tried to run the Business Intelligence use case of the Berlin SPARQL Benchmark only against Virtuoso, TDB and BigOwlim. However, we ran into several "technical problems" that prevented us from finishing the tests and from publishing meaningful results. We thus decided to give the store vendors more time to fix and optimize their stores and will run the BI query mix experiment again in about four months (July 2011). For the next test runs, we will also modify query 4, because of its quadratic complexity and dominates the benchmark for larger data sets. We will circulate the updated BI query mix specification via the SPARQL developers mailing list in May 2011 and will ask the vendors for feedback on the specifcation.
8. Thanks
Thanks a lot to Orri Erling for his proposal to have the Business Intelligence use case and initial queries for the query mix. Lots of thanks also go to Ivan Mikhailov for his in-depth review of the Business Intelligence query mix and for finding several bugs in the queries. We also want to thank Peter Boncz and Hugh Williams for feedback on the new version of the BSBM benchmark.
We want to thank the store vendors and implementers for helping us to setup and configure their stores for the experiment. Lots of thanks to Andy Seaborne, Ivan Mikhailov, Hugh Williams, Zdravko Tashev, Atanas Kiryakov, Barry Bishop, Bryan Thompson, Mike Personick and Steve Harris.
The work on the BSBM Benchmark Version 3 is funded through the LOD2 - Creating Knowledge out of Linked Data project.
Please send comments and feedback about the benchmark to Chris Bizer and Andreas Schultz.