Berlin SPARQL Benchmark V1 Results

Intro
Benchmark Dataset
Benchmark Machine
Benchmark Results

Jena SDB
Sesame
Virtuoso
D2R

Store Comparison

Date: 07/30/2008

1. Intro

This document presents initial results of running the Berlin SPARQL Benchmark (Version 1) against the RDF stores Virtuoso v5.0.6 and v5.0.7, Sesame Version 2.2-Beta2 , Jena SDB Version 1.1 and against D2R Server, a relational database-to-RDF wrapper. The stores were benchmarked with datasets ranging from 50,000 triples to 100,000,000 triples.

Note that this document has been superseeded by the new results document for BSBM Version 3.

2. Benchmark Dataset

We benchmarked using the Triple version of the BSMB dataset (benchmark scenario NTR). The benchmark was run for different dataset sizes. The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification.

Details about the datasets are summarized in the following table:

Number of Triples

	50K	250K	1M	5M	25M	100M
Number of Products	91	477	1,915	9,609	48,172	194,207
Number of Producers	2	10	41	199	974	3,844
Number of Reviews	2,275	11,925	47,875	240,225	1,204,300	4,855,175
Number of Offers	1,820	9,540	38,300	192,180	963,440	3,884,140
Number of Vendors	2	10	39	196	961	3,912
Number of Reviewers	116	622	2,452	12,351	61,862	248,730
Number of Product Features	580	580	1,390	3,307	3,307	10,931
Number of Product Types	13	13	31	73	73	411
Total Number of Instances	4,899	23,177	92,043	458,140	2,283,089	9,201,350
Exact Total Number of Triples	50,116	250,492	1,000,226	5,000,453	25,000,557	100,001,402
File Size N-Triple (unzipped)	14 MB	70,7 MB	284,4 MB	1,4 GB	7 GB	28,2 GB

There is a RDF triple and a relational representation of the benchmark datasets. Both representations can be downloaded below:

Download N-Triples Representation of the Benchmark Datasets

50K Benchmark Dataset (N-Triples, zipped size: 2.4 MB)
250K Benchmark Dataset (N-Triples, zipped size: 12.4 MB)
1M Benchmark Dataset (N-Triples, zipped size: 49.8 MB)
5M Benchmark Dataset (N-Triples, zipped size: 249.8 MB)
25M Benchmark Dataset (N-Triples, gzipped size: 1.2 GB)
100M Benchmark Dataset (N-Triples, gzipped size: 5.1 GB)

Download MySQL dump of the Benchmark Datasets

50K Benchmark Dataset (SQL dump, zipped size: 2.0 MB)
250K Benchmark Dataset (SQL dump, zipped size: 10.3 MB)
1M Benchmark Dataset (SQL dump, zipped size: 41.4 MB)
5M Benchmark Dataset (SQL dump, zipped size: 212.4 MB)
25M Benchmark Dataset (SQL dump, gzipped size: 1.06 GB)
100M Benchmark Dataset (SQL dump, gzipped size: 3.2 GB)

3. Benchmark Machine

The benchmark was run on a machine with the following specification:

Hardware:

Processors: Intel Core 2 Quad Q9450 2.66GHz, FSB 1333MHz, L1 256KB, shared L2: overall 12,288KB
Memory: 8GB DDR2 667 (4 x 2GB)
Hard Disks: 160GB (10,000 rpm) SATA2, 750GB (7,200 rpm) SATA2

Software:

Operating System: Ubuntu 8.04 64-bit, Kernel Linux 2.6.24-16-generic
Java Runtime: VM 1.6.0

4. Benchmark Results

The load performance of the systems was measured by loading the N-Triple representation of the BSBM datasets into the triple stores and by loading the relational representation in the form of MySQL dumps into the RDMS behind D2R Server. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.

The query performance of the systems was measured by running 50 BSBM query mixes (altogether 1250 queries) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to enable the SUTs to load parts of the working set into main memory and to take advantage of query result and query execution plan caching, 10 BSBM query mixes (altogether 250 queries) were executed for warm-up before the actual times were measured.

4.1 SDB (Hash) with MySQL over Joseki3

Jena SDB homepage

4.1.1 Configuration

The following changes were made to the default configuration of the software:

SDB: Version 1.1

Store Layout: layout2/hash InnoDB

MySQL: Version 5.0.51a-3Ubuntu5.1

Configuration file changes:

innodb_buffer_pool_size = 2800M
bulk_insert_buffer_size = 32M
query_cache_size = 512M

Indexes:

- product(producer)

- offer(product)
- offer(vendor)
- review(product)

Joseki: Version 3.2 (CVS)

4.1.2 Load Time

The table below summarizes the load times in of the N-Triple files (in seconds) :

50K	250K	1M	5M	25M	100M
5.343	23.992	116.7	1,053.0	13,306.7	144,989.0

4.1.3 Benchmark Query results (seconds): AQET (Average Query Execution Time)

	50K	250K	1M	5M	25M
Query 1	0.002559	0.004955	0.012592	0.057347	0.328271
Query 2	0.021276	0.024708	0.042374	0.071731	0.446813
Query 3	0.003968	0.006671	0.014379	0.059678	0.317215
Query 4	0.004618	0.008284	0.023286	0.112666	0.679248
Query 5	0.115628	0.509083	1.741927	8.199694	43.878071
Query 6	0.002521	0.005291	0.014744	0.254799	1.197798
Query 7	0.016953	0.078490	0.402448	2.193056	13.129511
Query 8	0.040807	0.159153	0.660130	3.535463	20.466382
Query 9	0.004357	0.004545	0.004461	0.004560	0.018882
Query 10	0.007880	0.033867	0.102852	1.028677	3.141508

We can not report average query execution times for the 100 M dataset yet, as SDB froze at query 7 after 3 hours while executing the first warm-up query mix.

4.1.4 Result Summaries

SDB (hash) 50K [xml / txt]
SDB (hash) 250K [xml / txt]
SDB (hash) 1M [xml / txt]
SDB (hash) 5M [xml / txt]
SDB (hash) 25M [xml / txt]

4.1.5 Run Logs (detailed information)

4.2 Sesame over Tomcat

Sesame homepage

4.2.1 Configuration

The following changes were made to the default configuration of the software:

Sesame: Version 2.2-Beta2

Store Type: Native
Indexes: spoc, posc, psoc

Tomcat: Version 5.5.25.5ubunt

JAVA_OPTS = ... -Xmx4096m ...

4.2.2 Load Time

The table below summarizes the load times in of the N-Triple files (in seconds) :

50K	250K	1M	5M	25M
3.269	17,706	132,743	1,988.839	27,674.091

When loading the 100M dataset, Sesame hid our 36 hour load time-out. Therefore, we cannot report query results for the 100M dataset.

4.2.3 Average Query Execution Time

The table below summarizes the average query execution time for each type of query over all 50 runs (in seconds):

	50K	250K	1M	5M	25M
Query 1	0.002106	0.002913	0.003904	0.005051	0.035575
Query 2	0.005274	0.004904	0.005523	0.006115	0.017859
Query 3	0.002364	0.003793	0.004681	0.010618	0.049690
Query 4	0.002595	0.004023	0.005491	0.009803	0.071493
Query 5	0.057673	0.310220	1.313046	6.688869	32.895117
Query 6	0.004993	0.014792	0.064516	0.290293	1.483277
Query 7	0.005265	0.005510	0.010948	0.007205	0.505330
Query 8	0.005111	0.005286	0.005770	0.006150	0.315689
Query 9	0.010526	0.040371	0.202310	1.071040	5.728123
Query 10	0.002722	0.002589	0.003027	0.002814	0.179580

4.3 Virtuoso Open-Source Edition v5.0.6 and v5.0.7

We ran the benchmark for the 50K to 25M dataset on Virtuoso v5.0.6 about two weeks ago. After the release v5.0.7, we ran the benchmark for the 100M dataset on v5.0.7.

Also note, that the Virtuoso team is currently working on a new version which will be released shortly and is likely to perform much better on the benchmark. We will run the benchmark for Virtuoso again after this version is publicly released.

The table below summarizes the load times in of the N-Triple files (in seconds) :

The table below summarizes the average query execution time for each type of query over all 50 runs (in seconds):

	50K	250K	1M	5M	25M	100M
Query 1	0.017650	0.017195	0.015160	0.020225	0.082935	0.033174
Query 2	0.071556	0.069347	0.069178	0.077868	0.085244	0.069636
Query 3	0.009790	0.008394	0.008904	0.011443	0.055770	0.028237
Query 4	0.016189	0.012962	0.013824	0.020450	0.100784	0.045437
Query 5	0.221498	0.380958	0.846915	3.344963	1.623720	6.312297
Query 6	0.003639	0.007368	0.020584	0.109199	0.536659	2.033258
Query 7	0.125126	0.107109	0.125771	0.310843	2.290610	1.116620
Query 8	0.026703	0.023340	0.024814	0.028546	4.535478	0.586523
Query 9	0.039994	0.039972	0.039744	0.042302	0.106834	0.096961
Query 10	0.585728	0.569584	0.642199	1.338361	6.708351	1.055452

Note that the fact that some queries are faster for the 100M dataset than the 25M dataset is likely to be due to the fact that we used Virtuoso v5.0.7 for the 100M run.

4.4 D2R Server 0.4

D2R Server is a database to RDF wrapper which rewrites SPARQL queries into SQL queries against an application-specific relational schemata based on a mapping.

The table below summarizes the average query execution time for each type of query over all 50 runs (in seconds):

Query 5 over the 25M and 100M dataset was excluded from the run after hitting our 45 second time out.

5. Store Comparison

5.1 Overall Query Execution Time

Running 50 query mixes against the different stores took the following overall times (in seconds). The best performance figure for each dataset size is set bold in the tables.

It turns out that Sesame is the fastest RDF store for small dataset sizes. Virtuoso is the fastest RDF store for big datasets, but is rather slow for small ones. According to feedback from the Virtuoso team, this is due to Virtuoso spending an equal amount of time on compiling the query and deciding on the query execution plan for all dataset sizes. Compared to the RDF stores, D2R Server showed an inferior overall performance. When running the query mix against the 25M and 100M dataset, query 5 did hid the 45 seconds time out and was excluded from the run. Thus there is no overall run time figure for D2R Server at this dataset size.

	Sesame	Virtuoso	SDB	D2R
50 K	9.410	162.036	23.436	66.486
250 K	28.597	162.803	72.964	153.445
1 M	115.198	201.396	268.000	484.430
5 M	569.058	476.806	1406.686	2188.072
25 M	3038.205	2089.118	7623.958	not applicable
100 M	load time out	906.679*	not applicable	not applicable

5.2 Average Query Execution Time by Query and Dataset Size

Running 50 query mixes against the different stores lead to the following average query execution times for different queries (in seconds). The best performance figure for each dataset size is set bold in the tables.

Query 1

	50K	250K	1M	5M	25M	100M
Sesame	0.002106	0.002913	0.003904	0.005051	0.035575	load time out
Virtuoso	0.017650	0.017195	0.015160	0.020225	0.082935	0.033174
SDB	0.002559	0.004955	0.012592	0.057347	0.328271	query time out
D2R	0.004810	0.011643	0.038524	0.195950	0.968688	4.067863

Query 2

	50K	250K	1M	5M	25M	100M
Sesame	0.005274	0.004904	0.005523	0.006115	0.017859	load time out
Virtuoso	0.071556	0.069347	0.069178	0.077868	0.085244	0.069636
SDB	0.023058	0.024708	0.042374	0.071731	0.446813	query time out
D2R	0.038768	0.041448	0.047844	0.056018	0.041420	0.031567

Query 3

	50K	250K	1M	5M	25M	100M
Sesame	0.002364	0.003793	0.004681	0.010618	0.049690	load time out
Virtuoso	0.009790	0.008394	0.008904	0.011443	0.055770	0.028237
SDB	0.005324	0.006671	0.014379	0.059678	0.317215	query time out
D2R	0.007125	0.021041	0.054744	0.226272	1.080856	4.140888

Query 4

	50K	250K	1M	5M	25M	100M
Sesame	0.002595	0.004023	0.005491	0.009803	0.071493	load time out
Virtuoso	0.016189	0.012962	0.013824	0.020450	0.100784	0.045437
SDB	0.005694	0.008284	0.023286	0.112666	0.679248	query time out
D2R	0.008209	0.021871	0.075406	0.390003	1.935566	8.056995

Query 5

	50K	250K	1M	5M	25M	100M
Sesame	0.057673	0.310220	1.313046	6.688869	32.895117	load time out
Virtuoso	0.221498	0.380958	0.846915	3.344963	1.623720	6.312297
SDB	0.129683	0.509083	1.741927	8.199694	43.878071	query time out
D2R	0.718670	2.373025	8.639514	41.823326	query time out	query time out

Query 6

	50K	250K	1M	5M	25M	100M
Sesame	0.004993	0.014792	0.064516	0.290293	1.483277	load time out
Virtuoso	0.003639	0.007368	0.020584	0.109199	0.536659	2.033258
SDB	0.003332	0.005291	0.014744	0.254799	1.197798	query time out
D2R	0.009314	0.013181	0.038695	0.144180	0.383284	1.607238

Query 7

	50K	250K	1M	5M	25M	100M
Sesame	0.005265	0.005510	0.010948	0.007205	0.505330	load time out
Virtuoso	0.125126	0.107109	0.125771	0.310843	2.290610	1.116620
SDB	0.020116	0.078490	0.402448	2.193056	13.129511	query time out
D2R	0.028352	0.030421	0.068561	0.069431	0.055678	0.366317

Query 8

	50K	250K	1M	5M	25M	100M
Sesame	0.005111	0.005286	0.005770	0.006150	0.315689	load time out
Virtuoso	0.026703	0.023340	0.024814	0.028546	4.535478	0.586523
SDB	0.046099	0.159153	0.660130	3.535463	20.466382	query time out
D2R	0.058440	0.069917	0.078460	0.094175	0.066114	0.366338

Query 9

	50K	250K	1M	5M	25M	100M
Sesame	0.010526	0.040371	0.202310	1.071040	5.728123	load time out
Virtuoso	0.039994	0.039972	0.039744	0.042302	0.106834	0.096961
SDB	0.006450	0.004545	0.004461	0.004560	0.018882	query time out
D2R	0.016164	0.015216	0.015215	0.026702	0.017857	0.030908

Query 10

It turned out that no store is superior for all queries. Sesame is the fastest store for queries 1 to 4 at all dataset sizes. Sesame shows a bad performance for queries 5 and 9 against large datasets which prevents the store from having the best overall performance for all dataset sizes. D2R Server, which showed an inferior overall runtime, is the fastest store for queries 6 to 10 against the 25M dataset. SDB turns out to be the fastest store for query 9 up to the 5M dataset, while Virtuoso is good at queries 5 and the only RDF store that managed to execute the benchmark for the 100M dataset.

	50K	250K	1M	5M	25M	100M
Sesame	0.002722	0.002589	0.003027	0.002814	0.179580	load time out
Virtuoso	0.585728	0.569584	0.642199	1.338361	6.708351	1.055452
SDB	0.009257	0.033867	0.102852	1.028677	3.141508	query time out
D2R	0.005088	0.005209	0.004926	0.005565	0.017070	0.005422

5.3 Average Query Execution Time by Dataset Size and Query

Running 50 query mixes against the different stores lead to the following average query execution times for different dataset sizes (in seconds):

50K Triple Dataset

250K Triple Dataset

1M Triple Dataset

5M Triple Dataset

25M Triple Dataset

	Sesame	Virtuoso	SDB	D2R
Query 1	0.002106	0.017650	0.002559	0.004810
Query 2	0.005274	0.071556	0.021276	0.038768
Query 3	0.002364	0.009790	0.003968	0.007125
Query 4	0.002595	0.016189	0.004618	0.008209
Query 5	0.057673	0.221498	0.115628	0.718670
Query 6	0.004993	0.003639	0.002521	0.009314
Query 7	0.005265	0.125126	0.016953	0.028352
Query 8	0.005111	0.026703	0.040807	0.058440
Query 9	0.010526	0.039994	0.004357	0.016164
Query 10	0.002722	0.585728	0.007880	0.005088

	Sesame	Virtuoso	SDB	D2R
Query 1	0.002913	0.017195	0.004955	0.011643
Query 2	0.004904	0.069347	0.024708	0.041448
Query 3	0.003793	0.008394	0.006671	0.021041
Query 4	0.004023	0.012962	0.008284	0.021871
Query 5	0.310220	0.380958	0.509083	2.373025
Query 6	0.014792	0.007368	0.005291	0.013181
Query 7	0.005510	0.107109	0.078490	0.030421
Query 8	0.005286	0.023340	0.159153	0.069917
Query 9	0.040371	0.039972	0.004545	0.015216
Query 10	0.002589	0.569584	0.033867	0.005209

	Sesame	Virtuoso	SDB	D2R
Query 1	0.003904	0.015160	0.012592	0.038524
Query 2	0.005523	0.069178	0.042374	0.047844
Query 3	0.004681	0.008904	0.014379	0.054744
Query 4	0.005491	0.013824	0.023286	0.075406
Query 5	1.313046	0.846915	1.741927	8.639514
Query 6	0.064516	0.020584	0.014744	0.038695
Query 7	0.010948	0.125771	0.402448	0.068561
Query 8	0.005770	0.024814	0.660130	0.078460
Query 9	0.202310	0.039744	0.004461	0.015215
Query 10	0.003027	0.642199	0.102852	0.004926

	Sesame	Virtuoso	SDB	D2R
Query 1	0.005051	0.020225	0.057347	0.195950
Query 2	0.006115	0.077868	0.071731	0.056018
Query 3	0.010618	0.011443	0.059678	0.226272
Query 4	0.009803	0.020450	0.112666	0.390003
Query 5	6.688869	3.344963	8.199694	41.823326
Query 6	0.290293	0.109199	0.254799	0.144180
Query 7	0.007205	0.310843	2.193056	0.069431
Query 8	0.006150	0.028546	3.535463	0.094175
Query 9	1.071040	0.042302	0.004560	0.026702
Query 10	0.002814	1.338361	1.028677	0.005565

	Sesame	Virtuoso	SDB	D2R
Query 1	0.035575	0.082935	0.328271	0.968688
Query 2	0.017859	0.085244	0.446813	0.041420
Query 3	0.049690	0.055770	0.317215	1.080856
Query 4	0.071493	0.100784	0.679248	1.935566
Query 5	32.895117	1.623720	43.878071	timed out
Query 6	1.483277	0.536659	1.197798	0.383284
Query 7	0.505330	2.290610	13.129511	0.055678
Query 8	0.315689	4.535478	20.466382	0.066114
Query 9	5.728123	0.106834	0.018882	0.017857
Query 10	0.179580	6.708351	3.141508	0.017070

This section contains initial results of benchmarking D2R Server and Virtuoso against a 100 million triple dataset.
Sesame did not manage to load the dataset within our 36 hour time limit. Jena SDB did not manage to execute the warm-up query mixes against the 100M dataset.

Berlin SPARQL Benchmark V1.0 Results

Contents

1. Intro

2. Benchmark Dataset

3. Benchmark Machine

4. Benchmark Results

4.1 SDB (Hash) with MySQL over Joseki3

4.2 Sesame over Tomcat

4.3 Virtuoso Open-Source Edition v5.0.6 and v5.0.7

4.4 D2R Server 0.4

5. Store Comparison

5.1 Overall Query Execution Time

5.2 Average Query Execution Time by Query and Dataset Size

Query 1

Query 2

Query 3

Query 4

Query 5

Query 6

Query 7

Query 8

Query 9

Query 10

5.3 Average Query Execution Time by Dataset Size and Query

50K Triple Dataset

250K Triple Dataset

1M Triple Dataset

5M Triple Dataset

25M Triple Dataset