Chris Bizer

Andreas Schultz

Contents

  1. Intro
  2. Benchmark Dataset
  3. Benchmark Machine
  4. Benchmark Results
    1. Jena SDB
    2. Sesame
    3. Virtuoso
    4. D2R
  5. Store Comparison


Date: 07/30/2008


1. Intro

This document presents initial results of running the Berlin SPARQL Benchmark (Version 1) against the RDF stores Virtuoso v5.0.6 and v5.0.7, Sesame Version 2.2-Beta2 , Jena SDB Version 1.1 and against D2R Server, a relational database-to-RDF wrapper. The stores were benchmarked with datasets ranging from 50,000 triples to 100,000,000 triples.

Note that this document has been superseeded by the new results document for BSBM Version 3.

 


2. Benchmark Dataset

We benchmarked using the Triple version of the BSMB dataset (benchmark scenario NTR). The benchmark was run for different dataset sizes. The datasets were generated using the BSBM data generator and fulfill the characteristics described in section the BSBM specification.

Details about the datasets are summarized in the following table:

Number of Triples

50K 250K 1M 5M 25M 100M
Number of Products 91 477 1,915 9,609 48,172 194,207
Number of Producers 2 10 41 199 974 3,844
Number of Reviews 2,275 11,925 47,875 240,225 1,204,300 4,855,175
Number of Offers 1,820 9,540 38,300 192,180 963,440 3,884,140
Number of Vendors 2 10 39 196 961 3,912
Number of Reviewers 116 622 2,452 12,351 61,862 248,730
Number of Product Features 580 580 1,390 3,307 3,307 10,931
Number of Product Types 13 13 31 73 73 411
Total Number of Instances 4,899 23,177 92,043 458,140 2,283,089 9,201,350
Exact Total Number of Triples 50,116 250,492 1,000,226 5,000,453 25,000,557 100,001,402
File Size N-Triple (unzipped) 14 MB 70,7 MB 284,4 MB 1,4 GB 7 GB 28,2 GB

There is a RDF triple and a relational representation of the benchmark datasets. Both representations can be downloaded below:

Download N-Triples Representation of the Benchmark Datasets

Download MySQL dump of the Benchmark Datasets

 


3. Benchmark Machine

The benchmark was run on a machine with the following specification:

 


4. Benchmark Results


The load performance of the systems was measured by loading the N-Triple representation of the BSBM datasets into the triple stores and by loading the relational representation in the form of MySQL dumps into the RDMS behind D2R Server. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.

The query performance of the systems was measured by running 50 BSBM query mixes (altogether 1250 queries) against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to enable the SUTs to load parts of the working set into main memory and to take advantage of query result and query execution plan caching, 10 BSBM query mixes (altogether 250 queries) were executed for warm-up before the actual times were measured.

4.1 SDB (Hash) with MySQL over Joseki3


Jena SDB homepage

4.1.1 Configuration

The following changes were made to the default configuration of the software:


4.1.2 Load Time

The table below summarizes the load times in of the N-Triple files (in seconds) :

50K 250K 1M 5M 25M 100M
5.343 23.992 116.7 1,053.0 13,306.7 144,989.0


4.1.3 Benchmark Query results (seconds): AQET (Average Query Execution Time)


50K 250K 1M 5M 25M
Query 1 0.002559 0.004955 0.012592 0.057347 0.328271
Query 2 0.021276 0.024708 0.042374 0.071731 0.446813
Query 3 0.003968 0.006671 0.014379 0.059678 0.317215
Query 4 0.004618 0.008284 0.023286 0.112666 0.679248
Query 5 0.115628 0.509083 1.741927 8.199694 43.878071
Query 6 0.002521 0.005291 0.014744 0.254799 1.197798
Query 7 0.016953 0.078490 0.402448 2.193056 13.129511
Query 8 0.040807 0.159153 0.660130 3.535463 20.466382
Query 9 0.004357 0.004545 0.004461 0.004560 0.018882
Query 10 0.007880 0.033867 0.102852 1.028677 3.141508

We can not report average query execution times for the 100 M dataset yet, as SDB froze at query 7 after 3 hours while executing the first warm-up query mix.


4.1.4 Result Summaries


4.1.5 Run Logs (detailed information)

 

4.2 Sesame over Tomcat


Sesame homepage

4.2.1 Configuration

The following changes were made to the default configuration of the software:

Store Type: Native
Indexes: spoc, posc, psoc
JAVA_OPTS = ... -Xmx4096m ...

4.2.2 Load Time

The table below summarizes the load times in of the N-Triple files (in seconds) :

50K 250K 1M 5M 25M
3.269 17,706 132,743 1,988.839 27,674.091

When loading the 100M dataset, Sesame hid our 36 hour load time-out. Therefore, we cannot report query results for the 100M dataset.


4.2.3 Average Query Execution Time

The table below summarizes the average query execution time for each type of query over all 50 runs (in seconds):

50K 250K 1M 5M 25M
Query 1 0.002106 0.002913 0.003904 0.005051 0.035575
Query 2 0.005274 0.004904 0.005523 0.006115 0.017859
Query 3 0.002364 0.003793 0.004681 0.010618 0.049690
Query 4 0.002595 0.004023 0.005491 0.009803 0.071493
Query 5 0.057673 0.310220 1.313046 6.688869 32.895117
Query 6 0.004993 0.014792 0.064516 0.290293 1.483277
Query 7 0.005265 0.005510 0.010948 0.007205 0.505330
Query 8 0.005111 0.005286 0.005770 0.006150 0.315689
Query 9 0.010526 0.040371 0.202310 1.071040 5.728123
Query 10 0.002722 0.002589 0.003027 0.002814 0.179580

4.2.4 Result Summaries


4.2.5 Run Logs (detailed information)

4.3 Virtuoso Open-Source Edition v5.0.6 and v5.0.7


Virtuoso homepage

We ran the benchmark for the 50K to 25M dataset on Virtuoso v5.0.6 about two weeks ago. After the release v5.0.7, we ran the benchmark for the 100M dataset on v5.0.7.

Also note, that the Virtuoso team is currently working on a new version which will be released shortly and is likely to perform much better on the benchmark. We will run the benchmark for Virtuoso again after this version is publicly released.


4.3.1 Configuration

The following changes were made to the default configuration of the software:

NumberOfBuffers = 360000
MaxDirtyBuffers = 220000
POSG, PSOG, SPOG

 

MaxCheckpointRemap = 917504
NumberOfBuffers = 420000
MaxDirtyBuffers = 240000
TransactionAfterImageLimit = 28000000000
No supplementary indexes.

4.3.2 Load Time

The table below summarizes the load times in of the N-Triple files (in seconds) :

50K 250K 1M 5M 25M 100M
2 33 87 609 49,096.0 16,321.0

4.3.3 Average Query Execution Time

The table below summarizes the average query execution time for each type of query over all 50 runs (in seconds):


50K 250K 1M 5M 25M 100M
Query 1 0.017650 0.017195 0.015160 0.020225 0.082935 0.033174
Query 2 0.071556 0.069347 0.069178 0.077868 0.085244 0.069636
Query 3 0.009790 0.008394 0.008904 0.011443 0.055770 0.028237
Query 4 0.016189 0.012962 0.013824 0.020450 0.100784 0.045437
Query 5 0.221498 0.380958 0.846915 3.344963 1.623720 6.312297
Query 6 0.003639 0.007368 0.020584 0.109199 0.536659 2.033258
Query 7 0.125126 0.107109 0.125771 0.310843 2.290610 1.116620
Query 8 0.026703 0.023340 0.024814 0.028546 4.535478 0.586523
Query 9 0.039994 0.039972 0.039744 0.042302 0.106834 0.096961
Query 10 0.585728 0.569584 0.642199 1.338361 6.708351 1.055452

Note that the fact that some queries are faster for the 100M dataset than the 25M dataset is likely to be due to the fact that we used Virtuoso v5.0.7 for the 100M run.

4.3.4 Result Summaries


4.3.5 Run Logs (detailed information)

 

4.4 D2R Server 0.4


D2R Server is a database to RDF wrapper which rewrites SPARQL queries into SQL queries against an application-specific relational schemata based on a mapping.


4.4.1 Configuration

The following changes were made to the default configuration of the software:

4.4.2 Load Time

The table below summarizes the load times in of the SQL-dump (in seconds) :

50K 250K 1M 5M 25M 100M
0.4 2 6 38 202 1 177

4.4.3 Average Query Execution Time

The table below summarizes the average query execution time for each type of query over all 50 runs (in seconds):

50K250K1M5M25M 100M
Query 10.0048100.0116430.0385240.1959500.968688 4.067863
Query 20.0387680.0414480.0478440.0560180.041420 0.031567
Query 30.0071250.0210410.0547440.2262721.080856 4.140888
Query 40.0082090.0218710.0754060.3900031.935566 8.056995
Query 50.7186702.3730258.63951441.823326 time out time out
Query 60.0093140.0131810.0386950.1441800.383284 1.607238
Query 70.0283520.0304210.0685610.0694310.055678 0.366317
Query 80.0584400.0699170.0784600.0941750.066114 0.366338
Query 90.0161640.0152160.0152150.0267020.017857 0.030908
Query 100.0050880.0052090.0049260.0055650.017070 0.005422


Query 5 over the 25M and 100M dataset was excluded from the run after hitting our 45 second time out.

4.4.4 Result Summaries


4.4.5 Run Logs (detailed information)

 


5. Store Comparison

This section provides a comparison of the benchmark results of the four stores.

5.1 Overall Query Execution Time

Running 50 query mixes against the different stores took the following overall times (in seconds). The best performance figure for each dataset size is set bold in the tables.

Sesame Virtuoso SDBD2R
50 K  9.410  162.036  23.436 66.486
250 K 28.597  162.803  72.964 153.445
1 M 115.198  201.396  268.000 484.430
5 M 569.058  476.806  1406.6862188.072
25 M 3038.205  2089.118  7623.958  not applicable
100 M load time out 906.679*  not applicable  not applicable

* Virtuoso Version 5.0.7. The other Virtuoso results are from Version 5.0.6.

It turns out that Sesame is the fastest RDF store for small dataset sizes. Virtuoso is the fastest RDF store for big datasets, but is rather slow for small ones. According to feedback from the Virtuoso team, this is due to Virtuoso spending an equal amount of time on compiling the query and deciding on the query execution plan for all dataset sizes. Compared to the RDF stores, D2R Server showed an inferior overall performance. When running the query mix against the 25M and 100M dataset, query 5 did hid the 45 seconds time out and was excluded from the run. Thus there is no overall run time figure for D2R Server at this dataset size.

5.2 Average Query Execution Time by Query and Dataset Size

Running 50 query mixes against the different stores lead to the following average query execution times for different queries (in seconds). The best performance figure for each dataset size is set bold in the tables.

Query 1

50K 250K 1M 5M 25M 100M
Sesame 0.002106 0.002913 0.003904 0.005051 0.035575 load time out
Virtuoso 0.017650 0.017195 0.015160 0.020225 0.082935 0.033174
SDB 0.002559 0.004955 0.012592 0.057347 0.328271 query time out
D2R 0.004810 0.011643 0.038524 0.195950 0.968688 4.067863

Query 2

50K 250K 1M 5M 25M 100M
Sesame 0.005274 0.004904 0.005523 0.006115 0.017859 load time out
Virtuoso 0.071556 0.069347 0.069178 0.077868 0.085244 0.069636
SDB 0.023058 0.024708 0.042374 0.071731 0.446813 query time out
D2R 0.038768 0.041448 0.047844 0.056018 0.041420 0.031567

Query 3

50K 250K 1M 5M 25M 100M
Sesame 0.002364 0.003793 0.004681 0.010618 0.049690 load time out
Virtuoso 0.009790 0.008394 0.008904 0.011443 0.055770 0.028237
SDB 0.005324 0.006671 0.014379 0.059678 0.317215 query time out
D2R 0.007125 0.021041 0.054744 0.226272 1.080856 4.140888

Query 4

50K 250K 1M 5M 25M 100M
Sesame 0.002595 0.004023 0.005491 0.009803 0.071493 load time out
Virtuoso 0.016189 0.012962 0.013824 0.020450 0.100784 0.045437
SDB 0.005694 0.008284 0.023286 0.112666 0.679248 query time out
D2R 0.008209 0.021871 0.075406 0.390003 1.935566 8.056995

Query 5

50K 250K 1M 5M 25M 100M
Sesame 0.057673 0.310220 1.313046 6.688869 32.895117 load time out
Virtuoso 0.221498 0.380958 0.846915 3.344963 1.623720 6.312297
SDB 0.129683 0.509083 1.741927 8.199694 43.878071 query time out
D2R 0.718670 2.373025 8.639514 41.823326 query time out query time out

Query 6

50K 250K 1M 5M 25M 100M
Sesame 0.004993 0.014792 0.064516 0.290293 1.483277 load time out
Virtuoso 0.003639 0.007368 0.020584 0.109199 0.536659 2.033258
SDB 0.003332 0.005291 0.014744 0.254799 1.197798 query time out
D2R 0.009314 0.013181 0.038695 0.144180 0.383284 1.607238

Query 7

50K 250K 1M 5M 25M 100M
Sesame 0.005265 0.005510 0.010948 0.007205 0.505330 load time out
Virtuoso 0.125126 0.107109 0.125771 0.310843 2.290610 1.116620
SDB 0.020116 0.078490 0.402448 2.193056 13.129511 query time out
D2R 0.028352 0.030421 0.068561 0.069431 0.055678 0.366317

Query 8

50K 250K 1M 5M 25M 100M
Sesame 0.005111 0.005286 0.005770 0.006150 0.315689 load time out
Virtuoso 0.026703 0.023340 0.024814 0.028546 4.535478 0.586523
SDB 0.046099 0.159153 0.660130 3.535463 20.466382 query time out
D2R 0.058440 0.069917 0.078460 0.094175 0.066114 0.366338

Query 9

50K 250K 1M 5M 25M 100M
Sesame 0.010526 0.040371 0.202310 1.071040 5.728123 load time out
Virtuoso 0.039994 0.039972 0.039744 0.042302 0.106834 0.096961
SDB 0.006450 0.004545 0.004461 0.004560 0.018882 query time out
D2R 0.016164 0.015216 0.015215 0.026702 0.017857 0.030908

Query 10

50K 250K 1M 5M 25M 100M
Sesame 0.002722 0.002589 0.003027 0.002814 0.179580 load time out
Virtuoso 0.585728 0.569584 0.642199 1.338361 6.708351 1.055452
SDB 0.009257 0.033867 0.102852 1.028677 3.141508 query time out
D2R 0.005088 0.005209 0.004926 0.005565 0.017070 0.005422

It turned out that no store is superior for all queries. Sesame is the fastest store for queries 1 to 4 at all dataset sizes. Sesame shows a bad performance for queries 5 and 9 against large datasets which prevents the store from having the best overall performance for all dataset sizes. D2R Server, which showed an inferior overall runtime, is the fastest store for queries 6 to 10 against the 25M dataset. SDB turns out to be the fastest store for query 9 up to the 5M dataset, while Virtuoso is good at queries 5 and the only RDF store that managed to execute the benchmark for the 100M dataset.

5.3 Average Query Execution Time by Dataset Size and Query

Running 50 query mixes against the different stores lead to the following average query execution times for different dataset sizes (in seconds):

50K Triple Dataset

Sesame Virtuoso SDB D2R
Query 1 0.002106 0.017650 0.002559 0.004810
Query 2 0.005274 0.071556 0.021276 0.038768
Query 3 0.002364 0.009790 0.003968 0.007125
Query 4 0.002595 0.016189 0.004618 0.008209
Query 5 0.057673 0.221498 0.115628 0.718670
Query 6 0.004993 0.003639 0.002521 0.009314
Query 7 0.005265 0.125126 0.016953 0.028352
Query 8 0.005111 0.026703 0.040807 0.058440
Query 9 0.010526 0.039994 0.004357 0.016164
Query 10 0.002722 0.585728 0.007880 0.005088

250K Triple Dataset

Sesame Virtuoso SDB D2R
Query 1 0.002913 0.017195 0.004955 0.011643
Query 2 0.004904 0.069347 0.024708 0.041448
Query 3 0.003793 0.008394 0.006671 0.021041
Query 4 0.004023 0.012962 0.008284 0.021871
Query 5 0.310220 0.380958 0.509083 2.373025
Query 6 0.014792 0.007368 0.005291 0.013181
Query 7 0.005510 0.107109 0.078490 0.030421
Query 8 0.005286 0.023340 0.159153 0.069917
Query 9 0.040371 0.039972 0.004545 0.015216
Query 10 0.002589 0.569584 0.033867 0.005209

1M Triple Dataset

Sesame Virtuoso SDB D2R
Query 1 0.003904 0.015160 0.012592 0.038524
Query 2 0.005523 0.069178 0.042374 0.047844
Query 3 0.004681 0.008904 0.014379 0.054744
Query 4 0.005491 0.013824 0.023286 0.075406
Query 5 1.313046 0.846915 1.741927 8.639514
Query 6 0.064516 0.020584 0.014744 0.038695
Query 7 0.010948 0.125771 0.402448 0.068561
Query 8 0.005770 0.024814 0.660130 0.078460
Query 9 0.202310 0.039744 0.004461 0.015215
Query 10 0.003027 0.642199 0.102852 0.004926

5M Triple Dataset

Sesame Virtuoso SDB D2R
Query 1 0.005051 0.020225 0.057347 0.195950
Query 2 0.006115 0.077868 0.071731 0.056018
Query 3 0.010618 0.011443 0.059678 0.226272
Query 4 0.009803 0.020450 0.112666 0.390003
Query 5 6.688869 3.344963 8.199694 41.823326
Query 6 0.290293 0.109199 0.254799 0.144180
Query 7 0.007205 0.310843 2.193056 0.069431
Query 8 0.006150 0.028546 3.535463 0.094175
Query 9 1.071040 0.042302 0.004560 0.026702
Query 10 0.002814 1.338361 1.028677 0.005565

25M Triple Dataset

Sesame Virtuoso SDB D2R
Query 1 0.035575 0.082935 0.328271 0.968688
Query 2 0.017859 0.085244 0.446813 0.041420
Query 3 0.049690 0.055770 0.317215 1.080856
Query 4 0.071493 0.100784 0.679248 1.935566
Query 5 32.895117 1.623720 43.878071 timed out
Query 6 1.483277 0.536659 1.197798 0.383284
Query 7 0.505330 2.290610 13.129511 0.055678
Query 8 0.315689 4.535478 20.466382 0.066114
Query 9 5.728123 0.106834 0.018882 0.017857
Query 10 0.179580 6.708351 3.141508 0.017070

100M Triple Dataset

This section contains initial results of benchmarking D2R Server and Virtuoso against a 100 million triple dataset.
Sesame did not manage to load the dataset within our 36 hour time limit. Jena SDB did not manage to execute the warm-up query mixes against the 100M dataset.

D2R Virtuoso
Query 1 4.067863 0.033174
Query 2 0.031567 0.069636
Query 3 4.140888 0.028237
Query 4 8.056995 0.045437
Query 5 not executed 6.312297
Query 6 1.607238 2.033258
Query 7 0.366317 1.116620
Query 8 0.366338 0.586523
Query 9 0.030908 0.096961
Query 10 0.005422 1.055452