Contents
- Intro
- Benchmark Datasets
- Benchmark Machine
- Benchmark Results for the Explore Use Case
- Benchmark Results for the BI use case
- Benchmark Results for the Cluster edition
- Store Comparison
- Thanks
Document Version: 0.9
Publication Date: 04/22/2013
1. Introduction
The Berlin SPARQL Benchmark (BSBM) is a benchmark for comparing the performance of storage systems that expose SPARQL endpoints. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and consumers have posted reviews about products.
We note that the data and query generator used here are using
an updated version of the original BSBM (http://sf.net/projects/bibm) which provides several modifications in the test
driver and the data generator. These changes have been adopted in the
official V3.1 BSBM benchmark definition. The changes are as follows:
- The test driver reports more and more detailed metrics including
"power" and "throughput" scores.
- The test driver has a drill down mode that starts at a broad product category, and then zooms in subsequent queries into smaller categories. Previously, the product category query parameter was picked randomly for each query; if this was a broad category, the query would be very slow; if it is a very specific category it would run very fast. This made it hard to compare individual query runs; and also introduced large variation in the overall result metric. The drill down mode makes it more stable and also tests a query pattern (drill down) that is common in practice.
- One query (i.e., BI Q6) was removed that returned a quadratic result. This query would become very expensive in the 1G and larger tests, so its performance would dominate the result.
- The text data in the generated strings is more realistic. This means you can do (more) sensible keyword queries on it.
- The new generator was adapted to enable parallel data generation.
Specifically, one can let it generate a subset of the data files. By
starting multiple data generators on multiple machines one can thus
hand-parallelize data generation. This is quite handy for the
larger-size tests, which literally otherwise takes weeks.
As the original BSBM benchmark, the test driver can run with
single-user run or multi-user run.
This document presents the results of a April 2013 BSBM experiment in which the Berlin SPARQL Benchmark Version 3.1 was used to measure the performance of:
- BigData (rev. 6528 as of July 02 2012)
- BigOwlim (version 5.2.5524) + BigOwlim (version 5.3.5777) for the cluster edition
- TDB (version 0.9.4)
- Virtuoso (06.04.3132-pthreads for Linux as of May 14 2012)
- Virtuoso (07.00.3202-pthreads for Linux as of Jan 1 2013)
The stores were benchmarked with datasets of by up to 150 billion triples. Details about
the dataset sizes are shown as in following table.
Single
Machine |
Cluster |
||
Use Cases |
Explore |
BI |
Explore
& BI |
Datasets (million triples) |
100, 200, 1000 |
10, 100, 1000 |
10000, 50000, 150000 |
These results extend the state-of-the-art in various dimensions:
- scale: this is the first time that RDF store benchmark results on such a large size have been published. The previous published BSBM results published were on 200M triples, the 150B experiments thus mark a 750x increase in scale.
- workload: this is the first time that results on the Business Intelligence (BI) workload are published. In contrast to the Explore workload, which features short-running "transactional" queries, the BI workload consists of queries that go through possibly billions of tuples, grouping and aggregating them (using the respective functionality, new in SPARQL1.1). In contrast to one year ago, we find that now the majority of the RDF stores is able to run the BI workload.
- architecture: this is the first time that RDF store technology with cluster functionality has been publicly benchmarked. These experiments include tests using the Virtuoso7 Cluster Edition as well as the BIGOWLIM 5.3 cluster edition.
2. Benchmark Datasets
We ran the benchmark using the Triple version of the BSBM dataset. The benchmark was run for different dataset sizes. The datasets were generated using the BIBM data generator and fulfill the characteristics described in section the BSBM specification.
Details about the benchmark datasets are summarized in the
following table:
Number of
Triples
|
10M | 100M | 200M | 1B | 10B | 50B | 150B |
Number of Products | 28480 | 284800 | 569600 | 2848000 | 28480000 | 142400000 | 427200000 |
Number of Producers | 559 | 5623 | 11232 | 56288 | 563142 | 2815554 | 8446788 |
Number of Product Features | 19180 | 47531 | 93876 | 167836 | 423832 | 796470 | 1593390 |
Number of Product Types | 585 | 2011 | 3949 | 7021 | 22527 | 42129 | 84259 |
Number of Vendors | 284 | 2838 | 5675 | 28439 | 284610 | 1421729 | 4264028 |
Number of Offers | 569600 | 5696000 | 11392000 | 56960000 | 569600000 | 2848000000 | 8544000000 |
Number of Reviewers | 14613 | 145961 | 291923 | 1459584 | 14599162 | 72989573 | 218974622 |
Number of Reviews | 284800 | 2848000 | 5696000 | 28480000 | 284800000 | 1424000000 | 4272000000 |
Exact Total Number of Triples* | 10119864 | 100062249 | 199945456 | 999700717 | 9967546016 | 49853640808 | 149513009920 |
File Size Turtle (unzipped) | 467 MB | 4.6 GB | 9.2 GB | 48 GB | 568 GB | 2.8 TB | 8.6 TB |
(*: As datasets 10B, 50B, 150B are generated in parallel in 8
machines, the number of triples is computed approximately by multiplying
the number of triples generated in one machine with 8)
Note: All datasets were generated with the -fc option for forward chaining.
The BSBM dataset generator and test driver can be downloaded from SourceForge.
The RDF representation of the
benchmark datasets can be generated in the following way:
To generate the 100M dataset as Turtle file type the following command in the BSBM directory:
./generate -fc -s ttl -fn dataset_100M -pc 284826 -pareto
To generate the 150B dataset in 1000 Turtle files in multiple machines (e.g., 8 machines, each machine has 125 files),
type the following command in the BSBM directory:
./generate -fc -s ttl -fn dataset150000m -pc 427200000 -nof 1000 -nom 8 -mId <machineID> -pareto
(The <machineID> will be 1, 2, 3, …, 8 according to which machine in 8 machines that the command is run)
Variations:
* generate N-Triples instead of Turtle:
use -s nt instead of -s ttl
* generate update dataset for the Explore and Update use case:
add -ud
* Generate multiple files instead of one, for example 100 files:
add -nof 100
* Write test driver data to a different directory (default is td_data), for example for the 100M dataset:
add -dir td_data_100M
3. Benchmark Machine
We used CWI Scilens (www.scilens.org)
cluster for the benchmark experiment. This cluster is designed for high
I/O bandwidth, and consists of multiple layers of machines. In order to
get large amounts of RAM, we used only the “bricks” layer, which
contains its most powerful machines.
The machines were connected by
Mellanox MCX353A-QCBT ConnectX3 VPI HCA card (QDR IB 40Gb/s and 10GigE)
through an InfiniScale IV QDR InfiniBand Switch (Mellanox MIS5025Q).
Each machine has the following specification.
- Hardware: (8 machines)
- Processors: 2 x Intel(R) Xeon(R) CPU E5-2650, 2.00GHz (8 cores & hyperthreading), Sandy Bridge architecture
- Memory: 256GB
- Hard Disks: 3 x 1.8TB (7,200 rpm) SATA in RAID 0 (180MB/s
sequential throughput).
- Software:
- Operating System: Linux version 3.3.4-3.fc16.x86_64
- Filesystem: ext4
- Filesystem: ext4
- Java Version and JVM: Version 1.6.0_31, 64-Bit Server VM (build 20.6-b01).
- BSBM generator and test driver version: bibm-0.7.8
The total cost of this configuration was EUR 70,000; when acquired in 2012.
4.
Benchmark Results for the Explore Use Case
This section reports the results of running the Explore
use case of
the BSBM
benchmark against:
- BigData (rev. 6528)
- BigOwlim (version 5.2.5524)
- TDB (version 0.9.4)
- Virtuoso6 (06.04.3132-pthreads for Linux as of May 14 2012)
- Virtuoso7 (07.00.3202-pthreads for Linux as of Jan 1 2013)
The load performance of the systems was measured by loading the Turtle representation of the BSBM datasets into the triple stores. The loaded datasets were forward chained and contained all rdf:type statements for product types. Thus the systems under test did not have to do any inferencing.
The query performance of the systems was measured by running 500 BSBM query mixes against the systems over the SPARQL protocol. The test driver and the system under test (SUT) were running on the same machine in order to reduce the influence of network latency. In order to measure sustainable performance of the SUTs, a large number of warm-up runs are executed before actual single-client test runs (as a ramp-up period). Drill down mode is used for all tests.
We applied the following test procedure to each store:
- Load data into the store.
- Shutdown store, (optional: clear OS caches and swap), restart store.
- Execute single-client test run (500 mixes performance
measurement, randomizer seed: 9834533) with 2000
warm-up runs.
- Execute multi-client runs (4, 8
and 64 clients; randomizer seeds: 8188326, 9175932 and 4187411). For
each run add two times the number of clients of warm up query mixes.
./testdriver -seed 9834533 –w 2000 –runs 500 –drill –o result_single.xml http://sparql-endpoint
For example for a run with 4 clients execute:
./testdriver -seed 8188326 -w 8 -mt 4 -drill -o results_4clients.xml http://sparql-endpoint
The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.
An overview of load times for SUTs and the different datasets are given in the following table (in hh:min:sec):
SUT
|
10M |
100M | 200M | 1B |
---|---|---|---|---|
BigData |
00:2:39 | 00:25:35 | 00:59:25 | - |
BigOwlim |
00:2:31 | 00:22:47 | 00:47:19 | 4:9:39 |
TDB |
00:9:41 | 1:37:55 | 3:34:59 | - |
Virtuoso6 |
00:7:06 | 00:19:26 | 00:31:30 | 1:10:30 |
Virtuoso7 |
- |
00:03:09 | - |
00:27:11 |
* The dataset was split into 1, 10, 20, 100 Turtle files respectively.
- We do not test/load this dataset with the SUT
4.1 BigData
4.1.1 Configuration
The following changes were made to the default configuration
of the software:
- BigData: Version rev. 6528
Copy the bibm3 into bigdata-perf directory.
For loading and starting the server the ANT script in the directory "bigdata-perf/bibm3" was used.
4.1.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
00:25:35 | 00:59:25 |
4.1.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100M | 200M | |
---|---|---|
Query 1 | 49.955 |
49.520 |
Query 2 | 42.769 |
43.713 |
Query 3 | 37.280 |
38.355 |
Query 4 | 36.846 |
36.830 |
Query 5 | 2.684 |
1.799 |
Query 7 | 16.172 |
16.548 |
Query 8 | 37.498 |
38.721 |
Query 9 | 59.524 |
61.476 |
Query 10 | 41.326 |
42.427 |
Query 11 | 62.375 |
63.784 |
Query 12 | 50.989 |
52.094 |
4.1.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 12512.278 |
17949.632 |
19574.007 |
20422.626 |
200M | 10059.940 | 9762.856 |
11572.433 |
12935.595 |
4.1.5 Result
Summaries
4.2 BigOwlim
4.2.1 Configuration
The following changes were made to the default configuration
of the software:
- BigOwlim: Version 5.2.5524
- Tomcat: Version 7.0.30
Modified heap size:
JAVA_OPTS="-Dinfo.aduna.platform.appdata.basedir=`pwd`/data -Xmx200G "
- Sesame: Version 2.6.8
- Config files:
4.2.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M | 1B |
00:22:47 | 00:47:19 | 4:9:39 |
4.2.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100M | 200M | 1B |
|
---|---|---|---|
Query 1 | 93.773 | 65.385 |
25.128 |
Query 2 | 115.960 |
65.158 |
34.181 |
Query 3 | 170.242 |
61.155 |
26.042 |
Query 4 | 140.607 |
127.747 |
12.868 |
Query 5 | 1.868 |
1.199 |
0.198 |
Query 7 | 75.746 |
98.357 |
32.593 |
Query 8 | 93.467 |
193.087 |
60.702 |
Query 9 | 202.041 |
105.759 |
38.391 |
Query 10 | 146.327 |
69.411 |
60.357 |
Query 11 | 368.732 |
74.074 |
65.428 |
Query 12 | 244.738 |
197.239 |
61.418 |
4.2.4 Benchmark Overall results: QMpH for the 100M, 200M, 1B datasets for all runs
For the 100M, 200M, 1B datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 14029.453 |
17184.314 |
11677.860 |
8321.202 |
200M | 9170.083 |
8130.137 |
5614.489 |
5150.768 |
1B |
1669.899 |
2246.865 |
1081.508 |
912.518 |
4.2.5 Result
Summaries
4.3 TDB
4.3.1 Configuration
The following changes were made to the default configuration
of the software:
- TDB: Version 0.9.4
Loading was done with tdbloader2
Statistics for the BGP optimizer were generated with the "tdbconfig stats" command and copied into the database directory.
- Fuseki: Version 0.2.5
Started server with: ./fuseki-server --loc /database/tdb /bsbm
4.3.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
100M | 200M |
1:37:55 | 3:34:59 |
4.3.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
100M | 200M | |
---|---|---|
Query 1 | 119.048 |
94.877 |
Query 2 | 158.755 |
151.883 |
Query 3 | 84.660 |
70.492 |
Query 4 | 70.912 |
52.759 |
Query 5 | 1.959 |
1.308 |
Query 7 | 196.754 |
184.349 |
Query 8 | 228.258 |
199.362 |
Query 9 | 355.999 |
319.489 |
Query 10 | 297.619 |
267.094 |
Query 11 | 483.092 |
450.045 |
Query 12 | 204.834 |
192.901 |
4.3.4 Benchmark Overall results: QMpH for the 100M and 200M datasets for all runs
For the 100M and 200M datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
100M | 15381.857 |
19036.097 |
24646.705 |
14838.483 |
200M | 10573.858 |
9540.452 |
18610.896 |
8265.151 |
4.3.5 Result
Summaries
4.4 Virtuoso6 & Virtuoso7
4.4.1 Configuration
The following changes were made to the default configuration
of the software:
- Virtuoso6: Version 06.04.3132-pthreads for Linux as of May 14 2012
- Virtuoso7: Version 07.00.3202-pthreads for Linux as of Jan 1 2013
Loading of datasets:
The loading was done by running multiple loading process (call rdf_loader_run() function).
For the 100M, 200M, 1B datasets 10, 20, 100 files were generated, respectively.
For the configuration see the "virtuoso.ini" file.
- Config files:
4.4.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
- Virtuoso6
100M | 200M | 1B |
00:19:26 | 00:31:30 | 1:10:30 |
- Virtuoso7
100M |
1B |
00:03:09 |
00:27:11 |
4.4.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 500 runs (in QpS):
- Virtuoso6
100M | 200M | 1B |
|
---|---|---|---|
Query 1 | 232.234 |
217.865 |
87.245 |
Query 2 | 109.445 |
110.019 |
79.791 |
Query 3 | 180.245 |
174.216 |
119.104 |
Query 4 | 116.604 |
111.732 |
42.586 |
Query 5 | 9.976 |
7.168 |
1.201 |
Query 7 | 30.001 |
32.918 |
31.840 |
Query 8 | 117.247 |
124.502 |
127.698 |
Query 9 | 397.456 |
363.042 |
132.459 |
Query 10 | 122.926 |
123.487 |
99.433 |
Query 11 | 539.957 |
493.583 |
500.501 |
Query 12 | 220.167 |
215.424 |
207.641 |
- Virtuoso7
100M |
1B |
|
Query 1 | 125.786 |
75.324 |
Query 2 | 68.929 |
68.820 |
Query 3 | 117.426 |
62.243 |
Query 4 | 58.514 |
30.473 |
Query 5 | 21.182 |
6.064 |
Query 7 | 54.484 |
55.356 |
Query 8 | 93.336 |
97.248 |
Query 9 | 173.898 |
176.772 |
Query 10 | 107.968 |
101.678 |
Query 11 | 214.133 |
225.124 |
Query 12 | 126.743 |
137.287 |
4.4.4 Benchmark Overall results: QMpH for the 100M, 200M, 1B datasets for all runs
For the 100M, 200M, 1B datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
- Virtuoso6
1 | 4 | 8 |
64 | |
100M | 37678.319 | 64885.747 |
112388.811 |
20647.413 |
200M | 32969.006 |
31387.107 |
77224.941 |
14480.812 |
1B |
8984.789 |
15637.439 |
14343.728 |
2800.053 |
- Virtuoso7
1 | 4 | 8 | 64 | |
100M |
47178.820 |
91505.200 |
188632.144 |
216118.852 |
1B | 27933.682 |
56714.875 |
79261.626 |
132685.957 |
4.4.5 Result Summaries
5. Benchmark Results for the BI Use Case
This section reports the results of running the BI
use case of
the BSBM
benchmark against:
- BigData (rev. 6528)
- BigOwlim (version 5.2.5524)
- TDB (version 0.9.4)
- Virtuoso6 (06.04.3132-pthreads for Linux as of May 14 2012)
- Virtuoso7 (07.00.3202-pthreads for Linux as of Jan 1 2013)
The load process is the same as for the Explore use case. (See section 4)
The test procedure is similar to that for the Explore use case,
however, for the single-client run, we only run 25 warm-up runs. Since
running a BI query mix touches most of the data, few warm-up runs can
make the SUTs sufficiently warm and they can have sustainable
performance after that.
We applied the following test procedure to each store:
- Load data into the store.
- Shutdown store, (optional: clear OS caches and swap), restart store.
- Execute single-client test run (10 mixes performance
measurement, randomizer seed: 9834533) with 25
warm-up runs.
- Execute multi-client runs (4, 8
and 64 clients; randomizer seeds: 8188326, 9175932 and 4187411). For
each run add two times the number of clients of warm up query mixes.
./testdriver -seed 9834533 -uc bsbm/bi –w 25 –runs 10 –drill –o result_single.xml http://sparql-endpoint
For example for a run with 4 clients execute:
./testdriver -seed 8188326 -uc bsbm/bi -w 8 -mt 4 -drill -o results_4clients.xml http://sparql-endpoint
The different runs use distinct randomizer seeds for choosing query parameters. This ensures that the test driver produces distinctly parameterized queries over all runs and makes it harder for the stores to apply query caching.
5.1 BigData
5.1.1 Configuration
The following changes were made to the default configuration
of the software:
- BigData: Version rev. 6528
Copy the bibm3 into bigdata-perf directory.
For loading and starting the server the ANT script in the directory "bigdata-perf/bibm3" was used.
5.1.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
10M |
00:2:39 |
5.1.3 Benchmark Query results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 10 runs (in QpS):
10M | |
---|---|
Query 1 | 0.453 |
Query 2 | 0.445 |
Query 3 | 0.300 |
Query 4 | 0.167 |
Query 5 | 1.992 |
Query 6 | 9.917 |
Query 7 | 0.006 |
Query 8 | 0.568 |
5.1.4 Benchmark Overall results: QMpH for the 10 dataset for all runs
For the 10M dataset we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
10M | 7.290 |
16.222 |
16.439 |
18.812 |
5.1.5 Result
Summaries
5.2 BigOwlim
5.2.1 Configuration
The following changes were made to the default configuration
of the software:
- BigOwlim: Version 5.2.5524
- Tomcat: Version 7.0.30
Modified heap size:
JAVA_OPTS="-Dinfo.aduna.platform.appdata.basedir=`pwd`/data -Xmx200G "
- Sesame: Version 2.6.8
- Config files:
5.2.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
10M | 100M |
1B |
00:2:31 | 00:22:47 | 4:9:39 |
5.2.3 Benchmark Query results: QpS (Queries per Second)
The table below summarizes the query throughput for each type of query over all 10 runs (in QpS):
10M | 100M |
1B |
|
---|---|---|---|
Query 1 | 1.426 |
0.176 |
0.016 |
Query 2 | 0.069 |
0.009 |
0.001 |
Query 3 | 2.540 |
0.105 |
0.002 |
Query 4 | 0.150 |
0.027 |
0.003 |
Query 5 | 1.923 |
0.240 | 0.020 |
Query 6 | 23.923 |
15.538 |
13.951 |
Query 7 | 2.232 |
0.369 |
0.040 |
Query 8 | 1.395 |
0.191 |
0.016 |
5.2.4 Benchmark Overall results: QMpH for the 10M, 1B datasets for all runs
For the 10M, 1B datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
10M | 121.841 |
265.294 |
177.338 |
218.678 |
100M |
15.512 |
33.986 |
20.263 |
15.076 |
1B |
1.400 |
3.465 |
2.323 |
* |
( *: No error was found, but this 64-client run was stopped when it
ran for more than 2 days.)
5.2.5 Result
Summaries
5.3 TDB
5.3.1 Configuration
The following changes were made to the default configuration
of the software:
- TDB: Version 0.9.4
Loading was done with tdbloader2
Statistics for the BGP optimizer were generated with the "tdbconfig stats" command and copied into the database directory.
- Fuseki: Version 0.2.5
Started server with: ./fuseki-server --loc /database/tdb /bsbm
5.3.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
10M |
00:9:41 |
5.3.3 Benchmark Query
results: QpS
(Queries per Second)
The table below summarizes the query throughput for each type of query over all 10 runs (in QpS):
10M | |
---|---|
Query 1 | 0.488 |
Query 2 | 0.023 |
Query 3 | 0.018 |
Query 4 | 0.140 |
Query 5 | 0.008 |
Query 6 | 16.202 |
Query 7 | 0.849 |
Query 8 | 0.018 |
5.3.4 Benchmark Overall results: QMpH for the 10 dataset for all runs
For the 10M dataset we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
1 | 4 | 8 |
64 | |
10M | 7.468 |
17.698 |
9.503 |
8.414 |
5.3.5 Result
Summaries
5.4 Virtuoso6 & Virtuoso7
5.4.1 Configuration
The following changes were made to the default configuration
of the software:
- Virtuoso6: Version 06.04.3132-pthreads for Linux as of May 14 2012
- Virtuoso7: Version 07.00.3202-pthreads for Linux as of Jan 1 2013
Loading of datasets:
The loading was done by running multiple loading process (call rdf_loader_run() function).
For the 100M, 200M, 1B datasets 10, 20, 100 files were generated, respectively.
For the configuration see the "virtuoso.ini" file.
- Config files:
5.4.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
- Virtuoso6
10M | 100M |
1B |
00:7:06 | 00:19:26 | 1:10:30 |
- Virtuoso7
100M |
1B |
00:03:09 |
00:27:11 |
5.4.3 Benchmark Query results: QpS (Queries per Second)
The table below summarizes the query throughput for each type of query over all 10 runs (in QpS):
- Virtuoso6
10M | 100M |
1B |
|
---|---|---|---|
Query 1 | 1.469 |
0.118 |
0.009 |
Query 2 | 37.707 |
7.931 |
0.635 |
Query 3 | 0.768 |
0.090 |
0.007 |
Query 4 | 1.183 |
0.216 |
0.020 |
Query 5 | 1.920 |
0.240 |
0.009 |
Query 6 | 14.988 |
10.767 |
3.726 |
Query 7 | 9.849 |
1.466 |
0.122 |
Query 8 | 0.592 |
0.048 |
0.003 |
- Virtuoso7
100M |
1B |
|
Query 1 | 11.558 |
0.462 |
Query 2 | 28.969 |
2.409 |
Query 3 | 0.886 |
0.035 |
Query 4 | 3.773 |
0.644 |
Query 5 | 5.496 |
0.468 |
Query 6 | 18.997 |
10.517 |
Query 7 | 14.816 |
1.912 |
Query 8 | 2.512 |
0.215 |
5.4.4 Benchmark Overall results: QMpH for the 10M, 1B datasets for all runs
For the 10M, 1B datasets we ran a test with 1, 4, 8 and 64 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
- Virtuoso6
1 | 4 | 8 |
64 | |
10M | 431.465 |
2667.657 |
3915.854 |
1401.186 |
100M |
35.342 |
191.431 |
268.428 |
99.321 |
1B |
2.383 |
17.777 |
21.457 |
8.355 |
- Virtuoso7
1 | 4 | 8 | 64 | |
100M |
996.795 |
5644.323 |
6402.190 |
7132.212 |
1B | 75.236 |
348.666 |
361.205 |
134.459 |
5.4.5 Result Summaries
6. Benchmark Results for the Cluster
Edition
This section reports the results of running the Explore
and BI use cases of
the BSBM
benchmark with the cluster editions of
- BigOwlim (version 5.2.5524)
- Virtuoso7 (07.00.3202-pthreads for Linux as of Jan 1 2013)
For the case of 10B triples dataset, we applied the following test procedure to each store:
- Load data into the store.
- Shutdown store, (optional: clear OS caches and swap), restart store.
- Execute single-client test run of Explore use case (100 mixes
performance
measurement, randomizer seed: 9834533) with 100
warm-up runs.
- Execute multi-client runs with 8
clients of Explore use case (randomizer
seed 9175932) and 16 warm-up runs.
./testdriver -seed 9834533 -uc bsbm/explore –w 100 –runs 100 –drill –o result_single.xml http://sparql-endpoint
./testdriver -seed 8188326 -uc bsbm/explore -w 16 -mt 8 -drill -o results_8clients.xml http://sparql-endpoint
- Execute single-client test run of BI use case (1 mix performance measurement, randomizer seed: 9834533) no warm-up run
- Execute multi-client runs with 8
clients of BI use case (randomizer
seed 9175932), no warm-up run.
./testdriver -seed 9834533 -uc bsbm/bi –runs 1 –drill –o result_single_bi.xml http://sparql-endpoint
./testdriver -seed 8188326 -uc bsbm/bi -mt 8 -drill -o results_8clients_bi.xml http://sparql-endpoint
6.1 BigOwlim
6.1.1 Configuration
The following changes were made to the default configuration
of the software:
- BigOwlim: Version 5.3.5777
Modified heap size and cache-memory in example.sh
-Xmx200G -Xms160G -Dcache-memory=100G
- Tomcat: Version 7.0.30
- Sesame: Version 2.6.10
- Config files:
6.1.2 Load Time
We use the application in getting-started
directory for loading the data.
The dataset is first generated into 100 .nt files (~100
million triples/file), and then copy to the getting-started/preload
for loading. For Bigowlim, the data generator is also modified so that
it writes the first 100 million triples to the first file, then writes
the next 100 million triples to the second file and so on. (Note: The
original data generator writes triples to 100 files in a round robin
style, e.g., first triple go to first file, next triple go to second
file, ...., 100th triple go the 100th file, 101th triple go to the
first file, and so on).
However, since we had to stop and resume loading process many
times for tuning the parameters and solving problems happened during the
loading process, it is hard to calculate the loading time.
After the getting-started app had finished loading process,
the built database is manually copied to each worker node. With 8
machines in the cluster, we have 8 replications.
6.1.3 Benchmark Query results: QpS (Queries per Second)
The table below summarizes the query throughput for each type of query in single-client runs (in QpS):- Explore use case
10B |
|
Query 1 | 1.262 |
Query 2 | 8.771 |
Query 3 | 0.230 |
Query 4 | 0.685 |
Query 5 | 0.010 |
Query 7 | 0.864 |
Query 8 | 2.811 |
Query 9 | 20.222 |
Query 10 | 1.763 |
Query 11 | 20.325 |
Query 12 | 35.461 |
- BI use case
10B | |
---|---|
Query 1 | 0.000076 |
Query 2 | 0.00005 |
Query 3 | 0.002 |
Query 4 | 0.0003 |
Query 5 | 0.0003 |
Query 6 | 0.036 |
Query 7 | 0.001 |
Query 8 | 0.001 |
6.1.4 Benchmark Overall results: QMpH for the 10B datasets for all runs
For the 10B datasets we ran a test with 1, 8 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
- Explore use case
1 | 8 |
|
10B | 16.506 |
257.399 |
- BI use case
1 | 8 |
|
10B | 0.044 |
0.120 |
6.1.5 Result
Summaries
- Bigowlim 10B Explore use case:
Number of clients Single
8
Download links xml
6.2 Virtuoso7 cluster
6.2.1 Configuration
The following changes were made to the default configuration
of the software:
- Virtuoso7: Version 07.00.3202-pthreads for Linux as of Jan 1 2013
- Loading of datasets:
The loading was done by executing multiple loading process in all cluster node (call cl_exec (' rdf_ld_srv ()' )).
For all datasets 1000 files were generated (125 files in each node).
This means that multiple files are read at the same time by the multiple cores of each CPU.
- The best performance was obtained with 7 loading threads per server process.
Hence, with two server processes per machine and 8 machines, 112 files were being read at the same time.
- Config files:
6.2.2 Load Time
The table below summarizes the load times Turtle files (in hh:mm:ss) :
10B |
50B |
150B |
01:05:00 | 06:28:00 |
* |
*: The largest load (150B) was slowed down by one machine showing markedly lower disk write throughput than the others. On the slowest machine iostat showed a continuous disk activity of about 700 device transactions per second, writing anything from 1 to 3 MB of data per second. On the other machines, disks were mostly idle with occasional flushing of database buffers to disk producing up to 2000 device transactions per second and 100MB/s write throughput. Since data is evenly divided and 2 of 16 processes were not runnable because the OS had too much buffered disk writes, this could stop the whole cluster for up to several minutes at a stretch. Our theory is that these problems were being caused by hardware malfunction. To complete the 150B load, we interrupted the stalling server processes, moved the data directories to different drives, and resumed the loading again. The need for manual intervention, and the prior period of very slow progress makes it hard to calculate the total time it took for the 150B load.
6.2.3 Benchmark Query results: QpS (Queries per Second)
We configured the BSBM driver to use 4 sparql endpoints for these query tests, so not all clients connect through the same machine.
The table below summarizes the query throughput for each type of query in single-client runs (in QpS):
- Explore use case
10B |
50B |
150B |
|
Query 1 | 15.530 |
15.049 |
8.843 |
Query 2 | 23.725 |
22.005 |
15.107 |
Query 3 | 15.349 |
8.921 |
9.016 |
Query 4 | 10.335 |
6.416 |
3.244 |
Query 5 | 0.959 |
0.267 |
0.124 |
Query 7 | 16.496 |
6.451 |
3.873 |
Query 8 | 22.121 |
10.035 |
5.314 |
Query 9 | 86.843 |
92.400 |
87.527 |
Query 10 | 35.663 |
6.823 |
4.987 |
Query 11 | 204.918 |
198.413 |
177.936 |
Query 12 | 65.402 |
73.046 |
75.075 |
- BI use case
10B |
50B |
150B |
|
Query 1 | 0.111 |
0.002 |
0.001 |
Query 2 | 1.511 |
0.005 |
0.005 |
Query 3 | 0.385 |
0.003 |
0.001 |
Query 4 | 0.088 |
0.055 |
0.005 |
Query 5 | 0.061 |
0.005 |
0.001 |
Query 6 | 3.802 |
0.021 |
0.041 |
Query 7 | 0.514 |
0.027 |
0.017 |
Query 8 | 0.045 |
0.004 |
0.001 |
6.2.4 Benchmark Overall results: QMpH for the 10B, 50B, 150B datasets for all runs
For the 10B we ran a test with 1 and 8 clients. For 50B and 150B datasets, we ran tests with 1 and 4 clients. The results are in Query Mixes per Hour (QMpH) meaning that larger numbers are better.
- Explore use case
1 | 4 | 8 |
|
10B |
2360.210 |
- |
4978.511 |
50B |
4253.157 |
2837.285 |
- |
150B |
2090.574 |
1471.032 |
- |
- BI use case
1 | 4 | 8 | |
10B | 13.078 |
- |
20.554 |
50B |
0.964 |
1.588 |
- |
150B |
0.285 |
0.480 |
- |
6.2.5 Result Summaries
- Virtuoso7 10B - Explore use case:
Number of clients Single
8
Download links xml
- Virtuoso7 10B - BI use case:
Number of clients Single
8
Download links xml
- Virtuoso7 50B - Explore use case:
Number of clients Single
4
Download links xml
- Virtuoso7 50B - BI use case:
Number of clients Single
4
Download links xml
- Virtuoso7 150B - Explore use case:
Number of clients Single
4
Download links xml
- Virtuoso7 150B - BI use case:
Number of clients Single
4
Download links xml
7. Store Comparison
This section compares the SPARQL query performance of the different stores.
7.1 Query Mixes per Hour for Single Clients
Running 500 query mixes against the different stores resulted in the following performance numbers (in QMpH). The best performance figure for each dataset size is set bold in the tables.
7.1.1 QMpH: Explore use case
The complete query mix is given here.
100M | 200M | 1B |
|
---|---|---|---|
BigData | 12512.278 | 10059.940 | - |
BigOwlim | 14029.453 | 9170.083 | 1669.899 |
TDB | 15381.857 | 10573.858 | - |
Virtuoso6 | 37678.319 | 32969.006 | 8984.789 |
Virtuoso7 |
47178.820 |
- |
27933.682 |
A much more detailed view of the results for the Explore use case is given under Detailed Results For The Explore-Query-Mix Benchmark Run.
7.1.2 QMpH: BI use case
10M | 100M |
1B |
|
---|---|---|---|
BigData | 7.290 | - |
- |
BigOwlim | 121.841 | 15.512 |
1.400 |
TDB | 7.468 | - |
- |
Virtuoso6 | 431.465 | 35.342 |
2.383 |
Virtuoso7 |
- | 996.795 |
75.236 |
A much more detailed view of the results for the BI use case is given under Detailed Results For The BI-Query-Mix Benchmark Run.
7.1.2 QMpH: Cluster edition
- Explore use case
10B | 50B |
150B |
|
---|---|---|---|
BigOwlim | 16.506 | - |
- |
Virtuoso7 |
2360.210 | 4253.157 | 2090.574 |
- BI use case
10B | 50B |
150B |
|
---|---|---|---|
BigOwlim | 0.044 | - |
- |
Virtuoso7 |
13.078 | 0.964 | 0.285 |
7.2 Query Mixes per Hour for Multiple Clients
- Explore use case
Dataset Size 100M | Number of clients | |||
1
|
4
|
8
|
64
|
|
BigData | 12512.278 |
17949.632 |
19574.007 |
20422.626 |
BigOwlim |
14029.453 | 17184.314 | 11677.860 | 8321.202 |
TDB |
15381.857 | 19036.097 | 24646.705 | 14838.483 |
Virtuoso6 |
37678.319 | 64885.747 | 112388.811 | 20647.413 |
Virtuoso7 |
47178.820 |
91505.200 |
188632.144 |
216118.852 |
Dataset Size 200M | Number of clients | |||
1
|
4
|
8
|
64
|
|
BigData |
10059.940 | 9762.856 | 11572.433 | 12935.595 |
BigOwlim |
9170.083 | 8130.137 | 5614.489 | 5150.768 |
TDB |
10573.858 | 9540.452 | 18610.896 | 8265.151 |
Virtuoso6 |
32969.006 | 31387.107 | 77224.941 | 14480.812 |
Dataset Size 1B | Number of clients | |||
1
|
4
|
8
|
64
|
|
BigOwlim |
1669.899 | 2246.865 | 1081.508 | 912.518 |
Virtuoso6 |
8984.789 | 15637.439 | 14343.728 | 2800.053 |
Virtuoso7 |
27933.682 | 56714.875 | 79261.626 | 132685.957 |
- BI use case
Dataset Size 10M | Number of clients | |||
1
|
4
|
8
|
64
|
|
BigData | 7.290 | 16.222 | 16.439 | 18.812 |
BigOwlim |
121.841 | 265.294 | 177.338 | 218.678 |
TDB |
7.468 | 17.698 | 9.503 | 8.414 |
Virtuoso6 |
431.465 | 2667.657 | 3915.854 | 1401.186 |
Dataset Size 100M | Number of clients | |||
1
|
4
|
8
|
64
|
|
BigOwlim |
15.512 |
33.986 |
20.263 |
15.076 |
Virtuoso6 |
35.342 |
191.431 |
268.428 |
99.321 |
Virtuoso7 |
996.795 | 5644.323 | 6402.190 | 7132.212 |
Dataset Size 1B | Number of clients | |||
1
|
4
|
8
|
64
|
|
BigOwlim |
1.400 | 3.465 | 2.323 | - |
Virtuoso6 |
2.383 | 17.777 | 21.457 | 8.355 |
Virtuoso7 |
75.236 | 348.666 | 361.205 | 134.459 |
- Cluster - Explore use case (10B only)
Dataset Size 10B | Number of clients |
|
1
|
8
|
|
BigOwlim |
16.506 | 257.399 |
Virtuoso7 |
2360.210 | 4978.511 |
- Cluster - BI use case (10B only)
Dataset Size 10B | Number of clients |
|
1
|
8
|
|
BigOwlim |
0.044 | 0.120 |
Virtuoso7 |
13.078 | 20.554 |
7.3 Detailed Results For The Explore-Query-Mix Benchmark Run
The details of running the Explore query mix are given here. There are two different views:
7.3.1 Queries per Second by Query and Dataset Size
Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each dataset size is set bold in the tables.
Query 1
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 49.955 |
93.773 | 119.048 | 232.234 | 125.786 |
200M | 49.520 | 52.094 | 94.877 | 217.865 | |
1B |
25.128 | 87.245 | 75.324 |
Query 2
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 |
|
---|---|---|---|---|---|
100M | 42.769 |
115.960 | 158.755 | 109.445 | 68.929 |
200M | 43.713 | 65.158 | 151.883 | 110.019 | |
1B |
34.181 | 79.791 | 68.820 |
Query 3
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 |
|
---|---|---|---|---|---|
100M | 37.280 |
170.242 | 84.660 | 180.245 | 117.426 |
200M | 38.355 | 61.155 | 70.492 | 174.216 | |
1B | 26.042 | 119.104 | 62.243 |
Query 4
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 |
|
---|---|---|---|---|---|
100M | 36.846 |
140.607 | 70.912 | 116.604 | 58.514 |
200M | 36.830 | 127.747 | 52.759 | 111.732 | |
1B | 12.868 | 42.586 | 30.473 |
Query 5
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 2.684 |
1.868 | 1.959 | 9.976 | 21.182 |
200M | 1.799 | 1.199 | 1.308 | 7.168 | |
1B | 0.198 | 1.201 | 6.064 |
Query 6
Removed.Query 7
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 16.172 |
75.746 | 196.754 | 30.001 | 54.484 |
200M | 16.548 | 98.357 | 184.349 | 32.918 | |
1B | 32.593 | 31.840 | 55.356 |
Query 8
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 37.498 |
93.467 | 228.258 | 117.247 | 93.336 |
200M | 38.721 | 193.087 | 199.362 | 124.502 | |
1B | 60.702 | 127.698 | 97.248 |
Query 9
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 59.524 |
202.041 | 355.999 | 397.456 | 173.898 |
200M | 61.476 | 105.759 | 319.489 | 363.042 | |
1B | 38.391 | 132.459 | 176.772 |
Query 10
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 41.326 |
146.327 | 297.619 | 122.926 | 107.968 |
200M | 42.427 | 69.411 | 267.094 | 123.487 | |
1B | 60.357 | 99.433 | 101.678 |
Query 11
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 62.375 |
368.732 | 483.092 | 539.957 | 214.133 |
200M | 63.784 | 74.074 | 450.045 | 493.583 | |
65.428 | 500.501 | 225.124 |
Query 12
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
100M | 50.989 |
244.738 | 204.834 | 220.167 | 126.743 |
200M | 52.094 | 197.239 | 192.901 | 215.424 | |
1B | 61.418 | 207.641 | 137.287 |
7.3.2 Queries per Second by Dataset Size and Query
Running 500 query mixes against the different stores lead to the following query throughput for each type of query over all 500 runs (in Queries per Second). The best performance figure for each query is set bold in the tables.
100M
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 |
|
---|---|---|---|---|---|
Query 1 | 49.955 |
93.773 | 119.048 | 232.234 | 125.786 |
Query 2 | 42.769 |
115.960 | 158.755 | 109.445 | 68.929 |
Query 3 | 37.280 |
170.242 | 84.660 | 180.245 | 117.426 |
Query 4 | 36.846 |
140.607 | 70.912 | 116.604 | 58.514 |
Query 5 | 2.684 |
1.868 | 1.959 | 9.976 | 21.182 |
Query 7 | 16.172 |
75.746 | 196.754 | 30.001 | 54.484 |
Query 8 | 37.498 |
93.467 | 228.258 | 117.247 | 93.336 |
Query 9 | 59.524 |
202.041 | 355.999 | 397.456 | 173.898 |
Query 10 | 41.326 |
146.327 | 297.619 | 122.926 | 107.968 |
Query 11 | 62.375 |
368.732 | 483.092 | 539.957 | 214.133 |
Query 12 | 50.989 |
244.738 | 204.834 | 220.167 | 126.743 |
200M
BigData | BigOwlim | TDB | Virtuoso6 | |
---|---|---|---|---|
Query 1 | 49.520 | 52.094 | 94.877 | 217.865 |
Query 2 | 43.713 | 65.158 | 151.883 | 110.019 |
Query 3 | 38.355 | 61.155 | 70.492 | 174.216 |
Query 4 | 36.830 | 127.747 | 52.759 | 111.732 |
Query 5 | 1.799 | 1.199 | 1.308 | 7.168 |
Query 7 | 16.548 | 98.357 | 184.349 | 32.918 |
Query 8 | 38.721 | 193.087 | 199.362 | 124.502 |
Query 9 | 61.476 | 105.759 | 319.489 | 363.042 |
Query 10 | 42.427 | 69.411 | 267.094 | 123.487 |
Query 11 | 63.784 | 74.074 | 450.045 | 493.583 |
Query 12 | 52.094 | 197.239 | 192.901 | 215.424 |
1B
BigOwlim | Virtuoso6 | Virtuoso7 |
|
---|---|---|---|
Query 1 | 25.128 | 87.245 | 75.324 |
Query 2 | 34.181 | 79.791 | 68.820 |
Query 3 | 26.042 | 119.104 | 62.243 |
Query 4 | 12.868 | 42.586 | 30.473 |
Query 5 | 0.198 | 1.201 | 6.064 |
Query 7 | 32.593 | 31.840 | 55.356 |
Query 8 | 60.702 | 127.698 | 97.248 |
Query 9 | 38.391 | 132.459 | 176.772 |
Query 10 | 60.357 | 99.433 | 101.678 |
Query 11 | 65.428 | 500.501 | 225.124 |
Query 12 | 61.418 | 207.641 | 137.287 |
7.4 Detailed Results For The BI-Query-Mix Benchmark Run
The details of running the BI query mix are given here. There are two different views:
7.4.1 Queries per Second by Query and Dataset Size
Running 10 query mixes against the different stores lead to the following query throughput for each type of query over all 10 runs (in Queries per Second). The best performance figure for each dataset size is set bold in the tables.
Query 1
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 0.453 | 1.426 | 0.488 | 1.469 |
|
100M |
0.176 |
0.118 |
11.558 | ||
1B |
0.016 | 0.009 | 0.462 |
Query 2
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 0.445 | 0.069 | 0.023 | 37.707 | |
100M | 0.009 |
7.931 |
28.969 | ||
1B |
0.001 | 0.635 | 2.409 |
Query 3
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 0.300 | 2.540 | 0.018 | 0.768 | |
100M | 0.105 |
0.090 |
0.886 | ||
1B |
0.002 | 0.007 | 0.035 |
Query 4
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 0.167 | 0.150 | 0.140 | 1.183 | |
100M | 0.027 |
0.216 |
3.773 | ||
1B |
0.003 | 0.020 | 0.644 |
Query 5
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 1.992 | 1.923 | 0.008 | 1.920 | |
100M | 0.240 |
0.240 |
5.496 | ||
1B |
0.020 | 0.009 | 0.468 |
Query
6
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 9.917 | 23.923 | 16.202 | 14.988 | |
100M | 15.538 |
10.767 |
18.997 | ||
1B | 13.951 | 3.726 | 10.517 |
Query
7
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 0.006 | 2.232 | 0.849 | 9.849 | |
100M | 0.369 |
1.466 |
14.816 | ||
1B | 0.040 | 0.122 | 1.912 |
Query
8
BigData | BigOwlim | TDB | Virtuoso6 | Virtuoso7 | |
---|---|---|---|---|---|
10M | 0.568 | 1.395 | 0.018 | 0.592 | |
100M | 0.191 |
0.048 |
2.512 | ||
1B | 0.016 | 0.003 | 0.215 |
7.4.2 Queries per Second by Dataset Size and Query
Running 10 query mixes against the different stores lead to the following query throughput for each type of query over all 10 runs (in Queries per Second). The best performance figure for each query is set bold in the tables.
10M
Bigdata |
Bigowlim |
TDB |
Virtuoso6 |
|
Query 1 | 0.453 | 1.426 | 0.488 | 1.469 |
Query 2 | 0.445 | 0.069 | 0.023 | 37.707 |
Query 3 | 0.300 | 2.540 | 0.018 | 0.768 |
Query 4 | 0.167 | 0.150 | 0.140 | 1.183 |
Query 5 | 1.992 | 1.923 | 0.008 | 1.920 |
Query 6 | 9.917 | 23.923 | 16.202 | 14.988 |
Query 7 | 0.006 | 2.232 | 0.849 | 9.849 |
Query 8 | 0.568 | 1.395 | 0.018 | 0.592 |
100M
Bigowlim |
Virtuoso6 | Virtuoso7 |
|
Query 1 | 0.176 |
0.118 | 11.558 |
Query 2 | 0.009 |
7.931 | 28.969 |
Query 3 | 0.105 |
0.090 | 0.886 |
Query 4 | 0.027 |
0.216 | 3.773 |
Query 5 | 0.240 |
0.240 | 5.496 |
Query 6 | 15.538 |
10.767 | 18.997 |
Query 7 | 0.369 |
1.466 | 14.816 |
Query 8 | 0.191 |
0.048 | 2.512 |
1B
Bigowlim |
Virtuoso6 |
Virtuoso7 |
|
Query 1 | 0.016 | 0.009 | 0.462 |
Query 2 | 0.001 | 0.635 | 2.409 |
Query 3 | 0.002 | 0.007 | 0.035 |
Query 4 | 0.003 | 0.020 | 0.644 |
Query 5 | 0.020 | 0.009 | 0.468 |
Query 6 | 13.951 | 3.726 | 10.517 |
Query 7 | 0.040 | 0.122 | 1.912 |
Query 8 | 0.016 | 0.003 | 0.215 |
8. Thanks
Thanks a lot to BSBM authors Chris Bizer and Andreas Schultz for providing instructions and sharing the software/scripts at the very beginning of our benchmark experiment.
We want to thank the store vendors and implementors for helping us to setup and configure their stores for the experiment. Lots of thanks to Orri Erling, Ivan Mikhailov, Mitko Iliev, Hugh Williams, Alexei Kaigorodov, Zdravko Tashev, Barry Bishop, Bryan Thompson, Mike Personick.
The work on the BSBM Benchmark Version 3 is funded through the LOD2 - Creating Knowledge out of Linked Data project.
Please send comments and feedback about the benchmark to Peter Boncz and Minh-Duc Pham.