This document defines the Explore use case of the Berlin SPARQL Benchmark (BSBM) for measuring the performance of storage systems that expose SPARQL endpoints. The benchmark is built around an e-commerce use case, where a set of products is offered by different vendors and different consumers have posted reviews about products. The query mix of the Explore use case illustrates the search and navigation pattern of a consumer looking for a product.
The SPARQL Query Language for RDF and the SPARQL Protocol for RDF are implemented by a growing number of storage systems and are used within enterprise and open web settings. As SPARQL is taken up by the community there is a growing need for benchmarks to compare the performance of storage systems that expose SPARQL endpoints via the SPARQL protocol. Such systems include native RDF stores, Named Graph stores, systems that map relational databases into RDF, and SPARQL wrappers around other kinds of data sources.
The Berlin SPARQL Benchmark (BSBM) defines a suite of
benchmarks for comparing the performance of these systems across
architectures. The benchmark is built around an e-commerce use case in
which a set of products is offered by different vendors and consumers
have posted reviews about products. The benchmark query mix of the Explore use case illustrates
the search and navigation pattern of a consumer looking for a product.
All queries conform to the SPARQL 1.0 standard.
The rest of this document is structured as follows: Section 2 defines the schema of benchmark dataset and describes the rules that are used by the data generator for populating the dataset according to the chosen scale factor. Section 3 defines the benchmark queries. Sections 4 defines how a system under test is verified against the qualification dataset.
The benchmark dataset is described in the BSBM dataset document.
This section defines a suite of benchmark queries and a query mix.
The benchmark queries are designed to emulate the search and navigation pattern of a consumer looking for a product. A product search includes the following steps:
There are three representations of the benchmark query set: One for the Triple and one for the Named Graphs data model as well as a pure SQL version for the relational representation given in section 2.2.4. All query sets have the same semantics.
The complete query mix consists of 25 queries that simulate a product search by a single consumer. The query sequenze is given below:
Each query is defined by the following components:
Use Case Motivation: A consumer is looking for a product and has a general idea about what he wants.
SPARQL Query:
PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?product ?label
WHERE {
?product rdfs:label ?label .
?product a %ProductType% .
?product bsbm:productFeature %ProductFeature1% .
?product bsbm:productFeature %ProductFeature2% .
?product bsbm:productPropertyNumeric1 ?value1 .
FILTER (?value1 > %x%)
}
ORDER BY ?label
LIMIT 10
Parameters:
Parameter | Description |
---|---|
%ProductType% | A randomly selected Class URI from the class hierarchy (one level above leaf level). |
%ProductFeature1% %ProductFeature2% |
Two different, randomly selected feature URIs that correspond to the chosen product type. |
%x% | A number between 1 and 500 |
Query Properties:
Use Case Motivation: The consumer wants to view basic information about products found by query 1.
SPARQL Query
PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?label ?comment ?producer ?productFeature ?propertyTextual1 ?propertyTextual2 ?propertyTextual3
?propertyNumeric1 ?propertyNumeric2 ?propertyTextual4 ?propertyTextual5 ?propertyNumeric4
WHERE {
%ProductXYZ% rdfs:label ?label .
%ProductXYZ% rdfs:comment ?comment .
%ProductXYZ% bsbm:producer ?p .
?p rdfs:label ?producer .
%ProductXYZ% dc:publisher ?p .
%ProductXYZ% bsbm:productFeature ?f .
?f rdfs:label ?productFeature .
%ProductXYZ% bsbm:productPropertyTextual1 ?propertyTextual1 .
%ProductXYZ% bsbm:productPropertyTextual2 ?propertyTextual2 .
%ProductXYZ% bsbm:productPropertyTextual3 ?propertyTextual3 .
%ProductXYZ% bsbm:productPropertyNumeric1 ?propertyNumeric1 .
%ProductXYZ% bsbm:productPropertyNumeric2 ?propertyNumeric2 .
OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual4 ?propertyTextual4 }
OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual5 ?propertyTextual5 }
OPTIONAL { %ProductXYZ% bsbm:productPropertyNumeric4 ?propertyNumeric4 }
}
Parameters:
Parameter | Description |
---|---|
%ProductXYZ% | A product URI (randomly selected) |
Query Properties:
Use Case Motivation: After looking at information about some products, the consumer has a more specific idea what we wants. Therefore, he asks for products having several features but not having a specific other feature.
SPARQL Query:
PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?product ?label
WHERE {
?product rdfs:label ?label .
?product a %ProductType% .
?product bsbm:productFeature %ProductFeature1% .
?product bsbm:productPropertyNumeric1 ?p1 .
FILTER ( ?p1 > %x% )
?product bsbm:productPropertyNumeric3 ?p3 .
FILTER (?p3 < %y% )
OPTIONAL {
?product bsbm:productFeature %ProductFeature2% .
?product rdfs:label ?testVar }
FILTER (!bound(?testVar))
}
ORDER BY ?label
LIMIT 10
Parameters:
Parameter | Description |
---|---|
%ProductType% | A randomly selected Class URI from the class hierarchy (leaf level). |
%ProductFeature1% %ProductFeature2% |
Three different, randomly selected product feature URI that correspond to the chosen product type. |
%x% %y% |
Two random numbers between 1 and 500 |
Query Properties:
Use Case Motivation: After looking at information about some products, the consumer has a more specific idea what we wants. Therefore, he asks for products matching either one set of features or another set.
SPARQL Query:
PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?product ?label ?propertyTextual
WHERE {
{
?product rdfs:label ?label .
?product rdf:type %ProductType% .
?product bsbm:productFeature %ProductFeature1% .
?product bsbm:productFeature %ProductFeature2% .
?product bsbm:productPropertyTextual1 ?propertyTextual .
?product bsbm:productPropertyNumeric1 ?p1 .
FILTER ( ?p1 > %x% )
} UNION {
?product rdfs:label ?label .
?product rdf:type %ProductType% .
?product bsbm:productFeature %ProductFeature1% .
?product bsbm:productFeature %ProductFeature3% .
?product bsbm:productPropertyTextual1 ?propertyTextual .
?product bsbm:productPropertyNumeric2 ?p2 .
FILTER ( ?p2> %y% )
}
}
ORDER BY ?label
OFFSET 5
LIMIT 10
Parameters:
Parameter | Description |
---|---|
%ProductType% | A randomly selected Class URI from the class hierarchy (leaf level). |
%ProductFeature1% %ProductFeature2% %ProductFeature3% |
Three different, randomly selected product feature URI that correspond to the chosen product type. |
%x% %y% |
Two random numbers between 1 and 500 |
Query Properties:
Use Case Motivation: The consumer has found a product that fulfills his requirements. He now wants to find products with similar features.
SPARQL Query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
SELECT DISTINCT ?product ?productLabel
WHERE {
?product rdfs:label ?productLabel .
FILTER (%ProductXYZ% != ?product)
%ProductXYZ% bsbm:productFeature ?prodFeature .
?product bsbm:productFeature ?prodFeature .
%ProductXYZ% bsbm:productPropertyNumeric1 ?origProperty1 .
?product bsbm:productPropertyNumeric1 ?simProperty1 .
FILTER (?simProperty1 < (?origProperty1 + 120) && ?simProperty1 > (?origProperty1 – 120))
%ProductXYZ% bsbm:productPropertyNumeric2 ?origProperty2 .
?product bsbm:productPropertyNumeric2 ?simProperty2 .
FILTER (?simProperty2 < (?origProperty2 + 170) && ?simProperty2 > (?origProperty2 – 170))
}
ORDER BY ?productLabel
LIMIT 5
Parameters:
Parameter | Description |
---|---|
%ProductXYZ% | A product URI (randomly selected) |
Query Properties:
Use Case Motivation: The consumer remembers parts of a product name from former searches. He wants to find the product again by searching for the parts of the name that he remembers.
SPARQL Query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
SELECT ?product ?label
WHERE {
?product rdfs:label ?label .
?product rdf:type bsbm:Product .
FILTER regex(?label, "%word1%")
}
Parameters:
Parameter | Description |
---|---|
%word1% |
A word from the list of words that were used in the dataset generation. |
Query Properties:
Use Case Motivation: The consumer has found a products which fulfills his requirements. Now he wants in-depth information about this product including offers from German vendors and product reviews if existent.
SPARQL Query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rev: <http://purl.org/stuff/rev#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?productLabel ?offer ?price ?vendor ?vendorTitle ?review ?revTitle
?reviewer ?revName ?rating1 ?rating2
WHERE {
%ProductXYZ% rdfs:label ?productLabel .
OPTIONAL {
?offer bsbm:product %ProductXYZ% .
?offer bsbm:price ?price .
?offer bsbm:vendor ?vendor .
?vendor rdfs:label ?vendorTitle .
?vendor bsbm:country <http://downlode.org/rdf/iso-3166/countries#DE> .
?offer dc:publisher ?vendor .
?offer bsbm:validTo ?date .
FILTER (?date > %currentDate% )
}
OPTIONAL {
?review bsbm:reviewFor %ProductXYZ% .
?review rev:reviewer ?reviewer .
?reviewer foaf:name ?revName .
?review dc:title ?revTitle .
OPTIONAL { ?review bsbm:rating1 ?rating1 . }
OPTIONAL { ?review bsbm:rating2 ?rating2 . }
}
}
Parameters:
Parameter | Description |
---|---|
%ProductXYZ% | A product URI (randomly selected) |
%currentDate% | A date within the validFrom validTo range of the offers (same date for all queries within a run). |
Query Properties:
Use Case Motivation: The consumer wants to read the 20 most recent English language reviews about a specific product.
SPARQL Query:
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX rev: <http://purl.org/stuff/rev#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?title ?text ?reviewDate ?reviewer ?reviewerName ?rating1 ?rating2 ?rating3 ?rating4
WHERE {
?review bsbm:reviewFor %ProductXYZ% .
?review dc:title ?title .
?review rev:text ?text .
FILTER langMatches( lang(?text), "EN" )
?review bsbm:reviewDate ?reviewDate .
?review rev:reviewer ?reviewer .
?reviewer foaf:name ?reviewerName .
OPTIONAL { ?review bsbm:rating1 ?rating1 . }
OPTIONAL { ?review bsbm:rating2 ?rating2 . }
OPTIONAL { ?review bsbm:rating3 ?rating3 . }
OPTIONAL { ?review bsbm:rating4 ?rating4 . }
}
ORDER BY DESC(?reviewDate)
LIMIT 20
Parameters:
Parameter | Description |
---|---|
%ProductXYZ% | A product URI (randomly selected) |
Query Properties:
Use Case Motivation: In order to decide whether to trust a review, the consumer asks for any kind of information that is available about the reviewer.
SPARQL Query:
PREFIX rev: <http://purl.org/stuff/rev#>
DESCRIBE ?x
WHERE { %ReviewXYZ% rev:reviewer ?x }
Parameters:
Parameter | Description |
---|---|
%ReviewXYZ% | A review URI (randomly selected) |
Query Properties:
Use Case Motivation: The consumer wants to buy from a vendor in the United States that is able to deliver within 3 days and is looking for the cheapest offer that fulfills these requirements.
SPARQL Query:
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT DISTINCT ?offer ?price
WHERE {
?offer bsbm:product %ProductXYZ% .
?offer bsbm:vendor ?vendor .
?offer dc:publisher ?vendor .
?vendor bsbm:country <http://downlode.org/rdf/iso-3166/countries#US> .
?offer bsbm:deliveryDays ?deliveryDays .
FILTER (?deliveryDays <= 3)
?offer bsbm:price ?price .
?offer bsbm:validTo ?date .
FILTER (?date > %currentDate% )
}
ORDER BY xsd:double(str(?price))
LIMIT 10
Parameters:
Parameter | Description |
---|---|
%ProductXYZ% | A product URI (randomly selected) |
%currentDate% | A date within the validFrom-validTo range of the offers (same date for all queries within a run). |
Query Properties:
Use Case Motivation: After deciding on a specific offer, the consumer wants to get all information that is directly related to this offer.
SPARQL Query:
SELECT ?property ?hasValue ?isValueOf
WHERE {
{ %OfferXYZ% ?property ?hasValue }
UNION
{ ?isValueOf ?property %OfferXYZ% }
}
Parameters:
Parameter | Description |
---|---|
%OfferXYZ% | An offer URI (randomly selected) |
Query Properties:
Use Case Motivation: After deciding on a specific offer, the consumer wants to save information about this offer on his local machine using a different RDF schema.
SPARQL Query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rev: <http://purl.org/stuff/rev#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX bsbm-export: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/export/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
CONSTRUCT { %OfferXYZ% bsbm-export:product ?productURI .
%OfferXYZ% bsbm-export:productlabel ?productlabel .
%OfferXYZ% bsbm-export:vendor ?vendorname .
%OfferXYZ% bsbm-export:vendorhomepage ?vendorhomepage .
%OfferXYZ% bsbm-export:offerURL ?offerURL .
%OfferXYZ% bsbm-export:price ?price .
%OfferXYZ% bsbm-export:deliveryDays ?deliveryDays .
%OfferXYZ% bsbm-export:validuntil ?validTo }
WHERE { %OfferXYZ% bsbm:product ?productURI .
?productURI rdfs:label ?productlabel .
%OfferXYZ% bsbm:vendor ?vendorURI .
?vendorURI rdfs:label ?vendorname .
?vendorURI foaf:homepage ?vendorhomepage .
%OfferXYZ% bsbm:offerWebpage ?offerURL .
%OfferXYZ% bsbm:price ?price .
%OfferXYZ% bsbm:deliveryDays ?deliveryDays .
%OfferXYZ% bsbm:validTo ?validTo }
Parameters:
Parameter | Description |
---|---|
%OfferXYZ% | An offer URI (randomly selected) |
Query Properties:
The queries for the Named Graphs data model have the same semantics as the queries for the triple data model. The queries do not specify the IRIs of the named graphs in the RDF Dataset using the FROM NAMED clause, but assume that the query is executed against the complete RDF Dataset.
This is still work in progress ...
Todo: Rewrite all queries for Named Graphs. Two examples are already
found below:
SPARQL Query
PREFIX bsbm-inst: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/instances/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?label ?comment ?producer ?productFeature ?propertyTextual1 ?propertyTextual2
?propertyNumeric1 ?propertyNumeric2 ?propertyTextual4 ?propertyTextual5 ?propertyNumeric4
WHERE {
GRAPH ?graph {
%ProductXYZ% rdfs:label ?label .
%ProductXYZ% rdfs:comment ?comment .
%ProductXYZ% bsbm:producer ?p .
?p rdfs:label ?producer .
%ProductXYZ% bsbm:productFeature ?f .
?f rdfs:label ?productFeature .
%ProductXYZ% bsbm:productPropertyTextual1 ?propertyTextual1 .
%ProductXYZ% bsbm:productPropertyTextual2 ?propertyTextual2 .
%ProductXYZ% bsbm:productPropertyNumeric1 ?propertyNumeric1 .
%ProductXYZ% bsbm:productPropertyNumeric2 ?propertyNumeric2 .
OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual4 ?propertyTextual4 }
OPTIONAL { %ProductXYZ% bsbm:productPropertyTextual5 ?propertyTextual5 }
OPTIONAL { %ProductXYZ% bsbm:productPropertyNumeric4 ?propertyNumeric4 }
}
GRAPH localhost:provenanceData {
?graph dc:publisher ?p .
}
}
SPARQL Query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rev: <http://purl.org/stuff/rev#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX bsbm: <http://www4.wiwiss.fu-berlin.de/bizer/bsbm/v01/vocabulary/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?productLabel ?offer ?price ?vendor ?vendorTitle ?review ?revTitle
?reviewer ?revName ?rating1 ?rating2
WHERE {
GRAPH ?producerGraph {
%ProductXYZ% rdfs:label ?productLabel .
}
OPTIONAL {
GRAPH ?vendorGraph {
?offer bsbm:product %ProductXYZ% .
?offer bsbm:price ?price .
?offer bsbm:vendor ?vendor .
?vendor rdfs:label ?vendorTitle .
?offer bsbm:validTo ?date .
FILTER (?date > %currentDate% )
}
}
OPTIONAL {
GRAPH ?ratingSiteGraph {
?review bsbm:reviewFor %ProductXYZ% .
?review rev:reviewer ?reviewer .
?reviewer foaf:name ?revName .
?review dc:title ?revTitle .
OPTIONAL { ?review bsbm:rating1 ?rating1 . }
OPTIONAL { ?review bsbm:rating2 ?rating2 . }
}
}
GRAPH localhost:provenanceData {
?vendorGraph dc:publisher ?vendor .
}
}
This section will contain a SQL representation of the benchmark queries in order to be able to compare the performance of stores that expose SPARQL endpoints to the performance of classic SQL-based RDBMS. Since there is no exact counterpart to some SPARQL specific query forms like DESCRIBE, the SQL queries aren't semantically completely equivalent.
Use Case Motivation: A consumer is looking for a product and has a general idea about what he wants.
SQL Query:
SELECT distinct nr, label
FROM product p, producttypeproduct ptp
WHERE p.nr = ptp.product AND ptp.productType=@ProductType@
AND propertyNum1 > @x@
AND p.nr IN (SELECT distinct product FROM productfeatureproduct WHERE productFeature=@ProductFeature1@)
AND p.nr IN (SELECT distinct product FROM productfeatureproduct WHERE productFeature=@ProductFeature2@)
ORDER BY label
LIMIT 10;
Parameters:
Parameter | Description |
---|---|
@ProductType@ | A randomly selected Class ID from the class hierarchy (one level above leaf level). |
@ProductFeature1@ @ProductFeature2@ |
Two different, randomly selected feature IDs that correspond to the chosen product type. |
@x@ | A number between 1 and 500 |
Use Case Motivation: The consumer wants to view basic information about products found by query 1.
SQL Query
SELECT pt.label, pt.comment, pt.producer, productFeature, propertyTex1, propertyTex2, propertyTex3,
propertyNum1, propertyNum2, propertyTex4, propertyTex5, propertyNum4
FROM product pt, producer pr, productfeatureproduct pfp
WHERE pt.nr=@ProductXYZ@ AND pt.nr=pfp.product AND pt.producer=pr.nr;
Parameters:
Parameter | Description |
---|---|
@ProductXYZ@ | A product ID (randomly selected) |
Use Case Motivation: After looking at information about some products, the consumer has a more specific idea what we wants. Therefore, he asks for products having several features but not having a specific other feature.
SQL Query:
SELECT p.nr, p.label
FROM product p, producttypeproduct ptp
WHERE p.nr=ptp.product
AND productType=@ProductType@
AND propertyNum1>@x@
AND propertyNum3<@y@
AND @ProductFeature1@ IN (SELECT productFeature FROM productfeatureproduct WHERE product=p.nr)
AND @ProductFeature2@ NOT IN (SELECT productFeature FROM productfeatureproduct WHERE product=p.nr)
ORDER BY p.label
LIMIT 10;
Parameters:
Parameter | Description |
---|---|
@ProductType@ | A randomly selected Class ID from the class hierarchy (leaf level). |
@ProductFeature1@ @ProductFeature2@ |
Three different, randomly selected product feature ID that correspond to the chosen product type. |
@x@ @y@ |
Two random numbers between 1 and 500 |
Use Case Motivation: After looking at information about some products, the consumer has a more specific idea what we wants. Therefore, he asks for products matching either one set of features or another set.
SQL Query:
SELECT distinct p.nr, p.label, p.propertyTex1
FROM product p, producttypeproduct ptp
WHERE p.nr=ptp.product AND ptp.productType=@ProductType@
AND p.nr IN (SELECT distinct product FROM productfeatureproduct WHERE productFeature=@ProductFeature1@)
AND ((propertyNum1>@x@ AND p.nr IN (SELECT distinct product FROM productfeatureproduct WHERE productFeature=@ProductFeature2@)
) OR (propertyNum2>@y@ AND p.nr IN (SELECT distinct product FROM productfeatureproduct WHERE productFeature=@ProductFeature3@)))
ORDER BY label
LIMIT 10
OFFSET 5;
Parameters:
Parameter | Description |
---|---|
@ProductType@ | A randomly selected Class ID from the class hierarchy (leaf level). |
@ProductFeature1@ @ProductFeature2@ @ProductFeature3@ |
Three different, randomly selected product feature IDs that correspond to the chosen product type. |
@x@ @y@ |
Two random numbers between 1 and 500 |
Use Case Motivation: The consumer has found a product that fulfills his requirements. He now wants to find products with similar features.
SQL Query:
SELECT distinct p.nr, p.label
FROM product p, product po,
(Select distinct pfp1.product FROM productfeatureproduct pfp1, (SELECT productFeature FROM productfeatureproduct WHERE product=@ProductXYZ@) pfp2 WHERE pfp2.productFeature=pfp1.productFeature) pfp
WHERE p.nr=pfp.product AND po.nr=@ProductXYZ@ AND p.nr!=po.nr
AND p.propertyNum1<(po.propertyNum1+120) AND p.propertyNum1>(po.propertyNum1-120)
AND p.propertyNum2<(po.propertyNum2+170) AND p.propertyNum2>(po.propertyNum2-170)
ORDER BY label
LIMIT 5;
Parameters:
Parameter | Description |
---|---|
@ProductXYZ@ | A product ID (randomly selected) |
Use Case Motivation: The consumer remembers parts of a product name from former searches. He wants to find the product again by searching for the parts of the name that he remembers.
SQL Query:
SELECT nr, label
FROM product
WHERE label like "%@word1@%";
Parameters:
Parameter | Description |
---|---|
@word1@ |
A word from the list of words that were used in the dataset generation. |
Use Case Motivation: The consumer has found a products which fulfills his requirements. Now he wants in-depth information about this product including offers from German vendors and product reviews if existent.
SQL Query:
SELECT *
FROM (select label from product where nr=@ProductXYZ@) p left join
((select o.nr as onr, o.price, v.nr as vnr, v.label from offer o, vendor v where @ProductXYZ@=o.product AND
o.vendor=v.nr AND v.country='DE' AND o.validTo>'@currentDate@') ov right join
(select r.nr as rnr, r.title, pn.nr as pnnr, pn.name, r.rating1, r.rating2 from review r, person pn where r.product=@ProductXYZ@ AND
r.person=pn.nr) rpn on (1=1)) on (1=1);
Parameters:
Parameter | Description |
---|---|
@ProductXYZ@ | A product ID (randomly selected) |
@currentDate@ | A date within the validFrom validTo range of the offers (same date for all queries within a run). |
Use Case Motivation: The consumer wants to read the 20 most recent English language reviews about a specific product.
SQL Query:
SELECT r.title, r.text, r.reviewDate, p.nr, p.name, r.rating1, r.rating2, r.rating3, r.rating4
FROM review r, person p
WHERE r.product=@ProductXYZ@ AND r.person=p.nr
AND r.language='en'
ORDER BY r.reviewDate desc
LIMIT 20;
Parameters:
Parameter | Description |
---|---|
@ProductXYZ@ | A product ID (randomly selected) |
Use Case Motivation: In order to decide whether to trust a review, the consumer asks for any kind of information that is available about the reviewer.
SQL Query:
SELECT p.nr, p.name, p.mbox_sha1sum, p.country, r2.nr, r2.product, r2.title
FROM review r, person p, review r2
WHERE r.nr=@ReviewXYZ@ AND r.person=p.nr AND r2.person=p.nr;
Parameters:
Parameter | Description |
---|---|
@ReviewXYZ@ | A review ID (randomly selected) |
Use Case Motivation: The consumer wants to buy from a vendor in the United States that is able to deliver within 3 days and is looking for the cheapest offer that fulfills these requirements.
SQL Query:
SELECT distinct o.nr, o.price
FROM offer o, vendor v
WHERE o.product=@ProductXYZ@
AND o.deliveryDays<=3 AND v.country='US'
AND o.validTo>'@currentDate@' AND o.vendor=v.nr
Order BY o.price
LIMIT 10;
Parameters:
Parameter | Description |
---|---|
@ProductXYZ@ | A product ID (randomly selected) |
@currentDate@ | A date within the validFrom-validTo range of the offers (same date for all queries within a run). |
Use Case Motivation: After deciding on a specific offer, the consumer wants to get all information that is directly related to this offer.
SQL Query:
Select product, producer, vendor, price, validFrom, validTo, deliveryDays, offerWebpage, publisher, publishDate
from offer
where nr=@OfferXYZ@;
Parameters:
Parameter | Description |
---|---|
@OfferXYZ@ | An offer ID (randomly selected) |
Use Case Motivation: After deciding on a specific offer, the consumer wants to save information about this offer on his local machine using a different RDF schema.
SQL Query:
Select p.nr As productNr, p.label As productlabel, v.label As vendorname, v.homepage As vendorhomepage,
o.offerWebpage As offerURL, o.price As price, o.deliveryDays As deliveryDays, o.validTo As validTo
From offer o, product p, vendor v
Where o.nr=@OfferXYZ@ AND o.product=p.nr AND o.vendor=v.nr;
Parameters:
Parameter | Description |
---|---|
@OfferXYZ@ | An offer ID (randomly selected) |
Before the performance of a SUT is measured, it has to be verified that the SUT returns correct results for the benchmark queries.
For testing whether a SUT returns correct results, the BSBM
benchmark provides a qualification dataset and a qualification tool
which compares the query results of a SUT with the correct query
results. At the moment, the qualification tool verifies only the
results of SELECT queries. The results of DESCRIBE and CONSTRUCT
queries (queries 9 and 12) are not checked.
A BSBM qualification test is conducted in the two-step procedure
described below:
$ java -cp bin:lib/* benchmark.testdriver.TestDriver -q http://SUT/sparqlThis will create a qualification file named "run.qual" (different file name can be specified with the "-qf" parameter) which is used in step 2. Also the run.log (if logging is set to "ALL" in the log4j.xml file) contains all queries with full result text, so single queries can be examined later on.
where http://SUT/sparql specifies the SPARQL endpoint
Option | Description |
-rc | Only check the amount of results returned and not the result content. |
-ql <qualification log file name> | Specify the file name to write the qualification test results into. |
$ ./generate -fc -pc 284826 (Unix, CygWin)This will also generate the test driver data in the "td_data" directory, which needs to be in place when the test driver is run against the 100M dataset. If there is a need to also qualify against smaller datasets, please contact us and we will gladly add more options.
or
java -cp bin;lib\* -Xmx256M benchmark.generator.Generator -fc -pc 284826 (Windows)
$ java -cp bin:lib/* benchmark.qualification.Qualification correct_100m.qual run.qualThis generates by default a log file called "qual.log" with the following content:
where run.qual is the qualification file generated by the Test Driver in qualification mode
For more information about RDF and SPARQL Benchmarks please refer to:
The work on the BSBM Benchmark Version 3 is funded through the LOD2 project