SPARQL

In Semantic Web, SPARQL is a query language for getting information from RDF graphs

  • SPARQL = SPARQL Protocol and RDF Query Language
  • matches graph patterns - so also a graph matching language
  • it's a variant of Turtle adapted for querying
  • variables denoted by ?

Formal foundation


Versions

SPARQL 1.0

  • Basic Things

SPARQL 1.1

  • Aggregations
  • Negations
  • Nested Queries
  • Transitive Properties


Query Types

There are 4 types

  • SELECT query
    results are in a table format.
  • CONSTRUCT query
    results are translated into valid RDF
  • ASK query
    Just TRUE/FALSE
  • DESCRIBE query
    results are RDF graphs

All of them take WHERE clause


Structure of a query

Generally, each query follows this structure

  • Prefix declarations, for abbreviating URIs
  • Dataset definition, stating what RDF graph to query
    • FROM ...
  • Result clause, identifying what information to return from the query
    • SELECT ... , ASK ..., CONSTRUCT ... or DESCRIBE ...
  • Query pattern, specifying what to query for, in the underlying dataset
    • WHERE { ... }
  • Query modifiers, slicing, ordering, and otherwise rearranging query results
    • ORDER BY, etc


Formally,

  • A SPARQL query is a tuple $\langle P, G, D, S, R \rangle$:
  • $P$ stands for the prefix declarations ection
  • $G$ is a graph pattern (pattern of the query)
  • $D$ is a set of RDF data ("dataset" : database)
  • $S$ is a "result transformer": Projection, Distinct, Order, Limit, Offset
  • $R$ is the type of the result: SELECT, CONSTRUCT, DESCRIBE, ASK


Select Queries

Example 1

An example with prefix

PREFIX foaf: <http://xmlns.com/foaf/0.1/> 
SELECT ?name WHERE { 
  ?x foaf:name ?name
} 

result $\to$

name
Bob
Alice


Example 2

Consider this RDF graph [1]

  • rdf-graph-ex3-sparql.png
SELECT ?player ?club
WHERE {
  ?player :position :striker .
  ?player :playsFor ?club .
  ?club :region :Barcelona 
}


the query itself is also a graph

  • rdf-graph-ex3-q.png
  • now we match this graph with the data graph
  • this query will select only Messi, because he's a striker


In SQL it would be

SELECT
  A.subject, A.object
FROM
  triples AS A, triples AS B, triples AS C
WHERE
  B.predicate = "position" AND B.object = "striker"
  AND B.subject = A.subject AND A.predicate = "playsFor"
  AND A.object = C.subject AND C.predicate = "region"
  AND C.object = "Barcelona";


Also, this query can be translated to the following Conjunctive Query

  • $\text{query}(p, c) \equiv \text{Position}(p, \text{``striker}), \text{PlaysFor}(p, c), Region(c, \text{``Barcelona})$


Querying for Property

Can also query for a predicate

  • e.g. what do we know about James Dean?
SELECT ?property ?value 
WHERE {
  :JamesDean ?property ?value
}

Also can use DISTICNT keyword

SELECT DISTINCT ?property 
WHERE {
  :JamesDean ?property ?value
}

Suppose we want to know anything about a class Actor

SELECT DISTINCT ?property 
WHERE 
  q0 a :Actor . 
  ?q0 ?property ?object .
}


Where part

In this part matching happens

  • Generally, the same idea as in Conjunctive Queries
  • There are existential variables (not from the head of the query)
    • they are matched with some data in the database and assigned some value
  • as saw, here a graph is constructed and matched with


Filter: Value constraints

Boolean tests - not graph patterns

  • Logical: !, &&, ||
  • Arithmetic: +, -, *, /
  • Comparison: =, !=, >, <, ...
  • SPARQL tests: isURI, isBlank, isLiteral, bound
  • SPARQL accessors: str, lang, datatype
  • Other: sameTerm, langMatches, regex ...


Examples:

PREFIX dc: <http://purl.org/dc/elements/1.1/> 
PREFIX ns: <http://example.org/ns#> 
SELECT ?title ?price 
WHERE { 
  ?x ns:price ?price . 
  FILTER ?price < 30 . 
  ?x dc:title ?title . 
} 


SELECT ?actor 
WHERE { 
  ?actor :playedIn :Giant . 
  ?actor :diedOn ?deathdate . 
  FILTER(?deathdate > "1961-11-24"^^xsd:date)
}


We can also have several filters

SELECT ?person 
WHERE{
  ?person a :Person . 
  ?person :bornOn ?birthday . 
  FILTER(?birthday > "Jan 1, 1960"^^xsd:date) 
  FILTER(?birthday < "Dec 31, 1969"^^xsd:date)
}


Optional

When we don't want a query to fail if there is no data for something

Example:

  • if price exists, filter on it; otherwise just include it
PREFIX dc: <http://purl.org/dc/elements/1.1/> 
PREFIX ns: <http://example.org/ns#> 
SELECT ?title ?price 
WHERE { 
  ?x dc:title ?title . 
  OPTIONAL { 
    ?x ns:price?price . 
     FILTER ?price < 30 
  }
}

Also, can have several optionals

PREFIX foaf: <http://xmlns.com/foaf/0.1/> 
SELECT ?name ?mbox ?hpage
WHERE { 
  ?x foaf:name?name . 
  OPTIONAL { ?x foaf:mbox ?mbox } . 
  OPTIONAL { ?x foaf:homepage ?hpage} 
} 


Union

PREFIX dc10: <http://purl.org/dc/elements/1.0/> 
PREFIX dc11: <http://purl.org/dc/elements/1.1/> 
SELECT ?x ?y 
WHERE {
  { ?book dc10:title ?x } 
  UNION
  { ?book dc11:title ?y } 
}

Can have several unions:

CONSTRUCT { ?s :hasParent ?o } 
WHERE { 
  { ?s :hasMother ?o } 
  UNION
  { ?s :hasFather ?o } 
  UNION
  { ?o :hasSon ?s } 
  UNION 
  { ?o :hasDaughter ?s }
} 


Negation (SPARQL 1.1)

Negation is achieved with a keyword UNSAID

  • it introduces a subgraph
  • the overall graph pattern will match if the UNSAID pattern does not match.

This query will return all actors with no :diedOn record who played in "Giant"

SELECT ?actor 
WHERE {
  ?actor :playedIn :Giant . 
  UNSAID { ?actor :diedOn ?deathdate .} 
} 


Transitive Queries

Suppose we want to select Joe's children

  • and then children of his children
SELECT ?member 
WHERE { ?member :hasParent :Joe } 

SELECT ?member 
WHERE {
  ?c :hasParent :Joe . 
  ?member :hasParent ?c .
} 

sparql-transitive-1.png

But what if we want to follow :hasParent as long as it's there?

  • sparql-transitive-2.png
  • use * for that (only SPARQL 1.1)
SELECT ?member 
WHERE { ?member :hasParent* :Joe .}

But in this case it will also include :Joe

  • i.e. it includes zero-length chains as well
  • to avoid it, use + instead (like in Regular Expressions)
SELECT ?member 
WHERE { ?member :hasParent+ :Joe .}


Ordering (SPARQL 1.1)

SELECT ?title ?date 
WHERE {
  :JamesDean :playedIn ?movie. 
  ?movie rdfs:label ?title . 
  ?movie dc:date ?date . 
} ORDER BY ?date

-- with limit
SELECT ?title 
WHERE {
  :JamesDean :playedIn ?m. 
  ?m rdfs:label ?title . 
  ?m dc:date ?date . 
} ORDER BY ?date 
LIMIT1 

-- inverse order
SELECT ?last 
WHERE { 
  ?who :playedIn :RebelWithoutaCause .
  ?who rdfs:label ?last . 
  ?who :diedOn ?date
} ORDER BY DESC(?date) 
LIMIT 1 


Aggregating and Grouping (SPARQL 1.1)

SELECT (COUNT(?movie) AS ?howmany) 
WHERE {:JamesDean ?playedIn ?movie .}

SELECT (SUM (?val) AS ?total)
WHERE {
  ?s a :Sale . 
  ?s :amount ?val 
}

SELECT ?year (SUM (?val) AS ?total)
WHERE {
  ?s a :Sale . 
  ?s :amount ?val . 
  ?s :year ?year 
} GROUP BY ?year


SELECT ?year ?company (SUM(?val) AS ?total) 
WHERE {
  ?s a :Sale . 
  ?s :amount ?val . 
  ?s :year ?year . 
  ?s :company ?company . 
} GROUP BY ?year ?company 

-- with filtering 
SELECT ?year ?company (SUM(?val) AS ?total) 
WHERE {
  ?s a :Sale . 
  ?s :amount ?val . 
  ?s :year ?year . 
  ?s :company ?company . 
} GROUP BY ?year ?company HAVING (?total > 5000) 


Subqueries (SPARQL 1.1)

SELECT ?company 
WHERE { 
  {
    SELECT ?company ((SUM(?val)) AS ?total09) 
    WHERE { 
      ?s a :Sale . 
      ?s :amount ?val . 
      ?s :company ?company . 
      ?s :year 2009 . 
    } GROUP BY ?company 
  } . 
  {
    SELECT ?company ((SUM(?val)) AS?total10) 
    WHERE { 
      ?s a :Sale . 
      ?s :amount ?val . 
      ?s :company ?company . 
      ?s :year 2010 .
    } GROUP BY ?company 
  } . 
  FILTER(?total10 > ?total09) . 
} 


Other Types of Queries

Yes/No Queries

Use when we need just TRUE/FALSE

Example: do we have any :diedOn record for :ElizabethTaylor?

ASK WHERE {:ElizabethTaylor :diedOn ?any}

Or can use negation for that:

  • e.g. is :ElizabethTaylor still alive?
ASK WHERE { UNSAID {:ElizabethTaylor :diedOn ?any} }


CONSTRUCT Queries

"Select" queries

  • are run on a RDF graph, but return a table
  • have no closure property
  • want to construct a valid RDF graph from the result
CONSTRUCT {
  ?d rdf:type :Director . 
  ?d rdfs:label ?name . 
}
WHERE {
  ?any :directedBy ?d . 
  ?d rdfs:label ?name . 
} 

sparql-construct-ex1.png


Can use these queries for

  • insert it back into this RDF store / another RDF store
  • serialize to XML/RDF


Rules

Construct queries provide a convenient way of specifying rules

  • based on patterns found in your RDF graph - i.e. facts stored in the database

Examples:

  • from the previous example: if something is directed by ?d, then ?d must be a :Director
  • these rules say "whenever you see this, conclude that"
  • can be used to express inference rules in Inference in Semantic Web

Types of Rules

  • completeness rules
    • John's father is Joe $\to$ Joe's son is John
  • logical rules
    • Socrates is a man, all men are mortal $\to$ Socrates is mortal
  • definitions
    • Ted's sister is Maria's mother $\to$ Ted is Maria's uncle
  • business rules
    • customers spent > 5000 USD are preferred customers
CONSTRUCT{ ?q1 :hasSon :q2 .} 
WHERE {
  ?q2 a :Man . 
  ?q2 :hasFather ?q1
} 

CONSTRUCT { ?q1 a :Mortal } 
WHERE { ?q1 a :Man } 

CONSTRUCT { ?q1 :hasUncle ?q2 } 
WHERE {
  ?q2 :hasSibling ?parent . 
  ?q2 a :Man . 
  ?q1 :hasParent ?parent 
} 

CONSTRUCT { ?c a :PreferredCustomer } 
WHERE {
  ?c :totalBusiness ?tb . 
  FILTER(?tb > 5000) 
} 


Protocol

SPARQL is not only a query language, but also a protocol

  • so a query engine can be a web service

SPARQL endpoints


Implementations


Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion