Motivation
Data integration
- suppose we have a distributed database across many servers
- each row is some entity, a column represents some property of this entity, and the cell contains a value described by this property
- inside a cell we can refer to another entity, and the meaning of the relationship is described by the name of the column
- so each cell of this database can be seen as a triple
row column value
- row = resource/subject
- column = predicate
- value = object
- since the database is distributed, how to know if a resource on one server is the same resource from another?
- describe resources with a global ID - URI (uniform resource identifier_
- this is the main idea of RDF
RDF
RDF - resource description framework, a way to represent knowledge for the Semantic Web
- knowledge representation based on triples $\langle \text{subject}, \ \text{predicate}, \ \text{object} \rangle$
- the triples can form a graph
- nodes - resources
- edges - predicates
- both represented with URIs
- there’s a strong link between RDF and logic
- a set of RED triples can be interpreted as a conjunction of positive literals
Namespaces
one word can have several meaning
- e.g. Washington - state, city, person
- how to tell them apart?
- use namespaces
namespaces are typically URIs (like in XML)
- e.g.
http://www.example.com/states#Washington
http://www.example.com/cities#Washington
http://www.example.com/people#Washington
- and as in XML, it’s possible to use ‘'’qnames’’’ - URI abbreviations for local use
- qnames have 2 parts: namespace and id
states
-http://www.example.com/states#
- so use
states:Washington
to refer to Washington state
- default namespace in this case is empty
- use
:Washington
for thins in the default namespace
- use
Default namespaces in RDF
Examples
Example 1
- suppose we have these statements
doc.html
is written by Fabiendoc.html
is about music
- so we have these tripes
doc.html isWrittenBy fabien
doc.html about music
- it can be represented by the following graph
- every edge in this graph is an RDF triple
Example 2: Modeling with RDF
- suppose you need to make an RDF statement from the following sentence:
- “a flower which is red and has a round shape”
- In RDF triples it can be
flower color red
flower shape round
- first, you need to find some definition of a flower
- ideally it should be some resource you trust
- e.g. http://botanie.example.org/type/fleur
- then you look for relations and their definitions
- has color - http://concept.example.org/couleur
- has shape - http://concept.example.org/forme
- finally, you find appropriate instances for colors and shape
- http://colors.example.org/rouge - red
- http://shapes.example.org/ronde - round
- so you have (shortened)
:fleur :couleur :rouge
:fleur :forme :ronde
- graphically, it’s
Types and Properties
rdf:type
predicate provides basic typing system
- e.g.
geo:Washington rdf:type geo:USState
Blank Nodes
RDF allows resources to have no id at all
- Sometimes we know that something exists
- And even know something about it
- but don’t know its identity
For example,
- we know that Shakespeare had a mistress, but we don’t know her
- and that she was the source of the inspiration for one of his works
- try to model as follows
"unknown" rdf:type bio:Woman
"unknown" bio:livedIn geo:England
lit:Sonnet79 lit:hasInspiration "unknown"
We should interpret it as
- there exists a woman who lived in England and is the source of inspiration for “Sonnet 79”
- so blank nodes interpreted as existential variables
In Turtle it’s
lit:Sonnet78 lit:hasInspiration [a bio:Woman; bio:livedIn geo:England]
Semantic Web
RDF is a basis for the Semantic Web
- RDFS is schema for RDF that allows some basic inference
- RDFS-Plus extension of RDFS, and subset of OWL
- OWL - Web Ontologies Language
All of them use RDF to express the language constructs
Querying
- SPARQL is used for querying RDF graphs
RDF Serialization
Default is triplets - not very compact and user friendly
- need different representation
There are several: