# ML Wiki

## Motivation

Data integration

• suppose we have a distributed database across many servers
• each row is some entity, a column represents some property of this entity, and the cell contains a value described by this property
• inside a cell we can refer to another entity, and the meaning of the relationship is described by the name of the column
• so each cell of this database can be seen as a triple row column value
• row = resource/subject
• column = predicate
• value = object
• since the database is distributed, how to know if a resource on one server is the same resource from another?
• describe resources with a global ID - URI (uniform resource identifier_
• this is the main idea of RDF

## RDF

RDF - resource description framework, a way to represent knowledge for the Semantic Web

• knowledge representation based on triples $\langle \text{subject}, \ \text{predicate}, \ \text{object} \rangle$
• the triples can form a graph
• nodes - resources
• edges - predicates
• both represented with URIs
• there's a strong link between RDF and logic
• a set of RED triples can be interpreted as a conjunction of positive literals

### Namespaces

one word can have several meaning

• e.g. Washington - state, city, person
• how to tell them apart?
• use namespaces

namespaces are typically URIs (like in XML)

• e.g.
• http://www.example.com/states#Washington
• http://www.example.com/cities#Washington
• http://www.example.com/people#Washington
• and as in XML, it's possible to use qnames - URI abbreviations for local use
• qnames have 2 parts: namespace and id
• states - http://www.example.com/states#
• so use states:Washington to refer to Washington state
• default namespace in this case is empty
• use :Washington for thins in the default namespace

Default namespaces in RDF

• xsd: for primitive XML types
• rdf: for default things in rdf
• rdfs: for RDFS
• owl: for OWL

### Examples

#### Example 1

• suppose we have these statements
• doc.html is written by Fabien
• doc.html is about music
• so we have these tripes
• doc.html isWrittenBy fabien
• doc.html about music
• it can be represented by the following graph
• every edge in this graph is an RDF triple

#### Example 2: Modeling with RDF

• suppose you need to make an RDF statement from the following sentence:
• "a flower which is red and has a round shape"
• In RDF triples it can be
• flower color red
• flower shape round
• first, you need to find some definition of a flower
• then you look for relations and their definitions
• finally, you find appropriate instances for colors and shape
• so you have (shortened)
• :fleur :couleur :rouge
• :fleur :forme :ronde
• graphically, it's

### Types and Properties

rdf:type predicate provides basic typing system

• e.g. geo:Washington rdf:type geo:USState

### Blank Nodes

RDF allows resources to have no id at all

• Sometimes we know that something exists
• And even know something about it
• but don't know its identity

For example,

• we know that Shakespeare had a mistress, but we don't know her
• and that she was the source of the inspiration for one of his works
• try to model as follows
"unknown" rdf:type bio:Woman
"unknown" bio:livedIn geo:England
lit:Sonnet79 lit:hasInspiration "unknown"


We should interpret it as

• there exists a woman who lived in England and is the source of inspiration for "Sonnet 79"
• so blank nodes interpreted as existential variables

In Turtle it's

• lit:Sonnet78 lit:hasInspiration [a bio:Woman; bio:livedIn geo:England]

## Semantic Web

RDF is a basis for the Semantic Web

• RDFS is schema for RDF that allows some basic inference
• RDFS-Plus extension of RDFS, and subset of OWL
• OWL - Web Ontologies Language

All of them use RDF to express the language constructs

### Querying

• SPARQL is used for querying RDF graphs

## RDF Serialization

Default is triplets - not very compact and user friendly

• need different representation

There are several: