ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

RDF

Motivation

Data integration

  • suppose we have a distributed database across many servers
  • each row is some entity, a column represents some property of this entity, and the cell contains a value described by this property
  • inside a cell we can refer to another entity, and the meaning of the relationship is described by the name of the column
  • so each cell of this database can be seen as a triple row column value
    • row = resource/subject
    • column = predicate
    • value = object
  • since the database is distributed, how to know if a resource on one server is the same resource from another?
    • describe resources with a global ID - URI (uniform resource identifier_
  • this is the main idea of RDF

RDF

RDF - resource description framework, a way to represent knowledge for the Semantic Web

  • knowledge representation based on triples $\langle \text{subject}, \ \text{predicate}, \ \text{object} \rangle$
  • the triples can form a graph
    • nodes - resources
    • edges - predicates
    • both represented with URIs

Descriptive Logic

  • there’s a strong link between RDF and logic
  • a set of RED triples can be interpreted as a conjunction of positive literals

Namespaces

one word can have several meaning

  • e.g. Washington - state, city, person
  • how to tell them apart?
  • use namespaces

namespaces are typically URIs (like in XML)

  • e.g.
    • http://www.example.com/states#Washington
    • http://www.example.com/cities#Washington
    • http://www.example.com/people#Washington
  • and as in XML, it’s possible to use ‘'’qnames’’’ - URI abbreviations for local use
    • qnames have 2 parts: namespace and id
    • states - http://www.example.com/states#
    • so use states:Washington to refer to Washington state
  • default namespace in this case is empty
    • use :Washington for thins in the default namespace

Default namespaces in RDF

  • xsd: for primitive XML types
  • rdf: for default things in rdf
  • rdfs: for RDFS
  • owl: for OWL

Examples

Example 1

  • suppose we have these statements
    • doc.html is written by Fabien
    • doc.html is about music
  • so we have these tripes
    • doc.html isWrittenBy fabien
    • doc.html about music
  • it can be represented by the following graph
    • Image
    • every edge in this graph is an RDF triple

Example 2: Modeling with RDF

  • suppose you need to make an RDF statement from the following sentence:
    • “a flower which is red and has a round shape”
  • In RDF triples it can be
    • flower color red
    • flower shape round
  • first, you need to find some definition of a flower
    • ideally it should be some resource you trust
    • e.g. http://botanie.example.org/type/fleur
  • then you look for relations and their definitions
    • has color - http://concept.example.org/couleur
    • has shape - http://concept.example.org/forme
  • finally, you find appropriate instances for colors and shape
    • http://colors.example.org/rouge - red
    • http://shapes.example.org/ronde - round
  • so you have (shortened)
    • :fleur :couleur :rouge
    • :fleur :forme :ronde
  • graphically, it’s
    • Image

Types and Properties

rdf:type predicate provides basic typing system

  • e.g. geo:Washington rdf:type geo:USState

Blank Nodes

RDF allows resources to have no id at all

  • Sometimes we know that something exists
  • And even know something about it
  • but don’t know its identity

For example,

  • we know that Shakespeare had a mistress, but we don’t know her
  • and that she was the source of the inspiration for one of his works
  • try to model as follows
"unknown" rdf:type bio:Woman
"unknown" bio:livedIn geo:England
lit:Sonnet79 lit:hasInspiration "unknown"

We should interpret it as

  • there exists a woman who lived in England and is the source of inspiration for “Sonnet 79”
  • so blank nodes interpreted as existential variables

In Turtle it’s

  • lit:Sonnet78 lit:hasInspiration [a bio:Woman; bio:livedIn geo:England]

Semantic Web

RDF is a basis for the Semantic Web

  • RDFS is schema for RDF that allows some basic inference
  • RDFS-Plus extension of RDFS, and subset of OWL
  • OWL - Web Ontologies Language

All of them use RDF to express the language constructs

Querying

  • SPARQL is used for querying RDF graphs

RDF Serialization

Default is triplets - not very compact and user friendly

  • Image
  • need different representation

There are several:

See Also

Sources