Semantic Web

Semantic Web is a web that is used to represent knowledge

  • Semantic Webs - part of Knowledge Representation, AI
  • principles: be able to describe things in document in machine-processable way
  • RDF is the main tool for representing knowledge in Semantic Web

What we currently have in WWW:

  • mostly have links of the form $a \ \text{href} \ b$
  • what we want: $A \ \text{dependsOn} \ a$, $a \ \text{isDescribedBy} \ b$, etc -
    • i.e. we want links to have some meaning behind
    • so we can use semantic web and Ontologies for that


DIKW

DIKW [1]: data $\to$ information $\to$ knowledge $\to$ wisdom $\Rightarrow$ decision

  • D - just collecting data, smb enters data into a web app - just values
  • I - databases (RDBs, XML, etc) - now you have some structure
    • but also know when it was collected, by whom, etc - i.e. with some metadata
  • K - reports, analysis - to facilitate decision making
  • W - to increase effectiveness
  • see also [2]

DIKW.png


Smart Web of Linked Data

So the goal is to have machine-readable linked data. We want to have "Smart Web" - linked and consistent.

  • fundamental issue of the web: how to manage distributed data


Motivation: Integration

Data integration and distribution

  • suppose that two servers share the same tables
  • but tables have different schemas
  • how do we know that one columns in first db corresponds to another one in second?
  • so we need some coordination between the servers, like global reference
  • so represent each cell of these tables with 3 values
    • global reference for row
    • global reference for column
    • global reference for the value in the cell
  • such cells can be stored on any of these servers
  • this is the basic idea of RDF
  • and global references are URIs


Smart Managing of Data

  • rows are "things" (or entities or individuals)
  • columns specify properties of these things
  • if a cell references some other "thing", the meaning of this relation can be understood from the name of the column
  • can express this in a more meaningful form - with reference where this meaning is described
  • so the "things" are resources that can relate to other resources
  • to describe these things and relations, use URIs


Linked Data

Linked Open Data: a giant graph


Other Sources:

  • Freebase, UMBEL, YAGO2, OpenCyc
  • Geography: Geonames, LinkedGeoData; EuroStat, World Factbook, US Census, Ordnance

Survey

  • Media: BBC (/programmes, /music), WildlifeFinder, New York Times, Thomson Reuters: Open Calais (Named Entities extracted from text)
  • Social Media: Open Graph Protocol (Facebook), Internet Movie Database
  • Libraries American Lib. of Congress, GermanNational Lib. of Economics, LIBRIS, SwedishNat. Union Catalogue, OpenLibrary
  • Scholarly articles (journal, conferences): DBLP, ACM, RKBexplorer, SemanticWeb, DogfoodServer
  • Many others - see http://linkeddata.org and http://lod-cloud.net/


Linked data principles

These principles are recommendation - best practices

  • use URIs to talk about things
  • HTTP URIs are better so people can access them
  • when somebody uses this URI, make use of standards (RDF, SPARQL) to describe things
  • include links to other resources

Links

  • relationship links - point to related links (inside/outside)
  • identity links - point to other resources that describe the same concept (in OWL: owl:sameAs)
  • vocabulary links - definition of used terms


Main Assumptions

AAA Slogan

Main slogan for the web and the semantic web:

  • AAA - Anyone can say Anything about Any topic
  • consequence: there always can be something else that somebody can say

Open World Assumption

Open World assumption

  • a consequence of the AAA slogan
  • at any time some new information can appear


Semantic Modeling

How to model data in such a way so it's good for the web scale

  • need to explain things in understandable way
  • and then be able to reuse it
  • need to be formal so machines can understand it, and logical inference is possible
  • Result of modeling: Ontologies


Semantic Web provides a number of modeling languages with different degree of expressivity:

  • RDF - resource definition framework
    • the basic mechanism to make basic statements about anything
  • RDFS - schema for RDF, expresses classes, subclasses and properties
  • RDFS-Plus - a subset of OWL, more expressive than RDFS, less complex than OWL
  • OWL - logics of Semantic Web, models detailed constraints between classes, properties and entities

Formal foundation for RDFS and OWL:


Logical Inference

Main Article: Inference in Semantic Web

RDFS and OWL allow new tuples to be created from facts asserted in the database


Semantic Web/Application Architecture

How to use the SW in your applications?

  • tools, storage, parsers/serializers, etc


Links


Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion