Semantic Web
Semantic Web is a web that is used to represent knowledge
- Semantic Webs - part of Knowledge Representation, AI
- principles: be able to describe things in document in machine-processable way
- RDF is the main tool for representing knowledge in Semantic Web
What we currently have in WWW:
- mostly have links of the form $a \ \text{href} \ b$
- what we want: $A \ \text{dependsOn} \ a$, $a \ \text{isDescribedBy} \ b$, etc -
- i.e. we want links to have some meaning behind
- so we can use semantic web and Ontologies for that
DIKW
DIKW [1]: data $\to$ information $\to$ knowledge $\to$ wisdom $\Rightarrow$ decision
- D - just collecting data, smb enters data into a web app - just values
- I - databases (RDBs, XML, etc) - now you have some structure
- but also know when it was collected, by whom, etc - i.e. with some metadata
- K - reports, analysis - to facilitate decision making
- W - to increase effectiveness
- see also [2]
Smart Web of Linked Data
So the goal is to have machine-readable linked data. We want to have "Smart Web" - linked and consistent.
- fundamental issue of the web: how to manage distributed data
Motivation: Integration
Data integration and distribution
- suppose that two servers share the same tables
- but tables have different schemas
- how do we know that one columns in first db corresponds to another one in second?
- so we need some coordination between the servers, like global reference
- so represent each cell of these tables with 3 values
- global reference for row
- global reference for column
- global reference for the value in the cell
- such cells can be stored on any of these servers
- this is the basic idea of RDF
- and global references are URIs
Smart Managing of Data
- rows are "things" (or entities or individuals)
- columns specify properties of these things
- if a cell references some other "thing", the meaning of this relation can be understood from the name of the column
- can express this in a more meaningful form - with reference where this meaning is described
- so the "things" are resources that can relate to other resources
- to describe these things and relations, use URIs
Linked Data
Linked Open Data: a giant graph
- all these sources provide RDF data
- every circle - a source of data, the bigger - the more articles it has
- the bigger the arrow - the more links from one source to another
-
- http://dbpedia.org - the main hub in LOD
Other Sources:
- Freebase, UMBEL, YAGO2, OpenCyc
- Geography: Geonames, LinkedGeoData; EuroStat, World Factbook, US Census, Ordnance
Survey
- Media: BBC (/programmes, /music), WildlifeFinder, New York Times, Thomson Reuters: Open Calais (Named Entities extracted from text)
- Social Media: Open Graph Protocol (Facebook), Internet Movie Database
- Libraries American Lib. of Congress, GermanNational Lib. of Economics, LIBRIS, SwedishNat. Union Catalogue, OpenLibrary
- Scholarly articles (journal, conferences): DBLP, ACM, RKBexplorer, SemanticWeb, DogfoodServer
- Many others - see http://linkeddata.org and http://lod-cloud.net/
Linked data principles
These principles are recommendation - best practices
- use URIs to talk about things
- HTTP URIs are better so people can access them
- when somebody uses this URI, make use of standards (RDF, SPARQL) to describe things
- include links to other resources
Links
- relationship links - point to related links (inside/outside)
- identity links - point to other resources that describe the same concept (in OWL: owl:sameAs)
- vocabulary links - definition of used terms
Main Assumptions
AAA Slogan
Main slogan for the web and the semantic web:
- AAA - Anyone can say Anything about Any topic
- consequence: there always can be something else that somebody can say
Open World Assumption
Open World assumption
- a consequence of the AAA slogan
- at any time some new information can appear
Semantic Modeling
How to model data in such a way so it's good for the web scale
- need to explain things in understandable way
- and then be able to reuse it
- need to be formal so machines can understand it, and logical inference is possible
- Result of modeling: Ontologies
Semantic Web provides a number of modeling languages with different degree of expressivity:
- RDF - resource definition framework
- the basic mechanism to make basic statements about anything
- RDFS - schema for RDF, expresses classes, subclasses and properties
- RDFS-Plus - a subset of OWL, more expressive than RDFS, less complex than OWL
- OWL - logics of Semantic Web, models detailed constraints between classes, properties and entities
Formal foundation for RDFS and OWL:
Logical Inference
- Main Article: Inference in Semantic Web
RDFS and OWL allow new tuples to be created from facts asserted in the database
How to use the SW in your applications?
- tools, storage, parsers/serializers, etc
Links
Sources