# ML Wiki

## Semantic Web

Semantic Web is a web that is used to represent knowledge

• Semantic Webs - part of Knowledge Representation, AI
• principles: be able to describe things in document in machine-processable way
• RDF is the main tool for representing knowledge in Semantic Web

What we currently have in WWW:

• mostly have links of the form $a \ \text{href} \ b$
• what we want: $A \ \text{dependsOn} \ a$, $a \ \text{isDescribedBy} \ b$, etc -
• i.e. we want links to have some meaning behind
• so we can use semantic web and Ontologies for that

### DIKW

DIKW [1]: data $\to$ information $\to$ knowledge $\to$ wisdom $\Rightarrow$ decision

• D - just collecting data, smb enters data into a web app - just values
• I - databases (RDBs, XML, etc) - now you have some structure
• but also know when it was collected, by whom, etc - i.e. with some metadata
• K - reports, analysis - to facilitate decision making
• W - to increase effectiveness

## Smart Web of Linked Data

So the goal is to have machine-readable linked data. We want to have "Smart Web" - linked and consistent.

• fundamental issue of the web: how to manage distributed data

### Motivation: Integration

Data integration and distribution

• suppose that two servers share the same tables
• but tables have different schemas
• how do we know that one columns in first db corresponds to another one in second?
• so we need some coordination between the servers, like global reference
• so represent each cell of these tables with 3 values
• global reference for row
• global reference for column
• global reference for the value in the cell
• such cells can be stored on any of these servers
• this is the basic idea of RDF
• and global references are URIs

Smart Managing of Data

• rows are "things" (or entities or individuals)
• columns specify properties of these things
• if a cell references some other "thing", the meaning of this relation can be understood from the name of the column
• can express this in a more meaningful form - with reference where this meaning is described
• so the "things" are resources that can relate to other resources
• to describe these things and relations, use URIs

Linked Open Data: a giant graph

Other Sources:

• Freebase, UMBEL, YAGO2, OpenCyc
• Geography: Geonames, LinkedGeoData; EuroStat, World Factbook, US Census, Ordnance

Survey

• Media: BBC (/programmes, /music), WildlifeFinder, New York Times, Thomson Reuters: Open Calais (Named Entities extracted from text)
• Social Media: Open Graph Protocol (Facebook), Internet Movie Database
• Libraries American Lib. of Congress, GermanNational Lib. of Economics, LIBRIS, SwedishNat. Union Catalogue, OpenLibrary
• Scholarly articles (journal, conferences): DBLP, ACM, RKBexplorer, SemanticWeb, DogfoodServer
• Many others - see http://linkeddata.org and http://lod-cloud.net/

These principles are recommendation - best practices

• use URIs to talk about things
• HTTP URIs are better so people can access them
• when somebody uses this URI, make use of standards (RDF, SPARQL) to describe things
• include links to other resources

• identity links - point to other resources that describe the same concept (in OWL: owl:sameAs)
• vocabulary links - definition of used terms

### Main Assumptions

#### AAA Slogan

Main slogan for the web and the semantic web:

• AAA - Anyone can say Anything about Any topic
• consequence: there always can be something else that somebody can say

#### Open World Assumption

Open World assumption

• a consequence of the AAA slogan
• at any time some new information can appear

## Semantic Modeling

How to model data in such a way so it's good for the web scale

• need to explain things in understandable way
• and then be able to reuse it
• need to be formal so machines can understand it, and logical inference is possible
• Result of modeling: Ontologies

Semantic Web provides a number of modeling languages with different degree of expressivity:

• RDF - resource definition framework
• the basic mechanism to make basic statements about anything
• RDFS - schema for RDF, expresses classes, subclasses and properties
• RDFS-Plus - a subset of OWL, more expressive than RDFS, less complex than OWL
• OWL - logics of Semantic Web, models detailed constraints between classes, properties and entities

Formal foundation for RDFS and OWL:

### Logical Inference

Main Article: Inference in Semantic Web

RDFS and OWL allow new tuples to be created from facts asserted in the database

## Semantic Web/Application Architecture

How to use the SW in your applications?

• tools, storage, parsers/serializers, etc