|
|
Line 86: |
Line 86: |
| | | |
| == Sources == | | == Sources == |
− | * Web Data Management book [http://webdam.inria.fr/Jorge] | + | * [[Web Data Management (book)]] |
| | | |
| [[Category:Data Integration]] | | [[Category:Data Integration]] |
Latest revision as of 15:46, 23 November 2015
Mediator
This is an approach to Data Integration (opposite to Data Warehousing)
- data remains in the data sources (so it's sometimes called "virtual data integration")
- also better if you want to access "fresh" data
- but way harder to implement - need to transform data during the query time
-

Architecture
Global Schema
- start by designing a global (mediated) schema - an unique entry point for all the queries
- need to have semantic mappings between the mediated schema and data sources
Querying:
- user queries the global schema
- based on the mappings, queries are converted to local queries for the data sources
- all queries are executed
- then the results are combined (e.g. using some Ontologies - which is why this approach is useful for OBDA)
Semantic Mapping
Let $S_1, ..., S_n$ be local schemas
- assume that each $S_i$ has only one relation, also denoted $S_i$
- these $S_1, ..., S_n$ are local relations
Global schema $G$
- consists of global relations $G_1, ..., G_m$
Goal of Semantic Mappings:
- specify mappings between $\{ S_i \}$ and $\{ G_i \}$ - relationships between local and global schemas
- examples of such relationships:
- $G_1 \equiv S_1$ - equality
- $G_2 \equiv S_1 \cup S_2$
- $G_3 \equiv S_1 \Join S_3$
- so global $G_j$ can be seen as views of local relationships (example of GAV Mediation)
But better to use containment instead of equality
- to be able to express the usage of multiple sources
- example (GAV Mediation)
- $G_3 \supseteq S_1 \Join S_3$
- $G_3 \supseteq \sigma_{A = \text{yes}} ( S_4 )$
- example (LAV Mediation)
- $S_4 \subseteq G_1 \Join G_3$
Notation
- $v(S_1, ..., S_n)$ - local view (a view/query built on local schemas)
- $v(G_1, ..., G_m)$ - global view (on global schemas)
Mediation Approaches
global is constrained by views of the local relations
GAV are mappings of the following form
- $v_i(S_1, ..., S_n) \subseteq G_i$
- or, equivalently, $G_i \supseteq v_i(S_1, ..., S_n)$
- contribution of each data source $S_i$ is specified independently of other data source
- typical for Semantic Web based systems
GAV are mappings of the following form
- $S_i \subseteq v_i(G_1, ..., G_m)$
Main algorithms for query rewriting in LAV Meditation:
Discussion
- all these algorithms have the same complexity
- but in experiments (from the book) show that Minicon outperforms others
- no algorithm handles additional knowledge (ontologies)
Ontology Based Data Access
- Typically LAV is used along with OBDA
-

Sources