Mediator

This is an approach to Data Integration (opposite to Data Warehousing)

  • data remains in the data sources (so it's sometimes called "virtual data integration")
  • also better if you want to access "fresh" data
  • but way harder to implement - need to transform data during the query time
  • architecture-mediator.png


Architecture

Global Schema

  • start by designing a global (mediated) schema - an unique entry point for all the queries
  • need to have semantic mappings between the mediated schema and data sources


Querying:

  • user queries the global schema
  • based on the mappings, queries are converted to local queries for the data sources
  • all queries are executed
  • then the results are combined (e.g. using some Ontologies - which is why this approach is useful for OBDA)


Semantic Mapping

Let $S_1, ..., S_n$ be local schemas

  • assume that each $S_i$ has only one relation, also denoted $S_i$
  • these $S_1, ..., S_n$ are local relations

Global schema $G$

  • consists of global relations $G_1, ..., G_m$

Goal of Semantic Mappings:

  • specify mappings between $\{ S_i \}$ and $\{ G_i \}$ - relationships between local and global schemas
  • examples of such relationships:
    • $G_1 \equiv S_1$ - equality
    • $G_2 \equiv S_1 \cup S_2$
    • $G_3 \equiv S_1 \Join S_3$
  • so global $G_j$ can be seen as views of local relationships (example of GAV Mediation)


But better to use containment instead of equality

  • to be able to express the usage of multiple sources
  • example (GAV Mediation)
    • $G_3 \supseteq S_1 \Join S_3$
    • $G_3 \supseteq \sigma_{A = \text{yes}} ( S_4 )$
  • example (LAV Mediation)
    • $S_4 \subseteq G_1 \Join G_3$


Notation

  • $v(S_1, ..., S_n)$ - local view (a view/query built on local schemas)
  • $v(G_1, ..., G_m)$ - global view (on global schemas)


Mediation Approaches

GAV Mediation - Global-as-View

global is constrained by views of the local relations

GAV are mappings of the following form

  • $v_i(S_1, ..., S_n) \subseteq G_i$
  • or, equivalently, $G_i \supseteq v_i(S_1, ..., S_n)$


LAV Mediation - Local-as-View

  • contribution of each data source $S_i$ is specified independently of other data source
  • typical for Semantic Web based systems

GAV are mappings of the following form

  • $S_i \subseteq v_i(G_1, ..., G_m)$


Main algorithms for query rewriting in LAV Meditation:

Discussion

  • all these algorithms have the same complexity
  • but in experiments (from the book) show that Minicon outperforms others
  • no algorithm handles additional knowledge (ontologies)


Ontology Based Data Access

  • Typically LAV is used along with OBDA
  • semantic-web-data-access.png


Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion