|
|
Line 84: |
Line 84: |
| | | |
| == Sources == | | == Sources == |
− | * Web Data Management book [http://webdam.inria.fr/Jorge] | + | * [[Web Data Management (book)]] |
| | | |
| [[Category:Data Integration]] | | [[Category:Data Integration]] |
Latest revision as of 15:48, 23 November 2015
LAV Mediation
There are two main approached for Mediating in Data Integration
- GAV Mediation - defining global relations in terms of local
- LAV Mediation - defining local relations in terms of global
LAV - Local-as-View Mediation
- local relations are defined as views (queries) over global relations
- goal: define the global schema in such a way that individual definitions don't change when new data sources are added or old are removed
- See some notation in Mediator (Data Integration)
LAV Mapping
LAV Mapping
- mapping $S \subseteq Q$ for some Conjunctive Query $Q(\vec{x}) \leftarrow A_1(\vec{u}_1), \ ..., \ A_k(\vec{u}_k)$ over the global relations
- this gives loose-coupling between global and local schemas
FOL Semantics:
- $\forall x_1, ..., x_n \Big[ S(x_1, ..., x_n) \Rightarrow \exists \ y_1, ..., y_m \ : \ A_1(\vec{u}_1) \ \land \ ... \ \land \ A_k(\vec{u}_k) \Big]$
- $S(x_1, ..., x_n)$ - head of a view
- $y_1, ..., y_m$ - existential variables
- $A_1(\vec{u}_1) \ \land \ ... \ \land \ A_k(\vec{u}_k)$ - body
Example
Suppose we have this global schema
- Student(studentName),
- EuropeanStudent(studentName),
- University(uniName),
- NonEuropeanStudent(studentName),
- FrenchUniversity(uniName),
- EuropeanUniversity(uniName),
- NonEuropeanUniversity(uniName),
- Program(title),
- MasterProgram(title),
- EnrolledInProgram(studentName, title),
- Course(code),
- EnrolledInCourse(studentName, code),
- PartOf(code, title),
- RegisteredTo(studentName, uniName),
- OfferedBy(title, uniName).
Data sources from the previous examples
- S1.Catalogue(nomUniv, programme). - programs in French universities
- S2.Erasmus(student, course, univ). - European Erasmus students
- S3.CampusFr(student, program, university). - foreign students in France
- S4.Mundus(program, course). - international master programs
LAV Mappings:
- $m_1$: S1.Catalogue(U, P) $\subseteq$ FrenchUniversity(U), Program(P), OfferedBy(P, U), OfferedBy(P', U), MasterProgram(P')
- $m_2$: S2.Erasmus(S, C, U) $\subseteq$ Student(S), EnrolledInCourse(S, C), PartOf(C, P), OfferedBy(P, U), EuropeanUniversity(U), EuropeanUniversity(U') RegisteredTo(S, U'), U $\neq$ U'
- $m_3$: S3.CampusFr(S, P, U) $\subseteq$ NonEuropeanStudent(S), Program(P), EnrolledInProgram(S, P), OfferedBy(P, U), FrenchUniversity(U), RegisteredTo(S, U)
- $m_4$: S4.Mundus(P, C) $\subseteq$ MasterProgram(P), OfferedBy(P, U), OfferedBy(P, U'), EuropeanUniversity(U), NonEuropeanUniversity(U'), PartOf(C, P)
So,
- LAV mapping can be seen as a description of the data source in terms of the global schema
- for example, Erasmus students ($m_2$) are
- European students
- enrolled in an European university
- that European university is different from their home university
- they remain registered in their home university
Loose-Coupling
- This gives loose-coupling between local and global relations
- which is important when participating data sources change frequently
Query Answering
suppose we're interested in Master students
- define the following query
- $\text{MasterStudent}(E) \leftarrow \text{Student}(E), \text{EnrolledInProgram}(E, M), \text{MasterProgram}(M).$
- how to find which data sources to query?
- rewriting process is more complex, than for GAV
Algorithms to do that
Sources