Data Integration
Goal of Data Integration - provide uniform access to heterogeneous data sources in some domain.
Main approaches
Data Warehousing
- data from all data sources are federated into one main warehouse (using ETLs)
- the queries are issued to this federated storage
Mediator
- data remains in the data sources
- sometimes called “virtual data integration”
- better for the Web - there are many DBs, and we would like to find something, no matter what DB provides it
- so it can be a preferred approach for Ontology Based Data Access
- also better if you want to access “fresh” data
- but way harder to implement - need to transform data during the query time
- need to use Ontologies for that, no ETLs
See Also
Links
- http://en.wikipedia.org/wiki/Data_integration
Source
- Web Data Management book [http://webdam.inria.fr/Jorge]