ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Data Integration

Data Integration

Goal of Data Integration - provide uniform access to heterogeneous data sources in some domain.

Main approaches

Data Warehousing

  • data from all data sources are federated into one main warehouse (using ETLs)
  • the queries are issued to this federated storage
  • Image

Mediator

  • data remains in the data sources
  • sometimes called “virtual data integration”
  • better for the Web - there are many DBs, and we would like to find something, no matter what DB provides it
  • also better if you want to access “fresh” data
  • but way harder to implement - need to transform data during the query time
  • Image

See Also

  • http://en.wikipedia.org/wiki/Data_integration

Source

  • Web Data Management book [http://webdam.inria.fr/Jorge]