Data Integration

Goal of Data Integration - provide uniform access to heterogeneous data sources in some domain.

Main approaches

Data Warehousing

  • data from all data sources are federated into one main warehouse (using ETLs)
  • the queries are issued to this federated storage
  • architecture-dwh.png


  • data remains in the data sources
  • sometimes called "virtual data integration"
  • better for the Web - there are many DBs, and we would like to find something, no matter what DB provides it
  • also better if you want to access "fresh" data
  • but way harder to implement - need to transform data during the query time
  • architecture-mediator.png

See Also



  • Web Data Management book [1]