Data Integration

Goal of Data Integration - provide uniform access to heterogeneous data sources in some domain.


Main approaches

Data Warehousing

  • data from all data sources are federated into one main warehouse (using ETLs)
  • the queries are issued to this federated storage
  • architecture-dwh.png


Mediator

  • data remains in the data sources
  • sometimes called "virtual data integration"
  • better for the Web - there are many DBs, and we would like to find something, no matter what DB provides it
  • also better if you want to access "fresh" data
  • but way harder to implement - need to transform data during the query time
  • architecture-mediator.png


See Also

Links

Source

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".

Share your opinion