ML Wiki
Machine Learning Wiki - A collection of ML concepts, algorithms, and resources.

Pipelining

Pipelining

Sometimes the output of one physical operator can be used directly as input for other operator. This technique is called ‘‘pipelining’’.

  • output of an operator is stored in a buffer that serves as input for the next operator
  • results are computed as early as possible - and its as soon as enough data is available
  • no need to wait unit the previous operator finishes its work
  • dramatically speeds up the execution process| | |

    Operators

    Operators that usually can be pipelined

  • projections
  • selections
  • renaming
  • bag-based union
  • merge-joins for which input is known to be sorted

An operator that cannot be pipelined is called ‘‘blocking’’

Example

Image

  • output from index scan on $R$ can be pipelined to filter
  • filter output can be pipelined to union
  • union result can be pipelined to projection
  • (given we have enough memory buffers available)

Materialization

When we cannot pipeline, we have to ‘‘materialize’’ everything. It means we have to write all the intermediate sub-results to disk.

  • Image
  • also the next operator cannot start working until everything is materialized

Sources