# ML Wiki

## Pipelining

Sometimes the output of one physical operator can be used directly as input for other operator. This technique is called pipelining.

• output of an operator is stored in a buffer that serves as input for the next operator
• results are computed as early as possible - and its as soon as enough data is available
• no need to wait unit the previous operator finishes its work
• dramatically speeds up the execution process!

### Operators

Operators that usually can be pipelined

• projections
• selections
• renaming
• bag-based union
• merge-joins for which input is known to be sorted

An operator that cannot be pipelined is called blocking

### Example

• output from index scan on $R$ can be pipelined to filter
• filter output can be pipelined to union
• union result can be pipelined to projection
• (given we have enough memory buffers available)

## Materialization

When we cannot pipeline, we have to materialize everything. It means we have to write all the intermediate sub-results to disk.

• also the next operator cannot start working until everything is materialized

## Sources

Machine Learning Bookcamp: Learn machine learning by doing projects. Get 40% off with code "grigorevpc".