Loading [MathJax]/jax/element/mml/optable/MathOperators.js

Relational Algebra

  • Relational Algebra repesents the operations on relations for Relational Databases
  • Relational Algebra is algebra that consists of operations for constructing new relations from given relations
    • (it's closed, i.e. each operation always produces another relation)
  • RA is not used as a query language, but usually SQL is translated to it in RDBMS


Relational Algebra Operators

Operations of traditional RA:

  • usual set operations (union, intersection, difference)
  • operations that remove some parts of the relation
selections eliminate tuples (rows)
projections eliminate attributes (columns)
  • operations to combine tuples of two relations: Cartesian product, joins, etc
  • renaming

Binary

  • Union U
  • Intersection
  • Difference

Unary

  • Projection
  • Join

Extended RA

  • bags semantic
  • duplicate elimination: d
  • Grouping and aggregation: γ
  • Sorting: t


Traditional RA Operations

Set-Based Union

R1R2

  • Both R1 and R2 must have the same schema
  • And the result is a relation with the same schema
  • The result contains elements that are in R1 or in R2

SQL:

SELECT * FROM R1
UNION
SELECT * FROM R2

Example:

$\begin{array}{ c | c }

 A & B \\
 \hline
 \hline
 a_1 & b_1 \\ 
 a_2 & b_1 \\

\end{array} \cup \begin{array}{ c | c }

 A & B \\
 \hline
 \hline
 a_1 & b_1 \\ 
 a_3 & b_3 \\

\end{array} = \begin{array}{ c | c }

 A & B \\
 \hline
 \hline
 a_1 & b_1 \\ 
 a_2 & b_1 \\
 a_3 & b_3 \\

\end{array} $


Set-Based Difference

R1R2

  • Both R1 and R2 must have the same schema
  • And the result is a relation with the same schema
  • The result contains elements that are in R1 but not in R2


SELECT * FROM R1
EXCEPT 
SELECT * FROM R2


Example:

  • $\begin{array}{ c | c }
 A & B \\
 \hline \hline
 a_1 & b_1 \\ 
 a_2 & b_1 \\

\end{array} \cup \begin{array}{ c | c }

 A & B \\
 \hline \hline
 a_1 & b_1 \\
 a_3 & b_3 \\

\end{array} = \begin{array}{ c | c }

 A & B \\
 \hline \hline
 a_2 & b_1 \\

\end{array} $


Set-Based Intersection

R1R2

  • Both R1 and R2 must have the same schema
  • And the result is a relation with the same schema
  • The result contains elements that are in R1 and in R2


SQL

SELECT * FROM R1
INTERSECT 
SELECT * FROM R2


Example

$\begin{array}{ c | c }

 A & B \\
 \hline
 \hline
 a_1 & b_1 \\ 
 a_2 & b_1 \\

\end{array} \cup \begin{array}{ c | c }

 A & B \\
 \hline
 \hline
 a_1 & b_1 \\ 
 a_3 & b_3 \\

\end{array} = \begin{array}{ c | c }

 A & B \\
 \hline
 \hline
 a_1 & b_1 \\ 

\end{array} $


Can be expressed via Union and Difference

  • R1R2=R1(R1R2)
  • R1R2=R1R2


Selection

A binary operator

  • Takes a relation R
  • Outputs all tuples of R that satisfy a certain condition θ
  • Attributes that take part in θ must be present in the relation R

in SQL:

SELECT ... FROM R WHERE {condition}

For example

  • σθ(Relation)
  • $\sigma_{A \geqslant 3} \left (

\begin{array}{ c | c }

 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array} \right) = \begin{array}{ c | c }

 A & B \\
 \hline \hline
 3 & 4 \\

\end{array} $


A condition can be anything

  • salary > 40000 (inequality)
  • name = "Smith" (equation)
  • etc


Set-Based Projection

πA1,...,An(R)

A binary operator

  • Takes a relation R
  • Outputs only specified attributes A1,...,An of R
  • All A1,...,An must be present in R

SQL:

SELECT A1, ..., An FROM R

example

  • πSSN,name(Employee)


Note that for set-based projection there are no duplicated in the output:

  • $\pi_{A, C} \left(

\begin{array}{ c | c | c | c}

 A & B & C & D \\
 \hline \hline
 1 & 2 & 3 & 4 \\ 
 1 & 2 & 3 & 5 \\
 3 & 4 & 5 & 6 \\
 5 & 6 & 3 & 4 \\

\end{array} \right) = \begin{array}{ c | c}

 A & C \\
 \hline \hline
 1 & 3 \\ 
 3 & 5 \\
 5 & 3 \\

\end{array}$


Cartesian Product

Sometimes also called "Cross-Product"

R1×R2

  • Result of R1×R2 is a new relation
  • in which each tuple in R1 concatenated with each tuple in R2
  • i.e. it outputs all possible combinations of tuples
  • R1 and R2 must have disjoint schema

SQL

SELECT * FROM R1, R2

Example:

  • $\begin{array}{ c | c}
 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array} \times \begin{array}{ c | c}

 C & D \\
 \hline \hline
 2 & 6 \\ 
 3 & 7 \\
 4 & 9 \\

\end{array} = \begin{array}{ c | c | c | c}

 A & B & C & D \\
 \hline \hline
 1 & 2 & 2 & 6 \\
 1 & 2 & 3 & 7 \\
 1 & 2 & 4 & 9 \\
 3 & 4 & 2 & 6 \\
 3 & 4 & 3 & 7 \\
 3 & 4 & 4 & 9 \\

\end{array} $


Natural Join

Or "Equi-Join"

RS

  • no requirements for schema for R and S
  • if they have one or more attributes in common, in the output tuples with same values will be matched
  • if they don't have attributes in common - the result is the same as R×S (Cartesian Product)

Example:

  • $\begin{array}{ c | c}
 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array} \Join \begin{array}{ c | c}

 B & D \\
 \hline \hline
 2 & 6 \\ 
 3 & 7 \\
 4 & 9 \\

\end{array} = \begin{array}{ c | c | c}

 A & B & D \\
 \hline \hline
 1 & 2 & 6 \\
 3 & 4 & 9 \\

\end{array} $

  • $\begin{array}{ c | c}
 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array} \Join \begin{array}{ c | c}

 C & D \\
 \hline \hline
 2 & 6 \\ 
 3 & 7 \\
 4 & 9 \\

\end{array} = \begin{array}{ c | c | c | c}

 A & B & C & D \\
 \hline \hline
 1 & 2 & 2 & 6 \\
 1 & 2 & 3 & 7 \\
 1 & 2 & 4 & 9 \\
 3 & 4 & 2 & 6 \\
 3 & 4 & 3 & 7 \\
 3 & 4 & 4 & 9 \\

\end{array} $

Theta Join

R1θR2

  • A join that involves some predicate θ
  • For all combinations of tuples from R1θR2, a tuple is output if θ holds for the combination
  • essentially is the same as Cartesian Product plus Selection

Examples

  • R1θR2=σΘR1×R2
  • $\begin{array}{ c | c}
 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array} \Join_{B = C} \begin{array}{ c | c}

 C & D \\
 \hline \hline
 2 & 6 \\ 
 3 & 7 \\
 4 & 9 \\

\end{array} = \begin{array}{ c | c | c | c}

 A & B & C & D \\
 \hline \hline
 1 & 2 & 2 & 6 \\
 3 & 4 & 4 & 9 \\

\end{array} $


Anti-Join

R ˉ S

  • if for a tuple in R there is a match in S, we do not output this tuple from R
  • R ˉ SR(RS)
  • The result is only tuples from R (the resulting schema is also the same as in R)


Difference between R ˉ S and RS (Difference):

  • for Difference RS both R and S need to have the same schema
  • for R ˉ S - any schema
  • if R and S have the same schema, then R ˉ SRS

Example

  • $\begin{array}{ c | c}
 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\
 5 & 6 \\
 6 & 7 \\

\end{array} \AntiJoin \begin{array}{ c | c | c }

 B & C & D \\
 \hline \hline
 2 & 5 & 6 \\ 
 6 & 4 & 2 \\

\end{array} = \begin{array}{ c | c }

 A & B \\
 \hline \hline
 3 & 4 \\
 6 & 7 \\

\end{array} $


Renaming

ρprefix(R)

  • takes all attributes of R and
  • produces a new relation with prefix. appended to all of them
  • so the resulting relation is the same, but with changed schema

SQL

SELECT * FROM Relation R

Example

  • $\rho_T \left(

\begin{array}{ c | c}

 A & B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array} \right) = \begin{array}{ c | c}

 T.A & T.B \\
 \hline \hline
 1 & 2 \\ 
 3 & 4 \\

\end{array}$


Examples

find all hospitals within 5 ms of a school


SELECT DISTINCT h.name 
FROM Hospital h, School s
WHERE distance(h.location, s.location) < 5

πname(ρh(Hospital)h.location = s.locationρs(School))


Outer Joins

...


Semi Joins

R

Extended RA

  • Adds additional operations to Transitional RA
  • Allows Bag semantics for operations

Sets vs Bags

  • Set of tuples: no duplicates allowed
  • Bag of tuples: there can be duplicates
  • In theory set semantics is usually assumed
  • But in implementation - bag semantics

Duplicates

  • Practically we don't care about duplicates
  • We remove them only when required (duplicate elimination: d)

RA has two semantics:

  • set semantics = traditional RA
  • bag semantics = extended RA

All set-based operations are straightforwardly extended to bags


Bag-Based Intersection, Difference, Union

Intersection

  • If the same tuple occurs twice in one relation,
  • It must also occur twice in the second relation
  • Then the result will also contain 2 tuples

Same idea with Difference and Union


Grouping

\gamma_{\text{grouping_attribute}, \ \text{func}(A) \ \to \ \text{name}}(R)

  • unary relation that takes R as input
  • first parameter (\text{grouping_attribute}) is attribute on which R will be grouped
  • function \text{func} is applied to each group, and the result is written to attribute \text{name}
  • NB: all other (non-mentioned) attributes are not output to the result!

Example

  • $\gamma_{A, \ \text{min}(B) \ \to \ D} \left(

\begin{array}{ c | c | c }

 A & B & C \\
 \hline \hline
 1 & 2 & a \\
 1 & 3 & b \\
 2 & 3 & c \\
 2 & 4 & a \\
 2 & 5 & d \\

\end{array} \right) = \begin{array}{ c | c | c }

 A & D \\
 \hline \hline
 1 & 2 \\
 2 & 3 \\

\end{array} $


Projection


In Extended RA we also can allow renaming in projection

  • $\pi_{A, C \to D} \left(

\begin{array}{ c | c | c | c}

 A & B & C & D \\
 \hline \hline
 1 & 2 & 2 & 6 \\
 1 & 2 & 2 & 7 \\
 1 & 2 & 2 & 9 \\
 3 & 4 & 3 & 6 \\
 3 & 4 & 3 & 7 \\
 3 & 4 & 3 & 9 \\

\end{array} \right) = \begin{array}{ c | c}

 A & D \\
 \hline \hline
 1 & 2 \\
 1 & 2 \\
 1 & 2 \\
 3 & 3 \\
 3 & 3 \\
 3 & 3 \\

\end{array} $ (bag semantics)

  • $\pi_{A, C \to D} \left(

\begin{array}{ c | c | c | c}

 A & B & C & D \\
 \hline \hline
 1 & 2 & 2 & 6 \\
 1 & 2 & 2 & 7 \\
 1 & 2 & 2 & 9 \\
 3 & 4 & 3 & 6 \\
 3 & 4 & 3 & 7 \\
 3 & 4 & 3 & 9 \\

\end{array} \right) = \begin{array}{ c | c}

 A & D \\
 \hline \hline
 1 & 2 \\
 3 & 3 \\

\end{array} $ (set semantics)

  • C is renamed to D

Translating SQL to RA

Main Article: Translating SQL to Relational Algebra


Sources