Secondary Index

database-indexes database-systems-architecture

What if we want to support efficient search by some other attribute (not one that is already indexed)

then the file is not sequentially sorted by this other attribute (it’s sorted on the pk)
we can make a copy of the entire table - but it’s too expensive

Index structures to do that are Secondary Indexes

Doesn’t make sense

Idea:

Duplicates

Suppose we use Dense Index as our secondary index

Same is Option 1 from Dense Index (note that Option 2 will not work here - file is not ordered by this key)
10 occurs 3 times - may lead to waste of space
may look innocent for integers, but often keys are strings

Another way is to use variable-length records

We add one more level of indirection

So now we have

Dense Index where each value is stored once
'’Bucket list’’ where we have multiple occurrences
- pointers to actual values
- should be sequential: i.e. ordered by the key
Actual blocks

Also saves space| | |Example

This idea is used in Buckets of Pointers

✏️ Edit on GitHub