Pratyush | SSTable

Sorted String Tables are immutable files that contain arbitrary, sorted key-value pairs, persisted on the disk. When Memtables are flushed from memory to disk, they are stored as SSTables.

SSTables support fast sequential read/writes. As SSTables are sorted sequences of key-value pairs, an index file is maintained for fast retrieval. To retrieve data from SSTable, a binary search can be performed on the index to locate the offset of the value. The index partition is also stored in the disk.

Example:

For fast read, a partition index summary file is maintained which maps the key range to the offsets.

Recently accessed SSTables offsets are stored in cache:key cache. It maps Key \(\rightarrow\) Byte offset

Reads

When a read request arrives, it is first checked against the Memtable, if found the value is returned
Lookup in key cache, if found in key cache the disk is read for the specified offset, and value is returned.
Perform binary search lookup in primary index summary to find the offset. The offset will give the location of the index file. From the index file, the disk offset can be found. The disk is read for that offset and the value is returned.

Compactions

As SSTables are immutable, they can grow in size to a very huge amount. Compaction helps compress the SSTables.

Compaction is the operation of merging multiple related SSTables into a single new one. During compaction, the data in SSTables are merged:

The keys are merged
columns are combined
obsolete values are discarded
a new index is created

After compaction is performed the data is written into a new SSTable and the older SSTables are deleted from the disk.

Resources

Google SSTables