SSTable
Sorted String Tables are immutable files that contain arbitrary, sorted key-value pairs, persisted on the disk. When Memtables are flushed from memory to disk, they are stored as SSTables.
SSTables support fast sequential read/writes. As SSTables are sorted sequences of key-value pairs, an index file is maintained for fast retrieval. To retrieve data from SSTable, a binary search can be performed on the index to locate the offset of the value. The index partition is also stored in the disk.

Example:

For fast read, a partition index summary file is maintained which maps the key range to the offsets.

Recently accessed SSTables offsets are stored in cache:key cache.
It maps Key \(\rightarrow\) Byte offset
Reads

- When a read request arrives, it is first checked against the
Memtable, if found the value is returned - Lookup in
key cache, if found in key cache the disk is read for the specified offset, and value is returned. - Perform binary search lookup in
primary index summaryto find the offset. The offset will give the location of theindex file. From theindex file, the disk offset can be found. The disk is read for that offset and the value is returned.
Compactions

As SSTables are immutable, they can grow in size to a very huge amount. Compaction helps compress the SSTables.
Compaction is the operation of merging multiple related SSTables into a single new one. During compaction, the data in SSTables are merged:
- The keys are merged
- columns are combined
- obsolete values are discarded
- a new index is created
After compaction is performed the data is written into a new SSTable and the older SSTables are deleted from the disk.
