In the problem of load balancing, a hash function is used to assign the incoming BLOB to a server. A general approach would be to calculate the hash of the BLOB object and take modulus with the number of servers (n).

We assume

\(\beta\): hash of the BLOB object
\(n\): number of available servers
\(\zeta\): assigned server to the BLOB

\[\zeta = \beta \ \%\ n\]

The problem with this approach is that if a server is removed, the \(n\) changes and all the BLOBs need to be resassigned.


Consistent Hashing

Figure: Blue disks denote the machines and red boxes denote the items. Arrows also show the machine to which each item is assigned.


The central idea is, we use a hash function that randomly maps both the BLOB and servers to a unit circle, usually \(2\pi\) radians.

For example, \(\zeta =\Phi \ \%\ 360\) (where \(\Phi\) is hash of a BLOB or server’s identifier, like IP address or UUID). Each BLOB is then assigned to the next server that appears on the right of the BLOB on the circle.

Failure

If a server fails and is removed from the circle, only the BLOBs that are mapped to the failed server need to be reassigned to the next server in the clockwise order. Likewise if the server is added to the unit circle, only the BLOBs mapped to that server need to be reassigned.


Implementation

A Binary Serch Tree (BST) is maintained with the server ID as nodes. To find the server for the BLOBs, tree traversal is performend \(O(log \ N)\).

Let

\(h_b(x)\): Hash function used for BLOB
\(h_s(x)\): Hash function user for server’s identity\

Let \(\beta\) be the hash value of the blob such that:
\(\zeta = h_b(x) = \beta \ \% \ 360\)

Let \(\phi\) be the hash value of the server such that:
\(\theta = h_s(x) = \phi \ \% \ 360\)

Inserting \(x\) into the cluster:

  • To insert \(x\), find the successor of \(\zeta\) in the BST.
  • If \(\zeta\) is larger than all the Server IDs, then place the BLOB in the server with the smallest Server ID value.

Deleting \(x\) from the cluster:

  • Find the successor of \(\zeta\) in the BST, remove the BLOB from the returned Server ID.
  • If \(\zeta\) has no successor, remove the BLOB from the server with the smallest Server ID.

Insert a server into cluster

  • Move all the BLOBs from the server whose Server ID is successor of \(\theta\).
  • If \(\theta\) is the largest of the Server ID, move the BLOBs from the smallest of the Server IDs to \(\theta\)

Delete a server from cluster

  • Find the successor of \(\theta\) in the BST, move the BLOBs from \(\theta\) into te successor server.
  • If \(\theta\) doesn’t have a successor, move the BLOBs into the smallest of the server ID

Resources

  1. https://en.wikipedia.org/wiki/Consistent_hashing
  2. MIT