Functional Requirements

  1. Tweet
  2. Re-Tweet
  3. Follow
  4. Search

Capacity Estimation

  • 150 Million Daily Active Users
  • 350 Million Monthly Active Users
  • 1.5 Billion Users Accounts
  • 500 Million Tweets/day

User Categorization

  • famous: millions of followers
  • live: currently accessing twitter
  • active: have accessed twitter recently eg. 3 hours
  • passive: have accessed twitter not recently
  • inactive: deleted account

1. User Onboarding



User Service

  • User service helps in onboarding new users
  • The users information is stored in an RDBMS cluster
  • Information is also cached in Redis.
    • When a GET requests arrives to get users info by id it is retrieved from Redis
    • If information is not found, RDBMS is queried and Redis is updated

2. Follower-Follow



Graph Service

  • generates the graph of users and their followers
  • the graph is stored in the RDBMS
    • table: user(A) \(\rightarrow\) users who follow (A)
    • table: user(A) \(\rightarrow\) users (A) is following
  • information is cached in Redis
    • When a GET request arrives for the list of followers, the information in Redis is looked up.

3. Live Websocket Notification



Live Websocket Notification

  • Users who are Live are connected using the websocket.
  • The events as they happen are pushed to Kafka.
  • Live Websocket Notification Service consumes messages from Kafka and notifies all the connected Live Users

4. Tweet Service



User Timeline

  • the tweets that the user has posted

Home Timeline

  • the combined view of tweets of the users that the user follows

Tweet Injestion Service

  • User tweets are sent to injestion service
  • It is only responsible for tweet writes not reads
  • If a media is associated with the tweet:
    • it contacts Short URL Service, which provides a unique URL
    • the media along with the short URL is forwarded to Asset Service that stores the media on CDN
  • Tweets are forwarded to Kafka. Live user connected with Websockets will receive the tweet.

Tweet Service

  • provides APIs to read tweets

Tweet Processor

  • for active users, tweet processor service caches the timeline in Redis cluster
  • To generate user timeline:
    • It talks to Graph Service to get the ids of all the followers
    • It talks to User Service to get the user details
    • It consumes tweets from Kafka and inserts it into the user timeline
  • Finally the user timeline is cached in Redis cluster

Timeline Service

  • Generates user’s timeline for passive users, users whose timeline is not cached
  • For a user time:
    • list of all users that the user is following is fetched from Graph Service
    • The user service provides details about the users.
    • if any media is associated with the tweet, Asset Service is contacted to retreive that media
    • Tweet Service provides the tweets of the users that the user is following

Tweet Read Flow

Active Users

  • Let U1 is followed by U2, U3, U4.
  • U1 tweets t1 with tweet id t_id: 105
  • Tweet Processor Service queries Redis
    • t1 is inserted in the timeline of U2, U3, U4