Design a news feed system

Step 1 - Understand the problem
and establish design scope

Step 2 - Propose high-level design
and get buy-in

Is this a mobile app? Or a web app? Or both?
Both

What are the important features?
Publish a post and see friends’ posts

Is the news feed sorted by reverse chronological order?
Yes

How many friends can a user have?
5000

What is the traffic volume?
10 milion DAU

Can feed contain images, videos, or just text?
All

Feed publishing

When a user publishes a post, corresponding data is written into cache and database. A post is populated to her friends’ news feed.

POST /v1/me/feed

Params

content: content is the text of the post

auth_token: it is used to authenticate API requests.

Newsfeed building

Get the news feed is built by aggregating friends’ posts in reverse chronological order

GET /v1/me/feed

Params

auth_token: it is used to authenticate API requests.

Components

User: Web browser and Mobile app

Load Balancer: Distribute traffic to webserver

Web servers: handle request call internal services

Post service: persist post in the database and cache

Fanout service: push new content to friends’ news feed

Notification service: inform friends that new content is available and send out push notifications.

Newsfeed service: news feed service fetches news feed from the cache

Post Cache and Post DB: store post

News Feed Cache: store news feed IDs needed to render the news feed

Step 3 - Design deep dive

Web servers

Feed publishing deep dive

Details Components

Message Queue

Fanout Workers

Graph DB

User Cache and DB

Fanout on write

Pre-computed during write time

Pros

Cons

Real-time

Fetching is fast

Hotpot

Waste for inactive users

Fanout on read

The news feed is generated during read time

Pros

Cons

No waste for inactive users

no hotpot

Fetching slow

Fetch data from the message queue and
store news feed data in the news feed cache

Newsfeed retrieval deep dive

The news feed service fetches the complete user and post objects from caches (user cache and post cache) to construct the fully hydrated news feed

Cache Architecture

Authentication and rate limiting

News Feed

Content

Social Graph

Action

Counters

Step 4: Wrap up

Scaling

SQL vs NoSQL

Master-slave replication

Read replicas

Consistency models

Database sharding

Vertical scaling vs Horizontal scaling

Other talking points

Keep web tier stateless

Cache data as much as you can

Support multiple data centers

Lose couple components with message queues

Monitor key metrics. For instance, QPS during peak hours and latency while users refreshing their news feed are interesting to monitor