28 Comments
User's avatar
Tanner Barcelos's avatar

This was quite amazing and super super helpful. I am an experienced coder but have not had many opportunities to design high scale systems at work or in practice so this is going to help me a ton.

Expand full comment
Ashish Pratap Singh's avatar

Really happy to hear this, thank you so much!

Expand full comment
Vikas's avatar

Nice one Ashish 👍🏻

Expand full comment
Raul Junco's avatar

Excellent study case, Ashish!

Expand full comment
Srinu Nampalli's avatar

Impressive! You are truly a gem. Thank you for dedicating your time and sharing such valuable content with crystal-clear clarity.

Expand full comment
Ashish Pratap Singh's avatar

Thank you 😊

Love to hear this.

Expand full comment
Grend's avatar

Great content brother

Expand full comment
Ashish Pratap Singh's avatar

thank you!

Expand full comment
Deepak Katariya's avatar

Thanks for sharing.

Expand full comment
Ashish Pratap Singh's avatar

you are welcome!

Expand full comment
Himanshu Sharma's avatar

I'm really interested to know how you gonna scale the consumer because on average 17,000 notifications/second and lets assume 10k are getting consumed at the same time. How the poller/consumer picking the notifications from topic will scale it and so does the database choice and IO for getting the data from DB.

Expand full comment
Sayan's avatar

You can use different queues for each channel. For example, an EmailNotification can use an EmailMessagingQueue, and so on. This way, the load will be distributed across multiple queues, preventing a single point of failure. At extremely large scale, we should also create separate services such as an EmailNotificationService and a PushNotificationService.

Expand full comment
Deepak Kumar's avatar

Let’s talk about implementing the notification service , with best design / architecture / patterns .

I would like to do it in rust .

Anyone interested to tag along ?

I am thinking to open source such components so lot others can benefit

Expand full comment
Obayd's avatar

let me know if you want to do it with java .

Expand full comment
Sumeet Chandra's avatar

Ashish,

This is awesome content. Just few points to add on if you agree:

1) Syncing mechanisms between Cache and User Preferences NoSQL db to ensure consistency

2) How does Cache payload look like along with the key

3) Cache TTL, eviction strategy, invalidation strategy, cache writing strategy, dealing with a data level conflict (discrepancy between current cache and NoSQL db entry)

4) Deduplication of notification so that users don’t get same notification again

Expand full comment
Sayan's avatar

Great Content Ashish. Really enjoyed the article.

Expand full comment
Abhigna Ogirala's avatar

Just started learning System Design, one question I have is, Publish/Subscribe is it used here?

Like when the notification is sent to the channel processor then we can send that from there to the client using Publish/Subscribe?

Expand full comment
Saurabh Singh's avatar

what db for what is unclear for more , like I don't know about scheduler service fetches from SQl or NOSQL

Expand full comment
Pradeep Kumar's avatar

very nicely explained, specially the details of request message and types of db usage, the way notification service is crafting the message for individual channel. learning alot from you.

Expand full comment
vishalyadav's avatar

How we are performing rate limiting based on user preferences is not clearly mentioned in the above design. How the service checks the number of notifications sent to a user and compares it with its preferences that flow is absent. And also the bottlenecks related to it.

Expand full comment
Harvey's avatar

That's what I think, if a failure occurs will we loose all the data/events that was supposed to be pick by the Notification Service ? Can anyone help here ?

Expand full comment
saiteja's avatar

I learnt a lot. One query here is it channel processor fails to deliver the notification due to some reason , will it take responsibility of retrying again or it can just update in the scheduler database with scheduled time as the next time using exponential backoff strategy so scheduler picks it up for the next retry. What would be the expected approach should be ?

Thanks.

Expand full comment