25 Comments
User's avatar
Tanner Barcelos's avatar

This was quite amazing and super super helpful. I am an experienced coder but have not had many opportunities to design high scale systems at work or in practice so this is going to help me a ton.

Expand full comment
Ashish Pratap Singh's avatar

Really happy to hear this, thank you so much!

Expand full comment
Vikas's avatar

Nice one Ashish 👍🏻

Expand full comment
Raul Junco's avatar

Excellent study case, Ashish!

Expand full comment
Srinu Nampalli's avatar

Impressive! You are truly a gem. Thank you for dedicating your time and sharing such valuable content with crystal-clear clarity.

Expand full comment
Ashish Pratap Singh's avatar

Thank you 😊

Love to hear this.

Expand full comment
Grend's avatar

Great content brother

Expand full comment
Ashish Pratap Singh's avatar

thank you!

Expand full comment
Deepak Katariya's avatar

Thanks for sharing.

Expand full comment
Ashish Pratap Singh's avatar

you are welcome!

Expand full comment
Deepak Kumar's avatar

Let’s talk about implementing the notification service , with best design / architecture / patterns .

I would like to do it in rust .

Anyone interested to tag along ?

I am thinking to open source such components so lot others can benefit

Expand full comment
Obayd's avatar

let me know if you want to do it with java .

Expand full comment
Sumeet Chandra's avatar

Ashish,

This is awesome content. Just few points to add on if you agree:

1) Syncing mechanisms between Cache and User Preferences NoSQL db to ensure consistency

2) How does Cache payload look like along with the key

3) Cache TTL, eviction strategy, invalidation strategy, cache writing strategy, dealing with a data level conflict (discrepancy between current cache and NoSQL db entry)

4) Deduplication of notification so that users don’t get same notification again

Expand full comment
Saurabh Singh's avatar

what db for what is unclear for more , like I don't know about scheduler service fetches from SQl or NOSQL

Expand full comment
Himanshu Sharma's avatar

I'm really interested to know how you gonna scale the consumer because on average 17,000 notifications/second and lets assume 10k are getting consumed at the same time. How the poller/consumer picking the notifications from topic will scale it and so does the database choice and IO for getting the data from DB.

Expand full comment
Pradeep Kumar's avatar

very nicely explained, specially the details of request message and types of db usage, the way notification service is crafting the message for individual channel. learning alot from you.

Expand full comment
vishalyadav's avatar

How we are performing rate limiting based on user preferences is not clearly mentioned in the above design. How the service checks the number of notifications sent to a user and compares it with its preferences that flow is absent. And also the bottlenecks related to it.

Expand full comment
Harvey's avatar

That's what I think, if a failure occurs will we loose all the data/events that was supposed to be pick by the Notification Service ? Can anyone help here ?

Expand full comment
saiteja's avatar

I learnt a lot. One query here is it channel processor fails to deliver the notification due to some reason , will it take responsibility of retrying again or it can just update in the scheduler database with scheduled time as the next time using exponential backoff strategy so scheduler picks it up for the next retry. What would be the expected approach should be ?

Thanks.

Expand full comment
Obayd's avatar

thank you so much for this amazing explanation. can this notification system design meet my situation where I have 30k IoT devices, and users can create custom conditions? For example, user1 has device1 and device2 and may create condition1 (for example, notify me if the temperature > 89) for device1 and condition2 (temperature > 60) for device2.

Expand full comment
Pradeep Kumar's avatar

yes this will surely a way to implement this usecase. Try to start with few client prototype and then scale to 30k number.

Expand full comment
Slava's avatar

Great post Ashish, well articulated main components and flow.

I noticed that message content generation is done by Notification Service. Would it make more resilient and robust to put behind inbound queue as to manage the spike in load on Notification service? Each request to Notification service to generate message will also take up connection from the web server to handle the request until generated message is placed in the queue. What if error occurs during generation of message content, will the request be dropped?

Expand full comment
Harvey's avatar

That's what I think, if a failure occurs will we loose all the data/events that was supposed to be pick by the Notification Service ? Can anyone help here ?

Expand full comment