AlgoMaster Newsletter

AlgoMaster Newsletter

Share this post

AlgoMaster Newsletter
AlgoMaster Newsletter
System Design: What is Availability?
Copy link
Facebook
Email
Notes
More

System Design: What is Availability?

#22 - Availability

Ashish Pratap Singh's avatar
Ashish Pratap Singh
Jul 24, 2024
166

Share this post

AlgoMaster Newsletter
AlgoMaster Newsletter
System Design: What is Availability?
Copy link
Facebook
Email
Notes
More
5
8
Share

In this blog, we'll explore the concept of availability, availability tiers, strategies to improve availability, and best practices for achieving high availability.

What is Availability?

Availability refers to the proportion of time a system is operational and accessible when required.

It is usually expressed as a percentage, indicating the system's uptime over a specific period.

The formal definition of availability is:

Availability = Uptime / (Uptime + Downtime)

Uptime: The period during which a system is functional and accessible.

Downtime: The period during which a system is unavailable due to failures, maintenance, or other issues.


If you’re finding this newsletter valuable and want to deepen your learning, consider becoming a paid subscriber.

As a paid subscriber, you'll receive an exclusive deep-dive article every week, access to a structured System Design Resource (100+ topics and interview questions), and other premium perks.

Unlock Full Access


Availability Tiers

Availability is often expressed in "nines". The higher the availability, the less downtime there is.

Each additional "nine" represents an order of magnitude improvement in availability.

Example: 99.99% availability represents a 10-fold improvement in uptime compared to 99.9%.

Share

Strategies for Improving Availability

1. Redundancy

Redundancy involves having backup components that can take over when primary components fail.

Techniques:

  • Server Redundancy: Deploying multiple servers to handle requests, ensuring that if one server fails, others can continue to provide service.

  • Database Redundancy: Creating a replica database that can take over if the primary database fails.

  • Geographic Redundancy: Distributing resources across multiple geographic locations to mitigate the impact of regional failures.

2. Load Balancing

Load balancing distributes incoming network traffic across multiple servers to ensure no single server becomes a bottleneck, enhancing both performance and availability.

Techniques:

  • Hardware Load Balancers: Physical devices that distribute traffic based on pre-configured rules.

  • Software Load Balancers: Software solutions that manage traffic distribution, such as HAProxy, Nginx, or cloud-based solutions like AWS Elastic Load Balancer.

3. Failover Mechanisms

Failover mechanisms automatically switch to a redundant system when a failure is detected.

Techniques:

  • Active-Passive Failover: A primary active component is backed by a passive standby component that takes over upon failure.

  • Active-Active Failover: All components are active and share the load. If one fails, the remaining components continue to handle the load seamlessly.

4. Data Replication

Data replication involves copying data from one location to another to ensure that data is available even if one location fails.

Techniques:

  • Synchronous Replication: Data is replicated in real-time to ensure consistency across locations.

  • Asynchronous Replication: Data is replicated with a delay, which can be more efficient but may result in slight data inconsistencies.

5. Monitoring and Alerts

Continuous health monitoring involves checking the status of system components to detect failures early and trigger alerts for immediate action.

Techniques:

  • Heartbeat Signals: Regular signals sent between components to check their status.

  • Health Checks: Automated scripts or tools that perform regular health checks on components.

  • Alerting Systems: Tools like PagerDuty or OpsGenie that notify administrators of detected issues.

Best Practices for High Availability

  1. Design for Failure: Assume that any component of your system can fail at any time and design your system accordingly.

  2. Implement Health Checks: Regular health checks allow you to detect and respond to issues before they become critical failures.

  3. Use Multiple Availability Zones: Distribute your system across different data centers to prevent localized failures.

  4. Practice Chaos Engineering: Intentionally introduce failures to test system resilience.

  5. Implement Circuit Breakers: Prevent cascading failures by quickly cutting off problematic services.

  6. Use Caching Wisely: Caching can improve availability by reducing load on backend systems.

  7. Plan for Capacity: Ensure your system can handle both expected and unexpected load increases.

Availability is a critical aspect of system design that ensures users can access services reliably and continuously.

By implementing strategies like redundancy, load balancing, failover mechanisms, and data replication, you can design highly available systems.


Thank you for reading!

If you found it valuable, hit a like ❤️ and consider subscribing for more such content every week.

If you have any questions or suggestions, leave a comment.

This post is public so feel free to share it.

Share


P.S. If you’re finding this newsletter helpful and want to get even more value, consider becoming a paid subscriber.

As a paid subscriber, you'll receive an exclusive deep dive every week, access to a comprehensive system design learning resource , and other premium perks.

Get full access to AlgoMaster

There are group discounts, gift options, and referral bonuses available.


Checkout my Youtube channel for more in-depth content.

Follow me on LinkedIn, X and Medium to stay updated.

Checkout my GitHub repositories for free interview preparation resources.

I hope you have a lovely day!

See you soon,
Ashish

Ankit Singh's avatar
Sai Keerthan Palavarapu's avatar
Sandeep Pandey's avatar
Nagender Aneja's avatar
Mayur Sonowal's avatar
166 Likes∙
8 Restacks
166

Share this post

AlgoMaster Newsletter
AlgoMaster Newsletter
System Design: What is Availability?
Copy link
Facebook
Email
Notes
More
5
8
Share

Discussion about this post

User's avatar
midway's avatar
midway
Jul 25

Useful

Expand full comment
Like (1)
Reply
Share
Fahad Bin Shahid's avatar
Fahad Bin Shahid
Jul 24

Database Redundancy and Data Replication are the same things right?

Expand full comment
Like (1)
Reply
Share
2 replies by Ashish Pratap Singh and others
3 more comments...
LeetCode was HARD until I Learned these 15 Patterns
#21 - Patterns to master LeetCode
Jul 21, 2024 • 
Ashish Pratap Singh
1,809

Share this post

AlgoMaster Newsletter
AlgoMaster Newsletter
LeetCode was HARD until I Learned these 15 Patterns
Copy link
Facebook
Email
Notes
More
36
How I Mastered Data Structures and Algorithms
#16 How I mastered DSA
Jun 16, 2024 • 
Ashish Pratap Singh
883

Share this post

AlgoMaster Newsletter
AlgoMaster Newsletter
How I Mastered Data Structures and Algorithms
Copy link
Facebook
Email
Notes
More
23
System Design: What is Scalability?
#1 System Design - Scalability
Mar 4, 2024 • 
Ashish Pratap Singh
514

Share this post

AlgoMaster Newsletter
AlgoMaster Newsletter
System Design: What is Scalability?
Copy link
Facebook
Email
Notes
More
38

Ready for more?

© 2025 Ashish Pratap Singh
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More

Create your profile

User's avatar

Only paid subscribers can comment on this post

Already a paid subscriber? Sign in

Check your email

For your security, we need to re-authenticate you.

Click the link we sent to , or click here to sign in.