Distributed Systems
Service Discovery
Service discovery is a mechanism that automatically determines the location and accessibility of a service in distributed systems (e.g., microservices architectures, cloud environments, or Kubernetes). This is helpful in dynamic environments where addresses and the number of systems frequently change. Static configurations would be impossible here.
- Automatic discovery and connection of services
- Load balancing and scaling
- Increased reliability and fault tolerance
- No manual configuration of IPs and ports required
Further information
Link: Service Discovery
Load Balancer
A system that distributes incoming requests across multiple servers. This spreads the load as evenly as possible across multiple servers and can be quickly removed from the system in the event of a server failure. Note that this is limited if, for example, sticky sessions are used.
Further information
Link: Load Balancer
Circuit Breaker
When multiple services interact with each other, a slow or broken service can bring the entire application to its knees. A circuit breaker is a mechanism that removes a malfunctioning service from the system so that it can no longer be used. In principle, it works like a switch.
- Open: The incoming request is blocked.
- Closed: The request is being processed by the service.
- Half open: To test functionality, a limited number of requests are being processed.
Further information
Link: Circuit Breaker
Retry Mechanism
Automatically retries failed request before throwing an error as networks are not 100% reliable
e.g. Spring Retry
Further information
Link: Spring Retry
API Gateway
Instead of a client having to connect to multiple services, this can be done centrally via an API gateway. The client connects to the gateway, which forwards the request to the appropriate service. This has the following advantages
- The gateway can aggregate multiple responses from the services.
- Caching can be done centrally.
- Error handling can take place in a single location.
- Authorization and monitoring can be performed more easily at the gateway.
Further information
Link: API Gateway
Rate Limiting
Mechanism that limits the number of requests per unit of time to protect systems from overload, abuse, or DDoS attacks.
Further information
Link: Rate Limiting
Useful for
- Protects APIs and servers from overload
- Prevents abuse (e.g., brute-force attacks)
- Ensures fairness (no single user dominates the system)
- Saves resources and costs in cloud environments
How does rate limiting work?
- A user or service sends a request.
- The rate limiter checks whether the request exceeds the limit.
- If the limit is exceeded: Error message (429 Too Many Requests)
- If not: The request is processed.
Algorithms
Fixed Window
Divides time into fixed intervals (e.g., 100 requests per minute). Problem: At the beginning of a new time window, many requests can arrive at once.
Sliding Window
Handles requests over a sliding period (e.g., a maximum of 100 requests within the last 60 seconds). Advantage: More even load distribution than a fixed window.
Token Bucket
A bucket of tokens continuously fills at a specific rate. Each request consumes one token. When the bucket is empty, requests are blocked. Advantage: Allows short-term burst requests (e.g., sudden traffic spikes).
Leaky Bucket
It works like a token bucket, but requests are processed at a fixed rate. Prevents large peak loads.