In previous articel discussed about the BulkHead Pattern, Now we are discussing on Sliding Window.
Good π this is core internal logic of CircuitBreaker in Resilience4j.
Most developers use @CircuitBreaker but don’t understand how sliding window actually calculates failure rate.
Let’s break it clearly.
π₯ What is Sliding Window in Resilience4j?
Sliding window is the statistical window used by CircuitBreaker to decide:
Should we OPEN the circuit or keep it CLOSED?
It calculates:
-
Failure rate %
-
Slow call rate %
-
Total calls count
Based on last N calls or last N seconds.
π Two Types of Sliding Windows
1️⃣ COUNT_BASED Sliding Window
Based on number of calls.
Example:
CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.slidingWindowType(SlidingWindowType.COUNT_BASED)
.slidingWindowSize(10)
.failureRateThreshold(50)
.build();
Meaning:
-
Observe last 10 calls
-
If more than 50% fail
-
Circuit goes OPEN
Example Scenario:
Last 10 calls:
S F S F F S F F S F
Failures = 6
Failure rate = 60%
If threshold = 50% → Circuit OPEN
2️⃣ TIME_BASED Sliding Window
Based on time duration.
Example:
.slidingWindowType(SlidingWindowType.TIME_BASED)
.slidingWindowSize(10)
Meaning:
-
Observe calls in last 10 seconds
-
Calculate failure rate
-
If threshold crossed → OPEN
π§ How Sliding Window Internally Works
Internally it maintains:
-
Circular array (ring buffer)
-
Buckets for time-based
-
Atomic counters
Every new call:
-
Old data expires
-
New result added
-
Failure rate recalculated
-
Decision made
This is O(1) time complexity per update.
Very efficient.
π― Important Configurations (Architect Level)
πΉ Minimum Number of Calls
.minimumNumberOfCalls(5)
Circuit will not evaluate failure rate unless at least 5 calls happen.
This avoids false positives in low traffic systems.
πΉ Failure Rate Threshold
.failureRateThreshold(50)
If failure % > threshold → OPEN
πΉ Slow Call Rate Threshold
.slowCallRateThreshold(60)
.slowCallDurationThreshold(Duration.ofSeconds(2))
If 60% calls take > 2 seconds → OPEN
This protects against latency spikes.
π¦ Real Banking Example (APS Context)
Let’s say:
Loan SOR:
-
Sliding window size = 20 calls
-
Failure threshold = 40%
-
Minimum calls = 10
If last 20 calls:
-
8 failures
-
Failure rate = 40%
Circuit remains CLOSED.
But if 9 failures:
-
45%
-
Circuit OPEN
π Difference Between Count vs Time Based
| Feature | COUNT_BASED | TIME_BASED |
|---|---|---|
| Best For | Stable traffic | Variable traffic |
| Banking Core APIs | ✅ Good | ⚠️ Depends |
| High burst systems | ❌ Risky | ✅ Better |
| Predictability | High | Medium |