Sunday, 15 February 2026

Resilience4J - SlidingWindow Protocal with CircuitBreaker

 In previous articel discussed about the BulkHead Pattern, Now we are discussing on Sliding Window.

Good 👍 this is core internal logic of CircuitBreaker in Resilience4j.

Most developers use @CircuitBreaker but don’t understand how sliding window actually calculates failure rate.

Let’s break it clearly.



🔥 What is Sliding Window in Resilience4j?

Sliding window is the statistical window used by CircuitBreaker to decide:

Should we OPEN the circuit or keep it CLOSED?

It calculates:

  • Failure rate %

  • Slow call rate %

  • Total calls count

Based on last N calls or last N seconds.


📌 Two Types of Sliding Windows

1️⃣ COUNT_BASED Sliding Window

Based on number of calls.

Example:

CircuitBreakerConfig config = CircuitBreakerConfig.custom()
.slidingWindowType(SlidingWindowType.COUNT_BASED)
.slidingWindowSize(10)
.failureRateThreshold(50)
.build();

Meaning:

  • Observe last 10 calls

  • If more than 50% fail

  • Circuit goes OPEN


Example Scenario:

Last 10 calls:

S F S F F S F F S F

Failures = 6
Failure rate = 60%

If threshold = 50% → Circuit OPEN


2️⃣ TIME_BASED Sliding Window

Based on time duration.

Example:

.slidingWindowType(SlidingWindowType.TIME_BASED)
.slidingWindowSize(10)

Meaning:

  • Observe calls in last 10 seconds

  • Calculate failure rate

  • If threshold crossed → OPEN


🧠 How Sliding Window Internally Works

Internally it maintains:

  • Circular array (ring buffer)

  • Buckets for time-based

  • Atomic counters

Every new call:

  1. Old data expires

  2. New result added

  3. Failure rate recalculated

  4. Decision made

This is O(1) time complexity per update.

Very efficient.


🎯 Important Configurations (Architect Level)

🔹 Minimum Number of Calls

.minimumNumberOfCalls(5)

Circuit will not evaluate failure rate unless at least 5 calls happen.

This avoids false positives in low traffic systems.


🔹 Failure Rate Threshold

.failureRateThreshold(50)

If failure % > threshold → OPEN


🔹 Slow Call Rate Threshold

.slowCallRateThreshold(60)
.slowCallDurationThreshold(Duration.ofSeconds(2))

If 60% calls take > 2 seconds → OPEN

This protects against latency spikes.


🏦 Real Banking Example (APS Context)

Let’s say:

Loan SOR:

  • Sliding window size = 20 calls

  • Failure threshold = 40%

  • Minimum calls = 10

If last 20 calls:

  • 8 failures

  • Failure rate = 40%

Circuit remains CLOSED.

But if 9 failures:

  • 45%

  • Circuit OPEN


🔄 Difference Between Count vs Time Based

FeatureCOUNT_BASEDTIME_BASED
Best ForStable trafficVariable traffic
Banking Core APIs✅ Good⚠️ Depends
High burst systems❌ Risky✅ Better
PredictabilityHighMedium

No comments:

Post a Comment