Saturday, 20 June 2026

Blocking IO vs Non-Blocking IO Concepts

1. CPU Internal Model & Thread Context Switching

How CPU Executes Instructions

A CPU continuously performs:

+---------+    +---------+    +---------+
| Fetch   | -> | Decode  | -> | Execute |
+---------+    +---------+    +---------+

Fetch

CPU fetches instructions from memory/cache.

Decode

CPU understands what operation must be performed.

Execute

CPU executes the instruction.


What is a Thread Context Switch?

Assume CPU is executing Thread-A.

CPU
 |
 +--> Thread-A Running

Suddenly a higher-priority thread arrives.

CPU
 |
 +--> Save Thread-A State
 |
 +--> Load Thread-B State
 |
 +--> Execute Thread-B

The CPU must save:

  • Program Counter (PC)

  • Registers

  • Stack Pointer

  • Thread State

Then load another thread's state.

Cost of Context Switching

Context Switch

Save Current Thread
        +
Load New Thread
        +
CPU Cache Disturbance
        +
Scheduler Overhead

Result:

More Threads
      ↓
More Context Switches
      ↓
CPU Wastage
      ↓
Lower Throughput

This is why high-performance systems try to minimize unnecessary threads.


2. Why Redis Uses Single Thread

Redis is famous for using a mostly single-threaded event loop.

Traditional Multi-thread Model

Request-1 --> Thread-1
Request-2 --> Thread-2
Request-3 --> Thread-3
Request-4 --> Thread-4

Problem:

Many Threads
      ↓
Many Context Switches
      ↓
CPU Overhead

Redis Model

Request-1
Request-2
Request-3
Request-4
      |
      v
+----------------+
| Single Event   |
| Loop Thread    |
+----------------+

Benefits:

  • No thread synchronization

  • Minimal context switching

  • Predictable latency

  • Better CPU cache utilization


CPU Cache Locality

Linked List

Node-A --> Node-B --> Node-C --> Node-D

Memory:

A ----- far ----- B ----- far ----- C

CPU keeps jumping in RAM.


Array-Based Structure

[A][B][C][D][E]

Memory is contiguous.

CPU Fetch
     ↓
Cache Line Loaded
     ↓
Multiple Elements Available

Advantages:

  • Better cache hit rate

  • Less RAM access

  • Faster execution

This principle is used heavily in Redis internals. (rameshvanka.blogspot.com)


3. Blocking I/O Architecture

What is Blocking I/O?

A thread waits until data becomes available.

Flow

Client
   |
   v
Socket Created
   |
   v
Dedicated Thread Assigned
   |
   v
Waiting For Data
   |
   v
Thread Blocked
   |
   v
Data Arrives
   |
   v
Process Request
   |
   v
Response

Example

Suppose 10,000 clients connect.

10,000 Clients
      ↓
10,000 Sockets
      ↓
10,000 Threads

Most threads are doing:

Waiting...
Waiting...
Waiting...
Waiting...

CPU is not busy.

Memory is wasted.


Blocking I/O Diagram

Client-1 ---> Thread-1 ---> Waiting
Client-2 ---> Thread-2 ---> Waiting
Client-3 ---> Thread-3 ---> Waiting
Client-4 ---> Thread-4 ---> Waiting

Problems:

  • High memory usage

  • Context switching overhead

  • Limited scalability

Traditional Tomcat thread-per-request model largely follows this pattern. (rameshvanka.blogspot.com)


4. Non-Blocking I/O Architecture

Core Idea

Don't dedicate a thread per socket.

Instead:

One Thread
      ↓
Monitor Many Sockets
      ↓
Process Only Ready Sockets

Event Loop Model

            +----------------+
Socket-1 -->|                |
Socket-2 -->| Event Loop     |
Socket-3 -->| (Poller)       |
Socket-4 -->|                |
            +----------------+
                     |
                     v
          Ready Socket Found
                     |
                     v
             Worker Executes

Detailed Flow

Client Request
       |
       v
Socket Registered
       |
       v
Selector/Poller
       |
       v
Data Available?
   |
   +-- No --> Continue Monitoring
   |
   +-- Yes
          |
          v
    Worker Thread
          |
          v
    Business Logic
          |
          v
      Response

5. Selector Pattern (Java NIO)

Java NIO introduced:

Selector
Channel
Buffer

Architecture:

SocketChannel-1
SocketChannel-2
SocketChannel-3
SocketChannel-4
       |
       v
    Selector
       |
       v
Ready Events
       |
       v
Worker Pool

One selector can monitor thousands of connections.


6. Blocking vs Non-Blocking Comparison

FeatureBlocking IONon-Blocking IO
Thread per socketYesNo
Memory usageHighLow
Context switchingHighLow
ScalabilityLimitedVery High
Idle thread wastageHighVery Low
Suitable forSmall systemsLarge-scale systems
ExampleTraditional Servlet/TomcatNetty, Node.js, Vert.x

7. Where Non-Blocking IO Fails

Your note is correct but can be explained better.

Non-blocking IO is excellent for:

IO Bound Work

Examples:

  • Database calls

  • Network calls

  • API calls

  • Messaging


Problem: CPU Intensive Tasks

Image Processing
Video Encoding
AI Inference
Complex Calculations
Encryption

If a single event-loop thread does this:

Event Loop
     |
     +--> Heavy CPU Task

Then:

Event Loop Blocked
      ↓
Cannot Accept New Requests
      ↓
Performance Collapse

Correct Modern Architecture

            Event Loop
                 |
                 v
          Ready Request
                 |
                 v
        Worker Thread Pool
                 |
                 v
         CPU Intensive Work
                 |
                 v
             Response

This is exactly what frameworks like Netty, Spring WebFlux, Vert.x, and Node.js ecosystems follow.


Interview Summary (One-Line Version)

Blocking IO:
One Socket -> One Thread -> Wait For Data

Non-Blocking IO:
Many Sockets -> One Event Loop -> Process Only Ready Events
Blocking IO optimizes programming simplicity.

Non-Blocking IO optimizes scalability and resource utilization.

This version would be more accurate for senior Java Architect/System Design interviews and aligns with modern Java NIO, Netty, Spring WebFlux, and Redis architecture concepts.

In multi thread env, thread context switch will be take more time for the cpu.

   

Reference:





Instead of multi thread, single thread is best for we wil save time and fast due to saving time of thread context switch

Redis

Redis using internally arraylist, datastructure which will store content side by side, instead of linkedlist, due to this cpu will fetch set of instructions fetch phase, store them those instructions instruction cache, due to this CPU will save cycle times.due to cpu will not goes to RAM instead it will fetch instructiosn from instruction cache only.
------------


Above diagram clearly explain the when request comes, one socket will be created, then for that corresponding socket tomcat will create the thread, thread will wait until the socket will have data, thread is blocked until the socket fulled, due to this - threads wasting the user space due to blocking nature.



In the Single Thread Model with Non-block IO with event loop, it will reads the sockets full, it will handle multple requests, where as tomcat instance can't handle multiple requests.


Note: Single Thread IO - if CPU intension task this single thread model will fail.

No comments:

Post a Comment