1. CPU Internal Model & Thread Context Switching
How CPU Executes Instructions
A CPU continuously performs:
+---------+ +---------+ +---------+
| Fetch | -> | Decode | -> | Execute |
+---------+ +---------+ +---------+
Fetch
CPU fetches instructions from memory/cache.
Decode
CPU understands what operation must be performed.
Execute
CPU executes the instruction.
What is a Thread Context Switch?
Assume CPU is executing Thread-A.
CPU
|
+--> Thread-A Running
Suddenly a higher-priority thread arrives.
CPU
|
+--> Save Thread-A State
|
+--> Load Thread-B State
|
+--> Execute Thread-B
The CPU must save:
Program Counter (PC)
Registers
Stack Pointer
Thread State
Then load another thread's state.
Cost of Context Switching
Context Switch
Save Current Thread
+
Load New Thread
+
CPU Cache Disturbance
+
Scheduler Overhead
Result:
More Threads
↓
More Context Switches
↓
CPU Wastage
↓
Lower Throughput
This is why high-performance systems try to minimize unnecessary threads.
2. Why Redis Uses Single Thread
Redis is famous for using a mostly single-threaded event loop.
Traditional Multi-thread Model
Request-1 --> Thread-1
Request-2 --> Thread-2
Request-3 --> Thread-3
Request-4 --> Thread-4
Problem:
Many Threads
↓
Many Context Switches
↓
CPU Overhead
Redis Model
Request-1
Request-2
Request-3
Request-4
|
v
+----------------+
| Single Event |
| Loop Thread |
+----------------+
Benefits:
No thread synchronization
Minimal context switching
Predictable latency
Better CPU cache utilization
CPU Cache Locality
Linked List
Node-A --> Node-B --> Node-C --> Node-D
Memory:
A ----- far ----- B ----- far ----- C
CPU keeps jumping in RAM.
Array-Based Structure
[A][B][C][D][E]
Memory is contiguous.
CPU Fetch
↓
Cache Line Loaded
↓
Multiple Elements Available
Advantages:
Better cache hit rate
Less RAM access
Faster execution
This principle is used heavily in Redis internals. (rameshvanka.blogspot.com)
3. Blocking I/O Architecture
What is Blocking I/O?
A thread waits until data becomes available.
Flow
Client
|
v
Socket Created
|
v
Dedicated Thread Assigned
|
v
Waiting For Data
|
v
Thread Blocked
|
v
Data Arrives
|
v
Process Request
|
v
Response
Example
Suppose 10,000 clients connect.
10,000 Clients
↓
10,000 Sockets
↓
10,000 Threads
Most threads are doing:
Waiting...
Waiting...
Waiting...
Waiting...
CPU is not busy.
Memory is wasted.
Blocking I/O Diagram
Client-1 ---> Thread-1 ---> Waiting
Client-2 ---> Thread-2 ---> Waiting
Client-3 ---> Thread-3 ---> Waiting
Client-4 ---> Thread-4 ---> Waiting
Problems:
High memory usage
Context switching overhead
Limited scalability
Traditional Tomcat thread-per-request model largely follows this pattern. (rameshvanka.blogspot.com)
4. Non-Blocking I/O Architecture
Core Idea
Don't dedicate a thread per socket.
Instead:
One Thread
↓
Monitor Many Sockets
↓
Process Only Ready Sockets
Event Loop Model
+----------------+
Socket-1 -->| |
Socket-2 -->| Event Loop |
Socket-3 -->| (Poller) |
Socket-4 -->| |
+----------------+
|
v
Ready Socket Found
|
v
Worker Executes
Detailed Flow
Client Request
|
v
Socket Registered
|
v
Selector/Poller
|
v
Data Available?
|
+-- No --> Continue Monitoring
|
+-- Yes
|
v
Worker Thread
|
v
Business Logic
|
v
Response
5. Selector Pattern (Java NIO)
Java NIO introduced:
Selector
Channel
Buffer
Architecture:
SocketChannel-1
SocketChannel-2
SocketChannel-3
SocketChannel-4
|
v
Selector
|
v
Ready Events
|
v
Worker Pool
One selector can monitor thousands of connections.
6. Blocking vs Non-Blocking Comparison
| Feature | Blocking IO | Non-Blocking IO |
|---|---|---|
| Thread per socket | Yes | No |
| Memory usage | High | Low |
| Context switching | High | Low |
| Scalability | Limited | Very High |
| Idle thread wastage | High | Very Low |
| Suitable for | Small systems | Large-scale systems |
| Example | Traditional Servlet/Tomcat | Netty, Node.js, Vert.x |
7. Where Non-Blocking IO Fails
Your note is correct but can be explained better.
Non-blocking IO is excellent for:
IO Bound Work
Examples:
Database calls
Network calls
API calls
Messaging
Problem: CPU Intensive Tasks
Image Processing
Video Encoding
AI Inference
Complex Calculations
Encryption
If a single event-loop thread does this:
Event Loop
|
+--> Heavy CPU Task
Then:
Event Loop Blocked
↓
Cannot Accept New Requests
↓
Performance Collapse
Correct Modern Architecture
Event Loop
|
v
Ready Request
|
v
Worker Thread Pool
|
v
CPU Intensive Work
|
v
Response
This is exactly what frameworks like Netty, Spring WebFlux, Vert.x, and Node.js ecosystems follow.
Interview Summary (One-Line Version)
Blocking IO:
One Socket -> One Thread -> Wait For Data
Non-Blocking IO:
Many Sockets -> One Event Loop -> Process Only Ready Events
Blocking IO optimizes programming simplicity.
Non-Blocking IO optimizes scalability and resource utilization.
This version would be more accurate for senior Java Architect/System Design interviews and aligns with modern Java NIO, Netty, Spring WebFlux, and Redis architecture concepts.
In multi thread env, thread context switch will be take more time for the cpu.
Reference:
Above diagram clearly explain the when request comes, one socket will be created, then for that corresponding socket tomcat will create the thread, thread will wait until the socket will have data, thread is blocked until the socket fulled, due to this - threads wasting the user space due to blocking nature.
In the Single Thread Model with Non-block IO with event loop, it will reads the sockets full, it will handle multple requests, where as tomcat instance can't handle multiple requests.