Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Concurrency

Concurrency allows your program to do multiple things simultaneously. It’s essential for modern applications to utilize multi-core processors and handle multiple users.

1. Threads vs. Processes

  • Process: An executing program (e.g., the JVM itself). It has its own isolated memory space.
  • Thread: A lightweight unit of execution within a process. Threads share the same memory space.
Think of a Process like a house, and Threads like the people living in it. They share the kitchen (memory), but do different tasks.

Creating a Thread

// 1. Extend Thread (Not recommended)
class MyThread extends Thread {
    public void run() {
        System.out.println("Running in thread");
    }
}
new MyThread().start();

// 2. Implement Runnable (Preferred)
// Decouples the task from the thread mechanism
Runnable task = () -> System.out.println("Running task");
new Thread(task).start();

2. Executor Framework

Manually creating threads (new Thread()) is expensive and error-prone. If you create 10,000 threads, you might crash the OS. Executors manage a pool of threads for you. You just submit tasks, and the pool handles the execution.
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

// Create a pool with 10 fixed threads
ExecutorService executor = Executors.newFixedThreadPool(10);

for (int i = 0; i < 100; i++) {
    executor.submit(() -> {
        System.out.println("Processing task on " + Thread.currentThread().getName());
    });
}

// Shutdown when done (otherwise the app keeps running)
executor.shutdown();

Types of Pools

  • newFixedThreadPool(n): Fixed number of threads. Good for predictable loads.
  • newCachedThreadPool(): Creates threads as needed, reuses idle ones. Good for many short-lived tasks.
  • newSingleThreadExecutor(): One thread. Ensures tasks run sequentially.

3. Callable & Future

Runnable returns void. What if you want a result? Use Callable. A Future represents the result of an asynchronous computation. It’s a placeholder for a value that will arrive later.
import java.util.concurrent.*;

Callable<Integer> task = () -> {
    Thread.sleep(1000); // Simulate work
    return 42;
};

ExecutorService executor = Executors.newSingleThreadExecutor();
Future<Integer> future = executor.submit(task);

// Do other work here...

// Get result (blocks until ready)
Integer result = future.get(); 
System.out.println(result); // 42

4. Synchronization

When multiple threads access shared data (like a counter), race conditions occur. Two threads might read “5”, increment it, and both write “6”, losing one increment.

synchronized Keyword

Ensures only one thread can execute a block at a time. It’s like a lock on a door.
class Counter {
    private int count = 0;

    public synchronized void increment() {
        count++;
    }
    
    public synchronized int getCount() {
        return count;
    }
}

Atomic Classes

For simple variables, AtomicInteger is faster and lock-free.
import java.util.concurrent.atomic.AtomicInteger;

AtomicInteger count = new AtomicInteger(0);
count.incrementAndGet(); // Thread-safe increment

5. CompletableFuture (Java 8+)

Future.get() blocks the thread. CompletableFuture allows you to build non-blocking, reactive pipelines (similar to Promises in JavaScript).
CompletableFuture.supplyAsync(() -> "Hello")
    .thenApply(s -> s + " World")
    .thenAccept(System.out::println); // Prints "Hello World"
    
// Combine two futures
CompletableFuture<String> f1 = CompletableFuture.supplyAsync(() -> "A");
CompletableFuture<String> f2 = CompletableFuture.supplyAsync(() -> "B");

f1.thenCombine(f2, (a, b) -> a + b)
  .thenAccept(System.out::println); // "AB"

6. Virtual Threads (Java 21)

Game Changer: Traditional threads are mapped 1:1 to OS threads. They are heavy (2MB stack). You can only have a few thousand. Virtual Threads are lightweight threads managed by the JVM. You can create millions of them.
// Create a virtual thread
Thread.startVirtualThread(() -> {
    System.out.println("Running on virtual thread");
});

// Executor for virtual threads
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    IntStream.range(0, 10_000).forEach(i -> {
        executor.submit(() -> {
            Thread.sleep(Duration.ofSeconds(1));
            return i;
        });
    });
} // Auto-closes and waits for all tasks
Why Virtual Threads? They make blocking I/O cheap. You can write simple, synchronous code (e.g., read from DB) that scales like complex asynchronous code.

Summary

  • Threads: Basic unit of concurrency.
  • Executors: Manage thread pools.
  • Synchronization: Protect shared state from race conditions.
  • CompletableFuture: Compose async tasks.
  • Virtual Threads: High-throughput concurrency for Java 21+.
Next, we’ll wrap up with Modern Java Features that make the language expressive and concise.

Interview Deep-Dive

Strong Answer:
  • The Java Memory Model (JMM, defined in JSR-133 and the JLS Chapter 17) specifies how threads interact through memory. The core problem it solves: modern CPUs have per-core caches, store buffers, and instruction reordering optimizations. Without a memory model, a write by thread A might not be visible to thread B — ever — because B reads from its CPU cache, not from main memory. The JMM defines the rules for when writes by one thread are guaranteed to be visible to reads by another thread.
  • The central concept is “happens-before.” If action A happens-before action B, then A’s effects are guaranteed to be visible to B. Key happens-before relationships: unlocking a monitor happens-before subsequent locking of the same monitor (synchronized); writing a volatile variable happens-before subsequent reads of the same variable; Thread.start() happens-before any action in the started thread; all actions in a thread happen-before any other thread successfully returns from Thread.join() on that thread.
  • Why this matters practically: without volatile or synchronized, the JVM and CPU are free to reorder instructions and cache values. A classic bug: boolean running = true; in a shared field, one thread sets running = false, another thread loops while (running). Without volatile, the JIT compiler may hoist the read of running out of the loop (since it is not modified within the loop from the compiler’s perspective), creating an infinite loop. I have seen this exact bug in production — a shutdown hook set a flag, but the worker thread never saw the update because the JIT optimized away the repeated read. Adding volatile fixed it.
  • A senior-level nuance: volatile guarantees visibility and prevents reordering around the volatile access, but it does not provide atomicity for compound operations. volatile int count; count++ is not thread-safe because the increment is a read-modify-write sequence that can be interleaved. For compound atomicity, you need AtomicInteger, synchronized, or a Lock.
Follow-up: Explain what ‘false sharing’ is and how it can silently destroy performance in concurrent Java code.
  • False sharing occurs when two threads modify independent variables that happen to reside on the same CPU cache line (typically 64 bytes on x86). Even though the variables are logically independent, the CPU’s cache coherency protocol (MESI or similar) forces the entire cache line to be invalidated and reloaded every time either variable is modified. Both threads repeatedly invalidate each other’s cached copy, causing continuous cache-line bouncing between cores.
  • In Java, this commonly happens with adjacent fields in an object or adjacent elements in an array that are written by different threads. For example, two AtomicLong counters in the same object — if they end up on the same cache line, incrementing one counter from thread A invalidates the cache line for thread B reading the other counter. This can cause 10-100x performance degradation compared to the expected throughput.
  • The fix is cache line padding: ensure the hot variables are separated by at least 64 bytes. Java 8 introduced @Contended (internal annotation sun.misc.Contended, requires -XX:-RestrictContended to use outside the JDK). This annotation tells the JVM to pad the field to its own cache line. The JDK itself uses @Contended on Thread.threadLocalRandomSeed and on ForkJoinPool worker queues. In Java 9+ modules, it is jdk.internal.vm.annotation.Contended.
  • The Disruptor library (LMAX) is the canonical example of false sharing awareness in Java. Their ring buffer is designed to avoid false sharing between the producer’s write position and the consumer’s read position, which is a key reason it achieves millions of operations per second with sub-microsecond latency.
Strong Answer:
  • synchronized is the built-in monitor lock. It is simple, scoped (automatically releases on block exit, even with exceptions), and the JVM applies biased locking (Java 15 deprecated biased locking, removed in later versions), thin locking, and lock inflation optimizations. Use it for simple mutual exclusion where you do not need advanced features. It has been aggressively optimized over 25 years and is the right default choice for most cases.
  • ReentrantLock (from java.util.concurrent.locks) provides everything synchronized does plus: tryLock() (non-blocking attempt to acquire), tryLock(timeout) (timed attempt), lockInterruptibly() (can be interrupted while waiting), fairness policy (optional FIFO ordering of waiting threads), and the ability to bind multiple Condition objects (equivalent to multiple wait-sets per lock). Use it when you need any of these features — the most common reason is tryLock for avoiding deadlocks in lock-ordering protocols.
  • StampedLock (Java 8) is a more advanced lock optimized for read-heavy workloads. It supports three modes: write lock (exclusive), read lock (shared, like ReadWriteLock), and optimistic read (lock-free). The optimistic read returns a stamp, you read your data, then validate the stamp. If validation succeeds, no locking occurred at all — pure throughput. If it fails (a writer intervened), you fall back to a pessimistic read lock. This is transformative for data structures with 95%+ read access patterns.
  • The decision matrix: 90% of the time, use synchronized — it is simpler and the JVM optimizes it heavily. Use ReentrantLock when you need tryLock, timed lock, interruptible lock, or multiple conditions. Use StampedLock for read-dominated data structures where you have benchmarked that the optimistic read path provides measurable improvement. Never use StampedLock as a general-purpose lock — it is not reentrant, which means calling a method that acquires the same lock will deadlock.
Follow-up: You mentioned deadlocks. How do you detect and prevent deadlocks in a production Java application?
  • Detection: jstack <pid> dumps all thread stacks and automatically detects deadlocks (it prints “Found one Java-level deadlock” with the threads and locks involved). In production, JMX exposes ThreadMXBean.findDeadlockedThreads() which can be called programmatically or via monitoring tools. Java Flight Recorder (JFR) captures lock contention events that help identify near-deadlock conditions.
  • Prevention follows a simple discipline: always acquire locks in a consistent global order. If thread A needs locks L1 and L2, and thread B also needs L1 and L2, both must acquire L1 first, then L2. Deadlock occurs when A holds L1 and waits for L2 while B holds L2 and waits for L1. Enforcing a total ordering eliminates this. In practice, the ordering is often by object hash or a sequence number assigned at creation time.
  • tryLock with a timeout is the pragmatic defense. Instead of blocking forever, the thread tries to acquire the lock for a bounded time and backs off on failure. This does not prevent deadlocks but converts them from hangs (infinite wait) to retriable failures (timeout exception). This is the pattern used in database systems (lock timeout) and is appropriate when strict lock ordering is impractical.
  • In my experience, the most common production deadlock in Java is not between explicit locks but between synchronized methods in two classes that call each other. Class A’s synchronized method M1 calls class B’s synchronized method M2, and vice versa. Code review rarely catches this because the cycle spans multiple files. The fix is reducing the scope of synchronization — lock only the critical section, not the entire method — or switching to lock-free data structures.
Strong Answer:
  • Platform threads (traditional Java threads) are thin wrappers around OS threads. Each platform thread has a 1:1 mapping to a kernel thread, consumes ~1MB of stack memory, and context-switching between them involves a kernel-mode transition. You can create a few thousand before the OS starts rejecting new threads or the system becomes unstable from context-switch overhead.
  • Virtual threads are user-mode threads managed entirely by the JVM. They are scheduled onto a small pool of carrier threads (platform threads managed by a ForkJoinPool). When a virtual thread blocks on I/O (socket read, file read, Thread.sleep, Lock.lock), the JVM unmounts it from the carrier thread and mounts a different virtual thread. The carrier thread is never actually blocked — it always has work to do. This is conceptually similar to goroutines in Go or fibers in other runtimes.
  • Virtual threads have very small stacks (they start at a few hundred bytes and grow dynamically, stored on the heap, not in native memory). You can create millions of them. The startup cost is negligible — creating a virtual thread is roughly the cost of creating a small object, compared to the microseconds required to create a platform thread (which involves a kernel syscall).
  • The workloads that benefit most are I/O-bound applications with high concurrency: web servers handling thousands of simultaneous requests that each make database calls, HTTP calls to downstream services, or file reads. Before virtual threads, these applications needed reactive frameworks (WebFlux, Vert.x) with callback-based or Mono/Flux-style APIs to avoid blocking threads. Virtual threads let you write simple, sequential, blocking code (resultSet.next(), httpClient.send()) that scales like reactive code because the JVM handles the thread-switching transparently.
  • The workloads that do NOT benefit: CPU-bound computation (virtual threads still run on the same number of CPU cores — they do not create parallelism, they create concurrency), applications that already use async I/O efficiently, and code that holds native locks or calls JNI methods while blocking (these “pin” the virtual thread to its carrier, negating the benefit).
Follow-up: What is ‘pinning’ in the context of virtual threads, and why is synchronized a problem?
  • Pinning occurs when a virtual thread cannot be unmounted from its carrier thread during a blocking operation. When this happens, the carrier thread is blocked along with the virtual thread, which defeats the entire purpose of virtual threads (the carrier pool has fewer available threads, reducing throughput).
  • The primary cause of pinning is synchronized blocks and methods. When a virtual thread enters a synchronized block and then blocks (e.g., calls Thread.sleep or performs I/O inside the synchronized block), the JVM cannot unmount it because the monitor lock is associated with the specific OS thread. Unmounting would mean the lock is “held” by a thread that is no longer running, which breaks the monitor semantics.
  • The fix is straightforward: replace synchronized with ReentrantLock in code that runs on virtual threads. ReentrantLock is implemented in Java (not via native monitor instructions), so the JVM can unmount a virtual thread that blocks while holding a ReentrantLock. This is the single most important migration step when adopting virtual threads in an existing codebase.
  • The JVM provides a diagnostic flag -Djdk.tracePinnedThreads=full (or short) that prints a stack trace whenever a virtual thread is pinned. In a migration, you run the application with this flag under load and fix the pinning hotspots. In practice, most pinning comes from third-party libraries (JDBC drivers, connection pools, logging frameworks) that use synchronized internally. Library authors are actively migrating to ReentrantLock — for example, HikariCP (the most popular connection pool) has been updated for virtual thread compatibility.
Strong Answer:
  • Thread starvation means tasks are queued but not making progress because all threads in the pool are occupied. The symptoms are: increasing response latency, growing task queue depth, eventually timeouts and rejections. The first diagnostic step is capturing a thread dump (jstack <pid>, kill -3 <pid> on Linux, or JFR/JMX). I look at what the threads are doing — if all threads in the pool are in BLOCKED or WAITING state, something is holding them up.
  • The most common cause I have seen: a thread pool executing tasks that themselves make blocking calls to a slow downstream service. If you have a pool of 10 threads and each task makes an HTTP call to a service that occasionally takes 30 seconds to respond (due to GC pauses, network issues, or the downstream service being overloaded), it takes only 10 slow requests to exhaust the entire pool. All subsequent tasks queue up behind the slow ones. The fix is: (1) add timeouts to all downstream calls (connect timeout + read timeout), (2) use a circuit breaker (Resilience4j, Hystrix) to fail fast when the downstream is unhealthy, and (3) consider separate thread pools for different downstream services (bulkhead pattern) so one slow service cannot starve the others.
  • Another common cause: lock contention. If every task acquires a shared lock and one task holds it for a long time (e.g., a database transaction under load), all other threads block waiting for the lock. The thread dump will show many threads in BLOCKED state on the same monitor. The fix depends on the situation: reduce the scope of the lock, switch to a read-write lock if most operations are reads, use lock-free data structures, or remove the shared state entirely.
  • A less obvious cause: the common ForkJoinPool being saturated by parallel streams or CompletableFuture tasks. By default, CompletableFuture.supplyAsync() runs on the common ForkJoinPool, which has Runtime.getRuntime().availableProcessors() - 1 threads. If many tasks use this pool and some block, the entire application’s parallel processing stalls. The fix is providing a dedicated Executor to supplyAsync(task, myExecutor) and never relying on the common pool for blocking work.
  • My diagnostic sequence: (1) thread dump to identify what threads are doing, (2) check pool metrics (queue depth, active count, task rejection count), (3) correlate with downstream latency metrics, (4) check for lock contention in the thread dump, (5) verify timeouts are configured on all I/O operations.
Follow-up: How would you size a thread pool correctly? What formula or approach do you use?
  • The classic formula from Brian Goetz’s “Java Concurrency in Practice” is: threads = N_cpu * U_cpu * (1 + W/C), where N_cpu is the number of available processors, U_cpu is the target CPU utilization (usually 0.5 to 1.0), W is the average time a task spends waiting (I/O, locks), and C is the average time a task spends computing. For CPU-bound tasks (W/C near 0), you want roughly N_cpu threads. For I/O-bound tasks (W/C is large), you want many more threads.
  • In practice, I do not trust the formula blindly. I start with the formula as a baseline, then load test the application with a realistic traffic pattern and observe: CPU utilization (should be high but not saturated — 70-80% is healthy), response latency percentiles (p50, p95, p99), thread pool queue depth (should be near zero in steady state), and GC pause behavior (too many threads create too many objects, increasing GC pressure). I adjust the pool size iteratively based on these observations.
  • A common mistake: setting the pool size based on peak load. If you configure 200 threads for a traffic spike that happens once a day, you waste memory and OS resources for the other 23 hours. Use a dynamic pool (ThreadPoolExecutor with core and max pool sizes) that scales up under load and shrinks during quiet periods. Set the core size for average load and the max size for peak load, with a reasonable keep-alive time (60 seconds is the default).
  • With virtual threads (Java 21), the pool sizing question largely disappears for I/O-bound work. You create one virtual thread per task and let the JVM manage carrier thread scheduling. The JVM’s ForkJoinPool for carriers is sized to the number of processors, which is optimal for CPU utilization. This is one of the most compelling reasons to migrate to virtual threads.