Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

C Thread Synchronization

Concurrency & Threading

Modern systems demand concurrent programming. Let’s master POSIX threads and synchronization primitives. A useful mental model: threads are like multiple cooks in a shared kitchen. They can all work simultaneously (parallelism), but if two cooks try to use the same knife at the same time, someone gets cut (data corruption). Synchronization primitives are the rules that prevent collisions — “only one person uses the stove at a time” (mutex), “wait until the ingredients are prepped before cooking” (condition variable), “anyone can read the recipe, but only one person can write changes” (read-write lock).

POSIX Threads Basics

Threads are “lightweight processes”. Unlike fork(), which creates a copy of the process, threads share the same memory space (heap, data segments, file descriptors) but have their own stack and registers. Key differences:
  • Shared Memory: All threads can access global variables and the heap.
  • Independent Execution: Each thread runs independently (scheduled by the kernel).
  • Low Overhead: Creating a thread is much faster than creating a process.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>

void *thread_function(void *arg) {
    int thread_num = *(int*)arg;
    printf("Thread %d starting\n", thread_num);
    
    sleep(1);  // Simulate work
    
    printf("Thread %d finishing\n", thread_num);
    
    // Return value (will be collected by pthread_join)
    int *result = malloc(sizeof(int));
    *result = thread_num * 10;
    return result;
}

int main(void) {
    pthread_t threads[5];
    int thread_args[5];
    
    // Create threads
    for (int i = 0; i < 5; i++) {
        thread_args[i] = i;
        int ret = pthread_create(&threads[i], NULL, thread_function, &thread_args[i]);
        if (ret != 0) {
            fprintf(stderr, "pthread_create failed: %d\n", ret);
            return 1;
        }
    }
    
    // Wait for threads to complete
    for (int i = 0; i < 5; i++) {
        void *result;
        pthread_join(threads[i], &result);
        printf("Thread %d returned: %d\n", i, *(int*)result);
        free(result);
    }
    
    return 0;
}

// Compile with: gcc -pthread program.c -o program

Thread Attributes

#include <pthread.h>

void create_with_attributes(void) {
    pthread_attr_t attr;
    pthread_attr_init(&attr);
    
    // Set stack size
    pthread_attr_setstacksize(&attr, 2 * 1024 * 1024);  // 2 MB
    
    // Set detached state (no join needed)
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
    
    // Set scheduling policy
    pthread_attr_setschedpolicy(&attr, SCHED_FIFO);
    
    pthread_t thread;
    pthread_create(&thread, &attr, thread_function, NULL);
    
    pthread_attr_destroy(&attr);
}

// Detach a thread (can't join after this)
void detach_example(void) {
    pthread_t thread;
    pthread_create(&thread, NULL, thread_function, NULL);
    pthread_detach(thread);  // Thread cleans itself up
}

Why Synchronization Matters

When multiple threads access shared data concurrently, race conditions occur. Without proper synchronization, the final state of your data becomes unpredictable.

The Problem: Race Conditions

Consider two threads incrementing a counter:
// Shared variable
int counter = 100;

// Thread A and Thread B both execute this:
counter++;  // Looks atomic, but it's NOT!
What actually happens (assembly level):
  1. Thread A reads counter (value: 100)
  2. Thread B reads counter (value: 100)
  3. Thread A increments to 101, writes back
  4. Thread B increments to 101, writes back
  5. Final value: 101 (should be 102!)
This is called a lost update - one thread’s work was silently discarded.

The Solution: Synchronization Primitives

Thread Synchronization Use synchronization primitives (mutexes, condition variables, atomics) to ensure only one thread accesses shared data at a time. The diagram above shows a classic Producer-Consumer pattern where:
  • Mutex provides mutual exclusion (only one thread in critical section)
  • Condition variables allow threads to wait for specific conditions
  • Bounded queue is the shared resource protected by synchronization
Key Insight: Synchronization trades performance (threads must wait) for correctness (data remains consistent).

Mutexes

#include <stdio.h>
#include <pthread.h>

// Shared data
int counter = 0;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

void *increment_thread(void *arg) {
    for (int i = 0; i < 100000; i++) {
        pthread_mutex_lock(&mutex);
        counter++;
        pthread_mutex_unlock(&mutex);
    }
    return NULL;
}

int main(void) {
    pthread_t t1, t2;
    
    pthread_create(&t1, NULL, increment_thread, NULL);
    pthread_create(&t2, NULL, increment_thread, NULL);
    
    pthread_join(t1, NULL);
    pthread_join(t2, NULL);
    
    printf("Counter: %d (expected: 200000)\n", counter);
    
    pthread_mutex_destroy(&mutex);
    return 0;
}

Mutex Types

#include <pthread.h>

void mutex_types_demo(void) {
    pthread_mutexattr_t attr;
    pthread_mutex_t mutex;
    
    pthread_mutexattr_init(&attr);
    
    // Normal mutex (default) - undefined behavior on recursive lock
    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_NORMAL);
    
    // Error checking - returns error on recursive lock
    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_ERRORCHECK);
    
    // Recursive mutex - allows same thread to lock multiple times
    pthread_mutexattr_settype(&attr, PTHREAD_MUTEX_RECURSIVE);
    
    pthread_mutex_init(&mutex, &attr);
    pthread_mutexattr_destroy(&attr);
    
    // Recursive lock example
    pthread_mutex_lock(&mutex);
    pthread_mutex_lock(&mutex);    // OK with RECURSIVE
    pthread_mutex_unlock(&mutex);
    pthread_mutex_unlock(&mutex);  // Must unlock same number of times
}

// Try-lock (non-blocking)
void trylock_example(pthread_mutex_t *mutex) {
    if (pthread_mutex_trylock(mutex) == 0) {
        // Got the lock
        // ... critical section ...
        pthread_mutex_unlock(mutex);
    } else {
        // Couldn't get lock, do something else
    }
}

// Timed lock
#include <time.h>

void timedlock_example(pthread_mutex_t *mutex) {
    struct timespec timeout;
    clock_gettime(CLOCK_REALTIME, &timeout);
    timeout.tv_sec += 1;  // 1 second timeout
    
    int ret = pthread_mutex_timedlock(mutex, &timeout);
    if (ret == 0) {
        // Got the lock
        pthread_mutex_unlock(mutex);
    } else if (ret == ETIMEDOUT) {
        // Timeout
    }
}

Condition Variables

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdbool.h>

#define QUEUE_SIZE 10

typedef struct {
    int items[QUEUE_SIZE];
    int front, rear, count;
    pthread_mutex_t mutex;
    pthread_cond_t not_empty;
    pthread_cond_t not_full;
} BoundedQueue;

void queue_init(BoundedQueue *q) {
    q->front = q->rear = q->count = 0;
    pthread_mutex_init(&q->mutex, NULL);
    pthread_cond_init(&q->not_empty, NULL);
    pthread_cond_init(&q->not_full, NULL);
}

void queue_destroy(BoundedQueue *q) {
    pthread_mutex_destroy(&q->mutex);
    pthread_cond_destroy(&q->not_empty);
    pthread_cond_destroy(&q->not_full);
}

void queue_push(BoundedQueue *q, int item) {
    pthread_mutex_lock(&q->mutex);
    
    // Wait while full
    while (q->count == QUEUE_SIZE) {
        pthread_cond_wait(&q->not_full, &q->mutex);
    }
    
    q->items[q->rear] = item;
    q->rear = (q->rear + 1) % QUEUE_SIZE;
    q->count++;
    
    // Signal that queue is not empty
    pthread_cond_signal(&q->not_empty);
    
    pthread_mutex_unlock(&q->mutex);
}

int queue_pop(BoundedQueue *q) {
    pthread_mutex_lock(&q->mutex);
    
    // Wait while empty
    while (q->count == 0) {
        pthread_cond_wait(&q->not_empty, &q->mutex);
    }
    
    int item = q->items[q->front];
    q->front = (q->front + 1) % QUEUE_SIZE;
    q->count--;
    
    // Signal that queue is not full
    pthread_cond_signal(&q->not_full);
    
    pthread_mutex_unlock(&q->mutex);
    return item;
}

// Producer-Consumer example
BoundedQueue queue;

void *producer(void *arg) {
    int id = *(int*)arg;
    for (int i = 0; i < 20; i++) {
        int item = id * 100 + i;
        queue_push(&queue, item);
        printf("Producer %d: pushed %d\n", id, item);
    }
    return NULL;
}

void *consumer(void *arg) {
    int id = *(int*)arg;
    for (int i = 0; i < 20; i++) {
        int item = queue_pop(&queue);
        printf("Consumer %d: popped %d\n", id, item);
    }
    return NULL;
}
Always use a while loop with condition variables, not if! Spurious wakeups can occur, where the thread wakes up even though no signal was sent.

Read-Write Locks

#include <stdio.h>
#include <pthread.h>

typedef struct {
    int data;
    pthread_rwlock_t rwlock;
} SharedData;

SharedData shared = {0, PTHREAD_RWLOCK_INITIALIZER};

void *reader(void *arg) {
    for (int i = 0; i < 10; i++) {
        pthread_rwlock_rdlock(&shared.rwlock);
        printf("Reader %ld: data = %d\n", (long)arg, shared.data);
        pthread_rwlock_unlock(&shared.rwlock);
    }
    return NULL;
}

void *writer(void *arg) {
    for (int i = 0; i < 5; i++) {
        pthread_rwlock_wrlock(&shared.rwlock);
        shared.data++;
        printf("Writer %ld: data = %d\n", (long)arg, shared.data);
        pthread_rwlock_unlock(&shared.rwlock);
    }
    return NULL;
}

// Multiple readers can hold the lock simultaneously
// Writers have exclusive access

Thread-Local Storage

#include <stdio.h>
#include <pthread.h>

// C11 thread-local storage
_Thread_local int thread_var = 0;

// POSIX thread-specific data
pthread_key_t key;

void destructor(void *data) {
    printf("Destroying thread data: %d\n", *(int*)data);
    free(data);
}

void *thread_func(void *arg) {
    // C11 TLS
    thread_var = (int)(long)arg;
    printf("Thread %ld: thread_var = %d\n", (long)arg, thread_var);
    
    // POSIX thread-specific data
    int *data = malloc(sizeof(int));
    *data = (int)(long)arg * 10;
    pthread_setspecific(key, data);
    
    int *retrieved = pthread_getspecific(key);
    printf("Thread %ld: specific data = %d\n", (long)arg, *retrieved);
    
    return NULL;
}

int main(void) {
    pthread_key_create(&key, destructor);
    
    pthread_t threads[3];
    for (long i = 0; i < 3; i++) {
        pthread_create(&threads[i], NULL, thread_func, (void*)i);
    }
    
    for (int i = 0; i < 3; i++) {
        pthread_join(threads[i], NULL);
    }
    
    pthread_key_delete(key);
    return 0;
}

Thread Pool

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdbool.h>

typedef struct Task {
    void (*function)(void*);
    void *arg;
    struct Task *next;
} Task;

typedef struct {
    Task *head, *tail;
    pthread_mutex_t mutex;
    pthread_cond_t cond;
    pthread_t *threads;
    int num_threads;
    bool shutdown;
} ThreadPool;

void *worker(void *arg) {
    ThreadPool *pool = arg;
    
    while (1) {
        pthread_mutex_lock(&pool->mutex);
        
        while (pool->head == NULL && !pool->shutdown) {
            pthread_cond_wait(&pool->cond, &pool->mutex);
        }
        
        if (pool->shutdown && pool->head == NULL) {
            pthread_mutex_unlock(&pool->mutex);
            break;
        }
        
        Task *task = pool->head;
        pool->head = task->next;
        if (pool->head == NULL) {
            pool->tail = NULL;
        }
        
        pthread_mutex_unlock(&pool->mutex);
        
        task->function(task->arg);
        free(task);
    }
    
    return NULL;
}

ThreadPool *threadpool_create(int num_threads) {
    ThreadPool *pool = calloc(1, sizeof(ThreadPool));
    pool->num_threads = num_threads;
    pool->threads = malloc(num_threads * sizeof(pthread_t));
    
    pthread_mutex_init(&pool->mutex, NULL);
    pthread_cond_init(&pool->cond, NULL);
    
    for (int i = 0; i < num_threads; i++) {
        pthread_create(&pool->threads[i], NULL, worker, pool);
    }
    
    return pool;
}

void threadpool_submit(ThreadPool *pool, void (*func)(void*), void *arg) {
    Task *task = malloc(sizeof(Task));
    task->function = func;
    task->arg = arg;
    task->next = NULL;
    
    pthread_mutex_lock(&pool->mutex);
    
    if (pool->tail) {
        pool->tail->next = task;
    } else {
        pool->head = task;
    }
    pool->tail = task;
    
    pthread_cond_signal(&pool->cond);
    pthread_mutex_unlock(&pool->mutex);
}

void threadpool_destroy(ThreadPool *pool) {
    pthread_mutex_lock(&pool->mutex);
    pool->shutdown = true;
    // broadcast, not signal: we need ALL worker threads to wake up and see
    // the shutdown flag. signal only wakes one, leaving others blocked forever.
    pthread_cond_broadcast(&pool->cond);
    pthread_mutex_unlock(&pool->mutex);
    
    // Join every thread to ensure clean shutdown. Without this, the process
    // could exit while worker threads are still running, causing undefined
    // behavior if they access freed memory.
    for (int i = 0; i < pool->num_threads; i++) {
        pthread_join(pool->threads[i], NULL);
    }
    
    // PITFALL: Any tasks still in the queue are leaked here. A production
    // thread pool should either drain the queue before shutting down, or
    // provide a task cleanup callback so callers can free task arguments.
    pthread_mutex_destroy(&pool->mutex);
    pthread_cond_destroy(&pool->cond);
    free(pool->threads);
    free(pool);
}

Atomics (C11)

Atomics provide lock-free thread safety for individual variables. A common misconception: volatile does NOT provide thread safety in C. volatile only prevents the compiler from caching the value in a register — it says nothing about CPU caches, memory ordering, or atomicity. For thread-safe shared variables, you need _Atomic (or explicit mutexes). This is a frequent source of subtle bugs in code ported from single-core embedded systems to multi-core servers.
#include <stdio.h>
#include <stdatomic.h>
#include <pthread.h>

atomic_int counter = 0;

void *atomic_increment(void *arg) {
    for (int i = 0; i < 100000; i++) {
        atomic_fetch_add(&counter, 1);
    }
    return NULL;
}

// Lock-free stack
typedef struct Node {
    int data;
    struct Node *next;
} Node;

typedef struct {
    _Atomic(Node*) head;
} LockFreeStack;

void lfs_push(LockFreeStack *stack, int value) {
    Node *node = malloc(sizeof(Node));
    node->data = value;
    
    Node *old_head = atomic_load(&stack->head);
    do {
        node->next = old_head;
    } while (!atomic_compare_exchange_weak(&stack->head, &old_head, node));
}

int lfs_pop(LockFreeStack *stack, int *value) {
    Node *old_head = atomic_load(&stack->head);
    Node *new_head;
    
    do {
        if (old_head == NULL) return 0;  // Empty
        new_head = old_head->next;
    } while (!atomic_compare_exchange_weak(&stack->head, &old_head, new_head));
    
    *value = old_head->data;
    free(old_head);  // DANGER: This is a classic ABA problem / use-after-free.
    // Between loading old_head and the successful CAS, another thread could have
    // popped this node, freed it, and pushed a NEW node at the same address.
    // The CAS succeeds (same pointer value) but old_head->next is now garbage.
    // Production lock-free code uses hazard pointers, epoch-based reclamation,
    // or tagged pointers to solve this. Do not use this pattern in real systems.
    return 1;
}

// Memory ordering
void memory_ordering_example(void) {
    atomic_int x = 0, y = 0;
    
    // Relaxed (weakest)
    atomic_store_explicit(&x, 1, memory_order_relaxed);
    
    // Release (writes before are visible)
    atomic_store_explicit(&x, 1, memory_order_release);
    
    // Acquire (reads after see previous writes)
    int val = atomic_load_explicit(&x, memory_order_acquire);
    
    // Sequentially consistent (strongest, default)
    atomic_store(&x, 1);  // Uses memory_order_seq_cst
}

Common Pitfalls

Deadlock

pthread_mutex_t mutex_a, mutex_b;

// Thread 1
void *thread1(void *arg) {
    pthread_mutex_lock(&mutex_a);
    sleep(1);
    pthread_mutex_lock(&mutex_b);  // Waits for thread2
    // ...
    pthread_mutex_unlock(&mutex_b);
    pthread_mutex_unlock(&mutex_a);
    return NULL;
}

// Thread 2
void *thread2(void *arg) {
    pthread_mutex_lock(&mutex_b);
    sleep(1);
    pthread_mutex_lock(&mutex_a);  // Waits for thread1 - DEADLOCK!
    // ...
    pthread_mutex_unlock(&mutex_a);
    pthread_mutex_unlock(&mutex_b);
    return NULL;
}

// Solution: Always acquire locks in the same order
void *thread1_fixed(void *arg) {
    pthread_mutex_lock(&mutex_a);
    pthread_mutex_lock(&mutex_b);
    // ...
    pthread_mutex_unlock(&mutex_b);
    pthread_mutex_unlock(&mutex_a);
    return NULL;
}

Preventing Deadlocks

Deadlock Visualization Deadlocks occur when threads wait for each other in a circular dependency. The diagram above shows the classic scenario where Thread 1 holds Mutex A and wants Mutex B, while Thread 2 holds Mutex B and wants Mutex A.

Four Conditions for Deadlock (Coffman Conditions)

A deadlock can only occur if ALL four conditions are present:
  1. Mutual Exclusion: Resources cannot be shared (mutexes by definition)
  2. Hold and Wait: Thread holds resources while waiting for more
  3. No Preemption: Resources cannot be forcibly taken from threads
  4. Circular Wait: Circular chain of threads waiting for resources
Prevention Strategy: Break at least one of these conditions.

Deadlock Prevention Techniques

1. Lock Ordering (Most Common) Always acquire locks in the same global order across all threads - this breaks the circular wait condition. 2. Lock Timeout Use pthread_mutex_timedlock() and retry:
struct timespec timeout;
clock_gettime(CLOCK_REALTIME, &timeout);
timeout.tv_sec += 1;

if (pthread_mutex_timedlock(&mutex_b, &timeout) != 0) {
    // Couldn't get lock, release what we have
    pthread_mutex_unlock(&mutex_a);
    // Retry or handle error
}
3. Try-Lock and Backoff
pthread_mutex_lock(&mutex_a);

if (pthread_mutex_trylock(&mutex_b) != 0) {
    // Couldn't get mutex_b, release mutex_a
    pthread_mutex_unlock(&mutex_a);
    usleep(rand() % 1000);  // Random backoff
    // Retry
}
Best Practice: Design your system to minimize the number of locks needed simultaneously. Consider using lock-free data structures or message passing instead of shared memory.

Race Condition (TOCTOU — Time of Check to Time of Use)

The check-then-act pattern is one of the most common concurrency bugs. Between the moment you check a condition and the moment you act on it, another thread can change the state, making your action invalid. This class of bug is called TOCTOU (Time Of Check to Time Of Use) and it appears not just in threading but also in file system operations (checking if a file exists, then opening it — another process can delete it in between).
// Dangerous: check-then-act without lock
void dangerous_update(void) {
    if (counter > 0) {      // Thread 1 checks: counter is 1
        // <-- Thread 2 runs here and decrements counter to 0
        counter--;          // Thread 1 decrements: counter is now -1 (BUG!)
    }
}

// Safe: check and act atomically
void safe_update(pthread_mutex_t *mutex) {
    pthread_mutex_lock(mutex);
    if (counter > 0) {
        counter--;
    }
    pthread_mutex_unlock(mutex);
}

Exercises

1

Dining Philosophers

Implement the dining philosophers problem with proper deadlock prevention.
2

Reader-Writer Problem

Implement a solution that doesn’t starve either readers or writers.
3

Barrier

Implement a reusable barrier that allows N threads to synchronize.
4

Thread-Safe Queue

Implement a lock-free MPSC (multi-producer, single-consumer) queue.

Next Up

Network Programming

Build network applications with sockets