Concurrency & Threading
Modern systems demand concurrent programming. Let’s master POSIX threads and synchronization primitives.
POSIX Threads Basics
Threads are “lightweight processes”. Unlike fork(), which creates a copy of the process, threads share the same memory space (heap, data segments, file descriptors) but have their own stack and registers.
Key differences :
Shared Memory : All threads can access global variables and the heap.
Independent Execution : Each thread runs independently (scheduled by the kernel).
Low Overhead : Creating a thread is much faster than creating a process.
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
void * thread_function ( void * arg ) {
int thread_num = * ( int * )arg;
printf ( "Thread %d starting \n " , thread_num);
sleep ( 1 ); // Simulate work
printf ( "Thread %d finishing \n " , thread_num);
// Return value (will be collected by pthread_join)
int * result = malloc ( sizeof ( int ));
* result = thread_num * 10 ;
return result;
}
int main ( void ) {
pthread_t threads [ 5 ];
int thread_args [ 5 ];
// Create threads
for ( int i = 0 ; i < 5 ; i ++ ) {
thread_args [i] = i;
int ret = pthread_create ( & threads [i], NULL , thread_function, & thread_args [i]);
if (ret != 0 ) {
fprintf (stderr, "pthread_create failed: %d \n " , ret);
return 1 ;
}
}
// Wait for threads to complete
for ( int i = 0 ; i < 5 ; i ++ ) {
void * result;
pthread_join ( threads [i], & result);
printf ( "Thread %d returned: %d \n " , i, * ( int * )result);
free (result);
}
return 0 ;
}
// Compile with: gcc -pthread program.c -o program
Thread Attributes
#include <pthread.h>
void create_with_attributes ( void ) {
pthread_attr_t attr;
pthread_attr_init ( & attr);
// Set stack size
pthread_attr_setstacksize ( & attr, 2 * 1024 * 1024 ); // 2 MB
// Set detached state (no join needed)
pthread_attr_setdetachstate ( & attr, PTHREAD_CREATE_DETACHED);
// Set scheduling policy
pthread_attr_setschedpolicy ( & attr, SCHED_FIFO);
pthread_t thread;
pthread_create ( & thread, & attr, thread_function, NULL );
pthread_attr_destroy ( & attr);
}
// Detach a thread (can't join after this)
void detach_example ( void ) {
pthread_t thread;
pthread_create ( & thread, NULL , thread_function, NULL );
pthread_detach (thread); // Thread cleans itself up
}
Why Synchronization Matters
When multiple threads access shared data concurrently, race conditions occur. Without proper synchronization, the final state of your data becomes unpredictable.
The Problem: Race Conditions
Consider two threads incrementing a counter:
// Shared variable
int counter = 100 ;
// Thread A and Thread B both execute this:
counter ++ ; // Looks atomic, but it's NOT!
What actually happens (assembly level):
Thread A reads counter (value: 100)
Thread B reads counter (value: 100)
Thread A increments to 101, writes back
Thread B increments to 101, writes back
Final value: 101 (should be 102!)
This is called a lost update - one thread’s work was silently discarded.
The Solution: Synchronization Primitives
Use synchronization primitives (mutexes, condition variables, atomics) to ensure only one thread accesses shared data at a time. The diagram above shows a classic Producer-Consumer pattern where:
Mutex provides mutual exclusion (only one thread in critical section)
Condition variables allow threads to wait for specific conditions
Bounded queue is the shared resource protected by synchronization
Key Insight : Synchronization trades performance (threads must wait) for correctness (data remains consistent).
Mutexes
#include <stdio.h>
#include <pthread.h>
// Shared data
int counter = 0 ;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
void * increment_thread ( void * arg ) {
for ( int i = 0 ; i < 100000 ; i ++ ) {
pthread_mutex_lock ( & mutex);
counter ++ ;
pthread_mutex_unlock ( & mutex);
}
return NULL ;
}
int main ( void ) {
pthread_t t1, t2;
pthread_create ( & t1, NULL , increment_thread, NULL );
pthread_create ( & t2, NULL , increment_thread, NULL );
pthread_join (t1, NULL );
pthread_join (t2, NULL );
printf ( "Counter: %d (expected: 200000) \n " , counter);
pthread_mutex_destroy ( & mutex);
return 0 ;
}
Mutex Types
#include <pthread.h>
void mutex_types_demo ( void ) {
pthread_mutexattr_t attr;
pthread_mutex_t mutex;
pthread_mutexattr_init ( & attr);
// Normal mutex (default) - undefined behavior on recursive lock
pthread_mutexattr_settype ( & attr, PTHREAD_MUTEX_NORMAL);
// Error checking - returns error on recursive lock
pthread_mutexattr_settype ( & attr, PTHREAD_MUTEX_ERRORCHECK);
// Recursive mutex - allows same thread to lock multiple times
pthread_mutexattr_settype ( & attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init ( & mutex, & attr);
pthread_mutexattr_destroy ( & attr);
// Recursive lock example
pthread_mutex_lock ( & mutex);
pthread_mutex_lock ( & mutex); // OK with RECURSIVE
pthread_mutex_unlock ( & mutex);
pthread_mutex_unlock ( & mutex); // Must unlock same number of times
}
// Try-lock (non-blocking)
void trylock_example ( pthread_mutex_t * mutex ) {
if ( pthread_mutex_trylock (mutex) == 0 ) {
// Got the lock
// ... critical section ...
pthread_mutex_unlock (mutex);
} else {
// Couldn't get lock, do something else
}
}
// Timed lock
#include <time.h>
void timedlock_example ( pthread_mutex_t * mutex ) {
struct timespec timeout;
clock_gettime (CLOCK_REALTIME, & timeout);
timeout . tv_sec += 1 ; // 1 second timeout
int ret = pthread_mutex_timedlock (mutex, & timeout);
if (ret == 0 ) {
// Got the lock
pthread_mutex_unlock (mutex);
} else if (ret == ETIMEDOUT) {
// Timeout
}
}
Condition Variables
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdbool.h>
#define QUEUE_SIZE 10
typedef struct {
int items [QUEUE_SIZE];
int front, rear, count;
pthread_mutex_t mutex;
pthread_cond_t not_empty;
pthread_cond_t not_full;
} BoundedQueue;
void queue_init (BoundedQueue * q ) {
q -> front = q -> rear = q -> count = 0 ;
pthread_mutex_init ( & q -> mutex , NULL );
pthread_cond_init ( & q -> not_empty , NULL );
pthread_cond_init ( & q -> not_full , NULL );
}
void queue_destroy (BoundedQueue * q ) {
pthread_mutex_destroy ( & q -> mutex );
pthread_cond_destroy ( & q -> not_empty );
pthread_cond_destroy ( & q -> not_full );
}
void queue_push (BoundedQueue * q , int item ) {
pthread_mutex_lock ( & q -> mutex );
// Wait while full
while ( q -> count == QUEUE_SIZE) {
pthread_cond_wait ( & q -> not_full , & q -> mutex );
}
q -> items [ q -> rear ] = item;
q -> rear = ( q -> rear + 1 ) % QUEUE_SIZE;
q -> count ++ ;
// Signal that queue is not empty
pthread_cond_signal ( & q -> not_empty );
pthread_mutex_unlock ( & q -> mutex );
}
int queue_pop (BoundedQueue * q ) {
pthread_mutex_lock ( & q -> mutex );
// Wait while empty
while ( q -> count == 0 ) {
pthread_cond_wait ( & q -> not_empty , & q -> mutex );
}
int item = q -> items [ q -> front ];
q -> front = ( q -> front + 1 ) % QUEUE_SIZE;
q -> count -- ;
// Signal that queue is not full
pthread_cond_signal ( & q -> not_full );
pthread_mutex_unlock ( & q -> mutex );
return item;
}
// Producer-Consumer example
BoundedQueue queue;
void * producer ( void * arg ) {
int id = * ( int * )arg;
for ( int i = 0 ; i < 20 ; i ++ ) {
int item = id * 100 + i;
queue_push ( & queue, item);
printf ( "Producer %d : pushed %d \n " , id, item);
}
return NULL ;
}
void * consumer ( void * arg ) {
int id = * ( int * )arg;
for ( int i = 0 ; i < 20 ; i ++ ) {
int item = queue_pop ( & queue);
printf ( "Consumer %d : popped %d \n " , id, item);
}
return NULL ;
}
Always use a while loop with condition variables, not if! Spurious wakeups can occur, where the thread wakes up even though no signal was sent.
Read-Write Locks
#include <stdio.h>
#include <pthread.h>
typedef struct {
int data;
pthread_rwlock_t rwlock;
} SharedData;
SharedData shared = { 0 , PTHREAD_RWLOCK_INITIALIZER};
void * reader ( void * arg ) {
for ( int i = 0 ; i < 10 ; i ++ ) {
pthread_rwlock_rdlock ( & shared . rwlock );
printf ( "Reader %ld : data = %d \n " , ( long )arg, shared . data );
pthread_rwlock_unlock ( & shared . rwlock );
}
return NULL ;
}
void * writer ( void * arg ) {
for ( int i = 0 ; i < 5 ; i ++ ) {
pthread_rwlock_wrlock ( & shared . rwlock );
shared . data ++ ;
printf ( "Writer %ld : data = %d \n " , ( long )arg, shared . data );
pthread_rwlock_unlock ( & shared . rwlock );
}
return NULL ;
}
// Multiple readers can hold the lock simultaneously
// Writers have exclusive access
Thread-Local Storage
#include <stdio.h>
#include <pthread.h>
// C11 thread-local storage
_Thread_local int thread_var = 0 ;
// POSIX thread-specific data
pthread_key_t key;
void destructor ( void * data ) {
printf ( "Destroying thread data: %d \n " , * ( int * )data);
free (data);
}
void * thread_func ( void * arg ) {
// C11 TLS
thread_var = ( int )( long )arg;
printf ( "Thread %ld : thread_var = %d \n " , ( long )arg, thread_var);
// POSIX thread-specific data
int * data = malloc ( sizeof ( int ));
* data = ( int )( long )arg * 10 ;
pthread_setspecific (key, data);
int * retrieved = pthread_getspecific (key);
printf ( "Thread %ld : specific data = %d \n " , ( long )arg, * retrieved);
return NULL ;
}
int main ( void ) {
pthread_key_create ( & key, destructor);
pthread_t threads [ 3 ];
for ( long i = 0 ; i < 3 ; i ++ ) {
pthread_create ( & threads [i], NULL , thread_func, ( void * )i);
}
for ( int i = 0 ; i < 3 ; i ++ ) {
pthread_join ( threads [i], NULL );
}
pthread_key_delete (key);
return 0 ;
}
Thread Pool
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <stdbool.h>
typedef struct Task {
void ( * function)( void * );
void * arg;
struct Task * next;
} Task;
typedef struct {
Task * head, * tail;
pthread_mutex_t mutex;
pthread_cond_t cond;
pthread_t * threads;
int num_threads;
bool shutdown;
} ThreadPool;
void * worker ( void * arg ) {
ThreadPool * pool = arg;
while ( 1 ) {
pthread_mutex_lock ( & pool -> mutex );
while ( pool -> head == NULL && ! pool -> shutdown ) {
pthread_cond_wait ( & pool -> cond , & pool -> mutex );
}
if ( pool -> shutdown && pool -> head == NULL ) {
pthread_mutex_unlock ( & pool -> mutex );
break ;
}
Task * task = pool -> head ;
pool -> head = task -> next ;
if ( pool -> head == NULL ) {
pool -> tail = NULL ;
}
pthread_mutex_unlock ( & pool -> mutex );
task -> function ( task -> arg );
free (task);
}
return NULL ;
}
ThreadPool * threadpool_create ( int num_threads ) {
ThreadPool * pool = calloc ( 1 , sizeof (ThreadPool));
pool -> num_threads = num_threads;
pool -> threads = malloc (num_threads * sizeof ( pthread_t ));
pthread_mutex_init ( & pool -> mutex , NULL );
pthread_cond_init ( & pool -> cond , NULL );
for ( int i = 0 ; i < num_threads; i ++ ) {
pthread_create ( & pool -> threads [i], NULL , worker, pool);
}
return pool;
}
void threadpool_submit (ThreadPool * pool , void ( * func)( void * ), void * arg ) {
Task * task = malloc ( sizeof (Task));
task -> function = func;
task -> arg = arg;
task -> next = NULL ;
pthread_mutex_lock ( & pool -> mutex );
if ( pool -> tail ) {
pool -> tail -> next = task;
} else {
pool -> head = task;
}
pool -> tail = task;
pthread_cond_signal ( & pool -> cond );
pthread_mutex_unlock ( & pool -> mutex );
}
void threadpool_destroy (ThreadPool * pool ) {
pthread_mutex_lock ( & pool -> mutex );
pool -> shutdown = true ;
pthread_cond_broadcast ( & pool -> cond );
pthread_mutex_unlock ( & pool -> mutex );
for ( int i = 0 ; i < pool -> num_threads ; i ++ ) {
pthread_join ( pool -> threads [i], NULL );
}
pthread_mutex_destroy ( & pool -> mutex );
pthread_cond_destroy ( & pool -> cond );
free ( pool -> threads );
free (pool);
}
Atomics (C11)
#include <stdio.h>
#include <stdatomic.h>
#include <pthread.h>
atomic_int counter = 0 ;
void * atomic_increment ( void * arg ) {
for ( int i = 0 ; i < 100000 ; i ++ ) {
atomic_fetch_add ( & counter, 1 );
}
return NULL ;
}
// Lock-free stack
typedef struct Node {
int data;
struct Node * next;
} Node;
typedef struct {
_Atomic (Node * ) head;
} LockFreeStack;
void lfs_push (LockFreeStack * stack , int value ) {
Node * node = malloc ( sizeof (Node));
node -> data = value;
Node * old_head = atomic_load ( & stack -> head );
do {
node -> next = old_head;
} while ( ! atomic_compare_exchange_weak ( & stack -> head , & old_head, node));
}
int lfs_pop (LockFreeStack * stack , int * value ) {
Node * old_head = atomic_load ( & stack -> head );
Node * new_head;
do {
if (old_head == NULL ) return 0 ; // Empty
new_head = old_head -> next ;
} while ( ! atomic_compare_exchange_weak ( & stack -> head , & old_head, new_head));
* value = old_head -> data ;
free (old_head); // Warning: potential use-after-free in real code!
return 1 ;
}
// Memory ordering
void memory_ordering_example ( void ) {
atomic_int x = 0 , y = 0 ;
// Relaxed (weakest)
atomic_store_explicit ( & x, 1 , memory_order_relaxed);
// Release (writes before are visible)
atomic_store_explicit ( & x, 1 , memory_order_release);
// Acquire (reads after see previous writes)
int val = atomic_load_explicit ( & x, memory_order_acquire);
// Sequentially consistent (strongest, default)
atomic_store ( & x, 1 ); // Uses memory_order_seq_cst
}
Common Pitfalls
Deadlock
pthread_mutex_t mutex_a, mutex_b;
// Thread 1
void * thread1 ( void * arg ) {
pthread_mutex_lock ( & mutex_a);
sleep ( 1 );
pthread_mutex_lock ( & mutex_b); // Waits for thread2
// ...
pthread_mutex_unlock ( & mutex_b);
pthread_mutex_unlock ( & mutex_a);
return NULL ;
}
// Thread 2
void * thread2 ( void * arg ) {
pthread_mutex_lock ( & mutex_b);
sleep ( 1 );
pthread_mutex_lock ( & mutex_a); // Waits for thread1 - DEADLOCK!
// ...
pthread_mutex_unlock ( & mutex_a);
pthread_mutex_unlock ( & mutex_b);
return NULL ;
}
// Solution: Always acquire locks in the same order
void * thread1_fixed ( void * arg ) {
pthread_mutex_lock ( & mutex_a);
pthread_mutex_lock ( & mutex_b);
// ...
pthread_mutex_unlock ( & mutex_b);
pthread_mutex_unlock ( & mutex_a);
return NULL ;
}
Preventing Deadlocks
Deadlocks occur when threads wait for each other in a circular dependency. The diagram above shows the classic scenario where Thread 1 holds Mutex A and wants Mutex B, while Thread 2 holds Mutex B and wants Mutex A.
Four Conditions for Deadlock (Coffman Conditions)
A deadlock can only occur if ALL four conditions are present:
Mutual Exclusion : Resources cannot be shared (mutexes by definition)
Hold and Wait : Thread holds resources while waiting for more
No Preemption : Resources cannot be forcibly taken from threads
Circular Wait : Circular chain of threads waiting for resources
Prevention Strategy : Break at least one of these conditions.
Deadlock Prevention Techniques
1. Lock Ordering (Most Common)
Always acquire locks in the same global order across all threads - this breaks the circular wait condition.
2. Lock Timeout
Use pthread_mutex_timedlock() and retry:
struct timespec timeout;
clock_gettime (CLOCK_REALTIME, & timeout );
timeout.tv_sec += 1 ;
if ( pthread_mutex_timedlock ( & mutex_b , & timeout ) != 0 ) {
// Couldn't get lock, release what we have
pthread_mutex_unlock ( & mutex_a);
// Retry or handle error
}
3. Try-Lock and Backoff
pthread_mutex_lock ( & mutex_a );
if ( pthread_mutex_trylock ( & mutex_b ) != 0 ) {
// Couldn't get mutex_b, release mutex_a
pthread_mutex_unlock ( & mutex_a);
usleep ( rand () % 1000 ); // Random backoff
// Retry
}
Best Practice : Design your system to minimize the number of locks needed simultaneously. Consider using lock-free data structures or message passing instead of shared memory.
Race Condition
// Dangerous: check-then-act without lock
void dangerous_update ( void ) {
if (counter > 0 ) { // Thread 1 checks
counter -- ; // Another thread may have modified!
}
}
// Safe: check and act atomically
void safe_update ( pthread_mutex_t * mutex ) {
pthread_mutex_lock (mutex);
if (counter > 0 ) {
counter -- ;
}
pthread_mutex_unlock (mutex);
}
Exercises
Dining Philosophers
Implement the dining philosophers problem with proper deadlock prevention.
Reader-Writer Problem
Implement a solution that doesn’t starve either readers or writers.
Barrier
Implement a reusable barrier that allows N threads to synchronize.
Thread-Safe Queue
Implement a lock-free MPSC (multi-producer, single-consumer) queue.
Next Up
Network Programming Build network applications with sockets