Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Memory Layout & Segments
Understanding how a C program uses memory is fundamental to systems programming. It’s the difference between writing code that “just works” and code that is efficient, secure, and robust.Why is Memory Segmented?
The Design Rationale
Before examining the memory layout, understand why it’s organized this way: The problem: A flat memory model would be inefficient and insecure. If code and data were mixed, a bug could overwrite instructions, or a hacker could inject malicious code. The solution: Separate segments with different properties:-
Text segment (read-only, executable):
- Prevents code modification (security)
- Allows sharing between processes (efficiency)
- Example: 100 Chrome tabs share one copy of Chrome code
-
Data/BSS segments (read-write, fixed size):
- Globals live here (known at compile time)
- BSS is zero-initialized automatically (efficiency: no need to store zeros in binary)
- Example: Global configuration, static lookup tables
-
Heap (read-write, grows up):
- Dynamic allocation for variable-sized data
- Grows toward higher addresses
- Example: malloc’d data, runtime-sized structures
-
Stack (read-write, grows down):
- Fast allocation (just move stack pointer)
- Automatic cleanup (pop on return)
- Grows toward lower addresses
- Example: Local variables, function call frames
Process Memory Layout
This diagram shows how a C program looks in memory (virtual address space) when it’s running.Examining Memory Segments
To understand the code below, you need to know what lives where:- Text Segment: Read-only code instructions. Shared between processes running the same binary.
- Data Segment: Initialized global and static variables (e.g.,
int x = 10;). - BSS Segment: Uninitialized global and static variables. Automatically zeroed by the kernel on startup.
- Heap: Dynamic memory (
malloc). Grows upward (towards higher addresses). - Stack: Local variables and function call frames. Grows downward (towards lower addresses).
Viewing with Linux Tools
The Stack
The stack is the most critical segment for program flow. It manages function calls, local variables, and return addresses.Stack vs Heap: When to Use Which?
Before diving into stack mechanics, understand when to use stack vs heap: Use Stack when:- Size is known at compile time
- Size is small (< few KB, typically < 1MB total)
- Lifetime matches function scope
- Performance is critical (stack is ~100x faster)
- Size is determined at runtime
- Size is large (> few KB)
- Lifetime extends beyond function
- Sharing data between functions
- Building dynamic data structures
Function Call Mechanism
When a function is called, a new “frame” is pushed onto the stack. This frame contains everything the function needs to run.Stack Frame Structure
A stack frame (or activation record) typically contains:- Function Arguments: Parameters passed to the function.
- Return Address: Where to jump back to when the function finishes.
- Saved Base Pointer (RBP): To restore the caller’s stack frame.
- Local Variables: Variables declared inside the function.
Stack in Action
Stack Overflow
The stack has a fixed size (usually 8MB on Linux). If you recurse too deep or allocate huge arrays, you’ll hit the limit.Variable-Length Arrays (VLAs)
The Heap
How malloc Works (Simplified)
The heap is managed by the C standard library (glibc). It requests memory from the OS usingbrk (for small allocations) or mmap (for large ones).
Heap Fragmentation
Fragmentation is like a parking lot where cars of different sizes come and go. After a while, you might have 20 empty spots scattered across the lot, but none of them are adjacent — so a bus (large allocation) cannot park even though there is plenty of total free space. This is external fragmentation, and it is one of the biggest problems with long-running C programs like servers and databases.Memory Alignment
CPUs access memory most efficiently when data is aligned to its size (e.g., 4-byte integers on 4-byte boundaries).Why Alignment Matters
Unaligned access can be slow or even crash the program on some architectures (like ARM).Static and Thread-Local Storage
Memory-Mapped Files
Memory mapping (mmap) allows you to treat a file on disk as if it were in memory. This is how the OS loads executables and libraries.
Exercises
Memory Map Explorer
Write a program that prints the address of variables in each segment (text, data, bss, heap, stack) and verifies they’re in the expected order.
Stack Size Probe
Write a recursive function that measures roughly how much stack space is available before a stack overflow.
Alignment Checker
Write a function
bool is_aligned(void *ptr, size_t alignment) that checks if a pointer is properly aligned.Next Up
Dynamic Memory Management
Master malloc, custom allocators, and memory patterns
Interview Deep-Dive
A process on a 64-bit Linux system has 4GB of heap allocated but RSS (Resident Set Size) shows only 200MB. How is this possible?
A process on a 64-bit Linux system has 4GB of heap allocated but RSS (Resident Set Size) shows only 200MB. How is this possible?
Strong Answer:
- Virtual memory and physical memory are not the same thing. When
mallocrequests memory viammap, the kernel creates virtual address space mappings (page table entries) but does not allocate physical pages. Physical pages are only assigned on first write (demand paging / copy-on-write). The 4GB is virtual address space commitment; the 200MB is the actual physical pages that have been touched. - Additionally, the kernel may have swapped some pages to disk. Pages that were once resident but have not been accessed recently get evicted to swap, reducing RSS without reducing virtual size.
callocis a special case: it maps zero-filled pages, and the kernel uses a single shared zero page for all untouched pages. Only when you write to acalloc’d page does the kernel allocate a real physical page (copy-on-write). Acalloc(1, 1GB)might consume almost zero physical memory until you write to it.- This is also why overcommit exists on Linux: the kernel allows processes to allocate more virtual memory than physically available, betting that most of it will never be touched. This is configurable via
/proc/sys/vm/overcommit_memory.
- ASLR randomizes the base addresses of the stack, heap, shared libraries, and the executable’s text segment on each program execution. Without ASLR, an attacker who discovers a buffer overflow can hardcode the address of their shellcode or a useful gadget (like
system()). With ASLR, those addresses change every run, turning a deterministic exploit into a probabilistic one. On 64-bit systems, the randomization entropy is large enough (28-40 bits depending on the region) that brute-force guessing is impractical. ASLR requires the binary to be compiled as a Position Independent Executable (PIE) with-fPIE -pie.
Why does the stack grow downward and the heap grow upward? What happens when they collide?
Why does the stack grow downward and the heap grow upward? What happens when they collide?
Strong Answer:
- This is a deliberate design choice to maximize available space. With the stack starting at a high address and growing downward, and the heap starting at a low address and growing upward, the two regions grow toward each other, using the entire gap between them as available space. If they grew in the same direction, one would hit the other’s starting address much sooner.
- In a modern 64-bit system with virtual memory, the two regions will never literally collide because the virtual address space is enormous (48-bit or 57-bit). The stack has a fixed limit (typically 8MB, configurable via
ulimit -s), and exceeding it triggers a SIGSEGV (stack overflow). The heap can grow until the system runs out of virtual address space or the kernel denies themmap/brkrequest. - On 32-bit embedded systems without an MMU, the collision is a real risk. The stack and heap share a flat memory space, and overflow in either direction silently corrupts the other, causing mysterious crashes far from the actual bug. This is why embedded systems often forbid heap allocation entirely and size the stack conservatively at compile time.
- The kernel places a guard page (a page with no permissions) at the bottom of the stack. If the stack grows into it, the hardware triggers a page fault that becomes SIGSEGV. You can install a signal handler for SIGSEGV using
sigaltstack(which provides an alternate stack for the handler itself, since the main stack is blown). GCC’s-fstack-protector-strongplaces canary values between local variables and the return address; if a buffer overflow overwrites the canary, the runtime detects it and aborts. For deeper monitoring, usegetrlimit(RLIMIT_STACK)to check limits and log warnings when recursion depth approaches critical levels.
Explain how mmap works and when you would use it instead of malloc for large allocations.
Explain how mmap works and when you would use it instead of malloc for large allocations.
Strong Answer:
mmapasks the kernel to map a region of virtual address space. For file-backed mappings, the file’s contents are lazily loaded into memory on first access (page faults trigger disk reads). For anonymous mappings (MAP_ANONYMOUS), the kernel provides zero-filled pages on demand. In both cases, the actual memory consumption is proportional to the pages you touch, not the size you request.mallocalready usesmmapinternally for large allocations (typically above 128KB in glibc). The advantage of callingmmapdirectly is control: you can specify alignment, useMAP_HUGETLBfor huge pages (reducing TLB misses for very large datasets), useMAP_POPULATEto pre-fault all pages (avoiding latency spikes during access), and usemadviseto tell the kernel about your access pattern (MADV_SEQUENTIALfor sequential scans,MADV_RANDOMfor random access).- The key difference from
malloc:munmapimmediately returns memory to the OS. Withmalloc/free, glibc may hold onto freed memory in its free list for future allocations, so RSS does not decrease even after freeing. For a process that does a large computation and then wants to release that memory,mmap/munmapgives you that guarantee.
- Map the entire file with
mmap(NULL, file_size, PROT_READ, MAP_PRIVATE, fd, 0). The OS pages data in and out transparently using the page cache. Usemadvise(addr, len, MADV_RANDOM)for B-tree traversals (random access pattern) orMADV_SEQUENTIALfor full table scans. Pitfalls: (1) On 32-bit systems, you cannot map files larger than ~3GB. (2) Page faults on cold pages stall the thread for milliseconds (disk I/O), making latency unpredictable — this is why some databases like InnoDB manage their own buffer pool instead of relying on mmap. (3) The kernel’s page eviction policy may not match your workload — it might evict hot index pages to make room for a sequential scan. (4) Error handling is awkward: a disk read error during a page fault delivers SIGBUS, not an errno. You need a SIGBUS handler to recover gracefully.