Linux Kernel Architecture
The Famous Torvalds-Tanenbaum Debate
Kernel Source Tree Organization
Key Files to Know
Boot Process Deep Dive
start_kernel() - The Heart of Boot
Key Boot Parameters
Kernel Address Space Layout
KASLR (Kernel Address Space Layout Randomization)
Loadable Kernel Modules
Module Structure
Module Loading Process
Module Management Commands
Module Parameters
Kernel Threads
Important Kernel Threads
Lab Exercises
Interview Questions
Key Takeaways
Further Reading

Linux Kernel Architecture

Understanding the architecture of the Linux kernel is the foundation for everything else in this course. This module covers how the kernel is organized, why certain design decisions were made, and how to navigate the massive codebase.

Interview Frequency: Very High
Key Topics: Monolithic design, kernel source navigation, boot process, modules
Time to Master: 10-12 hours

Interview Insight: Linux uses a “modular monolithic” approach — monolithic core with loadable modules. This provides the performance of monolithic with some flexibility of microkernel.

The Famous Torvalds-Tanenbaum Debate

In 1992, Linus Torvalds and Andrew Tanenbaum debated kernel design:

Tanenbaum: Microkernels are the future; monolithic is obsolete
Torvalds: Performance matters; Linux’s approach is pragmatic

Outcome: Linux became the dominant OS kernel, though microkernels power some embedded systems (QNX in cars, L4 in phones).

Kernel Source Tree Organization

The Linux kernel source is massive (30+ million lines), but well-organized:

linux/
├── arch/           # Architecture-specific code (x86, arm64, riscv)
│   ├── x86/
│   │   ├── boot/       # Boot code
│   │   ├── kernel/     # x86-specific kernel code
│   │   ├── mm/         # x86 memory management
│   │   └── entry/      # Syscall entry points
│   └── arm64/
│
├── block/          # Block layer, I/O scheduling
├── certs/          # Signing certificates
├── crypto/         # Cryptographic API and algorithms
├── Documentation/  # Kernel documentation
│
├── drivers/        # Device drivers (largest directory)
│   ├── net/            # Network drivers
│   ├── block/          # Block device drivers
│   ├── gpu/            # GPU drivers (including drm)
│   ├── nvme/           # NVMe drivers
│   └── ...
│
├── fs/             # Filesystems
│   ├── ext4/           # ext4 filesystem
│   ├── xfs/            # XFS filesystem
│   ├── btrfs/          # Btrfs filesystem
│   ├── proc/           # procfs
│   └── ...
│
├── include/        # Header files
│   ├── linux/          # Public kernel headers
│   ├── uapi/           # User-space API headers
│   └── asm-generic/    # Generic assembly headers
│
├── init/           # Kernel initialization
│   └── main.c          # start_kernel() lives here
│
├── ipc/            # Inter-process communication
├── kernel/         # Core kernel code
│   ├── sched/          # Scheduler
│   ├── locking/        # Locks, mutexes
│   ├── trace/          # Tracing infrastructure
│   └── bpf/            # BPF subsystem
│
├── lib/            # Kernel libraries
├── mm/             # Memory management
│   ├── slab.c          # Slab allocator
│   ├── page_alloc.c    # Page allocator
│   ├── mmap.c          # Memory mapping
│   └── ...
│
├── net/            # Networking
│   ├── core/           # Core networking
│   ├── ipv4/           # IPv4 stack
│   ├── ipv6/           # IPv6 stack
│   ├── netfilter/      # Packet filtering
│   └── ...
│
├── scripts/        # Build and helper scripts
├── security/       # Security modules (SELinux, AppArmor)
├── sound/          # Sound subsystem
├── tools/          # Userspace tools (perf, bpf)
│   ├── perf/           # perf tool
│   ├── bpf/            # BPF tools
│   └── ...
│
├── Kconfig         # Build configuration
├── Makefile        # Main makefile
└── MAINTAINERS     # Who maintains what

Key Files to Know

File	Purpose
`init/main.c`	Kernel entry point (`start_kernel()`)
`arch/x86/entry/entry_64.S`	System call entry point
`kernel/sched/core.c`	Scheduler core
`mm/page_alloc.c`	Page allocator (buddy system)
`mm/slab.c`	Slab allocator
`fs/read_write.c`	read/write system calls
`net/core/dev.c`	Core networking

Navigation Tip: Use tools like cscope, ctags, or online browsers like elixir.bootlin.com to navigate the source.

Boot Process Deep Dive

Understanding how Linux boots is essential for systems engineers:

start_kernel() - The Heart of Boot

// init/main.c - simplified
asmlinkage __visible void __init start_kernel(void)
{
    // Very early setup
    set_task_stack_end_magic(&init_task);
    
    // Memory management initialization
    setup_arch(&command_line);      // Arch-specific setup
    mm_init();                      // Memory subsystem
    
    // Core subsystems
    sched_init();                   // Scheduler
    rcu_init();                     // RCU
    
    // Interrupts and timers
    init_IRQ();
    tick_init();
    
    // Various subsystems
    vfs_caches_init();              // VFS
    signals_init();                 // Signals
    
    // Start init process
    rest_init();                    // Creates init process
}

Key Boot Parameters

Parameter	Purpose	Example
`root=`	Root filesystem device	`root=/dev/sda1`
`init=`	First user process	`init=/bin/bash`
`quiet`	Suppress boot messages	`quiet`
`debug`	Enable debug messages	`debug`
`nokaslr`	Disable KASLR	`nokaslr`
`isolcpus=`	Isolate CPUs from scheduler	`isolcpus=2,3`
`nosmp`	Disable SMP	`nosmp`

Debug Tip: Boot with init=/bin/bash to get a shell before init runs. Useful for recovery.

Kernel Address Space Layout

On x86-64, the virtual address space is split between user and kernel:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    x86-64 VIRTUAL ADDRESS SPACE (48-bit)                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│  0xFFFFFFFFFFFFFFFF ┌────────────────────────────────────────────────────┐  │
│                     │                                                     │  │
│                     │              KERNEL SPACE (128 TB)                  │  │
│                     │                                                     │  │
│                     │  0xFFFFFFFF80000000 - Kernel text (code)           │  │
│                     │  0xFFFF880000000000 - Direct physical mapping      │  │
│                     │  0xFFFFC90000000000 - vmalloc area                 │  │
│                     │  0xFFFFEA0000000000 - Virtual memory map           │  │
│                     │                                                     │  │
│  0xFFFF800000000000 ├────────────────────────────────────────────────────┤  │
│                     │              NON-CANONICAL HOLE                     │  │
│                     │         (addresses that cause fault)                │  │
│  0x0000800000000000 ├────────────────────────────────────────────────────┤  │
│                     │                                                     │  │
│                     │              USER SPACE (128 TB)                    │  │
│                     │                                                     │  │
│                     │  Stack (grows down from near top)                  │  │
│                     │  mmap region (shared libs, anonymous maps)         │  │
│                     │  Heap (grows up from end of data)                  │  │
│                     │  BSS, Data, Text (program sections)                │  │
│                     │                                                     │  │
│  0x0000000000000000 └────────────────────────────────────────────────────┘  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

KASLR (Kernel Address Space Layout Randomization)

KASLR randomizes kernel addresses at boot for security:

# Check if KASLR is enabled
cat /proc/cmdline | grep -q nokaslr && echo "KASLR disabled" || echo "KASLR enabled"

# See kernel text base (will differ each boot with KASLR)
sudo cat /proc/kallsyms | grep " _text" | head -1

Loadable Kernel Modules

Modules allow extending the kernel without recompiling:

Module Structure

// hello_module.c - Simple kernel module
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Your Name");
MODULE_DESCRIPTION("A simple Hello World module");
MODULE_VERSION("1.0");

static int __init hello_init(void)
{
    printk(KERN_INFO "Hello, Kernel!\n");
    return 0;  // 0 = success
}

static void __exit hello_exit(void)
{
    printk(KERN_INFO "Goodbye, Kernel!\n");
}

module_init(hello_init);
module_exit(hello_exit);

Module Loading Process

Module Management Commands

# List loaded modules
lsmod

# Module info
modinfo ext4

# Load module
sudo modprobe ext4

# Load with parameters
sudo modprobe loop max_loop=64

# Remove module
sudo rmmod loop

# Show module dependencies
modprobe --show-depends ext4

# Show module parameters
systool -vm ext4

Module Parameters

// Module with parameters
static int buffer_size = 1024;
module_param(buffer_size, int, 0644);  // Read-write in sysfs
MODULE_PARM_DESC(buffer_size, "Size of internal buffer");

static char *device_name = "mydev";
module_param(device_name, charp, 0444);  // Read-only
MODULE_PARM_DESC(device_name, "Device name");

Kernel Threads

Kernel threads (kthreads) are processes that run entirely in kernel mode:

// Creating a kernel thread
#include <linux/kthread.h>

static struct task_struct *my_thread;

static int thread_function(void *data)
{
    while (!kthread_should_stop()) {
        // Do work
        schedule_timeout_interruptible(HZ);  // Sleep 1 second
    }
    return 0;
}

// In module init:
my_thread = kthread_run(thread_function, NULL, "my_kthread");

// In module exit:
kthread_stop(my_thread);

Important Kernel Threads

# View kernel threads (names in brackets)
ps aux | grep '\[.*\]'

# Common kernel threads:
# [kthreadd]     - Parent of all kernel threads
# [ksoftirqd/N]  - Soft IRQ handling for CPU N
# [kworker/N:M]  - Workqueue workers
# [kswapd0]      - Memory reclaim
# [jbd2/sda1-8]  - Journal block device (ext4 journaling)
# [kcompactd0]   - Memory compaction

Lab Exercises

Lab 1: Navigate Kernel Source

Objective: Get comfortable with kernel source tree

# Clone kernel source
git clone --depth=1 https://github.com/torvalds/linux.git
cd linux

# Find start_kernel
grep -rn "asmlinkage.*start_kernel" init/

# Find syscall table (x86-64)
find arch/x86 -name "*syscall*"

# Count lines in different subsystems
wc -l mm/*.c       # Memory management
wc -l kernel/*.c   # Core kernel
wc -l fs/*.c       # Filesystems

Lab 2: Build and Load Module

Objective: Write, compile, and load a kernel module

// Save as hello.c
#include <linux/init.h>
#include <linux/module.h>

MODULE_LICENSE("GPL");

static int __init hello_init(void)
{
    pr_info("Hello from kernel module!\n");
    return 0;
}

static void __exit hello_exit(void)
{
    pr_info("Goodbye from kernel module!\n");
}

module_init(hello_init);
module_exit(hello_exit);

# Makefile
obj-m := hello.o

KDIR := /lib/modules/$(shell uname -r)/build

all:
	make -C $(KDIR) M=$(PWD) modules

clean:
	make -C $(KDIR) M=$(PWD) clean

# Build and test
make
sudo insmod hello.ko
dmesg | tail
sudo rmmod hello
dmesg | tail

Lab 3: Analyze Boot Process

Objective: Understand boot timing and initialization

# View boot messages
dmesg | head -100

# Boot timing analysis
systemd-analyze
systemd-analyze blame
systemd-analyze critical-chain

# Kernel command line used
cat /proc/cmdline

# Initramfs contents
lsinitramfs /boot/initramfs-$(uname -r).img | head -50

Interview Questions

Q1: Why is Linux monolithic, and what are the trade-offs?

Answer:Linux chose monolithic design for performance:

Direct function calls between subsystems (no IPC overhead)
Single address space eliminates context switch on internal calls
Simpler data sharing between components

Trade-offs:

A bug in any component can crash the entire kernel
Larger attack surface (all code runs privileged)
More complex codebase to maintain

Mitigations in Linux:

Loadable modules for flexibility
Namespaces and cgroups for isolation
Seccomp for syscall filtering
Strong code review process

Q2: Walk through what happens when you type 'ls' and press Enter

Answer (kernel perspective):

Shell process:
- fork() → creates child process (clone syscall)
- execve("/bin/ls") → replaces process image
execve processing:
- Kernel opens ELF binary
- Maps code/data sections into memory
- Sets up stack with arguments/environment
- Loads dynamic linker (ld.so)
ls execution:
- Dynamic linker loads libc
- ls calls opendir() → getdents64 syscall
- Kernel reads directory entries from filesystem
- ls calls write() → output to terminal
Termination:
- ls calls exit() → exit_group syscall
- Kernel cleans up resources
- Parent shell’s wait() returns

Q3: How do kernel modules differ from user-space shared libraries?

Answer:

Aspect	Kernel Module	Shared Library
Privilege	Runs in kernel mode	Runs in user mode
Address space	Kernel address space	Process address space
Fault impact	Can crash system	Crashes only that process
Symbol resolution	Kernel symbol table	User-space linker
Memory	Uses kmalloc, vmalloc	Uses malloc, mmap
Loading	insmod, modprobe	ld.so, dlopen

Key insight: Modules are essentially kernel code with a defined entry/exit point, while shared libraries are user-space code loaded by the dynamic linker.

Q4: Explain KASLR and its security benefits

Answer:KASLR (Kernel Address Space Layout Randomization):

Randomizes kernel base address at each boot
Makes it harder to exploit memory corruption vulnerabilities
Attacker can’t hardcode kernel addresses

Implementation:

Random offset chosen during early boot
All kernel symbols shifted by this offset
/proc/kallsyms shows randomized addresses

Limitations:

Information leaks can reveal base address
Side-channel attacks (Meltdown/Spectre) can bypass
Doesn’t protect against local attackers with kernel memory access

Related protections:

SMEP (Supervisor Mode Execution Prevention)
SMAP (Supervisor Mode Access Prevention)
Stack canaries

Key Takeaways

Architecture Choice

Linux’s monolithic design prioritizes performance while modules add flexibility

Source Organization

Understanding source tree layout is essential for kernel development and debugging

Boot Process

From firmware to init, each stage has specific responsibilities and debugging points

Module System

Modules extend kernel functionality at runtime without recompilation

Overview

Testing & Code Quality

Crash Courses

AI Engineering

Math for ML - Understanding Linear Algebra

Probability & Statistics for ML

Math for ML - Understanding Calculus

ML Mastery

Deep Learning Mastery

NestJS Mastery

Microservices Mastery

Low Level Design

OOP Concepts

SOLID Principles

Design Patterns

LLD Case Studies

System Design (HLD)

Senior Level (L5+/Staff)

HLD Case Studies

Engineering Fundamentals

DevOps & Operations

Azure Cloud Engineering

AWS Cloud

AWS Monitoring & Observability

AWS Security Services

AWS Serverless

AWS Operations

AWS Advanced

AWS Case Studies

GCP Cloud Engineering

DevOps Tools

Database Engineering

HIPAA Compliance Mastery

Operating Systems

Linux Internals

Distributed Systems

Networking Mastery

Build Your Own X

Go Lang Mastery

C Programming

Classic Research Papers

Distributed System Tools

​Linux Kernel Architecture

​The Famous Torvalds-Tanenbaum Debate

​Kernel Source Tree Organization

​Key Files to Know

​Boot Process Deep Dive

​start_kernel() - The Heart of Boot

​Key Boot Parameters

​Kernel Address Space Layout

​KASLR (Kernel Address Space Layout Randomization)

​Loadable Kernel Modules

​Module Structure

​Module Loading Process

​Module Management Commands

​Module Parameters

​Kernel Threads

​Important Kernel Threads

​Lab Exercises

​Interview Questions

​Key Takeaways

Architecture Choice

Source Organization

Boot Process

Module System

​Further Reading

Linux Kernel Architecture

The Famous Torvalds-Tanenbaum Debate

Kernel Source Tree Organization

Key Files to Know

Boot Process Deep Dive

start_kernel() - The Heart of Boot

Key Boot Parameters

Kernel Address Space Layout

KASLR (Kernel Address Space Layout Randomization)

Loadable Kernel Modules

Module Structure

Module Loading Process

Module Management Commands

Module Parameters

Kernel Threads

Important Kernel Threads

Lab Exercises

Interview Questions

Key Takeaways

Further Reading