OS security protects system resources from unauthorized access and malicious attacks. Understanding security principles is essential for building robust systems.
Interview Frequency: Medium-High Key Topics: Access control, capabilities, sandboxing, containers Time to Master: 12-15 hours
// User to Kernel transition triggers:// 1. System call (intentional)int result = syscall(SYS_read, fd, buffer, size);// 2. Exception (page fault, divide by zero)int crash = 1 / 0;// 3. Hardware interrupt (timer, I/O)// Handled automatically by hardware// Kernel to User transition:// - Return from system call/interrupt// - sigreturn
// Each process has three capability sets:// Permitted (P): Maximum capabilities process can use// Effective (E): Currently active capabilities// Inheritable (I): Passed to child processes// Drop capabilities after setup#include <sys/capability.h>void drop_caps() { cap_t caps = cap_get_proc(); // Keep only what we need cap_value_t keep[] = {CAP_NET_BIND_SERVICE}; cap_clear(caps); cap_set_flag(caps, CAP_PERMITTED, 1, keep, CAP_SET); cap_set_flag(caps, CAP_EFFECTIVE, 1, keep, CAP_SET); cap_set_proc(caps); cap_free(caps);}
#include <sched.h>// Create new namespaceint clone_flags = CLONE_NEWPID | // New PID namespace CLONE_NEWNET | // New network namespace CLONE_NEWNS | // New mount namespace CLONE_NEWUTS | // New hostname CLONE_NEWIPC | // New IPC namespace CLONE_NEWUSER; // New user namespaceint pid = clone(child_func, stack_top, clone_flags, NULL);// Or using unshare:unshare(CLONE_NEWNS); // New mount namespace
Stack canaries are ONE layer. Always combine:✓ Stack canaries → Detect buffer overflow✓ ASLR → Randomize addresses ✓ NX/DEP → Prevent code execution✓ RELRO → Protect GOT/PLT✓ PIE → Position independent executable✓ Seccomp → Limit syscalls✓ Safe coding → No gets(), strcpy(), sprintf()Modern systems use ALL of these together!
# Check if KASLR is enabled$ cat /proc/cmdline | grep -o nokaslr# Empty = enabled# Kernel symbols randomized each boot$ cat /proc/kallsymsffffffffc0123456 t some_kernel_function# Address different next boot
# /etc/sudoers (use visudo to edit!)# User alice can run any command as rootalice ALL=(ALL) ALL# User bob can only restart nginxbob ALL=(root) /usr/bin/systemctl restart nginx# Group wheel can sudo without password (dangerous!)%wheel ALL=(ALL) NOPASSWD: ALL# LoggingDefaults logfile=/var/log/sudo.logDefaults log_input, log_output
# Run as non-rootdocker run --user 1000:1000 myapp# Read-only filesystemdocker run --read-only myapp# Drop capabilitiesdocker run --cap-drop ALL --cap-add NET_BIND_SERVICE myapp# Use seccomp profiledocker run --security-opt seccomp=profile.json myapp
Q3: Explain a buffer overflow attack and protections
Answer:Attack:
Copy
void vulnerable() { char buffer[64]; gets(buffer); // No bounds checking! // Attacker sends 100 bytes, overwrites return address}// Stack before overflow:[buffer (64 bytes)] [saved_rbp] [return_addr]// After overflow:[shellcode..............................] [jmp_to_buf] ↑ Now points to shellcode
httpd_t (Apache process type) │ ├── CAN access httpd_sys_content_t (web files) ├── CAN access httpd_log_t (logs) ├── CANNOT access user_home_t (home directories) └── CANNOT access etc_t (system config)
Policy Rules:
Copy
# Allow httpd to read contentallow httpd_t httpd_sys_content_t:file { read getattr };# Allow httpd to write logsallow httpd_t httpd_log_t:file { write append };# Deny by default - anything not allowed is blocked
Workflow when Apache accesses a file:
Copy
1. Apache (httpd_t) tries to read /var/www/index.html2. File has context httpd_sys_content_t3. SELinux checks policy: httpd_t → httpd_sys_content_t:file:read4. Policy allows → access grantedIf Apache tries to read /etc/shadow:1. File has context shadow_t2. SELinux checks: httpd_t → shadow_t:file:read3. No policy allows this → DENIED4. Even if DAC allows (unlikely), SELinux blocks
Troubleshooting:
Copy
# Check for denials$ ausearch -m AVC -ts recent# Generate policy module for denial$ audit2allow -a -M mymodule$ semodule -i mymodule.pp# Temporarily set permissive (logs only)$ setenforce 0
Q5: Design a secure multi-tenant system
Answer:Requirements:
Multiple customers share infrastructure
Complete isolation between tenants
Resource limits per tenant
Audit logging
Architecture:Isolation Mechanisms:
Network Level:
Copy
# Kubernetes NetworkPolicyapiVersion: networking.k8s.io/v1kind: NetworkPolicyspec: podSelector: matchLabels: tenant: A ingress: - from: - podSelector: matchLabels: tenant: A # Tenant A pods can only talk to Tenant A pods
Data Level:
Copy
-- Row-level securityCREATE POLICY tenant_isolation ON dataUSING (tenant_id = current_setting('app.tenant_id'));