Understanding power management is essential for infrastructure engineers. Whether optimizing cloud costs, managing thermal throttling, or debugging performance issues, these concepts matter at scale.
Interview Frequency: Medium (important for infrastructure/performance roles) Key Topics: cpufreq, cpuidle, thermal throttling, power governors Time to Master: 6-8 hours
┌─────────────────────────────────────────────────────────────────────────────┐│ CPU P-STATES (Performance States) │├─────────────────────────────────────────────────────────────────────────────┤│ ││ P-State Frequency Voltage Power Use Case ││ ──────────────────────────────────────────────────────────────────────── ││ P0 3.6 GHz 1.2 V ~100W Max performance ││ P1 3.2 GHz 1.1 V ~75W High performance ││ P2 2.8 GHz 1.0 V ~50W Normal use ││ P3 2.4 GHz 0.9 V ~35W Power saving ││ P4 2.0 GHz 0.85V ~25W Low power ││ ... ... ... ... ... ││ Pn 800 MHz 0.7 V ~5W Minimum ││ ││ Turbo Boost (above base frequency): ││ • Single-core turbo: Up to 4.8 GHz ││ • All-core turbo: Up to 4.2 GHz ││ • Depends on: Temperature, power budget, active cores ││ │└─────────────────────────────────────────────────────────────────────────────┘
# Set governor for all CPUsfor cpu in /sys/devices/system/cpu/cpu[0-9]*; do echo performance > $cpu/cpufreq/scaling_governordone# Using cpupowercpupower frequency-set -g performance# Set specific frequency (userspace governor)cpupower frequency-set -g userspacecpupower frequency-set -f 2.4GHz# Set min/max frequenciesecho 2400000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freqecho 3600000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
┌─────────────────────────────────────────────────────────────────────────────┐│ CPU C-STATES (Idle States) │├─────────────────────────────────────────────────────────────────────────────┤│ ││ C-State Name Power Exit Latency Description ││ ──────────────────────────────────────────────────────────────────────── ││ C0 Active High 0 μs CPU executing ││ C1 Halt Low 1-2 μs Clock gated ││ C1E Enhanced Halt Lower 2-5 μs + voltage reduction ││ C3 Sleep Very Low 50-100 μs L1/L2 flushed ││ C6 Deep Sleep Minimal 100-200 μs Core off ││ C7/C8/C9 Deeper Ultra Low 200-500 μs Package states ││ ││ Trade-off: ││ • Deeper states = more power savings ││ • Deeper states = higher wake-up latency ││ • Must balance power vs. latency requirements ││ ││ For latency-sensitive apps: Limit to C1/C1E ││ For power efficiency: Allow all C-states ││ │└─────────────────────────────────────────────────────────────────────────────┘
# Disable deeper C-states (disable state 2 and beyond)for cpu in /sys/devices/system/cpu/cpu*/cpuidle/state[2-9]; do echo 1 > $cpu/disabledone# Kernel parameter to limit C-states# Add to GRUB_CMDLINE_LINUX:intel_idle.max_cstate=1processor.max_cstate=1# Using PM QoS to set latency constraint# Request max 10 microsecond latencyecho 10 > /dev/cpu_dma_latency # Keeps file open# Programmatically:int fd = open("/dev/cpu_dma_latency", O_RDWR);int latency = 10; // microsecondswrite(fd, &latency, sizeof(latency));// Keep fd open as long as low latency needed
# List cooling devicesls /sys/class/thermal/cooling_device*/# View cooling device typecat /sys/class/thermal/cooling_device0/type# intel_powerclamp# Processor# Fan# Current and max cooling statecat /sys/class/thermal/cooling_device0/cur_state # 0cat /sys/class/thermal/cooling_device0/max_state # 10# Which zone is this device bound to?cat /sys/class/thermal/thermal_zone0/cdev0/type
# View power zonesls /sys/class/powercap/intel-rapl/# Package powercat /sys/class/powercap/intel-rapl/intel-rapl:0/name # package-0cat /sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj # Cumulative energy# DRAM powercat /sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:0/name # dram# Power constraints (limits)cat /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uwcat /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us# Using perf for power eventsperf stat -e power/energy-pkg/,power/energy-cores/,power/energy-ram/ ./myapp# Using turbostatsudo turbostat --show PkgWatt,CorWatt,RAMWatt --interval 1
# Set power limit (microwatts)echo 65000000 > /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw# Set time windowecho 1000000 > /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_time_window_us
# Maximum performance configuration# /etc/tuned/profiles/latency-performance/# 1. Set performance governorcpupower frequency-set -g performance# 2. Disable turbo (for consistent performance)echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo# 3. Disable deep C-statesfor state in /sys/devices/system/cpu/cpu*/cpuidle/state[2-9]; do echo 1 > $state/disabledone# 4. Disable frequency scaling (lock to max)for cpu in /sys/devices/system/cpu/cpu*/cpufreq; do cat $cpu/cpuinfo_max_freq > $cpu/scaling_min_freqdone# 5. Isolate CPUs for application# GRUB: isolcpus=2-7 nohz_full=2-7
# Power saving configuration# 1. Set schedutil governor (balances power/performance)cpupower frequency-set -g schedutil# 2. Enable all C-statesfor state in /sys/devices/system/cpu/cpu*/cpuidle/state*; do echo 0 > $state/disable 2>/dev/nulldone# 3. Enable turbo (uses power when needed, saves when not)echo 0 > /sys/devices/system/cpu/intel_pstate/no_turbo# 4. Set power cap if neededecho 35000000 > /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw
# Install tunedyum install tuned # or apt install tuned# List available profilestuned-adm list# Common profiles:# balanced - Default, balance power and performance# throughput-performance - Maximum throughput# latency-performance - Minimum latency# powersave - Maximum power savings# virtual-guest - Optimized for VMs# virtual-host - Optimized for hypervisors# Set profiletuned-adm profile latency-performance# Check current profiletuned-adm active
┌─────────────────────────────────────────────────────────────────────────────┐│ CLOUD POWER CONSIDERATIONS │├─────────────────────────────────────────────────────────────────────────────┤│ ││ Instance Type Power Profile Notes ││ ──────────────────────────────────────────────────────────────────────── ││ General Purpose Balanced C-states enabled, schedutil ││ (m5, n2) ││ ││ Compute Optimized Max Performance Often performance governor ││ (c5, c2) Higher power draw ││ ││ Memory Optimized Balanced Large memory, moderate CPU ││ (r5, m2) ││ ││ Burstable Power efficient Credits for bursts ││ (t3, e2-micro) Deep C-states ││ ││ High Frequency Max single-thread Turbo boost, high power ││ (z1d, c6i.metal) ││ ││ Spot/Preemptible Variable Reclaimed for capacity ││ ││ Key: Match instance to workload power profile ││ │└─────────────────────────────────────────────────────────────────────────────┘
# 1. Check if throttling due to temperaturecat /sys/devices/system/cpu/cpu0/thermal_throttle/core_throttle_count# 2. Check power limitscat /sys/class/powercap/intel-rapl/intel-rapl:0/constraint_0_power_limit_uw# 3. Check governorcat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor# 4. Check if turbo is disabledcat /sys/devices/system/cpu/intel_pstate/no_turbo# 5. Monitor in real-timesudo turbostat --show Core,CPU,Avg_MHz,Busy%,Bzy_MHz,TSC_MHz,PkgTmp --interval 1
# 1. Monitor power drawsudo turbostat --show PkgWatt,CorWatt,RAMWatt --interval 1# 2. Check which processes are using CPUperf top# 3. Check for inefficient C-state usagecat /sys/devices/system/cpu/cpu*/cpuidle/state*/usage# 4. Check for unnecessary activitysudo perf record -g -a sleep 10sudo perf report
# Monitor all at oncesudo turbostat --interval 1# Set power/performance profiletuned-adm profile <profile-name># CPU frequency controlcpupower frequency-set -g performance# View temperaturessensors# Power statisticsperf stat -e power/energy-pkg/ ./myapp