Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Python Memory Model

Python Fundamentals

Python is an interpreted, high-level, dynamically typed language. It emphasizes code readability with its use of significant indentation. Think of Python like writing a recipe: you describe the steps in plain language, in order, and anyone can follow along. Languages like C++ are more like writing assembly instructions for a specific machine — precise but harder for humans to parse at a glance. This is not just a metaphor — Python was explicitly designed to be read by humans first and machines second, which is why Guido van Rossum made whitespace significant and kept the syntax minimal.

1. How Python Works: Under the Hood

Unlike compiled languages (C++, Go, Rust) that produce standalone executables, Python uses a two-stage process: compilation to bytecode, then interpretation.

The Python Execution Pipeline

StageWhat HappensCan You See It?
LexingSource code → tokens (keywords, identifiers, operators)Internal
ParsingTokens → Abstract Syntax Tree (AST)ast.parse()
CompilingAST → Bytecode instructionsdis.dis()
InterpretingPVM executes bytecodeYour program runs!

Why is Python “Interpreted”?

Python is actually both compiled and interpreted:
  1. Compilation (happens automatically): Your .py → bytecode (.pyc)
  2. Interpretation: The Python Virtual Machine (PVM) executes bytecode
# You can see the bytecode!
import dis

def add(a, b):
    return a + b

dis.dis(add)
# Output:
#   2           0 LOAD_FAST                0 (a)
#               2 LOAD_FAST                1 (b)
#               4 BINARY_ADD
#               6 RETURN_VALUE

The __pycache__ Folder

When you import a module, Python saves the compiled bytecode:
myproject/
├── main.py
├── utils.py
└── __pycache__/
    └── utils.cpython-311.pyc  ← Compiled bytecode
  • cpython-311 = CPython interpreter, Python 3.11
  • Bytecode is cached for faster imports (no recompilation if source unchanged)
  • Delete safely: Python will regenerate if needed

CPython vs Other Implementations

ImplementationDescriptionUse Case
CPythonReference implementation (C)Default, most libraries
PyPyJIT-compiled Python (faster)Performance-critical
JythonPython on JVM (Java bytecode)Java integration
MicroPythonTiny Python for microcontrollersIoT, embedded
Performance Tip: Python is slower than compiled languages because the PVM interprets bytecode at runtime. For CPU-intensive tasks, consider:
  • NumPy/Pandas: Uses C under the hood
  • Cython: Compile Python to C
  • PyPy: JIT compilation for speedups

The GIL (Global Interpreter Lock)

CPython has a Global Interpreter Lock (GIL) — a mutex that allows only one thread to execute Python bytecode at a time. Think of it like a single-lane bridge: no matter how many cars (threads) are waiting, only one can cross at a time. This means multi-threaded Python code does not achieve true parallelism for CPU-bound work. Two threads crunching numbers will take roughly the same time as running them sequentially because they take turns holding the GIL. However, the GIL is released during I/O operations (network calls, file reads, database queries). So threading still works well for I/O-bound tasks — while one thread waits for a network response, another can run.
# GIL impact: threading does NOT speed up CPU-bound work
import threading

def cpu_work():
    total = sum(range(10_000_000))  # CPU-bound -- GIL is held

# Two threads will NOT run in parallel for this task.
# Use multiprocessing instead for true CPU parallelism:
from multiprocessing import Pool

def cpu_work(n):
    return sum(range(n))

# Each process gets its own GIL -- true parallelism
with Pool(4) as p:
    results = p.map(cpu_work, [10_000_000] * 4)
Python 3.13+ introduced an experimental “free-threaded” build (PEP 703) that removes the GIL. This is opt-in and still maturing, but it signals the long-term direction of the language. For now, use multiprocessing for CPU-bound parallelism and asyncio or threading for I/O-bound concurrency.

2. Variables & Types

Python is dynamically typed. You don’t declare types (like int x), but types definitely exist. Think of variables in Python as name tags, not boxes. In C, a variable is a box with a fixed size — you put an integer in an int box. In Python, a variable is a sticky note you attach to an object. You can peel the note off and stick it on a completely different object. The object carries its own type; the name tag does not.
x = 10          # x points to an int object with value 10
price = 19.99   # price points to a float object
name = "Alice"  # name points to a str object
is_active = True # is_active points to a bool object
nothing = None  # nothing points to the singleton None object

# The "name tag" analogy in action:
x = 10          # x is attached to int(10)
x = "hello"     # x is now attached to str("hello") -- the int(10) still exists
                # until garbage collected. The name moved, the objects did not.

Type Hints (Python 3.5+)

While Python doesn’t enforce types at runtime, you can (and should) use Type Hints. They act as documentation and allow tools (like VS Code or mypy) to catch errors.
age: int = 25
name: str = "Bob"

# This is valid Python code (no runtime error), but a linter will warn you.
age = "Twenty" 

Mutable vs. Immutable

This is a critical concept in Python that trips up beginners. Think of it this way: immutable objects are like printed books — once printed, you cannot change the text on page 5. You can only print a new edition. Mutable objects are like whiteboards — you can erase and rewrite freely.
  • Immutable (Cannot change): int, float, str, tuple, bool.
  • Mutable (Can change): list, dict, set.
# Immutable -- strings cannot be modified in place
s = "hello"
# s[0] = "H"  # TypeError! You cannot change a string. You must create a new one.
s = "H" + s[1:]  # Creates a NEW string "Hello". The old "hello" is garbage collected.

# Mutable -- lists can be modified in place
nums = [1, 2, 3]
nums[0] = 100  # OK. The list object is modified in place. No new list is created.

# Why this matters: function arguments
def add_to_list(items, value):
    items.append(value)  # Mutates the ORIGINAL list -- the caller's list changes!

def try_to_change_string(text):
    text = text.upper()  # Creates a new string. The caller's variable is unaffected.

my_list = [1, 2]
add_to_list(my_list, 3)
print(my_list)  # [1, 2, 3] -- the original was modified

my_str = "hello"
try_to_change_string(my_str)
print(my_str)   # "hello" -- unchanged, because strings are immutable
The Aliasing Trap: Because mutable objects can change in place, two variables can point to the same object. Changing one affects the other.
a = [1, 2, 3]
b = a          # b is NOT a copy -- it's the SAME list object
b.append(4)
print(a)       # [1, 2, 3, 4] -- surprise! a changed too

# Fix: Make an explicit copy
b = a.copy()   # Now b is an independent shallow copy
b = a[:]       # Slicing also creates a shallow copy
b = list(a)    # Constructor also creates a shallow copy
Shallow Copy vs Deep Copy: The .copy() method creates a shallow copy — it copies the outer container but not the nested objects inside it. If your list contains other mutable objects (lists, dicts), those inner objects are still shared.
import copy

# Shallow copy -- inner lists are still shared
original = [[1, 2], [3, 4]]
shallow = original.copy()
shallow[0].append(99)
print(original)  # [[1, 2, 99], [3, 4]] -- original was affected!

# Deep copy -- everything is fully independent
original = [[1, 2], [3, 4]]
deep = copy.deepcopy(original)
deep[0].append(99)
print(original)  # [[1, 2], [3, 4]] -- original is safe

# Rule of thumb: use .copy() for flat structures (list of ints, strings).
# Use copy.deepcopy() when you have nested mutable objects.

3. Input & Output

# Output
print("Hello", "World") # Prints "Hello World" (space separated by default)
print("Hello", "World", sep="-") # Prints "Hello-World"

# Input
# input() ALWAYS returns a string. You must convert it if you need a number.
name = input("Enter your name: ")
age_str = input("Enter your age: ")
age = int(age_str) # Convert to integer

f-Strings (Python 3.6+)

The “Formatted String Literal” is the modern way to insert variables into strings. It’s fast and readable. You can put any valid Python expression inside the curly braces.
name = "Alice"
age = 30
print(f"Hello, {name}. You are {age} years old.")

# You can embed expressions, not just variables
print(f"In 5 years, you'll be {age + 5}.")

# Format numbers with precision
pi = 3.14159265
print(f"Pi is approximately {pi:.2f}")  # "Pi is approximately 3.14"

# Debugging trick (Python 3.8+): Add = to see variable name AND value
x = 42
print(f"{x = }")  # "x = 42" -- extremely useful for quick debugging

4. Control Flow

Python uses indentation (whitespace) to define blocks of code. There are no curly braces {} or semicolons ;.

If-Elif-Else

score = 85

if score >= 90:
    print("A")
elif score >= 80:
    print("B")
else:
    print("C")

Pythonic Conditionals

Python has several idiomatic patterns for conditionals that differ from other languages:
# Ternary expression (Python's one-line if-else)
status = "adult" if age >= 18 else "minor"

# Truthiness -- Python treats these as False (called "falsy"):
#   None, 0, 0.0, "", [], {}, set(), False
# Everything else is "truthy". Use this to write cleaner checks:

# Non-Pythonic
if len(my_list) > 0:
    process(my_list)

# Pythonic -- rely on truthiness
if my_list:
    process(my_list)

# Chained comparisons (unique to Python, reads like math)
if 0 < x < 100:      # Equivalent to: 0 < x and x < 100
    print("In range")

Loops

For Loop Python’s for loop is actually a “for-each” loop. It iterates over any iterable — a sequence, a file, a generator, or any object that implements the iterator protocol.
# range(5) generates: 0, 1, 2, 3, 4
for i in range(5):
    print(i)

# Iterate over a list directly -- no indexing needed
names = ["Alice", "Bob"]
for name in names:
    print(name)

# Need the index AND the value? Use enumerate() instead of range(len(...))
for i, name in enumerate(names):
    print(f"Index {i}: {name}")

# Iterating over two lists in parallel? Use zip()
scores = [85, 92]
for name, score in zip(names, scores):
    print(f"{name}: {score}")
Pitfall: Modifying a list while iterating over it. This leads to skipped elements and subtle bugs.
# WRONG -- modifying during iteration causes skipped items
nums = [1, 2, 3, 4, 5]
for n in nums:
    if n % 2 == 0:
        nums.remove(n)  # Shifts indices, skipping the next element

# CORRECT -- build a new list with a comprehension
nums = [n for n in nums if n % 2 != 0]

# Also correct -- iterate over a copy
for n in nums[:]:  # [:] creates a shallow copy to iterate over
    if n % 2 == 0:
        nums.remove(n)
While Loop Runs as long as the condition is true.
count = 0
while count < 5:
    print(count)
    count += 1

5. Functions

Functions are defined using def.
# Type hints are optional but recommended
def add(a: int, b: int) -> int:
    """
    Returns the sum of two numbers.
    This is a docstring - used for documentation.
    """
    return a + b

result = add(5, 3)

Default Arguments

You can provide default values for parameters.
def greet(name="Guest"):
    print(f"Hello, {name}")

greet()         # Hello, Guest
greet("Alice")  # Hello, Alice
The Mutable Default Argument Trap — This is Python’s most infamous gotcha. Never use a mutable object (list, dict, set) as a default argument value. The default is created once when the function is defined, not each time it is called. Every call shares the same object.
# WRONG -- the same list is reused across calls
def add_item(item, items=[]):
    items.append(item)
    return items

print(add_item("a"))  # ['a']
print(add_item("b"))  # ['a', 'b'] -- expected ['b']!

# CORRECT -- use None as a sentinel, create a new list inside
def add_item(item, items=None):
    if items is None:
        items = []    # Fresh list created on each call
    items.append(item)
    return items

*args and **kwargs

These allow functions to accept an arbitrary number of arguments. Think of *args as a catch-all bucket for extra positional arguments and **kwargs as a catch-all bucket for extra keyword arguments.
  • *args: Collects positional arguments into a Tuple.
  • **kwargs: Collects keyword arguments into a Dictionary.
def log(message, *args, **kwargs):
    print(f"MSG: {message}")
    print(f"Extra Args: {args}")     # Tuple of extra positional args
    print(f"Config: {kwargs}")       # Dict of extra keyword args

log("Error", 1, 2, user="admin", code=500)
# Output:
# MSG: Error
# Extra Args: (1, 2)
# Config: {'user': 'admin', 'code': 500}

Keyword-Only Arguments

Using * as a separator forces callers to use keyword arguments. This prevents subtle bugs where argument order is confused.
# Everything after * must be passed by name
def connect(host, port, *, timeout=30, retries=3):
    print(f"Connecting to {host}:{port}, timeout={timeout}")

connect("localhost", 5432, timeout=10)    # OK
# connect("localhost", 5432, 10)          # TypeError! timeout is keyword-only

# This pattern is extremely common in well-designed Python APIs.
# It makes call sites self-documenting and prevents positional mixups.

Lambda Functions

Lambdas are anonymous, single-expression functions. They are useful as short callbacks, not as replacements for def.
# Lambda: good for short, one-off operations
users = [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]
users.sort(key=lambda u: u["age"])  # Sort by age

# Equivalent with def (use this if the logic is complex)
def get_age(user):
    return user["age"]
users.sort(key=get_age)

# Anti-pattern: assigning a lambda to a variable. Just use def instead.
# Bad:  square = lambda x: x ** 2
# Good: def square(x): return x ** 2

Common Pitfalls Cheat Sheet

Pitfalls by experience level:Beginner:
  • Using == to compare to None instead of is None. None is a singleton — always use is.
  • Forgetting that input() always returns a string. "5" + "3" gives "53", not 8.
  • Using = (assignment) when you mean == (comparison) inside conditions.
Intermediate:
  • Mutable default arguments (covered above) — the single most common Python bug in production code.
  • Shallow copies of nested structures.copy() does not copy inner lists or dicts.
  • Catching bare except: instead of except Exception:. Bare except catches KeyboardInterrupt and SystemExit, making your program impossible to kill with Ctrl+C.
Senior:
  • Relying on __del__ (destructor) for cleanup. CPython’s reference counting makes __del__ timing unpredictable, and it is not guaranteed to run at all in some edge cases. Use context managers instead.
  • Assuming the GIL makes your code thread-safe. The GIL prevents parallel execution of bytecode, but operations that span multiple bytecodes (like dict[key] += 1) are not atomic and can still cause race conditions.
  • Over-using isinstance() checks instead of leveraging duck typing and protocols. If it quacks like a duck, let it quack.

Summary

  • Dynamic Typing: Variables are name tags, not boxes. Types live on objects, not variables.
  • Indentation: Whitespace is syntactically significant.
  • Immutability: Strings and numbers cannot be changed in place; lists can. This affects function arguments.
  • The GIL: CPython allows only one thread to run Python bytecode at a time. Use multiprocessing for CPU parallelism.
  • Shallow vs Deep Copy: .copy() only copies the top layer. Use copy.deepcopy() for nested structures.
  • f-Strings: The best way to format text.
Next, we’ll explore Python’s powerful built-in Data Structures.

Interview Deep-Dive

Strong Answer:
  • The Global Interpreter Lock is a mutex in CPython that ensures only one thread executes Python bytecode at any given time. Even on a 64-core machine, only one thread is running Python instructions. The other threads are either waiting for the GIL or blocked on I/O.
  • The GIL exists because CPython’s memory management relies on reference counting, and reference counts are not thread-safe. Without the GIL, two threads could simultaneously increment and decrement the refcount of the same object, causing a race condition that corrupts memory. The GIL is the simplest solution: serialize all bytecode execution so refcounts are always consistent.
  • The key nuance that separates senior candidates: the GIL only affects CPU-bound work. For I/O-bound programs — web servers, API clients, database-heavy applications — threading works fine because the GIL is released while waiting on I/O. A thread calling socket.recv() or file.read() drops the GIL so other threads can run. This is why libraries like requests with concurrent.futures.ThreadPoolExecutor achieve real concurrency.
  • For CPU-bound parallelism, you have three main options. First, multiprocessing — each process gets its own Python interpreter and its own GIL, so you get true parallelism. The cost is inter-process communication overhead (serialization via pickle). Second, C extensions that explicitly release the GIL — NumPy does this, which is why NumPy matrix multiplication can saturate all cores even from a single Python thread. Third, alternative runtimes like PyPy (which still has a GIL but is faster overall) or the new free-threaded CPython build in 3.13+ (PEP 703) which removes the GIL entirely as an opt-in experiment.
  • In practice, the GIL is rarely the actual bottleneck. I have seen teams spend weeks trying to work around the GIL when the real problem was an unindexed database query or an O(n^2) algorithm. Profile first, then decide if the GIL is actually your constraint.
Follow-up: If the GIL is released during I/O, can you still have race conditions in a multithreaded Python program?
  • Absolutely, and this is a common misconception. The GIL protects CPython’s internal state (refcounts, interpreter data structures), not your application’s state. If two threads read a shared variable, do some computation, and write back the result, the GIL does not prevent interleaving between those operations.
  • For example, counter += 1 is not atomic in Python. It compiles to multiple bytecode instructions: LOAD, ADD, STORE. The GIL can be released between any of them (CPython releases the GIL every 5ms by default via sys.setswitchinterval). So two threads can both read counter = 5, both compute 6, and both write 6 — losing an increment.
  • You still need threading.Lock, queue.Queue, or other synchronization primitives for shared mutable state. The GIL is an implementation detail, not a concurrency guarantee for your code. Treating it as one is a bug waiting to happen.
Strong Answer:
  • Dynamic typing means type checking happens at runtime, not compile time. You do not declare int x = 5 — you write x = 5 and Python infers the type. You can reassign x = "hello" and Python will not complain. The type lives on the object, not the variable.
  • Strong typing means Python does not silently coerce types. In JavaScript (weakly typed), "5" + 3 gives you "53" (string concatenation). In Python, "5" + 3 raises a TypeError. Python forces you to be explicit: int("5") + 3 or "5" + str(3). This catches an enormous number of bugs that slip through in weakly typed languages.
  • The combination creates an interesting trade-off. You get rapid prototyping speed (no type boilerplate) with a safety net against the worst type coercion bugs, but you lose compile-time guarantees. A function that expects a dict but receives a list will only fail when that code path actually executes, which could be in production at 3 AM.
  • For large codebases, this is managed with type hints and static analysis. Type hints (PEP 484) let you annotate function signatures: def process(items: list[str]) -> int. Tools like mypy, pyright, or pytype then check these annotations statically, effectively giving you compile-time checking as an opt-in layer. At companies like Dropbox and Google, mypy is integrated into CI pipelines and catches thousands of bugs before code reaches production.
  • The practical implication is a spectrum: small scripts and data exploration benefit from dynamic typing’s flexibility. Large services with multiple contributors benefit from strict type annotations enforced by CI. The senior move is knowing where you are on that spectrum and adjusting accordingly — not dogmatically applying one approach everywhere.
Follow-up: What is the difference between mypy strict mode and Python’s runtime behavior, and where can they diverge?
  • mypy --strict enforces type annotations everywhere and disallows Any types. But critically, type hints have zero runtime effect in standard CPython. They are metadata stored in __annotations__ — Python does not check them during execution. So you can have a function annotated def add(a: int, b: int) -> int and call it with add("x", "y") and Python will happily concatenate the strings.
  • This means your CI can be green with mypy and your code can still crash with a type error in production if there is a gap between what mypy analyzed and what actually runs. Common sources of divergence: dynamic code paths that mypy cannot follow (getattr, eval, metaclass magic), third-party libraries without type stubs, and cast() calls that lie to mypy.
  • Libraries like pydantic and beartype add runtime type checking. Pydantic validates data at model boundaries (API inputs, config loading), which is where type mismatches most commonly originate. The senior approach is to combine static checking (mypy in CI) with runtime validation at system boundaries (pydantic for API schemas) and skip runtime checking in hot inner loops where the overhead is not justified.
Strong Answer:
  • When you write a = [1, 2, 3], Python allocates a list object on the heap. The list object internally holds an array of pointers to three integer objects (1, 2, 3). The variable a is a reference (think pointer) to that list object. So a is in the local namespace, and it points to the list on the heap.
  • b = a does not copy anything. It creates a second reference to the exact same list object. a and b now point to the same memory address. You can verify with a is b returning True, or id(a) == id(b). Any mutation through b — like b.append(4) — is visible through a because there is only one list.
  • b = a[:] creates a shallow copy. A new list object is allocated, and the new list’s internal pointer array is populated with copies of the pointers from a. The key word is “shallow” — the new list has its own identity (a is b returns False), but its elements still point to the same underlying objects. For immutable elements like integers and strings, this distinction does not matter. But for nested mutable objects, it does.
  • The gotcha with shallow copies: a = [[1, 2], [3, 4]] followed by b = a[:]. Now b is a different list, but b[0] is the same inner list as a[0]. Mutating b[0].append(99) changes what you see through a[0] as well. For true independence, you need copy.deepcopy(a), which recursively copies every nested object.
  • This matters in production constantly. Passing a list to a function and having the function mutate it is a common source of bugs. Returning a mutable internal attribute from a method exposes your object’s state to external mutation. Defensive copying is the fix, but deep copies are expensive for large nested structures. The trade-off is correctness versus performance, and the decision depends on whether the caller is trusted code or an external API boundary.
Follow-up: Python caches small integers and interns short strings. How does this interact with the is operator, and why is using is for value comparison a bug?
  • CPython caches integers in the range -5 to 256 as singletons. When you write x = 100, Python does not allocate a new integer — it returns a reference to the pre-existing int(100) object. So a = 100; b = 100; a is b returns True because both point to the cached singleton. But a = 1000; b = 1000; a is b might return False because 1000 is outside the cache range and two separate objects are created.
  • String interning is similar. Short strings that look like identifiers are automatically interned (same object reused). a = "hello"; b = "hello"; a is b is True. But a = "hello world!"; b = "hello world!"; a is b may be False.
  • Using is for value comparison is a bug because it checks identity (same object in memory), not equality (same value). It happens to work for cached integers and interned strings, making tests pass in development but fail unpredictably in production when values exceed the cache range. Always use == for value comparison. The only legitimate uses of is are is None, is True, is False, and checking sentinel objects — cases where you genuinely care about identity, not value.
Strong Answer:
  • The trap is that default argument values are evaluated once, at function definition time, not each time the function is called. So def append_to(item, target=[]) creates a single list object when the def statement executes, and that same list is used as the default for every subsequent call. Successive calls accumulate state.
  • At the implementation level, when Python compiles the function, it evaluates the default expressions and stores them in the function object’s __defaults__ tuple. You can inspect this: append_to.__defaults__ will show you the single list that all calls share. This is not a bug — it is a direct consequence of functions being first-class objects that are created at definition time.
  • The idiomatic fix is to use None as a sentinel and create the mutable object inside the function body: def append_to(item, target=None): if target is None: target = []. This ensures a fresh list on every call that does not supply an explicit argument.
  • Interestingly, this “trap” can be used intentionally as a feature. A common technique is memoization using a mutable default dict: def fib(n, cache={"{"}). The cache persists across calls because it is the same dict object. This is a hack -- functools.lru_cache` is the proper tool — but it illustrates that the behavior is consistent and well-defined, just surprising to newcomers.
  • This question tests whether a candidate understands Python’s object model at a fundamental level: that def is an executable statement, that default arguments are part of the function object, and that assignment in Python creates references, not copies.
Follow-up: How does functools.lru_cache work internally, and what gotchas should you watch for when using it in production?
  • lru_cache is a decorator that memoizes function results in a dictionary keyed by the function’s arguments. It uses a doubly-linked list to track access order and evicts the least recently used entry when the cache exceeds maxsize. Arguments must be hashable to serve as dict keys, which means you cannot cache functions that take lists or dicts as parameters.
  • Production gotchas: First, unbounded caches (maxsize=None) can consume all available memory if the function is called with many distinct arguments. Second, cached results hold references to return values, preventing garbage collection — if your function returns large objects, those stay in memory. Third, in multithreaded code, lru_cache is thread-safe for reads but the internal lock can become a contention point under heavy concurrent access. Fourth, for methods on class instances, using lru_cache naively caches per-instance because self is part of the key, which can cause memory leaks since the cache holds a reference to self and prevents the instance from being garbage collected. Use functools.cached_property for instance-level caching instead.
Strong Answer:
  • *args collects extra positional arguments into a tuple. **kwargs collects extra keyword arguments into a dictionary. They allow functions to accept an arbitrary number of arguments, which is essential for writing flexible APIs, decorators, and wrapper functions.
  • Python’s full argument resolution order, from left to right in the function signature, is: (1) regular positional arguments, (2) *args, (3) keyword-only arguments (anything after *args or a bare *), (4) **kwargs. In the call site, positional arguments fill the regular parameters first, overflow goes to *args, and any keyword arguments that do not match a named parameter go to **kwargs.
  • A concrete example: def f(a, b, *args, debug=False, **kwargs). Here a and b are required positional, args catches overflow positional, debug is keyword-only (cannot be passed positionally because it is after *args), and kwargs catches overflow keyword arguments. Calling f(1, 2, 3, 4, debug=True, user="admin") gives a=1, b=2, args=(3,4), debug=True, kwargs={"user": "admin"}.
  • The keyword-only argument pattern (using bare *) is underappreciated: def connect(host, port, *, timeout=30, retries=3). The * forces timeout and retries to be passed by name only. This prevents positional ambiguity and makes call sites self-documenting: connect("localhost", 5432, timeout=10) is clear, whereas connect("localhost", 5432, 10, 3) is not.
  • In practice, *args and **kwargs are most valuable in decorator patterns and inheritance hierarchies where you need to forward arguments you do not care about. The wrapper function in a decorator typically has the signature def wrapper(*args, **kwargs) so it can wrap any function regardless of that function’s signature.
Follow-up: What is the difference between * used in a function definition versus in a function call?
  • In a function definition, *args packs extra positional arguments into a tuple, and a bare * marks the boundary between positional and keyword-only parameters. In a function call, * unpacks an iterable into separate positional arguments: f(*[1, 2, 3]) is equivalent to f(1, 2, 3). Similarly, ** in a definition packs keyword arguments into a dict, while ** in a call unpacks a dict into keyword arguments: f(**{"a": 1, "b": 2}) is equivalent to f(a=1, b=2).
  • This symmetry is elegant and useful. A common pattern is merging configuration dicts: config = {**defaults, **overrides}. The second dict wins on key collisions. Another pattern is forwarding: def wrapper(*args, **kwargs): return original(*args, **kwargs) — pack on the way in, unpack on the way out. This is the backbone of every decorator in Python.