Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Python Fundamentals
Python is an interpreted, high-level, dynamically typed language. It emphasizes code readability with its use of significant indentation. Think of Python like writing a recipe: you describe the steps in plain language, in order, and anyone can follow along. Languages like C++ are more like writing assembly instructions for a specific machine — precise but harder for humans to parse at a glance. This is not just a metaphor — Python was explicitly designed to be read by humans first and machines second, which is why Guido van Rossum made whitespace significant and kept the syntax minimal.1. How Python Works: Under the Hood
Unlike compiled languages (C++, Go, Rust) that produce standalone executables, Python uses a two-stage process: compilation to bytecode, then interpretation.The Python Execution Pipeline
| Stage | What Happens | Can You See It? |
|---|---|---|
| Lexing | Source code → tokens (keywords, identifiers, operators) | Internal |
| Parsing | Tokens → Abstract Syntax Tree (AST) | ast.parse() |
| Compiling | AST → Bytecode instructions | dis.dis() |
| Interpreting | PVM executes bytecode | Your program runs! |
Why is Python “Interpreted”?
Python is actually both compiled and interpreted:- Compilation (happens automatically): Your
.py→ bytecode (.pyc) - Interpretation: The Python Virtual Machine (PVM) executes bytecode
The __pycache__ Folder
When you import a module, Python saves the compiled bytecode:
cpython-311= CPython interpreter, Python 3.11- Bytecode is cached for faster imports (no recompilation if source unchanged)
- Delete safely: Python will regenerate if needed
CPython vs Other Implementations
| Implementation | Description | Use Case |
|---|---|---|
| CPython | Reference implementation (C) | Default, most libraries |
| PyPy | JIT-compiled Python (faster) | Performance-critical |
| Jython | Python on JVM (Java bytecode) | Java integration |
| MicroPython | Tiny Python for microcontrollers | IoT, embedded |
- NumPy/Pandas: Uses C under the hood
- Cython: Compile Python to C
- PyPy: JIT compilation for speedups
The GIL (Global Interpreter Lock)
CPython has a Global Interpreter Lock (GIL) — a mutex that allows only one thread to execute Python bytecode at a time. Think of it like a single-lane bridge: no matter how many cars (threads) are waiting, only one can cross at a time. This means multi-threaded Python code does not achieve true parallelism for CPU-bound work. Two threads crunching numbers will take roughly the same time as running them sequentially because they take turns holding the GIL. However, the GIL is released during I/O operations (network calls, file reads, database queries). So threading still works well for I/O-bound tasks — while one thread waits for a network response, another can run.multiprocessing for CPU-bound parallelism and asyncio or threading for I/O-bound concurrency.2. Variables & Types
Python is dynamically typed. You don’t declare types (likeint x), but types definitely exist. Think of variables in Python as name tags, not boxes. In C, a variable is a box with a fixed size — you put an integer in an int box. In Python, a variable is a sticky note you attach to an object. You can peel the note off and stick it on a completely different object. The object carries its own type; the name tag does not.
Type Hints (Python 3.5+)
While Python doesn’t enforce types at runtime, you can (and should) use Type Hints. They act as documentation and allow tools (like VS Code ormypy) to catch errors.
Mutable vs. Immutable
This is a critical concept in Python that trips up beginners. Think of it this way: immutable objects are like printed books — once printed, you cannot change the text on page 5. You can only print a new edition. Mutable objects are like whiteboards — you can erase and rewrite freely.- Immutable (Cannot change):
int,float,str,tuple,bool. - Mutable (Can change):
list,dict,set.
3. Input & Output
f-Strings (Python 3.6+)
The “Formatted String Literal” is the modern way to insert variables into strings. It’s fast and readable. You can put any valid Python expression inside the curly braces.4. Control Flow
Python uses indentation (whitespace) to define blocks of code. There are no curly braces{} or semicolons ;.
If-Elif-Else
Pythonic Conditionals
Python has several idiomatic patterns for conditionals that differ from other languages:Loops
For Loop Python’s for loop is actually a “for-each” loop. It iterates over any iterable — a sequence, a file, a generator, or any object that implements the iterator protocol.5. Functions
Functions are defined usingdef.
Default Arguments
You can provide default values for parameters.*args and **kwargs
These allow functions to accept an arbitrary number of arguments. Think of*args as a catch-all bucket for extra positional arguments and **kwargs as a catch-all bucket for extra keyword arguments.
*args: Collects positional arguments into a Tuple.**kwargs: Collects keyword arguments into a Dictionary.
Keyword-Only Arguments
Using* as a separator forces callers to use keyword arguments. This prevents subtle bugs where argument order is confused.
Lambda Functions
Lambdas are anonymous, single-expression functions. They are useful as short callbacks, not as replacements fordef.
Common Pitfalls Cheat Sheet
Summary
- Dynamic Typing: Variables are name tags, not boxes. Types live on objects, not variables.
- Indentation: Whitespace is syntactically significant.
- Immutability: Strings and numbers cannot be changed in place; lists can. This affects function arguments.
- The GIL: CPython allows only one thread to run Python bytecode at a time. Use
multiprocessingfor CPU parallelism. - Shallow vs Deep Copy:
.copy()only copies the top layer. Usecopy.deepcopy()for nested structures. - f-Strings: The best way to format text.
Interview Deep-Dive
Explain the GIL. Why does it exist, how does it affect multithreaded Python programs, and what are the alternatives?
Explain the GIL. Why does it exist, how does it affect multithreaded Python programs, and what are the alternatives?
- The Global Interpreter Lock is a mutex in CPython that ensures only one thread executes Python bytecode at any given time. Even on a 64-core machine, only one thread is running Python instructions. The other threads are either waiting for the GIL or blocked on I/O.
- The GIL exists because CPython’s memory management relies on reference counting, and reference counts are not thread-safe. Without the GIL, two threads could simultaneously increment and decrement the refcount of the same object, causing a race condition that corrupts memory. The GIL is the simplest solution: serialize all bytecode execution so refcounts are always consistent.
- The key nuance that separates senior candidates: the GIL only affects CPU-bound work. For I/O-bound programs — web servers, API clients, database-heavy applications — threading works fine because the GIL is released while waiting on I/O. A thread calling
socket.recv()orfile.read()drops the GIL so other threads can run. This is why libraries likerequestswithconcurrent.futures.ThreadPoolExecutorachieve real concurrency. - For CPU-bound parallelism, you have three main options. First,
multiprocessing— each process gets its own Python interpreter and its own GIL, so you get true parallelism. The cost is inter-process communication overhead (serialization via pickle). Second, C extensions that explicitly release the GIL — NumPy does this, which is why NumPy matrix multiplication can saturate all cores even from a single Python thread. Third, alternative runtimes like PyPy (which still has a GIL but is faster overall) or the new free-threaded CPython build in 3.13+ (PEP 703) which removes the GIL entirely as an opt-in experiment. - In practice, the GIL is rarely the actual bottleneck. I have seen teams spend weeks trying to work around the GIL when the real problem was an unindexed database query or an O(n^2) algorithm. Profile first, then decide if the GIL is actually your constraint.
- Absolutely, and this is a common misconception. The GIL protects CPython’s internal state (refcounts, interpreter data structures), not your application’s state. If two threads read a shared variable, do some computation, and write back the result, the GIL does not prevent interleaving between those operations.
- For example,
counter += 1is not atomic in Python. It compiles to multiple bytecode instructions: LOAD, ADD, STORE. The GIL can be released between any of them (CPython releases the GIL every 5ms by default viasys.setswitchinterval). So two threads can both readcounter = 5, both compute6, and both write6— losing an increment. - You still need
threading.Lock,queue.Queue, or other synchronization primitives for shared mutable state. The GIL is an implementation detail, not a concurrency guarantee for your code. Treating it as one is a bug waiting to happen.
Python is dynamically typed but strongly typed. What does that distinction mean, and what are the real-world implications for large codebases?
Python is dynamically typed but strongly typed. What does that distinction mean, and what are the real-world implications for large codebases?
- Dynamic typing means type checking happens at runtime, not compile time. You do not declare
int x = 5— you writex = 5and Python infers the type. You can reassignx = "hello"and Python will not complain. The type lives on the object, not the variable. - Strong typing means Python does not silently coerce types. In JavaScript (weakly typed),
"5" + 3gives you"53"(string concatenation). In Python,"5" + 3raises aTypeError. Python forces you to be explicit:int("5") + 3or"5" + str(3). This catches an enormous number of bugs that slip through in weakly typed languages. - The combination creates an interesting trade-off. You get rapid prototyping speed (no type boilerplate) with a safety net against the worst type coercion bugs, but you lose compile-time guarantees. A function that expects a
dictbut receives alistwill only fail when that code path actually executes, which could be in production at 3 AM. - For large codebases, this is managed with type hints and static analysis. Type hints (PEP 484) let you annotate function signatures:
def process(items: list[str]) -> int. Tools likemypy,pyright, orpytypethen check these annotations statically, effectively giving you compile-time checking as an opt-in layer. At companies like Dropbox and Google, mypy is integrated into CI pipelines and catches thousands of bugs before code reaches production. - The practical implication is a spectrum: small scripts and data exploration benefit from dynamic typing’s flexibility. Large services with multiple contributors benefit from strict type annotations enforced by CI. The senior move is knowing where you are on that spectrum and adjusting accordingly — not dogmatically applying one approach everywhere.
mypy strict mode and Python’s runtime behavior, and where can they diverge?mypy --strictenforces type annotations everywhere and disallowsAnytypes. But critically, type hints have zero runtime effect in standard CPython. They are metadata stored in__annotations__— Python does not check them during execution. So you can have a function annotateddef add(a: int, b: int) -> intand call it withadd("x", "y")and Python will happily concatenate the strings.- This means your CI can be green with mypy and your code can still crash with a type error in production if there is a gap between what mypy analyzed and what actually runs. Common sources of divergence: dynamic code paths that mypy cannot follow (
getattr,eval, metaclass magic), third-party libraries without type stubs, andcast()calls that lie to mypy. - Libraries like
pydanticandbeartypeadd runtime type checking. Pydantic validates data at model boundaries (API inputs, config loading), which is where type mismatches most commonly originate. The senior approach is to combine static checking (mypy in CI) with runtime validation at system boundaries (pydantic for API schemas) and skip runtime checking in hot inner loops where the overhead is not justified.
Walk me through what happens in memory when you write `a = [1, 2, 3]` followed by `b = a` versus `b = a[:]`. Why does this matter?
Walk me through what happens in memory when you write `a = [1, 2, 3]` followed by `b = a` versus `b = a[:]`. Why does this matter?
- When you write
a = [1, 2, 3], Python allocates a list object on the heap. The list object internally holds an array of pointers to three integer objects (1, 2, 3). The variableais a reference (think pointer) to that list object. Soais in the local namespace, and it points to the list on the heap. b = adoes not copy anything. It creates a second reference to the exact same list object.aandbnow point to the same memory address. You can verify witha is breturningTrue, orid(a) == id(b). Any mutation throughb— likeb.append(4)— is visible throughabecause there is only one list.b = a[:]creates a shallow copy. A new list object is allocated, and the new list’s internal pointer array is populated with copies of the pointers froma. The key word is “shallow” — the new list has its own identity (a is breturnsFalse), but its elements still point to the same underlying objects. For immutable elements like integers and strings, this distinction does not matter. But for nested mutable objects, it does.- The gotcha with shallow copies:
a = [[1, 2], [3, 4]]followed byb = a[:]. Nowbis a different list, butb[0]is the same inner list asa[0]. Mutatingb[0].append(99)changes what you see througha[0]as well. For true independence, you needcopy.deepcopy(a), which recursively copies every nested object. - This matters in production constantly. Passing a list to a function and having the function mutate it is a common source of bugs. Returning a mutable internal attribute from a method exposes your object’s state to external mutation. Defensive copying is the fix, but deep copies are expensive for large nested structures. The trade-off is correctness versus performance, and the decision depends on whether the caller is trusted code or an external API boundary.
is operator, and why is using is for value comparison a bug?- CPython caches integers in the range -5 to 256 as singletons. When you write
x = 100, Python does not allocate a new integer — it returns a reference to the pre-existingint(100)object. Soa = 100; b = 100; a is breturnsTruebecause both point to the cached singleton. Buta = 1000; b = 1000; a is bmight returnFalsebecause 1000 is outside the cache range and two separate objects are created. - String interning is similar. Short strings that look like identifiers are automatically interned (same object reused).
a = "hello"; b = "hello"; a is bisTrue. Buta = "hello world!"; b = "hello world!"; a is bmay beFalse. - Using
isfor value comparison is a bug because it checks identity (same object in memory), not equality (same value). It happens to work for cached integers and interned strings, making tests pass in development but fail unpredictably in production when values exceed the cache range. Always use==for value comparison. The only legitimate uses ofisareis None,is True,is False, and checking sentinel objects — cases where you genuinely care about identity, not value.
Explain Python's mutable default argument trap. Why does it happen at the implementation level, and what is the idiomatic fix?
Explain Python's mutable default argument trap. Why does it happen at the implementation level, and what is the idiomatic fix?
- The trap is that default argument values are evaluated once, at function definition time, not each time the function is called. So
def append_to(item, target=[])creates a single list object when thedefstatement executes, and that same list is used as the default for every subsequent call. Successive calls accumulate state. - At the implementation level, when Python compiles the function, it evaluates the default expressions and stores them in the function object’s
__defaults__tuple. You can inspect this:append_to.__defaults__will show you the single list that all calls share. This is not a bug — it is a direct consequence of functions being first-class objects that are created at definition time. - The idiomatic fix is to use
Noneas a sentinel and create the mutable object inside the function body:def append_to(item, target=None): if target is None: target = []. This ensures a fresh list on every call that does not supply an explicit argument. - Interestingly, this “trap” can be used intentionally as a feature. A common technique is memoization using a mutable default dict:
def fib(n, cache={"{"}). The cache persists across calls because it is the same dict object. This is a hack --functools.lru_cache` is the proper tool — but it illustrates that the behavior is consistent and well-defined, just surprising to newcomers. - This question tests whether a candidate understands Python’s object model at a fundamental level: that
defis an executable statement, that default arguments are part of the function object, and that assignment in Python creates references, not copies.
functools.lru_cache work internally, and what gotchas should you watch for when using it in production?lru_cacheis a decorator that memoizes function results in a dictionary keyed by the function’s arguments. It uses a doubly-linked list to track access order and evicts the least recently used entry when the cache exceedsmaxsize. Arguments must be hashable to serve as dict keys, which means you cannot cache functions that take lists or dicts as parameters.- Production gotchas: First, unbounded caches (
maxsize=None) can consume all available memory if the function is called with many distinct arguments. Second, cached results hold references to return values, preventing garbage collection — if your function returns large objects, those stay in memory. Third, in multithreaded code,lru_cacheis thread-safe for reads but the internal lock can become a contention point under heavy concurrent access. Fourth, for methods on class instances, usinglru_cachenaively caches per-instance becauseselfis part of the key, which can cause memory leaks since the cache holds a reference toselfand prevents the instance from being garbage collected. Usefunctools.cached_propertyfor instance-level caching instead.
What are `*args` and `**kwargs`, and how does Python's argument resolution order work when you combine positional, keyword, default, `*args`, and `**kwargs`?
What are `*args` and `**kwargs`, and how does Python's argument resolution order work when you combine positional, keyword, default, `*args`, and `**kwargs`?
*argscollects extra positional arguments into a tuple.**kwargscollects extra keyword arguments into a dictionary. They allow functions to accept an arbitrary number of arguments, which is essential for writing flexible APIs, decorators, and wrapper functions.- Python’s full argument resolution order, from left to right in the function signature, is: (1) regular positional arguments, (2)
*args, (3) keyword-only arguments (anything after*argsor a bare*), (4)**kwargs. In the call site, positional arguments fill the regular parameters first, overflow goes to*args, and any keyword arguments that do not match a named parameter go to**kwargs. - A concrete example:
def f(a, b, *args, debug=False, **kwargs). Hereaandbare required positional,argscatches overflow positional,debugis keyword-only (cannot be passed positionally because it is after*args), andkwargscatches overflow keyword arguments. Callingf(1, 2, 3, 4, debug=True, user="admin")givesa=1, b=2, args=(3,4), debug=True, kwargs={"user": "admin"}. - The keyword-only argument pattern (using bare
*) is underappreciated:def connect(host, port, *, timeout=30, retries=3). The*forcestimeoutandretriesto be passed by name only. This prevents positional ambiguity and makes call sites self-documenting:connect("localhost", 5432, timeout=10)is clear, whereasconnect("localhost", 5432, 10, 3)is not. - In practice,
*argsand**kwargsare most valuable in decorator patterns and inheritance hierarchies where you need to forward arguments you do not care about. The wrapper function in a decorator typically has the signaturedef wrapper(*args, **kwargs)so it can wrap any function regardless of that function’s signature.
* used in a function definition versus in a function call?- In a function definition,
*argspacks extra positional arguments into a tuple, and a bare*marks the boundary between positional and keyword-only parameters. In a function call,*unpacks an iterable into separate positional arguments:f(*[1, 2, 3])is equivalent tof(1, 2, 3). Similarly,**in a definition packs keyword arguments into a dict, while**in a call unpacks a dict into keyword arguments:f(**{"a": 1, "b": 2})is equivalent tof(a=1, b=2). - This symmetry is elegant and useful. A common pattern is merging configuration dicts:
config = {**defaults, **overrides}. The second dict wins on key collisions. Another pattern is forwarding:def wrapper(*args, **kwargs): return original(*args, **kwargs)— pack on the way in, unpack on the way out. This is the backbone of every decorator in Python.