Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Modules & Packages

Real-world Python projects aren’t single files. They are split across multiple files (modules) and directories (packages). Understanding how to organize code and manage dependencies is crucial. Think of modules as individual LEGO bricks — each one does something specific. Packages are the organized trays that hold related bricks together. And pip is the store where you buy bricks other people made. Without this organization system, every Python project would be a single 10,000-line file, and collaboration would be impossible.

1. Modules

A Module is simply a Python file (.py). Any Python file can be imported by another.
# math_utils.py
def add(a, b):
    return a + b

PI = 3.14159

Importing

You can import the whole module or specific parts.
# main.py

# Option 1: Import entire module
import math_utils
print(math_utils.add(1, 2))

# Option 2: Import specific items (Cleaner)
from math_utils import add, PI
print(add(1, 2))

# Option 3: Alias (Common for libraries like pandas as pd)
import math_utils as mu
print(mu.PI)

__name__ == "__main__"

This is a common idiom. It checks if the file is being run directly (like python main.py) or imported as a module.
# utils.py
def helper():
    return "I am helpful"

if __name__ == "__main__":
    # This block runs ONLY when you execute: python utils.py
    # It does NOT run when another file does: import utils
    print(helper())
    print("Running tests or demos here...")
This pattern is useful for putting test code, demos, or CLI entry points at the bottom of a module without affecting importers. You will see it in virtually every well-structured Python project.
Import Pitfall: Circular Imports — If module A imports module B, and module B imports module A, you get a circular import error (or worse, silently incomplete modules). This is one of the most common structural bugs in growing Python projects.
# BROKEN: circular dependency
# models.py
from services import process_user  # imports services, which imports models...

# services.py
from models import User            # imports models, which imports services...

# FIX 1: Import inside the function (deferred import)
# services.py
def handle_request():
    from models import User  # Import happens at call time, not module load time
    return User()

# FIX 2: Restructure to break the cycle (better long-term solution)
# Extract shared code into a third module that both can import.

2. Packages

A Package is a directory containing Python modules. It requires a __init__.py file (Python 3.3+ made this optional via “namespace packages,” but including it is still strongly recommended — it is the explicit signal that “this directory is a Python package”).
my_package/
    __init__.py       # Marks this folder as a package (can be empty)
    module1.py
    module2.py
    sub_package/
        __init__.py
        helpers.py
# Absolute imports (preferred -- always unambiguous)
from my_package import module1
from my_package.module2 import some_function
from my_package.sub_package.helpers import utility

# Relative imports (use within a package to refer to siblings)
# In my_package/module2.py:
from .module1 import something       # . means "current package"
from .sub_package import helpers      # Import from a sub-package
from ..other_package import thing     # .. means "parent package" (use sparingly)
Best Practice: Prefer absolute imports over relative imports. Absolute imports are always unambiguous about where a module lives. Relative imports can be confusing and break if you reorganize your package structure. The main exception is within tightly coupled sub-packages where relative imports clarify “this is an internal dependency.”

What goes in __init__.py?

The __init__.py file runs when the package is imported. Use it to define the package’s public API:
# my_package/__init__.py
from .module1 import PublicClass
from .module2 import public_function

# Now users can do: from my_package import PublicClass
# Instead of:       from my_package.module1 import PublicClass

# Control what "from my_package import *" exports:
__all__ = ["PublicClass", "public_function"]

3. The Standard Library

Python is famous for being “Batteries Included”. It has a massive standard library built-in.

pathlib (Modern File Paths)

Stop using os.path.join. Use pathlib. It treats file paths as objects rather than strings, which makes path manipulation safer and more readable. It works on Windows, Mac, and Linux seamlessly — no more worrying about / vs \.
from pathlib import Path

# Create paths using the / operator (reads naturally)
p = Path("data") / "subdir" / "file.txt"

# Read and write files directly (no open() needed for simple cases)
if p.exists():
    content = p.read_text(encoding="utf-8")  # Always specify encoding!
    p.write_text("new content", encoding="utf-8")

# Find files with glob patterns
for py_file in Path(".").glob("**/*.py"):  # ** means recursive
    print(py_file)

# Useful properties
print(p.name)      # "file.txt"
print(p.stem)      # "file" (name without extension)
print(p.suffix)    # ".txt"
print(p.parent)    # Path("data/subdir")
print(p.resolve()) # Absolute path

json (Data Serialization)

JSON is the language of the web. Python handles it natively.
import json

data = {"name": "Alice", "age": 30}

# Serialize to String
json_str = json.dumps(data)

# Save to File
with open("data.json", "w") as f:
    json.dump(data, f)

datetime (Dates & Times)

from datetime import datetime, timedelta

now = datetime.now()
tomorrow = now + timedelta(days=1)

print(now.strftime("%Y-%m-%d"))

4. Virtual Environments (venv)

The Golden Rule of Python: Never install packages globally. Always use a virtual environment. A virtual environment creates an isolated folder for your project’s dependencies. Think of it as giving each project its own private copy of Python and its libraries. This prevents “Dependency Hell” where Project A needs requests==2.28 and Project B needs requests==2.31, and installing one breaks the other.

Setup

# 1. Create the environment (run once per project)
# This creates a folder named '.venv' (dot prefix keeps it hidden on Unix)
python -m venv .venv

# 2. Activate it (run each time you open a new terminal)
# Windows:
.venv\Scripts\activate
# Mac/Linux:
source .venv/bin/activate
Once activated, your terminal prompt will change (e.g., (.venv) C:\Project>). Now, when you run pip install, packages go into this folder, not your system Python.
Virtual Environment Pitfalls:
  • Always add .venv/ (or venv/) to your .gitignore. Virtual environments are machine-specific and should never be committed to version control.
  • If you rename or move your project folder, the virtual environment may break because it contains hardcoded absolute paths. The fix is to delete it and recreate: python -m venv .venv.
  • On some systems, python points to Python 2. Use python3 -m venv .venv explicitly to ensure you get Python 3.

Modern Alternative: uv

The Python packaging ecosystem is evolving. uv (from Astral, the makers of ruff) is a fast, Rust-based replacement for pip, venv, and pip-tools combined. It creates virtual environments, resolves dependencies, and installs packages 10-100x faster than pip.
# Install uv
pip install uv

# Create a venv and install dependencies in one step
uv venv
uv pip install requests fastapi

# uv also manages lockfiles for reproducible builds
uv pip compile requirements.in -o requirements.txt

5. Package Management (pip)

pip is the package installer for Python. It fetches packages from PyPI (Python Package Index).
# Install a package
pip install requests

# List installed packages
pip list

# Save your dependencies to a file
pip freeze > requirements.txt

# Install dependencies from a file (Crucial for collaboration)
pip install -r requirements.txt

Example: Using requests

requests is the most popular Python library. It makes HTTP requests simple.
import requests

response = requests.get("https://api.github.com")
print(response.status_code)
print(response.json())

Summary

  • Modules: .py files.
  • Packages: Folders with __init__.py.
  • Standard Library: Learn pathlib, json, datetime.
  • venv: Always isolate your projects.
  • pip: The tool to install external libraries.
Next, we’ll tackle Advanced Python concepts like decorators and async programming.

Interview Deep-Dive

Strong Answer:
  • When you write import foo, Python executes a multi-step process. First, it checks sys.modules — a cache of all previously imported modules. If foo is already there, it returns the cached module object immediately. No file is read, no code is executed. This is why importing the same module in 50 files does not run the module code 50 times.
  • If the module is not cached, Python searches for it using sys.path — an ordered list of directories. sys.path includes: the directory of the script being run, directories set in the PYTHONPATH environment variable, the standard library paths, and the site-packages directory (where pip installs packages). Python searches these in order, and the first match wins.
  • Once found, Python compiles the module to bytecode (if a cached .pyc is not up to date), executes the module’s top-level code (all statements at the module level run during import), and stores the resulting module object in sys.modules. This is why putting side effects (print statements, database connections, API calls) at the module level is dangerous — they execute on import, which might be at test collection time, CI startup, or other unexpected moments.
  • Absolute imports (from package.sub import module) use the full path from the project root. Relative imports (from .sub import module or from ..sibling import func) use dots to navigate relative to the current package. Relative imports only work inside packages (not in top-level scripts). One dot means “current package,” two dots mean “parent package.”
  • A common production issue: circular imports. Module A imports module B, and module B imports module A. This does not always fail — Python handles it by returning a partially-initialized module from sys.modules. But if B tries to access a name from A that has not been defined yet (because A’s top-level code has not finished executing), you get an ImportError or AttributeError. The fix is to restructure the code (extract shared code to a third module), use lazy imports (import inside the function that needs it), or use TYPE_CHECKING blocks for type-hint-only imports.
Follow-up: What is if __name__ == "__main__" actually doing, and what is the __name__ variable?
  • Every Python module has a __name__ attribute. When a file is run directly (python script.py), __name__ is set to the string "__main__". When the same file is imported as a module (import script), __name__ is set to the module’s qualified name (e.g., "script" or "package.script").
  • The if __name__ == "__main__" guard prevents code from running when the module is imported. Without it, any top-level code (test runs, demo output, server startup) would execute on import, which breaks test discovery, IDE introspection, and module reuse.
  • This pattern also makes your module both a library and a script. The functions and classes are importable by other code, and the __main__ block provides a command-line entry point. This is a foundational Python pattern that every production module should use for any executable behavior.
Strong Answer:
  • pip is the standard, built-in package installer. It installs packages from PyPI and supports requirements.txt for reproducibility. Its weakness is that it does not do dependency resolution well — if package A needs requests>=2.0 and package B needs requests<2.25, pip might install a conflicting version. It also does not distinguish between direct dependencies and transitive dependencies, making requirements.txt a flat dump that is hard to audit.
  • poetry provides a full project management experience: dependency resolution, lockfiles (poetry.lock), virtual environment management, and package building/publishing. It uses pyproject.toml for configuration (PEP 621 compliant). Its dependency resolver is deterministic — it computes a locked set of versions that satisfy all constraints and records them. The downside is that Poetry is opinionated (it manages your virtualenv for you, which can conflict with Docker workflows) and its resolution can be slow for complex dependency trees.
  • pipenv was an earlier attempt at combining pip and virtualenv with a Pipfile/Pipfile.lock workflow. It fell out of favor due to slow resolution, inconsistent maintenance, and confusing behavior around lock file generation. Most teams have migrated to Poetry or uv.
  • uv is the newest contender, built in Rust by the creators of ruff. It is a drop-in replacement for pip and pip-tools that is 10-100x faster. It handles dependency resolution, virtual environment creation (uv venv), and lockfile generation (uv lock). It supports pyproject.toml and is rapidly becoming the default recommendation for new projects due to its speed and compatibility.
  • For a new production project today, I would use uv for speed and simplicity, with pyproject.toml for project metadata. For teams already on Poetry with established workflows, there is no urgency to migrate. The key principle is: always have a lockfile (deterministic builds), always separate direct dependencies from transitive ones, and always use a virtual environment.
Follow-up: What is the difference between requirements.txt and a lockfile, and why does it matter for production deployments?
  • requirements.txt with pinned versions (requests==2.31.0) specifies direct dependencies but not the exact versions of their transitive dependencies. If requests depends on urllib3, and you do not pin urllib3, you might get different versions on different machines or at different times, leading to “works on my machine” problems.
  • A lockfile (Poetry’s poetry.lock, uv’s uv.lock) records the exact version of every package in the dependency tree — direct and transitive — along with content hashes. This guarantees that uv sync on your laptop, in CI, and on the production server installs byte-for-byte identical packages. Without a lockfile, you are one pip install away from a surprise breaking change in a transitive dependency.
  • The practical workflow: pyproject.toml declares what you need (loose constraints), the lockfile records what you got (exact versions). Developers update constraints in pyproject.toml, regenerate the lockfile, test, and commit both. Production deployments install from the lockfile only.
Strong Answer:
  • Historically, __init__.py was required to mark a directory as a Python package. Without it, Python would not recognize the directory as importable. The file is executed when the package is first imported, and the resulting module object becomes the package itself. So import mypackage runs mypackage/__init__.py and gives you access to whatever names are defined there.
  • __init__.py serves several practical purposes: it controls the public API of the package (by importing select names from submodules), it can run package initialization code (setting up logging, loading configuration), and it defines __all__ (which controls what from package import * exports). A common pattern is to import the most-used classes into __init__.py so users can write from mypackage import MyClass instead of from mypackage.submodule import MyClass.
  • Python 3.3 introduced namespace packages (PEP 420), which allow packages without __init__.py. The motivation was to allow a single logical package to be split across multiple directories or distributions. For example, the google namespace package lets google-cloud-storage and google-auth both install into the google/ directory without conflicting __init__.py files.
  • In practice, most application code should still use __init__.py. Namespace packages are primarily for large library ecosystems (Google Cloud, Azure SDK) where multiple independent teams publish sub-packages under a shared namespace. Omitting __init__.py in application code causes confusion and breaks some tooling (test discovery, IDE imports, some linters).
  • A production best practice: keep __init__.py files minimal. Heavy initialization code (database connections, config parsing) should not live there because it runs on import. If importing a package triggers a database connection, you cannot import it in tests, scripts, or type-checking contexts without that side effect. Lazy initialization patterns or explicit init() functions are better.
Follow-up: What is the __all__ variable and how does it affect imports?
  • __all__ is a list of strings that defines the public API of a module. It controls two things: what from module import * exports, and what tools like mypy and IDEs consider “public.”
  • Without __all__, from module import * imports every name that does not start with an underscore. With __all__ = ["ClassA", "function_b"], only those specific names are imported. This is important for packages with many internal helper functions that should not leak into the user’s namespace.
  • __all__ does not prevent direct access — from module import _private_func still works. It is a convention, not an access control mechanism. But it is respected by documentation generators (Sphinx), linters, and IDE autocompletion, making it a valuable tool for API design.
Strong Answer:
  • Step one: understand the current state. Run pip list on the production machine (or whoever has the “working” environment) to get the exact set of installed packages. Save this with pip freeze > requirements-snapshot.txt. This is your baseline — it captures the exact versions that are known to work, including transitive dependencies.
  • Step two: create a virtual environment on a development machine. python -m venv venv, activate it, and install from the snapshot: pip install -r requirements-snapshot.txt. Run the full test suite and verify the application works identically. If there are no tests, run the application manually and verify key flows.
  • Step three: separate direct dependencies from transitive ones. Go through the snapshot and identify which packages the code actually imports (grep the codebase for import statements). Create a pyproject.toml or requirements.in with only the direct dependencies and their version constraints. Use pip-compile (from pip-tools) or uv pip compile to regenerate a lockfile from the direct dependencies. Verify the resolved versions match the working snapshot.
  • Step four: set up the virtual environment in CI/CD. The build pipeline should create a fresh venv and install from the lockfile. If the deployment is containerized (Docker), the Dockerfile should create a venv inside the container — even in Docker, a venv keeps system Python clean and makes it clear what is application code versus OS dependencies.
  • Step five: document the process and add it to the contribution guide. The most common reason projects end up in this state is that the setup process was never documented, so each developer improvised. A Makefile or justfile with targets like make setup, make test, make lock prevents regression.
  • The key principle: do not change anything in production until the new setup is proven equivalent. Freeze the current state first, reproduce it in isolation, and only then start improving.
Follow-up: What is the difference between pip freeze output and a proper lockfile?
  • pip freeze outputs every installed package at its current version, but it does not record which packages are direct dependencies versus transitive, it does not include content hashes (so you cannot verify package integrity), and it does not record the Python version or platform constraints. Two different runs of pip install -r requirements.txt can resolve transitive dependencies differently if the constraint ranges overlap.
  • A proper lockfile (from pip-tools, Poetry, or uv) records the full dependency graph with hashes, distinguishes direct from transitive dependencies, and is deterministic — installing from it produces the exact same environment every time. The lockfile is what you deploy; the direct dependency list is what you edit.