> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Python Crash Course

> Essential Python for AI Engineers - from basics to advanced patterns

> **Note:** This is a quick-reference Python guide focused on AI/ML workflows. For a comprehensive Python course, see our [Complete Python Crash Course](/courses/python-crash-course/overview).

## Getting Started

### 1. Install Python

Download Python 3.11+ from [python.org](https://python.org). Verify installation:

```bash theme={null}
python --version  # Should show 3.11 or higher
```

### 2. Set Up Virtual Environment

Virtual environments are like separate toolboxes for each project. Without them, installing a package for Project A might break Project B because they need different versions of the same library. This is not a theoretical concern -- it will happen to you within your first week of AI development, because LLM libraries update frequently and often have conflicting dependencies.

```bash theme={null}
# Create a virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (macOS/Linux)
source venv/bin/activate
```

### 3. Install AI Packages

```bash theme={null}
pip install openai anthropic langchain chromadb pydantic python-dotenv
```

### 4. Manage Dependencies

```bash theme={null}
# Save current packages
pip freeze > requirements.txt

# Install from requirements
pip install -r requirements.txt
```

***

## Python Core Syntax (AI Context)

These are the Python fundamentals you will use daily in AI engineering. We focus on what matters for working with LLM APIs, data processing, and async workflows -- not the full breadth of Python.

### Variables & Types

```python theme={null}
# Basic types -- you will use all four of these in every LLM API call
name = "Claude"  # str: model names, prompts, responses
temperature = 0.7  # float: model parameters, scores, costs
max_tokens = 1000  # int: token limits, counts, retry attempts
is_streaming = True  # bool: feature flags, configuration

# Lists (ordered, mutable)
messages = ["Hello", "How are you?"]
messages.append("Great!")

# Dictionaries -- the most important data structure for AI work.
# Every API request and response is essentially a dict.
config = {
    "model": "gpt-4",
    "temperature": 0.7,
    "max_tokens": 1000
}

# Tuples (immutable)
coordinates = (40.7128, -74.0060)
```

### Functions

```python theme={null}
def generate_prompt(context: str, question: str) -> str:
    """Generate a prompt with context and question."""
    return f"Context: {context}\n\nQuestion: {question}"

# With default args
def call_api(prompt: str, temperature: float = 0.7, max_tokens: int = 1000):
    # API call logic here
    pass

# Lambda (anonymous functions)
multiply = lambda x, y: x * y
result = multiply(5, 3)  # 15
```

### Control Flow

```python theme={null}
# If-else
if temperature < 0.3:
    style = "deterministic"
elif temperature < 0.7:
    style = "balanced"
else:
    style = "creative"

# For loops
for message in messages:
    print(message)

# List comprehension (faster, more Pythonic)
lengths = [len(msg) for msg in messages]
filtered = [msg for msg in messages if len(msg) > 10]

# While loop
retry_count = 0
while retry_count < 3:
    try:
        # Try API call
        break
    except Exception:
        retry_count += 1
```

### Error Handling

Error handling is not optional in AI engineering -- LLM APIs fail regularly due to rate limits, network issues, and content policy violations. Every API call should be wrapped in try/except. The pattern below catches errors from most specific to most general, which is important because Python matches the first except block that fits.

```python theme={null}
try:
    response = api_call(prompt)
except APIError as e:
    # Catch specific API errors first (rate limits, invalid requests, etc.)
    print(f"API error: {e}")
except TimeoutError:
    # Network timeouts are common with LLM APIs (long generation times)
    print("Request timed out")
finally:
    # Always runs, even if an exception was raised -- use for cleanup
    close_connection()
```

***

## Data Structures for AI

### Working with JSON

JSON is the lingua franca of LLM APIs. Every request you send is JSON, every response you receive is JSON, and structured outputs are JSON. Mastering `json.loads()` and `json.dumps()` is as fundamental to AI engineering as knowing how to read and write.

```python theme={null}
import json

# Parse JSON string -- turns a string into a Python dict
data = json.loads('{"model": "gpt-4", "temp": 0.7}')

# Convert to JSON string
json_str = json.dumps({"result": "success"})

# Read from file
with open("config.json") as f:
    config = json.load(f)

# Write to file
with open("output.json", "w") as f:
    json.dump(results, f, indent=2)
```

### List Operations

```python theme={null}
# Slicing
messages[0]      # First item
messages[-1]     # Last item
messages[1:3]    # Items 1-2
messages[:2]     # First 2 items
messages[2:]     # From item 2 to end

# Common operations
len(messages)           # Length
messages.extend([...])  # Add multiple items
messages.remove(item)   # Remove specific item
messages.pop()          # Remove & return last item

# Sorting
sorted_list = sorted(numbers)
messages.sort(key=lambda x: len(x))  # Sort by length
```

### Dictionary Operations

```python theme={null}
# Access
value = config["model"]
value = config.get("model", "default")  # Safe access

# Check existence
if "model" in config:
    print(config["model"])

# Iteration
for key, value in config.items():
    print(f"{key}: {value}")

# Merge dictionaries
combined = {**config1, **config2}

# Dictionary comprehension
squares = {x: x**2 for x in range(5)}
```

***

## Object-Oriented Python for AI

### Classes & Dataclasses

```python theme={null}
from dataclasses import dataclass
from typing import List, Optional

@dataclass
class Message:
    role: str
    content: str
    tokens: Optional[int] = None

class ChatBot:
    def __init__(self, model: str = "gpt-4"):
        self.model = model
        self.messages: List[Message] = []
    
    def add_message(self, role: str, content: str) -> None:
        msg = Message(role=role, content=content)
        self.messages.append(msg)
    
    def get_history(self) -> list[dict]:
        return [
            {"role": m.role, "content": m.content}
            for m in self.messages
        ]
    
    def clear(self) -> None:
        self.messages = []

# Usage
bot = ChatBot(model="gpt-4o")
bot.add_message("user", "Hello!")
bot.add_message("assistant", "Hi there!")
print(bot.get_history())
```

**Why dataclasses?** Reduces boilerplate for data objects. Perfect for API responses, configuration objects, and structured data.

### Type Hints (Modern Python)

```python theme={null}
from typing import List, Dict, Optional, Union

def process_batch(
    items: List[str],
    config: Dict[str, any],
    timeout: Optional[float] = None
) -> List[Dict[str, Union[str, int]]]:
    """Process a batch of items with config."""
    results = []
    for item in items:
        result = {"text": item, "length": len(item)}
        results.append(result)
    return results
```

**Why type hints?** Better IDE support, catch bugs early, and self-documenting code.

***

## Dependency Management: pip vs. Poetry vs. uv

Choosing the right tool for managing Python packages will save you hours of debugging dependency conflicts -- a common occurrence in AI projects because libraries like `langchain`, `transformers`, and `torch` have deep and sometimes conflicting dependency trees.

| Tool                       | Speed                   | Lock File                               | Best For                                             | Watch Out For                                                                             |
| -------------------------- | ----------------------- | --------------------------------------- | ---------------------------------------------------- | ----------------------------------------------------------------------------------------- |
| **pip + requirements.txt** | Slow                    | Manual (`pip freeze`)                   | Simple projects, tutorials                           | No true dependency resolution; `pip freeze` captures everything including transitive deps |
| **pip + pip-tools**        | Moderate                | `requirements.in` -> `requirements.txt` | Production projects needing reproducibility          | Extra step to compile lock file                                                           |
| **Poetry**                 | Moderate                | `poetry.lock` (automatic)               | Libraries, projects needing publishing               | Slow resolver, can conflict with conda                                                    |
| **uv**                     | Very fast (10-100x pip) | `uv.lock` (automatic)                   | New projects in 2025+, fast iteration                | Newer tool, some edge cases with exotic packages                                          |
| **conda**                  | Slow                    | `environment.yml`                       | Data science, GPU/CUDA deps, cross-platform binaries | Heavy, dependency resolution can be painfully slow                                        |

**AI-specific recommendation**: For AI projects that need PyTorch or CUDA, start with `uv` (fast, modern) and fall back to `conda` only if you need binary packages that pip cannot install (e.g., specific CUDA toolkit versions). For everything else, `uv` or `pip-tools` gives you speed and reproducibility.

## Advanced Patterns for AI Engineering

### Decorators (Reusable Logic)

Decorators are functions that wrap other functions to add behavior -- think of them as "middleware for functions." They are everywhere in AI engineering: `@retry` for handling flaky API calls, `@timer` for profiling, `@cache` for avoiding redundant LLM calls, and `@observe` for tracing. If you understand decorators, you can read (and write) production AI code. If you do not, they will look like magic.

```python theme={null}
import functools
import time
from typing import Callable

def timer(func: Callable) -> Callable:
    """Measure execution time"""
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        duration = time.time() - start
        print(f"{func.__name__} took {duration:.2f}s")
        return result
    return wrapper

def retry(max_attempts: int = 3, delay: float = 1.0):
    """Retry decorator with exponential backoff"""
    def decorator(func: Callable) -> Callable:
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    if attempt == max_attempts - 1:
                        raise
                    wait_time = delay * (2 ** attempt)
                    print(f"Retry {attempt + 1}/{max_attempts} in {wait_time}s")
                    time.sleep(wait_time)
        return wrapper
    return decorator

def cache_result(func: Callable) -> Callable:
    """Simple caching decorator"""
    cache = {}
    @functools.wraps(func)
    def wrapper(*args):
        if args not in cache:
            cache[args] = func(*args)
        return cache[args]
    return wrapper

# Usage
@retry(max_attempts=3, delay=2.0)
@timer
def call_llm(prompt: str) -> str:
    # Simulated API call
    return "response"
```

**Use cases:**

* `@timer` - Profile slow functions
* `@retry` - Handle flaky API calls
* `@cache_result` - Avoid redundant LLM calls

### Context Managers (Resource Management)

Context managers ensure resources are properly managed—files closed, connections released, timers stopped.

```python theme={null}
from contextlib import contextmanager
import time

@contextmanager
def timer_context(label: str):
    """Time a block of code"""
    start = time.time()
    try:
        yield
    finally:
        duration = time.time() - start
        print(f"{label}: {duration:.2f}s")

@contextmanager
def temporary_config(new_config: dict):
    """Temporarily change config, then restore"""
    old_config = config.copy()
    config.update(new_config)
    try:
        yield
    finally:
        config.clear()
        config.update(old_config)

# Usage
with timer_context("Embedding generation"):
    embeddings = generate_embeddings(texts)

with open("data.txt") as f:
    content = f.read()
```

**Use cases:**

* File I/O
* Database connections
* Timing code blocks
* Temporary state changes

### Async/Await (Concurrency)

Async is the single most important advanced Python pattern for AI engineering. Here is why: a typical LLM API call takes 1-5 seconds, and during that time your program is just waiting for a network response. Without async, processing 10 prompts takes 10-50 seconds. With async, all 10 run concurrently and you get results in 1-5 seconds total. That is a 10x speedup for free.

The mental model: `async def` declares a function that can pause (at `await` points) and let other tasks run while it waits. `asyncio.gather` runs multiple async tasks concurrently.

```python theme={null}
import asyncio
from typing import List

async def fetch_completion(prompt: str) -> str:
    """Simulated async API call -- the await lets other tasks run while waiting"""
    await asyncio.sleep(1)  # In real code, this is the LLM API call
    return f"Response to: {prompt}"

async def process_batch(prompts: List[str]) -> List[str]:
    """Process multiple prompts concurrently.
    
    asyncio.gather runs all tasks at once and waits for all to complete.
    Order is preserved: results[0] corresponds to prompts[0].
    """
    tasks = [fetch_completion(p) for p in prompts]
    results = await asyncio.gather(*tasks)
    return results

# asyncio.run() is the entry point -- call it once from synchronous code
prompts = ["Question 1", "Question 2", "Question 3"]
results = asyncio.run(process_batch(prompts))
```

**Why async?** Process multiple API calls concurrently. 10 sequential 1-second calls = 10 seconds. 10 concurrent = \~1 second. For batch processing, this is not a nice-to-have -- it is the difference between a feature that ships and one that times out.

***

## File Operations

```python theme={null}
# Read entire file
with open("data.txt") as f:
    content = f.read()

# Read line by line (memory efficient)
with open("large_file.txt") as f:
    for line in f:
        process(line.strip())

# Write to file
with open("output.txt", "w") as f:
    f.write("Hello, world!\n")

# Append to file
with open("log.txt", "a") as f:
    f.write(f"{timestamp}: Event\n")

# Check if file exists
from pathlib import Path
if Path("config.json").exists():
    # Load config
    pass
```

***

## Environment Variables (.env)

API keys are the crown jewels of your AI application. A leaked OpenAI key can rack up thousands of dollars in charges before you notice. The `.env` pattern keeps secrets out of your code and out of git history. This is not a suggestion -- it is a hard requirement for any project that will ever be shared, deployed, or committed to a repository.

```python theme={null}
from dotenv import load_dotenv
import os

# Load from .env file -- call this once at startup
load_dotenv()

# Access variables -- os.getenv returns None if not found (no crash)
api_key = os.getenv("OPENAI_API_KEY")
db_url = os.getenv("DATABASE_URL", "default_url")  # Second arg is fallback
```

**.env file:**

```
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
DATABASE_URL=postgresql://...
```

**Never commit `.env` files!** Add `.env` to your `.gitignore` immediately when you create a new project -- before you make your first commit. If you accidentally commit a key, rotate it immediately; removing it from git history is difficult and unreliable.

***

## Essential Libraries for AI

### HTTP Requests

```python theme={null}
import requests

# GET request
response = requests.get("https://api.example.com/data")
data = response.json()

# POST request
response = requests.post(
    "https://api.example.com/generate",
    json={"prompt": "Hello", "max_tokens": 100},
    headers={"Authorization": f"Bearer {api_key}"}
)
```

### Data Manipulation (Pandas)

```python theme={null}
import pandas as pd

# Read CSV
df = pd.read_csv("data.csv")

# Basic operations
df.head()           # First 5 rows
df.describe()       # Statistics
df["column"].mean() # Column average

# Filter
filtered = df[df["score"] > 0.8]

# Group by
grouped = df.groupby("category")["score"].mean()
```

### Date & Time

```python theme={null}
from datetime import datetime, timedelta

now = datetime.now()
timestamp = now.isoformat()

# Add time
tomorrow = now + timedelta(days=1)
hour_ago = now - timedelta(hours=1)

# Parse string
dt = datetime.fromisoformat("2024-01-15T10:30:00")
```

***

## Common AI Patterns

These patterns appear in virtually every AI application. They are worth memorizing because you will use them dozens of times.

### Loading Environment Variables

```python theme={null}
from dotenv import load_dotenv
import os

load_dotenv()
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY not set")
```

### Building Prompts

```python theme={null}
def build_rag_prompt(context: str, question: str) -> str:
    """Build RAG prompt with context."""
    return f"""Use the following context to answer the question.

Context:
{context}

Question: {question}

Answer:"""

# Template with f-strings
system_prompt = f"""You are an AI assistant with expertise in {domain}.
Your responses should be {tone} and {length}."""
```

### Batching Requests

Batching is essential when you have hundreds or thousands of items to process. Sending them all at once will hit rate limits; sending them one at a time is painfully slow. Batching gives you the best of both worlds: controlled throughput that stays within API limits while processing efficiently. The `yield` keyword makes this a generator, which means it processes one batch at a time and does not load all results into memory.

```python theme={null}
def batch_process(items: List[str], batch_size: int = 10):
    """Process items in batches to stay within rate limits."""
    for i in range(0, len(items), batch_size):
        batch = items[i:i + batch_size]
        results = process_batch(batch)
        yield results  # yield returns results one batch at a time (memory-efficient)
```

### Rate Limiting

OpenAI, Anthropic, and other providers enforce rate limits (typically measured in requests per minute and tokens per minute). Exceeding them results in 429 errors and temporary bans. This simple rate limiter adds a fixed delay between calls to keep you under the limit. For production use, consider the `tenacity` library for more sophisticated retry-with-backoff patterns.

```python theme={null}
import time

def rate_limited_call(func, calls_per_minute: int = 60):
    """Rate limit function calls.
    
    Simple but effective: adds a fixed delay between calls.
    For 60 calls/min, that is 1 second between each call.
    """
    delay = 60.0 / calls_per_minute
    
    def wrapper(*args, **kwargs):
        time.sleep(delay)  # Pause before each call to stay under the limit
        return func(*args, **kwargs)
    
    return wrapper
```

### Sync vs. Async: When to Use What

| Scenario               | Use Sync | Use Async       | Why                                                                                                      |
| ---------------------- | -------- | --------------- | -------------------------------------------------------------------------------------------------------- |
| Single LLM call        | Yes      | No              | No concurrency benefit; simpler code                                                                     |
| Batch of 10+ LLM calls | No       | Yes             | 10x speedup from concurrent I/O                                                                          |
| FastAPI endpoint       | Either   | Preferred       | FastAPI is async-native; mixing sync blocks the event loop                                               |
| Jupyter notebook       | Yes      | Tricky          | Notebooks already run an event loop; `asyncio.run()` will fail -- use `await` directly or `nest_asyncio` |
| CLI script             | Yes      | Yes (for batch) | Sync is simpler; use async only if you have batch processing                                             |
| Streaming response     | Either   | Preferred       | Async generators (`async for`) integrate cleanly with streaming APIs                                     |

**Edge case -- mixing sync and async**: If you call a sync function (like `requests.get()`) from inside an async function, it blocks the entire event loop. Use `httpx` (async HTTP) instead of `requests`, or wrap sync calls in `asyncio.to_thread()` to run them in a thread pool without blocking.

***

## Next Steps

### Next Steps

* [**Complete Python Course**](/courses/python-crash-course/overview) - Deep dive into Python fundamentals, data structures, OOP, and more
* [**FastAPI Crash Course**](/ai-engineering/fastapi-crash-course) - Build production APIs for AI applications
* [**Async Patterns**](/ai-engineering/async-patterns) - Master concurrent programming for AI workloads
* [**LLM Fundamentals**](/ai-engineering/llm-fundamentals) - Start building with language models

***

## Quick Reference

### Common Commands

```bash theme={null}
# Python version
python --version

# Install package
pip install package-name

# Install from requirements
pip install -r requirements.txt

# Create requirements file
pip freeze > requirements.txt

# Create virtual environment
python -m venv venv

# Activate venv (Windows)
venv\Scripts\activate

# Activate venv (macOS/Linux)
source venv/bin/activate

# Deactivate venv
deactivate

# Run Python file
python script.py

# Interactive Python shell
python

# Install specific version
pip install package-name==1.2.3
```

### Style Guidelines (PEP 8)

```python theme={null}
# Naming conventions
class MyClass:          # PascalCase for classes
    pass

def my_function():      # snake_case for functions/variables
    pass

CONSTANT_VALUE = 100    # UPPER_CASE for constants

# Line length: max 79-88 characters
# Imports at top, grouped: standard lib, third-party, local
# Use 4 spaces for indentation (not tabs)
# Two blank lines between top-level functions/classes
# One blank line between methods
```

### Type Hints Quick Reference

```python theme={null}
from typing import List, Dict, Optional, Union, Callable, Any

def func(
    text: str,                          # String
    count: int,                         # Integer
    ratio: float,                       # Float
    is_valid: bool,                     # Boolean
    items: List[str],                   # List of strings
    config: Dict[str, int],             # Dict with str keys, int values
    callback: Callable[[str], int],     # Function: str -> int
    optional: Optional[str] = None,     # Can be str or None
    either: Union[str, int],            # Can be str OR int
    anything: Any                       # Any type
) -> tuple[str, int]:                   # Returns tuple
    return ("result", 42)
```

***

## Common Python Gotchas in AI Work

These are the mistakes that burn hours of debugging time specifically in AI engineering contexts:

| Gotcha                                   | What Happens                                                                                            | Fix                                                                                |
| ---------------------------------------- | ------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- |
| **Mutable default arguments**            | `def f(items=[])` shares the same list across all calls -- appended items persist                       | Use `def f(items=None): items = items or []`                                       |
| **Shallow copy of dicts**                | `config2 = config1` means both point to the same dict; changing one changes both                        | Use `config2 = config1.copy()` or `copy.deepcopy()` for nested dicts               |
| **f-string with dicts**                  | `f"value: {d['key']}"` fails with single quotes inside f-string braces                                  | Use double quotes: `f"value: {d[\"key\"]}"` or assign to variable first            |
| **JSON `dumps` vs `dump`**               | `json.dumps()` returns a string; `json.dump()` writes to a file. Mixing them up produces cryptic errors | Remember: the `s` stands for "string"                                              |
| **`asyncio.run()` in Jupyter**           | Raises `RuntimeError: cannot run nested event loop`                                                     | Use `await` directly in cells, or `pip install nest_asyncio; nest_asyncio.apply()` |
| **Float precision in cost calculations** | `0.1 + 0.2 == 0.30000000000000004`                                                                      | Use `round()` or `Decimal` for financial calculations                              |
| **Forgetting `await`**                   | `result = async_func()` returns a coroutine object, not the result                                      | Always `result = await async_func()` -- linters catch this if you use type hints   |

***

## Troubleshooting

### "Module not found" error

```bash theme={null}
# Make sure venv is activated
# Then reinstall
pip install -r requirements.txt
```

### "pip: command not found"

```bash theme={null}
# Use python -m pip instead
python -m pip install package-name
```

### Import errors in VS Code

1. Select correct Python interpreter: `Ctrl+Shift+P` → "Python: Select Interpreter"
2. Choose the one in your `venv` folder

### Slow pip installs

```bash theme={null}
# Use a faster mirror
pip install --index-url https://pypi.org/simple package-name
```

***

> **Pro Tips:**
>
> * Use virtual environments for EVERY project
> * Pin package versions in production (`package==1.2.3`)
> * Use type hints—they catch bugs before runtime
> * Learn list/dict comprehensions—they're faster and more Pythonic
> * Use `python-dotenv` for API keys and secrets