> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Object-Oriented Programming

> Classes, Magic Methods, and Dataclasses

<img src="https://mintcdn.com/devweeekends/X0Fp4X8lMl-ZftoO/images/courses/python-crash-course/python-oop.svg?fit=max&auto=format&n=X0Fp4X8lMl-ZftoO&q=85&s=368b2fe61a2bcfc837faa2881c652ff9" alt="Python OOP" width="1080" height="1080" data-path="images/courses/python-crash-course/python-oop.svg" />

# Object-Oriented Programming (OOP)

In Python, everything is an object. Functions are objects, numbers are objects, even classes themselves are objects (they are instances of `type`). OOP allows you to create your own custom types to model your problem domain.

A useful analogy: a class is like a blueprint for a house. The blueprint defines the layout (attributes) and what the house can do (methods like "open garage door"). Each house built from that blueprint is an object (instance). The blueprint is shared, but each house has its own address, paint color, and residents.

***

## 1. Classes & Objects

A **Class** is the blueprint. An **Object** is the instance.

```python theme={null}
class Dog:
    # Class Attribute: Shared by ALL instances of Dog
    species = "Canis familiaris"

    # Initializer (Constructor)
    # 'self' refers to the specific object being created
    def __init__(self, name, age):
        # Instance Attributes: Unique to each instance
        self.name = name
        self.age = age

    # Instance Method
    def bark(self):
        return f"{self.name} says Woof!"

# Usage
buddy = Dog("Buddy", 5)
print(buddy.bark())
```

### The `self` Parameter

In Python, you must explicitly define `self` as the first parameter of instance methods. It is how the method knows *which* object it is operating on. (It is similar to `this` in Java/C++, but explicit -- Python's "explicit is better than implicit" philosophy in action).

<Note>
  `self` is just a **convention**, not a keyword. You could call it `this` or `me` and Python would not care. But deviating from `self` is a surefire way to confuse every Python developer who reads your code. Follow the convention.
</Note>

***

## 2. Inheritance

Inheritance allows you to create specialized versions of existing classes. Think of it as an "is-a" relationship: a `Cat` **is an** `Animal`, so it inherits everything `Animal` provides and can override or extend behaviors.

```python theme={null}
class Animal:
    def speak(self):
        raise NotImplementedError("Subclasses must implement speak()")

class Cat(Animal):
    def speak(self):
        return "Meow"

class Dog(Animal):
    def speak(self):
        return "Woof"

# Polymorphism: treat different types uniformly
animals = [Cat(), Dog(), Cat()]
for animal in animals:
    print(animal.speak())  # Each calls its own version of speak()
```

### `super()`

Use `super()` to call methods from the parent class. This is how you **extend** behavior (add to it) rather than **replace** it.

```python theme={null}
class GoldenRetriever(Dog):
    def speak(self):
        # Call parent's speak(), then add to it
        return super().speak() + " (Golden Style)"
```

### The MRO (Method Resolution Order)

Python supports **multiple inheritance** -- a class can inherit from more than one parent. When it does, Python uses the **C3 linearization algorithm** to determine which method to call. This order is called the MRO.

```python theme={null}
class A:
    def greet(self):
        return "A"

class B(A):
    def greet(self):
        return "B"

class C(A):
    def greet(self):
        return "C"

class D(B, C):
    pass  # Inherits from both B and C

print(D().greet())   # "B" -- B comes before C in the MRO
print(D.__mro__)     # (D, B, C, A, object) -- the full resolution order
```

<Warning>
  **Inheritance Pitfall**: Favor **composition over inheritance** for code reuse. Deep inheritance hierarchies (more than 2-3 levels) become fragile and hard to reason about. If you find yourself inheriting just to reuse a few methods, consider passing the shared behavior as a collaborator object instead.

  ```python theme={null}
  # Fragile: deep inheritance for code reuse
  class Animal: ...
  class Pet(Animal): ...
  class DomesticDog(Pet): ...
  class GoldenRetriever(DomesticDog): ...  # 4 levels deep -- changes anywhere break things

  # Better: composition -- "has a" instead of "is a"
  class Dog:
      def __init__(self, breed, tricks=None):
          self.breed = breed
          self.tricks = tricks or TrickSet()  # Dog HAS a TrickSet, not IS a TrickSet
  ```
</Warning>

***

## 3. Magic Methods (Dunder Methods)

Python classes can integrate tightly with language syntax using "Magic Methods" (Double UNDERscore methods, hence "dunder"). This is Python's protocol system -- by implementing specific dunder methods, your objects can behave like built-in types. Think of it as a contract: if you implement `__add__`, Python will call it whenever someone uses `+` on your object.

The most commonly used dunder methods:

| Method                 | Triggered By                  | Purpose                                  |
| ---------------------- | ----------------------------- | ---------------------------------------- |
| `__init__`             | `MyClass()`                   | Initialize a new instance                |
| `__str__`              | `print(obj)`, `str(obj)`      | Human-readable string                    |
| `__repr__`             | `repr(obj)`, debugger display | Unambiguous string (ideally `eval`-able) |
| `__len__`              | `len(obj)`                    | Return the length                        |
| `__getitem__`          | `obj[key]`                    | Enable indexing and slicing              |
| `__eq__`               | `obj1 == obj2`                | Equality comparison                      |
| `__hash__`             | `hash(obj)`, dict key         | Make object hashable                     |
| `__add__`              | `obj1 + obj2`                 | Addition operator                        |
| `__enter__`/`__exit__` | `with obj:`                   | Context manager protocol                 |
| `__iter__`/`__next__`  | `for x in obj:`               | Iterator protocol                        |

```python theme={null}
class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        # __repr__ should be unambiguous -- ideally you could copy-paste to recreate
        return f"Vector({self.x}, {self.y})"

    def __str__(self):
        # __str__ is for human-friendly display
        return f"({self.x}, {self.y})"

    def __add__(self, other):
        # Called when you do v1 + v2
        return Vector(self.x + other.x, self.y + other.y)

    def __eq__(self, other):
        # Called when you do v1 == v2
        return self.x == other.x and self.y == other.y

v1 = Vector(1, 2)
v2 = Vector(3, 4)
v3 = v1 + v2

print(v3)        # Output: (4, 6)      -- calls __str__
print(repr(v3))  # Output: Vector(4, 6) -- calls __repr__
print(v1 == Vector(1, 2))  # True       -- calls __eq__
```

<Warning>
  **Dunder Pitfall: `__str__` vs `__repr__`** -- If you only implement one, implement `__repr__`. Python falls back to `__repr__` when `__str__` is not defined, but not the other way around. In practice, `__repr__` is what you see in the debugger, in logs, and in list displays (`[Vector(1, 2), Vector(3, 4)]`). It matters more than `__str__` for day-to-day development.
</Warning>

***

## 4. Properties

In Java, you write `getVariable()` and `setVariable()`. In Python, we prefer direct access (`obj.variable`). But what if you need validation?

Use the `@property` decorator. It lets you use a method as if it were an attribute.

```python theme={null}
class Circle:
    def __init__(self, radius):
        self._radius = radius # Convention: _variable means "internal use only"

    @property
    def radius(self):
        return self._radius

    @radius.setter
    def radius(self, value):
        if value < 0:
            raise ValueError("Radius cannot be negative")
        self._radius = value

c = Circle(5)
c.radius = 10 # Calls the setter method!
# c.radius = -1 # Raises ValueError
```

***

## 5. Dataclasses (Python 3.7+)

If you are writing a class just to hold data (like a struct in C or a record in Java), standard classes are verbose -- you write `__init__`, `__repr__`, `__eq__` by hand every time. **Dataclasses** automate this boilerplate while still giving you full control.

```python theme={null}
from dataclasses import dataclass, field

@dataclass
class Point:
    x: int
    y: int
    z: int = 0  # Default value

p1 = Point(1, 2)
p2 = Point(1, 2)

print(p1)        # Output: Point(x=1, y=2, z=0) -- auto-generated __repr__
print(p1 == p2)  # Output: True -- auto-generated __eq__ compares field values
```

### Frozen Dataclasses (Immutable)

```python theme={null}
@dataclass(frozen=True)
class Config:
    host: str
    port: int
    debug: bool = False

config = Config("localhost", 8080)
# config.port = 9090  # FrozenInstanceError! Cannot modify.
# Frozen dataclasses are hashable -- can be used as dict keys or set members
```

<Warning>
  **Dataclass Mutable Default Pitfall** -- The same mutable default trap from functions applies here. Never use a mutable object as a default directly. Use `field(default_factory=...)` instead.

  ```python theme={null}
  from dataclasses import dataclass, field

  # WRONG -- Python will actually raise a TypeError to protect you
  # @dataclass
  # class Team:
  #     members: list = []  # TypeError: mutable default not allowed

  # CORRECT -- use field with default_factory
  @dataclass
  class Team:
      name: str
      members: list = field(default_factory=list)  # Fresh list per instance
  ```
</Warning>

### `__slots__` for Memory Optimization

By default, Python objects store their attributes in a `__dict__` dictionary. For classes with many instances, this wastes memory. `__slots__` replaces the dict with a fixed-size array, reducing memory usage by 30-40%.

```python theme={null}
class PointWithSlots:
    __slots__ = ("x", "y")  # Only these attributes are allowed

    def __init__(self, x, y):
        self.x = x
        self.y = y
        # self.z = 0  # AttributeError! z is not in __slots__

# In Python 3.10+, dataclasses support slots directly:
@dataclass(slots=True)
class Point:
    x: float
    y: float
```

***

## 6. Duck Typing and Protocols

Python follows the principle of **duck typing**: "If it walks like a duck and quacks like a duck, it is a duck." You do not check what an object *is* -- you check what it *can do*.

```python theme={null}
# This function works with ANY object that has a .read() method --
# a file, a StringIO, a network socket, a mock in tests.
def process_stream(stream):
    data = stream.read()
    return data.upper()

# Since Python 3.8+, you can formalize duck typing with Protocol:
from typing import Protocol

class Readable(Protocol):
    def read(self) -> str: ...

def process_stream(stream: Readable) -> str:  # Type checker enforces the contract
    return stream.read().upper()
```

<Info>
  Protocols are Python's answer to Go interfaces -- they define what an object must be able to do without requiring inheritance. Use them for type safety without coupling. This is the Pythonic alternative to Java-style abstract base classes in most situations.
</Info>

## Summary

* **Classes**: Encapsulate data and behavior. Use `self` explicitly.
* **Inheritance**: Prefer composition over inheritance. Check the MRO with `__mro__`.
* **Magic Methods**: Implement `__repr__` first, then `__str__`. Make your objects behave like built-in types.
* **Properties**: Add validation logic without changing the API (`@property`).
* **Dataclasses**: The modern way to define data containers. Use `frozen=True` for immutability, `field(default_factory=...)` for mutable defaults, and `slots=True` for memory efficiency.
* **Duck Typing**: Check capabilities, not types. Use `Protocol` for type-safe duck typing.

Next, we'll learn how to organize code into **Modules and Packages**.

***

## Interview Deep-Dive

<AccordionGroup>
  <Accordion title="What are metaclasses in Python? When would you actually use one in production, and what are the alternatives?">
    **Strong Answer:**

    * A metaclass is the class of a class. Just as an object is an instance of a class, a class is an instance of a metaclass. By default, all classes in Python are instances of `type`. When you write `class Foo: pass`, Python internally calls `type("Foo", (object,), namespace)` to create the class object. A custom metaclass lets you intercept and modify this class creation process.
    * You define a metaclass by subclassing `type` and overriding `__new__` or `__init__`. `__new__` is called before the class object is created (you can modify the class name, bases, or namespace). `__init__` is called after the class object exists (you can modify the class in place). `__init_subclass__` (Python 3.6+) is a lighter-weight alternative that hooks into subclass creation without a full metaclass.
    * Real production use cases are narrow but powerful. ORMs like Django's Model and SQLAlchemy's declarative base use metaclasses to inspect class attributes (field definitions) at class creation time and build database table mappings. API frameworks use them to automatically register endpoint classes. Serialization libraries use them to generate schema validation code when the class is defined rather than at runtime.
    * The important nuance: metaclasses are almost always overkill. Python 3.6+ introduced `__init_subclass__`, which handles 80% of the use cases that previously required metaclasses (validating subclass attributes, auto-registering subclasses, injecting behavior). Class decorators handle another 15%. Actual metaclasses are the remaining 5% -- when you need to control the class namespace before the class body executes (using `__prepare__`), or when you need metaclass inheritance to propagate behavior automatically through a class hierarchy.
    * The senior answer to "should I use a metaclass?" is almost always "no, use `__init_subclass__` or a class decorator first." Metaclasses add cognitive overhead, make debugging harder (stack traces go through `type.__new__`), and create composability problems (you cannot easily combine two metaclasses). They are a power tool for framework authors, not application code.

    **Follow-up: What is `__init_subclass__` and how does it replace metaclasses for common patterns like auto-registration?**

    * `__init_subclass__` is a class method that is called automatically whenever the class is subclassed. It receives the new subclass as its first argument (after `cls`) plus any keyword arguments passed in the class definition.
    * For auto-registration: `class Plugin: _registry = {}` then define `def __init_subclass__(cls, **kwargs): super().__init_subclass__(**kwargs); Plugin._registry[cls.__name__] = cls`. Now any class that inherits from `Plugin` is automatically registered. No metaclass needed, no decorator needed, and it works transparently with multiple inheritance.
    * The key advantage over metaclasses is composability. Multiple parent classes can each define `__init_subclass__`, and they all get called via the MRO (as long as they call `super().__init_subclass__(**kwargs)`). With metaclasses, having two parent classes with different metaclasses causes a `TypeError` unless you manually create a combined metaclass.
  </Accordion>

  <Accordion title="Explain Python's Method Resolution Order (MRO) and the diamond problem. How does `super()` actually work?">
    **Strong Answer:**

    * The MRO is the order in which Python searches for methods in a class hierarchy. For single inheritance, it is straightforward: child, parent, grandparent, ..., object. For multiple inheritance, it gets complex because a class can appear in multiple inheritance chains.
    * Python uses the C3 linearization algorithm to compute the MRO. C3 guarantees three properties: (1) subclasses come before their parents, (2) if a class inherits from A then B, A comes before B in the MRO, and (3) the order is consistent -- the same class always appears in the same relative position. You can inspect any class's MRO with `MyClass.__mro__` or `MyClass.mro()`.
    * The diamond problem occurs when a class D inherits from B and C, and both B and C inherit from A. Without a proper MRO, A's methods could be called twice. C3 linearization ensures A appears only once in the MRO, at the end: `D -> B -> C -> A -> object`.
    * `super()` does not mean "call the parent class." It means "call the next class in the MRO." This distinction is critical for cooperative multiple inheritance. When `B.method()` calls `super().method()`, it does not call `A.method()` (B's parent). It calls the next class in the MRO of the actual instance, which might be `C.method()` if the instance is of type D. This is how all classes in the diamond get called exactly once.
    * The practical implication: if you use `super()`, all cooperating classes must follow the same protocol -- same method signature (or use `*args, **kwargs` to forward unknown arguments) and all must call `super()` in turn. If any class breaks the chain by not calling `super()`, downstream classes in the MRO get skipped. This is cooperative multiple inheritance, and it requires cooperation from all participants.
    * In production, deep multiple inheritance hierarchies are rare and usually a design smell. Mixins (small, focused classes that add a single behavior) are the common pattern: `class APIView(AuthMixin, LoggingMixin, View)`. Each mixin adds one capability, and `super()` chains them correctly through the MRO.

    **Follow-up: What happens if two parent classes define the same method and neither calls `super()`? How do you debug MRO-related issues?**

    * If neither calls `super()`, only the first class in the MRO wins. The second class's method is completely shadowed. This is not an error -- Python silently resolves the conflict by MRO order. This can cause subtle bugs where a class thinks its method is being called but it never is.
    * To debug: first, print the MRO with `MyClass.__mro__` to see the exact resolution order. Second, use `super()` explicitly with two arguments to call a specific class's version: `super(SpecificClass, self).method()` calls the next class after `SpecificClass` in the MRO. Third, for complex hierarchies, the `inspect` module's `getmro()` function can help, and you can trace method calls by adding logging to each class's method to see which ones actually execute.
  </Accordion>

  <Accordion title="What is the difference between `__str__` and `__repr__`, and why should every production class implement at least `__repr__`?">
    **Strong Answer:**

    * `__repr__` is for developers. It should return an unambiguous string representation that ideally could recreate the object: `repr(Point(1, 2))` should return `"Point(1, 2)"`. It is used in the REPL, in debugger displays, in log messages, and as a fallback when `__str__` is not defined. The convention is that `eval(repr(obj))` should produce an equivalent object when possible.
    * `__str__` is for end users. It returns a human-readable, "pretty" string: `str(datetime.now())` returns `"2024-01-15 10:30:00"`, not the full constructor call. `print()` calls `__str__`. If `__str__` is not defined, Python falls back to `__repr__`.
    * Every production class should implement `__repr__` because when something goes wrong at 3 AM and you are reading logs, seeing `<MyObject object at 0x7f...>` is useless. Seeing `MyObject(id=42, status='pending', retries=3)` immediately tells you what state the object was in when the error occurred. This is not a nice-to-have -- it is the difference between a 10-minute debugging session and a 2-hour one.
    * Dataclasses and named tuples give you `__repr__` for free, which is another reason to prefer them for data-holding classes. For classes with complex internal state, implement `__repr__` to show the most diagnostically useful fields, not every internal detail.
    * A subtle best practice: `__repr__` output should always include the class name. If `SubClass` inherits from `BaseClass` and only `BaseClass` defines `__repr__`, the output will say "BaseClass(...)" even for `SubClass` instances. Use `type(self).__name__` in your `__repr__` to get the actual class name dynamically.

    **Follow-up: What are the most important dunder methods to implement for a class that will be used as a dictionary key or stored in a set?**

    * You must implement `__hash__` and `__eq__`. The contract is: if `a == b`, then `hash(a) == hash(b)`. The reverse does not need to hold (hash collisions are fine). If you break this contract, dict lookups will silently fail to find keys that exist.
    * For `__hash__`, a common pattern is to hash a tuple of the fields that determine equality: `def __hash__(self): return hash((self.x, self.y))`. Use only immutable fields -- if a field can change, the hash changes, and the object becomes "lost" in the hash table.
    * You should also implement `__repr__` for debuggability, and consider `__lt__` if you want the objects to be sortable (needed for use with `sorted()`, `min()`, `max()` with multiple objects).
    * If you use `@dataclass(frozen=True)`, you get `__hash__`, `__eq__`, and `__repr__` for free, and the `frozen` flag prevents mutation, which guarantees hash stability. This is the recommended approach for value objects that serve as dict keys.
  </Accordion>

  <Accordion title="Compare `dataclass`, `NamedTuple`, `TypedDict`, and plain `dict` for representing structured data. When do you reach for each?">
    **Strong Answer:**

    * `dataclass` is the go-to for most structured data. It generates `__init__`, `__repr__`, `__eq__`, and optionally `__hash__` (with `frozen=True`). It supports default values, field-level metadata, post-init processing (`__post_init__`), and inheritance. Use it when you need a proper class with methods, validation, or behavior attached to the data.
    * `NamedTuple` (from `typing`) creates an immutable tuple subclass with named fields. It is lighter weight than a dataclass -- instances use less memory because they are backed by tuples, not dicts. It is hashable by default. Use it for simple, immutable records where you want tuple unpacking: `x, y = point`. The downside is no mutable fields and limited customization.
    * `TypedDict` (from `typing`) is for typing existing dictionary patterns. It does not create a new class -- it is purely a type hint that tells mypy what keys a dict should have and what types they should be. Use it when you are working with JSON data or APIs that return plain dicts and you want type safety without converting to a class. `TypedDict` has zero runtime overhead.
    * Plain `dict` is for genuinely dynamic key-value data where the keys are not known at definition time: configuration from files, JSON payloads being passed through, aggregation results. If you find yourself accessing `d["name"]` and `d["age"]` repeatedly with known keys, that is a signal to use a dataclass or NamedTuple instead -- you gain autocompletion, type checking, and `AttributeError` instead of `KeyError`.
    * The decision framework: Is the data immutable and simple? `NamedTuple`. Is the data mutable, has methods, or needs validation? `dataclass`. Is it a JSON blob from an external API you want to type-check but not convert? `TypedDict`. Is the shape truly dynamic? Plain `dict`. In practice, about 70% of the time the answer is `dataclass`.

    **Follow-up: What is `@dataclass(slots=True)` and why would you use it?**

    * Python 3.10 added `slots=True` to dataclasses, which automatically generates `__slots__` for the class. Normally, Python objects store their attributes in a per-instance `__dict__` (a dictionary), which uses about 100-200 bytes of overhead per instance. With `__slots__`, attributes are stored in a fixed-size array, eliminating the `__dict__` entirely.
    * The benefits are twofold: memory savings (roughly 30-40% per instance for small objects) and slightly faster attribute access (direct array indexing vs. hash table lookup). For a service holding 1 million user objects in memory, `slots=True` can save 100-200MB of RAM.
    * The trade-off: slotted classes cannot have arbitrary attributes added dynamically (`obj.new_attr = value` raises `AttributeError`). This also means libraries that rely on `__dict__` (some serialization libraries, some debugging tools) may not work. But for data containers, the restriction is actually a feature -- it prevents accidental attribute creation from typos, like `user.nane = "Alice"` silently succeeding on a regular object but raising an error on a slotted one.
  </Accordion>
</AccordionGroup>
