Skip to main content

Documentation Index

Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt

Use this file to discover all available pages before exploring further.

Python OOP

Object-Oriented Programming (OOP)

In Python, everything is an object. Functions are objects, numbers are objects, even classes themselves are objects (they are instances of type). OOP allows you to create your own custom types to model your problem domain. A useful analogy: a class is like a blueprint for a house. The blueprint defines the layout (attributes) and what the house can do (methods like “open garage door”). Each house built from that blueprint is an object (instance). The blueprint is shared, but each house has its own address, paint color, and residents.

1. Classes & Objects

A Class is the blueprint. An Object is the instance.
class Dog:
    # Class Attribute: Shared by ALL instances of Dog
    species = "Canis familiaris"

    # Initializer (Constructor)
    # 'self' refers to the specific object being created
    def __init__(self, name, age):
        # Instance Attributes: Unique to each instance
        self.name = name
        self.age = age

    # Instance Method
    def bark(self):
        return f"{self.name} says Woof!"

# Usage
buddy = Dog("Buddy", 5)
print(buddy.bark())

The self Parameter

In Python, you must explicitly define self as the first parameter of instance methods. It is how the method knows which object it is operating on. (It is similar to this in Java/C++, but explicit — Python’s “explicit is better than implicit” philosophy in action).
self is just a convention, not a keyword. You could call it this or me and Python would not care. But deviating from self is a surefire way to confuse every Python developer who reads your code. Follow the convention.

2. Inheritance

Inheritance allows you to create specialized versions of existing classes. Think of it as an “is-a” relationship: a Cat is an Animal, so it inherits everything Animal provides and can override or extend behaviors.
class Animal:
    def speak(self):
        raise NotImplementedError("Subclasses must implement speak()")

class Cat(Animal):
    def speak(self):
        return "Meow"

class Dog(Animal):
    def speak(self):
        return "Woof"

# Polymorphism: treat different types uniformly
animals = [Cat(), Dog(), Cat()]
for animal in animals:
    print(animal.speak())  # Each calls its own version of speak()

super()

Use super() to call methods from the parent class. This is how you extend behavior (add to it) rather than replace it.
class GoldenRetriever(Dog):
    def speak(self):
        # Call parent's speak(), then add to it
        return super().speak() + " (Golden Style)"

The MRO (Method Resolution Order)

Python supports multiple inheritance — a class can inherit from more than one parent. When it does, Python uses the C3 linearization algorithm to determine which method to call. This order is called the MRO.
class A:
    def greet(self):
        return "A"

class B(A):
    def greet(self):
        return "B"

class C(A):
    def greet(self):
        return "C"

class D(B, C):
    pass  # Inherits from both B and C

print(D().greet())   # "B" -- B comes before C in the MRO
print(D.__mro__)     # (D, B, C, A, object) -- the full resolution order
Inheritance Pitfall: Favor composition over inheritance for code reuse. Deep inheritance hierarchies (more than 2-3 levels) become fragile and hard to reason about. If you find yourself inheriting just to reuse a few methods, consider passing the shared behavior as a collaborator object instead.
# Fragile: deep inheritance for code reuse
class Animal: ...
class Pet(Animal): ...
class DomesticDog(Pet): ...
class GoldenRetriever(DomesticDog): ...  # 4 levels deep -- changes anywhere break things

# Better: composition -- "has a" instead of "is a"
class Dog:
    def __init__(self, breed, tricks=None):
        self.breed = breed
        self.tricks = tricks or TrickSet()  # Dog HAS a TrickSet, not IS a TrickSet

3. Magic Methods (Dunder Methods)

Python classes can integrate tightly with language syntax using “Magic Methods” (Double UNDERscore methods, hence “dunder”). This is Python’s protocol system — by implementing specific dunder methods, your objects can behave like built-in types. Think of it as a contract: if you implement __add__, Python will call it whenever someone uses + on your object. The most commonly used dunder methods:
MethodTriggered ByPurpose
__init__MyClass()Initialize a new instance
__str__print(obj), str(obj)Human-readable string
__repr__repr(obj), debugger displayUnambiguous string (ideally eval-able)
__len__len(obj)Return the length
__getitem__obj[key]Enable indexing and slicing
__eq__obj1 == obj2Equality comparison
__hash__hash(obj), dict keyMake object hashable
__add__obj1 + obj2Addition operator
__enter__/__exit__with obj:Context manager protocol
__iter__/__next__for x in obj:Iterator protocol
class Vector:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __repr__(self):
        # __repr__ should be unambiguous -- ideally you could copy-paste to recreate
        return f"Vector({self.x}, {self.y})"

    def __str__(self):
        # __str__ is for human-friendly display
        return f"({self.x}, {self.y})"

    def __add__(self, other):
        # Called when you do v1 + v2
        return Vector(self.x + other.x, self.y + other.y)

    def __eq__(self, other):
        # Called when you do v1 == v2
        return self.x == other.x and self.y == other.y

v1 = Vector(1, 2)
v2 = Vector(3, 4)
v3 = v1 + v2

print(v3)        # Output: (4, 6)      -- calls __str__
print(repr(v3))  # Output: Vector(4, 6) -- calls __repr__
print(v1 == Vector(1, 2))  # True       -- calls __eq__
Dunder Pitfall: __str__ vs __repr__ — If you only implement one, implement __repr__. Python falls back to __repr__ when __str__ is not defined, but not the other way around. In practice, __repr__ is what you see in the debugger, in logs, and in list displays ([Vector(1, 2), Vector(3, 4)]). It matters more than __str__ for day-to-day development.

4. Properties

In Java, you write getVariable() and setVariable(). In Python, we prefer direct access (obj.variable). But what if you need validation? Use the @property decorator. It lets you use a method as if it were an attribute.
class Circle:
    def __init__(self, radius):
        self._radius = radius # Convention: _variable means "internal use only"

    @property
    def radius(self):
        return self._radius

    @radius.setter
    def radius(self, value):
        if value < 0:
            raise ValueError("Radius cannot be negative")
        self._radius = value

c = Circle(5)
c.radius = 10 # Calls the setter method!
# c.radius = -1 # Raises ValueError

5. Dataclasses (Python 3.7+)

If you are writing a class just to hold data (like a struct in C or a record in Java), standard classes are verbose — you write __init__, __repr__, __eq__ by hand every time. Dataclasses automate this boilerplate while still giving you full control.
from dataclasses import dataclass, field

@dataclass
class Point:
    x: int
    y: int
    z: int = 0  # Default value

p1 = Point(1, 2)
p2 = Point(1, 2)

print(p1)        # Output: Point(x=1, y=2, z=0) -- auto-generated __repr__
print(p1 == p2)  # Output: True -- auto-generated __eq__ compares field values

Frozen Dataclasses (Immutable)

@dataclass(frozen=True)
class Config:
    host: str
    port: int
    debug: bool = False

config = Config("localhost", 8080)
# config.port = 9090  # FrozenInstanceError! Cannot modify.
# Frozen dataclasses are hashable -- can be used as dict keys or set members
Dataclass Mutable Default Pitfall — The same mutable default trap from functions applies here. Never use a mutable object as a default directly. Use field(default_factory=...) instead.
from dataclasses import dataclass, field

# WRONG -- Python will actually raise a TypeError to protect you
# @dataclass
# class Team:
#     members: list = []  # TypeError: mutable default not allowed

# CORRECT -- use field with default_factory
@dataclass
class Team:
    name: str
    members: list = field(default_factory=list)  # Fresh list per instance

__slots__ for Memory Optimization

By default, Python objects store their attributes in a __dict__ dictionary. For classes with many instances, this wastes memory. __slots__ replaces the dict with a fixed-size array, reducing memory usage by 30-40%.
class PointWithSlots:
    __slots__ = ("x", "y")  # Only these attributes are allowed

    def __init__(self, x, y):
        self.x = x
        self.y = y
        # self.z = 0  # AttributeError! z is not in __slots__

# In Python 3.10+, dataclasses support slots directly:
@dataclass(slots=True)
class Point:
    x: float
    y: float

6. Duck Typing and Protocols

Python follows the principle of duck typing: “If it walks like a duck and quacks like a duck, it is a duck.” You do not check what an object is — you check what it can do.
# This function works with ANY object that has a .read() method --
# a file, a StringIO, a network socket, a mock in tests.
def process_stream(stream):
    data = stream.read()
    return data.upper()

# Since Python 3.8+, you can formalize duck typing with Protocol:
from typing import Protocol

class Readable(Protocol):
    def read(self) -> str: ...

def process_stream(stream: Readable) -> str:  # Type checker enforces the contract
    return stream.read().upper()
Protocols are Python’s answer to Go interfaces — they define what an object must be able to do without requiring inheritance. Use them for type safety without coupling. This is the Pythonic alternative to Java-style abstract base classes in most situations.

Summary

  • Classes: Encapsulate data and behavior. Use self explicitly.
  • Inheritance: Prefer composition over inheritance. Check the MRO with __mro__.
  • Magic Methods: Implement __repr__ first, then __str__. Make your objects behave like built-in types.
  • Properties: Add validation logic without changing the API (@property).
  • Dataclasses: The modern way to define data containers. Use frozen=True for immutability, field(default_factory=...) for mutable defaults, and slots=True for memory efficiency.
  • Duck Typing: Check capabilities, not types. Use Protocol for type-safe duck typing.
Next, we’ll learn how to organize code into Modules and Packages.

Interview Deep-Dive

Strong Answer:
  • A metaclass is the class of a class. Just as an object is an instance of a class, a class is an instance of a metaclass. By default, all classes in Python are instances of type. When you write class Foo: pass, Python internally calls type("Foo", (object,), namespace) to create the class object. A custom metaclass lets you intercept and modify this class creation process.
  • You define a metaclass by subclassing type and overriding __new__ or __init__. __new__ is called before the class object is created (you can modify the class name, bases, or namespace). __init__ is called after the class object exists (you can modify the class in place). __init_subclass__ (Python 3.6+) is a lighter-weight alternative that hooks into subclass creation without a full metaclass.
  • Real production use cases are narrow but powerful. ORMs like Django’s Model and SQLAlchemy’s declarative base use metaclasses to inspect class attributes (field definitions) at class creation time and build database table mappings. API frameworks use them to automatically register endpoint classes. Serialization libraries use them to generate schema validation code when the class is defined rather than at runtime.
  • The important nuance: metaclasses are almost always overkill. Python 3.6+ introduced __init_subclass__, which handles 80% of the use cases that previously required metaclasses (validating subclass attributes, auto-registering subclasses, injecting behavior). Class decorators handle another 15%. Actual metaclasses are the remaining 5% — when you need to control the class namespace before the class body executes (using __prepare__), or when you need metaclass inheritance to propagate behavior automatically through a class hierarchy.
  • The senior answer to “should I use a metaclass?” is almost always “no, use __init_subclass__ or a class decorator first.” Metaclasses add cognitive overhead, make debugging harder (stack traces go through type.__new__), and create composability problems (you cannot easily combine two metaclasses). They are a power tool for framework authors, not application code.
Follow-up: What is __init_subclass__ and how does it replace metaclasses for common patterns like auto-registration?
  • __init_subclass__ is a class method that is called automatically whenever the class is subclassed. It receives the new subclass as its first argument (after cls) plus any keyword arguments passed in the class definition.
  • For auto-registration: class Plugin: _registry = {} then define def __init_subclass__(cls, **kwargs): super().__init_subclass__(**kwargs); Plugin._registry[cls.__name__] = cls. Now any class that inherits from Plugin is automatically registered. No metaclass needed, no decorator needed, and it works transparently with multiple inheritance.
  • The key advantage over metaclasses is composability. Multiple parent classes can each define __init_subclass__, and they all get called via the MRO (as long as they call super().__init_subclass__(**kwargs)). With metaclasses, having two parent classes with different metaclasses causes a TypeError unless you manually create a combined metaclass.
Strong Answer:
  • The MRO is the order in which Python searches for methods in a class hierarchy. For single inheritance, it is straightforward: child, parent, grandparent, …, object. For multiple inheritance, it gets complex because a class can appear in multiple inheritance chains.
  • Python uses the C3 linearization algorithm to compute the MRO. C3 guarantees three properties: (1) subclasses come before their parents, (2) if a class inherits from A then B, A comes before B in the MRO, and (3) the order is consistent — the same class always appears in the same relative position. You can inspect any class’s MRO with MyClass.__mro__ or MyClass.mro().
  • The diamond problem occurs when a class D inherits from B and C, and both B and C inherit from A. Without a proper MRO, A’s methods could be called twice. C3 linearization ensures A appears only once in the MRO, at the end: D -> B -> C -> A -> object.
  • super() does not mean “call the parent class.” It means “call the next class in the MRO.” This distinction is critical for cooperative multiple inheritance. When B.method() calls super().method(), it does not call A.method() (B’s parent). It calls the next class in the MRO of the actual instance, which might be C.method() if the instance is of type D. This is how all classes in the diamond get called exactly once.
  • The practical implication: if you use super(), all cooperating classes must follow the same protocol — same method signature (or use *args, **kwargs to forward unknown arguments) and all must call super() in turn. If any class breaks the chain by not calling super(), downstream classes in the MRO get skipped. This is cooperative multiple inheritance, and it requires cooperation from all participants.
  • In production, deep multiple inheritance hierarchies are rare and usually a design smell. Mixins (small, focused classes that add a single behavior) are the common pattern: class APIView(AuthMixin, LoggingMixin, View). Each mixin adds one capability, and super() chains them correctly through the MRO.
Follow-up: What happens if two parent classes define the same method and neither calls super()? How do you debug MRO-related issues?
  • If neither calls super(), only the first class in the MRO wins. The second class’s method is completely shadowed. This is not an error — Python silently resolves the conflict by MRO order. This can cause subtle bugs where a class thinks its method is being called but it never is.
  • To debug: first, print the MRO with MyClass.__mro__ to see the exact resolution order. Second, use super() explicitly with two arguments to call a specific class’s version: super(SpecificClass, self).method() calls the next class after SpecificClass in the MRO. Third, for complex hierarchies, the inspect module’s getmro() function can help, and you can trace method calls by adding logging to each class’s method to see which ones actually execute.
Strong Answer:
  • __repr__ is for developers. It should return an unambiguous string representation that ideally could recreate the object: repr(Point(1, 2)) should return "Point(1, 2)". It is used in the REPL, in debugger displays, in log messages, and as a fallback when __str__ is not defined. The convention is that eval(repr(obj)) should produce an equivalent object when possible.
  • __str__ is for end users. It returns a human-readable, “pretty” string: str(datetime.now()) returns "2024-01-15 10:30:00", not the full constructor call. print() calls __str__. If __str__ is not defined, Python falls back to __repr__.
  • Every production class should implement __repr__ because when something goes wrong at 3 AM and you are reading logs, seeing <MyObject object at 0x7f...> is useless. Seeing MyObject(id=42, status='pending', retries=3) immediately tells you what state the object was in when the error occurred. This is not a nice-to-have — it is the difference between a 10-minute debugging session and a 2-hour one.
  • Dataclasses and named tuples give you __repr__ for free, which is another reason to prefer them for data-holding classes. For classes with complex internal state, implement __repr__ to show the most diagnostically useful fields, not every internal detail.
  • A subtle best practice: __repr__ output should always include the class name. If SubClass inherits from BaseClass and only BaseClass defines __repr__, the output will say “BaseClass(…)” even for SubClass instances. Use type(self).__name__ in your __repr__ to get the actual class name dynamically.
Follow-up: What are the most important dunder methods to implement for a class that will be used as a dictionary key or stored in a set?
  • You must implement __hash__ and __eq__. The contract is: if a == b, then hash(a) == hash(b). The reverse does not need to hold (hash collisions are fine). If you break this contract, dict lookups will silently fail to find keys that exist.
  • For __hash__, a common pattern is to hash a tuple of the fields that determine equality: def __hash__(self): return hash((self.x, self.y)). Use only immutable fields — if a field can change, the hash changes, and the object becomes “lost” in the hash table.
  • You should also implement __repr__ for debuggability, and consider __lt__ if you want the objects to be sortable (needed for use with sorted(), min(), max() with multiple objects).
  • If you use @dataclass(frozen=True), you get __hash__, __eq__, and __repr__ for free, and the frozen flag prevents mutation, which guarantees hash stability. This is the recommended approach for value objects that serve as dict keys.
Strong Answer:
  • dataclass is the go-to for most structured data. It generates __init__, __repr__, __eq__, and optionally __hash__ (with frozen=True). It supports default values, field-level metadata, post-init processing (__post_init__), and inheritance. Use it when you need a proper class with methods, validation, or behavior attached to the data.
  • NamedTuple (from typing) creates an immutable tuple subclass with named fields. It is lighter weight than a dataclass — instances use less memory because they are backed by tuples, not dicts. It is hashable by default. Use it for simple, immutable records where you want tuple unpacking: x, y = point. The downside is no mutable fields and limited customization.
  • TypedDict (from typing) is for typing existing dictionary patterns. It does not create a new class — it is purely a type hint that tells mypy what keys a dict should have and what types they should be. Use it when you are working with JSON data or APIs that return plain dicts and you want type safety without converting to a class. TypedDict has zero runtime overhead.
  • Plain dict is for genuinely dynamic key-value data where the keys are not known at definition time: configuration from files, JSON payloads being passed through, aggregation results. If you find yourself accessing d["name"] and d["age"] repeatedly with known keys, that is a signal to use a dataclass or NamedTuple instead — you gain autocompletion, type checking, and AttributeError instead of KeyError.
  • The decision framework: Is the data immutable and simple? NamedTuple. Is the data mutable, has methods, or needs validation? dataclass. Is it a JSON blob from an external API you want to type-check but not convert? TypedDict. Is the shape truly dynamic? Plain dict. In practice, about 70% of the time the answer is dataclass.
Follow-up: What is @dataclass(slots=True) and why would you use it?
  • Python 3.10 added slots=True to dataclasses, which automatically generates __slots__ for the class. Normally, Python objects store their attributes in a per-instance __dict__ (a dictionary), which uses about 100-200 bytes of overhead per instance. With __slots__, attributes are stored in a fixed-size array, eliminating the __dict__ entirely.
  • The benefits are twofold: memory savings (roughly 30-40% per instance for small objects) and slightly faster attribute access (direct array indexing vs. hash table lookup). For a service holding 1 million user objects in memory, slots=True can save 100-200MB of RAM.
  • The trade-off: slotted classes cannot have arbitrary attributes added dynamically (obj.new_attr = value raises AttributeError). This also means libraries that rely on __dict__ (some serialization libraries, some debugging tools) may not work. But for data containers, the restriction is actually a feature — it prevents accidental attribute creation from typos, like user.nane = "Alice" silently succeeding on a regular object but raising an error on a slotted one.