Documentation Index
Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
Use this file to discover all available pages before exploring further.
Chapter 5: Branches & Checkout
Branches are one of Git’s most powerful features. In this chapter, we’ll implement branch management and the checkout command, completing our core Git implementation. Most developers think of branches as copies of the codebase. They are not. A branch in Git is a 41-byte text file (40 hex characters plus a newline) that contains a commit hash. Creating a branch is as cheap as creating a Post-it note — you are just writing down which commit the branch name should point to. This is why Git can have thousands of branches with zero performance overhead, while older version control systems like Subversion had to physically copy entire directory trees.Prerequisites: Completed Chapter 4: Commits & History
Time: 2-3 hours
Outcome: Working
Time: 2-3 hours
Outcome: Working
branch and checkout commandsHow Branches Work
A branch in Git is just a file containing a 40-character commit hash. That’s it!The key insight: Creating a branch is just creating a file. Switching branches is just changing HEAD and updating the working directory. The “cost” of a branch is literally 41 bytes on disk. This is why Git workflows like Git Flow, GitHub Flow, and trunk-based development with short-lived feature branches are practical — branching is essentially free.
Implementation
Step 1: Implement the Branch Command
src/commands/branch.js
Step 2: Implement the Checkout Command
The checkout command does two things:- Updates HEAD to point to the target branch/commit
- Updates the working directory to match
src/commands/checkout.js
Step 3: Update CLI
src/mygit.js
Testing Your Implementation
Understanding Detached HEAD
Exercises
Exercise 1: Implement checkout for files
Exercise 1: Implement checkout for files
Allow checking out individual files from a commit:
Exercise 2: Implement switch command
Exercise 2: Implement switch command
Modern Git has a separate
switch command (safer than checkout):Exercise 3: Add branch tracking
Exercise 3: Add branch tracking
Track which branch HEAD is on in status:
Complete Git Clone!
Congratulations! You’ve built a working Git implementation. Take a moment to appreciate what you’ve done — you’ve implemented the same fundamental architecture that manages billions of lines of code across millions of repositories worldwide. The patterns you’ve learned (content-addressable storage, DAG-based history, cheap branching via pointer files) are not Git-specific; they appear in databases, distributed systems, and blockchain technology. Here is what you’ve built:init
Initialize repositories
hash-object
Hash and store files
cat-file
Read stored objects
add
Stage changes
status
Show working tree
commit
Create commits
log
View history
branch
Manage branches
checkout
Switch branches
What You’ve Learned
Content-Addressable Storage
Files stored by their SHA-1 hash, enabling deduplication and integrity
Object Model
Blobs (files), trees (directories), and commits (snapshots)
The Index
Staging area as a binary file tracking what will be committed
Branches are Pointers
Just files containing commit hashes - incredibly simple!
Further Challenges
Ready for more? Try implementing:- Merge: Combine branches with three-way merge
- Rebase: Replay commits on a different base
- Diff: Show file differences
- Remote: Push and pull from other repositories
- Pack files: Delta compression for efficiency
Further Reading
DSA: Graph Algorithms
Essential for understanding commit graphs
Distributed Systems
How Git enables distributed version control
Next Project
Ready for a bigger challenge? Move on to:Build Your Own Redis
Master networking and data structures by building Redis from scratch
Interview Deep-Dive
What actually happens to the working directory when you run git checkout to switch branches? Walk through the steps.
What actually happens to the working directory when you run git checkout to switch branches? Walk through the steps.
Strong Answer:
- First, Git checks for uncommitted changes that would be overwritten by the checkout. It compares the current index against the working directory to find modified files, then checks if those files differ between the current branch’s tree and the target branch’s tree. If a modified file would be overwritten, Git aborts with a safety error.
- If safe, Git reads the target commit’s root tree and compares it to the current commit’s root tree. Files that differ are updated: Git reads the new blob from the object store and writes it to the working directory. Files that are identical are untouched (no I/O needed thanks to tree-level deduplication).
- Files that exist in the current tree but not in the target tree are deleted. Files that exist in the target tree but not the current tree are created. Empty directories left behind are cleaned up.
- The index is updated to match the target commit’s tree, with fresh stat cache entries (mtime, size, inode) from the newly written files.
- Finally, HEAD is updated: if switching to a branch, HEAD is written as
ref: refs/heads/<branch>; if checking out a commit hash, HEAD is written as the raw hash (detached HEAD). - The operation is optimized to touch the minimum number of files. For two branches that differ by one file in a 10,000-file repository, checkout reads one blob and writes one file.
git switch, which is smarter about this: it checks whether the working directory changes can be preserved across the switch. The safety logic is intentionally conservative because losing uncommitted work is worse than a false refusal — the user can always stash, switch, and pop.Explain detached HEAD state. Why does it exist, when is it useful, and what is the risk?
Explain detached HEAD state. Why does it exist, when is it useful, and what is the risk?
Strong Answer:
- Detached HEAD means HEAD contains a raw commit hash instead of a symbolic reference to a branch. You enter this state by checking out a specific commit, a tag, or a remote tracking branch directly.
- It exists because sometimes you need to inspect or build from a specific historical point without affecting any branch. CI/CD systems often check out a specific commit hash for reproducible builds.
git bisectuses detached HEAD internally as it navigates the commit graph looking for a bug. - The risk is that commits made in detached HEAD state are not referenced by any branch. They are reachable only through HEAD and the reflog. If you switch to a branch, HEAD updates to point to the branch, and the detached commits become unreachable. They will be garbage collected after the reflog expires (default 30 days for unreachable entries, 90 days for reachable ones).
- The recovery is simple if you realize in time:
git branch <name>while still in detached HEAD creates a branch pointing to your current commit. If you already left,git reflogshows recent HEAD positions, and you can recover withgit branch <name> <hash>.
.git/logs/. Every time a ref (HEAD, branch, etc.) changes, Git appends an entry with the old hash, new hash, timestamp, and the operation that caused the change. git reflog shows this log for HEAD, letting you see every commit HEAD has pointed to, even ones that are no longer reachable from any branch. This is the “undo history” for Git itself. The reflog is local-only (not pushed or fetched), expires after a configurable period, and is the mechanism behind git reset --hard @{1} (go back to where HEAD was one move ago). It is the last line of defense because it records state transitions that the commit DAG does not — specifically, the sequence of HEAD movements that led to the current state.How would you implement a basic git merge? What is three-way merge, and why is it better than two-way merge?
How would you implement a basic git merge? What is three-way merge, and why is it better than two-way merge?
Strong Answer:
- A two-way merge compares only the two branch tips. If a line differs between them, the merge cannot tell which side changed it — maybe one side added it, or maybe one side deleted it, or both modified it differently. Two-way merge produces many false conflicts.
- Three-way merge uses a common ancestor (the merge base) as the reference point. For each line, it compares: (1) base vs. branch A, (2) base vs. branch B. If only one side changed a line, the change is accepted automatically. If both sides changed the same line differently, that is a true conflict requiring manual resolution. If both sides made the same change, it is accepted once.
- To implement it: find the merge base with a lowest common ancestor algorithm on the commit DAG. Read the three trees (base, ours, theirs). For each file, compare the blob hashes. If the file changed on only one side, take that version. If it changed on both sides, diff the content and merge line by line using the three-way algorithm. Write the result as a new tree, create a merge commit with two parents, and update the branch pointer.
- The merge base is found using
git merge-base, which walks both branches’ ancestor chains until it finds a common commit. For multiple merge bases (criss-cross merges), Git uses recursive merge strategy: it merges the merge bases first to create a virtual ancestor, then uses that as the base for the actual merge.
git merge --no-ff forces a merge commit even when fast-forward is possible, which some teams prefer because merge commits explicitly mark where feature branches were integrated. The choice between fast-forward and merge commits is a workflow decision, not a technical one — both produce correct results.