> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Build Your Own Git

> Understand version control by building Git from scratch — SHA-1 hashing, trees, commits, and branching

<img src="https://mintcdn.com/devweeekends/AEOaWh79Ur7CdHHv/images/courses/build-your-own-x/git-objects.svg?fit=max&auto=format&n=AEOaWh79Ur7CdHHv&q=85&s=dc3491653633dc7971b903f6a9a55006" alt="Git Object Model" width="1080" height="1080" data-path="images/courses/build-your-own-x/git-objects.svg" />

# Build Your Own Git

**Target Audience**: Students and Junior Engineers\
**Language**: JavaScript (with Java & Go alternatives)\
**Duration**: 2-3 weeks\
**Difficulty**: ⭐⭐⭐☆☆

***

## Why Build Git?

Git is the most ubiquitous tool in software development, yet most developers treat it as magic. They memorize commands like incantations without understanding the machinery underneath. By building your own Git, you'll:

* **Understand content-addressable storage** -- the same idea that powers IPFS, Nix, and blockchain Merkle trees
* **Master hashing and cryptography basics** -- SHA-1 in practice, and why "same content = same hash" is so powerful
* **Learn tree data structures** -- directories as trees, commits as a DAG (Directed Acyclic Graph)
* **Build a real CLI tool** -- practical software engineering, argument parsing, and error handling
* **Debug Git problems from first principles** -- when `git rebase` goes wrong, you'll understand *why* at the object level

<Warning>
  This is NOT a tutorial on using Git. This is about understanding Git's internals deeply enough to reimplement them. You should already be comfortable with basic Git usage (commit, branch, merge) before starting.
</Warning>

***

## Git's Beautiful Architecture

<Frame>
  <img src="https://mintcdn.com/devweeekends/emzPt-9B_R8UKdqm/images/courses/git-object-model.svg?fit=max&auto=format&n=emzPt-9B_R8UKdqm&q=85&s=975a7026bf6591682f646c0648db547a" alt="Git Object Model" width="1080" height="1080" data-path="images/courses/git-object-model.svg" />
</Frame>

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                           GIT OBJECT MODEL                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                              │
│   BLOB                    TREE                      COMMIT                   │
│   ─────                   ─────                     ──────                   │
│   File contents           Directory listing         Snapshot + metadata      │
│   SHA-1 of content        Points to blobs/trees     Points to tree + parent │
│                                                                              │
│   ┌───────────┐          ┌───────────────────┐     ┌───────────────────┐    │
│   │ Hello     │          │ 100644 hello.txt  │     │ tree abc123       │    │
│   │ World     │          │ 040000 src/       │     │ parent def456     │    │
│   └───────────┘          └───────────────────┘     │ author John       │    │
│        │                        │                   │ message: Initial  │    │
│        └────────────────────────┴───────────────────┘                        │
│                                                                              │
│   ALL OBJECTS ARE CONTENT-ADDRESSED:                                        │
│   SHA-1(type + size + content) → 40-character hex string                    │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

***

## What You'll Build

### Core Commands

| Command       | Description              | Concepts Learned                 |
| ------------- | ------------------------ | -------------------------------- |
| `init`        | Initialize repository    | File structure, `.git` directory |
| `hash-object` | Hash and store a file    | SHA-1, zlib compression          |
| `cat-file`    | Read object contents     | Object types, decompression      |
| `add`         | Stage files              | Index file format                |
| `commit`      | Create commit object     | Tree building, commit linking    |
| `log`         | Show commit history      | DAG traversal                    |
| `status`      | Show working tree status | Diff against index               |
| `branch`      | Manage branches          | Refs, symbolic refs              |
| `checkout`    | Switch branches          | Updating HEAD, working tree      |

***

## Implementation: JavaScript

### Project Structure

```
mygit/
├── src/
│   ├── commands/
│   │   ├── init.js
│   │   ├── hashObject.js
│   │   ├── catFile.js
│   │   ├── add.js
│   │   ├── commit.js
│   │   ├── log.js
│   │   ├── status.js
│   │   ├── branch.js
│   │   └── checkout.js
│   ├── objects/
│   │   ├── blob.js
│   │   ├── tree.js
│   │   └── commit.js
│   ├── utils/
│   │   ├── hash.js
│   │   ├── compression.js
│   │   ├── index.js
│   │   └── refs.js
│   └── mygit.js
├── package.json
└── README.md
```

### Core Implementation

<CodeGroup>
  ```javascript src/utils/hash.js theme={null}
  const crypto = require('crypto');
  const zlib = require('zlib');
  const fs = require('fs');
  const path = require('path');

  /**
   * Compute SHA-1 hash of content with Git's format
   * Git format: "{type} {size}\0{content}"
   */
  function hashObject(content, type = 'blob') {
      const header = `${type} ${content.length}\0`;
      const store = Buffer.concat([Buffer.from(header), Buffer.from(content)]);
      const hash = crypto.createHash('sha1').update(store).digest('hex');
      return { hash, store };
  }

  /**
   * Write object to .git/objects/{hash[0:2]}/{hash[2:40]}
   */
  function writeObject(gitDir, hash, store) {
      const objectDir = path.join(gitDir, 'objects', hash.slice(0, 2));
      const objectPath = path.join(objectDir, hash.slice(2));
      
      if (!fs.existsSync(objectDir)) {
          fs.mkdirSync(objectDir, { recursive: true });
      }
      
      // Git stores objects compressed with zlib
      const compressed = zlib.deflateSync(store);
      fs.writeFileSync(objectPath, compressed);
      
      return hash;
  }

  /**
   * Read object from .git/objects
   */
  function readObject(gitDir, hash) {
      const objectPath = path.join(
          gitDir, 'objects', 
          hash.slice(0, 2), 
          hash.slice(2)
      );
      
      if (!fs.existsSync(objectPath)) {
          throw new Error(`Object ${hash} not found`);
      }
      
      const compressed = fs.readFileSync(objectPath);
      const store = zlib.inflateSync(compressed);
      
      // Parse header: "{type} {size}\0{content}"
      const nullIndex = store.indexOf(0);
      const header = store.slice(0, nullIndex).toString();
      const [type, size] = header.split(' ');
      const content = store.slice(nullIndex + 1);
      
      return { type, size: parseInt(size), content };
  }

  module.exports = { hashObject, writeObject, readObject };
  ```

  ```javascript src/commands/init.js theme={null}
  const fs = require('fs');
  const path = require('path');

  /**
   * Initialize a new Git repository
   * Creates .git directory structure
   */
  function init(directory = '.') {
      const gitDir = path.join(directory, '.git');
      
      // Create directory structure
      const dirs = [
          gitDir,
          path.join(gitDir, 'objects'),
          path.join(gitDir, 'refs', 'heads'),
          path.join(gitDir, 'refs', 'tags')
      ];
      
      dirs.forEach(dir => {
          if (!fs.existsSync(dir)) {
              fs.mkdirSync(dir, { recursive: true });
          }
      });
      
      // Create HEAD pointing to master
      const headContent = 'ref: refs/heads/master\n';
      fs.writeFileSync(path.join(gitDir, 'HEAD'), headContent);
      
      // Create config file
      const configContent = `[core]
      repositoryformatversion = 0
      filemode = false
      bare = false
  `;
      fs.writeFileSync(path.join(gitDir, 'config'), configContent);
      
      console.log(`Initialized empty Git repository in ${path.resolve(gitDir)}`);
      return gitDir;
  }

  module.exports = { init };
  ```

  ```javascript src/objects/blob.js theme={null}
  const { hashObject, writeObject } = require('../utils/hash');
  const fs = require('fs');

  /**
   * Create a blob object from file content
   * Blobs store file content, nothing more
   */
  class Blob {
      constructor(content) {
          this.content = Buffer.isBuffer(content) ? content : Buffer.from(content);
          this.type = 'blob';
          
          const { hash, store } = hashObject(this.content, this.type);
          this.hash = hash;
          this.store = store;
      }
      
      static fromFile(filePath) {
          const content = fs.readFileSync(filePath);
          return new Blob(content);
      }
      
      write(gitDir) {
          return writeObject(gitDir, this.hash, this.store);
      }
  }

  module.exports = { Blob };
  ```

  ```javascript src/objects/tree.js theme={null}
  const { hashObject, writeObject, readObject } = require('../utils/hash');

  /**
   * Tree object - represents a directory
   * Format: "{mode} {name}\0{20-byte SHA}"
   */
  class Tree {
      constructor(entries = []) {
          // entries: [{ mode, name, hash }]
          this.entries = entries.sort((a, b) => a.name.localeCompare(b.name));
          this.type = 'tree';
      }
      
      addEntry(mode, name, hash) {
          this.entries.push({ mode, name, hash });
          this.entries.sort((a, b) => a.name.localeCompare(b.name));
      }
      
      serialize() {
          const parts = [];
          
          for (const entry of this.entries) {
              // Format: "{mode} {name}\0{20-byte binary SHA}"
              const modeAndName = Buffer.from(`${entry.mode} ${entry.name}\0`);
              const hashBytes = Buffer.from(entry.hash, 'hex');
              parts.push(modeAndName, hashBytes);
          }
          
          return Buffer.concat(parts);
      }
      
      computeHash() {
          const content = this.serialize();
          const { hash, store } = hashObject(content, 'tree');
          this.hash = hash;
          this.store = store;
          return hash;
      }
      
      write(gitDir) {
          if (!this.hash) this.computeHash();
          return writeObject(gitDir, this.hash, this.store);
      }
      
      static parse(buffer) {
          const entries = [];
          let offset = 0;
          
          while (offset < buffer.length) {
              // Find the null byte that separates mode+name from hash
              const nullIndex = buffer.indexOf(0, offset);
              const modeAndName = buffer.slice(offset, nullIndex).toString();
              
              const spaceIndex = modeAndName.indexOf(' ');
              const mode = modeAndName.slice(0, spaceIndex);
              const name = modeAndName.slice(spaceIndex + 1);
              
              // Next 20 bytes are the SHA-1 hash
              const hashBytes = buffer.slice(nullIndex + 1, nullIndex + 21);
              const hash = hashBytes.toString('hex');
              
              entries.push({ mode, name, hash });
              offset = nullIndex + 21;
          }
          
          return new Tree(entries);
      }
  }

  module.exports = { Tree };
  ```

  ```javascript src/objects/commit.js theme={null}
  const { hashObject, writeObject } = require('../utils/hash');

  /**
   * Commit object - a snapshot in time
   * Points to a tree, has parent commits, author info, message
   */
  class Commit {
      constructor({ tree, parents = [], author, committer, message }) {
          this.tree = tree;
          this.parents = parents;
          this.author = author;
          this.committer = committer || author;
          this.message = message;
          this.type = 'commit';
      }
      
      serialize() {
          const lines = [];
          
          lines.push(`tree ${this.tree}`);
          
          for (const parent of this.parents) {
              lines.push(`parent ${parent}`);
          }
          
          // Format: "name <email> timestamp timezone"
          const timestamp = Math.floor(Date.now() / 1000);
          const timezone = '+0000';
          
          lines.push(`author ${this.author.name} <${this.author.email}> ${timestamp} ${timezone}`);
          lines.push(`committer ${this.committer.name} <${this.committer.email}> ${timestamp} ${timezone}`);
          lines.push('');
          lines.push(this.message);
          
          return Buffer.from(lines.join('\n'));
      }
      
      computeHash() {
          const content = this.serialize();
          const { hash, store } = hashObject(content, 'commit');
          this.hash = hash;
          this.store = store;
          return hash;
      }
      
      write(gitDir) {
          if (!this.hash) this.computeHash();
          return writeObject(gitDir, this.hash, this.store);
      }
      
      static parse(content) {
          const lines = content.toString().split('\n');
          const commit = {
              parents: []
          };
          
          let i = 0;
          while (lines[i] !== '') {
              const line = lines[i];
              
              if (line.startsWith('tree ')) {
                  commit.tree = line.slice(5);
              } else if (line.startsWith('parent ')) {
                  commit.parents.push(line.slice(7));
              } else if (line.startsWith('author ')) {
                  commit.author = parseAuthor(line.slice(7));
              } else if (line.startsWith('committer ')) {
                  commit.committer = parseAuthor(line.slice(10));
              }
              i++;
          }
          
          commit.message = lines.slice(i + 1).join('\n');
          return commit;
      }
  }

  function parseAuthor(str) {
      // Format: "name <email> timestamp timezone"
      const match = str.match(/^(.+) <(.+)> (\d+) ([+-]\d{4})$/);
      if (match) {
          return {
              name: match[1],
              email: match[2],
              timestamp: parseInt(match[3]),
              timezone: match[4]
          };
      }
      return { name: str, email: '', timestamp: 0, timezone: '+0000' };
  }

  module.exports = { Commit };
  ```

  ```javascript src/utils/index.js theme={null}
  const fs = require('fs');
  const path = require('path');
  const crypto = require('crypto');

  /**
   * Git Index (staging area) implementation
   * The index is a binary file that caches tree information
   */
  class Index {
      constructor(gitDir) {
          this.gitDir = gitDir;
          this.indexPath = path.join(gitDir, 'index');
          this.entries = new Map();
      }
      
      /**
       * Add a file to the index
       */
      add(filePath, hash, mode = '100644') {
          const stat = fs.statSync(filePath);
          
          this.entries.set(filePath, {
              mode,
              hash,
              flags: filePath.length,
              name: filePath,
              size: stat.size,
              mtime: stat.mtimeMs,
              ctime: stat.ctimeMs
          });
      }
      
      /**
       * Write index to disk
       * Format: 12-byte header + entries + 20-byte SHA-1
       */
      write() {
          const header = Buffer.alloc(12);
          header.write('DIRC', 0);  // Signature
          header.writeUInt32BE(2, 4);  // Version
          header.writeUInt32BE(this.entries.size, 8);  // Entry count
          
          const entryBuffers = [];
          
          for (const [name, entry] of this.entries) {
              const entryBuffer = this.serializeEntry(entry);
              entryBuffers.push(entryBuffer);
          }
          
          const content = Buffer.concat([header, ...entryBuffers]);
          
          // Append SHA-1 checksum
          const checksum = crypto.createHash('sha1').update(content).digest();
          const final = Buffer.concat([content, checksum]);
          
          fs.writeFileSync(this.indexPath, final);
      }
      
      serializeEntry(entry) {
          // Entry format is complex, simplified version here
          const nameBytes = Buffer.from(entry.name);
          const entryLength = 62 + nameBytes.length;
          const padding = 8 - (entryLength % 8);
          
          const buffer = Buffer.alloc(entryLength + padding);
          let offset = 0;
          
          // ctime, mtime (seconds + nanoseconds)
          buffer.writeUInt32BE(Math.floor(entry.ctime / 1000), offset); offset += 4;
          buffer.writeUInt32BE(0, offset); offset += 4;
          buffer.writeUInt32BE(Math.floor(entry.mtime / 1000), offset); offset += 4;
          buffer.writeUInt32BE(0, offset); offset += 4;
          
          // dev, ino (not used on Windows)
          buffer.writeUInt32BE(0, offset); offset += 4;
          buffer.writeUInt32BE(0, offset); offset += 4;
          
          // mode
          buffer.writeUInt32BE(parseInt(entry.mode, 8), offset); offset += 4;
          
          // uid, gid
          buffer.writeUInt32BE(0, offset); offset += 4;
          buffer.writeUInt32BE(0, offset); offset += 4;
          
          // file size
          buffer.writeUInt32BE(entry.size, offset); offset += 4;
          
          // SHA-1 hash (20 bytes)
          Buffer.from(entry.hash, 'hex').copy(buffer, offset); offset += 20;
          
          // flags (assume-valid, extended, stage, name length)
          buffer.writeUInt16BE(entry.name.length & 0xFFF, offset); offset += 2;
          
          // name
          nameBytes.copy(buffer, offset);
          
          return buffer;
      }
      
      /**
       * Read index from disk
       */
      read() {
          if (!fs.existsSync(this.indexPath)) {
              return;
          }
          
          const buffer = fs.readFileSync(this.indexPath);
          
          // Verify signature
          const signature = buffer.slice(0, 4).toString();
          if (signature !== 'DIRC') {
              throw new Error('Invalid index file');
          }
          
          const version = buffer.readUInt32BE(4);
          const entryCount = buffer.readUInt32BE(8);
          
          let offset = 12;
          
          for (let i = 0; i < entryCount; i++) {
              const entry = this.parseEntry(buffer, offset);
              this.entries.set(entry.name, entry);
              
              // Calculate next entry offset (entries are padded to 8 bytes)
              const entryLength = 62 + entry.name.length;
              const padding = 8 - (entryLength % 8);
              offset += entryLength + padding;
          }
      }
      
      parseEntry(buffer, offset) {
          const ctimeSec = buffer.readUInt32BE(offset);
          const mtimeSec = buffer.readUInt32BE(offset + 8);
          const mode = buffer.readUInt32BE(offset + 24).toString(8);
          const size = buffer.readUInt32BE(offset + 36);
          const hash = buffer.slice(offset + 40, offset + 60).toString('hex');
          const flags = buffer.readUInt16BE(offset + 60);
          const nameLength = flags & 0xFFF;
          const name = buffer.slice(offset + 62, offset + 62 + nameLength).toString();
          
          return {
              ctime: ctimeSec * 1000,
              mtime: mtimeSec * 1000,
              mode,
              size,
              hash,
              flags,
              name
          };
      }
  }

  module.exports = { Index };
  ```

  ```javascript src/commands/add.js theme={null}
  const fs = require('fs');
  const path = require('path');
  const { Blob } = require('../objects/blob');
  const { Index } = require('../utils/index');

  /**
   * Add files to the staging area (index)
   */
  function add(gitDir, files) {
      const workDir = path.dirname(gitDir);
      const index = new Index(gitDir);
      
      // Read existing index
      index.read();
      
      for (const file of files) {
          const filePath = path.isAbsolute(file) 
              ? file 
              : path.join(workDir, file);
          
          if (!fs.existsSync(filePath)) {
              console.error(`fatal: pathspec '${file}' did not match any files`);
              continue;
          }
          
          const stat = fs.statSync(filePath);
          
          if (stat.isDirectory()) {
              // Recursively add directory contents
              const addDir = (dir, relativePath = '') => {
                  const entries = fs.readdirSync(dir);
                  for (const entry of entries) {
                      if (entry === '.git') continue;
                      
                      const entryPath = path.join(dir, entry);
                      const entryRelative = path.join(relativePath, entry);
                      const entryStat = fs.statSync(entryPath);
                      
                      if (entryStat.isDirectory()) {
                          addDir(entryPath, entryRelative);
                      } else {
                          addFile(entryPath, entryRelative);
                      }
                  }
              };
              
              const relativePath = path.relative(workDir, filePath);
              addDir(filePath, relativePath);
          } else {
              const relativePath = path.relative(workDir, filePath);
              addFile(filePath, relativePath);
          }
      }
      
      function addFile(absolutePath, relativePath) {
          // Create blob object
          const blob = Blob.fromFile(absolutePath);
          blob.write(gitDir);
          
          // Add to index
          const mode = fs.statSync(absolutePath).mode & 0o111 
              ? '100755' 
              : '100644';
          
          index.add(relativePath.replace(/\\/g, '/'), blob.hash, mode);
          console.log(`add '${relativePath}'`);
      }
      
      // Write updated index
      index.write();
  }

  module.exports = { add };
  ```

  ```javascript src/commands/commit.js theme={null}
  const path = require('path');
  const fs = require('fs');
  const { Index } = require('../utils/index');
  const { Tree } = require('../objects/tree');
  const { Commit } = require('../objects/commit');
  const { readRef, writeRef, resolveHead } = require('../utils/refs');

  /**
   * Create a commit from the current index
   */
  function commit(gitDir, message) {
      const index = new Index(gitDir);
      index.read();
      
      if (index.entries.size === 0) {
          console.error('nothing to commit (create/copy files and use "git add" to track)');
          return;
      }
      
      // Build tree from index
      const tree = buildTreeFromIndex(index, gitDir);
      tree.write(gitDir);
      
      // Get parent commit (if any)
      const parentHash = resolveHead(gitDir);
      const parents = parentHash ? [parentHash] : [];
      
      // Create commit object
      const author = {
          name: process.env.GIT_AUTHOR_NAME || 'Your Name',
          email: process.env.GIT_AUTHOR_EMAIL || 'you@example.com'
      };
      
      const commitObj = new Commit({
          tree: tree.hash,
          parents,
          author,
          message
      });
      
      commitObj.write(gitDir);
      
      // Update HEAD reference
      const headContent = fs.readFileSync(path.join(gitDir, 'HEAD'), 'utf-8').trim();
      
      if (headContent.startsWith('ref: ')) {
          // HEAD points to a branch
          const refPath = headContent.slice(5);
          const refFile = path.join(gitDir, refPath);
          const refDir = path.dirname(refFile);
          
          if (!fs.existsSync(refDir)) {
              fs.mkdirSync(refDir, { recursive: true });
          }
          
          fs.writeFileSync(refFile, commitObj.hash + '\n');
      } else {
          // Detached HEAD
          fs.writeFileSync(path.join(gitDir, 'HEAD'), commitObj.hash + '\n');
      }
      
      const branch = getCurrentBranch(gitDir);
      const shortHash = commitObj.hash.slice(0, 7);
      const firstLine = message.split('\n')[0];
      
      console.log(`[${branch} ${shortHash}] ${firstLine}`);
      console.log(` ${index.entries.size} files changed`);
      
      return commitObj.hash;
  }

  /**
   * Build a tree structure from the flat index
   */
  function buildTreeFromIndex(index, gitDir) {
      // Group entries by directory
      const root = { entries: [], children: new Map() };
      
      for (const [name, entry] of index.entries) {
          const parts = name.split('/');
          let current = root;
          
          for (let i = 0; i < parts.length - 1; i++) {
              const part = parts[i];
              if (!current.children.has(part)) {
                  current.children.set(part, { entries: [], children: new Map() });
              }
              current = current.children.get(part);
          }
          
          current.entries.push({
              mode: entry.mode,
              name: parts[parts.length - 1],
              hash: entry.hash
          });
      }
      
      // Recursively create tree objects
      function createTree(node) {
          const tree = new Tree();
          
          // Add file entries
          for (const entry of node.entries) {
              tree.addEntry(entry.mode, entry.name, entry.hash);
          }
          
          // Add subtree entries
          for (const [name, child] of node.children) {
              const subtree = createTree(child);
              subtree.write(gitDir);
              tree.addEntry('40000', name, subtree.hash);
          }
          
          tree.computeHash();
          return tree;
      }
      
      return createTree(root);
  }

  function getCurrentBranch(gitDir) {
      const headContent = fs.readFileSync(path.join(gitDir, 'HEAD'), 'utf-8').trim();
      
      if (headContent.startsWith('ref: refs/heads/')) {
          return headContent.slice(16);
      }
      
      return 'HEAD';
  }

  module.exports = { commit };
  ```

  ```javascript src/commands/log.js theme={null}
  const { readObject } = require('../utils/hash');
  const { Commit } = require('../objects/commit');
  const { resolveHead } = require('../utils/refs');

  /**
   * Show commit history
   */
  function log(gitDir, options = {}) {
      let currentHash = resolveHead(gitDir);
      
      if (!currentHash) {
          console.error('fatal: your current branch does not have any commits yet');
          return;
      }
      
      const limit = options.limit || Infinity;
      let count = 0;
      
      while (currentHash && count < limit) {
          const { content } = readObject(gitDir, currentHash);
          const commit = Commit.parse(content);
          
          // Print commit info
          console.log(`\x1b[33mcommit ${currentHash}\x1b[0m`);
          console.log(`Author: ${commit.author.name} <${commit.author.email}>`);
          console.log(`Date:   ${formatDate(commit.author.timestamp, commit.author.timezone)}`);
          console.log();
          
          // Indent message
          const messageLines = commit.message.split('\n');
          for (const line of messageLines) {
              console.log(`    ${line}`);
          }
          console.log();
          
          // Move to parent
          currentHash = commit.parents[0] || null;
          count++;
      }
  }

  function formatDate(timestamp, timezone) {
      const date = new Date(timestamp * 1000);
      const options = {
          weekday: 'short',
          year: 'numeric',
          month: 'short',
          day: 'numeric',
          hour: '2-digit',
          minute: '2-digit',
          second: '2-digit',
          hour12: false
      };
      
      return date.toLocaleString('en-US', options) + ' ' + timezone;
  }

  module.exports = { log };
  ```

  ```javascript src/utils/refs.js theme={null}
  const fs = require('fs');
  const path = require('path');

  /**
   * Read a reference (branch) and return the commit hash
   */
  function readRef(gitDir, refPath) {
      const fullPath = path.join(gitDir, refPath);
      
      if (!fs.existsSync(fullPath)) {
          return null;
      }
      
      return fs.readFileSync(fullPath, 'utf-8').trim();
  }

  /**
   * Write a reference
   */
  function writeRef(gitDir, refPath, hash) {
      const fullPath = path.join(gitDir, refPath);
      const dir = path.dirname(fullPath);
      
      if (!fs.existsSync(dir)) {
          fs.mkdirSync(dir, { recursive: true });
      }
      
      fs.writeFileSync(fullPath, hash + '\n');
  }

  /**
   * Resolve HEAD to a commit hash
   */
  function resolveHead(gitDir) {
      const headPath = path.join(gitDir, 'HEAD');
      
      if (!fs.existsSync(headPath)) {
          return null;
      }
      
      let content = fs.readFileSync(headPath, 'utf-8').trim();
      
      // Follow symbolic references
      while (content.startsWith('ref: ')) {
          const refPath = content.slice(5);
          const refFullPath = path.join(gitDir, refPath);
          
          if (!fs.existsSync(refFullPath)) {
              return null;
          }
          
          content = fs.readFileSync(refFullPath, 'utf-8').trim();
      }
      
      return content;
  }

  /**
   * List all branches
   */
  function listBranches(gitDir) {
      const headsDir = path.join(gitDir, 'refs', 'heads');
      const branches = [];
      
      if (!fs.existsSync(headsDir)) {
          return branches;
      }
      
      const entries = fs.readdirSync(headsDir);
      for (const entry of entries) {
          const stat = fs.statSync(path.join(headsDir, entry));
          if (stat.isFile()) {
              branches.push(entry);
          }
      }
      
      return branches;
  }

  /**
   * Get current branch name
   */
  function getCurrentBranch(gitDir) {
      const headContent = fs.readFileSync(path.join(gitDir, 'HEAD'), 'utf-8').trim();
      
      if (headContent.startsWith('ref: refs/heads/')) {
          return headContent.slice(16);
      }
      
      return null; // Detached HEAD
  }

  module.exports = { readRef, writeRef, resolveHead, listBranches, getCurrentBranch };
  ```

  ```javascript src/commands/branch.js theme={null}
  const fs = require('fs');
  const path = require('path');
  const { resolveHead, listBranches, getCurrentBranch, writeRef } = require('../utils/refs');

  /**
   * List, create, or delete branches
   */
  function branch(gitDir, args = {}) {
      const { name, delete: deleteBranch, list } = args;
      
      if (deleteBranch) {
          // Delete branch
          const branchPath = path.join(gitDir, 'refs', 'heads', deleteBranch);
          
          if (!fs.existsSync(branchPath)) {
              console.error(`error: branch '${deleteBranch}' not found`);
              return;
          }
          
          const current = getCurrentBranch(gitDir);
          if (current === deleteBranch) {
              console.error(`error: Cannot delete branch '${deleteBranch}' checked out`);
              return;
          }
          
          fs.unlinkSync(branchPath);
          console.log(`Deleted branch ${deleteBranch}`);
          return;
      }
      
      if (name) {
          // Create new branch
          const currentHash = resolveHead(gitDir);
          
          if (!currentHash) {
              console.error('fatal: Not a valid object name: HEAD');
              return;
          }
          
          const branchPath = path.join(gitDir, 'refs', 'heads', name);
          
          if (fs.existsSync(branchPath)) {
              console.error(`fatal: A branch named '${name}' already exists`);
              return;
          }
          
          writeRef(gitDir, `refs/heads/${name}`, currentHash);
          console.log(`Created branch ${name}`);
          return;
      }
      
      // List branches
      const branches = listBranches(gitDir);
      const current = getCurrentBranch(gitDir);
      
      for (const b of branches) {
          if (b === current) {
              console.log(`* \x1b[32m${b}\x1b[0m`);
          } else {
              console.log(`  ${b}`);
          }
      }
  }

  module.exports = { branch };
  ```

  ```javascript src/commands/checkout.js theme={null}
  const fs = require('fs');
  const path = require('path');
  const { resolveHead, readRef, getCurrentBranch } = require('../utils/refs');
  const { readObject } = require('../utils/hash');
  const { Tree } = require('../objects/tree');
  const { Index } = require('../utils/index');

  /**
   * Switch branches or restore working tree files
   */
  function checkout(gitDir, target) {
      const workDir = path.dirname(gitDir);
      
      // Check if target is a branch
      const branchRef = `refs/heads/${target}`;
      let targetHash = readRef(gitDir, branchRef);
      let isBranch = true;
      
      if (!targetHash) {
          // Try as a commit hash
          if (target.length >= 7 && /^[0-9a-f]+$/.test(target)) {
              targetHash = resolveFullHash(gitDir, target);
              isBranch = false;
          }
          
          if (!targetHash) {
              console.error(`error: pathspec '${target}' did not match any known refs`);
              return;
          }
      }
      
      // Get the tree from the target commit
      const { content } = readObject(gitDir, targetHash);
      const commitLines = content.toString().split('\n');
      const treeLine = commitLines.find(l => l.startsWith('tree '));
      const treeHash = treeLine.slice(5);
      
      // Recursively extract tree to working directory
      extractTree(gitDir, workDir, treeHash, '');
      
      // Update HEAD
      if (isBranch) {
          fs.writeFileSync(
              path.join(gitDir, 'HEAD'),
              `ref: refs/heads/${target}\n`
          );
          console.log(`Switched to branch '${target}'`);
      } else {
          fs.writeFileSync(
              path.join(gitDir, 'HEAD'),
              targetHash + '\n'
          );
          console.log(`HEAD is now at ${target.slice(0, 7)}`);
      }
      
      // Update index
      const index = new Index(gitDir);
      buildIndexFromTree(gitDir, treeHash, '', index);
      index.write();
  }

  function extractTree(gitDir, workDir, treeHash, prefix) {
      const { content } = readObject(gitDir, treeHash);
      const tree = Tree.parse(content);
      
      for (const entry of tree.entries) {
          const entryPath = path.join(workDir, prefix, entry.name);
          
          if (entry.mode === '40000') {
              // Directory - create and recurse
              if (!fs.existsSync(entryPath)) {
                  fs.mkdirSync(entryPath, { recursive: true });
              }
              extractTree(gitDir, workDir, entry.hash, path.join(prefix, entry.name));
          } else {
              // File - extract blob content
              const { content: blobContent } = readObject(gitDir, entry.hash);
              
              const dir = path.dirname(entryPath);
              if (!fs.existsSync(dir)) {
                  fs.mkdirSync(dir, { recursive: true });
              }
              
              fs.writeFileSync(entryPath, blobContent);
              
              // Set executable bit if needed
              if (entry.mode === '100755') {
                  fs.chmodSync(entryPath, 0o755);
              }
          }
      }
  }

  function buildIndexFromTree(gitDir, treeHash, prefix, index) {
      const { content } = readObject(gitDir, treeHash);
      const tree = Tree.parse(content);
      
      for (const entry of tree.entries) {
          const entryPath = prefix ? `${prefix}/${entry.name}` : entry.name;
          
          if (entry.mode === '40000') {
              buildIndexFromTree(gitDir, entry.hash, entryPath, index);
          } else {
              index.add(entryPath, entry.hash, entry.mode);
          }
      }
  }

  function resolveFullHash(gitDir, partial) {
      const objectsDir = path.join(gitDir, 'objects');
      const prefix = partial.slice(0, 2);
      const prefixDir = path.join(objectsDir, prefix);
      
      if (!fs.existsSync(prefixDir)) {
          return null;
      }
      
      const suffix = partial.slice(2);
      const entries = fs.readdirSync(prefixDir);
      
      for (const entry of entries) {
          if (entry.startsWith(suffix)) {
              return prefix + entry;
          }
      }
      
      return null;
  }

  module.exports = { checkout };
  ```

  ```javascript src/mygit.js theme={null}
  #!/usr/bin/env node

  const path = require('path');
  const fs = require('fs');

  const { init } = require('./commands/init');
  const { add } = require('./commands/add');
  const { commit } = require('./commands/commit');
  const { log } = require('./commands/log');
  const { branch } = require('./commands/branch');
  const { checkout } = require('./commands/checkout');
  const { hashObject, readObject } = require('./utils/hash');
  const { Blob } = require('./objects/blob');

  // Find .git directory
  function findGitDir(startPath = process.cwd()) {
      let current = startPath;
      
      while (current !== path.parse(current).root) {
          const gitDir = path.join(current, '.git');
          if (fs.existsSync(gitDir)) {
              return gitDir;
          }
          current = path.dirname(current);
      }
      
      return null;
  }

  // Parse command line arguments
  const args = process.argv.slice(2);
  const command = args[0];

  switch (command) {
      case 'init':
          init(args[1] || '.');
          break;
          
      case 'hash-object': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          const write = args.includes('-w');
          const file = args.find(a => !a.startsWith('-'));
          
          if (file && fs.existsSync(file)) {
              const blob = Blob.fromFile(file);
              if (write) {
                  blob.write(gitDir);
              }
              console.log(blob.hash);
          }
          break;
      }
      
      case 'cat-file': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          const flag = args[1]; // -p, -t, or -s
          const hash = args[2];
          
          if (!hash) {
              console.error('usage: mygit cat-file <type> <object>');
              process.exit(1);
          }
          
          try {
              const { type, size, content } = readObject(gitDir, hash);
              
              switch (flag) {
                  case '-p':
                      console.log(content.toString());
                      break;
                  case '-t':
                      console.log(type);
                      break;
                  case '-s':
                      console.log(size);
                      break;
              }
          } catch (err) {
              console.error(`fatal: Not a valid object name ${hash}`);
          }
          break;
      }
      
      case 'add': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          const files = args.slice(1);
          if (files.length === 0) {
              console.error('Nothing specified, nothing added.');
              process.exit(1);
          }
          
          add(gitDir, files);
          break;
      }
      
      case 'commit': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          let message = '';
          const mIndex = args.indexOf('-m');
          if (mIndex !== -1 && args[mIndex + 1]) {
              message = args[mIndex + 1];
          } else {
              console.error('error: empty commit message');
              process.exit(1);
          }
          
          commit(gitDir, message);
          break;
      }
      
      case 'log': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          log(gitDir);
          break;
      }
      
      case 'branch': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          const dIndex = args.indexOf('-d');
          if (dIndex !== -1) {
              branch(gitDir, { delete: args[dIndex + 1] });
          } else if (args[1]) {
              branch(gitDir, { name: args[1] });
          } else {
              branch(gitDir, { list: true });
          }
          break;
      }
      
      case 'checkout': {
          const gitDir = findGitDir();
          if (!gitDir) {
              console.error('fatal: not a git repository');
              process.exit(1);
          }
          
          if (!args[1]) {
              console.error('error: you need to specify a branch or commit');
              process.exit(1);
          }
          
          checkout(gitDir, args[1]);
          break;
      }
      
      default:
          console.log(`mygit - A Git implementation in JavaScript

  Commands:
    init                    Initialize a new repository
    hash-object [-w] <file> Compute hash of a file (optionally write to objects)
    cat-file <flag> <hash>  Display object contents (-p, -t, -s)
    add <files...>          Add files to the staging area
    commit -m <message>     Create a commit
    log                     Show commit history
    branch [name]           List or create branches
    checkout <branch>       Switch branches
  `);
  }
  ```
</CodeGroup>

***

## Exercises

### Level 1: Basic Understanding

1. Initialize a repository and create a blob manually
2. Understand how SHA-1 hashing creates content-addressable storage
3. Create a commit and inspect its structure with `cat-file`

### Level 2: Core Implementation

1. Implement the `status` command (compare index to working tree)
2. Add support for `.gitignore` patterns
3. Implement `diff` to show changes between commits

### Level 3: Advanced Features

1. Implement merge (fast-forward and three-way)
2. Add remote repository support (fetch, push)
3. Implement pack files for efficient storage

***

## What You've Learned

<Check>Content-addressable storage using SHA-1 hashing</Check>
<Check>Tree data structures for representing directories</Check>
<Check>DAG (Directed Acyclic Graph) for commit history</Check>
<Check>Binary file formats (index file)</Check>
<Check>CLI tool development in JavaScript</Check>

***

## Next Steps

<CardGroup cols={2}>
  <Card title="Go Implementation" icon="golang" href="/courses/build-your-own-x/git-go">
    See how the same concepts translate to Go
  </Card>

  <Card title="Java Implementation" icon="java" href="/courses/build-your-own-x/git-java">
    Enterprise-grade implementation with strong typing
  </Card>

  <Card title="Build Redis" icon="database" href="/courses/build-your-own-x/redis-overview">
    Ready for the next challenge? Build a Redis clone
  </Card>
</CardGroup>

***

## Interview Deep-Dive

<AccordionGroup>
  <Accordion title="Git is often called a 'content-addressable filesystem.' What does that mean, and how does it differ from a traditional filesystem?">
    **Strong Answer:**

    * In a traditional filesystem, you choose a filename and the OS stores the content at that location. The name is arbitrary and independent of the content. In Git's object store, the "name" (address) is derived from the content itself via SHA-1 hashing. The file is stored at `.git/objects/<first-2-chars>/<remaining-38-chars>` of the hash.
    * This gives Git three properties for free: deduplication (identical content always produces the same hash, so it is stored once regardless of how many files or commits reference it), integrity verification (if a single bit flips on disk, the hash will not match and Git will detect corruption), and immutability (you cannot change an object's content without changing its address, which means any object you have retrieved is guaranteed to be the same object that was originally stored).
    * This design is the same principle behind Amazon S3's internal storage, IPFS, and blockchain Merkle trees. Content-addressable storage is one of the most powerful ideas in computer science, and Git is the most widely deployed example of it.
    * The practical implication for developers: Git repositories are much smaller than you would expect because of deduplication. Renaming a file costs almost nothing (the blob is the same, only the tree changes). And `git fsck` can verify the entire history's integrity by re-hashing every object and comparing it to its stored address.

    **Follow-up: If SHA-1 has known collision vulnerabilities, why does Git still use it, and what is being done about it?**

    SHA-1 collision attacks (like Google's SHAttered in 2017) require computing two distinct inputs that produce the same hash. Git mitigated this immediately with a "hardened SHA-1" that detects known collision patterns. More fundamentally, Git is transitioning to SHA-256 via the `extensions.objectFormat` config, which has been supported since Git 2.29. The transition is backward-compatible: repositories can be converted, and the object format extension allows interop. In practice, SHA-1 collisions in Git require a targeted attack against a specific repository, which is operationally much harder than the academic attack. But the industry consensus is that SHA-256 is the correct long-term direction, and Git's content-addressable design makes the hash algorithm swappable precisely because the architecture does not depend on SHA-1 specifically -- it depends on the *property* of content addressing.
  </Accordion>

  <Accordion title="Explain Git's object model: blobs, trees, and commits. How do they compose to represent a repository's history?">
    **Strong Answer:**

    * A **blob** stores raw file content with no metadata (no filename, no permissions). It is just the bytes of the file, prefixed with a header (`blob <size>\0`), then SHA-1 hashed. Two files with identical content across the entire repository (or across different commits) share the same blob object.
    * A **tree** represents a directory. It contains entries, each with a mode (permissions), a name (filename), and a pointer (SHA-1 hash) to either a blob (file) or another tree (subdirectory). Trees are recursive -- a root tree can point to sub-trees, which point to sub-sub-trees, mirroring the directory hierarchy.
    * A **commit** is a metadata object that points to a root tree (the complete directory snapshot at that point in time), zero or more parent commits (previous commits in history), an author, a committer, a timestamp, and a message. The chain of parent pointers forms the commit DAG (directed acyclic graph) -- the history.
    * The elegance is that commits are snapshots, not diffs. Each commit has a complete tree that represents the entire project state. Git does not store what changed between commits; it stores the full state. Diffs are computed on-the-fly by comparing two trees. This makes checkout extremely fast (just materialize one tree) and makes operations like blame, log, and bisect possible without reconstructing incremental changes.

    **Follow-up: If every commit stores a full snapshot, why are Git repositories not enormous?**

    Two reasons: deduplication and pack files. Deduplication at the blob level means unchanged files across commits point to the same blob -- no duplication. Deduplication at the tree level means unchanged directories share tree objects. For a commit that changes one file in a 10,000-file project, only one new blob and a chain of new tree objects (from the changed file up to the root) are created. The other 9,999 blobs and their trees are shared. On top of this, `git gc` periodically packs loose objects into `.pack` files using delta compression -- similar files are stored as a base object plus a binary diff. This is why a Git repo with years of history is often smaller than a single checkout of the working directory.
  </Accordion>

  <Accordion title="A junior developer says 'branches in Git are expensive because they copy the codebase.' Correct their misunderstanding and explain what a branch actually is.">
    **Strong Answer:**

    * A branch in Git is a 41-byte text file stored at `.git/refs/heads/<branchname>` containing a 40-character commit hash and a newline. Creating a branch literally writes 41 bytes to disk. There is no copying of code, no duplication of files, no additional storage proportional to repository size.
    * When you run `git branch feature`, Git creates the file `.git/refs/heads/feature` containing the same commit hash that HEAD currently points to. Both branches now point to the same commit object, which points to the same tree, which points to the same blobs. Nothing is duplicated.
    * As you make commits on the feature branch, the branch pointer advances to new commits. The old commits are still shared with main. Only new blobs (for changed files), new trees (for changed directories), and new commit objects are created. The cost is proportional to what changed, not to the size of the repository.
    * This is fundamentally different from systems like Subversion, where a branch was a physical copy of the directory tree (even if it was a "cheap copy" using copy-on-write at the filesystem level, it was still conceptually a copy). Git's model makes branching a constant-time, constant-space operation, which is why Git workflows can use dozens or hundreds of short-lived branches without any performance impact.

    **Follow-up: What is HEAD, and why is the distinction between HEAD pointing to a branch vs. pointing to a commit important?**

    HEAD is a symbolic reference stored in `.git/HEAD`. Normally it contains `ref: refs/heads/main`, meaning "I am on branch main." When you commit, Git reads HEAD, follows the ref to the branch file, reads the current commit hash, creates the new commit with that as the parent, and updates the branch file with the new commit hash. HEAD itself does not change. When HEAD contains a raw commit hash (detached HEAD), commits still work, but no branch pointer advances. Those commits become "orphaned" -- reachable only through the reflog -- and will eventually be garbage collected if you switch away without creating a branch. This is why `git checkout <commit>` prints a warning: it is not dangerous, but it creates a workflow where work can be silently lost.
  </Accordion>
</AccordionGroup>
