> ## Documentation Index
> Fetch the complete documentation index at: https://resources.devweekends.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Neo4j Architecture & Native Graph Storage

> Master Neo4j's native graph storage engine, index structures, transaction management, and performance optimization

# Neo4j Architecture & Native Graph Storage

<Info>
  **Module Duration**: 6-8 hours
  **Learning Style**: Deep Technical + Implementation Details + Performance Analysis
  **Outcome**: Understand how Neo4j achieves O(1) traversals and ACID guarantees at the storage layer
</Info>

## Introduction: What Makes Neo4j "Native"?

Most databases were built for **tables**, then retrofitted for graphs. Neo4j was designed **from day one** for graphs.

**Key Difference**: **Index-free adjacency**

```
Traditional (PostgreSQL with graph extension):
MATCH (a:Person)-[:KNOWS]->(b)
→ Index scan for Person nodes
→ Index scan for KNOWS relationships
→ JOIN on IDs
→ O(log N) per hop

Neo4j (Native Graph):
MATCH (a:Person)-[:KNOWS]->(b)
→ Follow pointer from node to relationship to node
→ O(1) per hop (constant time, regardless of graph size!)
```

**Performance**: 1000x faster for multi-hop traversals!

This module reveals **how** Neo4j achieves this at the storage level.

***

## Part 1: High-Level Architecture

### System Layering

```
┌─────────────────────────────────────────────────────┐
│            Client Applications                       │
│  (Browser, Drivers: Java, Python, Go, JS)          │
└────────────────────┬────────────────────────────────┘
                     │ Bolt Protocol (Binary)
┌────────────────────▼────────────────────────────────┐
│               Query Engine                           │
│  ┌─────────────────────────────────────────┐       │
│  │  Cypher Parser                           │       │
│  │  (Convert Cypher → AST)                  │       │
│  └─────────────────┬───────────────────────┘       │
│                    │                                 │
│  ┌─────────────────▼───────────────────────┐       │
│  │  Query Planner                           │       │
│  │  (Optimize: cost-based, rule-based)      │       │
│  └─────────────────┬───────────────────────┘       │
│                    │                                 │
│  ┌─────────────────▼───────────────────────┐       │
│  │  Execution Engine                        │       │
│  │  (Execute plan: operators, pipelining)   │       │
│  └─────────────────┬───────────────────────┘       │
└────────────────────┼────────────────────────────────┘
                     │ Kernel API
┌────────────────────▼────────────────────────────────┐
│              Transaction Manager                     │
│  • ACID guarantees                                   │
│  • Write-Ahead Logging (WAL)                         │
│  • Locking (pessimistic, read/write locks)          │
└────────────────────┬────────────────────────────────┘
                     │
┌────────────────────▼────────────────────────────────┐
│               Storage Engine                         │
│  ┌──────────────────────────────────────────┐      │
│  │  Page Cache (in-memory buffer)           │      │
│  │  • LRU eviction                           │      │
│  │  • Memory-mapped files                    │      │
│  └──────────────────┬───────────────────────┘      │
│                     │                                │
│  ┌──────────────────▼───────────────────────┐      │
│  │  Store Files (disk)                      │      │
│  │  • neostore.nodestore.db                 │      │
│  │  • neostore.relationshipstore.db         │      │
│  │  • neostore.propertystore.db             │      │
│  │  • neostore.labeltokenstore.db           │      │
│  └──────────────────────────────────────────┘      │
└─────────────────────────────────────────────────────┘
```

### Core Components

**1. Cypher Parser**

* Converts Cypher query string → Abstract Syntax Tree (AST)
* Validates syntax

**2. Query Planner**

* Generates execution plan (like SQL EXPLAIN)
* Cost-based optimization (estimates row counts, cardinality)
* Rule-based optimization (predicate push-down, etc.)

**3. Execution Engine**

* Executes query plan using operators (Scan, Filter, Expand, etc.)
* Pipelined execution (streaming results)

**4. Transaction Manager**

* Ensures ACID properties
* Write-Ahead Log (WAL) for durability
* Locking for isolation

**5. Storage Engine**

* Native graph storage (nodes, relationships, properties)
* Page cache (in-memory buffer pool)
* Indexes (for lookups)

***

## Part 2: Storage Layout

### Store Files

Neo4j stores graph data in **multiple fixed-size record stores**:

```bash theme={null}
# In data/databases/neo4j/
neostore.nodestore.db              # Node records
neostore.relationshipstore.db      # Relationship records
neostore.propertystore.db          # Property records
neostore.propertystore.db.strings  # String values
neostore.propertystore.db.arrays   # Array values
neostore.relationshipgroupstore.db # Relationship type groups (dense nodes)
neostore.labeltokenstore.db        # Label names
neostore.relationshiptypestore.db  # Relationship type names
neostore.schemastore.db            # Indexes, constraints
```

**Key Principle**: **Fixed-size records** enable O(1) access via record ID.

### Node Store

**Node Record Format** (15 bytes in Neo4j 4.x):

```
┌────────┬────────────┬────────────┬────────────┬────────┐
│ In Use │ Next Rel   │ Next Prop  │ Labels     │ Extra  │
│ 1 bit  │ 35 bits    │ 36 bits    │ 5 bytes    │ ...    │
└────────┴────────────┴────────────┴────────────┴────────┘
```

**Fields**:

* **In Use** (1 bit): Is this record active? (0 = deleted, 1 = active)
* **Next Rel** (35 bits): Pointer to first relationship (ID)
* **Next Prop** (36 bits): Pointer to first property (ID)
* **Labels** (5 bytes): Inline label storage or pointer to label array

**Example**:

Node with ID = 42:

```
Record at offset: 42 × 15 bytes = 630 bytes into neostore.nodestore.db

┌─────┬─────────┬─────────┬──────────────┐
│  1  │  10050  │  20010  │  0x0001      │  (Person label)
└─────┴─────────┴─────────┴──────────────┘
```

**Accessing Node 42**:

```java theme={null}
// Pseudocode
long nodeId = 42;
long offset = nodeId * NODE_RECORD_SIZE;  // 42 * 15 = 630
byte[] record = readFromFile("neostore.nodestore.db", offset, 15);

boolean inUse = getBit(record, 0);
long firstRelId = getBits(record, 1, 35);
long firstPropId = getBits(record, 36, 36);
```

**Time Complexity**: O(1)—direct calculation, no scanning!

### Relationship Store

**Relationship Record Format** (34 bytes in Neo4j 4.x):

```
┌────────┬────────┬────────┬─────────┬──────────┬──────────┬──────────┬──────────┬────────┐
│ In Use │ First  │ Second │ Rel     │ First    │ First    │ Second   │ Second   │ Next   │
│        │ Node   │ Node   │ Type    │ Prev Rel │ Next Rel │ Prev Rel │ Next Rel │ Prop   │
│ 1 bit  │ 35 bits│ 35 bits│ 16 bits │ 35 bits  │ 35 bits  │ 35 bits  │ 35 bits  │ 36 bits│
└────────┴────────┴────────┴─────────┴──────────┴──────────┴──────────┴──────────┴────────┘
```

**Fields**:

* **First Node**: Source node ID
* **Second Node**: Target node ID
* **Rel Type**: Relationship type ID (index into relationship type store)
* **First Prev/Next Rel**: Doubly-linked list for source node's relationships
* **Second Prev/Next Rel**: Doubly-linked list for target node's relationships
* **Next Prop**: Pointer to first property

**Why Doubly-Linked Lists?**

Each node maintains a **linked list of its relationships**:

```
Node A (ID=10):
  ↓ Next Rel = 100
Relationship 100: (A)-[:KNOWS]->(B)
  ├─ First Node = 10 (A)
  ├─ Second Node = 20 (B)
  ├─ First Next Rel = 101  ← Next relationship for A
  └─ Second Next Rel = 200 ← Next relationship for B

Relationship 101: (A)-[:LIKES]->(C)
  ├─ First Node = 10 (A)
  ├─ Second Node = 30 (C)
  └─ First Next Rel = NULL  ← End of A's relationships
```

**Traversal**:

To find all relationships for Node A:

1. Read Node A's record → get `Next Rel = 100`
2. Read Relationship 100 → check if `First Node == A` or `Second Node == A`
3. Follow appropriate `Next Rel` pointer (100 → 101 → NULL)

**Time Complexity**: O(D), where D = degree of node (number of relationships)

**Critical Insight**: **Independent of total graph size!**

* In relational DB: O(log N) per relationship (index scan across all edges)
* In Neo4j: O(D) (only scan node's own relationships)

**Example**:

* Graph with 1 billion relationships
* Node A has 10 relationships
* Relational: O(log 10⁹) ≈ 30 operations
* Neo4j: O(10) = 10 operations

**Speed-up**: 3x, and scales better!

### Property Store

**Property Record Format** (25 bytes):

```
┌────────┬─────────┬─────────┬──────────┬───────────┐
│ In Use │ Type    │ Key ID  │ Value    │ Next Prop │
│ 1 bit  │ 4 bits  │ 24 bits │ 64 bits  │ 36 bits   │
└────────┴─────────┴─────────┴──────────┴───────────┘
```

**Property Types**:

| Type           | Value Storage                        |
| -------------- | ------------------------------------ |
| `bool`         | Inline (1 bit in value field)        |
| `int`          | Inline (up to 64 bits)               |
| `float`        | Inline (IEEE 754, 64 bits)           |
| `short string` | Inline (up to 12 characters encoded) |
| `long string`  | Pointer to string store              |
| `array`        | Pointer to array store               |

**Property Chain**:

Properties form a **singly-linked list**:

```
Node A:
  Next Prop = 500

Property 500: {name: "Alice"}
  ├─ Key ID = 1 (points to "name" in property key store)
  ├─ Value = pointer to string store → "Alice"
  └─ Next Prop = 501

Property 501: {age: 30}
  ├─ Key ID = 2 ("age")
  ├─ Value = 30 (inline int)
  └─ Next Prop = NULL
```

**Reading All Properties**:

```java theme={null}
long propId = node.nextProp;  // 500
List<Property> props = new ArrayList<>();

while (propId != NULL) {
    PropertyRecord prop = readProperty(propId);
    props.add(new Property(
        getKeyName(prop.keyId),
        decodeValue(prop.type, prop.value)
    ));
    propId = prop.nextProp;
}
```

### String Store

Stores strings > 12 characters:

**String Record** (128 bytes, stores 120 chars + metadata):

```
┌────────┬──────────┬────────────────────────────────┐
│ In Use │ Length   │ Data (120 chars)               │
│ 1 byte │ 4 bytes  │ 120 bytes                      │
└────────┴──────────┴────────────────────────────────┘
```

**Long Strings** (> 120 chars):

Stored across **multiple records** forming a linked list:

```
String: "This is a very long string..." (200 chars)

Record 1: chars 0-119   + Next Record = 5001
Record 2: chars 120-199 + Next Record = NULL
```

### Dense Node Optimization

**Problem**: Nodes with **many relationships** (high degree) slow down traversals.

**Example**: Celebrity node with 1M followers

```
Without optimization:
MATCH (celeb)-[:FOLLOWED_BY]->(follower)
→ Scan 1M relationship records to find FOLLOWED_BY relationships!
```

**Solution**: **Relationship Group Store** (introduced Neo4j 2.1)

When a node exceeds a threshold (default: 50 relationships), Neo4j creates **relationship type groups**:

```
Node (Celebrity):
  Next Rel = NULL
  Next Rel Group = 7000  ← Points to group store

Relationship Group 7000:
  ├─ Type = FOLLOWED_BY
  ├─ First Outgoing = 8000
  ├─ First Incoming = 9000
  ├─ First Loop = NULL
  └─ Next Group = 7001

Relationship Group 7001:
  ├─ Type = FOLLOWS
  ├─ First Outgoing = 10000
  ├─ ...
```

**Traversal**:

```cypher theme={null}
MATCH (celeb)-[:FOLLOWED_BY]->(follower)
```

**Process**:

1. Read celeb node → Get `Next Rel Group = 7000`
2. Scan relationship groups for `FOLLOWED_BY` (typically \< 10 types)
3. Follow `First Outgoing` pointer
4. Scan only FOLLOWED\_BY relationships (not all 1M!)

**Performance**: From O(1M) to O(types) + O(followers of type) = much faster!

***

## Part 3: Indexes

### Label Indexes (Default)

When you create a label, Neo4j automatically creates an index for lookups:

```cypher theme={null}
CREATE (p:Person {name: "Alice"})
```

**Without Index**:

```cypher theme={null}
MATCH (p:Person {name: "Alice"})
→ Scan ALL nodes (O(N))
```

**With Label Index**:

```cypher theme={null}
MATCH (p:Person {name: "Alice"})
→ Scan only Person nodes (O(P), where P = # of Person nodes)
```

**Storage**:

Labels are stored in the **Node Record** (5 bytes for labels):

```
Node 42:
  Labels = 0x0001  → Label ID 1 = "Person"
```

Label IDs map to names in `neostore.labeltokenstore.db`.

### Property Indexes

**Create Index**:

```cypher theme={null}
CREATE INDEX person_name FOR (p:Person) ON (p.name)
```

**Index Structure** (Neo4j 4.x): **Native Index** (custom B+tree variant)

**Index File**: `schema/index/lucene-*/1/` (despite "lucene" in path, it's native in 4.x+)

**B+Tree Structure**:

```
                Root
               /    \
          [Alice]  [Charlie]
           /   \      /    \
     [Alice] [Bob] [Charlie] [David]
       ↓      ↓       ↓        ↓
    Node 10 Node 20 Node 30  Node 40
```

**Query**:

```cypher theme={null}
MATCH (p:Person {name: "Bob"})
```

**Process**:

1. Look up "Bob" in index → O(log P)
2. Get node ID (20)
3. Read node 20 → O(1)

**Total**: O(log P) \<\< O(P) (full scan)

### Composite Indexes

**Create**:

```cypher theme={null}
CREATE INDEX person_name_age FOR (p:Person) ON (p.name, p.age)
```

**Use Case**:

```cypher theme={null}
MATCH (p:Person {name: "Alice", age: 30})
```

**Index Key**: Composite key `(name, age)` → `("Alice", 30)`

**Benefit**: Single index lookup instead of two separate lookups + intersection.

### Full-Text Indexes

**Create**:

```cypher theme={null}
CREATE FULLTEXT INDEX person_description FOR (p:Person) ON EACH [p.name, p.bio]
```

**Query**:

```cypher theme={null}
CALL db.index.fulltext.queryNodes("person_description", "software engineer")
YIELD node, score
RETURN node.name, score
```

**Backend**: Apache Lucene (inverted index)

**Use Cases**:

* Fuzzy search
* Tokenization (split "software engineer" → \["software", "engineer"])
* Relevance scoring (TF-IDF)

### Vector Indexes (Neo4j 5.x)

**Create**:

```cypher theme={null}
CREATE VECTOR INDEX user_embedding FOR (u:User) ON (u.embedding)
OPTIONS {dimension: 256, similarity: 'cosine'}
```

**Query** (Nearest neighbor search):

```cypher theme={null}
MATCH (u:User)
WHERE u.embedding SIMILAR TO $query_vector
RETURN u.name, vector.similarity(u.embedding, $query_vector) AS score
ORDER BY score DESC
LIMIT 10
```

**Backend**: HNSW (Hierarchical Navigable Small World) graph for approximate nearest neighbor (ANN) search

**Use Cases**:

* Recommendation systems (user/item embeddings)
* Semantic search (document embeddings)
* Image similarity (CNN features)

***

## Part 4: Transaction Management

### ACID Guarantees

Neo4j provides **full ACID** transactions:

* **Atomicity**: All-or-nothing (commit or rollback)
* **Consistency**: Constraints enforced (uniqueness, existence)
* **Isolation**: Transactions don't interfere (locking)
* **Durability**: Committed transactions survive crashes (WAL)

### Write-Ahead Log (WAL)

**Purpose**: Durability (survive crashes)

**Location**: `neostore.transaction.db.0`, `neostore.transaction.db.1`, ...

**Process**:

```
1. Transaction begins
   ↓
2. Modifications stored in memory
   ↓
3. Transaction commits
   ↓
4. Write transaction log entry to WAL (fsync to disk)
   ↓
5. Respond "SUCCESS" to client
   ↓
6. Apply changes to store files (asynchronous)
```

**Log Entry Format**:

```
┌──────────┬─────────┬──────────────────────────────┐
│ TX ID    │ Command │ Data                         │
│ 8 bytes  │ 1 byte  │ Variable                     │
└──────────┴─────────┴──────────────────────────────┘
```

**Commands**:

* `CREATE_NODE`: Node ID, labels, properties
* `CREATE_RELATIONSHIP`: Rel ID, type, start/end nodes, properties
* `SET_PROPERTY`: Node/Rel ID, key, value
* `DELETE_NODE`: Node ID

**Crash Recovery**:

```
On startup:
1. Read last committed transaction ID from store
2. Replay WAL entries from that point forward
3. Restore to consistent state
```

**Checkpointing**:

Periodically, Neo4j flushes in-memory changes to store files and **truncates WAL**:

```
┌─────────────────────────────────────────────────────┐
│                  Timeline                            │
├─────────────────────────────────────────────────────┤
│ t=0:   Last checkpoint (TX ID = 1000)               │
│ t=10:  TX 1001, 1002, 1003 (in WAL)                 │
│ t=20:  TX 1004, 1005 (in WAL)                       │
│ t=30:  Checkpoint! (flush 1001-1005 to store)       │
│        → Truncate WAL (safe to discard 1001-1005)   │
│ t=40:  TX 1006, 1007 (new WAL)                      │
└─────────────────────────────────────────────────────┘
```

**Configuration**:

```conf theme={null}
# neo4j.conf
db.checkpoint.interval.time=15m   # Checkpoint every 15 minutes
db.checkpoint.interval.tx=100000  # Or every 100K transactions
```

### Locking

Neo4j uses **pessimistic locking** (lock before modify).

**Lock Types**:

1. **Read Locks** (Shared locks):
   * Multiple transactions can hold read locks simultaneously
   * Prevents writes while reading

2. **Write Locks** (Exclusive locks):
   * Only one transaction can hold a write lock
   * Blocks reads and writes

**Lock Granularity**:

Neo4j locks at the **node/relationship level**:

```cypher theme={null}
// Transaction 1:
MATCH (p:Person {name: "Alice"})
SET p.age = 31  ← Acquires WRITE lock on Node(Alice)

// Transaction 2 (concurrent):
MATCH (p:Person {name: "Alice"})
RETURN p.age  ← Blocks until TX1 commits (released lock)
```

**Deadlock Detection**:

```cypher theme={null}
// Transaction 1:
MATCH (a:Person {name: "Alice"})
MATCH (b:Person {name: "Bob"})
SET a.age = 31, b.age = 32

// Transaction 2 (concurrent):
MATCH (b:Person {name: "Bob"})
MATCH (a:Person {name: "Alice"})
SET b.age = 33, a.age = 34
```

**Deadlock**:

* TX1 locks Alice, waits for Bob
* TX2 locks Bob, waits for Alice
* **Deadlock!**

**Resolution**: Neo4j detects deadlock and aborts one transaction:

```
Neo4jException: DeadlockDetected
Transaction aborted. Please retry.
```

**Best Practice**: Always acquire locks in **consistent order**:

```cypher theme={null}
// Both transactions lock in order: Alice → Bob
MATCH (a:Person {name: "Alice"})
MATCH (b:Person {name: "Bob"})
SET a.age = ..., b.age = ...
```

### Isolation Levels

Neo4j provides **Read Committed** isolation by default:

**Behavior**:

* Reads see committed data only (no dirty reads)
* Phantom reads possible (another TX inserts/deletes between reads)

**Example**:

```cypher theme={null}
// Transaction 1:
BEGIN
MATCH (p:Person) RETURN count(p)  → 100

// Transaction 2 (concurrent):
CREATE (p:Person {name: "New"})
COMMIT

// Transaction 1 (continues):
MATCH (p:Person) RETURN count(p)  → 101 (phantom read!)
COMMIT
```

**Higher Isolation** (Serializable):

Neo4j doesn't support serializable isolation natively. Use explicit locking:

```cypher theme={null}
// Pessimistic locking (manual):
MATCH (counter:Counter {id: 1})
SET counter.value = counter.value + 1  ← Acquires write lock
RETURN counter.value
```

***

## Part 5: Page Cache

### Memory Architecture

**Page Cache** = Neo4j's in-memory buffer pool (like PostgreSQL's shared buffers).

**Purpose**:

* Cache frequently accessed pages (node, relationship, property records)
* Reduce disk I/O

**Configuration**:

```conf theme={null}
# neo4j.conf
db.memory.pagecache.size=4g  # 4GB for page cache
```

**Sizing Rule**: `pagecache.size = 0.5 × (RAM - heap - OS)`

**Example** (64GB RAM):

```
Heap (JVM): 8GB
OS: 8GB
Available for page cache: 64 - 8 - 8 = 48GB
Recommended: 24-32GB
```

### Page Cache Structure

**Page Size**: 8KB (default, configurable)

**Cache Entry**:

```
┌─────────────────────────────────────────┐
│  File: neostore.nodestore.db            │
│  Offset: 630 (Node ID 42)               │
│  Data: [15 bytes of node record]        │
│  LRU: Last accessed timestamp           │
└─────────────────────────────────────────┘
```

**Page Eviction**: LRU (Least Recently Used)

When cache is full:

1. Find least recently used page
2. If dirty (modified), flush to disk
3. Evict page
4. Load new page into cache

### Monitoring Page Cache

```cypher theme={null}
CALL dbms.queryJmx("org.neo4j:instance=kernel#0,name=Page cache")
YIELD attributes
RETURN attributes.hits, attributes.faults, attributes.evictions
```

**Metrics**:

* **Hits**: Requests served from cache (fast!)
* **Faults**: Requests requiring disk read (slow)
* **Evictions**: Pages evicted to make room

**Hit Rate**: `hits / (hits + faults)`

**Target**: > 95% hit rate

**Example**:

```
Hits: 950,000
Faults: 50,000
Hit Rate: 950,000 / 1,000,000 = 95%  ✓ Good!
```

**If hit rate \< 90%**: Increase page cache size!

***

## Part 6: Query Execution

### Explain and Profile

**Explain** (planning only, no execution):

```cypher theme={null}
EXPLAIN
MATCH (p:Person {name: "Alice"})-[:KNOWS]->(friend)
RETURN friend.name
```

**Output**:

```
+---------------------------+
| Operator         | Rows  |
+---------------------------+
| ProduceResults   | 10    |
| Projection       | 10    |
| Expand(All)      | 10    |
| NodeIndexSeek    | 1     |
+---------------------------+
```

**Profile** (execute and collect statistics):

```cypher theme={null}
PROFILE
MATCH (p:Person {name: "Alice"})-[:KNOWS]->(friend)
RETURN friend.name
```

**Output**:

```
+----------------------------------------+
| Operator         | Rows | DB Hits     |
+----------------------------------------+
| ProduceResults   | 10   | 0           |
| Projection       | 10   | 20          |
| Expand(All)      | 10   | 21          |
| NodeIndexSeek    | 1    | 2           |
+----------------------------------------+
Total DB Hits: 43
```

**DB Hits**: Number of page cache accesses (lower = better!)

### Query Operators

**NodeIndexSeek**:

```
Look up node by indexed property
Cost: O(log N)
DB Hits: 1-3 (index lookup + node fetch)
```

**NodeByLabelScan**:

```
Scan all nodes with a specific label
Cost: O(L), where L = # of labeled nodes
DB Hits: L
```

**Expand(All)**:

```
Traverse relationships from a node
Cost: O(D), where D = degree
DB Hits: D (read relationship records)
```

**Filter**:

```
Apply predicate (WHERE clause)
Cost: O(R), where R = input rows
DB Hits: R × P (P = properties checked per row)
```

**Sort**:

```
Sort results (ORDER BY)
Cost: O(R log R)
DB Hits: 0 (in-memory sort)
```

**Aggregation** (COUNT, SUM, etc.):

```
Group and aggregate
Cost: O(R)
DB Hits: 0 (in-memory)
```

### Example Query Analysis

**Query**:

```cypher theme={null}
MATCH (p:Person {name: "Alice"})-[:KNOWS]->(friend:Person)
WHERE friend.age > 25
RETURN friend.name, friend.age
ORDER BY friend.age DESC
```

**Plan**:

```
ProduceResults
  ↑
Sort (friend.age DESC)
  ↑
Projection (friend.name, friend.age)
  ↑
Filter (friend.age > 25)
  ↑
Expand(All) [:KNOWS]
  ↑
NodeIndexSeek (Person.name = "Alice")
```

**Step-by-Step**:

1. **NodeIndexSeek**: Look up Alice in `Person.name` index
   * Cost: O(log P)
   * DB Hits: 2
   * Rows: 1

2. **Expand**: Traverse `:KNOWS` relationships
   * Cost: O(D), D = Alice's degree (e.g., 10 friends)
   * DB Hits: 10 (relationship records)
   * Rows: 10

3. **Filter**: Check `friend.age > 25`
   * Cost: O(10)
   * DB Hits: 10 (read `age` property)
   * Rows: 6 (assume 6 friends > 25)

4. **Projection**: Extract `name` and `age`
   * Cost: O(6)
   * DB Hits: 6 (read `name` property)
   * Rows: 6

5. **Sort**: Order by `age DESC`
   * Cost: O(6 log 6) ≈ 15
   * DB Hits: 0 (in-memory)
   * Rows: 6

6. **ProduceResults**: Return to client
   * Rows: 6

**Total DB Hits**: 2 + 10 + 10 + 6 = 28

### Optimization Tips

**1. Use Indexes**:

```cypher theme={null}
-- Bad (full scan)
MATCH (p:Person)
WHERE p.name = "Alice"

-- Good (index seek)
CREATE INDEX FOR (p:Person) ON (p.name)
MATCH (p:Person {name: "Alice"})
```

**2. Filter Early**:

```cypher theme={null}
-- Bad (expand then filter)
MATCH (p:Person {name: "Alice"})-[:KNOWS]->(friend)
WHERE friend.age > 25

-- Good (filter during expand, if possible)
MATCH (p:Person {name: "Alice"})-[:KNOWS]->(friend:Person)
WHERE friend.age > 25
```

**3. Limit Results**:

```cypher theme={null}
-- Add LIMIT to avoid processing unnecessary rows
MATCH (p:Person)-[:KNOWS]->(friend)
RETURN friend.name
LIMIT 10
```

**4. Use EXPLAIN/PROFILE**:

Always check query plans for:

* Missing indexes (NodeByLabelScan instead of NodeIndexSeek)
* High DB Hits
* Cartesian products (avoid!)

***

## Part 7: Performance Characteristics

### Time Complexity

| Operation                       | Neo4j    | Relational DB |
| ------------------------------- | -------- | ------------- |
| Find node by ID                 | O(1)     | O(1)          |
| Find node by indexed property   | O(log N) | O(log N)      |
| Find node by unindexed property | O(N)     | O(N)          |
| Get node's relationships        | O(D)     | O(log N)      |
| Traverse 2 hops                 | O(D²)    | O(N²)         |
| Traverse 3 hops                 | O(D³)    | O(N³)         |
| Shortest path (BFS, depth L)    | O(D^L)   | O(N^L)        |

**Typical Values**:

* N = 1,000,000 (total nodes)
* D = 50 (average degree)

**3-hop traversal**:

* Neo4j: 50³ = 125,000
* Relational: (10⁶)³ = 10¹⁸ (not feasible!)

### Benchmark: Social Network Queries

**Setup**:

* 1M users
* 50M friendships (average 50 friends/user)

**Query 1: Direct friends**

```cypher theme={null}
MATCH (u:User {id: 12345})-[:FRIEND]->(friend)
RETURN friend.name
```

| Database   | Time |
| ---------- | ---- |
| Neo4j      | 5ms  |
| PostgreSQL | 20ms |

**Query 2: Friends of friends**

```cypher theme={null}
MATCH (u:User {id: 12345})-[:FRIEND*2]->(fof)
RETURN fof.name
```

| Database   | Time    |
| ---------- | ------- |
| Neo4j      | 50ms    |
| PostgreSQL | 5,000ms |

**Query 3: Friends up to 3 hops**

```cypher theme={null}
MATCH (u:User {id: 12345})-[:FRIEND*1..3]->(friend)
RETURN DISTINCT friend.name
```

| Database   | Time            |
| ---------- | --------------- |
| Neo4j      | 200ms           |
| PostgreSQL | Timeout (> 60s) |

**Key Takeaway**: Neo4j shines for **deep traversals** (2+ hops).

***

## Part 8: Hands-On Exercises

### Exercise 1: Explore Store Files

**Task**: Inspect Neo4j store files and calculate sizes

```bash theme={null}
# Navigate to Neo4j data directory
cd <NEO4J_HOME>/data/databases/neo4j/

# List store files
ls -lh neostore.*

# Example output:
# -rw-r--r-- neostore.nodestore.db          150M
# -rw-r--r-- neostore.relationshipstore.db  2.1G
# -rw-r--r-- neostore.propertystore.db      500M
```

**Questions**:

1. How many nodes? `150M / 15 bytes = 10M nodes`
2. How many relationships? `2.1G / 34 bytes ≈ 61.7M relationships`
3. Average degree? `61.7M × 2 / 10M ≈ 12.3`

### Exercise 2: Analyze Query Plans

**Query**:

```cypher theme={null}
PROFILE
MATCH (p:Person {name: "Alice"})-[:KNOWS*2]->(fof)
RETURN fof.name
```

**Tasks**:

1. Identify bottleneck operator (highest DB Hits)
2. Check if index is used (NodeIndexSeek vs NodeByLabelScan)
3. Optimize query (add indexes, limit results)

### Exercise 3: Measure Page Cache Hit Rate

```cypher theme={null}
// Run query multiple times
MATCH (p:Person)-[:KNOWS]->(friend)
RETURN count(friend)

// Check hit rate
CALL dbms.queryJmx("org.neo4j:instance=kernel#0,name=Page cache")
YIELD attributes
RETURN attributes.hits, attributes.faults,
       (toFloat(attributes.hits) / (attributes.hits + attributes.faults)) AS hit_rate
```

**Expected**: Hit rate increases on subsequent runs (data cached)

***

## Summary

**Native Graph Storage**:

* Fixed-size records (O(1) access by ID)
* Pointers for relationships (no JOINs!)
* Index-free adjacency (O(D) traversal, not O(N))

**Indexes**:

* Label indexes (filter by node type)
* Property indexes (B+tree, O(log N))
* Full-text indexes (Lucene)
* Vector indexes (HNSW for ANN)

**Transactions**:

* Full ACID guarantees
* Write-Ahead Log (durability)
* Pessimistic locking (read/write locks)
* Deadlock detection

**Performance**:

* 10-1000x faster than relational for traversals
* Scales with degree (D), not graph size (N)
* Page cache critical for performance (aim > 95% hit rate)

***

## What's Next?

<Card title="Module 4: Cypher Query Language Mastery" icon="code" href="/distributed-systems-tools/neo4j-cypher-mastery">
  Master Cypher syntax, pattern matching, aggregations, and advanced query techniques
</Card>
