Neo4j/Cypher Mastery
Comprehensive guide to Neo4j graph database and Cypher query language. Covers fundamental concepts, common patterns, performance optimization, schema design, and integration with PostgreSQL/Supabase.
When This Skill Applies
Use this skill when:
- •Writing Cypher queries
- •Designing graph schemas
- •Optimizing graph traversals
- •Building recommendation systems
- •Modeling hierarchies or networks
- •Integrating Neo4j with relational databases
- •Questions about graph database patterns
Core Concepts
Nodes, Relationships, Properties
Nodes - Entities (nouns):
// Simple node
CREATE (u:User)
// Node with properties
CREATE (u:User {
id: 'user-123',
name: 'Alice',
email: 'alice@example.com'
})
// Multiple labels
CREATE (p:Person:Developer {name: 'Bob'})
Relationships - Connections (verbs):
// Simple relationship
CREATE (a)-[:FOLLOWS]->(b)
// Relationship with properties
CREATE (a)-[:FOLLOWS {since: date(), strength: 'strong'}]->(b)
// Relationship types are UPPERCASE by convention
CREATE (a)-[:MEMBER_OF {role: 'admin'}]->(org)
Properties - Attributes (key-value pairs):
// Node properties
{
id: 'user-123',
name: 'Alice',
age: 30,
verified: true,
createdAt: datetime()
}
// Relationship properties
{
since: date(),
weight: 0.85,
type: 'professional'
}
Graph Thinking
Relational mindset:
-- Joins and foreign keys SELECT * FROM users u JOIN follows f ON f.follower_id = u.id JOIN users u2 ON f.followed_id = u2.id WHERE u.id = '123';
Graph mindset:
// Pattern matching
MATCH (u:User {id: '123'})-[:FOLLOWS]->(friend)
RETURN friend;
Key difference: Relationships are first-class citizens in graphs.
Cypher Fundamentals
MATCH - Finding Patterns
Basic pattern:
// Find all users
MATCH (u:User)
RETURN u;
// Find users with specific property
MATCH (u:User {name: 'Alice'})
RETURN u;
// Find users matching condition
MATCH (u:User)
WHERE u.age > 25
RETURN u;
Relationship patterns:
// Outgoing relationship MATCH (a)-[:FOLLOWS]->(b) RETURN a, b; // Incoming relationship MATCH (a)<-[:FOLLOWS]-(b) RETURN a, b; // Any direction MATCH (a)-[:FOLLOWS]-(b) RETURN a, b; // Multiple relationships MATCH (a)-[:FOLLOWS]->(b)-[:FOLLOWS]->(c) RETURN a, b, c; // Variable length MATCH (a)-[:FOLLOWS*1..3]->(b) RETURN a, b;
CREATE - Adding Data
Create nodes:
// Single node
CREATE (u:User {id: 'user-123', name: 'Alice'})
RETURN u;
// Multiple nodes
CREATE
(a:User {name: 'Alice'}),
(b:User {name: 'Bob'}),
(c:User {name: 'Charlie'});
Create relationships:
// Find existing nodes, create relationship
MATCH (a:User {name: 'Alice'})
MATCH (b:User {name: 'Bob'})
CREATE (a)-[:FOLLOWS]->(b);
// Create nodes and relationships together
CREATE (a:User {name: 'Alice'})-[:FOLLOWS]->(b:User {name: 'Bob'});
MERGE - Create or Match
Create if not exists:
// Create user only if doesn't exist
MERGE (u:User {id: 'user-123'})
ON CREATE SET u.name = 'Alice', u.createdAt = datetime()
ON MATCH SET u.lastSeen = datetime()
RETURN u;
// Create relationship only if doesn't exist
MATCH (a:User {id: 'user-123'})
MATCH (b:User {id: 'user-456'})
MERGE (a)-[r:FOLLOWS]->(b)
ON CREATE SET r.since = datetime()
RETURN r;
Important: MERGE matches on entire pattern:
// This matches on ALL properties
MERGE (u:User {id: 'user-123', name: 'Alice'})
// Better: Match on unique constraint only
MERGE (u:User {id: 'user-123'})
SET u.name = 'Alice'
SET - Updating Properties
// Set single property
MATCH (u:User {id: 'user-123'})
SET u.name = 'Alicia'
RETURN u;
// Set multiple properties
MATCH (u:User {id: 'user-123'})
SET u.name = 'Alicia', u.verified = true
RETURN u;
// Set properties from map
MATCH (u:User {id: 'user-123'})
SET u += {name: 'Alicia', age: 31}
RETURN u;
// Add label
MATCH (u:User {id: 'user-123'})
SET u:Verified
RETURN u;
DELETE - Removing Data
// Delete node (only if no relationships)
MATCH (u:User {id: 'user-123'})
DELETE u;
// Delete node and all relationships
MATCH (u:User {id: 'user-123'})
DETACH DELETE u;
// Delete relationship only
MATCH (a:User)-[r:FOLLOWS]->(b:User)
WHERE a.id = 'user-123' AND b.id = 'user-456'
DELETE r;
// Delete properties
MATCH (u:User {id: 'user-123'})
REMOVE u.age, u.verified
RETURN u;
RETURN - Formatting Results
// Return nodes MATCH (u:User) RETURN u; // Return specific properties MATCH (u:User) RETURN u.id, u.name; // Alias properties MATCH (u:User) RETURN u.name AS userName, u.email AS userEmail; // Return count MATCH (u:User) RETURN count(u) AS totalUsers; // Return distinct MATCH (u:User)-[:FOLLOWS]->(friend) RETURN DISTINCT friend.name;
Common Patterns
Social Graph Patterns
Followers/Following:
// Get user's followers
MATCH (follower:User)-[:FOLLOWS]->(u:User {id: $userId})
RETURN follower;
// Get who user follows
MATCH (u:User {id: $userId})-[:FOLLOWS]->(following)
RETURN following;
// Mutual follows (friends)
MATCH (a:User {id: $userId})-[:FOLLOWS]->(b:User)
MATCH (b)-[:FOLLOWS]->(a)
RETURN b AS friend;
// Follow suggestions (friends of friends, not already following)
MATCH (u:User {id: $userId})-[:FOLLOWS]->()-[:FOLLOWS]->(suggestion)
WHERE NOT (u)-[:FOLLOWS]->(suggestion)
AND u <> suggestion
RETURN DISTINCT suggestion
LIMIT 10;
Blocking:
// Create block relationship
MATCH (a:User {id: $userId})
MATCH (b:User {id: $blockUserId})
MERGE (a)-[:BLOCKED]->(b);
// Get all users except blocked
MATCH (u:User)
WHERE NOT (:User {id: $currentUserId})-[:BLOCKED]->(u)
AND NOT (u)-[:BLOCKED]->(:User {id: $currentUserId})
RETURN u;
Hierarchy Patterns
Organizational structure:
// Find all reports (direct and indirect)
MATCH (manager:Person {id: $managerId})-[:MANAGES*]->(report:Person)
RETURN report;
// Find direct reports only
MATCH (manager:Person {id: $managerId})-[:MANAGES]->(report:Person)
RETURN report;
// Find manager chain up to CEO
MATCH path = (person:Person {id: $personId})-[:REPORTS_TO*]->(ceo:Person)
WHERE NOT (ceo)-[:REPORTS_TO]->()
RETURN nodes(path);
// Find all people in same department
MATCH (person:Person {id: $personId})-[:MEMBER_OF]->(dept:Department)
MATCH (colleague:Person)-[:MEMBER_OF]->(dept)
WHERE person <> colleague
RETURN colleague;
Category hierarchies:
// Find all subcategories
MATCH (parent:Category {id: $categoryId})-[:PARENT_OF*]->(child:Category)
RETURN child;
// Find path to root category
MATCH path = (cat:Category {id: $categoryId})-[:CHILD_OF*]->(root:Category)
WHERE NOT (root)-[:CHILD_OF]->()
RETURN nodes(path);
Recommendation Patterns
Collaborative filtering:
// Users who liked similar items
MATCH (u:User {id: $userId})-[:LIKED]->(item:Item)
MATCH (item)<-[:LIKED]-(other:User)
MATCH (other)-[:LIKED]->(recommendation:Item)
WHERE NOT (u)-[:LIKED]->(recommendation)
RETURN recommendation, count(*) AS score
ORDER BY score DESC
LIMIT 10;
// Weighted recommendations
MATCH (u:User {id: $userId})-[r1:LIKED]->(item:Item)
MATCH (item)<-[r2:LIKED]-(other:User)
MATCH (other)-[r3:LIKED]->(recommendation:Item)
WHERE NOT (u)-[:LIKED]->(recommendation)
WITH recommendation,
sum(r1.weight * r2.weight * r3.weight) AS score
RETURN recommendation
ORDER BY score DESC
LIMIT 10;
Content-based filtering:
// Items similar to liked items
MATCH (u:User {id: $userId})-[:LIKED]->(item:Item)
MATCH (item)-[:HAS_TAG]->(tag:Tag)
MATCH (tag)<-[:HAS_TAG]-(similar:Item)
WHERE NOT (u)-[:LIKED]->(similar)
AND item <> similar
RETURN similar, count(tag) AS commonTags
ORDER BY commonTags DESC
LIMIT 10;
Path Finding
Shortest path:
// Shortest path between two users
MATCH path = shortestPath(
(a:User {id: $userId1})-[:FOLLOWS*]-(b:User {id: $userId2})
)
RETURN path, length(path);
// All shortest paths
MATCH path = allShortestPaths(
(a:User {id: $userId1})-[:FOLLOWS*]-(b:User {id: $userId2})
)
RETURN path;
Dijkstra's algorithm (weighted paths):
// Find cheapest route
CALL gds.shortestPath.dijkstra.stream('graph', {
sourceNode: $startNodeId,
targetNode: $endNodeId,
relationshipWeightProperty: 'cost'
})
YIELD path, totalCost
RETURN path, totalCost;
Access Control
Permission hierarchies:
// Check if user has permission
MATCH (u:User {id: $userId})-[:HAS_ROLE]->(role:Role)
MATCH (role)-[:HAS_PERMISSION*0..]->(permission:Permission {name: $permissionName})
RETURN count(permission) > 0 AS hasPermission;
// Get all user permissions (including inherited)
MATCH (u:User {id: $userId})-[:HAS_ROLE]->(role:Role)
MATCH (role)-[:HAS_PERMISSION*0..]->(permission:Permission)
RETURN DISTINCT permission;
Performance Optimization
Indexes and Constraints
Unique constraints (automatically create index):
// Unique user ID CREATE CONSTRAINT user_id_unique FOR (u:User) REQUIRE u.id IS UNIQUE; // Unique email CREATE CONSTRAINT user_email_unique FOR (u:User) REQUIRE u.email IS UNIQUE;
Regular indexes:
// Index on property CREATE INDEX user_name_index FOR (u:User) ON (u.name); // Composite index CREATE INDEX user_location_index FOR (u:User) ON (u.city, u.country); // Full-text search CREATE FULLTEXT INDEX user_search_index FOR (u:User) ON EACH [u.name, u.bio, u.email];
Use indexes:
// Full-text search
CALL db.index.fulltext.queryNodes('user_search_index', 'Alice')
YIELD node, score
RETURN node, score;
Query Optimization
Use PROFILE to analyze:
PROFILE
MATCH (u:User {id: $userId})-[:FOLLOWS*1..3]->(friend)
RETURN friend;
Optimization tips:
1. Start with most specific nodes:
// ✗ Bad: Starts with all users
MATCH (u:User)-[:FOLLOWS]->(friend:User {id: $friendId})
RETURN u;
// ✓ Good: Starts with specific user
MATCH (u:User)-[:FOLLOWS]->(friend:User)
WHERE friend.id = $friendId
RETURN u;
2. Limit relationship depth:
// ✗ Bad: Unbounded traversal
MATCH (u:User {id: $userId})-[:FOLLOWS*]->(friend)
RETURN friend;
// ✓ Good: Bounded traversal
MATCH (u:User {id: $userId})-[:FOLLOWS*1..3]->(friend)
RETURN friend;
3. Use LIMIT early:
// ✓ Good: Limit before expensive operations MATCH (u:User) RETURN u ORDER BY u.createdAt DESC LIMIT 10;
4. Avoid Cartesian products:
// ✗ Bad: Creates cartesian product MATCH (a:User), (b:User) WHERE a.city = b.city RETURN a, b; // ✓ Good: Connect via relationship or property MATCH (a:User)-[:LIVES_IN]->(city:City)<-[:LIVES_IN]-(b:User) RETURN a, b;
Batch Operations
Bulk create:
// Create many nodes efficiently
UNWIND $users AS userData
MERGE (u:User {id: userData.id})
SET u.name = userData.name, u.email = userData.email;
// Create many relationships
UNWIND $follows AS follow
MATCH (a:User {id: follow.followerId})
MATCH (b:User {id: follow.followedId})
MERGE (a)-[:FOLLOWS {since: follow.since}]->(b);
Use APOC for batching:
// Process in batches of 1000
CALL apoc.periodic.iterate(
"MATCH (u:User) RETURN u",
"SET u.processed = true",
{batchSize: 1000}
);
Schema Design
Modeling Guidelines
Nodes: Represent entities
(:User) (:Post) (:Comment) (:Tag)
Relationships: Represent connections
(:User)-[:POSTED]->(:Post) (:User)-[:COMMENTED]->(:Comment) (:Comment)-[:ON]->(:Post) (:Post)-[:TAGGED]->(:Tag)
Properties: Store attributes
// On nodes
User {id, name, email, createdAt}
// On relationships
FOLLOWS {since, strength}
LIKED {rating, timestamp}
When to Use Relationships vs Properties
Use relationship when:
- •Connection between entities
- •Need to query traversals
- •Connection has properties
- •Many-to-many relationship
// ✓ Good: Relationship
(user:User)-[:LIKED {rating: 5}]->(post:Post)
Use property when:
- •Simple value
- •Doesn't need traversal
- •One-to-one relationship
- •Rarely queried independently
// ✓ Good: Property
(:User {email: 'alice@example.com'})
Multiple Labels
Use multiple labels for:
- •Shared behaviour
- •Polymorphism
- •Categorization
// User can be both Person and Developer
CREATE (p:Person:Developer {name: 'Alice'})
// Query specific type
MATCH (d:Developer)
RETURN d;
// Query any person
MATCH (p:Person)
RETURN p;
Integration with PostgreSQL/Supabase
Shared Primary Keys
Use same UUIDs:
// Create in PostgreSQL
const { data: user } = await supabase
.from('users')
.insert({
id: userId,
email: 'alice@example.com',
name: 'Alice'
});
// Create in Neo4j
await neo4j.run(`
CREATE (u:User {
id: $userId,
name: $name
})
`, { userId, name: user.name });
Data Synchronization
Event-driven sync:
// PostgreSQL trigger → Sync to Neo4j
supabase
.from('users')
.on('INSERT', async (payload) => {
await neo4j.run(`
MERGE (u:User {id: $id})
SET u.name = $name, u.email = $email
`, payload.record);
})
.subscribe();
Batch sync:
// Bulk sync from PostgreSQL to Neo4j
const { data: users } = await supabase
.from('users')
.select('*');
await neo4j.run(`
UNWIND $users AS userData
MERGE (u:User {id: userData.id})
SET u.name = userData.name,
u.email = userData.email
`, { users });
Query Patterns
Hybrid queries:
// Get user from PostgreSQL
const { data: user } = await supabase
.from('users')
.select('*')
.eq('id', userId)
.single();
// Get social graph from Neo4j
const result = await neo4j.run(`
MATCH (u:User {id: $userId})
MATCH (u)-[:FOLLOWS]->(following:User)
MATCH (follower:User)-[:FOLLOWS]->(u)
RETURN
count(DISTINCT following) as followingCount,
count(DISTINCT follower) as followerCount
`, { userId });
// Combine results
return {
...user,
social: {
followingCount: result.records[0].get('followingCount'),
followerCount: result.records[0].get('followerCount')
}
};
Portfolio Evidence
KSBs Demonstrated:
- •S6: Design and Implement Database Systems (graph modeling)
- •S1: Analyse Requirements (choosing graph for relationships)
- •S8: Create Analysis Artefacts (query optimization)
How to Document:
- •Schema diagrams showing graph structure
- •Query examples with performance comparisons
- •Explain why graph chosen over relational
- •Document traversal patterns
- •Show integration with other databases
Success Criteria
Neo4j implementation is successful when:
- •Queries leverage graph traversal strengths
- •Indexes on frequently queried properties
- •Bounded traversals (not unbounded
*) - •Clear distinction between nodes/relationships/properties
- •Integration with relational database clean
- •Performance acceptable for use case
- •Schema supports future requirements