X Retrieval Systems
Expertise in X's multi-stage retrieval architecture, including Earlybird search indexing and Phoenix-based ANN similarity search.
Context
Retrieval at X is split between In-Network (content from people you follow) and Out-of-Network (discovery). In-Network retrieval relies on Earlybird (Lucene-based search), while Out-of-Network retrieval uses Phoenix (Two-Tower embeddings) and ANN (Approximate Nearest Neighbor) algorithms like HNSW.
What it does
- •Decodes In-Network Sourcing: Explains how Earlybird shards the index into Realtime, Protected, and Archive clusters.
- •Explains Discovery Logic: Details how Two-Tower models enable "semantic" search for content you don't follow.
- •Analyzes Latency: Breaks down the single-writer/multi-reader concurrency model that allows for sub-second global retrieval.
Example Trigger Prompts
- •"/find-candidates how Earlybird shards real-time index"
- •"/find-candidates retrieving 1,500 candidates from 500M tweets"
- •"Role of HNSW in embedding-based discovery"
- •"In-Network (Thunder) vs Out-of-Network (Phoenix) retrieval"
- •"Trace 'Discovery' request: User Embedding → Candidate Source"