FAISS - Efficient Similarity Search
Facebook AI's library for billion-scale vector similarity search.
When to use FAISS
Use FAISS when:
- •Need fast similarity search on large vector datasets (millions/billions)
- •GPU acceleration required
- •Pure vector similarity (no metadata filtering needed)
- •High throughput, low latency critical
- •Offline/batch processing of embeddings
Metrics:
- •31,700+ GitHub stars
- •Meta/Facebook AI Research
- •Handles billions of vectors
- •C++ with Python bindings
Use alternatives instead:
- •Chroma/Pinecone: Need metadata filtering
- •Weaviate: Need full database features
- •Annoy: Simpler, fewer features
Quick start
Installation
bash
# CPU only pip install faiss-cpu # GPU support pip install faiss-gpu
Basic usage
python
import faiss
import numpy as np
# Create sample data (1000 vectors, 128 dimensions)
d = 128
nb = 1000
vectors = np.random.random((nb, d)).astype('float32')
# Create index
index = faiss.IndexFlatL2(d) # L2 distance
index.add(vectors) # Add vectors
# Search
k = 5 # Find 5 nearest neighbors
query = np.random.random((1, d)).astype('float32')
distances, indices = index.search(query, k)
print(f"Nearest neighbors: {indices}")
print(f"Distances: {distances}")
Index types
1. Flat (exact search)
python
# L2 (Euclidean) distance index = faiss.IndexFlatL2(d) # Inner product (cosine similarity if normalized) index = faiss.IndexFlatIP(d) # Slowest, most accurate
2. IVF (inverted file) - Fast approximate
python
# Create quantizer quantizer = faiss.IndexFlatL2(d) # IVF index with 100 clusters nlist = 100 index = faiss.IndexIVFFlat(quantizer, d, nlist) # Train on data index.train(vectors) # Add vectors index.add(vectors) # Search (nprobe = clusters to search) index.nprobe = 10 distances, indices = index.search(query, k)
3. HNSW (Hierarchical NSW) - Best quality/speed
python
# HNSW index M = 32 # Number of connections per layer index = faiss.IndexHNSWFlat(d, M) # No training needed index.add(vectors) # Search distances, indices = index.search(query, k)
4. Product Quantization - Memory efficient
python
# PQ reduces memory by 16-32× m = 8 # Number of subquantizers nbits = 8 index = faiss.IndexPQ(d, m, nbits) # Train and add index.train(vectors) index.add(vectors)
Save and load
python
# Save index
faiss.write_index(index, "large.index")
# Load index
index = faiss.read_index("large.index")
# Continue using
distances, indices = index.search(query, k)
GPU acceleration
python
# Single GPU res = faiss.StandardGpuResources() index_cpu = faiss.IndexFlatL2(d) index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu) # GPU 0 # Multi-GPU index_gpu = faiss.index_cpu_to_all_gpus(index_cpu) # 10-100× faster than CPU
LangChain integration
python
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
# Create FAISS vector store
vectorstore = FAISS.from_documents(docs, OpenAIEmbeddings())
# Save
vectorstore.save_local("faiss_index")
# Load
vectorstore = FAISS.load_local(
"faiss_index",
OpenAIEmbeddings(),
allow_dangerous_deserialization=True
)
# Search
results = vectorstore.similarity_search("query", k=5)
LlamaIndex integration
python
from llama_index.vector_stores.faiss import FaissVectorStore import faiss # Create FAISS index d = 1536 faiss_index = faiss.IndexFlatL2(d) vector_store = FaissVectorStore(faiss_index=faiss_index)
Best practices
- •Choose right index type - Flat for <10K, IVF for 10K-1M, HNSW for quality
- •Normalize for cosine - Use IndexFlatIP with normalized vectors
- •Use GPU for large datasets - 10-100× faster
- •Save trained indices - Training is expensive
- •Tune nprobe/ef_search - Balance speed/accuracy
- •Monitor memory - PQ for large datasets
- •Batch queries - Better GPU utilization
Performance
| Index Type | Build Time | Search Time | Memory | Accuracy |
|---|---|---|---|---|
| Flat | Fast | Slow | High | 100% |
| IVF | Medium | Fast | Medium | 95-99% |
| HNSW | Slow | Fastest | High | 99% |
| PQ | Medium | Fast | Low | 90-95% |
Resources
- •GitHub: https://github.com/facebookresearch/faiss ⭐ 31,700+
- •Wiki: https://github.com/facebookresearch/faiss/wiki
- •License: MIT