Ollama Setup for GrepAI
This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.
When to Use This Skill
- •Setting up GrepAI with local, private embeddings
- •Installing Ollama for the first time
- •Choosing and downloading embedding models
- •Troubleshooting Ollama connection issues
Why Ollama?
| Benefit | Description |
|---|---|
| 🔒 Privacy | Code never leaves your machine |
| 💰 Free | No API costs |
| ⚡ Fast | Local processing, no network latency |
| 🔌 Offline | Works without internet |
Installation
macOS (Homebrew)
# Install Ollama brew install ollama # Start the Ollama service ollama serve
macOS (Direct Download)
- •Download from ollama.com
- •Open the
.dmgand drag to Applications - •Launch Ollama from Applications
Linux
# One-line installer curl -fsSL https://ollama.com/install.sh | sh # Start the service ollama serve
Windows
- •Download installer from ollama.com
- •Run the installer
- •Ollama starts automatically as a service
Downloading Embedding Models
GrepAI requires an embedding model to convert code into vectors.
Recommended Model: nomic-embed-text
# Download the recommended model (768 dimensions) ollama pull nomic-embed-text
Specifications:
- •Dimensions: 768
- •Size: ~274 MB
- •Performance: Excellent for code search
- •Language: English-optimized
Alternative Models
# Multilingual support (better for non-English code/comments) ollama pull nomic-embed-text-v2-moe # Larger, more accurate ollama pull bge-m3 # Maximum quality ollama pull mxbai-embed-large
| Model | Dimensions | Size | Best For |
|---|---|---|---|
nomic-embed-text | 768 | 274 MB | General code search |
nomic-embed-text-v2-moe | 768 | 500 MB | Multilingual codebases |
bge-m3 | 1024 | 1.2 GB | Large codebases |
mxbai-embed-large | 1024 | 670 MB | Maximum accuracy |
Verifying Installation
Check Ollama is Running
# Check if Ollama server is responding curl http://localhost:11434/api/tags # Expected output: JSON with available models
List Downloaded Models
ollama list # Output: # NAME ID SIZE MODIFIED # nomic-embed-text:latest abc123... 274 MB 2 hours ago
Test Embedding Generation
# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
"model": "nomic-embed-text",
"prompt": "function hello() { return world; }"
}'
Configuring GrepAI for Ollama
After installing Ollama, configure GrepAI to use it:
# .grepai/config.yaml embedder: provider: ollama model: nomic-embed-text endpoint: http://localhost:11434
This is the default configuration when you run grepai init, so no changes are needed if using nomic-embed-text.
Running Ollama
Foreground (Development)
# Run in current terminal (see logs) ollama serve
Background (macOS/Linux)
# Using nohup nohup ollama serve & # Or as a systemd service (Linux) sudo systemctl enable ollama sudo systemctl start ollama
Check Status
# Check if running pgrep -f ollama # Or test the API curl -s http://localhost:11434/api/tags | head -1
Resource Considerations
Memory Usage
Embedding models load into RAM:
- •
nomic-embed-text: ~500 MB RAM - •
bge-m3: ~1.5 GB RAM - •
mxbai-embed-large: ~1 GB RAM
CPU vs GPU
Ollama uses CPU by default. For faster embeddings:
- •macOS: Uses Metal (Apple Silicon) automatically
- •Linux/Windows: Install CUDA for NVIDIA GPU support
Common Issues
❌ Problem: connection refused to localhost:11434
✅ Solution: Start Ollama:
ollama serve
❌ Problem: Model not found ✅ Solution: Pull the model first:
ollama pull nomic-embed-text
❌ Problem: Slow embedding generation ✅ Solution:
- •Use a smaller model
- •Ensure Ollama is using GPU (check
ollama ps) - •Close other memory-intensive applications
❌ Problem: Out of memory ✅ Solution: Use a smaller model or increase system RAM
Best Practices
- •Start Ollama before GrepAI: Ensure
ollama serveis running - •Use recommended model:
nomic-embed-textoffers best balance - •Keep Ollama running: Leave it as a background service
- •Update periodically:
ollama pull nomic-embed-textfor updates
Output Format
After successful setup:
✅ Ollama Setup Complete Ollama Version: 0.1.x Endpoint: http://localhost:11434 Model: nomic-embed-text (768 dimensions) Status: Running GrepAI is ready to use with local embeddings. Your code will never leave your machine.