GrepAI Embeddings with LM Studio
This skill covers using LM Studio as the embedding provider for GrepAI, offering a user-friendly GUI for managing local models.
When to Use This Skill
- •Want local embeddings with a graphical interface
- •Already using LM Studio for other AI tasks
- •Prefer visual model management over CLI
- •Need to easily switch between models
What is LM Studio?
LM Studio is a desktop application for running local LLMs with:
- •🖥️ Graphical user interface
- •📦 Easy model downloading
- •🔌 OpenAI-compatible API
- •🔒 100% private, local processing
Prerequisites
- •Download LM Studio from lmstudio.ai
- •Install and launch the application
- •Download an embedding model
Installation
Step 1: Download LM Studio
Visit lmstudio.ai and download for your platform:
- •macOS (Intel or Apple Silicon)
- •Windows
- •Linux
Step 2: Launch and Download a Model
- •Open LM Studio
- •Go to the Search tab
- •Search for an embedding model:
- •
nomic-embed-text-v1.5 - •
bge-small-en-v1.5 - •
bge-large-en-v1.5
- •
- •Click Download
Step 3: Start the Local Server
- •Go to the Local Server tab
- •Select your embedding model
- •Click Start Server
- •Note the endpoint (default:
http://localhost:1234)
Configuration
Basic Configuration
# .grepai/config.yaml embedder: provider: lmstudio model: nomic-embed-text-v1.5 endpoint: http://localhost:1234
With Custom Port
embedder: provider: lmstudio model: nomic-embed-text-v1.5 endpoint: http://localhost:8080
With Explicit Dimensions
embedder: provider: lmstudio model: nomic-embed-text-v1.5 endpoint: http://localhost:1234 dimensions: 768
Available Models
nomic-embed-text-v1.5 (Recommended)
| Property | Value |
|---|---|
| Dimensions | 768 |
| Size | ~260 MB |
| Quality | Excellent |
| Speed | Fast |
embedder: provider: lmstudio model: nomic-embed-text-v1.5
bge-small-en-v1.5
| Property | Value |
|---|---|
| Dimensions | 384 |
| Size | ~130 MB |
| Quality | Good |
| Speed | Very fast |
Best for: Smaller codebases, faster indexing.
embedder: provider: lmstudio model: bge-small-en-v1.5 dimensions: 384
bge-large-en-v1.5
| Property | Value |
|---|---|
| Dimensions | 1024 |
| Size | ~1.3 GB |
| Quality | Very high |
| Speed | Slower |
Best for: Maximum accuracy.
embedder: provider: lmstudio model: bge-large-en-v1.5 dimensions: 1024
Model Comparison
| Model | Dims | Size | Speed | Quality |
|---|---|---|---|---|
bge-small-en-v1.5 | 384 | 130MB | ⚡⚡⚡ | ⭐⭐⭐ |
nomic-embed-text-v1.5 | 768 | 260MB | ⚡⚡ | ⭐⭐⭐⭐ |
bge-large-en-v1.5 | 1024 | 1.3GB | ⚡ | ⭐⭐⭐⭐⭐ |
LM Studio Server Setup
Starting the Server
- •Open LM Studio
- •Navigate to Local Server tab (left sidebar)
- •Select an embedding model from the dropdown
- •Configure settings:
- •Port:
1234(default) - •Enable Embedding Endpoint
- •Port:
- •Click Start Server
Server Status
Look for the green indicator showing the server is running.
Verifying the Server
# Check server is responding
curl http://localhost:1234/v1/models
# Test embedding
curl http://localhost:1234/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "nomic-embed-text-v1.5",
"input": "function authenticate(user)"
}'
LM Studio Settings
Recommended Settings
In LM Studio's Local Server tab:
| Setting | Recommended Value |
|---|---|
| Port | 1234 |
| Enable CORS | Yes |
| Context Length | Auto |
| GPU Layers | Max (for speed) |
GPU Acceleration
LM Studio automatically uses:
- •macOS: Metal (Apple Silicon)
- •Windows/Linux: CUDA (NVIDIA)
Adjust GPU layers in settings for memory/speed balance.
Running LM Studio Headless
For server environments, LM Studio supports CLI mode:
# Start server without GUI (check LM Studio docs for exact syntax) lmstudio server start --model nomic-embed-text-v1.5 --port 1234
Common Issues
❌ Problem: Connection refused ✅ Solution: Ensure LM Studio server is running:
- •Open LM Studio
- •Go to Local Server tab
- •Click Start Server
❌ Problem: Model not found ✅ Solution:
- •Download the model in LM Studio's Search tab
- •Select it in the Local Server dropdown
❌ Problem: Slow embedding generation ✅ Solutions:
- •Enable GPU acceleration in LM Studio settings
- •Use a smaller model (bge-small-en-v1.5)
- •Close other GPU-intensive applications
❌ Problem: Port already in use ✅ Solution: Change port in LM Studio settings:
embedder: endpoint: http://localhost:8080 # Different port
❌ Problem: LM Studio closes and server stops ✅ Solution: Keep LM Studio running in the background, or consider using Ollama which runs as a system service
LM Studio vs Ollama
| Feature | LM Studio | Ollama |
|---|---|---|
| GUI | ✅ Yes | ❌ CLI only |
| System service | ❌ App must run | ✅ Background service |
| Model management | ✅ Visual | ✅ CLI |
| Ease of use | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Server reliability | ⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
Recommendation: Use LM Studio if you prefer a GUI, Ollama for always-on background service.
Migrating from LM Studio to Ollama
If you need a more reliable background service:
- •Install Ollama:
brew install ollama ollama serve & ollama pull nomic-embed-text
- •Update config:
embedder: provider: ollama model: nomic-embed-text endpoint: http://localhost:11434
- •Re-index:
rm .grepai/index.gob grepai watch
Best Practices
- •Keep LM Studio running: Server stops when app closes
- •Use recommended model:
nomic-embed-text-v1.5for best balance - •Enable GPU: Faster embeddings with hardware acceleration
- •Check server before indexing: Ensure green status indicator
- •Consider Ollama for production: More reliable as background service
Output Format
Successful LM Studio configuration:
✅ LM Studio Embedding Provider Configured Provider: LM Studio Model: nomic-embed-text-v1.5 Endpoint: http://localhost:1234 Dimensions: 768 (auto-detected) Status: Connected Note: Keep LM Studio running for embeddings to work.