Vector Search Skill

Overview

This project uses Qdrant for vector storage and similarity search. The service is in /apps/agent-api/src/services/qdrant.ts.

Environment Configuration

From src/env.ts:

•QDRANT_HOST - Qdrant server host
•QDRANT_PORT - Qdrant server port
•QDRANT_USE_HTTPS - Whether to use HTTPS
•QDRANT_API_KEY - API key for authentication
•QDRANT_TIMEOUT_MS - Request timeout
•QDRANT_RETRY_MAX - Max retry attempts
•QDRANT_RETRY_BASE_MS - Base delay for retries
•QDRANT_COLLECTION - Collection name

Vector Dimensions

Important: Vector dimension is 1536 (OpenAI ada-002 embeddings)

Tenant Isolation

Always include workspaceId in filters for multi-tenant isolation:

typescript

const results = await qdrantService.search({
  vector: queryEmbedding,
  filter: {
    must: [{ key: 'workspaceId', match: { value: workspaceId } }],
  },
  limit: 10,
});

Creating Embeddings

Use the embed function from LLM service:

typescript

import { embed } from '../services/llm';

const embedding = await embed(text);
// Returns: number[] of length 1536

Basic Operations

Ensure Collection Exists

typescript

import { ensureCollection } from '../services/qdrant';

await ensureCollection('my-collection', 1536);

Upsert Points

typescript

await qdrantService.upsert({
  collection: 'documents',
  points: [
    {
      id: documentId,
      vector: embedding,
      payload: {
        workspaceId,
        content: text,
        metadata: { ... }
      }
    }
  ]
});

Similarity Search

typescript

const results = await qdrantService.search({
  collection: 'documents',
  vector: queryEmbedding,
  filter: {
    must: [{ key: 'workspaceId', match: { value: workspaceId } }],
  },
  limit: 10,
  scoreThreshold: 0.7,
});

Delete Points

typescript

await qdrantService.delete({
  collection: 'documents',
  filter: {
    must: [{ key: 'documentId', match: { value: documentId } }],
  },
});

Error Handling

Use custom error classes:

typescript

import {
  QdrantError,
  QdrantTimeout,
  QdrantUnavailable,
  QdrantBadRequest
} from '../errors/qdrant';

try {
  await qdrantService.search(...);
} catch (error) {
  if (error instanceof QdrantTimeout) {
    // Handle timeout - maybe retry with smaller limit
  } else if (error instanceof QdrantUnavailable) {
    // Service is down - graceful degradation
  } else if (error instanceof QdrantBadRequest) {
    // Invalid request - check parameters
  }
  throw error;
}

Retry Pattern

Use fetchWithRetry for resilience:

typescript

import { fetchWithRetry } from '../utils/fetch';

const response = await fetchWithRetry(`${qdrantUrl}/collections/${collection}/points/search`, {
  method: 'POST',
  body: JSON.stringify(searchRequest),
  retries: 3,
  backoff: 1000,
});

Health Check

Include Qdrant in health checks:

typescript

import { pingQdrant } from '../services/qdrant';

// In health endpoint
const qdrantHealthy = await pingQdrant();

Best Practices

•Always filter by workspaceId - Never expose cross-tenant data
•Use score thresholds - Filter out low-relevance results (0.7+ recommended)
•Batch operations - Upsert in batches of 100-500 points
•Handle timeouts gracefully - Qdrant can be slow on large collections
•Log search metrics - Track latency and result counts

Related Files

•Qdrant service: /apps/agent-api/src/services/qdrant.ts
•LLM/Embeddings: /apps/agent-api/src/services/llm.ts
•Environment: /apps/agent-api/src/env.ts
•Error classes: /apps/agent-api/src/errors/