AgentSkillsCN

perplexity-performance-tuning

优化Perplexity API性能,使用缓存、批处理和连接池。 在遇到API响应缓慢、实施缓存策略或优化Perplexity集成的请求吞吐量时使用。 可通过“perplexity performance”、“optimize perplexity”、“perplexity latency”、“perplexity caching”、“perplexity slow”、“perplexity batch”等短语触发。

SKILL.md
--- frontmatter
name: perplexity-performance-tuning
description: |
  Optimize Perplexity API performance with caching, batching, and connection pooling.
  Use when experiencing slow API responses, implementing caching strategies,
  or optimizing request throughput for Perplexity integrations.
  Trigger with phrases like "perplexity performance", "optimize perplexity",
  "perplexity latency", "perplexity caching", "perplexity slow", "perplexity batch".
allowed-tools: Read, Write, Edit
version: 1.0.0
license: MIT
author: Jeremy Longshore <jeremy@intentsolutions.io>

Perplexity Performance Tuning

Overview

Optimize Perplexity API performance with caching, batching, and connection pooling.

Prerequisites

  • Perplexity SDK installed
  • Understanding of async patterns
  • Redis or in-memory cache available (optional)
  • Performance monitoring in place

Latency Benchmarks

OperationP50P95P99
Read50ms150ms300ms
Write100ms250ms500ms
List75ms200ms400ms

Caching Strategy

Response Caching

typescript
import { LRUCache } from 'lru-cache';

const cache = new LRUCache<string, any>({
  max: 1000,
  ttl: 60000, // 1 minute
  updateAgeOnGet: true,
});

async function cachedPerplexityRequest<T>(
  key: string,
  fetcher: () => Promise<T>,
  ttl?: number
): Promise<T> {
  const cached = cache.get(key);
  if (cached) return cached as T;

  const result = await fetcher();
  cache.set(key, result, { ttl });
  return result;
}

Redis Caching (Distributed)

typescript
import Redis from 'ioredis';

const redis = new Redis(process.env.REDIS_URL);

async function cachedWithRedis<T>(
  key: string,
  fetcher: () => Promise<T>,
  ttlSeconds = 60
): Promise<T> {
  const cached = await redis.get(key);
  if (cached) return JSON.parse(cached);

  const result = await fetcher();
  await redis.setex(key, ttlSeconds, JSON.stringify(result));
  return result;
}

Request Batching

typescript
import DataLoader from 'dataloader';

const perplexityLoader = new DataLoader<string, any>(
  async (ids) => {
    // Batch fetch from Perplexity
    const results = await perplexityClient.batchGet(ids);
    return ids.map(id => results.find(r => r.id === id) || null);
  },
  {
    maxBatchSize: 100,
    batchScheduleFn: callback => setTimeout(callback, 10),
  }
);

// Usage - automatically batched
const [item1, item2, item3] = await Promise.all([
  perplexityLoader.load('id-1'),
  perplexityLoader.load('id-2'),
  perplexityLoader.load('id-3'),
]);

Connection Optimization

typescript
import { Agent } from 'https';

// Keep-alive connection pooling
const agent = new Agent({
  keepAlive: true,
  maxSockets: 10,
  maxFreeSockets: 5,
  timeout: 30000,
});

const client = new PerplexityClient({
  apiKey: process.env.PERPLEXITY_API_KEY!,
  httpAgent: agent,
});

Pagination Optimization

typescript
async function* paginatedPerplexityList<T>(
  fetcher: (cursor?: string) => Promise<{ data: T[]; nextCursor?: string }>
): AsyncGenerator<T> {
  let cursor: string | undefined;

  do {
    const { data, nextCursor } = await fetcher(cursor);
    for (const item of data) {
      yield item;
    }
    cursor = nextCursor;
  } while (cursor);
}

// Usage
for await (const item of paginatedPerplexityList(cursor =>
  perplexityClient.list({ cursor, limit: 100 })
)) {
  await process(item);
}

Performance Monitoring

typescript
async function measuredPerplexityCall<T>(
  operation: string,
  fn: () => Promise<T>
): Promise<T> {
  const start = performance.now();
  try {
    const result = await fn();
    const duration = performance.now() - start;
    console.log({ operation, duration, status: 'success' });
    return result;
  } catch (error) {
    const duration = performance.now() - start;
    console.error({ operation, duration, status: 'error', error });
    throw error;
  }
}

Instructions

Step 1: Establish Baseline

Measure current latency for critical Perplexity operations.

Step 2: Implement Caching

Add response caching for frequently accessed data.

Step 3: Enable Batching

Use DataLoader or similar for automatic request batching.

Step 4: Optimize Connections

Configure connection pooling with keep-alive.

Output

  • Reduced API latency
  • Caching layer implemented
  • Request batching enabled
  • Connection pooling configured

Error Handling

IssueCauseSolution
Cache miss stormTTL expiredUse stale-while-revalidate
Batch timeoutToo many itemsReduce batch size
Connection exhaustedNo poolingConfigure max sockets
Memory pressureCache too largeSet max cache entries

Examples

Quick Performance Wrapper

typescript
const withPerformance = <T>(name: string, fn: () => Promise<T>) =>
  measuredPerplexityCall(name, () =>
    cachedPerplexityRequest(`cache:${name}`, fn)
  );

Resources

Next Steps

For cost optimization, see perplexity-cost-tuning.