Optimizing Gemini Models

Ensures the SaaS uses the most efficient, cost-effective, and powerful Gemini models based on the latest available versions and specific use cases.

When to Use This Skill

•When a new Gemini model is announced or released.
•When existing AI features require optimization (latency, cost, accuracy).
•When resolving production errors related to AI model availability (500 errors).

•Read access to the official Gemini model documentation: https://ai.google.dev/gemini-api/docs/models.
•Write access to api/ai.ts and associated configuration files.

Identify the available models and their specifically optimized use cases.

•Reference URL: Gemini Models
•
Selection Criteria:
- •Flash models: Best for high-volume, low-latency tasks like transcription and simple chat.
- •Pro models: Best for complex reasoning, multi-step analysis, and high-precision chat.
- •Stable tags: Always prefer model names with specific version suffixes (e.g., -001, -002) or the -latest alias if stability is confirmed.

Map the models to specific Diktalo actions in api/ai.ts:

Update the GEMINI_CONFIG object in api/ai.ts.

•Build: Run npm run build to ensure no syntax errors.
•Test Request: Trigger a small AI request (Summary or Chat) and monitor the terminal logs for the used model name.

Error	Cause	Resolution
404 (Model not found)	Using a non-existent or deprecated model name.	Cross-check with official docs and use the latest stable name.
429 (Rate Limit)	The chosen model has lower quota on the current tier.	Downgrade to a different model or implement retry logic.
500 (Internal Error)	Temporary API instability or payload mismatch.	Check `runWithFallback` logic and ensure payload follows model specs.

•examples/model-config-template.ts - Reference for a robust configuration setup.
•ADVANCED.md - Deep dive into model comparison for SaaS workflows.