DSPy AI Applications with Gemini
This skill guides you through building robust, self-optimizing AI applications using DSPy (Declarative Self-improving Python) and the Gemini API. It emphasizes Intent-Oriented Programming: defining what you want (Signatures) rather than how to prompt it.
[!WARNING] STRICT RATE LIMITS: The
gemini-3-flash-previewmodel has extremely strict rate limits:
- •5 Requests Per Minute (RPM)
- •20 Requests Per Day (RPD)
- •250,000 Tokens Per Minute (TPM)
You MUST enable caching and use "Dry Runs" to avoid exhaustion.
1. Core Philosophy: Intent-Oriented Programming
DSPy shifts focus from "hand-crafting prompts" to "programming architectures".
- •Decomposition: Break complex tasks (Translation, RAG, Agents) into small, optimizable specific programs. Do not build monolithic "God Prompts".
- •Signatures: Declare input/output specs (including types like
list[dict],float,Pydantic). - •Optimization: Use data (synthetic or real) to compiling programs into effective prompts automatically.
- •Modularity: Swap backends (e.g., Gemini -> Local) or optimization strategies without rewriting code.
2. Quick Start (Safe Mode)
To start a new project without hitting rate limits immediately:
- •
Install Dependencies:
bashpip install dspy-ai google-generativeai
- •
Use the Safe Boilerplate: Always start with the provided boilerplate which includes caching and rate limit handling. See assets/boilerplate.py.
- •
Configure Environment: Ensure
GOOGLE_API_KEYis set in your environment variables.
3. Workflow
Step 1: Define Typed Signatures
Define explicit intents. Be specific with types. See references/cheat_sheet.md for examples (Invoice Parsing, Entity Extraction).
Step 2: Build Modular Programs
Connect modules like dspy.Predict, dspy.ChainOfThought, or dspy.ReAct.
For Agents, expose these programs as tools.
Step 3: Run Once & Cache
Run your module on a single example.
- •ALWAYS use
dspy.configure(experimental=True)or standard settings to enable file-based caching. - •Verify the cache file was created before proceeding.
Step 4: Synthetic Data & Optimization (Advanced)
CRITICAL WARNING: Optimization loops (BootstrapFewShot, MIPRO) consume massive RPD.
- •Strategy: Use a "Teacher" model (stronger/different quota) to generate synthetic data if possible.
- •Micro-Optimization: If you must optimize on Gemini Flash, use a tiny trainset (2-3 examples) and
BootstrapFewShotwithmax_bootstrapped_demos=1.
4. Advanced Patterns
A. Agents & MCP
DSPy programs can be exposed as MCP (Model Context Protocol) tools.
- •Define a
dspy.Signaturefor the tool. - •Wrap it in a
dspy.Predict. - •Serve it via an MCP server (e.g., using
mcp2py).
B. Fine-Tuning Flow
- •Define Signature.
- •Optimize a Teacher program (strong model) to get high-quality traces.
- •Generate synthetic data.
- •Fine-tune a smaller Student model (e.g., Gemma 2B) using
BootstrapFinetune.
C. RAG (Retrieval Augmented Generation)
Combine dspy.Retrieve (e.g., ColBERTv2, VectorDB) with dspy.ChainOfThought.
Optimize the entire pipeline to improve retrieval queries and answer generation simultaneously.
5. Troubleshooting
- •429 Errors: You hit the rate limit. Stop immediately.
- •Empty Responses: Check API key and safety settings.
- •"Context too long": Use
dspy.Retrieveor decompose the task.
6. Artifacts & Resources
- •
assets/boilerplate.py: MANDATORY starting point (includes Pydantic/Typed examples). - •
references/cheat_sheet.md: Signatures for Invoice Parser, RAG, Agents, Entities.