OpenAI Agents SDK + Groq Integration Expert
Overview
This skill documents the proven, production-ready pattern for integrating the OpenAI Agents SDK with Groq's ultra-fast LLM inference API. This pattern was successfully implemented in the Islamic Todo application to create an AI-powered productivity assistant with function calling capabilities.
Key Achievement: Successfully replaced OpenAI's GPT models with Groq's llama-3.1-8b-instant while maintaining full compatibility with the OpenAI Agents SDK's tool calling and conversation management features.
🎯 Core Pattern: Groq as OpenAI-Compatible Provider
The Challenge
The OpenAI Agents SDK is designed for OpenAI's API, but Groq offers:
- •✅ 10x faster inference (sub-second responses)
- •✅ OpenAI-compatible API endpoints
- •✅ Free tier with generous limits
- •✅ Support for Llama 3.1 models with native function calling
The Solution
Use AsyncOpenAI client with custom base_url pointing to Groq's OpenAI-compatible endpoint.
📋 Implementation Checklist
Step 1: Environment Configuration
import os from agents import AsyncOpenAI, OpenAIChatCompletionsModel, set_tracing_disabled # Critical: Set environment variables for SDK compatibility os.environ["OPENAI_API_KEY"] = your_groq_api_key os.environ["OPENAI_BASE_URL"] = "https://api.groq.com/openai/v1" # Disable tracing to avoid OpenAI-specific telemetry set_tracing_disabled(disabled=True)
Why this works: The OpenAI Agents SDK reads these environment variables internally. By pointing OPENAI_BASE_URL to Groq, all SDK calls are transparently routed to Groq's infrastructure.
Step 2: Client Initialization
# Create AsyncOpenAI client with Groq endpoint
client = AsyncOpenAI(
base_url="https://api.groq.com/openai/v1",
api_key=your_groq_api_key
)
# Wrap in OpenAI Agents SDK model
model = OpenAIChatCompletionsModel(
model="llama-3.1-8b-instant", # Groq's fastest model with function calling
openai_client=client
)
Model Selection Guide:
- •
llama-3.1-8b-instant: Best for speed, good for simple tasks - •
llama-3.1-70b-versatile: Best for complex reasoning and function calling - •
llama-3.3-70b-versatile: Latest, most capable model
Step 3: Agent Creation with Tools
from agents import Agent, function_tool
# Define async tool functions
async def add_task(title: str, description: str = None, priority: str = "medium"):
"""Add a new task to the user's todo list."""
# Your implementation here
return {"success": True, "task_id": "123"}
async def list_tasks(status: str = None, limit: int = 10):
"""List user tasks with optional filtering."""
# Your implementation here
return {"tasks": [...]}
# Create agent with tools
agent = Agent(
name="Islamic Assistant",
model=model,
instructions="You are an Islamic productivity assistant...",
tools=[
function_tool(add_task, name_override="add_task", strict_mode=False),
function_tool(list_tasks, name_override="list_tasks", strict_mode=False)
]
)
Critical Settings:
- •
strict_mode=False: Required for Groq compatibility (strict mode uses OpenAI-specific features) - •
name_override: Explicitly set tool names to avoid SDK auto-generation issues - •All tools must be async: The SDK expects async functions for proper execution
Step 4: Execution with Runner
from agents import Runner
# Process user message
result = await Runner.run(agent, user_message)
# Extract response
if hasattr(result, 'final_output'):
response = result.final_output
else:
response = str(result)
🏗️ Production Architecture Pattern
Wrapper Class Design
class IslamicTodoAgent:
"""
Wrapper for OpenAI Agents SDK with Groq provider.
Uses composition instead of inheritance for cleaner integration.
"""
def __init__(self):
# Setup Groq client
self.client = AsyncOpenAI(
base_url="https://api.groq.com/openai/v1",
api_key=settings.groq_api_key
)
self.model = OpenAIChatCompletionsModel(
model="llama-3.1-8b-instant",
openai_client=self.client
)
# State management
self.current_session = None
self.current_user_id = None
def _create_agent(self) -> Agent:
"""Create fresh agent instance with context-aware tools."""
# Define tools with closure over self.current_user_id
async def add_task(title: str, ...):
# Access self.current_user_id for user isolation
return await self.task_tools.add_task(
user_id=self.current_user_id,
title=title
)
return Agent(
name="Islamic Assistant",
model=self.model,
instructions=self._get_system_prompt(),
tools=[function_tool(add_task, ...)]
)
async def process_message(
self,
user_id: str,
message: str,
history: List[Dict] = None
) -> Dict[str, Any]:
"""Process message with user context."""
try:
# Setup session and user context
self.current_user_id = user_id
self.current_session = get_db_session()
# Create agent (fresh instance per request)
agent = self._create_agent()
# Run with history context
result = await Runner.run(agent, message)
return {"content": result.final_output}
finally:
# Cleanup
if self.current_session:
self.current_session.close()
self.current_session = None
🔧 Critical Implementation Details
1. User Isolation Pattern
# ❌ WRONG: Tools can't access user context
async def add_task(title: str):
# How do we know which user?
pass
# ✅ CORRECT: Use closure to capture user_id
def _create_agent(self):
user_id = self.current_user_id # Captured in closure
async def add_task(title: str):
return await self.task_tools.add_task(
user_id=user_id, # User context available
title=title
)
return Agent(tools=[function_tool(add_task)])
2. Session Management
# Always use try/finally for cleanup
try:
self.current_session = get_db_session()
result = await Runner.run(agent, message)
finally:
if self.current_session:
self.current_session.close()
self.current_session = None
3. Conversation History Integration
# Mix history into context message
if history:
h_text = "\n".join([f"{m['role']}: {m['content']}" for m in history[-5:]])
context_message = f"Conversation History:\n{h_text}\n\nCurrent Request: {message}"
else:
context_message = message
result = await Runner.run(agent, context_message)
🚨 Common Pitfalls & Solutions
Pitfall 1: Tool Functions Not Async
Error: TypeError: object NoneType can't be used in 'await' expression
Solution: All tool functions MUST be async
# ❌ WRONG
def add_task(title: str):
return db.add(...)
# ✅ CORRECT
async def add_task(title: str):
return await db.add(...)
Pitfall 2: Strict Mode Enabled
Error: Groq API doesn't support strict mode
Solution: Always set strict_mode=False
function_tool(add_task, strict_mode=False)
Pitfall 3: Reusing Agent Instances
Error: Stale state, memory leaks
Solution: Create fresh agent per request
# ❌ WRONG: Reuse agent self.agent = Agent(...) # Created once result = await Runner.run(self.agent, message) # ✅ CORRECT: Fresh agent per request agent = self._create_agent() # New instance result = await Runner.run(agent, message)
Pitfall 4: Missing Environment Variables
Error: SDK tries to use default OpenAI endpoint
Solution: Always set both env vars
os.environ["OPENAI_API_KEY"] = groq_key os.environ["OPENAI_BASE_URL"] = "https://api.groq.com/openai/v1"
📊 Performance Characteristics
Groq vs OpenAI Response Times
- •Groq (llama-3.1-8b-instant): 200-500ms average
- •OpenAI (gpt-3.5-turbo): 1-3 seconds average
- •Groq (llama-3.1-70b): 500-1000ms average
- •OpenAI (gpt-4): 3-8 seconds average
Function Calling Reliability
- •Groq llama-3.1-70b: ~95% accuracy on well-defined schemas
- •Groq llama-3.1-8b: ~85% accuracy (use for simple tasks)
- •Tip: Use
llama-3.1-70b-versatilefor production function calling
🎓 Best Practices
1. Model Selection Strategy
# Development: Fast iteration model = "llama-3.1-8b-instant" # Production: Reliability model = "llama-3.1-70b-versatile" # Cost-sensitive: Balance model = "llama-3.3-70b-versatile" # Latest, best value
2. Error Handling
try:
result = await Runner.run(agent, message)
return {"content": result.final_output}
except Exception as e:
logger.error(f"Agent Error: {str(e)}", exc_info=True)
return {"content": f"I encountered an error: {str(e)}", "error": str(e)}
3. Logging & Debugging
# Add debug prints to track tool calls
async def add_task(title: str, ...):
print(f"SIGNAL: SDK called add_task(title='{title}')")
result = await self.task_tools.add_task(...)
print(f"RESULT: {result}")
return result
4. System Prompt Engineering
instructions = """You are an Islamic productivity assistant. CAPABILITIES: - Add, list, update, and complete tasks - Provide Islamic reminders and motivation - Help users stay organized CONSTRAINTS: - Always use tools for task operations - Respond in a warm, encouraging tone - Include relevant Islamic wisdom when appropriate TOOL USAGE: - Call add_task when user wants to create a task - Call list_tasks to show current tasks - Call complete_task when user finishes something """
🔗 Integration with FastAPI
Route Handler Pattern
from fastapi import APIRouter, Depends
from .agents.todo_agent import IslamicTodoAgent
router = APIRouter()
agent = IslamicTodoAgent()
@router.post("/chat")
async def chat(
message: str,
user_id: str = Depends(get_current_user_id)
):
result = await agent.process_message(
user_id=user_id,
message=message
)
return {"response": result["content"]}
📚 Reference Implementation
Source: backend/src/agents/todo_agent.py in the Islamic Todo project
Key Files:
- •
todo_agent.py: Main agent wrapper - •
config.py: System prompt configuration - •
task_tools.py: MCP-style tool implementations
✅ Success Criteria
You've successfully implemented this pattern when:
- •✅ Agent responds in <500ms with Groq
- •✅ Tools are called correctly with proper parameters
- •✅ User context is isolated (no cross-user data leaks)
- •✅ Conversation history is maintained across messages
- •✅ Error handling prevents crashes
- •✅ Database sessions are properly cleaned up
🚀 Quick Start Template
import os
from agents import Agent, Runner, AsyncOpenAI, OpenAIChatCompletionsModel, function_tool, set_tracing_disabled
class MyAgent:
def __init__(self, groq_api_key: str):
os.environ["OPENAI_API_KEY"] = groq_api_key
os.environ["OPENAI_BASE_URL"] = "https://api.groq.com/openai/v1"
set_tracing_disabled(disabled=True)
self.client = AsyncOpenAI(
base_url="https://api.groq.com/openai/v1",
api_key=groq_api_key
)
self.model = OpenAIChatCompletionsModel(
model="llama-3.1-70b-versatile",
openai_client=self.client
)
def _create_agent(self):
async def my_tool(param: str):
"""Tool description."""
return f"Processed: {param}"
return Agent(
name="My Agent",
model=self.model,
instructions="You are a helpful assistant.",
tools=[function_tool(my_tool, strict_mode=False)]
)
async def chat(self, message: str):
agent = self._create_agent()
result = await Runner.run(agent, message)
return result.final_output
# Usage
agent = MyAgent(groq_api_key="your-key")
response = await agent.chat("Hello!")
📖 Additional Resources
- •Groq API Docs: https://console.groq.com/docs
- •OpenAI Agents SDK: https://github.com/openai/openai-agents-sdk
- •Llama 3.1 Function Calling: https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1
Last Updated: 2026-01-28
Tested With: OpenAI Agents SDK v0.1.x, Groq API v1