Research Agent Optimization
Scope
- •Project root:
/home/bender/classwork/Thesis - •Backend:
backend/news_research_agent.py,backend/app/api/routes/research.py,backend/app/services/news_research.py - •Frontend:
frontend/app/search/page.tsx,frontend/lib/api.ts - •Configuration:
backend/app/core/config.py
Problem Statement
- •Rate Limiting: Gemini API hits 429 quota exceeded errors during research and article analysis
- •Web Search: DuckDuckGo tool integration has naming issues (not properly initialized)
- •Unclear Progress: Research streaming shows generic "Still working..." instead of specific tool calls
- •JSON in Response: Results show raw JSON blocks instead of formatted source cards
- •Redundant API Calls: Multiple internal search calls without caching/deduplication
Required Outcomes
- •Graceful rate limit handling with exponential backoff and quota monitoring
- •Working web search tool with proper DuckDuckGo initialization
- •Verbose streaming events showing real tool execution (web_search, news_search, internal_news_search)
- •Research results rendered with inline source cards (not JSON blocks)
- •Optimized API calls: batch searches, cache semantic results, reuse internal knowledge base
- •Clear error messages when quota is exceeded
Workflow
1. API Call Optimization
- •Implement request batching in
search_internal_newstool - •Add caching layer for semantic search results (avoid duplicate queries within 5min window)
- •Combine web_search + news_search into single result set
- •Track API call counts per session and warn before quota exhaustion
- •Add exponential backoff retry logic (1s, 2s, 4s, 8s max)
Files:
- •
backend/news_research_agent.py- tools and caching - •
backend/app/services/news_research.py- request batching helpers
2. Rate Limit & Quota Handling
- •Add try/catch wrapper around Gemini calls
- •Detect 429 errors and return user-friendly message ("API Rate Limit: ...please wait a moment...")
- •Add optional
--skip-gemini-analysismode for article analysis when quota is low - •Log quota usage and remaining tokens
- •Set model to
gemini-2.0-flash(faster, lower token cost) instead ofgemini-2.0-flash-exp
Files:
- •
backend/app/core/config.py- error handling wrapper, model selection - •
backend/app/api/routes/research.py- HTTP error responses - •
backend/news_research_agent.py- LLM call error handling
3. Web Search Tool Fix
- •Verify DuckDuckGo import:
from duckduckgo_search import DDGS(notddgsorDuckDuckGo) - •Ensure
web_searchandnews_searchtools are properly bound to LLM - •Add fallback to internal search if web search fails
- •Log tool execution with query and result count
Files:
- •
backend/news_research_agent.py- tool definitions and error handling - •Use
exa-codeto verify current DuckDuckGo API patterns
4. Streaming Progress Clarity
- •Expand SSE event types:
tool_startincludes tool name + query parameters - •Map tool events to user-friendly messages:
- •
web_search("climate change")→ "Searching web for: climate change..." - •
news_search(keywords="COP30")→ "Searching news for: COP30..." - •
search_internal_news(query)→ "Searching internal knowledge base..." - •
fetch_article_content(url)→ "Reading article: [title/domain]..."
- •
- •Add timestamps and tool execution duration
- •Emit status updates every 3-5 seconds if no tool activity
Files:
- •
backend/news_research_agent.py- streaming generator - •
backend/app/api/routes/research.py- SSE formatting
5. Frontend Result Rendering
- •Remove JSON blocks from response text
- •Render referenced articles in a "Sources" section below the answer
- •Use article cards: title, source, date, image thumbnail
- •Make cards clickable to open article detail modal
- •Group sources by retrieval method (semantic, web search, internal)
Files:
- •
frontend/app/search/page.tsx- message rendering and sources grid - •
frontend/lib/api.ts- response parsing
6. Error Handling & User Feedback
- •Detect and handle:
- •429 quota exceeded → "API Rate Limit: The AI service has reached its rate limit. Please wait a moment and try again."
- •Connection timeout → "Request Timeout: The research took too long. Try a simpler query."
- •Tool execution failure → "Tool [name] failed: [reason]. Continuing with alternative search..."
- •Add retry prompt on error (not automatic, user chooses)
- •Log all errors with request ID for debugging
Files:
- •
backend/app/api/routes/research.py- error formatting - •
frontend/app/search/page.tsx- error UI and retry logic
Checks
API Optimization
- •Verify semantic search results are cached (no duplicate calls)
- •Check web_search and news_search return results (not empty)
- •Confirm tool execution logs show cache hits for repeated queries
Rate Limit Handling
- •Trigger 429 error and verify graceful fallback message displays
- •Confirm no stack traces shown to user
- •Check logs show quota status and retry timing
Web Search
- •Query "climate change" and verify web_search returns 5+ results
- •Confirm DuckDuckGo DDGS class is properly instantiated
- •Check news_search returns recent news articles
Streaming Clarity
- •Monitor SSE events for tool_start with query details
- •Verify timestamps increment correctly
- •Confirm "Still working..." message only shows after 30s inactivity
Frontend Rendering
- •Verify research answer is plain text (no JSON)
- •Check "Sources" section appears with article cards
- •Confirm card click opens article detail modal
- •Verify no duplicate sources (de-duplication working)
Error Scenarios
- •Submit invalid query and verify doesn't crash
- •Test with network disconnect and check timeout message
- •Simulate quota exceeded (403) and verify user sees rate limit message
Implementation Checklist
- • Add retry decorator with exponential backoff to Gemini client
- • Implement request cache in
search_internal_newswith 5min TTL - • Fix DuckDuckGo tool initialization (verify DDGS import)
- • Update
research_stream()to emit granular tool start/result events - • Map tool events to human-readable status messages in API endpoint
- • Remove JSON block from final answer text
- • Add "Sources" section with article cards to frontend
- • Update error handling for 429 quota exceeded
- • Add streaming status animation to UI
- • Write tests for quota handling and web search integration