LangGraph Application Fine-Tuning Skill
A skill for iteratively optimizing prompts and processing logic in each node of a LangGraph application based on evaluation criteria.
📋 Overview
This skill executes the following process to improve the performance of existing LangGraph applications:
- •Load Objectives: Retrieve optimization goals and evaluation criteria from
.langgraph-master/fine-tune.md(if this file doesn't exist, help the user create it based on their requirements) - •Identify Optimization Targets: Extract nodes containing LLM prompts using Serena MCP (if Serena MCP is unavailable, investigate the codebase using ls, read, etc.)
- •Baseline Evaluation: Measure current performance through multiple runs
- •Implement Improvements: Identify the most effective improvement areas and optimize prompts and processing logic
- •Re-evaluation: Measure performance after improvements
- •Iteration: Repeat steps 4-5 until goals are achieved
Important Constraint: Only optimize prompts and processing logic within each node without modifying the graph structure (nodes, edges configuration).
🎯 When to Use This Skill
Use this skill in the following situations:
- •
When performance improvement of existing applications is needed
- •Want to improve LLM output quality
- •Want to improve response speed
- •Want to reduce error rate
- •
When evaluation criteria are clear
- •Optimization goals are defined in
.langgraph-master/fine-tune.md - •Quantitative evaluation methods are established
- •Optimization goals are defined in
- •
When improvements through prompt engineering are expected
- •Improvements are likely with clearer LLM instructions
- •Adding few-shot examples would be effective
- •Output format adjustment is needed
📖 Fine-Tuning Workflow Overview
Phase 1: Preparation and Analysis
Purpose: Understand optimization targets and current state
Main Steps:
- •Load objective setting file (
.langgraph-master/fine-tune.md) - •Identify optimization targets (Serena MCP or manual code investigation)
- •Create optimization target list (evaluate improvement potential for each node)
→ See workflow.md for details
Phase 2: Baseline Evaluation
Purpose: Quantitatively measure current performance
Main Steps: 4. Prepare evaluation environment (test cases, evaluation scripts) 5. Baseline measurement (recommended: 3-5 runs) 6. Analyze baseline results (identify problems)
Important: When evaluation programs are needed, create evaluation code in a specific subdirectory (users may specify the directory).
→ See workflow.md and evaluation.md for details
Phase 3: Iterative Improvement
Purpose: Data-driven incremental improvement
Main Steps: 7. Prioritization (select the most impactful improvement area) 8. Implement improvements (prompt optimization, parameter tuning) 9. Post-improvement evaluation (re-evaluate under the same conditions) 10. Compare and analyze results (measure improvement effects) 11. Decide whether to continue iteration (repeat until goals are achieved)
→ See workflow.md and prompt_optimization.md for details
Phase 4: Completion and Documentation
Purpose: Record achievements and provide future recommendations
Main Steps: 12. Create final evaluation report (improvement content, results, recommendations) 13. Code commit and documentation update
→ See workflow.md for details
🔧 Tools and Technologies Used
MCP Server Utilization
- •
Serena MCP: Codebase analysis and optimization target identification
- •
find_symbol: Search for LLM clients - •
find_referencing_symbols: Identify prompt construction locations - •
get_symbols_overview: Understand node structure
- •
- •
Sequential MCP: Complex analysis and decision making
- •Determine improvement priorities
- •Analyze evaluation results
- •Plan next actions
Key Optimization Techniques
- •Few-Shot Examples: Accuracy +10-20%
- •Structured Output Format: Parsing errors -90%
- •Temperature/Max Tokens Adjustment: Cost -20-40%
- •Model Selection Optimization: Cost -40-60%
- •Prompt Caching: Cost -50-90% (on cache hit)
→ See prompt_optimization.md for details
📚 Related Documentation
Detailed guidelines and best practices:
- •workflow.md - Fine-tuning workflow details (execution procedures and code examples for each phase)
- •evaluation.md - Evaluation methods and best practices (metric calculation, statistical analysis, test case design)
- •prompt_optimization.md - Prompt optimization techniques (10 practical methods and priorities)
- •examples.md - Practical examples collection (copy-and-paste ready code examples and template collection)
⚠️ Important Notes
- •
Preserve Graph Structure
- •Do not add or remove nodes or edges
- •Do not change data flow between nodes
- •Maintain state schema
- •
Evaluation Consistency
- •Use the same test cases
- •Measure with the same evaluation metrics
- •Run multiple times to confirm statistically significant improvements
- •
Cost Management
- •Consider evaluation execution costs
- •Adjust sample size as needed
- •Be mindful of API rate limits
- •
Version Control
- •Git commit each iteration's changes
- •Maintain rollback-capable state
- •Record evaluation results
🎓 Fine-Tuning Best Practices
- •Start Small: Optimize from the most impactful node
- •Measurement-Driven: Always perform quantitative evaluation before and after improvements
- •Incremental Improvement: Validate one change at a time, not multiple simultaneously
- •Documentation: Record reasons and results for each change
- •Iteration: Continuously improve until goals are achieved