Tool Parser Generator Skill
Add support for new models' tool calling formats by analyzing their chat templates and generating appropriate parser implementations for dynamo.
When to Use This Skill
- •User asks to add tool calling support for a specific HuggingFace model
- •User wants to understand how a model structures tool calls
- •User needs to extend dynamo's parser library with new formats
Workflow
Follow this systematic workflow when the user provides a HuggingFace model name.
Phase 1: Fetch and Extract Chat Template
- •
Fetch tokenizer config from HuggingFace Hub:
codeURL: https://huggingface.co/{model_id}/resolve/main/tokenizer_config.json - •
Extract chat template:
- •Parse the JSON response
- •Look for
chat_templatefield - •Handle two formats:
- •String: Single template
- •Array: List of templates with
nameandtemplatefields- •Prefer
tool_usetemplate if available - •Fall back to
defaulttemplate
- •Prefer
- •
Extract special tokens (if relevant):
- •
bos_token,eos_token,unk_token - •
additional_special_tokens - •Any tool-specific tokens in the config
- •
Phase 2: Analyze Chat Template
The chat template is a Jinja template. Analyze it to identify tool call patterns:
- •
Find tool-related sections:
- •Look for conditional blocks with keywords:
tools,tool_call,function,available_tools - •Extract content within
{% if tools %}...{% endif %}blocks - •Find
{% for tool in tools %}loops
- •Look for conditional blocks with keywords:
- •
Identify markers and format:
- •Start markers: Tokens/strings before tool calls
- •Examples:
<tool_call>,[TOOL_CALLS],<|python_tag|>,<|tool▁call▁begin|>
- •Examples:
- •End markers: Tokens/strings after tool calls
- •Examples:
</tool_call>,[/TOOL_CALLS],<|tool▁call▁end|>
- •Examples:
- •Special tokens: Unicode or encoded tokens (DeepSeek, Harmony)
- •Format type:
- •JSON: Look for
tojsonfilter,{}brackets - •XML: Look for
<function=,<parameter=patterns - •Pythonic: Look for
function(arg=val)patterns - •DSML: Look for
<|DSML|tokens
- •JSON: Look for
- •Start markers: Tokens/strings before tool calls
- •
Identify JSON structure (if JSON format):
- •Name key: Usually
nameorfunction - •Arguments key: Usually
argumentsorparameters - •Array vs single object
- •Multiple calls handling
- •Name key: Usually
Phase 3: Compare with Existing Parsers
Read existing parser implementations in /lib/parsers/src/tool_calling/:
- •
Check JSON parsers (
json/directory):- •
base_json_parser.rs- Generic JSON with markers - •
deepseek_v3_parser.rs- DeepSeek V3 format - •
deepseek_v3_1_parser.rs- DeepSeek V3.1 format
- •
- •
Check XML parsers (
xml/directory):- •
parser.rs- Qwen3 Coder XML format
- •
- •
Check other formats:
- •
pythonic/pythonic_parser.rs- Python syntax - •
harmony/harmony_parser.rs- Harmony protocol - •
dsml/parser.rs- DeepSeek V3.2 DSML
- •
- •
Review config presets in
config.rs:- •Look at
ToolCallConfig::hermes(),mistral(),llama3_json(), etc. - •Each preset defines start/end tokens, key names, parser type
- •Look at
- •
Check parser registry in
parsers.rs:- •See how parsers are registered in
get_tool_parser_map() - •Understand the
ParserTypeenum and routing logic
- •See how parsers are registered in
Match the analyzed format:
- •If start/end tokens and format match existing parser → Use existing parser with config
- •If similar but different tokens → Adapt existing parser config
- •If completely different format → Generate new parser
Phase 4: Generate or Configure Parser
Option A: Use Existing Parser (Preferred)
If a match is found, create a configuration preset:
- •
Add a new preset function to
/lib/parsers/src/tool_calling/config.rs:rustimpl ToolCallConfig { pub fn new_model_name() -> Self { Self { config: ParserConfig::Json(JsonParserConfig { start_token: Some("<marker>".to_string()), end_token: Some("</marker>".to_string()), function_name_key: Some("name".to_string()), function_arguments_key: Some("arguments".to_string()), parser_type: JsonParserType::Basic, }), } } } - •
Register in parser map in
/lib/parsers/src/tool_calling/parsers.rs - •
Create tests to verify the configuration works
Option B: Generate New Parser (If Needed)
If no existing parser fits, generate new parser code:
- •
Choose parser template based on format:
- •JSON format → Use
base_json_parser.rsas template - •XML format → Use
xml/parser.rsas template - •Custom format → Implement three core functions
- •JSON format → Use
- •
Implement required functions:
rust// Detection pub fn detect_tool_call_start_[name](chunk: &str, config: &Config) -> bool // Parsing pub fn try_tool_call_parse_[name]( message: &str, config: &Config, tools: Option<&[ToolDefinition]>, ) -> Result<(Vec<ToolCallResponse>, Option<String>)> // End detection (for streaming) pub fn find_tool_call_end_position_[name](chunk: &str, config: &Config) -> usize - •
Use regex for token matching:
- •Use
OnceLock<Regex>for compiled regexes - •Escape special characters properly
- •Handle partial tokens for streaming
- •Use
- •
Parse JSON/XML content:
- •Use
serde_jsonfor JSON parsing - •Use regex for XML extraction (or XML parser if complex)
- •Build
ToolCallResponsestructs
- •Use
- •
Add to appropriate directory:
- •JSON variants →
json/directory - •XML variants →
xml/directory - •New format → Create new subdirectory
- •JSON variants →
Phase 5: Generate Tests
For any new parser or configuration, generate comprehensive tests:
- •
Basic tests:
- •Detection of start markers
- •Parsing single tool call
- •Parsing multiple tool calls
- •Normal text extraction
- •
Edge cases:
- •Empty arguments
- •Missing fields
- •Malformed JSON/XML
- •Partial tokens (streaming)
- •
Integration tests:
- •End-to-end with real model outputs (if available)
- •Tool validation (if tools list provided)
- •
Add tests to appropriate location:
- •Inline in parser file (in
#[cfg(test)]module) - •Or in
/lib/parsers/src/tool_calling/tests.rs
- •Inline in parser file (in
Phase 6: Integration
- •
Update module exports:
- •Add
moddeclaration in parentmod.rs - •Export functions as needed
- •Add
- •
Register parser in
parsers.rsif new parser:- •Add to
get_tool_parser_map()function - •CRITICAL: Update
test_get_available_tool_parsers()test - •Add your new parser name to the
available_parsersarray in the test
- •Add to
- •
Document the parser:
- •Add doc comments explaining format
- •Include example input/output
- •Reference model family
- •
Run tests:
bashcd lib/parsers cargo test tool_calling
- •
Verify with dynamo:
- •Test with actual model if possible
- •Verify streaming behavior
- •Check error handling
Key Reference Files
Dynamo Codebase:
- •
/lib/parsers/src/tool_calling/- All tool call parsers - •
/lib/parsers/src/tool_calling/config.rs- Configuration presets - •
/lib/parsers/src/tool_calling/parsers.rs- Parser registry - •
/lib/llm/src/preprocessor/prompt/template/tokcfg.rs- Chat template structures - •
/lib/llm/src/preprocessor/prompt/template.rs- Template loading
Reference Implementations:
- •sglang: https://github.com/sgl-project/sglang/tree/main/python/sglang/srt/function_call
- •Look at detector pattern (base_format_detector.py)
- •Model-specific detectors (qwen25_detector.py, deepseekv3_detector.py, etc.)
- •vLLM: https://github.com/vllm-project/vllm/tree/main/vllm/tool_parsers
- •Look at abstract_tool_parser.py
- •Model-specific parsers (llama_tool_parser.py, qwen3xml_tool_parser.py, etc.)
- •HuggingFace: https://huggingface.co/docs/transformers/chat_templating
Example: Adding Support for a New Model
User: "Add tool calling support for Qwen/Qwen2.5-72B-Instruct"
Step 1: Fetch tokenizer config
- •Use WebFetch to get
https://huggingface.co/Qwen/Qwen2.5-72B-Instruct/resolve/main/tokenizer_config.json
Step 2: Analyze chat template
- •Extract
chat_templatefield - •Identify
{% if tools %}block - •Find markers: Likely
<tool_call>and</tool_call> - •Identify format: Check for JSON with
tojsonfilter
Step 3: Compare with existing parsers
- •Read
/lib/parsers/src/tool_calling/config.rs - •Check
ToolCallConfig::hermes()- uses<tool_call>markers - •Check if Qwen format matches hermes format
Step 4: Use or adapt existing parser
- •If matches hermes: Create
qwen2_5()config preset - •If different: Generate new parser or adapt base_json_parser
Step 5: Generate tests
- •Create test cases with example Qwen tool calls
- •Test detection, parsing, and edge cases
Step 6: Integrate
- •Add config preset to
config.rs - •Register in parser map (
get_tool_parser_map()) - •Update
test_get_available_tool_parsers()test - •Run tests
- •Document
Tips
- •Always prefer existing parsers: Most models can use existing parsers with different configs
- •Read reference implementations: sglang and vLLM often have parsers for popular models
- •Use WebFetch for HF models: Don't assume - always fetch actual tokenizer config
- •Test with real outputs: If possible, get actual model outputs to test against
- •Keep it simple: Prefer straightforward regex over complex parsing when possible
- •Document well: Future you (or others) will thank you
Common Patterns
JSON with Brackets
[TOOL_CALLS] [{"name": "func", "arguments": {}}]
→ Use base_json_parser with bracket markers
JSON with XML Tags
<tool_call>
{"name": "func", "arguments": {}}
</tool_call>
→ Use base_json_parser with XML-style markers
XML Structure
<tool_call> <function=name> <parameter=key>value</parameter> </function> </tool_call>
→ Use xml/parser.rs or create variant
Nested Tokens
<|tool▁call▁begin|>name<|tool▁sep|>args<|tool▁call▁end|>
→ Create specialized parser (see DeepSeek parsers)
Minimal Changes Philosophy
- •First: Try existing parser with new config
- •Second: Adapt existing parser with minor tweaks
- •Last resort: Create entirely new parser
Most models (>80%) can use existing parsers with appropriate configuration.