BAML Code Generation

Name: baml-codegen
Rating: 92
Author: openclaw

Generate type-safe LLM extraction code. Use when creating structured outputs, classification, RAG, or agent workflows.

Golden Rules

•NEVER edit baml_client/ - 100% generated, overwritten on every baml-cli generate; check baml_src/generators.baml for output_type (python, typescript, ruby, go)
•ALWAYS edit baml_src/ - Source of truth for all BAML code
•Run baml-cli generate after changes - Regenerates typed client code for target language

•Schema Is The Prompt - Define data models first, compiler injects types
•Types Over Strings - Use enums/classes/unions, not string parsing
•Fuzzy Parsing Is BAML's Job - BAML extracts valid JSON from messy LLM output
•Transpiler Not Library - Write .baml → generate native code (Python/TypeScript/Ruby/Go), no runtime dependency
•Test-Driven Prompting - Use VS Code playground or baml-cli test to iterate

code

Analyze → Pattern Match (MCP) → Validate → Generate → Test → Deliver
         ↓ [IF ERRORS] Error Recovery (MCP) → Retry

Element	Example
Class	`class Invoice { total float @description("Amount") @assert(this > 0) @alias("amt") }`
Enum	`enum Category { Tech @alias("technology") @description("Tech sector"), Finance, Other }`
Function	`function Extract(text: string, img: image?) -> Invoice { client GPT5 prompt #"{{ text }} {{ img }} {{ ctx.output_format }}"# }`
Client	`client<llm> GPT5 { provider openai options { model gpt-5 } retry_policy Exponential }`
Fallback	`client<llm> Resilient { provider fallback options { strategy [FastModel, SlowModel] } }`

•Primitives: string, int, float, bool | Multimodal: image, audio
•Containers: Type[] (array), Type? (optional), map<string, Type> (key-value)
•Composite: Type1 | Type2 (union), nested classes
•Annotations: @description("..."), @assert(condition), @alias("json_name"), @check(name, condition)

openai, anthropic, gemini, vertex, bedrock, ollama + any OpenAI-compatible via openai-generic

Pattern	Use Case	Model	Framework Markers
Extraction	Unstructured → structured	GPT-5	fastapi, next.js
Classification	Categorization	GPT-5-mini	any
RAG	Answers with citations	GPT-5	langgraph
Agents	Multi-step reasoning	GPT-5	langgraph
Vision	Image/audio data extraction	GPT-5-Vision	multimodal

•retry_policy: retry_policy Exp { max_retries 3 strategy { type exponential_backoff } }
•fallback client: Chain models [FastCheap, SlowReliable] for cost/reliability tradeoff

•Found patterns from baml-examples | Validated against BoundaryML/baml | Fixed errors using docs | MCP unavailable, using fallback

•BAML Code - Complete .baml files (types, functions, clients, retry_policy)
•Tests - pytest/Jest with 100% function coverage
•Integration - Framework-specific client code (LangGraph nodes, FastAPI endpoints, Next.js API routes)
•Metadata - Pattern used, token count, cost estimate

•providers.md - OpenAI, Anthropic, Google, Ollama, Azure, Bedrock, openai-generic
•types-and-schemas.md - Full type system, classes, enums, unions, map, image, audio
•validation.md - @assert, @check, @alias, block-level @@assert
•patterns.md - Pattern library with code examples
•philosophy.md - BAML principles, golden rules
•mcp-interface.md - Query workflow, caching
•languages-python.md - Python/Pydantic, async
•languages-typescript.md - TypeScript, React/Next.js
•frameworks-langgraph.md - LangGraph integration