Opik Optimizer
Purpose
Design, run, and interpret Opik Optimizer workflows for prompts, tools, and model parameters with consistent dataset/metric wiring and reproducible evaluation.
When to use
Use this skill when a user asks for:
- •Choosing and configuring Opik Optimizer algorithms for prompt/agent optimization.
- •Writing
ChatPrompt-based optimization runs and custom metric functions. - •Optimizing with tools (function calling or MCP), selected prompt roles, or prompt segments.
- •Tuning LLM call parameters with
optimize_parameter. - •Comparing optimizer outputs and interpreting
OptimizationResult.
Workflow
- •Select optimizer strategy (
MetaPromptOptimizer,FewShotBayesianOptimizer,HRPO, etc.) based on the target optimization goal. - •Build prompt/dataset/metric wiring and validate placeholder-field alignment.
- •Run prompt, tool, or parameter optimization with explicit controls (
n_threads,n_samples,max_trials, seed). - •Inspect
OptimizationResultand compare score deltas against initial baselines. - •Summarize recommendations, risks, and next experiments.
Inputs
- •Target optimization objective (prompt/tool/parameter) and success metric.
- •Dataset source and expected schema fields.
- •Model/provider constraints and runtime limits.
- •Optional scope constraints (
optimize_promptssegments, tool fields, project names).
Outputs
- •Optimizer run configuration and rationale.
- •Result interpretation (
score,initial_score, history trends). - •Recommended next changes and follow-up experiment plan.
Use the reference files in this skill for details before implementing code:
- •
references/algorithms.md - •
references/prompt_agent_workflow.md - •
references/example_patterns.md
Opik Optimizer quickstart
- •Install and import:
bash
pip install opik-optimizer
python
from opik_optimizer import ChatPrompt, MetaPromptOptimizer, HRPO, FewShotBayesianOptimizer from opik_optimizer import datasets
- •Build a prompt and metric:
python
from opik.evaluation.metrics import LevenshteinRatio
prompt = ChatPrompt(
system="You are a concise answerer.",
user="{question}",
)
def metric(dataset_item: dict, output: str) -> float:
return LevenshteinRatio().score(
reference=dataset_item["answer"],
output=output,
).value
- •Load dataset and run:
python
dataset = datasets.hotpot(count=30)
result = MetaPromptOptimizer(model="openai/gpt-5-nano").optimize_prompt(
prompt=prompt,
dataset=dataset,
metric=metric,
n_samples=20,
max_trials=10,
)
result.display()
Core workflow you should follow
- •Pick optimizer class:
- •Few-shot examples + Bayesian selection:
FewShotBayesianOptimizer - •LLM meta-reasoning:
MetaPromptOptimizer - •Genetic + MOO / LLM crossover:
EvolutionaryOptimizer - •Hierarchical reflective diagnostics:
HierarchicalReflectiveOptimizer(HRPO) - •Pareto-based genetic strategy:
GepaOptimizer - •Parameter tuning only:
ParameterOptimizer
- •Few-shot examples + Bayesian selection:
- •Define a single
ChatPrompt(or dict of prompts for multi-prompt cases). - •Provide a dataset from
opik_optimizer.datasets. - •Provide metric callable with signature
(dataset_item, llm_output) -> float(orScoreResult/list ofScoreResult). - •Set optimizer controls (
n_threads,n_samples,max_trials, seed, etc.). - •Run one of:
- •
optimize_prompt(...)for prompt/system behavior changes. - •
optimize_parameter(...)for model-call hyperparameters.
- •
- •Inspect
OptimizationResult(score,initial_score,history,optimization_id,get_optimized_parameters).
Key execution details to enforce
- •Prefer explicit
project_namefor Opik tracking if you are using org-level observability. - •Keep placeholders in prompts aligned with dataset fields (for example
{question}). - •Start with
optimize_prompts="system"or"user"when scope should be constrained. - •Keep
modelnames inMetaPrompt/reasoningcalls provider-compatible for your account. - •Validate multimodal input payloads by preserving non-empty content segments only.
- •For small datasets, use
n_samplesandn_samples_strategycarefully; over-allocation auto-falls back to full set.
Tooling and segment-based control
- •Tools can be optimized with MCP/function schema fields, not only by changing prompt wording.
- •For fine-grained text updates, use
optimize_promptsvalues and helper functions fromprompt_segments:- •
extract_prompt_segments(ChatPrompt)to inspect stable segment IDs. - •
apply_segment_updates(ChatPrompt, updates)for deterministic edits.
- •
- •Tool optimization is distinct from prompt optimization.
Runnable examples live upstream in the Opik repo:
If you need local runnable scripts, vendor the upstream examples into a scripts/ folder and keep references one level deep.
Common mistakes to avoid
- •Passing empty dataset or mismatched placeholder names.
- •Mixing deprecated constructor arg
num_threadswithn_threads. - •Assuming tool optimization is the same as agent function-calling optimization.
- •Running
ParameterOptimizer.optimize_prompt(it raises and should not be used).
Next actions
- •For in-depth behavior and per-class parameter tables:
references/algorithms.md - •For exact
optimize_promptsignatures, prompts, tool constraints, and result usage:references/prompt_agent_workflow.md - •For pattern examples and source-backed workflows:
references/example_patterns.md