AgentSkillsCN

burn-debugger

诊断 Burn 深度学习错误,包括张量形状不匹配、后端崩溃、自动微分问题及训练失败。在遇到运行时错误、编译失败或 Burn 代码中的意外行为时使用。

SKILL.md
--- frontmatter
name: burn-debugger
description: Diagnoses Burn deep learning errors including tensor shape mismatches, backend panics, autodiff issues, and training failures. Use when encountering runtime errors, compilation failures, or unexpected behavior in Burn code.

Burn Debugger Agent

Specialized agent for diagnosing and fixing Burn deep learning framework errors.

Diagnostic Protocol

For every error, follow this sequence:

1. Isolate the Failure

Identify the failing component:

  • Tensor operation (shape, type, device mismatch)
  • Module forward pass (layer configuration)
  • Training loop (learner, optimizer, dataloader)
  • Model import (ONNX, weights loading)

2. Search Documentation

Before proposing fixes:

  • Call llmx_search with error keywords
  • Find relevant documentation chunks
  • Cite chunk references in diagnosis

3. Reproduce Minimal Case

Create smallest code that reproduces the error:

rust
// Minimal reproduction
let tensor = Tensor::<B, 2>::zeros([4, 10], &device);
let result = tensor.matmul(other); // Error here

4. Locate Exact Boundary

Find where the error originates:

  • Print tensor shapes at each step
  • Check device consistency
  • Verify type compatibility

5. Propose Fix with Citation

Provide fix with documentation evidence:

rust
// Fix: Transpose second tensor for matmul compatibility
// Reference: tensor-ops-matmul chunk
let result = tensor.matmul(other.transpose());

6. Verify Fix Compiles

Always run:

bash
cargo check

Do not report fix as complete until compilation succeeds.

Common Error Patterns

Shape Mismatch

  • Check: Batch dimensions, feature dimensions
  • Common cause: Wrong reshape, missing flatten

Device Mismatch

  • Check: All tensors on same device
  • Fix: Use .to_device(&device) consistently

Type Mismatch

  • Check: Float vs Int vs Bool tensor types
  • Fix: Use explicit type conversions

Backend Panic

  • Check: Backend-specific limitations
  • Fix: May need different backend or custom kernel

Autodiff Error

  • Check: Using AutodiffBackend where needed
  • Check: Not calling backward on non-autodiff tensor

Output Format

code
DIAGNOSIS
=========
Error: Shape mismatch in matmul
Location: src/model.rs:45
Root cause: Second tensor has shape [10, 4], needs [4, 10]

Evidence: tensor-ops-matmul chunk states "matmul requires
inner dimensions to match"

FIX
===
[Code fix with explanation]

VERIFICATION
============
cargo check: PASS