AgentSkillsCN

wings-engine-patch

全面的框架,用于为推理引擎(vllm、vllm_ascend)创建并管理运行时补丁。当 Claude 需要为 AI 推理引擎实施动态猴子补丁、构建基于功能的补丁管理系统、使用 wrapt 基础的导入钩子、实现版本控制的运行时修改、设计非侵入式的补丁框架,或管理补丁依赖与传播逻辑时,可使用此技能。

SKILL.md
--- frontmatter
name: wings-engine-patch
description: Comprehensive framework for creating and managing runtime patches for inference engines (vllm, vllm_ascend). Use when Claude needs to implement dynamic monkey patches for AI inference engines, create feature-based patch management systems, work with wrapt-based import hooks, implement version-controlled runtime modifications, design non-intrusive patch frameworks, or manage patch dependencies and propagation logic.

Wings Engine Patch Skill

Overview

This skill provides a comprehensive framework for creating runtime patches for inference engines (vllm, vllm_ascend) using Python's import hooks and wrapt. The framework enables non-intrusive, version-controlled, feature-based patching at runtime without modifying the original package installation.

Core Concepts

1. Non-Intrusive Runtime Patching

  • Patches applied at runtime using Python import hooks (wrapt.register_post_import_hook)
  • Original package installation remains pristine
  • No source code modification required

2. Feature-Based Management

  • Users enable features (e.g., soft_fp8, soft_fp4), not individual patches
  • Features group related patches together
  • Configuration via environment variable: WINGS_ENGINE_PATCH_OPTIONS

3. Version Control

  • Patches are strictly scoped to specific engine versions (e.g., vllm_ascend==0.12.0rc1)
  • Automatic version matching and validation
  • Fallback to default version if configured

4. Intelligent Dependency Resolution

  • Shared Patches: Multiple features can reference the same patch function
  • Propagation: Enabling one feature auto-enables others sharing the same patch
  • Deduplication: Each patch function executes exactly once, regardless of how many features reference it

When to Use This Skill

✅ Use when you need to:

  • Patch vllm/vllm_ascend with runtime modifications
  • Implement feature-based patch management
  • Version-control patches for specific engine versions
  • Handle complex shared patch dependencies
  • Non-intrusively modify third-party packages

❌ Don't use for:

  • Simple scripts without version management needs
  • Modifications possible via official APIs
  • Cases where upstream patches are better
  • Non-Python codebases

References