AgentSkillsCN

fabric-rti-perf-remediate

诊断并解决 Microsoft Fabric 实时智能中的性能问题,包括 Eventhouse、KQL 数据库、Eventstream 以及摄取管道。适用于在被要求排查 KQL 查询缓慢、Eventhouse CPU 或内存占用过高、摄取延迟或失败、Eventstream 吞吐量问题、容量限流(HTTP 430)、缓存策略调优、物化视图滞后,或 Always-On 配置时使用。涵盖工作区监控、Fabric Capacity Metrics 应用、查询优化,以及流式诊断。

SKILL.md
--- frontmatter
name: fabric-rti-perf-remediate
description: Diagnose and resolve performance issues in Microsoft Fabric Real-Time Intelligence including Eventhouse, KQL databases, Eventstream, and ingestion pipelines. Use when asked to troubleshoot slow KQL queries, high Eventhouse CPU or memory, ingestion latency or failures, Eventstream throughput problems, capacity throttling (HTTP 430), cache policy tuning, materialized view lag, or Always-On configuration. Covers workspace monitoring, Fabric Capacity Metrics app, query optimization, and streaming diagnostics.
license: Complete terms in LICENSE.txt

Fabric Real-Time Intelligence Performance remediate

Systematic toolkit for diagnosing and resolving performance issues across the Microsoft Fabric Real-Time Intelligence stack: Eventhouse, KQL databases, Eventstream, ingestion pipelines, and capacity management.

When to Use This Skill

  • Eventhouse queries running slowly or timing out
  • Ingestion latency or failures into KQL databases
  • Eventstream throughput bottlenecks or backlog growth
  • Capacity throttling errors (HTTP 430, TooManyRequestsForCapacity)
  • High CPU, memory, or cache utilization on Eventhouse
  • Materialized view lag or freshness issues
  • Always-On and minimum consumption sizing decisions
  • Workspace monitoring setup and dashboard interpretation
  • KQL query optimization for Real-Time Intelligence workloads

Prerequisites

  • Microsoft Fabric workspace with Contributor or higher permissions
  • Workspace monitoring enabled (for query/ingestion logs)
  • Fabric Capacity Metrics app installed (for capacity-level analysis)
  • KQL Queryset or Eventhouse query editor access

Step-by-Step Workflows

Workflow 1: Diagnose Slow KQL Queries

  1. Enable workspace monitoring if not already active. See workspace-monitoring.md
  2. Identify expensive queries using the diagnostic script: Run diagnose-slow-queries.kql against the monitoring Eventhouse
  3. Analyze query patterns — filter by Top CPU Time, Top Duration, or Memory Peak
  4. Apply KQL optimization rules from kql-optimization.md
  5. Validate improvement by re-running the query and comparing duration/CPU metrics

Workflow 2: Troubleshoot Ingestion Issues

  1. Check ingestion results logs using diagnose-ingestion.kql
  2. Review Eventstream data insights — check IncomingMessages, OutgoingMessages, BackloggedInputEvents, and WatermarkDelay metrics
  3. Identify failure patterns — deserialization errors, schema mismatches, throttling
  4. Apply throughput tuning per ingestion-remediate.md
  5. Validate pipeline health by monitoring runtime logs on source and destination nodes

Workflow 3: Resolve Capacity Throttling

  1. Open the Fabric Capacity Metrics app — filter to your capacity and workspace
  2. Check Eventhouse UpTime CU consumption — identify if a single Eventhouse dominates
  3. Run capacity diagnostics using diagnose-capacity.kql
  4. Evaluate sizing options: Always-On minimum consumption, cache policy adjustments, or SKU upgrade
  5. Apply recommendations from capacity-and-sizing.md

remediate Quick Reference

SymptomFirst CheckScript
Slow queriesWorkspace Monitoring → EH Queries tabdiagnose-slow-queries.kql
Query throttling (HTTP 430)Capacity Metrics app → CU utilizationdiagnose-capacity.kql
Ingestion failuresEventstream → Runtime logs tabdiagnose-ingestion.kql
High ingestion latencyEventstream → Data insights → WatermarkDelaydiagnose-ingestion.kql
Materialized view stale.show materialized-views commanddiagnose-slow-queries.kql
Cold storage scansCache policy vs query time rangediagnose-capacity.kql
Eventhouse wake-up latencyAlways-On setting disabledcapacity-and-sizing.md
Eventstream backlog growingThroughput setting mismatchingestion-remediate.md

References