AgentSkillsCN

fabric-lakehouse-views-perf-remediate

排查 Microsoft Fabric 物化湖视图(MLV)的性能问题,包括刷新缓慢、增量刷新失败、全量刷新回退、Spark 作业失败、血缘执行错误、数据质量约束违规,以及最佳刷新配置等问题。适用于在诊断 MLV 刷新时长、在 Monitor Hub 中监控 MLV 运行、分析 Spark 日志以定位 MLV 失败原因、启用变更数据馈送(CDF)、解决 Delta 表未找到错误,或为增量刷新的适用性优化 MLV 查询定义时使用。涵盖 Spark SQL 语法、TBLPROPERTIES、PARTITIONED BY、CONSTRAINT CHECK ON MISMATCH DROP/FAIL,以及自定义环境配置。

SKILL.md
--- frontmatter
name: fabric-lakehouse-views-perf-remediate
description: Troubleshoot Microsoft Fabric materialized lake views (MLV) performance issues including slow refresh, incremental refresh failures, full refresh fallback, Spark job failures, lineage execution errors, data quality constraint violations, and optimal refresh configuration. Use when diagnosing MLV refresh duration, monitoring MLV runs in Monitor Hub, analyzing Spark logs for MLV failures, enabling change data feed (CDF), resolving delta table not found errors, or optimizing MLV query definitions for incremental refresh eligibility. Covers Spark SQL syntax, TBLPROPERTIES, PARTITIONED BY, CONSTRAINT CHECK ON MISMATCH DROP/FAIL, and custom environment configuration.
license: Complete terms in LICENSE.txt

Fabric Materialized Lake Views Performance remediate

Diagnose and resolve performance issues with materialized lake views (MLVs) in Microsoft Fabric lakehouses. This skill covers refresh optimization, Spark job diagnostics, data quality constraint tuning, and lineage execution remediate.

When to Use This Skill

  • MLV refresh runs are taking longer than expected
  • Incremental refresh is falling back to full refresh unexpectedly
  • MLV lineage execution shows Failed or Skipped nodes
  • Spark jobs for MLV refresh are failing with errors
  • "Delta table not found" errors during MLV creation or refresh
  • Data quality constraints causing unexpected pipeline failures
  • Need to enable or verify optimal refresh configuration
  • Custom Spark environment tuning for MLV workloads
  • Monitoring and interpreting MLV run history

Prerequisites

  • Microsoft Fabric workspace with Lakehouse items
  • Schema-enabled lakehouse (recommended for MLV support)
  • Fabric notebook for executing Spark SQL commands
  • Workspace Admin or Contributor role for scheduling and monitoring
  • Access to Monitor Hub for viewing MLV run details

Quick Diagnostics Checklist

Run through these checks in order when remediate MLV performance:

StepCheckAction
1Identify refresh modeVerify optimal refresh toggle is enabled in lineage view
2Check CDF statusConfirm delta.enableChangeDataFeed=true on ALL source tables
3Review query patternsEnsure only supported SQL constructs are used (see supported expressions)
4Inspect run historyOpen lineage view dropdown to see last 25 runs and their states
5Check node failuresClick failed nodes in lineage to view error messages
6Review Spark logsFollow Detailed Logs link to Monitor Hub for Spark error logs
7Validate data qualityCheck if FAIL constraints are causing "delta table not found" errors
8Assess source dataDetermine if source has updates/deletes (forces full refresh)

Step-by-Step Workflows

Workflow 1: Diagnose Slow Refresh

  1. Determine current refresh strategy - Navigate to Manage materialized lake views in the lakehouse ribbon. Check if Optimal refresh toggle is ON.
  2. Verify change data feed - Run the diagnostic script to check CDF status on all source tables. See scripts/check-cdf-status.sql.
  3. Check for unsupported expressions - Review the MLV definition for constructs that force full refresh. See Supported Expressions.
  4. Inspect source data patterns - If source tables have UPDATE or DELETE operations, Fabric always performs full refresh regardless of CDF status.
  5. Review partition strategy - Consider adding PARTITIONED BY to large MLVs to improve refresh parallelism.
  6. Attach custom environment - Configure a custom Spark environment with optimized compute for heavy workloads. See references/custom-environment-guide.md.

Workflow 2: Resolve Failed Refresh Runs

  1. Open lineage view - Navigate to Manage materialized lake views, select the failed run from the dropdown (last 25 runs available).
  2. Identify failed node - Click the failed node in the lineage graph. Review the error message in the right-side panel.
  3. Access Spark logs - Click the Detailed Logs link to navigate to Monitor Hub. Review Apache Spark error logs.
  4. Common failure patterns - See references/common-failure-patterns.md for resolution steps.
  5. Retry or recreate - For transient Spark failures, retry the run. For persistent errors, drop and recreate the MLV.

Workflow 3: Enable Optimal Refresh

  1. Enable CDF on all source tables:
sql
ALTER TABLE bronze.customers SET TBLPROPERTIES (delta.enableChangeDataFeed = true);
ALTER TABLE bronze.orders SET TBLPROPERTIES (delta.enableChangeDataFeed = true);
  1. Verify CDF is enabled:
sql
DESCRIBE DETAIL bronze.customers;
-- Check properties column for delta.enableChangeDataFeed=true
  1. Create MLV with CDF property:
sql
CREATE OR REPLACE MATERIALIZED LAKE VIEW silver.customer_orders
TBLPROPERTIES (delta.enableChangeDataFeed=true)
AS
SELECT 
    c.customerID,
    c.customerName,
    c.region,
    o.orderDate,
    o.orderAmount
FROM bronze.customers c INNER JOIN bronze.orders o
ON c.customerID = o.customerID;
  1. Enable optimal refresh toggle - Navigate to Manage materialized lake views and verify the Optimal refresh toggle is ON (enabled by default).

Supported Expressions for Incremental Refresh

MLVs using only these constructs qualify for incremental refresh:

SQL ConstructNotes
SELECTOnly deterministic built-in functions. Non-deterministic and window functions force full refresh.
FROMStandard table references
WHEREOnly deterministic built-in functions
INNER JOINSupported for incremental
WITH (CTE)Common table expressions supported
UNION ALLSupported
CONSTRAINT ... CHECKOnly deterministic built-in functions in constraint conditions

Unsupported constructs that force full refresh:

  • LEFT JOIN, RIGHT JOIN, FULL OUTER JOIN
  • Window functions (ROW_NUMBER, RANK, LAG, LEAD, etc.)
  • Non-deterministic functions (current_timestamp(), rand(), etc.)
  • Subqueries in SELECT or WHERE
  • GROUP BY with HAVING
  • DISTINCT
  • User-defined functions (UDFs)

Key Spark SQL Reference

List All MLVs

sql
SHOW MATERIALIZED LAKE VIEWS IN silver;

View MLV Definition

sql
SHOW CREATE MATERIALIZED LAKE VIEW silver.customer_orders;

Force Full Refresh

sql
REFRESH MATERIALIZED LAKE VIEW silver.customer_orders FULL;

Drop and Recreate

sql
DROP MATERIALIZED LAKE VIEW silver.customer_orders;

CREATE OR REPLACE MATERIALIZED LAKE VIEW silver.customer_orders
TBLPROPERTIES (delta.enableChangeDataFeed=true)
AS
SELECT ...;

Known Issues and Limitations

IssueImpactWorkaround
FAIL constraint + "delta table not found"MLV creation or refresh failsRecreate MLV using DROP action instead of FAIL
Schema names with ALL CAPSMLV creation failsUse lowercase or mixed-case schema names
Session-level Spark propertiesNot applied during scheduled refreshSet properties in custom environment instead
Delta time travel in MLV definitionNot supportedRemove VERSION AS OF or TIMESTAMP AS OF from queries
DML statements on MLVsNot supportedMLVs are read-only; modify source tables instead
UDFs in SELECTNot supportedUse built-in Spark SQL functions
Temporary views as MLV sourceNot supportedReference base tables or other MLVs directly
Cross-lakehouse lineageNot supportedKeep all related MLVs within same lakehouse
Updating data quality constraintsNot supportedDrop and recreate the MLV with new constraints
LIKE/regex in constraint conditionsNot supportedUse simple comparison operators in constraints
Service principal authenticationNot supported for MLV APIsUse Microsoft Entra user authentication
Single schedule per lineageUI instability if exceededMaintain only one active schedule per lineage
South Central US regionFeature not availableUse a workspace in a different region

Run States Reference

StateMeaningAction
CompletedAll nodes executed successfullyNo action needed
FailedOne or more nodes failed; child nodes skippedReview failed node error, check Spark logs
SkippedPrevious run still in progressWait for current run to complete, or cancel it
In ProgressRun started, not yet terminalMonitor progress in lineage view
CanceledManually canceled from Monitor HubRe-trigger if needed

remediate

SymptomLikely CauseResolution
Refresh always full, never incrementalCDF not enabled on source tablesRun ALTER TABLE ... SET TBLPROPERTIES (delta.enableChangeDataFeed = true) on ALL sources
Refresh always full despite CDF enabledUnsupported SQL constructs in MLVReview definition for window functions, LEFT JOIN, non-deterministic functions
Refresh always full despite CDF and supported SQLSource has UPDATE/DELETE operationsIncremental only supports append-only; redesign ETL to be append-only or accept full refresh
"Delta table not found" on createFAIL constraint issueRecreate MLV without FAIL; use DROP action instead
Node shows Skipped stateParent node failedFix the parent node failure first
Schedule not runningPrevious run still in progressCancel the stuck run from Monitor Hub, or wait
Environment not accessibleUser lacks access to selected environmentSelect an accessible environment from dropdown
Deleted environment errorAssociated environment was removedChoose a new environment in lineage settings
MLV names changed to lowercaseExpected behaviorMLV names are case-insensitive and stored as lowercase
Workspace names with spaces cause errorsBacktick syntax requiredUse backtick notation:`My Workspace`.lakehouse.schema.view_name

References