AgentSkillsCN

direct-lake-operations

关于 Direct Lake 语义模型操作的指南。适用于实现与 Direct Lake 相关的功能,或进行故障排查时使用。

SKILL.md
--- frontmatter
name: direct-lake-operations
description: Guide for working with Direct Lake semantic models. Use this when implementing Direct Lake-related features or troubleshooting.

Direct Lake Operations

This skill covers working with Direct Lake semantic models in Semantic Link Labs.

When to Use This Skill

Use this skill when you need to:

  • Migrate models to Direct Lake
  • Check and fix Direct Lake fallback issues
  • Sync schema between lakehouse and model
  • Work with Direct Lake guardrails
  • Warm Direct Lake cache

Overview

Direct Lake is a storage mode for Power BI semantic models that reads data directly from Delta tables in OneLake, providing fast query performance without data import.

Key Files

FilePurpose
src/sempy_labs/directlake/Direct Lake submodule
_dl_helper.pyCore Direct Lake utilities
_guardrails.pyGuardrail checking
_directlake_schema_sync.pySchema synchronization
_warm_cache.pyCache warming utilities

Direct Lake Functions

Core Functions

FunctionPurpose
check_fallback_reasonCheck why model falls back to DirectQuery
get_direct_lake_lakehouseGet lakehouse connected to model
get_direct_lake_sourceGet SQL endpoint and lakehouse info
generate_direct_lake_semantic_modelCreate new Direct Lake model

Schema Management

FunctionPurpose
direct_lake_schema_compareCompare model schema with lakehouse
direct_lake_schema_syncSync model schema from lakehouse

Guardrails

FunctionPurpose
get_direct_lake_guardrailsGet guardrail limits for model
get_sku_sizeGet SKU capacity info
get_directlake_guardrails_for_skuGet guardrails for specific SKU

Cache Management

FunctionPurpose
warm_direct_lake_cache_isresidentWarm cache using IsResident
warm_direct_lake_cache_perspectiveWarm cache using perspective

Connection Updates

FunctionPurpose
update_direct_lake_model_connectionUpdate lakehouse connection
update_direct_lake_model_lakehouse_connectionLegacy connection update

Checking Fallback Reasons

Direct Lake models can fall back to DirectQuery mode. Use check_fallback_reason to diagnose:

python
from sempy_labs.directlake import check_fallback_reason

# Check specific model
result = check_fallback_reason(
    dataset="My Direct Lake Model",
    workspace="My Workspace"
)

# Returns DataFrame with fallback reasons per table
print(result)

Common Fallback Reasons

ReasonCauseSolution
ColumnNotInPartitionColumn missing from Delta tableAdd column to lakehouse table
ParquetTypeMismatchData type mismatchUpdate lakehouse table schema
UnsupportedFilterDAX filter not supportedModify DAX query
GuardrailExceeded capacity limitsUpgrade capacity or reduce data

Schema Synchronization

Compare Schemas

python
from sempy_labs.directlake import direct_lake_schema_compare

# Compare model schema with lakehouse tables
comparison = direct_lake_schema_compare(
    dataset="My Direct Lake Model",
    workspace="My Workspace"
)

# Shows differences between model and lakehouse
print(comparison)

Sync Schema

python
from sempy_labs.directlake import direct_lake_schema_sync

# Sync model schema from lakehouse
direct_lake_schema_sync(
    dataset="My Direct Lake Model",
    workspace="My Workspace",
    add_columns=True,      # Add new columns
    remove_columns=False,  # Keep columns not in lakehouse
)

Guardrails

Direct Lake has limits based on capacity SKU:

python
from sempy_labs.directlake import get_direct_lake_guardrails

# Get guardrails for a model
guardrails = get_direct_lake_guardrails(
    dataset="My Direct Lake Model",
    workspace="My Workspace"
)

# Shows limits vs current values
print(guardrails)

Guardrail Limits

GuardrailDescription
Max Rows Per TableMaximum rows per table
Max Size Per TableMaximum table size in GB
Max Columns Per TableMaximum columns per table
Max Model SizeMaximum total model size

Generating Direct Lake Models

Create a new Direct Lake model from lakehouse tables:

python
from sempy_labs.directlake import generate_direct_lake_semantic_model

# Generate model from lakehouse
generate_direct_lake_semantic_model(
    dataset="New Direct Lake Model",
    lakehouse="My Lakehouse",
    workspace="My Workspace",
    lakehouse_workspace="Lakehouse Workspace",  # Optional
)

Updating Connections

Change the lakehouse a Direct Lake model points to:

python
from sempy_labs.directlake import update_direct_lake_model_connection

# Update to different lakehouse
update_direct_lake_model_connection(
    dataset="My Direct Lake Model",
    workspace="My Workspace",
    target_lakehouse="New Lakehouse",
    target_workspace="Target Workspace",
)

Cache Warming

Warm the Direct Lake cache after model refresh:

Using IsResident

python
from sempy_labs.directlake import warm_direct_lake_cache_isresident

# Warm columns that were previously in memory
warm_direct_lake_cache_isresident(
    dataset="My Direct Lake Model",
    workspace="My Workspace"
)

Using Perspective

python
from sempy_labs.directlake import warm_direct_lake_cache_perspective

# Warm columns defined in a perspective
warm_direct_lake_cache_perspective(
    dataset="My Direct Lake Model",
    perspective="Cache Warming",
    workspace="My Workspace"
)

Migration to Direct Lake

The library provides migration tools in src/sempy_labs/migration/:

Migration Workflow

  1. Migrate calculated tables to lakehouse as Delta tables
  2. Create Direct Lake model with same structure
  3. Migrate measures and other objects
  4. Validate and test

Example Migration

python
from sempy_labs.migration import (
    migrate_calctables_to_lakehouse,
    migrate_model_objects_to_semantic_model,
)

# Step 1: Migrate calculated tables to lakehouse
migrate_calctables_to_lakehouse(
    dataset="Source Import Model",
    workspace="My Workspace",
    lakehouse="Target Lakehouse",
    lakehouse_workspace="Lakehouse Workspace",
)

# Step 2: Migrate other objects (measures, etc.)
migrate_model_objects_to_semantic_model(
    source_dataset="Source Import Model",
    target_dataset="Target Direct Lake Model",
    source_workspace="My Workspace",
    target_workspace="My Workspace",
)

Best Practices

Do's

  • ✅ Always check guardrails before deploying
  • ✅ Use schema sync after lakehouse changes
  • ✅ Warm cache after scheduled refreshes
  • ✅ Monitor fallback reasons regularly

Don'ts

  • ❌ Don't ignore fallback warnings
  • ❌ Don't exceed guardrail limits
  • ❌ Don't modify lakehouse schema without syncing model
  • ❌ Don't use unsupported DAX patterns

Troubleshooting

Model Falls Back to DirectQuery

  1. Run check_fallback_reason() to identify cause
  2. Check column mappings with direct_lake_schema_compare()
  3. Verify data types match between model and lakehouse
  4. Check guardrails with get_direct_lake_guardrails()

Schema Mismatch

  1. Run direct_lake_schema_compare() to see differences
  2. Run direct_lake_schema_sync() to fix mismatches
  3. Verify partition entity mappings

Performance Issues

  1. Check cache state with warm_direct_lake_cache_isresident()
  2. Verify partitions are properly configured
  3. Review query patterns for unsupported filters

Related Resources

ResourceURL
Direct Lake OverviewMicrosoft Docs
GuardrailsMicrosoft Docs
Migration NotebookNotebook
API DocsReadTheDocs