AgentSkillsCN

troubleshooting

在调试失败作业、诊断错误,或解决常见的 Dataiku 问题时使用。

SKILL.md
--- frontmatter
name: troubleshooting
description: "Use when debugging failed jobs, diagnosing errors, or resolving common Dataiku issues"

Dataiku Troubleshooting Guide

Debugging Checklist

  1. Environment activated? which python should show dataiku-env
  2. Variables set? echo $DSS_URL
  3. Can connect? Run scripts/bootstrap.py
  4. Recipe saved? Check for settings.save()
  5. Job ran? Check for recipe.run()
  6. Job succeeded? Check job.get_status()
  7. Schema correct? Run autodetect_settings()

Top-10 Error Quick Reference

ErrorCauseSolution
Connection refusedWrong DSS_URL or instance downVerify URL, check instance status
401 UnauthorizedInvalid or expired API keyRegenerate key in Dataiku UI
Project not foundWrong project key or no accessclient.list_project_keys() to verify
Settings not savedMissing settings.save()Always call settings.save() after changes
Recipe ran but no dataFilter/join removed all rowsCheck inputs, join keys, filters
Job failedSchema mismatch, missing inputsInspect job status and logs
invalid identifier (quoted)Lowercase column names in SQL schemaNormalize schema to UPPERCASE
table does not existUpstream dataset not builtBuild datasets in dependency order
Insert value list mismatchOutput schema doesn't match recipe outputRun recipe.compute_schema_updates() and apply
ModuleNotFoundError: dataikuapiVirtual environment not activatedsource ~/dataiku-env/bin/activate

Job Failure Investigation Pattern

python
# Get the most recent job and extract error details
jobs = project.list_jobs()
job = project.get_job(jobs[0]['def']['id'])
status = job.get_status()
state = status.get("baseStatus", {}).get("state")  # "DONE" or "FAILED"

if state == "FAILED":
    activities = status.get("baseStatus", {}).get("activities", {})
    for name, info in activities.items():
        if info.get("firstFailure"):
            print(f"Error: {info['firstFailure'].get('message')}")

    # Or get full log
    print(job.get_log())

Important: recipe.run() already waits for completion internally. Use recipe.run(no_fail=True) to prevent exceptions on failure, then inspect the returned job object.

Detailed Error References

For full details on each error category including causes, code examples, and solutions:

Scripts