Dataiku Troubleshooting Guide
Debugging Checklist
- • Environment activated?
which pythonshould show dataiku-env - • Variables set?
echo $DSS_URL - • Can connect? Run
scripts/bootstrap.py - • Recipe saved? Check for
settings.save() - • Job ran? Check for
recipe.run() - • Job succeeded? Check
job.get_status() - • Schema correct? Run
autodetect_settings()
Top-10 Error Quick Reference
| Error | Cause | Solution |
|---|---|---|
Connection refused | Wrong DSS_URL or instance down | Verify URL, check instance status |
401 Unauthorized | Invalid or expired API key | Regenerate key in Dataiku UI |
Project not found | Wrong project key or no access | client.list_project_keys() to verify |
| Settings not saved | Missing settings.save() | Always call settings.save() after changes |
| Recipe ran but no data | Filter/join removed all rows | Check inputs, join keys, filters |
| Job failed | Schema mismatch, missing inputs | Inspect job status and logs |
invalid identifier (quoted) | Lowercase column names in SQL schema | Normalize schema to UPPERCASE |
table does not exist | Upstream dataset not built | Build datasets in dependency order |
Insert value list mismatch | Output schema doesn't match recipe output | Run recipe.compute_schema_updates() and apply |
ModuleNotFoundError: dataikuapi | Virtual environment not activated | source ~/dataiku-env/bin/activate |
Job Failure Investigation Pattern
python
# Get the most recent job and extract error details
jobs = project.list_jobs()
job = project.get_job(jobs[0]['def']['id'])
status = job.get_status()
state = status.get("baseStatus", {}).get("state") # "DONE" or "FAILED"
if state == "FAILED":
activities = status.get("baseStatus", {}).get("activities", {})
for name, info in activities.items():
if info.get("firstFailure"):
print(f"Error: {info['firstFailure'].get('message')}")
# Or get full log
print(job.get_log())
Important:
recipe.run()already waits for completion internally. Userecipe.run(no_fail=True)to prevent exceptions on failure, then inspect the returned job object.
Detailed Error References
For full details on each error category including causes, code examples, and solutions:
- •references/connection-errors.md — Connection refused, 401 Unauthorized, Project not found
- •references/recipe-errors.md — Settings not saved, empty output, job failures, job API usage patterns
- •references/sql-errors.md — Invalid identifier (quoted/general), table does not exist, pre-join computed columns, insert value list mismatch
- •references/environment-errors.md — ModuleNotFoundError, missing env vars, getting more help
Scripts
- •scripts/debug_job.py — Standalone script to debug the most recent failed job