AgentSkillsCN

06 Code Python

06 代码Python

SKILL.md

Python Code Node Mastery

Write powerful Python code in n8n - INCLUDING external libraries with Task Runners

Overview

This skill covers Python patterns for n8n Code nodes, including data access, return formats, standard library usage, external library setup (pandas, numpy, requests), Task Runner configuration, and v2.0+ migration guidance.


Part 1: Python vs JavaScript - Key Differences

Syntax Comparison

FeatureJavaScriptPython
Variables prefix$ (dollar sign)_ (underscore)
Property access$json.field or $json["field"]_json["field"] (bracket only)
Input all items$input.all()_input.all()
First item$input.first()_input.first()
Current item$input.item_input.item or _item
Node reference$node["Name"]_node["Name"]
Environment vars$env.VAR_env["VAR"]

Critical Syntax Rules

python
# PYTHON REQUIRES BRACKET NOTATION - NO DOT ACCESS

# WRONG - Will fail in native Python
data = _json.name  # Dot notation not supported

# CORRECT - Use bracket notation
data = _json["name"]

# CORRECT - Nested access
email = _json["body"]["user"]["email"]

v2.0+ Breaking Change

Native Python (v2.0+) does NOT support dot access like the legacy Pyodide version:

python
# Legacy Pyodide (v1.x) - DON'T USE IN v2.0+
item.json.myField  # Worked in Pyodide

# Native Python (v2.0+) - REQUIRED
item["json"]["myField"]  # Bracket notation only

Part 2: Data Access Patterns

Pattern 1: _input.all() - All Items

python
# Get all items as list
items = _input.all()

# Process each item
results = []
for item in items:
    name = item["json"]["name"]
    email = item["json"]["email"]

    results.append({
        "json": {
            "name": name.upper(),
            "email": email.lower(),
            "processed": True
        }
    })

return results

Pattern 2: _input.first() - First Item

python
# Get first item only
first = _input.first()
data = first["json"]

return [{
    "json": {
        "result": data["value"] * 2,
        "source": "first_item"
    }
}]

Pattern 3: _json - Shortcut to First Item

python
# Direct access to first item's JSON (shortcut)
name = _json["name"]
email = _json["email"]

return [{"json": {"name": name, "email": email}}]

Pattern 4: _node - Cross-Node Reference

python
# Access output from specific nodes by name
http_result = _node["HTTP Request"]["json"]
webhook_data = _node["Webhook"]["json"]["body"]

return [{
    "json": {
        "api_data": http_result,
        "webhook_payload": webhook_data
    }
}]

Pattern 5: _items - All Items Shortcut

python
# Alternative way to access all items (Run Once for All Items mode)
for item in _items:
    print(item["json"])

CRITICAL: Webhook Data Access

Webhook data is nested under ["body"]:

python
# WRONG - Returns None or KeyError
data = _json["name"]
email = _json["email"]

# CORRECT - Webhook data under body
data = _json["body"]["name"]
email = _json["body"]["email"]

# SAFE - With .get() fallback
webhook_body = _json.get("body", {})
username = webhook_body.get("username", "unknown")
email = webhook_body.get("email", "")
items = webhook_body.get("items", [])

Part 3: Built-in Variables Reference

VariableDescriptionExample
_inputInput data accessor_input.all(), _input.first()
_jsonFirst item's JSON (shortcut)_json["field"]
_itemsAll items (Run Once for All Items)for item in _items
_itemCurrent item (Run Once for Each Item)_item["json"]["field"]
_nodeAccess specific node's output_node["HTTP Request"]["json"]
_envEnvironment variables_env["API_KEY"]

Environment Variables

python
# Access environment variables
api_key = _env["API_KEY"]
base_url = _env.get("BASE_URL", "https://api.default.com")

# Use in API calls
headers = {
    "Authorization": f"Bearer {api_key}"
}

Part 4: Return Format Requirements

CRITICAL Rule

Always return a list of dictionaries with "json" key:

python
# CORRECT - List of dicts with 'json' key
return [
    {"json": {"name": "Alice", "processed": True}},
    {"json": {"name": "Bob", "processed": True}}
]

# CORRECT - Single result
return [{"json": {"result": "success", "count": 42}}]

# CORRECT - Empty result
return []

# WRONG - Missing 'json' wrapper
return [{"name": "Alice"}]  # Will fail!

# WRONG - Not a list
return {"json": {"name": "Alice"}}  # Will fail!

Processing Multiple Items

python
items = _input.all()
results = []

for item in items:
    processed = {
        "original": item["json"]["value"],
        "doubled": item["json"]["value"] * 2,
        "category": "high" if item["json"]["value"] > 100 else "low"
    }
    results.append({"json": processed})

return results

Returning Binary Data

python
import base64

# For files like PDFs, images, CSVs
pdf_content = b"..."  # Binary content

return [{
    "json": {"filename": "report.pdf", "size": len(pdf_content)},
    "binary": {
        "data": {
            "data": base64.b64encode(pdf_content).decode(),
            "mimeType": "application/pdf",
            "fileName": "report.pdf"
        }
    }
}]

Part 5: Standard Library (Always Available)

These modules work without Task Runner setup:

CategoryModules
Datajson, csv, pickle, sqlite3, xml
Textre, string, textwrap, difflib
DateTimedatetime, calendar, time, zoneinfo
Mathmath, statistics, random, decimal, fractions
Cryptohashlib, hmac, secrets
Encodingbase64, binascii, codecs
Networkurllib, http, socket, ssl, email
Filesos, pathlib, shutil, glob, tempfile, gzip, zipfile
Systemsys, subprocess, threading, multiprocessing, asyncio
Utilscollections, itertools, functools, copy, typing

DateTime Example

python
from datetime import datetime, timedelta
import json

now = datetime.now()
items = _input.all()

results = []
for item in items:
    # Parse ISO date
    created_str = item["json"]["created_at"]
    created = datetime.fromisoformat(created_str.replace("Z", "+00:00"))

    # Calculate age
    age_days = (now - created).days

    results.append({"json": {
        **item["json"],
        "age_days": age_days,
        "is_recent": age_days < 7,
        "processed_at": now.isoformat()
    }})

return results

Regex Example

python
import re

items = _input.all()
email_pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
phone_pattern = r"\+?[\d\s\-\(\)]{10,}"

results = []
for item in items:
    email = item["json"].get("email", "")
    phone = item["json"].get("phone", "")

    results.append({"json": {
        "email": email,
        "email_valid": bool(re.match(email_pattern, email)),
        "phone": phone,
        "phone_valid": bool(re.search(phone_pattern, phone))
    }})

return results

JSON Operations

python
import json

# Parse JSON string
data = _json["json_string"]
parsed = json.loads(data)

# Create JSON string
output = {"key": "value", "numbers": [1, 2, 3]}
json_string = json.dumps(output, indent=2)

return [{"json": {"parsed": parsed, "serialized": json_string}}]

Hashing Example

python
import hashlib

items = _input.all()

results = []
for item in items:
    data = item["json"]["data"]

    # Create hashes
    md5_hash = hashlib.md5(data.encode()).hexdigest()
    sha256_hash = hashlib.sha256(data.encode()).hexdigest()

    results.append({"json": {
        "original": data,
        "md5": md5_hash,
        "sha256": sha256_hash
    }})

return results

Part 6: External Libraries (Requires Task Runner)

What Requires Setup

These libraries need Task Runner configuration:

  • Data Analysis: pandas, numpy, scipy
  • HTTP: requests, httpx, aiohttp
  • Machine Learning: scikit-learn, tensorflow, pytorch
  • Web Scraping: beautifulsoup4, lxml, selenium
  • Excel/Spreadsheets: openpyxl, xlrd, xlsxwriter
  • Any pip-installable package

Using pandas (With Task Runner)

python
import pandas as pd

# Convert input to DataFrame
items = _input.all()
df = pd.DataFrame([item["json"] for item in items])

# Data manipulation
df["total"] = df["quantity"] * df["price"]
df["category"] = df["total"].apply(lambda x: "high" if x > 100 else "low")

# Filter
high_value = df[df["category"] == "high"]

# Aggregations
summary = {
    "total_revenue": float(df["total"].sum()),
    "avg_order": float(df["total"].mean()),
    "order_count": len(df)
}

# Return as n8n format
records = [{"json": row} for row in high_value.to_dict("records")]
records.append({"json": {"summary": summary}})

return records

Using numpy (With Task Runner)

python
import numpy as np

items = _input.all()
values = [item["json"]["value"] for item in items]

arr = np.array(values)

stats = {
    "mean": float(np.mean(arr)),
    "std": float(np.std(arr)),
    "min": float(np.min(arr)),
    "max": float(np.max(arr)),
    "median": float(np.median(arr)),
    "sum": float(np.sum(arr)),
    "percentile_25": float(np.percentile(arr, 25)),
    "percentile_75": float(np.percentile(arr, 75))
}

return [{"json": stats}]

Using requests (With Task Runner)

python
import requests

# GET request
response = requests.get(
    "https://api.example.com/data",
    headers={"Authorization": f"Bearer {_env['API_TOKEN']}"},
    params={"limit": 100},
    timeout=30
)

if response.status_code == 200:
    data = response.json()
    return [{"json": item} for item in data.get("results", [])]
else:
    return [{"json": {
        "error": response.text,
        "status": response.status_code
    }}]

POST with JSON (With Task Runner)

python
import requests

payload = {
    "name": _json["body"]["name"],
    "email": _json["body"]["email"],
    "timestamp": datetime.now().isoformat()
}

response = requests.post(
    "https://api.example.com/users",
    json=payload,
    headers={
        "Authorization": f"Bearer {_env['API_TOKEN']}",
        "Content-Type": "application/json"
    },
    timeout=30
)

return [{"json": {
    "response": response.json(),
    "status": response.status_code,
    "success": response.status_code < 400
}}]

Part 7: Task Runner Setup (v2.0+)

Why Task Runners?

n8n v2.0+ removed bundled Python (Pyodide). External Task Runners:

  • Run in isolated containers for security
  • Enable full Python library access
  • Separate code execution from n8n core

Quick Setup Overview

Step 1: docker-compose.yml

yaml
services:
  n8n:
    image: n8nio/n8n
    ports:
      - "5678:5678"
    environment:
      # Task Runner Settings
      - N8N_RUNNERS_MODE=external
      - N8N_RUNNERS_BROKER_LISTEN_ADDRESS=0.0.0.0
      - N8N_RUNNERS_AUTH_TOKEN=${RUNNERS_AUTH_TOKEN}
    depends_on:
      - n8n-runner

  n8n-runner:
    image: n8nio/runners:latest
    volumes:
      - ./n8n-task-runners.json:/etc/n8n-task-runners.json:ro
    environment:
      - N8N_RUNNERS_TASK_BROKER_URI=http://n8n:5679
      - N8N_RUNNERS_AUTH_TOKEN=${RUNNERS_AUTH_TOKEN}
    depends_on:
      - n8n

Step 2: n8n-task-runners.json

json
{
  "task-runners": [
    {
      "runner-type": "python",
      "workdir": "/home/runner",
      "command": "/opt/runners/task-runner-python/.venv/bin/python",
      "args": ["-m", "src.main"],
      "health-check-server-port": "5682",
      "env-overrides": {
        "PYTHONPATH": "/opt/runners/task-runner-python",
        "N8N_RUNNERS_STDLIB_ALLOW": "*",
        "N8N_RUNNERS_EXTERNAL_ALLOW": "*"
      }
    }
  ]
}

Step 3: Generate Auth Token

bash
openssl rand -hex 32

Use the same token in both N8N_RUNNERS_AUTH_TOKEN settings.

Installing External Libraries

Method 1: Live Installation (Lost on Restart)

bash
# Install to venv site-packages (MUST run as root)
docker exec -u root n8n-runner pip install \
    --target=/opt/runners/task-runner-python/.venv/lib/python3.13/site-packages/ \
    pandas numpy requests

# Verify installation
docker exec n8n-runner \
    /opt/runners/task-runner-python/.venv/bin/python \
    -c "import pandas; print(f'pandas {pandas.__version__}')"

Method 2: Custom Dockerfile (Persistent)

dockerfile
FROM n8nio/runners:latest

# Install packages to venv site-packages
RUN pip install \
    --target=/opt/runners/task-runner-python/.venv/lib/python3.13/site-packages/ \
    pandas \
    numpy \
    requests \
    scikit-learn \
    beautifulsoup4 \
    openpyxl

Update docker-compose.yml:

yaml
n8n-runner:
  build: ./custom-runner
  # ... rest of config

Environment Variables Reference

VariablePurpose
N8N_RUNNERS_MODE=externalEnable external task runners
N8N_RUNNERS_BROKER_LISTEN_ADDRESS=0.0.0.0Allow runner connections
N8N_RUNNERS_AUTH_TOKENSecure runner communication
N8N_RUNNERS_TASK_BROKER_URIRunner connects to n8n broker
N8N_RUNNERS_STDLIB_ALLOWAllowed Python stdlib modules
N8N_RUNNERS_EXTERNAL_ALLOWAllowed external pip packages

Part 8: Common Errors & Fixes

Error: "No module named 'X'"

Cause 1: Library installed in wrong location

bash
# WRONG - System Python (won't work)
docker exec n8n-runner pip install numpy

# CORRECT - venv site-packages
docker exec -u root n8n-runner pip install \
    --target=/opt/runners/task-runner-python/.venv/lib/python3.13/site-packages/ \
    numpy

Cause 2: External libraries not allowed

json
// Add to n8n-task-runners.json
"N8N_RUNNERS_EXTERNAL_ALLOW": "*"

Error: "Task request timed out"

Cause: Broker listening only to localhost

yaml
# FIX: Set broker to listen on all interfaces
N8N_RUNNERS_BROKER_LISTEN_ADDRESS=0.0.0.0

Error: "Permission denied during install"

bash
# Add -u root flag
docker exec -u root n8n-runner pip install ...

Error: "Return value not iterable"

Cause: Not returning list of dicts

python
# WRONG
return {"json": {"result": "ok"}}

# CORRECT
return [{"json": {"result": "ok"}}]

Error: "'NoneType' object is not subscriptable"

Cause: Accessing missing key

python
# WRONG
value = item["json"]["missing_field"]

# CORRECT - Use .get() with default
value = item["json"].get("missing_field", "default")

Error: "Library lost after container restart"

Cause: Live installation is not persistent

Fix: Use custom Dockerfile (Method 2 above)

Error: "Python runner unavailable"

Cause: Runner container not running or not connected

bash
# Check runner status
docker-compose logs n8n-runner

# Verify auth tokens match in both services

Part 9: Best Practices

Do

  • Validate input data before processing
  • Use try/except for external API calls
  • Use .get() with defaults for safe dictionary access
  • Return early on errors with descriptive messages
  • Keep code focused - one task per Code node
  • Use standard library when external libs aren't needed
  • Log errors with print() for debugging

Don't

  • Don't use dot notation - Python requires bracket access
  • Don't forget return statement - always return list
  • Don't hardcode credentials - use _env["VAR"]
  • Don't skip webhook body - data is under ["body"]
  • Don't install to system Python - use venv path

Error Handling Pattern

python
import traceback

try:
    items = _input.all()

    if not items:
        return [{"json": {"error": "No input items", "success": False}}]

    results = []
    for item in items:
        try:
            # Process individual item
            value = item["json"].get("value", 0)
            result = {"original": value, "processed": value * 2}
            results.append({"json": {**result, "success": True}})
        except Exception as e:
            results.append({"json": {
                "error": str(e),
                "item_id": item["json"].get("id"),
                "success": False
            }})

    return results

except Exception as e:
    return [{"json": {
        "error": str(e),
        "traceback": traceback.format_exc(),
        "success": False
    }}]

Part 10: When to Use Python vs JavaScript

Use Python When

ScenarioWhy
Data analysispandas is faster than JS alternatives
Machine learningscikit-learn, tensorflow, pytorch
Heavy data processing10k+ items, complex transformations
Statisticsnumpy, scipy for numerical computing
Scientific computingSpecialized Python libraries
Web scrapingBeautifulSoup, lxml support

Use JavaScript When

ScenarioWhy
Simple transformsFaster startup, no runner overhead
String manipulationJS string methods are excellent
DateTime operationsLuxon built-in, powerful API
Quick operationsInstant execution, no network latency
Small datasets< 1000 items
$helpers.httpRequestBuilt-in, handles auth well

Performance Comparison

FactorJavaScriptPython (Task Runner)
StartupInstant~10-50ms overhead
Simple opsFastestSlower startup
Large dataSlows downOptimized (pandas)
External libsLimitedFull pip access
MemorySharedIsolated container

Quick Reference Checklist

Before deploying Python Code nodes:

  • Bracket notation used: No dot access (["field"] not .field)
  • Return format correct: [{"json": {...}}] list structure
  • Webhook data accessed: Via ["body"] key
  • Input validated: Check for empty/missing data
  • Error handling: try/except for external calls
  • Safe access: .get() with defaults for optional fields
  • Task Runner setup: If using external libraries
  • No hardcoded secrets: Use _env["VAR"]

Sources & Further Reading


Related Skills

SkillWhen to Use
05-code-javascriptJavaScript patterns and helpers
07-expression-syntaxExpression vs code differences
03-node-configurationCode node settings
08-validation-expertDebug code errors

See community-fixes/python-external-libs/ for complete Task Runner setup files.