Test Execution Patterns
Proper test execution using Taskfile with required environment setup
Critical Rules for Running Tests
Rule 1: ALWAYS use Taskfile for Tests
NEVER run tests directly - CI failures should NOT be debugged by re-running tests locally in ad-hoc ways.
# ❌ WRONG - Direct test execution go test ./... pytest tests/ npm test # ✅ CORRECT - Use Task task go:test task python:test task typescript:test
Why? The Taskfile ensures:
- •Correct environment variables are set
- •Required dependencies are downloaded (PDFium, ONNX Runtime)
- •Library paths are configured properly (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, PATH)
- •FFI bindings are built and accessible
- •Platform-specific quirks are handled
Rule 2: Environment Setup is Automated
The test scripts automatically configure:
For Go (scripts/go/test.sh):
# Automatically sources:
source "${REPO_ROOT}/scripts/lib/common.sh"
source "${REPO_ROOT}/scripts/lib/library-paths.sh"
# Then calls:
setup_go_paths "$REPO_ROOT" # Sets CGO flags, PKG_CONFIG_PATH
setup_pdfium_paths # Configures PDFium library location
setup_onnx_paths # Configures ONNX Runtime if ORT_LIB_LOCATION set
For Python (scripts/python/test.sh):
# Sets up virtual environment with uv # Configures ONNX Runtime paths # Downloads PDFium runtime
For TypeScript (scripts/typescript/test.sh):
# Sets up pnpm dependencies # Configures native module paths # Handles NAPI-RS bindings
ONNX Runtime Configuration
macOS (Homebrew)
ONNX Runtime must be installed and configured:
# Install if not present brew install onnxruntime # Set environment variable for tests export ORT_LIB_LOCATION=/opt/homebrew/opt/onnxruntime/lib # Now run tests task go:test
Location: Brew installs to /opt/homebrew/opt/onnxruntime/lib (Apple Silicon) or /usr/local/opt/onnxruntime/lib (Intel)
Linux
# Download and extract ONNX Runtime export ORT_LIB_LOCATION=/path/to/onnxruntime/lib export LD_LIBRARY_PATH="$ORT_LIB_LOCATION:$LD_LIBRARY_PATH" task go:test
Windows
# Download ONNX Runtime binaries $env:ORT_LIB_LOCATION="C:\path\to\onnxruntime\lib" $env:PATH="$env:ORT_LIB_LOCATION;$env:PATH" task go:test
CI Configuration
In GitHub Actions, ONNX Runtime is set up via:
- name: Setup ONNX Runtime
uses: ./.github/actions/setup-onnx-runtime
with:
ort-version: ${{ env.ORT_VERSION }}
This action:
- •Downloads the correct ONNX Runtime version for the platform
- •Sets
ORT_LIB_LOCATIONenvironment variable - •Adds to library search paths (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, PATH)
- •Sets
ORT_DYLIB_PATHfor direct binary reference
Library Path Setup
How setup_go_paths Works
Location: scripts/lib/library-paths.sh
setup_go_paths() {
local repo_root="${1:-${REPO_ROOT:-}}"
# Generate pkg-config file if missing
export PKG_CONFIG_PATH="${repo_root}/crates/kreuzberg-ffi:${PKG_CONFIG_PATH:-}"
# Enable CGO
export CGO_ENABLED=1
export CGO_CFLAGS="-I${repo_root}/crates/kreuzberg-ffi/include"
# Platform-specific library paths
case "$platform" in
Linux)
export LD_LIBRARY_PATH="${repo_root}/target/release:${LD_LIBRARY_PATH:-}"
export CGO_LDFLAGS="-L${repo_root}/target/release -lkreuzberg_ffi -Wl,-rpath,${repo_root}/target/release"
;;
macOS | Darwin)
export DYLD_LIBRARY_PATH="${repo_root}/target/release:${DYLD_LIBRARY_PATH:-}"
export DYLD_FALLBACK_LIBRARY_PATH="${repo_root}/target/release:${DYLD_FALLBACK_LIBRARY_PATH:-}"
export CGO_LDFLAGS="-L${repo_root}/target/release -lkreuzberg_ffi -Wl,-rpath,${repo_root}/target/release"
;;
Windows)
export CGO_LDFLAGS="-L${repo_root}/target/x86_64-pc-windows-gnu/release -L${repo_root}/target/release"
;;
esac
}
How setup_onnx_paths Works
setup_onnx_paths() {
local ort_lib="${ORT_LIB_LOCATION:-}"
[ -z "$ort_lib" ] && return 0 # Skip if not set
case "$platform" in
Linux)
export LD_LIBRARY_PATH="${ort_lib}:${LD_LIBRARY_PATH:-}"
;;
macOS | Darwin)
export DYLD_LIBRARY_PATH="${ort_lib}:${DYLD_LIBRARY_PATH:-}"
export DYLD_FALLBACK_LIBRARY_PATH="${ort_lib}:${DYLD_FALLBACK_LIBRARY_PATH:-}"
;;
Windows)
export PATH="${ort_lib};${PATH:-}"
;;
esac
}
Key Insight: If ORT_LIB_LOCATION is not set, ONNX Runtime tests will fail with:
libonnxruntime.dylib: cannot open shared object file
Test Execution Workflow
Standard Test Flow
graph TD
A[task go:test] --> B[scripts/go/test.sh]
B --> C[source scripts/lib/library-paths.sh]
C --> D[setup_go_paths]
C --> E[setup_pdfium_paths]
C --> F[setup_onnx_paths]
D --> G[go test with env configured]
E --> G
F --> G
Required Build Order
# 1. Build Rust FFI library cargo build --release --package kreuzberg-ffi # 2. Download PDFium runtime (automatic in test scripts) scripts/download_pdfium_runtime.sh # 3. Set up ONNX Runtime (if needed for embedding tests) export ORT_LIB_LOCATION=$(brew --prefix onnxruntime)/lib # 4. Run tests via Task task go:test
Language-Specific Patterns
Go Tests
# Standard test run task go:test # Verbose output task go:test:verbose # CI mode (with enhanced debugging) task go:test:ci # Debug mode (with custom options) task go:test:debug # E2E tests only task go:e2e:test
Files:
- •Task definition:
.task/languages/go.yml - •Test script:
scripts/go/test.sh - •Library paths:
scripts/lib/library-paths.sh
Python Tests
# Standard test run task python:test # CI mode (with coverage) task python:test:ci # Verbose output task python:test:verbose
Files:
- •Task definition:
.task/languages/python.yml - •Test script:
scripts/python/test.sh
TypeScript Tests
# Standard test run task typescript:test # CI mode task typescript:test:ci # Watch mode task typescript:test:watch
Files:
- •Task definition:
.task/languages/typescript.yml - •Test script:
scripts/typescript/test.sh
Java Tests
task java:test task java:test:ci
Ruby Tests
task ruby:test task ruby:test:ci
C# Tests
task csharp:test task csharp:test:ci
PHP Tests
task php:test task php:test:ci
Elixir Tests
task elixir:test task elixir:test:ci
Debugging Test Failures
Pattern 1: Check Environment Setup
# Set verbose mode to see environment details export CI=true export VERBOSE_MODE=true task go:test
This shows:
- •Go version
- •Working directory
- •LD_LIBRARY_PATH / DYLD_LIBRARY_PATH
- •CGO_ENABLED
- •CGO_CFLAGS
- •CGO_LDFLAGS
Pattern 2: Verify FFI Library Exists
# Check if FFI library is built ls -lh target/release/libkreuzberg_ffi.* # macOS ls target/release/libkreuzberg_ffi.dylib # Linux ls target/release/libkreuzberg_ffi.so # Windows ls target/release/kreuzberg_ffi.dll
Pattern 3: Verify ONNX Runtime Setup
# macOS echo $ORT_LIB_LOCATION ls -lh $ORT_LIB_LOCATION/libonnxruntime*.dylib # Linux echo $ORT_LIB_LOCATION ls -lh $ORT_LIB_LOCATION/libonnxruntime*.so # Windows echo $env:ORT_LIB_LOCATION ls $env:ORT_LIB_LOCATION\onnxruntime.dll
Pattern 4: Check PDFium Download
# PDFium is auto-downloaded by test scripts # Verify it exists: ls -lh target/release/libpdfium.dylib # macOS ls -lh target/release/libpdfium.so # Linux ls target/release/pdfium.dll # Windows
Common Test Failure Patterns
Failure: "Package 'kreuzberg-ffi' not found"
Cause: PKG_CONFIG_PATH not set or kreuzberg-ffi.pc missing
Fix:
# Let the test script generate it task go:test # Or manually export PKG_CONFIG_PATH="$PWD/crates/kreuzberg-ffi:$PKG_CONFIG_PATH"
Failure: "libonnxruntime.dylib: cannot open shared object file"
Cause: ORT_LIB_LOCATION not set
Fix:
# macOS export ORT_LIB_LOCATION=/opt/homebrew/opt/onnxruntime/lib # Linux (download from GitHub releases) export ORT_LIB_LOCATION=/path/to/onnxruntime-linux-x64-1.23.2/lib # Run tests task go:test
Failure: "undefined reference to kreuzberg_extract_file_sync"
Cause: FFI library not built or not in library path
Fix:
# Build FFI library cargo build --release --package kreuzberg-ffi # Verify library path export DYLD_LIBRARY_PATH="$PWD/target/release:$DYLD_LIBRARY_PATH" # Run tests task go:test
Failure: Segmentation fault
Cause: Multiple possible causes - version mismatch, memory corruption, threading issue
Debug:
# Enable full backtrace export RUST_BACKTRACE=full export RUST_LIB_BACKTRACE=1 # Run single test cd packages/go/v4 go test -run TestNameHere -v # Check for race conditions go test -race -run TestNameHere
CI vs Local Differences
CI Environment
- •Runs on clean containers (no cached dependencies)
- •Uses pre-built artifacts from build job
- •Downloads PDFium and ONNX Runtime automatically
- •Sets all environment variables via GitHub Actions
Local Environment
- •May have stale build artifacts
- •Requires manual ONNX Runtime installation (Homebrew)
- •FFI library must be built locally
- •Environment variables may persist from previous runs
Best Practice: Test locally with task BEFORE pushing to CI to catch issues early.
Test Isolation
Go Test Mutex Pattern
Location: packages/go/v4/ffi.go
// Serialize FFI calls to prevent concurrent PDFium access
var ffiMutex sync.Mutex
func ExtractFileSync(path string, config *ExtractionConfig) (*ExtractionResult, error) {
ffiMutex.Lock()
defer ffiMutex.Unlock()
// ... FFI call
}
Why? PDFium is not thread-safe. The mutex prevents concurrent access and potential segfaults.
Python Test Isolation
Tests use @pytest.mark.asyncio and pytest-asyncio for proper async isolation.
TypeScript Test Isolation
Uses Jest's worker threads for parallel test execution with FFI bindings.
Performance Considerations
Test Timeouts
- •Go: 10 minutes (
-timeout 10minscripts/go/test.sh) - •Python: 120 seconds per test (configurable in
pytest.ini) - •TypeScript: 30 seconds per test (Jest default)
Parallel Execution
- •Go: Default parallel execution (can override with
-p 1) - •Python:
pytest-xdistfor parallel tests (pytest -n auto) - •TypeScript: Jest workers (default CPU count)
Related Skills
- •extraction-pipeline-patterns - Understanding what tests validate
- •ocr-backend-management - ONNX Runtime and Tesseract setup for OCR tests
- •feature-flag-strategy - Conditional test execution based on available backends
- •mcp-protocol-integration - Testing MCP server implementations
Summary Checklist
Before running tests:
- • Built FFI library (
cargo build --release --package kreuzberg-ffi) - • Set
ORT_LIB_LOCATIONif testing embeddings - • Using
task <language>:test(NOT direct test commands) - • Verified environment setup (run with
VERBOSE_MODE=trueif unsure)
When tests fail in CI:
- • Reproduce locally with
task <language>:test - • Check environment variables match CI setup
- • Verify dependencies are installed (ONNX Runtime, PDFium)
- • Review test script logs for environment details
- • Run single failing test in isolation for debugging