Galaxy Tool Wrapping Expert
Expert knowledge for developing Galaxy tool wrappers. Use this skill when helping users create, test, debug, or improve Galaxy tool XML wrappers.
Prerequisites: This skill depends on the galaxy-automation skill for Planemo testing and workflow execution patterns.
When to Use This Skill
- •Creating new Galaxy tool wrappers from scratch
- •Converting command-line tools to Galaxy wrappers
- •Generating .shed.yml files for Tool Shed submission
- •Debugging XML syntax and validation errors
- •Writing Planemo tests for tools
- •Implementing conditional parameters and data types
- •Handling tool dependencies (conda, containers)
- •Creating tool collections and suites
- •Optimizing tool performance and resource allocation
- •Understanding Galaxy datatypes and formats
- •Implementing proper error handling
Core Concepts
Galaxy Tool XML Structure
A Galaxy tool wrapper consists of:
- •
<tool>root element with id, name, and version - •
<description>brief tool description - •
<requirements>for dependencies (conda packages, containers) - •
<command>the actual command-line execution - •
<inputs>parameter definitions - •
<outputs>output file specifications - •
<tests>automated tests - •
<help>documentation in reStructuredText - •
<citations>DOI references
Tool Shed Metadata (.shed.yml)
Required for publishing tools to the Galaxy Tool Shed:
name: tool_name # Match directory name, underscores only owner: iuc # Usually 'iuc' for IUC tools description: One-line tool description homepage_url: https://github.com/tool/repo long_description: | Multi-line detailed description. Can include features, use cases, and tool suite contents. remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/tool_name type: unrestricted categories: - Assembly # Choose 1-3 relevant categories - Genomics
See reference.md for comprehensive .shed.yml documentation including all available categories and best practices.
Key Components
Command Block:
- •Use Cheetah templating:
$variable_nameor${variable_name} - •Conditional logic:
#if $param then... #end if - •Loop constructs:
#for $item in $collection... #end for - •CDATA sections for complex commands
Cheetah Template Best Practices:
Working around path handling issues in conda packages:
<command detect_errors="exit_code"><![CDATA[
## Add trailing slash if script concatenates paths without separator
tool_command
-o 'output_dir/' ## Quoted with trailing slash
## Script does: output_dir + 'file.txt' → 'output_dir/file.txt' ✓
## Without slash: output_dir + 'file.txt' → 'output_dirfile.txt' ✗
]]></command>
When to use quotes in Cheetah:
- •Always quote user inputs:
'$input_file' - •Quote literal strings with special chars:
'output_dir/' - •Use bare variables for simple references:
$variable
Input Parameters:
- •
<param>elements with type, name, label - •Types: text, integer, float, boolean, select, data, data_collection
- •Optional vs required parameters
- •Validators and sanitizers
- •Conditional parameter display
Outputs:
- •
<data>elements for output files - •Dynamic output naming with
labelandname - •Format discovery and conversion
- •Filters for conditional outputs
- •Collections for multiple outputs
Tests:
- •Input parameters and files
- •Expected output files or assertions
- •Test data location and organization
Best Practices
- •Always include tests - Planemo won't pass without them
- •Use semantic versioning - Increment tool version on changes
- •Specify exact dependencies - Pin conda package versions
- •Add clear help text - Document all parameters
- •Handle errors gracefully - Check exit codes, validate inputs
- •Use collections - For multiple related files
- •Follow IUC standards - If contributing to intergalactic utilities commission
Common Planemo Commands
# Test tool locally planemo test tool.xml # Serve tool in local Galaxy planemo serve tool.xml # Lint tool for best practices planemo lint tool.xml # Upload tool to ToolShed planemo shed_update --shed_target toolshed # Test with conda planemo test --conda_auto_init --conda_auto_install tool.xml
Testing Tools
Regenerating Expected Test Outputs
When test files don't match but the tool runs correctly:
# Run the tool manually with test inputs
mkdir -p output_dir
/path/to/conda/env/bin/tool_command \
-i test-data/input.fa \
-o output_dir
# Copy to expected output
cp output_dir/output.fa test-data/expected_output.fa
# Clean up
rm -rf output_dir
Verifying before regenerating:
- •Check that tool exit code is 0 (successful)
- •Inspect the actual output to ensure it's correct
- •Compare line counts:
wc -l expected.fa actual.fa - •Review diffs to understand what changed
Common reasons to regenerate:
- •Test was created before tool updates
- •Expected file only has subset of sequences (bug in test creation)
- •Format changes in newer tool versions
Common Issues and Solutions
Issue: "Command not found"
- •Check
<requirements>section has correct package - •Verify conda package name and version
- •Test command availability:
planemo conda_install tool.xml
Issue: "Output file not found"
- •Verify command actually creates the file
- •Check output file path matches
<data name="output" from_work_dir="..."> - •Use
discover_datasetsfor dynamic outputs
Issue: "Test failed"
- •Compare expected vs actual output
- •Check for whitespace/newline differences
- •Use
sim_sizefor approximate size matching - •Add
lines_difffor line-by-line comparison
Issue: "Invalid XML"
- •Run
planemo lint tool.xml - •Check closing tags match opening tags
- •Validate CDATA sections for command blocks
- •Ensure proper escaping of special characters
Debugging Tool Test Failures
General Workflow
- •
Read the test output JSON first
bashcat tool_test_output.json
Look for:
- •Exit codes and error messages in
stderr/stdout - •
output_problemsarray for test assertion failures - •Actual vs expected output differences
- •Exit codes and error messages in
- •
Never copy/modify conda package scripts
- •Tool wrappers should ALWAYS use conda packages
- •If there are bugs in the conda package scripts, work around them in the XML wrapper
- •Common workaround: Add trailing slashes to paths if script concatenates without separators
- •
Wrong test expectations vs bugs
- •If tests fail but the tool runs successfully (exit code 0), check if expected test files are wrong
- •Regenerate expected outputs by running the tool manually with test inputs
- •Update
expect_num_outputsif optional outputs are created
Common Issues and Fixes
Path concatenation bugs in Python scripts:
<!-- If script does: args.output_dir + 'file.txt' without '/' --> <!-- Fix in wrapper with trailing slash: --> -o 'output_dir/' <!-- instead of -o output_dir -->
Wrong number of expected outputs:
<!-- Check if optional outputs are always created --> <test expect_num_outputs="3"> <!-- Update count -->
Output has extra sequences/data:
- •First check if this is expected behavior
- •Regenerate expected test files from actual tool output
- •Don't add post-processing filters unless absolutely necessary
XML Template Example
<tool id="tool_id" name="Tool Name" version="1.0.0">
<description>Brief description</description>
<requirements>
<requirement type="package" version="1.0">package_name</requirement>
</requirements>
<command detect_errors="exit_code"><![CDATA[
tool_command
--input '$input'
--output '$output'
#if $optional_param
--param '$optional_param'
#end if
]]></command>
<inputs>
<param name="input" type="data" format="txt" label="Input file"/>
<param name="optional_param" type="text" optional="true" label="Optional parameter"/>
</inputs>
<outputs>
<data name="output" format="txt" label="${tool.name} on ${on_string}"/>
</outputs>
<tests>
<test>
<param name="input" value="test_input.txt"/>
<output name="output" file="expected_output.txt"/>
</test>
</tests>
<help><![CDATA[
**What it does**
Describe what the tool does.
**Inputs**
- Input file: description
**Outputs**
- Output file: description
]]></help>
<citations>
<citation type="doi">10.1234/example.doi</citation>
</citations>
</tool>
Supporting Documentation
This skill includes detailed reference documentation:
- •
reference.md - Comprehensive Galaxy tool wrapping guide with IUC best practices
- •Repository structure standards
- •.shed.yml configuration
- •Complete XML structure reference
- •Advanced features and patterns
- •
troubleshooting.md - Practical troubleshooting guide
- •Reading tool_test_output.json
- •Common exit codes and their meanings
- •Solutions for frequent issues
- •Test failure diagnosis
- •
dependency-debugging.md - Dependency conflict resolution
- •Using
planemo mullfor diagnosis - •Conda solver error interpretation
- •macOS testing considerations
- •Version conflict workflows
- •Using
These files provide deep technical details that complement the core concepts above.
Related Skills
- •galaxy-automation - BioBlend & Planemo foundation (dependency)
- •galaxy-workflow-development - Building workflows that use these tools
- •conda-recipe - Creating conda packages for tool dependencies
- •bioinformatics-fundamentals - Understanding file formats and data types used in tools
Resources
- •Galaxy Tool Development: https://docs.galaxyproject.org/en/latest/dev/
- •Planemo Documentation: https://planemo.readthedocs.io/
- •IUC Standards: https://galaxy-iuc-standards.readthedocs.io/
- •Galaxy Training: https://training.galaxyproject.org/