Lobster AI Development Guide
Lobster AI is a multi-agent bioinformatics platform using LangGraph for orchestration. This skill teaches you how to work with, extend, and contribute to the codebase.
Quick Navigation
| Task | Documentation |
|---|---|
| Planning new capabilities | references/planning-workflow.md |
| Domain knowledge (bioSkills) | references/bioskills-bridge.md |
| Architecture overview | references/architecture.md |
| Plugin architecture (omics types, providers, adapters) | references/plugin-architecture.md |
| Creating new agents | references/creating-agents.md |
| Creating new services | references/creating-services.md |
| Code layout & finding files | references/code-layout.md |
| Testing patterns | references/testing.md |
| CLI reference | references/cli.md |
Before You Build
STOP. Before creating any new agent, service, or package, follow this workflow.
| Phase | Purpose |
|---|---|
| 1. Understand Need | Structured Q&A -- what domain, workflow, tools, data formats |
| 2. Check What Exists | Dynamically scan Lobster packages + core services for overlap |
| 3. Find Domain Knowledge | Dynamically discover relevant bioSkills for the domain |
| 4. Present Findings | Show developer what exists vs. what's missing |
| 5. Recommend Approach | Extend existing vs. new package vs. service-only vs. not Lobster |
| 6. Build & Test | Apply lobster-dev patterns with domain knowledge |
Full workflow details: references/planning-workflow.md
Skip this if: fixing a bug, adding a tool to an existing agent, or working on core infrastructure. This workflow is for NEW capabilities only.
Critical Rules
- •ComponentRegistry is truth — Agents discovered via entry points, NOT hardcoded
- •AGENT_CONFIG at module top — Define before heavy imports for <50ms discovery
- •Services return 3-tuple —
(AnnData, Dict, AnalysisStep)always - •Always pass
ir—log_tool_usage(..., ir=ir)for reproducibility - •No
lobster/__init__.py— PEP 420 namespace package
Package Structure
lobster/
├── packages/ # Agent packages (PEP 420)
│ ├── lobster-transcriptomics/ # transcriptomics_expert, annotation_expert, de_analysis_expert
│ ├── lobster-research/ # research_agent, data_expert_agent
│ ├── lobster-visualization/ # visualization_expert
│ ├── lobster-metadata/ # metadata_assistant
│ ├── lobster-structural-viz/ # protein_structure_visualization_expert
│ ├── lobster-genomics/ # genomics_expert
│ ├── lobster-proteomics/ # proteomics_expert
│ └── lobster-ml/ # machine_learning_expert
└── lobster/ # Core SDK
├── agents/supervisor.py # Supervisor (stays in core)
├── agents/graph.py # LangGraph builder
├── core/ # Infrastructure (registry, data_manager, provenance)
├── services/ # Analysis services
└── tools/ # Agent tools
Quick Commands
# Setup (development) make dev-install # Full dev setup with editable install make test # Run all tests make format # black + isort # Setup (end-user testing via uv tool) uv tool install 'lobster-ai[full,anthropic]' # Install as users see it uv tool upgrade lobster-ai # Upgrade to latest # Running lobster chat # Interactive mode lobster query "your request" # Single-turn # Testing pytest tests/unit/ # Fast unit tests pytest tests/integration/ # Integration tests
Service Pattern (Essential)
All services return a 3-tuple:
def analyze(self, adata, **params) -> Tuple[AnnData, Dict, AnalysisStep]:
# Your analysis logic
stats = {"n_cells": adata.n_obs, "status": "complete"}
ir = AnalysisStep(
activity_type="analyze",
inputs={"n_obs": adata.n_obs},
outputs=stats,
params=params
)
return processed_adata, stats, ir
Tools wrap services:
@tool
def analyze_modality(modality_name: str, **params) -> str:
result, stats, ir = service.analyze(adata, **params)
data_manager.log_tool_usage("analyze", params, stats, ir=ir) # IR mandatory!
return f"Complete: {stats}"
Agent Registration (Entry Points)
Agents register via pyproject.toml:
[project.entry-points."lobster.agents"] my_agent = "lobster.agents.my_domain.my_agent:AGENT_CONFIG"
AGENT_CONFIG must be defined at module top (before imports):
# lobster/agents/mydomain/my_agent.py
from lobster.config.agent_registry import AgentRegistryConfig
AGENT_CONFIG = AgentRegistryConfig(
name="my_agent",
display_name="My Expert Agent",
description="What this agent does",
factory_function="lobster.agents.mydomain.my_agent.my_agent",
handoff_tool_name="handoff_to_my_agent",
handoff_tool_description="Assign tasks for my domain analysis",
tier_requirement="free", # All official agents are free
)
# Heavy imports AFTER config
from lobster.core.data_manager_v2 import DataManagerV2
# ... rest of implementation
Key Files
| File | Purpose |
|---|---|
lobster/agents/graph.py | LangGraph orchestration |
lobster/core/component_registry.py | Agent + plugin discovery (7 entry point groups) |
lobster/core/omics_registry.py | Omics type metadata, DataTypeDetector |
lobster/core/data_manager_v2.py | Data/workspace management |
lobster/core/provenance.py | W3C-PROV tracking |
lobster/cli.py | CLI implementation |
Online Documentation
Full documentation at docs.omics-os.com (or local docs-site/):
- •Getting Started:
docs/getting-started/ - •Core SDK:
docs/core/ - •Agents:
docs/agents/ - •Developer Guide:
docs/developer/ - •API Reference:
docs/api-reference/
Common Tasks
Adding a New Agent
- •Create package:
packages/lobster-mydomain/ - •Define AGENT_CONFIG at top of agent file
- •Register entry point in
pyproject.toml - •Implement agent with tools
- •Add tests in
tests/unit/agents/
See references/creating-agents.md for full guide.
Adding a New Service
- •Create service class in appropriate package
- •Implement 3-tuple return pattern
- •Wrap in tool with
log_tool_usage - •Add unit tests
See references/creating-services.md for full guide.
Adding a New Omics Type (Plugin)
- •Define
OmicsTypeConfigwith detection keywords, preferred databases, QC thresholds - •Create adapter factory functions → register via
lobster.adaptersentry point - •Create provider class (if new database) → register via
lobster.providersentry point - •Create download service + queue preparer → register via entry points
- •Register
OmicsTypeConfig→lobster.omics_typesentry point - •Zero core changes needed — everything via
pyproject.tomlentry points
See references/plugin-architecture.md for full guide with code examples.
Understanding Data Flow
User Query → CLI → LobsterClientAdapter → AgentClient
↓
LangGraph (supervisor → agents)
↓
Services → DataManagerV2
↓
Results + Provenance
Testing
# Unit tests (fast, no external deps) pytest tests/unit/ -v # Integration tests (may need env vars) pytest tests/integration/ -v # Specific test pytest tests/unit/test_my_feature.py -v # With coverage pytest --cov=lobster tests/
Contributing
- •Fork the repository
- •Create feature branch:
git checkout -b feature/my-feature - •Make changes following patterns above
- •Run tests:
make test - •Format code:
make format - •Submit PR with clear description