BERDL Database Discovery Skill
This skill introspects BERDL databases via API and generates documentation modules for use with the berdl skill.
Discovery Workflow
When discovering a new database, follow these steps in order:
Step 1: Authenticate
AUTH_TOKEN=$(grep "KB_AUTH_TOKEN" .env | cut -d'"' -f2)
Step 2: List All Databases
curl -s -X POST \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"use_hms": true, "filter_by_namespace": true}' \
https://hub.berdl.kbase.us/apis/mcp/delta/databases/list | python3 -m json.tool
Step 3: List Tables in Target Database
curl -s -X POST \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"database": "DATABASE_NAME", "use_hms": true}' \
https://hub.berdl.kbase.us/apis/mcp/delta/databases/tables/list | python3 -m json.tool
Step 4: Get Schema for Each Table
curl -s -X POST \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"database": "DATABASE_NAME", "table": "TABLE_NAME"}' \
https://hub.berdl.kbase.us/apis/mcp/delta/databases/tables/schema | python3 -m json.tool
Step 5: Get Row Counts
curl -s -X POST \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"database": "DATABASE_NAME", "table": "TABLE_NAME"}' \
https://hub.berdl.kbase.us/apis/mcp/delta/tables/count | python3 -m json.tool
Step 6: Sample Data
curl -s -X POST \
-H "Authorization: Bearer $AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{"database": "DATABASE_NAME", "table": "TABLE_NAME", "limit": 5}' \
https://hub.berdl.kbase.us/apis/mcp/delta/tables/sample | python3 -m json.tool
Step 7: Identify Relationships
Look for foreign key patterns:
- •Columns ending in
_idthat match other table names - •Columns with consistent naming across tables
- •Sample data to confirm relationship patterns
Output: Module File Template
After discovery, generate a module file at .claude/skills/berdl/modules/{database_short_name}.md:
# {Database Name} Module
## Overview
{Brief description of what this database contains}
**Database**: `{database_name}`
**Generated**: {YYYY-MM-DD}
## Tables
| Table | Rows | Description |
|-------|------|-------------|
| `table_name` | {count} | {description inferred from columns/data} |
## Key Table Schemas
### {table_name}
| Column | Type | Description |
|--------|------|-------------|
| `column_name` | {type} | {description} |
## Table Relationships
{Describe foreign key relationships discovered}
- `table1.column` -> `table2.column`
## Common Query Patterns
### {Pattern Name}
{Brief description of what this query does}
```sql
SELECT ...
FROM {database}.{table}
...
Pitfalls
{Any gotchas, NULL handling, performance notes discovered during exploration}
## Output: collections.yaml Entry Template
If the user wants to add the database to the UI, offer to update `ui/config/collections.yaml`:
```yaml
{database_short_name}:
name: "{Human Readable Name}"
description: "{Description}"
database: "{database_name}"
tables:
- name: "{table_name}"
description: "{description}"
Instructions for Claude
When the user invokes /berdl-discover:
- •Ask which database to discover (or list available databases if unknown)
- •Execute discovery workflow steps 1-7, collecting:
- •Table list
- •Schema for each table
- •Row counts for key tables
- •Sample data (2-5 rows per table)
- •Analyze relationships by:
- •Identifying
*_idcolumns - •Matching column names across tables
- •Confirming with sample data
- •Identifying
- •Generate module file using the template above
- •Write the file to
.claude/skills/berdl/modules/{name}.md - •Offer to update
ui/config/collections.yamlif applicable
Example Session
User: "Discover the kescience_fitnessbrowser database" Claude: 1. Lists tables in kescience_fitnessbrowser -> finds: experiments, genes, fitness_scores, conditions 2. Gets schema for each table 3. Gets row counts: experiments(500), genes(50000), fitness_scores(2M), conditions(1000) 4. Samples data to understand structure 5. Identifies relationships: fitness_scores.gene_id -> genes.id, fitness_scores.experiment_id -> experiments.id 6. Generates .claude/skills/berdl/modules/fitness.md with: - Table overview - Schema details - Query patterns for fitness analysis - Pitfalls (e.g., "fitness_scores table is large, always filter by experiment_id") 7. Asks: "Would you like me to add this to ui/config/collections.yaml?"
Error Handling
- •Authentication failure: Check
.envfile exists and contains validKB_AUTH_TOKEN - •Database not found: List available databases and confirm spelling
- •Timeout on large tables: Skip row counts for tables > 100M rows, note in pitfalls
- •Schema unavailable: Mark table as "schema pending" and note in output
Pitfall Detection
When you encounter errors, unexpected results, retry cycles, performance issues, or data surprises during this task, follow the pitfall-capture protocol. Read .claude/skills/pitfall-capture/SKILL.md and follow its instructions to determine whether the issue should be added to docs/pitfalls.md.