Text-to-SQL Skill
Convert natural language questions into SQL queries and execute them against SQL databases.
Phase 1: Project Setup
Step 1: Ask about database connection
Ask user which database type they want to use:
Option A: SQLite (file-based, no credentials needed)
- •User provides path to
.sqliteor.dbfile - •Or places file in
database/folder
Option B: Server database (PostgreSQL, MySQL, MariaDB, etc.)
- •User creates
.envfile with connection details - •Supported: PostgreSQL, MySQL, MariaDB, and other SQL databases
Step 2: Initialize project structure
Run the init script OR manually create structure:
Option A: Use init script
python scripts/init_project.py --target /path/to/project
Option B: Manual setup
mkdir -p database output/queries output/reports
Copy from skill folders to project root:
- •
scripts/*.py→ project root (db_extractor.py, query_runner.py, list_databases.py, sql_helper.py) - •
assets/example.env→ project root - •
assets/requirements.txt→ project root - •
assets/.gitignore→ project root
Install dependencies:
pip install -r requirements.txt
Step 3: Configure connection
For SQLite:
# Place database file cp /path/to/database.sqlite database/ # Extract schema python db_extractor.py --sqlite database/YOUR_DB.sqlite
For server databases (PostgreSQL, MySQL, etc.):
Copy and edit the template:
cp example.env .env # Edit .env with actual credentials
The example.env template contains:
DB_TYPE=postgresql # postgresql, mysql, mariadb DB_HOST=localhost DB_PORT=5432 # 5432 for PostgreSQL, 3306 for MySQL DB_USER=your_username DB_PASSWORD=your_password DB_NAME=your_database_name
Then extract schema:
python db_extractor.py --database your_database_name
Step 4: Verify setup
After extraction, these files should exist in output/:
- •
connection.json- current connection config - •
text_to_sql_context.md- schema for LLM queries - •
schema_info.json- full schema data - •
database_documentation.md- human-readable docs
Phase 2: Query Workflow
When user asks a data question:
Step 1: Read schema context
Read output/text_to_sql_context.md to understand:
- •Available tables and columns
- •Data types and relationships
- •Enum values for filtering
Step 2: Generate and save SQL
Create SQL file based on user question. See sql_patterns.md for common query patterns.
# Save to output/queries/descriptive_name.sql
Step 3: Execute query
Get run command from output/connection.json, then:
# SQLite example python query_runner.py --sqlite database/DB.sqlite -f output/queries/query.sql -o result.csv # MySQL example python query_runner.py -f output/queries/query.sql -o result.csv
Step 4: Report results
Tell user: "Results saved to output/reports/result.csv"
Quick Reference
Commands
# List databases python list_databases.py # Extract schema (SQLite) python db_extractor.py --sqlite database/file.sqlite # Extract schema (MySQL) python db_extractor.py --database db_name # Run query (SQLite) python query_runner.py --sqlite database/file.sqlite "SELECT * FROM table LIMIT 10" python query_runner.py --sqlite database/file.sqlite -f query.sql -o result.csv # Run query (MySQL) python query_runner.py "SELECT * FROM table LIMIT 10" python query_runner.py -f query.sql -o result.csv # Output formats --format csv # default --format xlsx # Excel --format json # JSON --format md # Markdown
Project Structure
project/ ├── .env # MySQL credentials (if using MySQL) ├── database/ # SQLite files go here │ └── your_db.sqlite ├── output/ │ ├── connection.json # Current DB connection │ ├── text_to_sql_context.md # Schema for LLM │ ├── queries/ # Saved SQL queries │ └── reports/ # Query results (CSV, XLSX, JSON) ├── db_extractor.py ├── query_runner.py ├── list_databases.py └── sql_helper.py
Example Workflow
User: "I have a SQLite database with e-commerce data. Help me analyze it."
Setup:
- •Ask user for SQLite file path
- •Copy file to
database/ - •Run
python db_extractor.py --sqlite database/file.sqlite - •Read generated
output/text_to_sql_context.md
User: "Show me top 10 sellers by revenue"
Query:
- •Read schema from
output/text_to_sql_context.md - •Generate SQL:
sql
SELECT seller_id, SUM(price) as revenue FROM order_items GROUP BY seller_id ORDER BY revenue DESC LIMIT 10;
- •Save to
output/queries/top_sellers.sql - •Execute:
python query_runner.py --sqlite database/file.sqlite -f output/queries/top_sellers.sql -o top_sellers.csv - •Report: "Results saved to
output/reports/top_sellers.csv"