Data Source Connect
Replace the demo unicorns data with user's own data source. Supports two modes:
- •Database URL - Connect to an existing PostgreSQL database
- •CSV File - Import a CSV file and create a new table
Mode 1: Database URL
When user provides a PostgreSQL connection URL:
Step 1: Update Environment
Update .env file:
POSTGRES_URL="<user-provided-connection-string>"
Step 2: Discover Schema
Connect to the database and retrieve schema information:
SELECT table_name, column_name, data_type, is_nullable FROM information_schema.columns WHERE table_schema = 'public' ORDER BY table_name, ordinal_position;
Step 3: Update Actions
Edit app/actions.ts - update the schema in these locations:
- •
generateQueryfunction (~line 18): Replace the unicorns schema in the system prompt with the new table schema. Include:- •Table name and column definitions
- •Data type hints (e.g., "valuation is in billions")
- •Available enum values for categorical fields
- •Query patterns specific to the data
- •
explainQueryfunction (~line 115): Replace the unicorns schema in the system prompt. - •
runGenerateSQLQueryfunction (~line 93): Update the table name in error handling.
Step 4: Generate Sample Prompts
Create 3-5 sample natural language queries for the new data to help users understand what they can ask.
Mode 2: CSV File
When user provides a CSV file:
Step 1: Analyze CSV Structure
Read the CSV file and analyze:
- •Column names (header row)
- •Data types (infer from sample values)
- •Null/empty value handling requirements
Step 2: Generate Schema
Create SQL schema based on CSV structure. Type inference rules:
- •Numbers with decimals ->
DECIMAL(10, 2) - •Integers ->
INTEGER - •Dates (various formats) ->
DATE - •Boolean values ->
BOOLEAN - •Everything else ->
VARCHAR(255)orTEXTfor long content
Step 3: Update seed.ts
Edit lib/seed.ts:
- •Update
CREATE TABLEstatement with new schema - •Update CSV parsing logic for new column names
- •Add type conversion for each column
- •Handle missing/null values with appropriate defaults or skip logic
Example seed.ts structure:
export async function seed() {
await pool.query(`
CREATE TABLE IF NOT EXISTS <table_name> (
id SERIAL PRIMARY KEY,
<column1> <type1>,
<column2> <type2>,
...
);
`);
// Parse CSV
const results: any[] = [];
const csvFilePath = path.join(process.cwd(), '<filename>.csv');
await new Promise((resolve, reject) => {
fs.createReadStream(csvFilePath)
.pipe(csv())
.on('data', (data) => results.push(data))
.on('end', resolve)
.on('error', reject);
});
// Insert with null handling
for (const row of results) {
// Skip rows with critical missing data or use defaults
const value1 = row['ColumnName'] || null;
await pool.query(
`INSERT INTO <table_name> (...) VALUES (...) ON CONFLICT DO NOTHING`,
[value1, ...]
);
}
}
Step 4: Update Actions
Same as Database URL Mode Step 3 - update schema references in app/actions.ts.
Step 5: Run Seed
pnpm run seed
Step 6: Generate Sample Prompts
Create 3-5 sample natural language queries for the new data to help users understand what they can ask. Examples should cover:
- •Basic filtering (e.g., "Show all records where X > Y")
- •Aggregations (e.g., "What is the total/average X by Y?")
- •Time-based queries if applicable (e.g., "Show trends over time")
- •Top/bottom queries (e.g., "Top 10 by X")
Critical Files to Update
| File | What to Update |
|---|---|
.env | POSTGRES_URL connection string |
lib/seed.ts | Table schema, CSV parsing, insert logic |
app/actions.ts | Schema in generateQuery and explainQuery prompts |
Schema Prompt Template
When updating the schema in app/actions.ts, use this template:
system: `You are a SQL (postgres) and data visualization expert. Your job is to help the user write a SQL query to retrieve the data they need. The table schema is as follows: <table_name> ( id SERIAL PRIMARY KEY, <column_name> <data_type> <constraints>, ... ); Only retrieval queries are allowed. [Add data-specific hints here, e.g.:] - Use ILIKE for case-insensitive text search - <field_name> values include: value1, value2, value3 - <numeric_field> is in <units> (e.g., thousands, millions) EVERY QUERY SHOULD RETURN QUANTITATIVE DATA THAT CAN BE PLOTTED ON A CHART! `,
Null Value Handling
For CSV imports with missing data:
- •Skip row: When critical identifier is missing
- •Default value: Use sensible defaults (0 for numbers, 'Unknown' for strings)
- •Allow null: For truly optional fields