DuckDB Parquet Lab Workflow
Purpose
Standardize the pattern of loading Parquet files into DuckDB, inspecting schema, running SQL joins, and converting results to pandas DataFrames.
Usage
- •"load Parquet with DuckDB and join tables"
- •"describe DuckDB table schema"
- •"convert DuckDB query to pandas"
Instructions
- •Read Parquet data with
duckdb.queryorduckdb.sqlusing SQL strings. - •Inspect schema using
DESCRIBE SELECT * FROM <table>and display with.show(). - •Use explicit joins with clear
LEFTorRIGHTsemantics to preserve row counts. - •Convert results to pandas with
.to_df()for downstream modeling. - •Use
./templates/duckdb_snippets.mdfor the standard SQL patterns.