Databricks SQL API Skill
Expert knowledge for building .NET applications that integrate with the Databricks SQL Statement Execution API.
API Patterns
Statement Execution Flow
- •POST
/api/2.0/sql/statementswith SQL query, warehouse ID, andARROW_STREAMformat - •Poll the statement status until
SUCCEEDED,FAILED, orCANCELED - •Download Arrow IPC chunks from external links in parallel
- •Deserialize
RecordBatchobjects withArrowStreamReader
Request format
json
{
"warehouse_id": "<warehouse-id>",
"statement": "SELECT * FROM catalog.schema.table",
"format": "ARROW_STREAM",
"disposition": "EXTERNAL_LINKS",
"wait_timeout": "0s"
}
Polling with exponential backoff
- •Start at 500ms delay
- •Multiply by 1.5 each iteration
- •Cap at 5s max delay
- •Always pass
CancellationToken
Arrow IPC streaming
- •Use
ArrowStreamReaderto deserialize downloaded chunks - •Extract column values via
ArrowUtils.GetValue()— handles all Arrow types without reflection - •Stream rows to display via
Channel<T>for real-time table rendering
Security rules
- •Never log or display access tokens
- •Always validate SQL identifiers with regex
^[\w][\w.]*$before interpolating into queries - •Use masked input (
*) for token prompts - •Dispose
HttpClientinstances properly when connection changes
Error handling
- •Surface user-friendly errors via
AnsiConsole.MarkupLine("[red]...[/]") - •Handle HTTP 429 (rate limiting) with retry
- •Handle network timeouts gracefully
- •Provide statement ID in error messages for debugging
Performance tips
- •Use
EXTERNAL_LINKSdisposition for large results (enables parallel chunk download) - •Download Arrow chunks with
Task.WhenAllfor parallelism - •Use
Channel<RecordBatch>to stream results as they arrive - •Prefer
JsonSerializerwith source-generatedJsonContextfor zero-reflection serialization