AgentSkillsCN

idasql

通过idasql CLI对IDA Pro数据库执行SQL查询——查询函数、交叉引用、字符串,以及反编译代码。

SKILL.md
--- frontmatter
name: idasql
description: Execute SQL queries against IDA Pro databases using idasql CLI - query functions, xrefs, strings, decompile code
allowed-tools: Bash, Read, Glob, Grep

CRITICAL: Execution Behavior

You are an EXECUTOR. Your job is to RUN COMMANDS, not discuss them.

Rules

  1. EXECUTE IMMEDIATELY - Never say "I would run..." or "You could try..."
  2. NO CONFIRMATION - Don't ask "Should I run this?" Just run it.
  3. TRUST THE ENVIRONMENT - Commands will work. Run them.
  4. ASSUME idasql IN PATH - The user has idasql installed and accessible

Examples of What NOT to Do

BAD: "I would run this command to count functions..." BAD: "Let me suggest running..." BAD: "Should I execute this query?" BAD: "This command might work..."

Examples of What TO Do

GOOD: [Execute Bash tool with the command] GOOD: "There are 88 functions in the binary." (after running command)


IDASQL - SQL Interface to IDA Databases

When to Use This Skill

Use this skill when the user wants to:

  • Analyze executables or IDA databases (.exe, .dll, .so, .i64, .idb)
  • Query functions, segments, cross-references, strings
  • Decompile code or search for byte patterns
  • Compare multiple binaries or databases

Prerequisites

The user must have:

  1. IDA Pro installed with ida.exe directory in PATH
  2. idasql.exe placed next to ida.exe (same directory)

Command Pattern

bash
idasql -s "<database>" -q "<SQL>"

Windows Note

On Windows, use forward slashes in paths:

bash
idasql -s "C:/path/to/database.i64" -q "SELECT ..."

Direct CLI Mode (One-off Queries)

For simple queries, run idasql directly without starting a server:

bash
# Query a database
idasql -s database.i64 -q "SELECT COUNT(*) FROM funcs"

# Query an existing IDA database
idasql -s database.i64 -q "SELECT name, start_ea FROM funcs LIMIT 10"

# Multiple queries in one session
idasql -s program.exe -q "SELECT COUNT(*) FROM funcs" -q "SELECT COUNT(*) FROM strings"

Use direct CLI mode when:

  • Running a single query or a few queries
  • Analyzing a file for the first time
  • No need to keep the database open

HTTP Server Mode (Persistent Queries)

Use HTTP mode when:

  • Running many queries against the same database
  • Comparing multiple databases simultaneously (up to 6 on ports 17200-17205)
  • Keeping analysis results cached between queries

Starting a Server

bash
# Start server for a single database
idasql -s /path/to/database.i64 --http 17200

# With authentication token
idasql -s database.i64 --http 17200 --token mysecret

# Bind to all interfaces (for remote access)
idasql -s database.i64 --http 17200 --bind 0.0.0.0

Querying via curl

bash
# Execute SQL query
curl -X POST http://localhost:17200/query -d "SELECT name, size FROM funcs LIMIT 5"

# With authentication
curl -X POST http://localhost:17200/query \
     -H "Authorization: Bearer mysecret" \
     -d "SELECT * FROM funcs"

# Check server status
curl http://localhost:17200/status

Response Format

json
{"success": true, "columns": ["name", "size"], "rows": [["main", "500"]], "row_count": 1}
json
{"success": false, "error": "no such table: bad_table"}

Multiple Databases (Ports 17200-17205)

Start up to 6 databases on different ports:

bash
# Terminal/background processes
idasql -s first.i64 --http 17200 &
idasql -s second.i64 --http 17201 &
idasql -s third.i64 --http 17202 &

Query each database by port:

bash
curl -X POST http://localhost:17200/query -d "SELECT name FROM funcs"
curl -X POST http://localhost:17201/query -d "SELECT name FROM funcs"

Shutdown Servers

bash
curl -X POST http://localhost:17200/shutdown
curl -X POST http://localhost:17201/shutdown

HTTP Endpoints

EndpointMethodDescription
/GETWelcome message
/helpGETAPI documentation
/queryPOSTExecute SQL (body = raw SQL)
/statusGETHealth check with stats
/shutdownPOSTStop server gracefully

IDASQL Skill Guide

A comprehensive reference for using IDASQL - an SQL interface for reverse engineering binary analysis with IDA Pro.


What is IDA and Why SQL?

IDA Pro is the industry-standard disassembler and reverse engineering tool. It analyzes compiled binaries (executables, DLLs, firmware) and produces:

  • Disassembly - Human-readable assembly code
  • Functions - Detected code boundaries with names
  • Cross-references - Who calls what, who references what data
  • Types - Structures, enums, function prototypes
  • Decompilation - C-like pseudocode (with Hex-Rays plugin)

IDASQL exposes all this analysis data through SQL virtual tables, enabling:

  • Complex queries across multiple data types (JOINs)
  • Aggregations and statistics (COUNT, GROUP BY)
  • Pattern detection across the entire binary
  • Scriptable analysis without writing IDA plugins or IDAPython scripts

Core Concepts for Binary Analysis

Addresses (ea_t)

Everything in a binary has an address - a memory location where code or data lives. IDA uses ea_t (effective address) as unsigned 64-bit integers. SQL shows these as integers; use printf('0x%X', address) for hex display.

Functions

IDA groups code into functions with:

  • address / start_ea - Where the function begins
  • end_ea - Where it ends
  • name - Assigned or auto-generated name (e.g., main, sub_401000)
  • size - Total bytes in the function

Cross-References (xrefs)

Binary analysis is about understanding relationships:

  • Code xrefs - Function calls, jumps between code
  • Data xrefs - Code reading/writing data locations
  • from_eato_ea represents "address X references address Y"

Segments

Memory is divided into segments with different purposes:

  • .text - Executable code (typically)
  • .data - Initialized global data
  • .rdata - Read-only data (strings, constants)
  • .bss - Uninitialized data

Of course, segment names and types can vary. You may query the segments table to understand memory layout.

Basic Blocks

Within a function, basic blocks are straight-line code sequences:

  • No branches in the middle
  • Single entry, single exit
  • Useful for control flow analysis

Decompilation (Hex-Rays)

The Hex-Rays decompiler converts assembly to C-like pseudocode:

  • ctree - The Abstract Syntax Tree of decompiled code
  • lvars - Local variables detected by the decompiler
  • Much easier to analyze than raw assembly

Command-Line Interface

IDASQL provides SQL access to IDA databases via command line or as a server.

Invocation Modes

1. Single Query (Local)

bash
idasql -s database.i64 -q "SELECT * FROM funcs LIMIT 10"
idasql -s database.i64 -c "SELECT COUNT(*) FROM funcs"  # -c is alias for -q

2. SQL File Execution

bash
idasql -s database.i64 -f analysis.sql

3. Interactive REPL

bash
idasql -s database.i64 -i

4. Remote Mode (connect to running server)

bash
idasql --remote localhost:8080 -q "SELECT * FROM funcs"
idasql --remote localhost:8080 -i  # Remote interactive

5. HTTP Server Mode

bash
idasql -s database.i64 --http 8080
# Then query via: curl -X POST http://localhost:8080/query -d "SELECT * FROM funcs"

6. Export Mode

bash
idasql -s database.i64 --export dump.sql
idasql -s database.i64 --export dump.sql --export-tables=funcs,segments

CLI Options

OptionDescription
-s <file>IDA database file (.idb/.i64)
--remote <host:port>Connect to IDASQL server
--token <token>Auth token for remote/server mode
-q <sql>Execute single SQL query
-c <sql>Alias for -q (Python-style)
-f <file>Execute SQL from file
-iInteractive REPL mode
-w, --writeSave database changes on exit
--export <file>Export tables to SQL file
--export-tables=XTables to export: * (all) or table1,table2,...
--http [port]Start HTTP REST server (default: 8080)
--bind <addr>Bind address for server (default: 127.0.0.1)
-h, --helpShow help

REPL Commands

CommandDescription
.tablesList all virtual tables
.schema [table]Show table schema
.infoShow database metadata
.clearClear session
.quit / .exitExit REPL
.helpShow available commands
.http startStart HTTP server on random port
.http stopStop HTTP server
.http statusShow HTTP server status
.agentStart AI agent mode

Performance Strategy

Single queries: Use -q directly.

bash
idasql -s database.i64 -q "SELECT COUNT(*) FROM funcs"

Multiple queries / exploration: Start a server once, then query as a client.

Opening an IDA database has startup overhead (idalib initialization, auto-analysis). If you plan to run many queries—exploring the database, experimenting with different queries, or iterating on analysis—avoid re-opening the database each time.

Recommended workflow for iterative analysis:

bash
# Terminal 1: Start server (opens database once)
idasql -s database.i64 --http 8080

# Terminal 2: Query repeatedly via remote client (instant responses)
idasql --remote localhost:8080 -q "SELECT * FROM funcs LIMIT 5"
idasql --remote localhost:8080 -q "SELECT * FROM strings WHERE content LIKE '%error%'"
idasql --remote localhost:8080 -q "SELECT name, size FROM funcs ORDER BY size DESC"
# ... as many queries as needed, no startup cost

Or use interactive mode on the remote connection:

bash
idasql --remote localhost:8080 -i
idasql> SELECT COUNT(*) FROM funcs;
idasql> SELECT * FROM xrefs WHERE to_ea = 0x401000;
idasql> .quit

This approach is significantly faster for iterative analysis since the database remains open and queries go directly through the already-initialized session.


Tables Reference

Entity Tables (Read-Only)

funcs

All detected functions in the binary with prototype information.

ColumnTypeDescription
addressINTFunction start address
nameTEXTFunction name
sizeINTFunction size in bytes
end_eaINTFunction end address
flagsINTFunction flags

Prototype columns (populated when type info available):

ColumnTypeDescription
return_typeTEXTReturn type string (e.g., "int", "void *")
return_is_ptrINT1 if return type is pointer
return_is_intINT1 if return type is exactly int
return_is_integralINT1 if return type is int-like (int, long, DWORD, BOOL)
return_is_voidINT1 if return type is void
arg_countINTNumber of function arguments
calling_convTEXTCalling convention (cdecl, stdcall, fastcall, etc.)
sql
-- 10 largest functions
SELECT name, size FROM funcs ORDER BY size DESC LIMIT 10;

-- Functions starting with "sub_" (auto-named, not analyzed)
SELECT name, printf('0x%X', address) as addr FROM funcs WHERE name LIKE 'sub_%';

-- Functions returning integers with 3+ arguments
SELECT name, return_type, arg_count FROM funcs
WHERE return_is_integral = 1 AND arg_count >= 3;

-- Void functions (side effects, callbacks)
SELECT name, arg_count FROM funcs WHERE return_is_void = 1;

-- Pointer-returning functions (factories, allocators)
SELECT name, return_type FROM funcs WHERE return_is_ptr = 1;

-- Simple getter functions (no args, returns value)
SELECT name, return_type FROM funcs
WHERE arg_count = 0 AND return_is_void = 0;

-- Functions by calling convention
SELECT calling_conv, COUNT(*) as count FROM funcs
WHERE calling_conv IS NOT NULL AND calling_conv != ''
GROUP BY calling_conv ORDER BY count DESC;

segments

Memory segments.

ColumnTypeDescription
start_eaINTSegment start
end_eaINTSegment end
nameTEXTSegment name (.text, .data, etc.)
classTEXTSegment class (CODE, DATA)
permINTPermissions (R=4, W=2, X=1)
sql
-- Find executable segments
SELECT name, printf('0x%X', start_ea) as start FROM segments WHERE perm & 1 = 1;

names

All named locations (functions, labels, data).

ColumnTypeDescription
addressINTAddress
nameTEXTName

entries

Entry points (exports, program entry).

ColumnTypeDescription
ordinalINTExport ordinal
addressINTEntry address
nameTEXTEntry name

imports

Imported functions from external libraries.

ColumnTypeDescription
addressINTImport address (IAT entry)
nameTEXTImport name
moduleTEXTModule/DLL name
ordinalINTImport ordinal
sql
-- Imports from kernel32.dll
SELECT name FROM imports WHERE module LIKE '%kernel32%';

strings

String literals found in the binary. IDA maintains a cached string list that can be configured.

ColumnTypeDescription
addressINTString address
lengthINTString length
typeINTString type (raw encoding bits)
type_nameTEXTType name: ascii, utf16, utf32
widthINTChar width (0=1-byte, 1=2-byte, 2=4-byte)
width_nameTEXTWidth name: 1-byte, 2-byte, 4-byte
layoutINTString layout (0=null-terminated, 1-3=pascal)
layout_nameTEXTLayout name: termchr, pascal1, pascal2, pascal4
encodingINTEncoding index (0=default)
contentTEXTString content

String Type Encoding: IDA stores string type as a 32-bit value:

  • Bits 0-1: Width (0=1B/ASCII, 1=2B/UTF-16, 2=4B/UTF-32)
  • Bits 2-7: Layout (0=TERMCHR, 1=PASCAL1, 2=PASCAL2, 3=PASCAL4)
  • Bits 8-15: term1 (first termination character)
  • Bits 16-23: term2 (second termination character)
  • Bits 24-31: encoding index
sql
-- Find error messages
SELECT content, printf('0x%X', address) as addr FROM strings WHERE content LIKE '%error%';

-- ASCII strings only
SELECT * FROM strings WHERE type_name = 'ascii';

-- UTF-16 strings (common in Windows)
SELECT * FROM strings WHERE type_name = 'utf16';

-- Count strings by type
SELECT type_name, layout_name, COUNT(*) as count
FROM strings GROUP BY type_name, layout_name ORDER BY count DESC;

Important: For new analysis (exe/dll), strings are auto-built. For existing databases (i64/idb), strings are already saved. If you see 0 strings unexpectedly, run SELECT rebuild_strings() once to rebuild the list. See String List Functions section below.

xrefs

Cross-references - the most important table for understanding code relationships.

ColumnTypeDescription
from_eaINTSource address (who references)
to_eaINTTarget address (what is referenced)
typeINTXref type code
is_codeINT1=code xref (call/jump), 0=data xref
sql
-- Who calls function at 0x401000?
SELECT printf('0x%X', from_ea) as caller FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;

-- What does function at 0x401000 reference?
SELECT printf('0x%X', to_ea) as target FROM xrefs WHERE from_ea >= 0x401000 AND from_ea < 0x401100;

blocks

Basic blocks within functions. Use func_ea constraint for performance.

ColumnTypeDescription
func_eaINTContaining function
start_eaINTBlock start
end_eaINTBlock end
sizeINTBlock size
sql
-- Blocks in a specific function (FAST - uses constraint pushdown)
SELECT * FROM blocks WHERE func_ea = 0x401000;

-- Functions with most basic blocks
SELECT func_at(func_ea) as name, COUNT(*) as blocks
FROM blocks GROUP BY func_ea ORDER BY blocks DESC LIMIT 10;

Convenience Views

Pre-built views for common xref analysis patterns. These simplify caller/callee queries.

callers

Who calls each function. Use this instead of manual xref JOINs.

ColumnTypeDescription
func_addrINTTarget function address
caller_addrINTXref source address
caller_nameTEXTCalling function name
caller_func_addrINTCalling function start
sql
-- Who calls function at 0x401000?
SELECT caller_name, printf('0x%X', caller_addr) as from_addr
FROM callers WHERE func_addr = 0x401000;

-- Most called functions
SELECT printf('0x%X', func_addr) as addr, COUNT(*) as callers
FROM callers GROUP BY func_addr ORDER BY callers DESC LIMIT 10;

callees

What each function calls. Inverse of callers view.

ColumnTypeDescription
func_addrINTCalling function address
func_nameTEXTCalling function name
callee_addrINTCalled address
callee_nameTEXTCalled function/symbol name
sql
-- What does main call?
SELECT callee_name, printf('0x%X', callee_addr) as addr
FROM callees WHERE func_name LIKE '%main%';

-- Functions making most calls
SELECT func_name, COUNT(*) as call_count
FROM callees GROUP BY func_addr ORDER BY call_count DESC LIMIT 10;

string_refs

Which functions reference which strings. Great for finding functions by string content.

ColumnTypeDescription
string_addrINTString address
string_valueTEXTString content
string_lengthINTString length
ref_addrINTReference address
func_addrINTReferencing function
func_nameTEXTFunction name
sql
-- Find functions using error strings
SELECT func_name, string_value
FROM string_refs
WHERE string_value LIKE '%error%' OR string_value LIKE '%fail%';

-- Functions with most string references
SELECT func_name, COUNT(*) as string_count
FROM string_refs WHERE func_name IS NOT NULL
GROUP BY func_addr ORDER BY string_count DESC LIMIT 10;

Instruction Tables

instructions

Decoded instructions. Always filter by func_addr for performance.

ColumnTypeDescription
addressINTInstruction address
func_addrINTContaining function
itypeINTInstruction type (architecture-specific)
mnemonicTEXTInstruction mnemonic
sizeINTInstruction size
operand0TEXTFirst operand
operand1TEXTSecond operand
disasmTEXTFull disassembly line
sql
-- Instruction profile of a function (FAST)
SELECT mnemonic, COUNT(*) as count
FROM instructions WHERE func_addr = 0x401330
GROUP BY mnemonic ORDER BY count DESC;

-- Find all call instructions in a function
SELECT address, disasm FROM instructions
WHERE func_addr = 0x401000 AND mnemonic = 'call';

Performance: WHERE func_addr = X uses O(function_size) iteration. Without this constraint, it scans the entire database - SLOW.

disasm_calls

All call instructions with resolved targets.

ColumnTypeDescription
func_addrINTFunction containing the call
eaINTCall instruction address
callee_addrINTTarget address (0 if unknown)
callee_nameTEXTTarget name
sql
-- Functions that call malloc
SELECT DISTINCT func_at(func_addr) as caller
FROM disasm_calls WHERE callee_name LIKE '%malloc%';

Database Modification

The following tables support SQL UPDATE and DELETE:

TableUPDATE columnsDELETE
funcsnameNo
namesnameYes
commentscomment, rep_commentYes
bookmarksdescriptionYes
ctree_lvarsname, typeNo

Examples:

sql
-- Rename a function
UPDATE funcs SET name = 'my_main' WHERE address = 0x401000;

-- Rename any named address
UPDATE names SET name = 'my_global' WHERE address = 0x404000;

-- Add/update comment
UPDATE comments SET comment = 'Check return value' WHERE address = 0x401050;

-- Add repeatable comment
UPDATE comments SET rep_comment = 'Global config' WHERE address = 0x404000;

-- Delete a name
DELETE FROM names WHERE address = 0x401000;

Decompiler local variables (requires Hex-Rays):

sql
-- Rename a local variable
UPDATE ctree_lvars SET name = 'buffer_size'
WHERE func_addr = 0x401000 AND name = 'v1';

-- Change variable type
UPDATE ctree_lvars SET type = 'char *'
WHERE func_addr = 0x401000 AND idx = 2;

Persisting Changes

Changes to the database (UPDATE, set_name, etc.) are held in memory by default.

To persist changes:

sql
-- Explicit save (recommended for scripts)
SELECT save_database();  -- Returns 1 on success, 0 on failure

CLI flag for auto-save:

bash
# Auto-save on exit (use with caution)
idasql -s db.i64 -q "UPDATE funcs SET name='main' WHERE address=0x401000" -w

Best practice for batch operations:

sql
-- Make multiple changes
UPDATE funcs SET name = 'init_config' WHERE address = 0x401000;
UPDATE names SET name = 'g_settings' WHERE address = 0x402000;
-- Persist once at the end
SELECT save_database();

Without save_database() or -w, changes are lost when the session ends.

Decompiler Tables (Hex-Rays Required)

CRITICAL: Always filter by func_addr. Without constraint, these tables will decompile EVERY function - extremely slow!

pseudocode

Decompiled C-like code lines. Use decompile(addr) function instead for simple decompilation!

ColumnTypeDescription
func_addrINTFunction address
line_numINTLine number
lineTEXTPseudocode text
eaINTCorresponding assembly address
sql
-- PREFERRED: Use decompile() function for full pseudocode
SELECT decompile(0x401000);

-- Only use pseudocode table when you need line-level details (ea mapping, etc.)
SELECT line_num, line, ea FROM pseudocode WHERE func_addr = 0x401000;

ctree

Full Abstract Syntax Tree of decompiled code.

ColumnTypeDescription
func_addrINTFunction address
item_idINTUnique node ID
is_exprINT1=expression, 0=statement
op_nameTEXTNode type (cot_call, cit_if, etc.)
eaINTAddress in binary
parent_idINTParent node ID
depthINTTree depth
x_id, y_id, z_idINTChild node IDs
var_idxINTLocal variable index
var_nameTEXTVariable name
obj_eaINTTarget address
obj_nameTEXTSymbol name
num_valueINTNumeric literal
str_valueTEXTString literal

ctree_lvars

Local variables from decompilation.

ColumnTypeDescription
func_addrINTFunction address
idxINTVariable index
nameTEXTVariable name
typeTEXTType string
sizeINTSize in bytes
is_argINT1=function argument
is_stk_varINT1=stack variable
stkoffINTStack offset

ctree_call_args

Flattened call arguments for easy querying.

ColumnTypeDescription
func_addrINTFunction address
call_item_idINTCall node ID
arg_idxINTArgument index (0-based)
arg_opTEXTArgument type
arg_var_nameTEXTVariable name if applicable
arg_var_is_stkINT1=stack variable
arg_num_valueINTNumeric value
arg_str_valueTEXTString value

Decompiler Views

Pre-built views for common patterns:

ViewPurpose
ctree_v_callsFunction calls with callee info
ctree_v_loopsfor/while/do loops
ctree_v_ifsif statements
ctree_v_comparisonsComparisons with operands
ctree_v_assignmentsAssignments with operands
ctree_v_derefsPointer dereferences
ctree_v_returnsReturn statements with value details
ctree_v_calls_in_loopsCalls inside loops (recursive)
ctree_v_calls_in_ifsCalls inside if branches (recursive)
ctree_v_leaf_funcsFunctions with no outgoing calls
ctree_v_call_chainsCall chain paths up to depth 10

ctree_v_returns

Return statements with details about what's being returned.

ColumnTypeDescription
func_addrINTFunction address
item_idINTReturn statement item_id
eaINTAddress of return
return_opTEXTReturn value opcode (cot_num, cot_var, cot_call, etc.)
return_numINTNumeric value (if cot_num)
return_strTEXTString value (if cot_str)
return_varTEXTVariable name (if cot_var)
returns_argINT1 if returning a function argument
returns_call_resultINT1 if returning result of another call
sql
-- Functions that return 0
SELECT DISTINCT func_at(func_addr) as name FROM ctree_v_returns
WHERE return_op = 'cot_num' AND return_num = 0;

-- Functions that return -1 (error sentinel)
SELECT DISTINCT func_at(func_addr) as name FROM ctree_v_returns
WHERE return_op = 'cot_num' AND return_num = -1;

-- Functions that return their argument (pass-through)
SELECT DISTINCT func_at(func_addr) as name FROM ctree_v_returns
WHERE returns_arg = 1;

Type Tables

types

All local type definitions.

ColumnTypeDescription
ordinalINTType ordinal
nameTEXTType name
sizeINTSize in bytes
kindTEXTstruct/union/enum/typedef/func
is_structINT1=struct
is_unionINT1=union
is_enumINT1=enum

types_members

Structure and union members.

ColumnTypeDescription
type_ordinalINTParent type ordinal
type_nameTEXTParent type name
member_nameTEXTMember name
offsetINTByte offset
sizeINTMember size
member_typeTEXTType string
mt_is_ptrINT1=pointer
mt_is_arrayINT1=array
mt_is_structINT1=embedded struct

types_enum_values

Enum constant values.

ColumnTypeDescription
type_ordinalINTEnum type ordinal
type_nameTEXTEnum name
value_nameTEXTConstant name
valueINTConstant value

types_func_args

Function prototype arguments with type classification.

ColumnTypeDescription
type_ordinalINTFunction type ordinal
type_nameTEXTFunction type name
arg_indexINTArgument index (-1 = return type, 0+ = args)
arg_nameTEXTArgument name
arg_typeTEXTArgument type string
calling_convTEXTCalling convention (on return row only)

Surface-level type classification (literal type as written):

ColumnTypeDescription
is_ptrINT1 if pointer type
is_intINT1 if exactly int type
is_integralINT1 if int-like (int, long, short, char, bool)
is_floatINT1 if float/double
is_voidINT1 if void
is_structINT1 if struct/union
is_arrayINT1 if array
ptr_depthINTPointer depth (int** = 2)
base_typeTEXTType with pointers stripped

Resolved type classification (after typedef resolution):

ColumnTypeDescription
is_ptr_resolvedINT1 if resolved type is pointer
is_int_resolvedINT1 if resolved type is exactly int
is_integral_resolvedINT1 if resolved type is int-like
is_float_resolvedINT1 if resolved type is float/double
is_void_resolvedINT1 if resolved type is void
ptr_depth_resolvedINTPointer depth after resolution
base_type_resolvedTEXTResolved type with pointers stripped
sql
-- Functions returning integers (strict: exactly int)
SELECT type_name FROM types_func_args
WHERE arg_index = -1 AND is_int = 1;

-- Functions returning integers (loose: includes BOOL, DWORD, LONG)
SELECT type_name FROM types_func_args
WHERE arg_index = -1 AND is_integral_resolved = 1;

-- Functions taking 4 pointer arguments
SELECT type_name, COUNT(*) as ptr_args FROM types_func_args
WHERE arg_index >= 0 AND is_ptr = 1
GROUP BY type_ordinal HAVING ptr_args = 4;

-- Typedefs that hide pointers (HANDLE, etc.)
SELECT type_name, arg_type FROM types_func_args
WHERE is_ptr = 0 AND is_ptr_resolved = 1;

Type Views

Convenience views for filtering types:

ViewDescription
types_v_structsSELECT * FROM types WHERE is_struct = 1
types_v_unionsSELECT * FROM types WHERE is_union = 1
types_v_enumsSELECT * FROM types WHERE is_enum = 1
types_v_typedefsSELECT * FROM types WHERE is_typedef = 1
types_v_funcsSELECT * FROM types WHERE is_func = 1
local_typesLegacy compatibility view

Extended Tables

bookmarks

User-defined bookmarks/marked positions.

ColumnTypeDescription
indexINTBookmark index
addressINTBookmarked address
descriptionTEXTBookmark description
sql
-- List all bookmarks
SELECT printf('0x%X', address) as addr, description FROM bookmarks;

heads

All defined items (code/data heads) in the database.

ColumnTypeDescription
addressINTHead address
sizeINTItem size
flagsINTIDA flags

Performance: This table can be very large. Always use address range filters.

fixups

Relocation and fixup information.

ColumnTypeDescription
addressINTFixup address
typeINTFixup type
targetINTTarget address

hidden_ranges

Collapsed/hidden code regions in IDA.

ColumnTypeDescription
start_eaINTRange start
end_eaINTRange end
descriptionTEXTDescription
visibleINTVisibility state

problems

IDA analysis problems and warnings.

ColumnTypeDescription
addressINTProblem address
typeINTProblem type code
descriptionTEXTProblem description
sql
-- Find all analysis problems
SELECT printf('0x%X', address) as addr, description FROM problems;

fchunks

Function chunks (for functions with non-contiguous code, like exception handlers).

ColumnTypeDescription
func_addrINTParent function
start_eaINTChunk start
end_eaINTChunk end
sizeINTChunk size
sql
-- Functions with multiple chunks (complex control flow)
SELECT func_at(func_addr) as name, COUNT(*) as chunks
FROM fchunks GROUP BY func_addr HAVING chunks > 1;

signatures

FLIRT signature matches.

ColumnTypeDescription
addressINTMatched address
nameTEXTSignature name
libraryTEXTLibrary name

mappings

Memory mappings for debugging.

ColumnTypeDescription
from_eaINTMapped from
to_eaINTMapped to
sizeINTMapping size

Metadata Tables

db_info

Database-level metadata.

ColumnTypeDescription
keyTEXTMetadata key
valueTEXTMetadata value
sql
-- Get database info
SELECT * FROM db_info;

ida_info

IDA processor and analysis info.

ColumnTypeDescription
keyTEXTInfo key
valueTEXTInfo value
sql
-- Get processor type
SELECT value FROM ida_info WHERE key = 'procname';

Disassembly Tables

disasm_loops

Detected loops in disassembly.

ColumnTypeDescription
func_addrINTFunction address
loop_startINTLoop header address
loop_endINTLoop end address

Disassembly Views

Views for disassembly-level analysis (no Hex-Rays required):

ViewDescription
disasm_v_leaf_funcsFunctions with no outgoing calls
disasm_v_call_chainsCall chain paths (recursive CTE)
disasm_v_calls_in_loopsCalls inside loop bodies
disasm_v_funcs_with_loopsFunctions containing loops
sql
-- Find functions that don't call anything
SELECT * FROM disasm_v_leaf_funcs LIMIT 10;

-- Find hotspot calls (inside loops)
SELECT func_at(func_addr) as func, callee_name
FROM disasm_v_calls_in_loops;

SQL Functions

Disassembly

FunctionDescription
disasm(addr)Disassembly line at address
disasm(addr, n)Multiple lines from address
bytes(addr, n)Bytes as hex string
bytes_raw(addr, n)Raw bytes as BLOB
mnemonic(addr)Instruction mnemonic only
operand(addr, n)Operand text (n=0-5)

Binary Search

FunctionDescription
search_bytes(pattern)Find all matches, returns JSON array
search_bytes(pattern, start, end)Search within address range
search_first(pattern)First match address (or NULL)
search_first(pattern, start, end)First match in range

Pattern syntax (IDA native):

  • "48 8B 05" - Exact bytes (hex, space-separated)
  • "48 ? 05" or "48 ?? 05" - ? = any byte wildcard (whole byte only)
  • "(01 02 03)" - Alternatives (match any of these bytes)

Note: Unlike Binary Ninja, IDA does NOT support nibble wildcards or regex.

Example:

sql
-- Find all matches for a pattern
SELECT search_bytes('48 8B ? 00');

-- Parse JSON results
SELECT json_extract(value, '$.address') as addr
FROM json_each(search_bytes('48 89 ?'))
LIMIT 10;

-- First match only
SELECT printf('0x%llX', search_first('CC CC CC'));

-- Search with alternatives
SELECT search_bytes('E8 (01 02 03 04)');

Optimization Pattern: Find functions using specific instruction

To answer "How many functions use RDTSC instruction?" efficiently:

sql
-- Count unique functions containing RDTSC (opcode: 0F 31)
SELECT COUNT(DISTINCT func_start(json_extract(value, '$.address'))) as count
FROM json_each(search_bytes('0F 31'))
WHERE func_start(json_extract(value, '$.address')) IS NOT NULL;

-- List those functions with names
SELECT DISTINCT
    func_start(json_extract(value, '$.address')) as func_ea,
    name_at(func_start(json_extract(value, '$.address'))) as func_name
FROM json_each(search_bytes('0F 31'))
WHERE func_start(json_extract(value, '$.address')) IS NOT NULL;

This is much faster than scanning all disassembly lines because:

  • search_bytes() uses native binary search
  • func_start() is O(1) lookup in IDA's function index

Names & Functions

FunctionDescription
name_at(addr)Name at address
func_at(addr)Function name containing address
func_start(addr)Start of containing function
func_end(addr)End of containing function
func_qty()Total function count
func_at_index(n)Function address at index (O(1))

Cross-References

FunctionDescription
xrefs_to(addr)JSON array of xrefs TO address
xrefs_from(addr)JSON array of xrefs FROM address

Navigation

FunctionDescription
next_head(addr)Next defined item
prev_head(addr)Previous defined item
segment_at(addr)Segment name at address
hex(val)Format as hex string

Comments

FunctionDescription
comment_at(addr)Get comment at address
set_comment(addr, text)Set regular comment
set_comment(addr, text, 1)Set repeatable comment

Modification

FunctionDescription
set_name(addr, name)Set name at address

Item Analysis

FunctionDescription
item_type(addr)Item type flags at address
item_size(addr)Item size at address
is_code(addr)Returns 1 if address is code
is_data(addr)Returns 1 if address is data
flags_at(addr)Raw IDA flags at address

Instruction Details

FunctionDescription
itype(addr)Instruction type code (processor-specific)
decode_insn(addr)Full instruction info as JSON
operand_type(addr, n)Operand type code (o_void, o_reg, etc.)
operand_value(addr, n)Operand value (register num, immediate, etc.)
sql
-- Get instruction type for filtering
SELECT address, itype(address) as itype, mnemonic(address)
FROM heads WHERE is_code(address) = 1 LIMIT 10;

-- Decode full instruction
SELECT decode_insn(0x401000);

Decompilation

FunctionDescription
decompile(addr)PREFERRED - Full pseudocode as single text (requires Hex-Rays)
list_lvars(addr)List local variables as JSON
rename_lvar(addr, old, new)Rename a local variable

IMPORTANT: To decompile a function, use decompile(addr) - NOT the pseudocode table!

sql
-- CORRECT: Get full decompilation in one call
SELECT decompile(0x401000);

-- WRONG: Inefficient line-by-line approach (avoid this!)
-- SELECT line FROM pseudocode WHERE func_addr = 0x401000 ORDER BY line_num;

-- Get all local variables in a function
SELECT list_lvars(0x401000);

-- Rename a variable
SELECT rename_lvar(0x401000, 'v1', 'buffer_size');

File Generation

FunctionDescription
gen_asm_file(start, end, path)Generate ASM file
gen_lst_file(start, end, path)Generate listing file
gen_map_file(path)Generate MAP file
gen_idc_file(start, end, path)Generate IDC script
gen_html_file(start, end, path)Generate HTML file
sql
-- Export function as ASM
SELECT gen_asm_file(0x401000, 0x401100, '/tmp/func.asm');

-- Generate MAP file
SELECT gen_map_file('/tmp/binary.map');

Graph Generation

FunctionDescription
gen_cfg_dot(addr)Generate CFG as DOT graph string
gen_cfg_dot_file(addr, path)Write CFG DOT to file
gen_schema_dot()Generate database schema as DOT
sql
-- Get CFG for a function as DOT format
SELECT gen_cfg_dot(0x401000);

-- Export schema visualization
SELECT gen_schema_dot();

Entity Search ("Jump to Anything")

FunctionDescription
jump_search(pattern, mode, limit, offset)Search entities, returns JSON array
jump_query(pattern, mode, limit, offset)Returns the generated SQL string
sql
-- Search for functions/types/labels starting with 'sub'
SELECT jump_search('sub', 'prefix', 10, 0);

-- Search for anything containing 'main'
SELECT jump_search('main', 'contains', 10, 0);

String List Functions

IDA maintains a cached list of strings. Use rebuild_strings() to detect and cache strings.

FunctionDescription
rebuild_strings()Rebuild with ASCII + UTF-16, minlen 5 (default)
rebuild_strings(minlen)Rebuild with custom minimum length
rebuild_strings(minlen, types)Rebuild with custom length and type mask
string_count()Get current string count (no rebuild)

Type mask values:

  • 1 = ASCII only (STRTYPE_C)
  • 2 = UTF-16 only (STRTYPE_C_16)
  • 4 = UTF-32 only (STRTYPE_C_32)
  • 3 = ASCII + UTF-16 (default)
  • 7 = All types
sql
-- Check current string count
SELECT string_count();

-- Rebuild with defaults (ASCII + UTF-16, minlen 5)
SELECT rebuild_strings();

-- Rebuild with shorter minimum length
SELECT rebuild_strings(4);

-- Rebuild with specific types
SELECT rebuild_strings(5, 1);   -- ASCII only
SELECT rebuild_strings(5, 7);   -- All types (ASCII + UTF-16 + UTF-32)

-- Typical workflow: rebuild then query
SELECT rebuild_strings();
SELECT * FROM strings WHERE content LIKE '%error%';

IMPORTANT - String Query Behavior: When the user asks about strings (e.g., "show me the strings", "what strings are in this binary"):

  1. First run SELECT rebuild_strings() to ensure strings are detected
  2. Then query the strings table

The rebuild_strings() function configures IDA's string detection with sensible defaults (ASCII + UTF-16, minimum length 5) and rebuilds the string list. This ensures the user gets results even if the database had no prior string analysis.


Entity Search Table (jump_entities)

A table-valued function for unified entity search with full SQL composability.

Usage

sql
-- Basic search (function-call syntax)
SELECT * FROM jump_entities('sub', 'prefix') LIMIT 10;

-- Filter by kind
SELECT * FROM jump_entities('EH', 'prefix') WHERE kind = 'struct';

-- JOIN with other tables
SELECT j.name, f.size
FROM jump_entities('sub', 'prefix') j
LEFT JOIN funcs f ON j.address = f.address
WHERE j.kind = 'function';

Parameters

ParameterDescription
patternSearch pattern (required)
mode'prefix' or 'contains'

Columns

ColumnTypeDescription
nameTEXTEntity name
kindTEXTfunction/label/segment/struct/union/enum/member/enum_member
addressINTAddress (for functions, labels, segments)
ordinalINTType ordinal (for types, members)
parent_nameTEXTParent type (for members)
full_nameTEXTFully qualified name

Use Case: Implement "Jump to Anything" with virtual scrolling - lazy cursor respects LIMIT.


Performance Rules

CRITICAL: Constraint Pushdown

Some tables have optimized filters that use efficient IDA SDK APIs:

TableOptimized FilterWithout Filter
instructionsfunc_addr = XO(all instructions) - SLOW
blocksfunc_ea = XO(all blocks)
xrefsto_ea = X or from_ea = XO(all xrefs)
pseudocodefunc_addr = XDecompiles ALL functions
ctree*func_addr = XDecompiles ALL functions

Always filter decompiler tables by func_addr!

Use Integer Comparisons

sql
-- SLOW: String comparison
WHERE mnemonic = 'call'

-- FAST: Integer comparison
WHERE itype IN (16, 18)  -- x86 call opcodes

O(1) Random Access

sql
-- SLOW: O(n) - sorts all rows
SELECT address FROM funcs ORDER BY RANDOM() LIMIT 1;

-- FAST: O(1) - direct index access
SELECT func_at_index(ABS(RANDOM()) % func_qty());

Common Query Patterns

Find Most Called Functions

sql
SELECT f.name, COUNT(*) as callers
FROM funcs f
JOIN xrefs x ON f.address = x.to_ea
WHERE x.is_code = 1
GROUP BY f.address
ORDER BY callers DESC
LIMIT 10;

Find Functions Calling a Specific API

sql
SELECT DISTINCT func_at(from_ea) as caller
FROM xrefs
WHERE to_ea = (SELECT address FROM imports WHERE name = 'CreateFileW');

String Cross-Reference Analysis

sql
SELECT s.content, func_at(x.from_ea) as used_by
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%password%';

Function Complexity (by Block Count)

sql
SELECT func_at(func_ea) as name, COUNT(*) as block_count
FROM blocks
GROUP BY func_ea
ORDER BY block_count DESC
LIMIT 10;

Find Leaf Functions (No Outgoing Calls)

sql
SELECT f.name, f.size
FROM funcs f
LEFT JOIN disasm_calls c ON c.func_addr = f.address
GROUP BY f.address
HAVING COUNT(c.ea) = 0
ORDER BY f.size DESC;

Functions with Deep Call Chains

sql
SELECT f.name, MAX(cc.depth) as max_depth
FROM disasm_v_call_chains cc
JOIN funcs f ON f.address = cc.root_func
GROUP BY cc.root_func
ORDER BY max_depth DESC
LIMIT 10;

Security: Dangerous Function Calls with Stack Buffers

sql
SELECT f.name, c.callee_name, printf('0x%X', c.ea) as address
FROM funcs f
JOIN ctree_v_calls c ON c.func_addr = f.address
JOIN ctree_call_args a ON a.func_addr = c.func_addr AND a.call_item_id = c.item_id
WHERE c.callee_name IN ('strcpy', 'strcat', 'sprintf', 'gets', 'memcpy')
  AND a.arg_idx = 0 AND a.arg_var_is_stk = 1
ORDER BY f.name;

Find Zero Comparisons (Potential Error Checks)

sql
SELECT func_at(func_addr) as func, printf('0x%X', ea) as addr
FROM ctree_v_comparisons
WHERE op_name = 'cot_eq' AND rhs_op = 'cot_num' AND rhs_num = 0;

Calls Inside Loops (Performance Hotspots)

sql
SELECT f.name, l.callee_name, l.loop_op
FROM ctree_v_calls_in_loops l
JOIN funcs f ON f.address = l.func_addr
ORDER BY f.name;

malloc with Constant Size

sql
SELECT func_at(c.func_addr) as func, a.arg_num_value as size
FROM ctree_v_calls c
JOIN ctree_call_args a ON a.func_addr = c.func_addr AND a.call_item_id = c.item_id
WHERE c.callee_name LIKE '%malloc%'
  AND a.arg_idx = 0 AND a.arg_op = 'cot_num'
ORDER BY a.arg_num_value DESC;

Largest Structures

sql
SELECT name, size, alignment
FROM types
WHERE is_struct = 1 AND size > 0
ORDER BY size DESC
LIMIT 10;

Instruction Profile for a Function

sql
SELECT mnemonic, COUNT(*) as count
FROM instructions
WHERE func_addr = 0x401330
GROUP BY mnemonic
ORDER BY count DESC;

Import Dependency Map

sql
-- Which modules does each function depend on?
SELECT f.name as func_name, i.module, COUNT(*) as api_count
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
JOIN imports i ON dc.callee_addr = i.address
GROUP BY f.address, i.module
ORDER BY f.name, api_count DESC;

Find Indirect Calls (Potential Virtual Functions/Callbacks)

sql
-- Functions with indirect calls (call through register/memory)
SELECT f.name, COUNT(*) as indirect_calls
FROM funcs f
JOIN disasm_calls dc ON dc.func_addr = f.address
WHERE dc.callee_addr = 0  -- Unresolved target = indirect
GROUP BY f.address
ORDER BY indirect_calls DESC
LIMIT 20;

String Format Audit (printf-style Vulnerabilities)

sql
-- Format string usage with variable formats (potential vuln)
SELECT f.name, c.callee_name, printf('0x%X', c.ea) as addr
FROM funcs f
JOIN ctree_v_calls c ON c.func_addr = f.address
JOIN ctree_call_args a ON a.func_addr = c.func_addr AND a.call_item_id = c.item_id
WHERE c.callee_name LIKE '%printf%'
  AND a.arg_idx = 0  -- First arg is format string
  AND a.arg_op = 'cot_var';  -- Variable, not constant string

Memory Allocation Patterns

sql
-- Find functions that allocate but may not free
WITH allocators AS (
    SELECT func_addr, COUNT(*) as alloc_count
    FROM disasm_calls
    WHERE callee_name LIKE '%alloc%' OR callee_name LIKE '%malloc%'
    GROUP BY func_addr
),
freers AS (
    SELECT func_addr, COUNT(*) as free_count
    FROM disasm_calls
    WHERE callee_name LIKE '%free%'
    GROUP BY func_addr
)
SELECT f.name,
       COALESCE(a.alloc_count, 0) as allocations,
       COALESCE(r.free_count, 0) as frees
FROM funcs f
LEFT JOIN allocators a ON f.address = a.func_addr
LEFT JOIN freers r ON f.address = r.func_addr
WHERE a.alloc_count > 0 AND COALESCE(r.free_count, 0) = 0
ORDER BY allocations DESC;

Control Flow Anomalies

sql
-- Functions with many basic blocks but few instructions (possibly obfuscated)
SELECT
    f.name,
    f.size,
    COUNT(DISTINCT b.start_ea) as blocks,
    f.size / COUNT(DISTINCT b.start_ea) as avg_block_size
FROM funcs f
JOIN blocks b ON b.func_ea = f.address
WHERE f.size > 100
GROUP BY f.address
HAVING COUNT(DISTINCT b.start_ea) > 10
   AND f.size / COUNT(DISTINCT b.start_ea) < 10  -- Very small blocks
ORDER BY blocks DESC;

Return Value Analysis

sql
-- Functions with multiple return statements (complex control flow)
SELECT f.name, COUNT(*) as return_count
FROM funcs f
JOIN ctree ct ON ct.func_addr = f.address
WHERE ct.op_name = 'cit_return'
GROUP BY f.address
HAVING COUNT(*) > 3
ORDER BY return_count DESC;

-- Functions that return 0 (common success pattern)
SELECT DISTINCT func_at(func_addr) as name FROM ctree_v_returns
WHERE return_op = 'cot_num' AND return_num = 0;

-- Functions that return -1 (error sentinel)
SELECT DISTINCT func_at(func_addr) as name FROM ctree_v_returns
WHERE return_op = 'cot_num' AND return_num = -1;

-- Functions that return a specific constant
SELECT DISTINCT func_at(func_addr) as name FROM ctree_v_returns
WHERE return_op = 'cot_num' AND return_num = 1;

Function Signature Queries

sql
-- Functions returning integers (includes BOOL, DWORD via resolved)
SELECT type_name FROM types_func_args
WHERE arg_index = -1 AND is_integral_resolved = 1;

-- Functions taking exactly 4 pointer arguments
SELECT type_name, COUNT(*) as ptr_args FROM types_func_args
WHERE arg_index >= 0 AND is_ptr = 1
GROUP BY type_ordinal HAVING ptr_args = 4;

-- Functions with string parameters (char*/wchar_t*)
SELECT DISTINCT type_name FROM types_func_args
WHERE arg_index >= 0 AND is_ptr = 1
  AND base_type_resolved IN ('char', 'wchar_t', 'CHAR', 'WCHAR');

-- Typedefs hiding pointers (HANDLE, HMODULE, etc.)
SELECT DISTINCT type_name, arg_type FROM types_func_args
WHERE is_ptr = 0 AND is_ptr_resolved = 1;

-- Functions returning void pointers
SELECT type_name FROM types_func_args
WHERE arg_index = -1 AND is_ptr_resolved = 1 AND is_void_resolved = 1;

Loops with System Calls (Performance/Security Hotspots)

sql
-- System API calls inside loops
SELECT
    f.name as function,
    l.callee_name as api_called,
    l.loop_op as loop_type
FROM ctree_v_calls_in_loops l
JOIN funcs f ON f.address = l.func_addr
JOIN imports i ON l.callee_name = i.name
ORDER BY f.name;

Type Usage Statistics

sql
-- Most referenced types (by struct member usage in decompiled code)
SELECT tm.type_name, COUNT(DISTINCT ct.func_addr) as func_count
FROM types_members tm
JOIN ctree ct ON ct.var_name = tm.member_name
GROUP BY tm.type_name
ORDER BY func_count DESC
LIMIT 20;

Data Section Analysis

sql
-- Find functions referencing data sections
SELECT
    f.name,
    s.name as segment,
    COUNT(*) as data_refs
FROM funcs f
JOIN xrefs x ON x.from_ea BETWEEN f.address AND f.end_ea
JOIN segments s ON x.to_ea BETWEEN s.start_ea AND s.end_ea
WHERE s.class = 'DATA' AND x.is_code = 0
GROUP BY f.address, s.name
ORDER BY data_refs DESC
LIMIT 20;

Exception Handler Detection

sql
-- Functions with multiple chunks (often due to exception handlers)
SELECT
    f.name,
    COUNT(*) as chunk_count,
    SUM(fc.size) as total_size
FROM funcs f
JOIN fchunks fc ON fc.func_addr = f.address
GROUP BY f.address
HAVING COUNT(*) > 1
ORDER BY chunk_count DESC;

Advanced SQL Patterns

Common Table Expressions (CTEs)

CTEs make complex queries readable and allow recursive traversal.

Basic CTE for Filtering

sql
-- Find functions that both call malloc AND check return value
WITH malloc_callers AS (
    SELECT DISTINCT func_addr
    FROM disasm_calls
    WHERE callee_name LIKE '%malloc%'
),
null_checkers AS (
    SELECT DISTINCT func_addr
    FROM ctree_v_comparisons
    WHERE rhs_num = 0 AND op_name = 'cot_eq'
)
SELECT f.name
FROM funcs f
JOIN malloc_callers m ON f.address = m.func_addr
JOIN null_checkers n ON f.address = n.func_addr;

CTE with Aggregation

sql
-- Functions ranked by complexity (calls * blocks)
WITH call_counts AS (
    SELECT func_addr, COUNT(*) as call_cnt
    FROM disasm_calls
    GROUP BY func_addr
),
block_counts AS (
    SELECT func_ea as func_addr, COUNT(*) as block_cnt
    FROM blocks
    GROUP BY func_ea
)
SELECT f.name,
       COALESCE(c.call_cnt, 0) as calls,
       COALESCE(b.block_cnt, 0) as blocks,
       COALESCE(c.call_cnt, 0) * COALESCE(b.block_cnt, 0) as complexity
FROM funcs f
LEFT JOIN call_counts c ON f.address = c.func_addr
LEFT JOIN block_counts b ON f.address = b.func_addr
ORDER BY complexity DESC
LIMIT 10;

Recursive CTEs (Call Graph Traversal)

sql
-- Find all functions reachable from main (up to depth 5)
WITH RECURSIVE call_graph AS (
    -- Base case: start from main
    SELECT address as func_addr, name, 0 as depth
    FROM funcs WHERE name = 'main'

    UNION ALL

    -- Recursive case: follow calls
    SELECT f.address, f.name, cg.depth + 1
    FROM call_graph cg
    JOIN disasm_calls dc ON dc.func_addr = cg.func_addr
    JOIN funcs f ON f.address = dc.callee_addr
    WHERE cg.depth < 5
      AND dc.callee_addr != 0  -- Skip indirect calls
)
SELECT DISTINCT func_addr, name, MIN(depth) as min_depth
FROM call_graph
GROUP BY func_addr
ORDER BY min_depth, name;
sql
-- Reverse call graph: who calls this function (transitive)
WITH RECURSIVE callers AS (
    -- Base: direct callers of target
    SELECT DISTINCT dc.func_addr, 1 as depth
    FROM disasm_calls dc
    WHERE dc.callee_addr = 0x401000

    UNION ALL

    -- Recursive: who calls the callers
    SELECT DISTINCT dc.func_addr, c.depth + 1
    FROM callers c
    JOIN disasm_calls dc ON dc.callee_addr = c.func_addr
    WHERE c.depth < 5
)
SELECT func_at(func_addr) as caller, MIN(depth) as distance
FROM callers
GROUP BY func_addr
ORDER BY distance, caller;

Window Functions

sql
-- Rank functions by size within each segment
SELECT
    segment_at(f.address) as seg,
    f.name,
    f.size,
    ROW_NUMBER() OVER (PARTITION BY segment_at(f.address) ORDER BY f.size DESC) as rank
FROM funcs f
WHERE f.size > 0;
sql
-- Running total of function sizes
SELECT
    name,
    size,
    SUM(size) OVER (ORDER BY address) as cumulative_size
FROM funcs
ORDER BY address;
sql
-- Find consecutive functions with similar sizes (possible duplicates)
SELECT
    name,
    size,
    LAG(name) OVER (ORDER BY size) as prev_name,
    LAG(size) OVER (ORDER BY size) as prev_size
FROM funcs
WHERE size > 100;

Complex JOINs

Multi-Table Join (Functions with Context)

sql
-- Function overview with all relationships
SELECT
    f.name,
    f.size,
    segment_at(f.address) as segment,
    (SELECT COUNT(*) FROM blocks WHERE func_ea = f.address) as block_count,
    (SELECT COUNT(*) FROM disasm_calls WHERE func_addr = f.address) as outgoing_calls,
    (SELECT COUNT(*) FROM xrefs WHERE to_ea = f.address AND is_code = 1) as incoming_calls,
    (SELECT COUNT(*) FROM ctree_lvars WHERE func_addr = f.address) as local_vars
FROM funcs f
ORDER BY f.size DESC
LIMIT 20;

Self-Join (Compare Functions)

sql
-- Find functions with identical sizes (potential clones)
SELECT
    f1.name as func1,
    f2.name as func2,
    f1.size
FROM funcs f1
JOIN funcs f2 ON f1.size = f2.size AND f1.address < f2.address
WHERE f1.size > 50  -- Ignore tiny functions
ORDER BY f1.size DESC;

Subqueries

sql
-- Functions that call more APIs than average
SELECT f.name, call_count
FROM (
    SELECT func_addr, COUNT(*) as call_count
    FROM disasm_calls dc
    JOIN imports i ON dc.callee_addr = i.address
    GROUP BY func_addr
) sub
JOIN funcs f ON f.address = sub.func_addr
WHERE call_count > (
    SELECT AVG(cnt) FROM (
        SELECT COUNT(*) as cnt
        FROM disasm_calls dc
        JOIN imports i ON dc.callee_addr = i.address
        GROUP BY func_addr
    )
)
ORDER BY call_count DESC;

CASE Expressions

sql
-- Categorize functions by complexity
SELECT
    name,
    size,
    CASE
        WHEN size < 50 THEN 'tiny'
        WHEN size < 200 THEN 'small'
        WHEN size < 1000 THEN 'medium'
        WHEN size < 5000 THEN 'large'
        ELSE 'huge'
    END as category
FROM funcs
ORDER BY size DESC;
sql
-- Classify strings by content
SELECT
    content,
    CASE
        WHEN content LIKE '%error%' OR content LIKE '%fail%' THEN 'error'
        WHEN content LIKE '%password%' OR content LIKE '%key%' THEN 'sensitive'
        WHEN content LIKE '%http%' OR content LIKE '%://% ' THEN 'url'
        WHEN content LIKE '%.dll%' OR content LIKE '%.exe%' THEN 'file'
        ELSE 'other'
    END as category
FROM strings
WHERE length > 5;

Batch Analysis with UNION ALL

sql
-- Comprehensive security audit in one query
SELECT 'dangerous_func' as check_type, func_at(func_addr) as location, callee_name as detail
FROM disasm_calls
WHERE callee_name IN ('strcpy', 'strcat', 'sprintf', 'gets', 'scanf')

UNION ALL

SELECT 'crypto_usage', func_at(func_addr), callee_name
FROM disasm_calls
WHERE callee_name LIKE '%Crypt%' OR callee_name LIKE '%AES%' OR callee_name LIKE '%RSA%'

UNION ALL

SELECT 'network_call', func_at(func_addr), callee_name
FROM disasm_calls
WHERE callee_name IN ('socket', 'connect', 'send', 'recv', 'WSAStartup')

UNION ALL

SELECT 'registry_access', func_at(func_addr), callee_name
FROM disasm_calls
WHERE callee_name LIKE 'Reg%'

ORDER BY check_type, location;

Efficient Pagination

sql
-- Page through large result sets efficiently
SELECT * FROM (
    SELECT
        f.name,
        f.size,
        ROW_NUMBER() OVER (ORDER BY f.size DESC) as row_num
    FROM funcs f
)
WHERE row_num BETWEEN 101 AND 200;  -- Page 2 (100 per page)

EXISTS for Efficient Filtering

sql
-- Functions that have at least one string reference (more efficient than JOIN + DISTINCT)
SELECT f.name
FROM funcs f
WHERE EXISTS (
    SELECT 1 FROM xrefs x
    JOIN strings s ON x.to_ea = s.address
    WHERE x.from_ea BETWEEN f.address AND f.end_ea
);
sql
-- Functions without any calls (leaf functions, EXISTS version)
SELECT f.name, f.size
FROM funcs f
WHERE NOT EXISTS (
    SELECT 1 FROM disasm_calls dc
    WHERE dc.func_addr = f.address
)
ORDER BY f.size DESC;

Hex Address Formatting

IDA uses integer addresses. For display, use printf():

sql
-- 32-bit format
SELECT printf('0x%08X', address) as addr FROM funcs;

-- 64-bit format
SELECT printf('0x%016llX', address) as addr FROM funcs;

-- Auto-width
SELECT printf('0x%X', address) as addr FROM funcs;

Common x86 Instruction Types

When filtering by itype (faster than string comparison):

itypeMnemonicDescription
16call (near)Direct call
18call (indirect)Indirect call
122movMove data
143pushPush to stack
134popPop from stack
159retnReturn
85jzJump if zero
79jnzJump if not zero
27cmpCompare
103nopNo operation

ctree Operation Names

Common Hex-Rays AST node types:

Expressions (cot_*):

  • cot_call - Function call
  • cot_var - Local variable
  • cot_obj - Global object/function
  • cot_num - Numeric constant
  • cot_str - String literal
  • cot_ptr - Pointer dereference
  • cot_ref - Address-of
  • cot_asg - Assignment
  • cot_add, cot_sub, cot_mul, cot_sdiv, cot_udiv - Arithmetic
  • cot_eq, cot_ne, cot_lt, cot_gt - Comparisons
  • cot_land, cot_lor, cot_lnot - Logical
  • cot_band, cot_bor, cot_xor - Bitwise

Statements (cit_*):

  • cit_if - If statement
  • cit_for - For loop
  • cit_while - While loop
  • cit_do - Do-while loop
  • cit_return - Return statement
  • cit_block - Code block

Error Handling

  • No Hex-Rays license: Decompiler tables (pseudocode, ctree*, ctree_lvars) will be empty or unavailable
  • No constraint on decompiler tables: Query will be extremely slow (decompiles all functions)
  • Invalid address: Functions like func_at(addr) return NULL
  • Missing function: JOINs may return fewer rows than expected

Quick Start Examples

"What does this binary do?"

sql
-- Entry points
SELECT * FROM entries;

-- Imported APIs (hints at functionality)
SELECT module, name FROM imports ORDER BY module, name;

-- Interesting strings
SELECT content FROM strings WHERE length > 10 ORDER BY length DESC LIMIT 20;

"Find security-relevant code"

sql
-- Dangerous string functions
SELECT DISTINCT func_at(func_addr) FROM disasm_calls
WHERE callee_name IN ('strcpy', 'strcat', 'sprintf', 'gets');

-- Crypto-related
SELECT * FROM imports WHERE name LIKE '%Crypt%' OR name LIKE '%Hash%';

-- Network-related
SELECT * FROM imports WHERE name LIKE '%socket%' OR name LIKE '%connect%' OR name LIKE '%send%';

"Understand a specific function"

sql
-- Basic info
SELECT * FROM funcs WHERE address = 0x401000;

-- Pseudocode (if Hex-Rays available)
SELECT line FROM pseudocode WHERE func_addr = 0x401000 ORDER BY line_num;

-- Local variables
SELECT name, type, size FROM ctree_lvars WHERE func_addr = 0x401000;

-- What it calls
SELECT callee_name FROM disasm_calls WHERE func_addr = 0x401000;

-- What calls it
SELECT func_at(from_ea) FROM xrefs WHERE to_ea = 0x401000 AND is_code = 1;

"Find all uses of a string"

sql
SELECT s.content, func_at(x.from_ea) as function, printf('0x%X', x.from_ea) as location
FROM strings s
JOIN xrefs x ON s.address = x.to_ea
WHERE s.content LIKE '%config%';

Natural Language Query Examples

These examples show how to translate common user questions into SQL.

Function Signature Queries

"Show me functions that return integers"

sql
-- Using funcs table (recommended - direct and fast)
SELECT name, return_type, arg_count FROM funcs
WHERE return_is_integral = 1
LIMIT 20;

-- Or via types_func_args (for typedef-aware queries)
SELECT DISTINCT type_name FROM types_func_args
WHERE arg_index = -1 AND is_integral_resolved = 1;

"Show me functions that take 4 string arguments"

sql
-- String = char* or wchar_t*
SELECT type_name, COUNT(*) as string_args
FROM types_func_args
WHERE arg_index >= 0
  AND is_ptr_resolved = 1
  AND base_type_resolved IN ('char', 'wchar_t', 'CHAR', 'WCHAR')
GROUP BY type_ordinal
HAVING string_args = 4;

"Which functions return pointers?"

sql
SELECT name, return_type FROM funcs
WHERE return_is_ptr = 1
ORDER BY name LIMIT 20;

"Find void functions with many arguments"

sql
SELECT name, arg_count FROM funcs
WHERE return_is_void = 1 AND arg_count >= 4
ORDER BY arg_count DESC;

"What calling conventions are used?"

sql
SELECT calling_conv, COUNT(*) as count
FROM funcs
WHERE calling_conv IS NOT NULL AND calling_conv != ''
GROUP BY calling_conv ORDER BY count DESC;

Return Value Analysis

"Which functions return 0?"

sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.return_num = 0;

"Find functions that return -1 (error pattern)"

sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.return_num = -1;

"Functions that return their input argument"

sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.returns_arg = 1;

"Functions that return the result of another call (wrappers)"

sql
SELECT DISTINCT f.name FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE r.returns_call_result = 1;

"Functions with multiple return statements"

sql
SELECT f.name, COUNT(*) as return_count
FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
GROUP BY f.address
HAVING return_count > 1
ORDER BY return_count DESC LIMIT 20;

Type Analysis

"Find typedefs that hide pointers (like HANDLE)"

sql
SELECT DISTINCT type_name, arg_type, base_type_resolved
FROM types_func_args
WHERE is_ptr = 0 AND is_ptr_resolved = 1;

"Functions with struct parameters"

sql
SELECT type_name, arg_name, arg_type FROM types_func_args
WHERE arg_index >= 0 AND is_struct = 1;

Combined Queries

"Integer-returning functions with 3+ args that return specific values"

sql
SELECT f.name, f.return_type, f.arg_count, r.return_num
FROM funcs f
JOIN ctree_v_returns r ON r.func_addr = f.address
WHERE f.return_is_integral = 1
  AND f.arg_count >= 3
  AND r.return_num IS NOT NULL
ORDER BY r.return_num;

"Fastcall functions that return pointers"

sql
SELECT name, return_type, arg_count FROM funcs
WHERE calling_conv = 'fastcall' AND return_is_ptr = 1;

Summary: When to Use What

GoalTable/Function
List all functionsfuncs
Functions by return typefuncs WHERE return_is_integral = 1
Functions by arg countfuncs WHERE arg_count >= N
Void functionsfuncs WHERE return_is_void = 1
Pointer-returning functionsfuncs WHERE return_is_ptr = 1
Functions by calling conventionfuncs WHERE calling_conv = 'fastcall'
Find who calls whatxrefs with is_code = 1
Find data referencesxrefs with is_code = 0
Analyze importsimports
Find stringsstrings
Configure string typesrebuild_strings(types, minlen)
Instruction analysisinstructions WHERE func_addr = X
Decompiled codepseudocode WHERE func_addr = X
AST pattern matchingctree WHERE func_addr = X
Call patternsctree_v_calls, disasm_calls
Control flowctree_v_loops, ctree_v_ifs
Return value analysisctree_v_returns
Functions returning specific valuesctree_v_returns WHERE return_num = 0
Pass-through functionsctree_v_returns WHERE returns_arg = 1
Wrapper functionsctree_v_returns WHERE returns_call_result = 1
Variable analysisctree_lvars WHERE func_addr = X
Type informationtypes, types_members
Function signaturestypes_func_args (with type classification)
Functions by return typetypes_func_args WHERE arg_index = -1
Typedef-aware type queriestypes_func_args (surface vs resolved)
Hidden pointer typestypes_func_args WHERE is_ptr = 0 AND is_ptr_resolved = 1
Modify database*_live tables
Jump to Anythingjump_entities('pattern', 'mode')
Entity search (JSON)jump_search('pattern', 'mode', limit, offset)

Remember: Always use func_addr = X constraints on instruction and decompiler tables for acceptable performance.