Create CodeQL Query Development Workshop
This skill guides you through creating custom CodeQL query development workshops from existing, production-grade CodeQL queries. The workshop format uses a test-driven, incremental learning approach where developers progress through stages from simple to complex.
When to Use This Skill
- •Creating training materials for CodeQL query development
- •Teaching developers to build custom security or code quality queries
- •Generating guided learning paths from existing query implementations
- •Building workshops customized to specific business needs or code patterns
Workshop Value Proposition
Custom workshops are more effective than generic tutorials because:
- •Developers learn by building queries that actually matter to their work
- •Real-world query patterns are more motivating than toy examples
- •Teams can train developers on their specific security or quality concerns
- •Workshops scale knowledge transfer from CodeQL experts to their teams
Prerequisites
Before creating a workshop, ensure you have:
- •An existing CodeQL query (
.qlfile) that is production-ready - •Passing unit tests for that query (
.expectedresults that match actual results) - •Understanding of the query's purpose and complexity
- •Access to CodeQL Development MCP Server tools
CodeQL Pack Naming Convention
This repository uses codeql-pack.yml for new CodeQL pack configuration files and recommends it over qlpack.yml. While both codeql-pack.yml and qlpack.yml are equally supported by CodeQL, codeql-pack.yml is preferred as it aligns with the codeql-pack.lock.yml naming convention used by codeql pack install. If you encounter references to qlpack.yml in this workshop or related materials, treat them as equivalent to codeql-pack.yml, with codeql-pack.yml as the recommended name for new packs.
Required Inputs
When invoking this skill, you must provide:
- •Source Query Path: Full path to the production query
.qlfile - •Source Query Tests Path: Full path to the directory containing unit tests for the query
- •Base Directory: Path where the workshop directory will be created (e.g.,
/tmp/workshopsor<your-repo>/workshops) - •Workshop Name: Name for the workshop directory (e.g.,
dataflow-analysis-cpp)
Workshop Output Structure
The skill creates a complete workshop under <base_dir>/<workshop_name>/:
<base_dir>/<workshop_name>/
├── README.md # Workshop overview and setup instructions
├── codeql-workspace.yml # CodeQL workspace configuration
├── build-databases.sh # Script to create test databases
├── exercises/ # Student exercise queries (incomplete)
│ ├── codeql-pack.yml # Query pack config
│ ├── Exercise1.ql
│ ├── Exercise2.ql
│ └── ...
├── exercises-tests/ # Unit tests for exercises
│ ├── codeql-pack.yml # Test pack config (with extractor + dependency on exercises)
│ ├── Exercise1/
│ │ ├── Exercise1.qlref
│ │ ├── Exercise1.expected
│ │ └── test.{ext}
│ └── ...
├── solutions/ # Complete solution queries
│ ├── codeql-pack.yml # Query pack config
│ ├── Exercise1.ql
│ ├── Exercise2.ql
│ └── ...
├── solutions-tests/ # Unit tests for solutions
│ ├── codeql-pack.yml # Test pack config (with extractor + dependency on solutions)
│ ├── Exercise1/
│ │ ├── Exercise1.qlref
│ │ ├── Exercise1.expected
│ │ └── test.{ext}
│ └── ...
├── graphs/ # AST/CFG visualizations
│ ├── Exercise1-ast.txt
│ ├── Exercise1-cfg.txt
│ └── ...
└── tests-common/ # Shared test code and databases
├── test.{ext}
└── codeql-pack.yml
See workshop-structure-reference.md for detailed structure documentation.
Workflow Overview
The workshop creation process follows these phases:
Phase 1: Analysis
- •Analyze Source Query using
find_codeql_query_filesandexplain_codeql_query - •Identify Complexity to determine number of stages
- •Extract Test Cases from existing unit tests
- •Plan Stages breaking query from simple to complex
Phase 2: Decomposition
Working backwards from the complete query:
- •Identify Decomposition Points (predicates, logic blocks, complexity layers)
- •Define Stage Goals (what each exercise teaches)
- •Create Stage Order (simple to complex progression)
Phase 3: Generation
For each stage (starting with final/complete stage):
- •Generate Solution Query for this stage
- •Create Solution Tests that validate the solution
- •Run Tests using
codeql_test_runto ensure they pass - •Generate Exercise Query by removing implementation details
- •Create Exercise Tests (may match solution tests or be subset)
Phase 4: Enrichment
- •Generate Graph Outputs (AST/CFG) for each stage using
codeql_bqrs_interpret - •Create build-databases.sh script for test database creation
- •Write README.md with workshop overview, setup, and instructions
- •Create codeql-workspace.yml to configure CodeQL workspace
Phase 5: Validation
- •Test All Solutions run
codeql_test_runon solutions-tests/ - •Verify Test Pass Rate ensure 100% pass rate for solutions
- •Check File Structure validate all required files exist
- •Review Exercise Gaps ensure exercises have appropriate scaffolding
Key MCP Server Tools
Query Analysis
- •
find_codeql_query_files- Locate query files and dependencies - •
explain_codeql_query- Understand query purpose and logic - •
codeql_resolve_metadata- Extract query metadata
Test Management
- •
codeql_test_extract- Create test databases from test code - •
codeql_test_run- Execute tests and validate results - •
codeql_test_accept- Update expected results when needed
Query Execution
- •
codeql_query_run- Run queries (including PrintAST, PrintCFG) - •
codeql_query_compile- Validate query syntax - •
codeql_bqrs_interpret- Generate graph outputs from results
Database Operations
- •
codeql_database_create- Create CodeQL databases from source - •
codeql_resolve_database- Validate database structure
See mcp-tools-reference.md for detailed tool usage patterns.
Stage Decomposition Strategy
When decomposing a complex query into stages, consider these patterns:
Pattern 1: Syntactic to Semantic
- •Stage 1: Find syntactic elements (e.g.,
ArrayExpr) - •Stage 2: Add type constraints (e.g., specific array types)
- •Stage 3: Add semantic analysis (e.g., control flow)
- •Stage 4: Add data flow analysis (e.g., track values)
Pattern 2: Local to Global
- •Stage 1: Local pattern matching
- •Stage 2: Add local control flow
- •Stage 3: Add local data flow
- •Stage 4: Add global data flow
Pattern 3: Simple to Filtered
- •Stage 1: Find all candidates (high recall, low precision)
- •Stage 2: Add basic filtering
- •Stage 3: Add context-aware filtering
- •Stage 4: Eliminate false positives
Pattern 4: Building Blocks
- •Stage 1: Define helper predicates
- •Stage 2: Combine helpers into sources
- •Stage 3: Define sinks
- •Stage 4: Connect sources to sinks with data flow
Exercise Creation Guidelines
What to Remove from Solutions
When creating exercises from solutions:
- •Implementation bodies: Leave predicate signatures with
none()body - •Complex logic: Replace with
// TODO: Implementcomments - •Data flow configs: Provide signature, remove implementation
- •Filter predicates: Keep structure, remove conditions
What to Keep in Exercises
- •Import statements: All imports should be present
- •Type signatures: Full type information for predicates
- •Comments: Helpful hints about what to implement
- •Test scaffolding: Basic structure to guide implementation
Hints and Documentation
Add inline comments to guide students:
/**
* Find all array expressions that access a specific type.
*
* Hint: Use `.getArrayBase().getType()` to get the base type.
*/
predicate isTargetArrayAccess(ArrayExpr array) {
// TODO: Implement type checking
none()
}
Test Creation Guidelines
Test Code Patterns
Create test code (test.{ext}) that includes:
- •Positive cases: Code patterns the query should detect
- •Negative cases: Similar code that should NOT be detected
- •Edge cases: Boundary conditions
- •Comments: Explain what each test case validates
Example for C++:
// POSITIVE CASE: Null pointer dereference
void unsafeFunction() {
int* ptr = nullptr;
*ptr = 42; // Should be detected
}
// NEGATIVE CASE: Checked before use
void safeFunction() {
int* ptr = nullptr;
if (ptr != nullptr) {
*ptr = 42; // Should NOT be detected
}
}
// EDGE CASE: Pointer in complex expression
void edgeCase() {
int* ptr = nullptr;
int result = ptr ? *ptr : 0; // Should be detected
}
Expected Results Format
The .expected file uses CodeQL test format:
| file | line | col | endLine | endCol | message | | test.cpp | 3 | 5 | 3 | 8 | Null pointer dereference | | test.cpp | 18 | 17 | 18 | 20 | Null pointer dereference |
Test Progression
- •Early stages: Fewer expected results (simpler queries)
- •Later stages: More expected results (more comprehensive)
- •Final stage: Should match production query expected results
Graph Generation
Generate visual aids for understanding code structure:
PrintAST Graphs
Show Abstract Syntax Tree structure:
{
"queryName": "PrintAST",
"queryLanguage": "cpp",
"database": "tests-common/test.testproj",
"outputFormat": "graphtext"
}
Use codeql_bqrs_interpret to create graphs/Exercise1-ast.txt.
PrintCFG Graphs
Show Control Flow Graph:
{
"queryName": "PrintCFG",
"queryLanguage": "cpp",
"database": "tests-common/test.testproj",
"outputFormat": "graphtext"
}
Use codeql_bqrs_interpret to create graphs/Exercise1-cfg.txt.
Build Scripts
build-databases.sh Template
#!/bin/bash
set -e
WORKSHOP_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
TEST_SOURCE="${WORKSHOP_ROOT}/tests-common"
echo "Building test databases..."
# For each test database needed
for db_name in test1 test2; do
DB_PATH="${WORKSHOP_ROOT}/tests-common/${db_name}.testproj"
echo "Creating database: ${db_name}"
rm -rf "${DB_PATH}"
codeql database create \
--language={language} \
--source-root="${TEST_SOURCE}" \
"${DB_PATH}" \
--command="clang -fsyntax-only ${TEST_SOURCE}/${db_name}.c"
done
echo "Database creation complete!"
codeql-workspace.yml Template
provide: - '*/codeql-pack.yml'
This makes all codeql-pack.yml files available in the workspace.
Workshop README Template
The generated README.md should include:
- •Title and Overview: What the workshop teaches
- •Prerequisites: Required knowledge and tools
- •Setup Instructions: How to clone, install dependencies, build databases
- •Workshop Structure: Overview of exercise progression
- •How to Use: Instructions for working through exercises
- •Validation: How to test exercise solutions
- •Solutions: Where to find reference solutions
- •Additional Resources: Links to CodeQL documentation
See example workshop READMEs for templates.
Language-Specific Considerations
File Extensions by Language
- •C/C++:
.c,.cpp,.h,.hpp - •C#:
.cs - •Go:
.go - •Java:
.java - •JavaScript/TypeScript:
.js,.ts - •Python:
.py - •Ruby:
.rb
Test Database Creation
Language-specific database creation varies:
- •C/C++: Requires build command (e.g.,
clang -fsyntax-only) - •Java: Requires build tool (e.g.,
mvn clean install) - •JavaScript: Usually no build command needed
- •Python: Usually no build command needed
Adjust build-databases.sh accordingly.
Library Dependencies
Include appropriate CodeQL libraries in codeql-pack.yml:
- •C/C++:
codeql/cpp-all - •C#:
codeql/csharp-all - •Go:
codeql/go-all - •Java:
codeql/java-all - •JavaScript/TypeScript:
codeql/javascript-all - •Python:
codeql/python-all - •Ruby:
codeql/ruby-all
Java-Specific API Notes
When writing Java queries, note these API patterns:
- •Primitive Types: No
ByteTypeclass exists. UsePrimitiveTypewith.getName() = "byte", e.g.:ace.getType().(Array).getElementType().(PrimitiveType).getName() = "byte" - •Array Initializers: No
hasInit()method. Useexists(ace.getInit())to check for initializers - •Method Calls: No
MethodAccessclass. UseMethodCallfor method invocations - •Deduplication: When matching both
ArrayCreationExprandArrayInit, excludeArrayInitthat are part ofArrayCreationExprto avoid duplicate results:not exists(ArrayCreationExpr ace | ace.getInit() = ai)
Validation Checklist
Before considering the workshop complete:
- • All solution queries compile without errors
- • All solution tests pass at 100%
- • Exercise queries have appropriate scaffolding (not empty, not complete)
- • Expected results progress logically from stage to stage
- • Test code covers positive, negative, and edge cases
- • Graph outputs exist for stages where helpful
- • build-databases.sh successfully creates all needed databases
- • README.md provides clear setup and usage instructions
- • codeql-workspace.yml correctly references all codeql-pack.yml files
Common Pitfalls
Avoid These Mistakes
- •Too many stages: Keep to 4-8 stages max; too many fragments the learning
- •Too few stages: 1-2 stages don't provide enough incremental learning
- •Uneven difficulty: Each stage should add similar complexity increments
- •Missing test cases: Every query behavior should have test coverage
- •Incomplete exercises: Exercises should have enough scaffolding to guide students
- •Overly complete exercises: Don't give away the solution in exercise code
- •Inconsistent test results: Solution tests must pass reliably
Example Workshops
This skill can be used with CodeQL queries from any repository. To see example workshops created with this skill, refer to workshop repositories that demonstrate the standard format and structure.
Reference Materials
For detailed guidance:
- •Workshop Structure Reference - Complete structure specification
- •MCP Tools Reference - Tool usage patterns for workshop creation
- •Stage Decomposition Examples - Patterns for breaking down queries
Working Workshop Examples
- •Example C++ Simple - Basic C++ null pointer dereference workshop structure
Some workshops may have optional advanced branches:
├── exercises/ │ ├── Exercise1.ql │ ├── Exercise2.ql │ ├── Exercise3.ql │ ├── Exercise4-basic.ql │ └── Exercise4-advanced.ql
Multiple Learning Paths
Consider creating workshops with different focuses from the same source query:
- •Path A: Focus on syntactic analysis
- •Path B: Focus on data flow
- •Path C: Focus on false positive elimination
Difficulty Levels
Add difficulty metadata to exercises:
/** * @name Find Array Access * @description Identify array expressions * @kind problem * @difficulty beginner * @exercise 1 */
Troubleshooting
Solution Tests Fail
If solution tests don't pass:
- •Run
codeql_test_runwith verbose output - •Compare actual vs expected results
- •Verify test database was created correctly
- •Check query logic matches intended behavior
- •Use
codeql_test_acceptto update.expectedif needed
Exercise Too Difficult
If students can't complete an exercise:
- •Add more scaffolding in the exercise query
- •Add more detailed hints in comments
- •Consider splitting into two stages
- •Provide more example patterns in test code
Query Doesn't Compile
If generated queries have compilation errors:
- •Run
codeql_query_compileto see specific errors - •Check import statements are correct
- •Verify qlpack dependencies are installed
- •Ensure predicate signatures are valid
Tips for Success
- •Start simple: First workshop should be straightforward
- •Test frequently: Run tests after creating each stage
- •Iterate on stages: Refine stage boundaries based on testing
- •Get feedback: Have someone unfamiliar try the workshop
- •Document well: Clear instructions reduce support burden
- •Version control: Track workshop iterations in git
- •Reuse test code: Same test code across all stages when possible
Related Skills
- •create-codeql-query-tdd-generic - TDD approach to query development
- •create-codeql-query-unit-test-cpp - Creating C++ query tests
- •create-codeql-query-unit-test-java - Creating Java query tests
- •create-codeql-query-unit-test-javascript - Creating JavaScript query tests
- •create-codeql-query-unit-test-python - Creating Python query tests
Success Metrics
A successful workshop:
- •Completable: Students can finish with provided guidance
- •Educational: Each stage teaches a new concept
- •Validated: All tests pass reliably
- •Practical: Query addresses real-world concerns
- •Scalable: Can be delivered to multiple teams