Capability-Oriented Security Reasoning Framework
Non-goal: This framework does not attempt to classify code as malicious or benign. It enumerates potential capability changes and contextual signals that may support or refute security hypotheses.
Goal: Provide a constrained vocabulary and reasoning structure for describing what becomes possible when code changes, enabling systematic capability expansion analysis.
Atomic unit: Version transition (diff), not standalone code. Capabilities are attributed to added/modified hunks.
Core Principle: Capability-First Reasoning
Traditional approach:
"Does this match a known attack pattern?" → Binary classification
This framework:
"What new affordances does this create?" → Capability description → Contextual reasoning
Capability Taxonomy
Use this vocabulary to describe what code can do, not what it "is."
Capabilities should be attributed to added/modified hunks where possible. Existing capabilities present in both versions are background context, not delta.
Network Capabilities
- •
network.http_client- Can initiate HTTP/HTTPS requests - •
network.socket- Can create raw network sockets - •
network.dns- Can perform DNS queries - •
network.alternate_protocol- Can use FTP, SMTP, etc.
Environment Capabilities
- •
environment.read_single- Can read specific environment variable - •
environment.read_wholesale- Can enumerate all environment variables - •
environment.write- Can modify environment
Filesystem Capabilities
- •
filesystem.read_generic- Can read files - •
filesystem.read_sensitive- Can access.ssh,.aws,.env, etc. - •
filesystem.write- Can create/modify files - •
filesystem.permission_change- Can chmod/chown files
Process Capabilities
- •
process.spawn- Can create child processes - •
process.exec- Can execute system commands - •
process.eval- Can dynamically execute code
Data Transformation Capabilities
- •
encoding.base64- Can encode/decode base64 - •
encoding.hex- Can encode/decode hexadecimal - •
encoding.compress- Can compress/decompress (gzip, zlib) - •
crypto.encrypt- Can encrypt data - •
crypto.decrypt- Can decrypt data
Conditional Execution Capabilities
- •
conditional.environment_gated- Execution depends on environment variables - •
conditional.time_gated- Execution depends on date/time - •
conditional.platform_gated- Execution depends on OS/platform - •
conditional.input_gated- Execution depends on function arguments
Execution Phase Capabilities (CRITICAL for supply-chain)
- •
phase.install_time- Runs during package installation (npm lifecycle hooks, setup.py) - •
phase.import_time- Runs when module is imported (module-level side effects) - •
phase.build_time- Runs during build/compilation (build scripts, webpack) - •
phase.runtime- Runs when explicitly invoked via API
Why phase matters: Install-time execution bypasses code review. Build-time divergence enables XZ-style attacks.
Counterfactual Reasoning Framework
For each code change, systematically enumerate:
1. Capability Delta
Before: List capabilities present in previous version After: List capabilities present in new version Added: Capabilities in After but not in Before (focus here) Removed: Capabilities in Before but not in After
Attribution: Link capabilities to specific hunks/lines where possible.
2. Affordance Questions
For each added capability, ask:
- •Reach: What data can this capability access?
- •Transform: How can that data be modified?
- •Transmit: Where can that data be sent?
- •Persist: Can effects outlive the process?
- •Trigger: Under what conditions does this activate?
- •Phase: When does this execute (install/import/build/runtime)?
3. Composition Analysis
For capability combinations, describe:
- •Data flow: A → B → C (e.g., env_read → encode → network)
- •Control flow: IF condition THEN capability (e.g., if env.CI then network.http)
- •Timing: Sequential, parallel, or conditional chains
- •Phase interaction: Does install-time code enable runtime behavior?
4. Intent Alignment Assessment
Compare observed capabilities with stated package purpose:
- •Stated purpose: From package description, README, documentation
- •Implied capabilities: What capabilities does purpose require?
- •Observed capabilities: What capabilities exist in code?
- •Alignment gap: Capabilities present but not implied by purpose
5. Uncertainty Qualification
Observation Confidence:
- •HIGH: Capability is explicit (imports + callsite visible in code)
- •MEDIUM: Capability inferred (wrapper function, indirect call, dynamic import)
- •LOW: Capability speculative (requires runtime resolution, obfuscated)
Dynamic Resolution Flag:
- •
requires_dynamic_resolution: true- Cannot determine statically (eval, computed imports) - •
requires_dynamic_resolution: false- Statically observable
Context Budget Policy
To prevent hidden overfitting and ensure reproducible evaluation:
Default context (always provide):
- •Changed files only (diffs)
- •Minimal package metadata (name, version, 1-sentence description)
Escalation context (optional, must log):
- •Full file context (not just diffs)
- •Complete README
- •Dependency tree
- •Maintainer history
Logging requirement: If escalating beyond default context, document what additional context was used and why.
This ensures methods sections can accurately describe information available to the model.
Available Tools
Note: Tools are executable scripts in the tools/ directory. Call them via bash when needed.
1. extract_capabilities (REQUIRED)
Extracts security-relevant capabilities from code with diff-aware attribution.
Purpose: Build factual inventory of what code can do
When to use: Always, as first step in analysis
Returns: List of capabilities with:
- •
capability- Taxonomy identifier - •
phase- Execution phase (if detectable) - •
evidence_span- {file, hunk_id, start_line, end_line} - •
origin- "added" | "removed" | "preexisting" - •
confidence_obs- "HIGH" | "MEDIUM" | "LOW" - •
requires_dynamic_resolution- true | false - •
context- Code snippet showing capability
Example:
extract_capabilities(
old_code="...",
new_code="import requests\nif os.environ.get('CI'): requests.get(...)",
language="python"
)
# Returns: [
# {
# capability: "network.http_client",
# phase: "import_time",
# evidence_span: {file: "main.py", hunk: 1, start: 1, end: 1},
# origin: "added",
# confidence_obs: "HIGH",
# requires_dynamic_resolution: false,
# context: "import requests"
# },
# {
# capability: "conditional.environment_gated",
# phase: "runtime",
# evidence_span: {file: "main.py", hunk: 2, start: 2, end: 2},
# origin: "added",
# confidence_obs: "HIGH",
# requires_dynamic_resolution: false,
# context: "if os.environ.get('CI')"
# }
# ]
2. analyze_execution_paths (OPTIONAL - Confirmatory Only)
Surfaces potential execution paths through code.
Purpose: Understand how capabilities might compose
When to use: When you need to trace data/control flow
NOT for: Determining reachability or confirmed behavior
Returns:
- •
possible_paths- Sequences of capability nodes - •
conditions- Normalized triggers - •
note- Always includes "possible, not confirmed" - •Never returns "reachable: true" or definitive flow
Example:
analyze_execution_paths(
code="...",
language="javascript"
)
# Returns: {
# possible_paths: ["env_read → encode → network", "env_read → filesystem"],
# conditions: ["process.env.CI", "process.platform === 'linux'"],
# note: "These are possible paths based on static analysis, not confirmed execution"
# }
3. search_capability_examples (OPTIONAL - Explanatory Only)
Finds historical examples where capability overlap exists.
Purpose: Provide context, not classification
When to use: To explain or provide evidence for hypothesis
NOT for: Pattern matching, similarity scoring, or labeling
Returns (sanitized schema):
- •
example_name- Identifier only - •
capabilities_overlap- List of overlapping capabilities - •
why_relevant- One sentence explanation - •
caution- Always included disclaimer
NO similarity scores. NO "this matches X" language.
Example:
search_capability_examples(
capabilities=["environment.read_wholesale", "network.http_client", "phase.install_time"]
)
# Returns: [
# {
# example_name: "ctx-2021",
# capabilities_overlap: ["environment.read_wholesale", "network.http_client"],
# why_relevant: "Historical example of wholesale env access + network transmission",
# caution: "Overlap exists for context. Does not indicate malicious intent."
# }
# ]
Capability Risk Composition Matrix
This describes potential security implications of capability combinations, not verdicts.
| Capabilities | Potential Implication | Why Notable |
|---|---|---|
| environment.read_wholesale + network.http_client | Data exfiltration channel | All env vars accessible + transmission capability |
| process.exec + network.http_client | Remote command execution channel | External input could control commands |
| filesystem.read_sensitive + encoding.base64 + network.http_client | Credential theft channel | Sensitive data + obfuscation + transmission |
| conditional.environment_gated + network.http_client | Selective activation | Behavior varies by environment (CI vs local) |
| phase.install_time + network.http_client | Pre-review execution | Runs before code review, in high-privilege context |
| phase.build_time + filesystem.write | Build-time injection | Can modify artifacts not in source control |
| encoding.base64 + process.eval | Obfuscated code execution | Hidden logic execution |
Note: These describe possibilities, not probabilities or intentions.
Historical Capability Pattern Examples
These are post-hoc explanations, not detection rules.
Example: event-stream (2018)
Capabilities observed:
- •
environment.read_single(npm_package_description) - •
conditional.environment_gated - •
crypto.decrypt - •
phase.runtime
Use of this example: Illustrates that environment-gated execution can enable targeted attacks. Does NOT mean all env-gated code is malicious.
Example: ua-parser-js (2021)
Capabilities observed:
- •
conditional.platform_gated(process.platform) - •
process.spawn - •
phase.install_time
Use of this example: Shows install-time + platform-gating pattern. Does NOT mean install hooks indicate compromise.
Example: ctx/phpass (2021)
Capabilities observed:
- •
environment.read_wholesale(os.environ) - •
encoding.base64 - •
network.http_client - •
phase.install_time(setup.py)
Use of this example: Demonstrates wholesale env + encoding + network pattern. Does NOT make this combination automatically suspicious.
Example: XZ Utils (CVE-2024-3094, 2024)
Capabilities observed:
- •
phase.build_time(injection in release tarball, not git) - •
conditional.environment_gated(SSH + systemd context) - •
filesystem.write(binary blobs) - •Long-term social engineering
Use of this example: Illustrates build-time vs source-time capability divergence. Does NOT mean all build scripts are suspect.
False Positive Awareness
Benign code often has security-relevant capabilities:
Telemetry/Analytics
Capabilities: network.http_client + conditional.environment_gated
Benign when: Documented, opt-out available, analytics domain matches package
Check: Is DISABLE_ANALYTICS respected? Is domain in README?
Update Checks
Capabilities: network.http_client Benign when: Checking version only, not sending user data Check: Is request to package registry? Is response only version info?
License Validation
Capabilities: network.http_client + environment.read_single Benign when: Commercial package, license endpoint documented Check: Is package commercial? Is validation endpoint disclosed?
Handling Obfuscated Code
Malicious code is often heavily obfuscated to evade analysis. This framework includes strategies for analyzing obfuscated code.
Obfuscation Indicators
- •Hex-encoded function names (
_0x4e9bf4,_0x112fa8) - •Large arrays of encoded strings
- •Self-modifying code patterns
- •Computed property access (
window[_0x4e9bf4(0x174)]) - •Nested function calls with numeric offsets
- •Unusual arithmetic expressions as array indices
De-Obfuscation Strategy
When encountering obfuscated code:
- •
Identify String Arrays: Look for large arrays containing encoded strings
- •Often named
_0xNNNNor similar patterns - •Usually defined at module/function scope
- •Often named
- •
Find Decoder Functions: Locate functions that map indices to strings
- •Pattern:
function _0xNNNN(index) { return array[index - offset]; } - •May include string transformations (base64, rot13, etc.)
- •Pattern:
- •
Trace High-Value API Calls: Focus on capability-relevant APIs even if obfuscated
- •Look for patterns like
window[...](DOM access) - •Network APIs:
fetch,XMLHttpRequest,.get,.post,.send - •Crypto APIs: wallet-related strings in arrays
- •Environment:
process,env, global object access
- •Look for patterns like
- •
Extract String Literals: Analyze string array contents
- •Cryptocurrency addresses (bc1, 0x, etc.)
- •Domain names and URLs
- •API endpoint patterns
- •Wallet-related terms (ethereum, solana, bitcoin)
- •
Infer Capabilities from Context: Even without full de-obfuscation
- •
window[encoded](encoded_method)→ likely DOM/browser API - •Conditional checks + network → environment-gated behavior
- •Large encoded arrays + network → likely data exfiltration
- •
Obfuscated Code Analysis Workflow
1. Identify obfuscation pattern (array + decoder function) ↓ 2. Extract string array contents (literal strings) ↓ 3. Search for security-relevant keywords: - wallet, ethereum, solana, bitcoin, crypto - fetch, XMLHttpRequest, request, http - window, document, navigator - process.env, os.environ ↓ 4. Map API patterns to capabilities: - window.ethereum → credential_access (wallet interaction) - fetch/XHR → network.http_client - Conditionals → conditional.environment_gated ↓ 5. Describe capabilities with: - confidence: LOW/MEDIUM (due to obfuscation) - requires_dynamic_resolution: true - evidence: String literals found in array
Example: Obfuscated Wallet Stealer
const _0x112fa8=_0x180f;
function _0x180f(_0x240418,_0xdfe6b8){
const _0x3b4f1d=_0x550a();
return _0x3b4f1d[_0x240418-0x100];
}
function _0x550a(){
return ['ethereum','solana','bitcoin','fetch','send'];
}
typeof window[_0x112fa8(0x100)]!='undefined'?checkWallet():skip();
Capabilities identified (even without full de-obfuscation):
- •
network.http_client(confidence: MEDIUM) - 'fetch', 'send' in string array - •
credential_access(confidence: MEDIUM) - 'ethereum', 'solana', 'bitcoin' + window access - •
conditional.environment_gated(confidence: HIGH) - typeof check for window - •
requires_dynamic_resolution: true- Obfuscated control flow
Evidence: Lines where string array contains wallet-related terms, lines where window[encoded] pattern appears
Confidence Levels for Obfuscated Code
- •HIGH confidence: When string literals directly indicate capabilities (e.g., "https://evil.com" in array)
- •MEDIUM confidence: When API patterns are recognizable despite obfuscation
- •LOW confidence: When only structural patterns suggest capabilities
Always mark: requires_dynamic_resolution: true for heavily obfuscated code
Analysis Workflow
- •
Extract capabilities (use
extract_capabilitiestool)- •Get diff-attributed inventory
- •Note phase, origin, confidence for each
- •
Compute capability delta
- •Focus on
origin: "added" - •Background context:
origin: "preexisting"
- •Focus on
- •
Describe affordances (use counterfactual framework)
- •What becomes possible that wasn't before?
- •How do capabilities compose?
- •What phase do they execute in?
- •
Assess intent alignment (compare to package purpose)
- •Do capabilities match stated purpose?
- •Is there an alignment gap?
- •
(Optional) Check execution paths (use
analyze_execution_paths)- •How might capabilities connect?
- •What data flows are possible?
- •
(Optional) Find examples (use
search_capability_examples)- •Has overlap occurred before?
- •What context do historical cases provide?
- •
Render analysis (describe, don't classify)
- •Enumerate capabilities with evidence
- •Describe potential implications
- •State confidence and uncertainty
- •Provide context
Output Format
Your analysis should describe what is, not what it means:
✅ Good Output Format
Capability Delta: - Added: network.http_client (line 15, hunk 2, confidence: HIGH, phase: import_time) - Added: environment.read_wholesale (line 12, hunk 2, confidence: HIGH, phase: runtime) - Added: conditional.environment_gated (line 11, hunk 2, confidence: HIGH, condition: process.env.CI) Evidence Spans: - File: main.py, hunk 2, lines 11-15 (new code added in this version) Composition: - Observed path: env_read → network (lines 12-15) - Conditional: Only when process.env.CI is truthy - Phase: import_time network setup, runtime execution Affordance Description: - This combination creates a channel for environment variable transmission - Activation is selective (CI environments only) - All environment variables are accessible (wholesale access) - Executes when package is imported (phase.import_time for network import) Intent Alignment: - Package purpose: "Simple date formatting utility" - Implied capabilities: String manipulation, date parsing - Observed capabilities: Network transmission, environment access - Gap: Network and environment capabilities not implied by "date formatting" Uncertainty: - Confidence (observation): HIGH - Direct evidence in added hunks - Requires dynamic resolution: false - All capabilities statically observable - Alternative interpretation: None identified Historical Context (from search_capability_examples): - Overlap exists with ctx-2021: env.read_wholesale + network.http_client - Caution: Overlap provides context only, not classification
❌ Bad Output Format
VERDICT: CRITICAL RISK - Malicious credential theft detected CONFIDENCE: 95% Pattern match: ctx attack (similarity: 0.95) BLOCK DEPLOYMENT
Key Constraints
- •No autonomous conclusions: Tools surface data, YOU reason
- •No risk scoring: Describe implications, don't score them
- •No classification: Enumerate capabilities, don't label "malicious/benign"
- •Pattern examples are explanatory: Historical overlap provides context, not verdicts
- •Confidence is about observation: How certain are you about what code does, not what it "is"
- •Diff-scoped attribution: Link capabilities to specific hunks where possible
- •Phase-aware analysis: Always note when code executes (install/import/build/runtime)
- •Context budget: Log any escalation beyond default context
This Framework Defines Your Dataset Labels
Direct mapping to annotation schema:
- •
capability_delta[]- List of added/removed capabilities - •
trigger_surface[]- Conditional execution patterns - •
phase_delta[]- Changes in execution phase - •
alignment_gap- Qualitative intent mismatch description - •
confidence_obs- HIGH/MEDIUM/LOW per capability - •
evidence_span- Localization for each capability - •
requires_dynamic_resolution- Static/dynamic analysis boundary
This Framework Is
✅ A capability vocabulary
✅ A reasoning scaffold
✅ An annotation ontology
✅ A dataset labeling schema
✅ A reviewer-legible explanation layer
This Framework Is NOT
❌ A malware detector ❌ A rules engine ❌ A source of truth ❌ A substitute for reasoning ❌ A pattern matching system