AgentSkillsCN

binary-parsing-mixed-endian

在Python中解析具有混合字节序与可变长度记录的二进制文件。

SKILL.md
--- frontmatter
name: binary-parsing-mixed-endian
description: Parsing binary files with mixed endianness and variable-length records in Python.

Binary Parsing (Mixed Endian)

This skill summarizes the pattern for parsing binary files that utilize mixed byte orders (Big Endian and Little Endian) and contain variable-length records. This is common in legacy game saves or custom network protocols.

Trigger Criteria

Use this pattern when:

  • You need to read a binary file format.
  • The specification explicitly mixes Big Endian (Network Byte Order) and Little Endian.
  • Records have variable structures based on a type identifier.

Workflow

mermaid
graph TD
    A[Start] --> B[Open File 'rb']
    B --> C[Read Magic & Version]
    C --> D{EOF?}
    D -->|No| E[Read Record Type (1B)]
    E --> F[Read Common ID (4B, >I)]
    F --> G{Type == 0x01?}
    G -->|Yes| H[Read Terrain Data (10B)]
    H --> H1[Unpack >HHB (X, Y, Type)]
    H1 --> D
    G -->|No| I{Type == 0x02?}
    I -->|Yes| J[Read Length (2B, <H)]
    J --> K[Read String (Length bytes)]
    K --> D
    D -->|Yes| L[End]

Implementation Guide

1. struct Module format strings

CodeTypeSizeEndianness
>IUnsigned Int4 bytesBig Endian
<HUnsigned Short2 bytesLittle Endian
>HUnsigned Short2 bytesBig Endian
BUnsigned Char1 byteN/A

2. Reading Loop Pattern

python
with open(filepath, 'rb') as f:
    # Header
    ...
    while True:
        # Check EOF by trying to read 1 byte
        byte = f.read(1)
        if not byte: break
        
        # Dispatch based on type
        ...

Common Pitfalls

  1. Endianness Mismatch: Using > when < is required (or default @). This leads to massive integer values.
  2. Off-by-one Reading: Forgetting that read() advances the pointer.
  3. Padding: If the spec says "5 bytes padding", ensure you read(5) to skip them, or read a chunk and slice.
  4. String Decoding: Binary strings need .decode('utf-8') (or 'ascii', 'latin-1') to be usable.

Examples

See assets/examples/parse_save.py.example for a complete working script.