Name: token-alignment
Rating: 92
Author: brightlikethelight

When to Use This Skill

•Debugging why certain banned words aren't penalized
•Investigating subword tokenization issues
•Verifying token offset mapping

Debugging Steps

1. Print token-level breakdown

python

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-4B-Instruct-2507")

text = "The coin shows Heads"
enc = tokenizer(text, return_offsets_mapping=True, add_special_tokens=False)
for i, (token_id, offset) in enumerate(zip(enc["input_ids"], enc["offset_mapping"])):
    print(f"{i}: {tokenizer.decode([token_id])!r} -> chars {offset}")

2. Check all token variants

Always check these variants for any banned word:

•"Heads" (no space)
•" Heads" (leading space - SentencePiece!)
•"HEADS", "heads", "Head", "head"

python

banned_word = "Heads"
variants = [
    banned_word,
    f" {banned_word}",
    banned_word.upper(),
    banned_word.lower(),
    banned_word.capitalize(),
]
for v in variants:
    ids = tokenizer.encode(v, add_special_tokens=False)
    print(f"{v!r} -> {ids} -> {[tokenizer.decode([i]) for i in ids]}")

3. Verify overlap detection

python

def check_overlap(char_start, char_end, token_start, token_end):
    """Returns True if character span overlaps with token span."""
    return max(char_start, token_start) < min(char_end, token_end)

# Test: banned word at chars [15, 20], token spans from offset_mapping
for i, (tok_start, tok_end) in enumerate(enc["offset_mapping"]):
    if check_overlap(15, 20, tok_start, tok_end):
        print(f"Token {i} overlaps banned region: {tokenizer.decode([enc['input_ids'][i]])!r}")

4. Debug boundary detection

python

import re

def find_cot_boundaries(text):
    """Find character boundaries of <think>...</think> regions."""
    pattern = r"<think>(.*?)</think>"
    boundaries = []
    for match in re.finditer(pattern, text, re.DOTALL):
        boundaries.append((match.start(1), match.end(1)))
    return boundaries

# Only penalize tokens WITHIN these boundaries
cot_bounds = find_cot_boundaries(completion)
for tok_idx, (tok_start, tok_end) in enumerate(offset_mapping):
    in_cot = any(check_overlap(cot_s, cot_e, tok_start, tok_end) 
                 for cot_s, cot_e in cot_bounds)
    if in_cot and is_banned_token(tok_idx):
        penalties[tok_idx] = -penalty_weight

Common Gotchas

Issue	Solution
Qwen uses SentencePiece	Leading spaces become part of tokens - always check `" Heads"`
Unicode lookalikes	`U+2217` (∗) vs `*` can bypass bans - normalize first
Case sensitivity	Handle explicitly with `.lower()` or check all variants
Multi-token words	`"Tails"` might be `["T", "ails"]` - check all subword tokens
Offset mapping gaps	Some tokenizers don't return perfect offsets - validate

Quick Validation Script

python

# scripts/debug_tokenization.py
def validate_ban_coverage(text, banned_words, tokenizer):
    """Verify all banned word occurrences are captured by token penalties."""
    enc = tokenizer(text, return_offsets_mapping=True, add_special_tokens=False)
    
    for word in banned_words:
        for match in re.finditer(re.escape(word), text, re.IGNORECASE):
            word_start, word_end = match.span()
            covered = False
            for tok_start, tok_end in enc["offset_mapping"]:
                if check_overlap(word_start, word_end, tok_start, tok_end):
                    covered = True
                    break
            if not covered:
                print(f"WARNING: '{word}' at [{word_start}:{word_end}] not covered!")