Repository Tools Overview
Use this skill when working with repository-wide tools, understanding tool architecture, or coordinating workflows across multiple tool categories.
Tool categories
The tools/ directory contains all helper scripts and utilities:
Core scripts (tools/)
- •
init.py: Async wrapper for pytextgen with mtime/inode caching- •Discovers changed
.mdfiles (excludes.git,.obsidian,tools) - •Caches
(mtime, inode, text)to skip unchanged files - •Normalizes newlines to
\nbefore pytextgen processing - •Passes through pytextgen flags (
-C,--no-code-cache,--init-flashcards) - •Commands:
python -m init generate,python -m init clear
- •Discovers changed
- •
convert wiki.py: Wikipedia HTML → Markdown converter- •Reads HTML from clipboard
- •Normalizes links (relative paths with
%20encoding) - •Downloads media to
archives/Wikimedia Commons/ - •Uses
convert wiki.py.names map.jsonfor filename renames - •Preserves Wikipedia attribution
- •Command:
python -m "convert wiki"
- •
pack.py: PageRank-ordered zip bundling- •Walks Markdown links to build dependency graph
- •Computes PageRank to prioritize important files
- •Creates zip bundle with metadata (ranks, omissions, link closure)
- •Configurable: damping factor, iterations, max files
- •Command:
python -m pack -o pack.zip -n 25 --damping-factor 0.5 --page-rank-iterations 100 <paths>
- •
publish.py: Private → public history mirroring via git filter-repo- •Clones
private/.gittemporarily - •Runs
git filter-repowith propertyPrivate-commitfiltering - •Rewrites commit history to remove sensitive paths
- •Rebases with signing, adds remote to public
.git - •Command:
python -m publish --paths-file <file>(withliteral:<path>lines)
- •Clones
Subfolder tools
- •
tools/special/: Academic LMS converters and course management (seetools-specialskill)- •Canvas/HKUST Zinc converters
- •Course catalog fetchers
- •
tools/templates/: Note scaffolding and pytextgen templates (seetools-templatesskill)- •
new wiki page.py - •
pytextgen generate *.mdtemplates
- •
Submodule tools
- •
tools/pytextgen/: Git submodule for content generation library (seepytextgenskill) - •
tools/pyarchivist/: Git submodule for archiving tool (seepyarchivistskill)
When to use this skill
- •Understanding tool relationships and dependencies
- •Coordinating multi-tool workflows (e.g., wiki ingestion → generation → packaging)
- •Troubleshooting tool interactions
- •Planning new tool development or refactoring
Common workflows
End-to-end wiki ingestion
- •Scaffold note:
python -m "templates.new wiki page"(tools-templates) - •Ingest HTML:
python -m "convert wiki"(convert wiki.py) - •Generate flashcards:
python -m init generate <file>(init.py + pytextgen)
Academic course organization
- •Convert LMS export:
python -m tools.special."convert Canvas submission"(tools-special) - •Update index: Edit
special/academia/<Institution>/index.md - •Generate course tables: Add pytextgen fences, run
python -m init generate(pytextgen)
Packaging and publishing
- •Regenerate all:
python -m init generate -C(init.py) - •Package bundle:
python -m pack -o bundle.zip -n 50 <paths>(pack.py) - •Publish filtered history:
python -m publish --paths-file paths.txt(publish.py)
Archive management
- •Archive content: Use pyarchivist tool (pyarchivist skill)
- •Verify index: Check
archives/*/index.mdupdated - •Reference in notes: Add relative links with
%20encoding
Tool dependencies
Python requirements
See requirements.txt for package dependencies:
- •
anyio: Async I/O for init.py - •
aiohttp: HTTP client for downloads - •
bs4(BeautifulSoup): HTML parsing for convert wiki.py - •
PyYAML: YAML frontmatter parsing - •
pytextgen: Content generation library - •Others as needed
Install: pip install -r requirements.txt
External tools
- •git: Required for all workflows
- •git-filter-repo: Required for
publish.py - •Python 3.8+: Minimum version for async features
Git submodules
- •
tools/pytextgen/: Content generation engine - •
tools/pyarchivist/: Archiving tool - •
self/**: Personal metadata submodules - •
private/**: Private content submodule
Tool architecture
init.py caching strategy
- •On first run: Walk workspace, compute
(mtime, inode)for all.mdfiles - •Cache text and metadata in memory
- •On subsequent runs: Skip files with unchanged
(mtime, inode) - •Normalize
\nbefore passing to pytextgen - •Use
-C/--no-cachedto rebuild cache
Cache location: In-memory (not persisted to disk)
pytextgen compile cache
- •Compiled Python modules cached in
__pycache__/ - •Use
--no-code-cacheto bypass - •Clear with
rm -rf __pycache__/if stale
Git filter-repo workflow
- •Clone
private/.gitto temporary directory - •Run
git filter-repo --path-renamebased on--paths-file - •Filter commits by
Private-commitproperty - •Rebase and sign commits
- •Add temporary remote to public
.git - •User must manually push
CLI stability
Critical: Core tools have established interfaces; preserve:
- •Argument names and order
- •Expected input formats (clipboard, files, stdin)
- •Output formats (Markdown, YAML, CSV, ZIP)
- •Error codes and messages
If changes are needed, ask user for permission first.
Best practices
Tool coordination
- •Regenerate before packaging: Always run
python -m init generate -Cbeforepack.py - •Clean before publishing: Verify
private/content is properly filtered beforepublish.py - •Archive before ingestion: Use pyarchivist for media before manual note creation
- •Template before conversion: Scaffold frontmatter before ingesting content
Error handling
- •Check tool exit codes before proceeding
- •Verify file existence before processing
- •Validate YAML/HTML/CSV formats before parsing
- •Use
--dry-runor preview modes when available
Performance
- •Use init.py caching to skip unchanged files
- •Parallelize independent operations (e.g., multiple
convert wikiruns) - •Limit PageRank iterations in
pack.pyfor large graphs - •Use
--exclude-extensionin pack.py to skip large assets
When to ask for help
- •If tool behavior is unexpected, consult user or check tool documentation
- •If editing submodules (
pytextgen,pyarchivist), ask user for permission - •If new tool is needed, discuss requirements and architecture with user
- •If tool fails mysteriously, check Python version and dependencies
Common issues
- •Cache staleness: Use
-C/--no-cachedif init.py skips changed files - •Module import errors: Ensure
requirements.txtpackages installed - •Git submodule out of date: Run
git submodule update --remote - •Path encoding issues: Ensure
%20encoding for spaces in links - •Clipboard access: Some tools require clipboard support (may fail in headless environments)
Editing guidelines for tools/**/*.py
When editing Python helper scripts in tools/:
- •Preserve CLI surfaces: Keep argument names, defaults, and help text stable; avoid breaking
python -m init generate/clear,python -m pack,python -m publish, and other tool entrypoints - •Maintain async/anyio patterns: Preserve async patterns like
anyio.Path,asyncio.create_task,TaskGroup,BoundedSemaphoreand caching behaviors (init cache, pytextgen compile cache in__pycache__/) - •Keep submodules read-only unless requested:
tools/pytextgen,tools/pyarchivist,self/**,private/**are git submodules; ask user before editing - •Normalize newlines: When touching the init wrapper, normalize to
\n; do not bypass its exclusion list (.git,.obsidian,tools) - •Favor relative imports: Use relative imports within the tools package; do not hardcode absolute host paths
- •Update requirements.txt: When adding dependencies, update
requirements.txtand note any non-pip prerequisites; prefer lightweight stdlib/typed solutions - •Test thoroughly: Python tools are critical infrastructure; test changes carefully before committing