ALMA Intelligent Scraper
When to Use
- •Finding new youth justice information
- •Updating ALMA intelligence
- •Discovering new sources
- •Analyzing coverage gaps
- •Checking what's new in youth justice
Commands
| Command | Purpose | Duration |
|---|---|---|
quick | Top 10 high-value sources | 5 min |
deep | All 50+ sources with discovery | 30-60 min |
discover | Follow discovered links | Variable |
source "QLD" | Deep dive specific jurisdiction | 15 min |
gaps | Show coverage gaps | 2 min |
status | Current knowledge state | Instant |
Learning Cycle
code
SCRAPE → EXTRACT → EVALUATE → LEARN → STORE
(Claude) (Quality) (Patterns)
Quality Signals
| Signal | Weight |
|---|---|
| Relevance (AU youth justice?) | 30% |
| Novelty (new info?) | 25% |
| Specificity (concrete details?) | 20% |
| Evidence (research backed?) | 15% |
| Actionability (useful?) | 10% |
Priority Formula
code
priority = (quality × 0.4) + (freshness_need × 0.3) + (coverage_gap × 0.3)
Sacred Boundaries
Never scrape: Private info, court records, social media, paywalled Always mark: Community Controlled, Indigenous orgs, cultural knowledge Always check: Consent level, cultural authority, data sovereignty
File References
| Need | Reference |
|---|---|
| Database schema | references/database-schema.md |
| Extraction patterns | references/extraction-patterns.md |
| Coverage tracking | references/coverage-tracking.md |
| Implementation code | references/implementation.md |