AgentSkillsCN

Anti Detection

防检测

SKILL.md

Anti-Detection Skill

Description

Implements Cloudflare Turnstile avoidance strategies using human-like browser automation patterns.

Triggers

  • "turnstile triggered"
  • "bot detected"
  • "add delays"
  • "avoid detection"
  • "human-like behavior"

Cloudflare Turnstile Context

⚠️ CRITICAL: There is NO login CAPTCHA on OpenGov in normal human browsing. Turnstile is a bot detection system that triggers when automation patterns look non-human.

High Risk Areas

  1. Manual followers extraction flow (most sensitive)
  2. Rapid page navigation
  3. Direct URL access without referrer
  4. Lack of mouse movement/jitter

Detection Response Protocol

If Turnstile triggers:

  1. IMMEDIATELY CANCEL THE RUN - Do not attempt to solve
  2. 📝 Log anti_bot_turnstile_triggered event with full context
  3. 🔍 Analyze logs to identify the triggering command
  4. ✅ Fix and retry with enhanced delays

Why Human Intervention Doesn't Work

  • ✗ Human solving the challenge does NOT authenticate the automation script
  • ✗ The challenge is tied to browser fingerprint, not user session
  • ✓ Only prevention through behavioral patterns works

Avoidance Strategies

1. Paste + Enter Navigation Pattern

python
# GOOD: Paste URL into address bar, then press Enter
page.get_by_label("Address and search bar").fill(url)
await asyncio.sleep(random.uniform(0.5, 1.5))
page.keyboard.press("Enter")

# BAD: Direct navigation
page.goto(url)  # ❌ Triggers detection

2. Randomized Timing

python
import random

# Base delay with jitter
base_delay = 0.5
jitter = random.uniform(0.2, 0.8)
total_delay = base_delay + jitter

await asyncio.sleep(total_delay)

3. Natural Mouse Movements

python
# Move mouse naturally before click
await page.mouse.move(x - 10, y - 10)
await asyncio.sleep(0.1)
await page.mouse.move(x, y)
await asyncio.sleep(0.05)
await page.mouse.click(x, y)

4. User-Agent Rotation

python
user_agents = [
    "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36...",
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)...",
    # ... more diverse user agents
]

context = await browser.new_context(
    user_agent=random.choice(user_agents)
)

5. Viewport Randomization

python
viewports = [
    {"width": 1920, "height": 1080},
    {"width": 1366, "height": 768},
    {"width": 1440, "height": 900},
]

context = await browser.new_context(
    viewport=random.choice(viewports)
)

Configuration

Playwright Settings

json
{
  "playwright": {
    "headless": false,
    "slowMo": 100,
    "timeout": 30000,
    "antiDetection": {
      "randomizeTimings": true,
      "minDelay": 200,
      "maxDelay": 800,
      "jitterRange": 100,
      "useNaturalMouseMovements": true,
      "rotateUserAgent": true
    }
  }
}

Related Rules

  • playwright-navigation.md: Safe navigation patterns
  • playwright-core-automation.md: Core automation do's and don'ts
  • playwright-coordinate-control.md: Natural mouse movements
  • airflow-anti-bot-contract.md: Airflow-level anti-bot enforcement
  • airflow-turnstile-response.md: Response to detection events

Monitoring & Logging

Log Turnstile Events

python
logger.error(
    "anti_bot_turnstile_triggered",
    extra={
        "url": current_url,
        "action": "navigation_attempt",
        "timestamp": datetime.now().isoformat(),
        "user_agent": user_agent,
        "session_id": session_id
    }
)

Check for Turnstile Elements

python
turnstile_iframe = page.query_selector('iframe[src*="challenges.cloudflare.com"]')
if turnstile_iframe:
    logger.error("Turnstile challenge detected")
    raise TurnstileTriggeredException()

Testing Anti-Detection

Manual Test

  1. Run extraction with normal delays
  2. Verify no Turnstile triggers
  3. Check logs for timing patterns
  4. Adjust delays if needed

Stress Test

  1. Run extraction on 10 projects
  2. Monitor for any Turnstile triggers
  3. Analyze timing distribution
  4. Ensure adequate randomization

Success Criteria

  • ✅ No Turnstile triggers in 100+ page loads
  • ✅ Timing variance of 50%+ between actions
  • ✅ Natural mouse movement paths
  • ✅ Diverse user-agent distribution
  • ✅ Complete audit trail of all actions

Troubleshooting

If Turnstile Triggers

  1. Identify trigger point: Check logs for last action before trigger
  2. Increase delays: Add 50-100% more delay at trigger point
  3. Add jitter: Ensure random variance in timing
  4. Verify paste+enter: Ensure not using direct navigation
  5. Test in isolation: Run just the problematic section

Common Mistakes

  • ❌ Using page.goto() instead of paste+enter
  • ❌ Fixed delays without randomization
  • ❌ No mouse movement before clicks
  • ❌ Same user-agent across all runs
  • ❌ Headless mode (more detectable)