AgentSkillsCN

batch-jd-processor

采用并行子代理进行人工审核的批量职位描述处理工具。严禁使用自动化脚本。该工具可读取包含职位数据的 Excel/CSV 文件,将其分组处理,同时启动最多数量的子代理,依据用户设定的筛选条件开展类人化分析。最终输出彩色标注的 Excel 文件,其中清晰标注“通过”“拒绝”“错误”等分类,并附上详尽的判断依据与理由。触发方式包括:“批量处理这些职位描述”、“帮我筛选这些岗位”、“处理 Excel 中的职位”、“批量筛选职位”。

SKILL.md
--- frontmatter
name: batch-jd-processor
description: Batch job description processor using MANUAL REVIEW by parallel subagents. NO automated scripts allowed. Reads Excel/CSV with job data, splits into groups, launches max subagents for human-like analysis against user filter criteria. Outputs color-coded Excel with PASS/REJECT/ERROR classifications and detailed reasoning. Triggers include "batch process these JDs", "help me filter these positions", "process jobs in Excel", "batch filter jobs".

Batch JD Processor

Process job batches through parallel manual review by subagents. Each subagent reads jobs like a human and applies user's filter criteria.

Core Rules

⛔ Prohibitions

  • NO automated scripts (no filter_all_jobs.py, no regex scripts)
  • NO keyword extraction scripts

✅ Required Approach

  • Split jobs into groups (5 per group)
  • Launch ALL subagents in SINGLE message
  • Each subagent manually reads every job (5 jobs per agent)
  • Provide detailed reasoning for every decision

Three-Tier Classification

PASS - Meets all criteria, recommend applying REJECT - Fails filter criteria (location/experience/cert/position type) ERROR - Cannot evaluate (missing JD, corrupted data, missing fields)

Critical: Use ERROR for data problems, REJECT for criteria failures.

Examples:

json
{"filter_result": "ERROR", "reason": "Missing JD content, cannot evaluate position requirements"}
{"filter_result": "REJECT", "reason": "Requires 3 years experience, user needs 0-year entry-level"}
{"filter_result": "PASS", "reason": "Entry-level Data Analyst, matches ideal position, located in Toronto"}

Workflow

Step 0: Display Filter Criteria

Read and display user_filter.json showing location, experience, certifications, ideal positions, avoid categories.

Step 1: Load Data

Parse Excel/CSV into JSON for subagents:

python
import pandas as pd, json

df = pd.read_csv('input.csv') if file.endswith('.csv') else pd.read_excel('input.xlsx')
jobs = [{
    'job_id': str(row.get('Job ID', f'job_{idx}')),
    'title': row.get('Title', ''),
    'location': row.get('Location', ''),
    'url': row.get('URL', ''),
    'full_jd': row.get('Job Description', ''),
    'posted_date': row.get('Posted Date', '')
} for idx, row in df.iterrows()]

with open('all_jobs.json', 'w', encoding='utf-8') as f:
    json.dump(jobs, f, indent=2, ensure_ascii=False)

Step 2: Split into Groups

Calculate subagents: num_agents = Total JDs / 5 (each agent processes 5 JDs)

python
import math
num_agents = max(1, math.ceil(len(jobs) / 5))  # Allocate 1 agent per 5 JDs
jobs_per_agent = math.ceil(len(jobs) / num_agents)

for i in range(num_agents):
    group = jobs[i*jobs_per_agent : (i+1)*jobs_per_agent]
    with open(f'jobs_group_{i+1}.json', 'w', encoding='utf-8') as f:
        json.dump(group, f, indent=2, ensure_ascii=False)

Step 3: Launch Subagents (In SINGLE Message)

Prompt template for each subagent:

code
Review Group {N}: /path/to/jobs_group_{N}.json
Filter: /path/to/user_filter.json

For EACH job:
1. Check data quality first - Missing JD/fields? → ERROR
2. Read title and full JD carefully
3. Check ALL criteria: location, experience, certs, position type, avoid categories
4. Classify as PASS/REJECT/ERROR with specific reason

Output JSON: [{job_id, title, url, location, filter_result, reason}]
Save to: /path/to/group_{N}_results.json

Rules:
- NO scripts, manual reading only
- ERROR for data issues, REJECT for criteria failures
- Specific reasons (e.g. "Requires 2 years experience" not "Does not meet requirements")

Step 4: Merge & Generate Excel

python
import json, pandas as pd
from openpyxl import load_workbook
from openpyxl.styles import PatternFill, Font, Alignment

# Merge results
all_results = []
for i in range(1, num_agents + 1):
    with open(f'group_{i}_results.json', 'r', encoding='utf-8') as f:
        all_results.extend(json.load(f))

# Statistics
pass_count = sum(1 for r in all_results if r['filter_result'] == 'PASS')
reject_count = sum(1 for r in all_results if r['filter_result'] == 'REJECT')
error_count = sum(1 for r in all_results if r['filter_result'] == 'ERROR')

# Create Excel
df = pd.DataFrame(all_results)
df['sort_key'] = df['filter_result'].map({'PASS': 0, 'REJECT': 1, 'ERROR': 2})
df = df.sort_values('sort_key').drop('sort_key', axis=1)
df.to_excel('output.xlsx', index=False)

# Color code
wb = load_workbook('output.xlsx')
ws = wb.active
green = PatternFill(start_color='C6EFCE', end_color='C6EFCE', fill_type='solid')
red = PatternFill(start_color='FFC7CE', end_color='FFC7CE', fill_type='solid')
yellow = PatternFill(start_color='FFEB9C', end_color='FFEB9C', fill_type='solid')

for row in ws.iter_rows(min_row=2):
    result = row[4].value
    row[4].fill = green if result == 'PASS' else red if result == 'REJECT' else yellow

wb.save('output.xlsx')

Step 5: Present Results

code
📊 Batch Filtering Complete!
Total: {total} | ✅ PASS: {pass} | ❌ REJECT: {reject} | ⚠️ ERROR: {error}

🎯 Recommended Positions (Top 5):
[List PASS jobs with title, location, reason, URL]

📋 Rejection Reason Statistics:
[Group REJECT by reason with counts]

⚠️ Error Statistics:
[Group ERROR by reason with counts]

[Excel download link]

Subagent Quality Checklist

✅ Read data quality first (missing JD → ERROR, not REJECT) ✅ Read full job description, don't skim ✅ Check ALL criteria (location, experience, certs, position type, avoid) ✅ Specific reasoning ("Requires 3 years experience" not "Does not meet requirements") ✅ Handle edge cases (bilingual, "Associate" titles, "preferred" vs "required")

Common Mistakes

❌ Using scripts instead of manual review ❌ Launching subagents sequentially (launch ALL in one message) ❌ Vague reasoning ❌ ERROR for criteria failures (use ERROR only for data issues)

Performance

Batch SizeAgentsJobs/AgentEstimated Time
25552-3min
501053-5min
1002055-8min
20040510-15min

Manual review is slower than scripts but catches nuances and context-dependent requirements. Each agent processes exactly 5 JDs for consistent workload distribution.