AgentSkillsCN

hiring-and-interviews

在设计技术面试流程、创建系统设计题目、制定评估标准、规范居家考核,或减少招聘过程中的偏见时使用。

SKILL.md
--- frontmatter
name: hiring-and-interviews
description: "Use when designing technical interview processes, creating system design questions, building evaluation rubrics, structuring take-home assessments, or reducing bias in hiring."

Hiring and Interviews

Pipeline Design

Standard Technical Hiring Pipeline

StageDurationOwnerPurposePass Rate
1. SourcingOngoingRecruiterBuild candidate poolN/A
2. Recruiter Screen30 minRecruiterRole fit, logistics, comp range~50%
3. Technical Screen45-60 minEngineerCore competency check~30-40%
4. Technical Deep Dive2-4 hrsPanel (2-3)Coding + system design~25-35%
5. Team/Culture Fit45 minHiring managerValues alignment, collaboration~80%
6. Offer & Close1-2 weeksRecruiter + HMComp, start date, negotiation~70-80%

Pipeline Customization

Role LevelSkip Screen?System Design?Take-Home?Panel Size
Junior (0-2 yr)NoNoOptional2
Mid (3-5 yr)NoSimplifiedOptional3
Senior (5-8 yr)SometimesYes (45 min)Optional3-4
Staff+ (8+ yr)YesYes (60 min)No4-5

Question Type Selection

When to Use Each Format

FormatBest ForAvoid WhenDuration
Live codingAlgorithm thinking, code fluencySenior+ roles (too artificial)45 min
System designArchitecture skills, tradeoff reasoningJunior roles (<2 yr experience)45-60 min
BehavioralLeadership, conflict resolution, valuesSole evaluation method30-45 min
Take-homeReal-world code quality, completenessCandidate has <3 days, senior roles2-4 hrs
Pair programmingCollaboration, communication, debuggingRemote async interviews60 min
Code reviewReading comprehension, feedback qualityCandidates unfamiliar with language30-45 min
Portfolio reviewCandidates with public work (OSS, etc.)Roles requiring specific tech assessment30 min

Live Coding Best Practices

  • Provide the problem statement in writing (don't make them memorize it)
  • Allow language choice when possible
  • Focus on problem-solving process, not syntax recall
  • Give hints when stuck for >5 min (evaluate how they use hints)
  • Reserve 10 min at end for questions

System Design Interview Structure

45-Minute Format

PhaseTimeInterviewer RoleCandidate Should
Problem statement2 minPresent the scenarioListen, take notes
Requirements8 minAnswer questions, guide scopeAsk clarifying questions, define scope
Estimation5 minValidate assumptionsBack-of-envelope math, state assumptions
High-level design12 minProbe decisionsDraw components, explain data flow
Deep dive13 minPush on weak spotsGo deep on 1-2 components
Tradeoffs & wrap-up5 minAsk "what would you change?"Discuss alternatives, limitations

Requirements Phase — Guide the Candidate

code
Functional requirements (what it does):
- "Users can upload photos up to 10MB"
- "Feed shows posts from followed users, reverse chronological"
- "Search returns results within 200ms p99"

Non-functional requirements (how it performs):
- Scale: 10M DAU, 500 req/s average, 5000 req/s peak
- Availability: 99.9% uptime
- Latency: p50 < 50ms, p99 < 200ms
- Consistency: eventual is acceptable for feed, strong for payments

Evaluation Dimensions

DimensionJuniorMidSeniorStaff
Requirements gatheringNeeds promptingAsks good questionsDrives the scopingIdentifies hidden requirements
EstimationStrugglesReasonable with helpQuick, accurateChallenges given constraints
Component designBasic boxesAppropriate servicesJustified choicesNovel approaches
Data modelingSimple schemasNormalized + indexesPartitioning, replicationCross-system data strategy
Tradeoff discussionBinary thinkingNames tradeoffsQuantifies tradeoffsConnects to business impact
CommunicationNeeds structureClear with promptingProactive, structuredAdapts to audience

Example System Design Questions by Level

LevelQuestionKey Evaluation Focus
MidDesign a URL shortenerData modeling, basic scale
SeniorDesign a notification systemPub/sub, delivery guarantees, priority
SeniorDesign a rate limiterDistributed systems, consistency
StaffDesign a multi-region deployment strategyCAP tradeoffs, failover, data sovereignty
StaffDesign the technical interview platform itselfMeta-thinking, full-stack architecture

Evaluation Rubrics

Structured Scoring Template

Score each dimension 1-4. Avoid 5-point scales (the middle becomes a dumping ground).

ScoreLabelMeaning
1Does not meetSignificant gaps, could not perform at this level
2Partially meetsSome capability shown, needs development
3MeetsCompetent at expected level for the role
4ExceedsNotably strong, above typical for this level

Coding Interview Rubric

Dimension1234Weight
Problem solvingCan't break down problemNeeds significant hintsSolves with minor hintsElegant solution, considers edge cases30%
Code qualityUnreadable, no structureWorks but messyClean, well-named, modularProduction-quality, defensive25%
CommunicationSilent or confusedExplains when askedThinks aloud naturallyDrives conversation, checks assumptions20%
Testing mindsetNo mention of testsTests when promptedIdentifies key test casesTests first, edge cases, error paths15%
Technical depthSurface-level answersKnows basicsExplains tradeoffsDeep knowledge, references real experience10%

Calibration Sessions

Run calibration before each hiring cycle:

  1. Interviewers independently score the same recorded interview (or written scenario)
  2. Compare scores; discuss any dimension with >1 point spread
  3. Align on what "3 - Meets" looks like for this specific role
  4. Document calibrated examples in the rubric

Frequency: once per quarter or when adding new interviewers.

Take-Home Assessment Design

Design Principles

PrincipleImplementation
Time-boxed"Spend no more than 3 hours" (state explicitly)
Realistic scopeResembles actual work, not algorithmic puzzles
Clear evaluation criteriaShare the rubric with the candidate
Starter code providedReduce boilerplate time, focus on the interesting part
Multiple valid approachesNo single "right" answer

Take-Home Structure

markdown
# Assessment: [Title]

## Context
You're building [realistic scenario]. The existing codebase has [starter code description].

## Requirements
1. [Core requirement — must complete]
2. [Core requirement — must complete]
3. [Stretch goal — bonus if time allows]

## Provided
- Starter repo: [link]
- API documentation: [link]
- Sample data: included in repo

## Time Expectation
Please spend no more than 3 hours. We value a well-structured partial solution over a rushed complete one.

## Submission
- Push to a private fork or email a zip
- Include a README explaining your approach, tradeoffs, and what you'd do with more time

## Evaluation Criteria
We'll assess:
- Code organization and readability
- Error handling and edge cases
- Testing approach
- Technical decision rationale (in README)

Take-Home Anti-Patterns

  • "Build a full app from scratch" (too broad, biases toward free time)
  • No time limit (candidates spend 20+ hours, creates inequity)
  • Hidden evaluation criteria (candidates can't optimize for what matters)
  • Requiring a specific framework/language without business justification

Hiring Metrics

Core Metrics

MetricTargetHow to Measure
Time to hire<30 days (eng)Offer accepted date - req opened date
Pipeline velocity<2 weeks per stageAverage days between stage transitions
Pass-through rateSee pipeline table aboveCandidates advancing / candidates entering stage
Offer acceptance rate>80%Offers accepted / offers extended
Interview-to-offer ratio3:1 to 5:1Final round interviews / offers made
Interviewer load<4 hrs/weekInterview hours per interviewer per week
Candidate NPS>60Post-interview survey (even for rejects)

Quality Metrics (Lagging)

MetricTimeframeSignal
New hire performance rating6 monthsWere our assessments predictive?
90-day retention3 monthsDid we set correct expectations?
1-year retention12 monthsCulture and role fit assessment quality
Time to productivity3 monthsOnboarding effectiveness (related to hiring)
Regretted attritionOngoingAre we losing people we wanted to keep?

Using Metrics

  • Track pass-through rates by interviewer to detect outliers (too harsh or too lenient)
  • If offer acceptance < 70%, investigate comp, speed, or candidate experience
  • If time-to-hire > 45 days, audit where candidates stall

Anti-Bias Practices

Structured Interview Protocol

PracticeWhyHow
Same questions for all candidatesPrevents ad-hoc difficulty varianceMaintain a question bank per role
Rubric-first evaluationAnchors to criteria, not gut feelingWrite rubric before seeing candidates
Independent scoringPrevents groupthinkInterviewers submit scores before debrief
Diverse interview panelsReduces affinity biasMin 1 interviewer from underrepresented group
Blind resume reviewReduces name/school/company biasStrip identifying info in initial screen
Standardized debrief formatPrevents loudest-voice-winsEach interviewer presents scores, then discussion

Debrief Structure

code
1. Each interviewer shares:
   - Score per dimension (already submitted)
   - One strongest signal (positive)
   - One concern (if any)

2. Hiring manager synthesizes:
   - Areas of agreement
   - Areas of disagreement (discuss these)
   - Overall hire/no-hire recommendation

3. Decision:
   - Strong hire: ≥3 interviewers at 3+ average
   - Lean hire: mixed signals, discuss specific concerns
   - No hire: ≥2 interviewers below 2.5 average

Language to Avoid in Evaluations

Biased LanguageBetter Alternative
"Not a culture fit""Specific concern about [collaboration/communication/X]"
"Not technical enough""Scored 2 on problem solving: needed hints on [specific topic]"
"Seemed nervous""Communication score reflects difficulty articulating approach"
"Overqualified""Concern about long-term engagement given stated career goals"
"Would be great in a few years""Scored 2 on [dimension], meets bar for [lower level] role"

Process Audits

Run quarterly:

  • Are pass-through rates consistent across demographic groups?
  • Are certain interviewers consistently harsh/lenient?
  • Do take-home completion rates differ by candidate background?
  • Is time-to-hire equitable across candidate sources?

Gotchas

  • Over-indexing on algorithms: LeetCode-style questions test a narrow skill. Use them as one signal, not the primary filter. Pair programming or take-homes test more relevant skills.
  • Culture fit bias: "Culture fit" often means "similar to us." Replace with "values alignment" and define specific values with behavioral indicators.
  • Inconsistent evaluation: Without rubrics, interviewers anchor on different things. Two interviewers can both say "strong hire" for completely different (even contradictory) reasons.
  • Speed vs. quality tradeoff: Pressure to fill roles fast leads to lowered bars. Track quality metrics (new hire performance) alongside speed metrics.
  • Take-home inequity: Candidates with families, multiple jobs, or disabilities may have less discretionary time. Always offer an alternative format (live session with same problem).
  • Interviewer burnout: Senior engineers doing 6+ hours/week of interviews burn out and give worse evaluations. Cap at 4 hours/week maximum.
  • Feedback black holes: Candidates who never hear back poison your employer brand. Send rejection emails within 5 business days of decision, always.
  • Panel homogeneity: All-male, all-senior, all-same-background panels introduce systematic blind spots. Diverse panels catch different signals.
  • Anchoring on pedigree: FAANG experience or top-school degrees are weak predictors of job performance. Evaluate demonstrated skills, not credentials.
  • Moving the goalposts: Changing evaluation criteria mid-pipeline because a favored candidate didn't score well. Lock rubrics before interviewing starts.