PEIL Evaluation Skill
This skill evaluates prompts generated using the Prompt Engineering Instructional Language (PEIL) methodology for quality and effectiveness.
When to Use This Skill
- •Assessing the quality of generated system prompts
- •Providing constructive feedback on prompt design
- •Rating prompts against established criteria
- •Iteratively improving prompts before deployment
- •Comparing multiple prompt versions
Evaluation Criteria (Weighted)
| Criterion | Weight | Description |
|---|---|---|
| Clarity and Coherence | 30% | Is the language clear and unambiguous? Does the prompt have a logical flow? |
| Completeness and Comprehensiveness | 25% | Does the prompt cover all necessary aspects? Are important elements missing? |
| Relevance and Applicability | 20% | How well does the prompt align with its intended purpose? Is it practical? |
| Creativity and Originality | 15% | Does the prompt introduce novel approaches? How original is it? |
| Technical Accuracy | 10% | Are technical details and instructions accurate? |
Evaluation Process
Step 1: Initial Assessment
Read the prompt completely before scoring. Identify:
- •The intended purpose/task
- •Target audience (agent, human, specific domain)
- •Expected output format
Step 2: Criterion-by-Criterion Analysis
For each of the 5 criteria:
- •Identify specific strengths
- •Identify areas for improvement
- •Assign a score (0-100)
Step 3: Calculate Overall Score
code
Overall = (Clarity × 0.30) + (Completeness × 0.25) + (Relevance × 0.20) + (Creativity × 0.15) + (Accuracy × 0.10)
Step 4: Generate Feedback
Provide actionable recommendations for improvement.
Evaluation Output Format
markdown
## Prompt Evaluation Report ### Overall Score: [X]/100 ### Criterion Breakdown | Criterion | Score | Strengths | Areas for Improvement | |-----------|-------|-----------|----------------------| | Clarity (30%) | X/100 | ... | ... | | Completeness (25%) | X/100 | ... | ... | | Relevance (20%) | X/100 | ... | ... | | Creativity (15%) | X/100 | ... | ... | | Technical Accuracy (10%) | X/100 | ... | ... | ### Key Recommendations 1. [Specific, actionable recommendation] 2. [Specific, actionable recommendation] 3. [Specific, actionable recommendation] ### Summary [2-3 sentence summary of the evaluation]
Scoring Guidelines
Clarity and Coherence (30%)
| Score Range | Indicators |
|---|---|
| 90-100 | Crystal clear instructions, perfect logical flow, no ambiguity |
| 70-89 | Mostly clear, minor ambiguities, good structure |
| 50-69 | Some unclear sections, could be better organized |
| 30-49 | Confusing in places, poor flow, significant ambiguity |
| 0-29 | Very unclear, disorganized, highly ambiguous |
Completeness and Comprehensiveness (25%)
| Score Range | Indicators |
|---|---|
| 90-100 | All PEIL components present, thorough coverage |
| 70-89 | Most components present, good coverage with minor gaps |
| 50-69 | Some components missing, moderate coverage |
| 30-49 | Several components missing, incomplete coverage |
| 0-29 | Most components missing, very incomplete |
Relevance and Applicability (20%)
| Score Range | Indicators |
|---|---|
| 90-100 | Perfectly aligned with purpose, immediately applicable |
| 70-89 | Well-aligned, practical with minor adjustments |
| 50-69 | Somewhat aligned, needs modification for use |
| 30-49 | Poorly aligned, limited practical value |
| 0-29 | Not aligned with purpose, impractical |
Creativity and Originality (15%)
| Score Range | Indicators |
|---|---|
| 90-100 | Highly innovative approach, novel techniques |
| 70-89 | Some creative elements, good use of techniques |
| 50-69 | Standard approach, minimal creativity |
| 30-49 | Very basic, formulaic |
| 0-29 | No creativity, copied template |
Technical Accuracy (10%)
| Score Range | Indicators |
|---|---|
| 90-100 | All technical details correct, best practices followed |
| 70-89 | Mostly accurate, minor technical issues |
| 50-69 | Some inaccuracies, deviations from best practices |
| 30-49 | Several inaccuracies, poor technical implementation |
| 0-29 | Major technical errors, incorrect information |
Quick Evaluation Checklist
Before detailed evaluation, check:
- • Does the prompt have a clear Role defined?
- • Is the Context specific and relevant?
- • Are complex questions broken down?
- • Are there specific, actionable instructions?
- • Is there a length/conciseness constraint?
- • Is an appropriate prompting technique applied?
- • Is the desired output format specified?
Common Issues and Recommendations
| Issue | Recommendation |
|---|---|
| Vague role definition | Add specific expertise and domain context |
| Missing context | Explain the situation and constraints |
| Overly complex | Break into sub-prompts or stages |
| No output format | Specify Markdown, JSON, bullet points, etc. |
| Wrong technique | Match technique to task type (see PEIL techniques) |
| Too long | Focus on essential instructions, move details to examples |
| Too short | Add constraints, examples, or edge case handling |
Additional Resources
- •PEIL Skill - Main prompt generation methodology
- •Evaluation Criteria Details - Extended criterion definitions