Pluck
INPUT CONTRACT
- •
target: Collection (variable or ID) - •
field: String path (supports dot notation likemetadata.uri) - •
out: Variable name
REQUIREMENTS:
- •Collection MUST contain Notes (not Collections)
- •Each Note MUST be dict/JSON object
- •Field MUST exist as key in each Note
NOT SUPPORTED:
- •❌ Note containing array (use
splitfirst) - •❌ Collection of arrays (must be dict Notes)
OUTPUT
Returns Collection of Notes, each containing extracted scalar value. Notes missing field are excluded.
FAILURE SEMANTICS
Empty Collection = expected when:
- •Field missing in all Notes
- •Type contract violated
Empty ≠ error — indicates no matches, not failure.
Actual failures: Invalid target type or missing parameters.
REPRESENTATION INVARIANTS
- •Note containing JSON array ≠ Collection
- •Use
splitto convert array → Collection - •
flattenperforms inverse (Collection → Note)
CONTENT STRUCTURE
For JSON Notes, content is a dict with fields:
- •Top-level fields:
text,format,char_count - •Nested fields:
metadata.*(e.g.,metadata.uri,metadata.title,metadata.year)
Example Note content structure (from semantic-scholar/search-web):
json
{
"text": "Full text content...",
"format": "paper",
"metadata": {
"title": "Paper Title",
"uri": "https://example.com/paper.pdf"
},
"char_count": 5000
}
FIELD ACCESS EXAMPLES
Extract nested field:
json
{"type":"pluck","target":"$papers","field":"metadata.title","out":"$titles"}
Extract top-level field:
json
{"type":"pluck","target":"$results","field":"text","out":"$texts"}
Extract URI for fetching:
json
{"type":"pluck","target":"$search_results","field":"metadata.uri","out":"$urls"}
ANTI-PATTERNS
❌ pluck(target=$array_note) → Use split first
❌ pluck(target=$coll_of_arrays) → Elements must be dicts
❌ Treating empty result as error → Empty = no matches