Split Primitive
Transform a Note's internal structure into a Collection of Notes.
Overview
The split primitive transforms a single Note into a Collection by splitting its content based on structure or delimiter. This is a structure transformation, not content inspection. To view Collection contents, use display (show to user) or flatten (merge into single Note).
Input Types
The split primitive handles four input types:
- •JSON Array: Directly splits array elements (e.g.,
[1, 2, 3]→ Collection of 3 Notes) - •JSON Object with Array Field: Extracts array from specified field (default:
results) - •JSONL Format: Multiple JSON objects separated by newlines
- •Plain Text: Splits by delimiter (default: sentence boundaries)
Parameters
Required
- •
target:$variable- Note containing JSON array, JSON object with array field, JSONL, or plain text - •
out:$variable- Variable name for resulting Collection
Optional
- •
field:string(default:'results') - Name of array field in JSON object case - •
delimiter:string(default:'sentence') - For plain text splitting:- •
'sentence'- Splits on sentence boundaries (.,.\n,!,!\n,?,?\n) - •
'paragraph'- Splits on double newlines (\n\n) - •
'line'- Splits on single newlines (\n) - •Custom string - Splits on the specified delimiter string
- •
Behavior
For Plain Text (default delimiter: 'sentence')
When splitting plain text with the default 'sentence' delimiter:
- •Splits on sentence boundaries: period (
.), exclamation (!), or question mark (?) followed by space or newline - •Normalizes whitespace: Internal newlines are removed and multiple spaces are collapsed to single spaces within each segment
- •Filters empty segments: Zero-length or whitespace-only segments are removed
- •Preserves semantic units (complete sentences) even when text spans multiple lines (e.g., PDF-extracted text)
For Other Delimiters
- •
'paragraph': Splits on double newlines, normalizes whitespace within paragraphs - •
'line': Splits on single newlines, normalizes whitespace within lines - •Custom delimiter: Splits on the specified string, normalizes whitespace within segments
Examples
JSON Array
{"type":"split","target":"$json_array_note","out":"$items"}
Input Note: [1, 2, 3] → Collection of 3 Notes
JSON Object with Array Field
{"type":"split","target":"$data_note","out":"$items"}
Input Note: {"results": [{"x":1}, {"x":2}]} → Collection of 2 Notes
JSONL Format
{"type":"split","target":"$jsonl_note","out":"$items"}
Input Note: {"key":"val1"}\n{"key":"val2"} → Collection of 2 Notes
Plain Text - Sentence Splitting (Default)
{"type":"split","target":"$text_note","out":"$sentences"}
Input Note: "First sentence. Second sentence! Third sentence?" → Collection of 3 Notes
Plain Text - Paragraph Splitting
{"type":"split","target":"$text_note","delimiter":"paragraph","out":"$paragraphs"}
Input Note: "Para 1\n\nPara 2" → Collection of 2 Notes
Plain Text - Line Splitting
{"type":"split","target":"$text_note","delimiter":"line","out":"$lines"}
Input Note: "Line 1\nLine 2\nLine 3" → Collection of 3 Notes
Plain Text - Custom Delimiter
{"type":"split","target":"$text_note","delimiter":"---","out":"$sections"}
Input Note: "Section 1---Section 2---Section 3" → Collection of 3 Notes
Use Cases
- •Document Processing: Split RFP documents into sentences for compliance analysis
- •Text Analysis: Break down large text documents into semantic units
- •Data Extraction: Transform structured JSON arrays into Collections for processing
- •PDF Text Processing: Handle PDF-extracted text where sentences span multiple lines
Important Notes
- •NOT for inspecting Collections: Collections are already split. Use
displayto view Collection contents orflattento merge back into a single Note. - •search-web and semantic-scholar: These tools return Collections directly - NO split needed.
- •Whitespace Normalization: For plain text, internal newlines are removed and multiple spaces are collapsed to ensure clean semantic units.
- •Empty Filtering: Empty or whitespace-only segments are automatically filtered out.
Common Mistakes
- •Trying to split a Collection to "see inside it" - Collections are already split, use
displayinstead - •Forgetting that default behavior is sentence splitting (not line splitting) for plain text
- •Expecting line breaks to be preserved - they are normalized to spaces for semantic processing