Readwise
Access Readwise highlights (via MCP) and full document content (via Reader API).
<EXTREMELY-IMPORTANT> ## IRON LAW: Main Chat NEVER Calls Readwise ToolsEVERY READWISE OPERATION MUST GO THROUGH LIBRARIAN. This is not negotiable.
Main chat MUST NOT:
- •Call
mcp__readwise__search_readwise_highlightsdirectly - •Call any
mcp__readwise__*tool - •“Just quickly check” highlights
- •“Look up one thing” in Readwise
If you’re about to call a Readwise tool in main chat, STOP. Spawn a librarian sub-agent instead. </EXTREMELY-IMPORTANT>
Permission Model
| Context | Readwise MCP Tools | Reader API Script |
|---|---|---|
| Main chat | FORBIDDEN | FORBIDDEN |
| Librarian sub-agent | ALLOWED | ALLOWED |
Red Flag Detection
STOP if you catch yourself thinking: - “Let me quickly search Readwise...” - “I’ll just check the highlights...” - “The MCP tool is right here in my tool list...” These thoughts in MAIN CHAT = VIOLATION. Delegate instead.
Rationalization Prevention
| Thought | Reality |
|---|---|
| “Just one quick search” | Quick = context pollution. Delegate. |
| “MCP tool is available” | Available != permitted in main chat. Delegate. |
| “I’ll summarize results” | You still receive full payload. Delegate. |
| “User wants speed” | Sub-agents ARE fast. Delegate. |
| “I know the exact tag” | Use Python script via librarian. Delegate. |
| “It’s a simple query” | Simple still pollutes. Delegate. |
Correct Pattern
User: “Search my Readwise for proxy advisor articles” MAIN CHAT RESPONSE: Task(subagent_type=”workflows:librarian”, prompt=”Search Readwise for proxy advisor articles and summarize findings”) NEVER IN MAIN CHAT: mcp__readwise__search_readwise_highlights(...)
Honesty Requirement
<EXTREMELY-IMPORTANT> **Calling Readwise tools directly in main chat is not “being helpful” - it’s violating the workflow.**When you call Readwise directly, you are:
- •Wasting the user’s context window with verbose results
- •Using the wrong tool (MCP = semantic only, no full docs)
- •Skipping the proper workflow (librarian knows search -> format -> NotebookLM)
"I'll just check quickly" is the rationalization. The librarian exists for this purpose. Use it. </EXTREMELY-IMPORTANT>
Tag-Based Workflow (CRITICAL)
<EXTREMELY-IMPORTANT> **When user mentions items were added by tag, NEVER use MCP semantic search.**Trigger Phrases
- •"we added items tagged X"
- •"I thought we added X to NLM"
- •"are they not in notebooklm?"
- •"items tagged [tag]"
- •"documents with tag [tag]"
Required Workflow
User mentions tagged items or NLM content
│
▼
┌─────────────────────┐
│ 1. CHECK NLM FIRST │ ← MANDATORY
│ nlm list │
│ nlm chat <id> │
└─────────────────────┘
│
Not in NLM?
▼
┌─────────────────────┐
│ 2. USE READER API │ ← For tagged items
│ --tag "X" │
│ NOT MCP search! │
└─────────────────────┘
Red Flags for Tag-Based Queries
STOP if user mentions tagged items AND you're about to: - Call mcp__readwise__search_readwise_highlights - Do "semantic search" for tagged content - Skip checking NLM first These are WORKFLOW VIOLATIONS. The content is already curated by tag.
Correct Response Pattern
User: "I thought we added the Egan-Jones letters to NLM? They were tagged proxy advisors."
CORRECT:
1. Task(librarian) → Check NLM for proxy advisors notebook
2. If not found: Use Reader API with --tag "proxy advisors"
3. NEVER: mcp__readwise__search_readwise_highlights("Egan-Jones...")
WRONG:
- Immediately calling MCP search
- Skipping NLM check
- Using semantic search for tagged content
Rationalization Prevention for Tags
| Thought | Reality |
|---|---|
| "MCP will find it faster" | Tags are exact. Reader API is correct tool. |
| "Semantic search is more flexible" | User already organized by tag. Respect that. |
| "I'll check NLM after" | NLM FIRST. This is the knowledge hierarchy. |
| "Let me verify it's there" | Check NLM, don't re-search everything. |
Decision Tree: Which Method to Use?
┌─────────────────────────────────────────────────────────────┐
│ 1. CHECK NLM FIRST (always) │
│ Is the content already in a NotebookLM notebook? │
│ → nlm list && nlm chat <id> "query" │
└─────────────────────────────────────────────────────────────┘
│
Not in NLM?
▼
┌─────────────────────────────────────────────────────────────┐
│ 2. Do you know the exact tag(s)? │
│ │
│ YES → Reader API (tag-based fetch) │
│ Fast, gets full documents, no semantic search needed │
│ → python3 skills/readwise/scripts/readwise_to_nlm.py \ │
│ --tag "tag" --notebook <id> │
│ │
│ NO → MCP (semantic search) - LAST RESORT │
│ Find highlights by meaning/keywords │
│ → mcp__readwise__search_readwise_highlights │
└─────────────────────────────────────────────────────────────┘
Quick Reference
| Need | Method | Command |
|---|---|---|
| Full docs by tag | Reader API | python3 skills/readwise/scripts/readwise_to_nlm.py --tag “proxy advisors” --notebook <id> |
| Semantic search | MCP | mcp__readwise__search_readwise_highlights |
| List tags/dry-run | Reader API | python3 skills/readwise/scripts/readwise_to_nlm.py --tag “proxy advisors” --notebook <id> --dry-run |
Two Data Sources
| Source | Tool | Use Case |
|---|---|---|
| Full Documents | Reader API (Python) | Known tags, need complete article text |
| Highlights | mcp__readwise__search_readwise_highlights | Semantic search for quotes, annotations |
MCP Tool
mcp__readwise__search_readwise_highlights
Parameters
| Parameter | Type | Description |
|---|---|---|
vector_search_term | string | Semantic search query (required) |
full_text_queries | array | Full-text filters on specific fields |
Full-Text Query Fields
- •
document_author- Author name - •
document_title- Document/book title - •
highlight_note- User’s annotations - •
highlight_plaintext- The highlight text itself - •
highlight_tags- Tags applied to highlights
Example Search
{
“vector_search_term”: “proxy advisors ISS Glass Lewis shareholder voting”,
“full_text_queries”: [
{“field_name”: “highlight_plaintext”, “search_term”: “proxy”}
]
}
Response Structure
Each result includes:
{
“id”: 123456789,
“score”: 0.025,
“attributes”: {
“document_author”: “Author Name”,
“document_category”: “books|articles”,
“document_tags”: [“tag1”, “tag2”],
“document_title”: “Document Title”,
“highlight_note”: “User’s annotation”,
“highlight_plaintext”: “The highlighted text...”,
“highlight_tags”: [“htag1”]
}
}
Formatting for NotebookLM
When adding highlights to NotebookLM, format as markdown grouped by source:
# Readwise Highlights: [Topic] Generated: [date] Search: vector=”[query]” + full_text=”[filter]” --- ## From “[Document Title]” by [Author] **Tags:** tag1, tag2 > “[Highlight text...]” **Note:** [User’s annotation if present] --- [Repeat for each source document]
Formatting Guidelines
- •Group by document - All highlights from same source together
- •Include metadata - Author, tags, category when available
- •Show user notes - Include
highlight_noteif present - •Quote highlights - Use blockquotes for the actual text
- •Add provenance - Include search terms used at top
Adding to NotebookLM
Pipe formatted markdown to nlm CLI:
cat <<'EOF' | /Users/vwh7mb/projects/nlm/nlm add <notebook-id> - [formatted markdown] EOF
NotebookLM will auto-generate a title from content. Optionally rename:
/Users/vwh7mb/projects/nlm/nlm rename-source <source-id> “Readwise: [Topic] Highlights”
Workflow Pattern
- •Search - Use semantic + full-text for best results
- •Filter - Consider score threshold (>0.01 is usually relevant)
- •Format - Group by document, include metadata
- •Add - Pipe to nlm via stdin
- •Verify - Check sources list in notebook
Tips
- •Combine vector search (semantic) with full-text (keyword) for precision
- •Results are ranked by relevance score
- •Large result sets (50+) can be split into themed sources
- •Include search terms in output for provenance tracking
- •User notes (
highlight_note) often contain valuable context
Anti-Pattern: Never Fetch from Source URL
WRONG:
- •Search Readwise, find document
- •Try to fetch from original URL (fails - paywalled)
RIGHT:
- •Search Readwise, find document
- •Get full text FROM READWISE using Reader API with
withHtmlContent=true
If a document is in Readwise, the full text is already there. Never go back to the source URL.
Reader API (Full Documents)
For fetching complete document content (not just highlights), use the Reader API.
Python Client
Located at: /Users/vwh7mb/projects/readwise-reader-tools/src/reader.py
from reader import ReaderClient # Token from agenix import os token = open(“/var/folders/01/wzs3mqmn3jx2b81f0dcq9w8h0000gq/T/agenix/readwise-token”).read().strip() client = ReaderClient(token)
Key Methods
List Documents
# All documents docs = client.list_documents() # Filter by location: new, later, shortlist, archive, feed docs = client.list_documents(location=”archive”) # Filter by category: article, email, rss, highlight, note, pdf, epub, tweet, video docs = client.list_documents(category=”article”)
Get Full Document Content
# Get document with full HTML content doc = client.get_document(document_id, with_html_content=True) # Access the content html = doc.get(“html_content”) # Full HTML title = doc.get(“title”) author = doc.get(“author”) source_url = doc.get(“source_url”)
Find Document by URL
doc = client.find_by_url(“https://example.com/article”)
Save New Document
result = client.add_article(
url=”https://example.com/article”,
html=”<html>...</html>”, # Optional: for paywall bypass
title=”Article Title”,
author=”Author Name”,
tags=[“research”, “topic”]
)
API Endpoints Reference
| Endpoint | Method | Purpose |
|---|---|---|
/api/v3/list/ | GET | List/search documents |
/api/v3/save/ | POST | Save new document |
/api/v3/update/<id>/ | PATCH | Update document |
/api/v3/delete/<id>/ | DELETE | Remove document |
List Parameters
| Parameter | Description |
|---|---|
id | Filter by document ID |
location | new, later, shortlist, archive, feed |
category | article, email, rss, pdf, epub, tweet, video |
updatedAfter | ISO datetime filter |
withHtmlContent | Include full HTML (slower) |
pageCursor | Pagination cursor |
Workflow: Add Full Article to NotebookLM
- •
Find the document
pythondoc = client.find_by_url(“https://example.com/article”) # or doc = client.get_document(doc_id, with_html_content=True)
- •
Extract and clean content
pythonfrom html2text import html2text markdown = html2text(doc[“html_content”])
- •
Add to NotebookLM
bashecho “$markdown” | /Users/vwh7mb/projects/nlm/nlm add <notebook-id> -
Rate Limits
- •Standard endpoints: 20 requests/minute
- •Save/Update: 50 requests/minute
Authentication
Token stored via agenix at:
/var/folders/01/wzs3mqmn3jx2b81f0dcq9w8h0000gq/T/agenix/readwise-token
Same token works for both Reader API and Highlights MCP.
API Reference
See skills/readwise/references/reader-api.md for full Reader API documentation.
Batch Add by Tag (Recommended)
For adding multiple documents by tag to NotebookLM, use the reusable script.
Script Location
skills/readwise/scripts/readwise_to_nlm.py
Usage
# Dry run - see what would be added python3 skills/readwise/scripts/readwise_to_nlm.py \ --tag “proxy advisors” \ --notebook 1457a61e-02ff-4ef0-a0de-deeb0e931972 \ --dry-run # Add all documents with tag to notebook python3 skills/readwise/scripts/readwise_to_nlm.py \ --tag “proxy advisors” \ --notebook 1457a61e-02ff-4ef0-a0de-deeb0e931972 # Verbose output python3 skills/readwise/scripts/readwise_to_nlm.py \ --tag “Corps” \ --notebook abc123 \ --verbose
Requirements
pip install requests html2text
How It Works
- •Fetches all documents from Readwise with
withHtmlContent=true - •Filters by tag (case-insensitive match on
tagsfield) - •Converts HTML to Markdown using
html2text - •Pipes each document to
nlm add <notebook-id> -
Via Opencode (Librarian Pattern)
For ad-hoc requests, delegate to opencode:
# Simple task (list, single add) opencode run -m github-copilot/gpt-5-mini \ “Add documents tagged ‘Corps’ to notebook abc123” # Long context (many docs, research) opencode run -m google/antigravity-gemini-3-flash \ “Search readwise for highlights on activism and add to notebook xyz”
Note: Opencode tasks need ~15 min timeout for multi-document workflows.
Learnings & Tips
Tag Filtering
- •Tags are stored in
tagsfield as list of strings - •Filter with:
tag.lower() in [t.lower() for t in doc.get(“tags”, [])] - •Common tags: document categories, topics, project names
HTML Content
- •Use
withHtmlContent=trueparameter to get full text - •Fetching HTML is slower - only request when needed
- •Some documents may lack HTML (PDFs, certain imports)
Dependencies
- •
html2text- converts HTML to clean Markdown - •
requests- HTTP client for API calls - •Both available via pip
Practical Notes
- •Rate limit: 20 req/min for list, 50 req/min for save/update
- •Pagination: use
nextPageCursorfrom response - •Large batches: script handles pagination automatically
- •Timeouts: allow 60s per request, 15min for full workflows