llms.txt Generator
Generate, validate, and optimize llms.txt files -- the Markdown-formatted standard that gives LLMs a curated, context-window-friendly overview of a website's most important content.
Before starting, consult references/llmstxt-guide.xml -- a comprehensive playbook covering the official specification, five implementation patterns, companion file conventions, best practices, a validation checklist, and the full tooling ecosystem. Reference it throughout the workflow.
Workflow Overview
Discovery --> Strategy --> Generate llms.txt --> Companion Files --> Validate --> Deliver
^ | |
+--------- Refine ---------+ |
+-------------- Fix issues ------------+
Follow steps 1-6 in order. Steps may loop if the user wants revisions or validation surfaces issues.
Step 1: Discovery
Gather context about the user's site or project. Ask about:
- •Site type -- Documentation site, SaaS product, e-commerce, open-source library, personal site, or multi-product organization?
- •Content structure -- What are the key pages and content areas? (docs, API references, tutorials, guides, blog, pricing)
- •Audience -- Developers, end-users, or mixed? Who will be feeding this to an LLM?
- •Technical stack -- What framework or CMS? Do pages have
.mdversions available? - •Goal -- AI coding assistant context (Cursor, Windsurf), generative engine presence, future-proofing, or all of the above?
- •Existing file -- Do they already have an llms.txt they want audited or improved?
Keep discovery concise: 3-5 questions max in one message. If the user provides a URL, explore the site structure to identify key content areas.
If the user already has an llms.txt file and wants an audit, skip to Step 5 (Validate).
Step 2: Strategy
Based on discovery, recommend an implementation pattern. See <implementation_patterns> in llmstxt-guide.xml for full details and a decision tree:
| Pattern | Best for | Example |
|---|---|---|
| Minimal index | Small sites, modular docs | Supabase (~100 tokens, links to per-topic files) |
| Comprehensive directory | Large multi-product sites | Vercel, Cloudflare (deep nesting, extensive links) |
| Dual-file | Medium-to-large docs sites | Anthropic, Stripe (llms.txt index + llms-full.txt) |
| Product-segmented | Multi-product orgs | Cloudflare (per-product llms-full.txt files) |
| Multi-site | Orgs with distinct web properties | Stripe (separate files for marketing vs docs) |
Recommend companion files based on context (see <companion_files> in llmstxt-guide.xml):
- •llms-full.txt -- Recommended for most sites. Contains all documentation content in a single Markdown file. AI agents visit llms-full.txt over 2x more frequently than llms.txt.
- •llms-small.txt -- Useful when targeting constrained context windows. Contains only page structure and titles.
- •Per-topic files -- Useful for modular documentation (e.g.,
/llms/guides.txt,/llms/api.txt).
Present the recommendation with rationale. Confirm with the user before proceeding.
Step 3: Generate llms.txt
Build the llms.txt file following the specification strictly (see <specification> in llmstxt-guide.xml for the authoritative rules and canonical example). The file has four sections in this exact order:
Section 1: H1 Header (required)
# Project Name
The only required element. Use the project or site name.
Section 2: Blockquote Summary
> A concise summary that gives an LLM immediate context about what this > project or site is, what it does, and why someone would use it.
Include key information needed to understand the rest of the file. This is the single most important piece of context for an LLM.
Section 3: Freeform Context (optional)
Plain Markdown paragraphs or lists (no headings) providing additional project context -- important caveats, compatibility notes, or constraints.
Section 4: H2 Sections with Link Lists
Organize curated links under H2 headings. Each list item must have a hyperlink, optionally followed by a colon and descriptive notes:
## Docs - [Quickstart Guide](https://example.com/docs/quickstart.html.md): A 5-minute tutorial for getting your first project running with authentication and basic CRUD - [API Reference](https://example.com/docs/api.html.md): Complete REST API documentation covering all endpoints, request formats, and error codes
Apply these best practices when generating (see <best_practices> in llmstxt-guide.xml for the full rules):
- •Be a curator, not a collector -- List the "greatest hits," not every page. This is not a sitemap.
- •Write descriptive link text --
[Quickstart Guide](url): A 5-minute tutorial...not[Docs](url). - •Keep descriptions to 10-20 words -- Factual and value-oriented.
- •Link to .md versions where possible -- Markdown is up to 10x more token-efficient than HTML.
- •Most important sections first -- Core docs, API reference, getting started before blog posts.
- •Use the Optional section strategically -- Place secondary content that can be dropped when context is tight.
## Optional - [Changelog](https://example.com/changelog.html.md): Release history with breaking changes and migration notes - [Advanced Configuration](https://example.com/docs/advanced.html.md): Deep-dive into environment variables and custom plugins
Content to include: Core documentation, API references, getting-started guides, FAQs, key tutorials, pricing pages, key blog posts.
Content to exclude: Login pages, admin panels, user-generated content, gated/premium content, duplicate pages.
Step 4: Companion Files
Based on the strategy from Step 2, generate companion files as needed.
llms-full.txt
Compile referenced documentation content into a single Markdown file. Structure it with clear H1/H2 headers mirroring the llms.txt sections. Each section contains the full rendered content of the linked pages.
If the user cannot provide full page content, generate a structured skeleton with clear placeholders:
# Project Name -- Full Documentation ## Quickstart Guide [Paste or generate full quickstart content here] ## API Reference [Paste or generate full API reference content here]
Advise the user on token budget -- note that Anthropic's llms-full.txt is ~481K tokens and Cloudflare's is ~3.7M tokens. Most sites should aim for the range their target LLM context window can handle.
llms-small.txt
Generate an ultra-compact version with only the structural outline and page titles:
# Project Name > One-sentence summary. ## Docs - Quickstart Guide - API Reference - Authentication ## Examples - Todo App Tutorial - E-commerce Integration
Per-topic files
If the segmented pattern was chosen, generate separate files for each product or topic area (e.g., /llms/guides.txt, /llms/js-sdk.txt, /llms/python-sdk.txt).
Step 5: Validate
Audit the generated file (or an existing user-provided file) against the specification and best practices. See <validation_checklist> in llmstxt-guide.xml for the complete checklist. Check each item and report findings.
Specification compliance
- • H1 header with project/site name is present (only required element)
- • Blockquote summary follows the H1
- • Freeform context (if any) uses paragraphs/lists only, no headings
- • H2 sections contain properly formatted link lists
- • Links use the format
[name](url)with optional: description - • Section order follows the specification (H1, blockquote, freeform, H2 sections)
- •
## Optionalsection used correctly (only for skippable content) - • Valid Markdown syntax throughout
Content quality
- • Summary blockquote provides enough context for an LLM to understand the project
- • Curated selection, not an exhaustive sitemap dump
- • Link descriptions are 10-20 words, factual, and value-oriented
- • Links point to .md versions where available
- • Most important content appears in the first sections
- • No login pages, admin panels, or gated content
- • Reasonable total size (estimate token count; flag if unusually large or small)
The paste test
Recommend the user validate by pasting the file contents into ChatGPT or Claude (with web search disabled) and asking questions about their product. If the model gives poor answers, the file needs revision.
Present validation results as a checklist with pass/fail status and specific recommendations for any issues found.
Step 6: Deliver
Present the final file(s) and provide implementation guidance.
Implementation instructions
- •Host at domain root:
example.com/llms.txt - •Serve with MIME type
text/plain - •Place companion files alongside:
example.com/llms-full.txt,example.com/llms-small.txt - •For subpath documentation:
example.com/docs/llms.txtis also valid
Framework-specific guidance
Recommend relevant tooling based on the user's stack (see <tooling_ecosystem> in llmstxt-guide.xml for the full catalog):
| Stack | Tooling |
|---|---|
| Mintlify | Auto-generated, zero config needed |
| GitBook | Auto-generated for all published docs |
| VitePress | vitepress-plugin-llms |
| Docusaurus | docusaurus-plugin-llms |
| Astro | @4hse/astro-llms-txt |
| Next.js | next-llms-txt or manual placement in public/ |
| Hugo | Place in /static/llms.txt |
| Jekyll | Place in root directory |
| WordPress | Yoast SEO (built-in), AIOSEO, or "Website LLMs.txt" plugin |
| Manual | Place file at web server root or configure routing |
Maintenance checklist
Advise the user to update their llms.txt when they:
- •Add, remove, or significantly restructure documentation pages
- •Launch new product areas or features with documentation
- •Change URLs or site architecture
- •Add new SDK/language support
Next steps summary
Present a numbered list of concrete actions:
- •Deploy the generated file(s) to the site root
- •Verify the file is accessible at
example.com/llms.txt - •Run the paste test (Step 5) to validate quality
- •Set up
.mdpage versions if not already available - •Add to the llms-txt-hub directory on GitHub for discoverability
- •Schedule periodic reviews (quarterly or on major doc changes)
Examples
Example 1: Open-source library
User says: "Create an llms.txt for my Python image processing library called PixelForge"
Actions:
- •Ask about documentation structure, key pages, and whether docs have .md versions
- •Recommend the dual-file pattern (llms.txt index + llms-full.txt)
- •Generate llms.txt with sections for Docs, API Reference, Examples, and Optional
- •Generate llms-full.txt skeleton with content placeholders
- •Validate both files against the specification
- •Deliver with instructions for placing in the docs site root
Result:
# PixelForge > PixelForge is a Python library for high-performance image processing, offering > GPU-accelerated filters, batch transformations, and a composable pipeline API. - PixelForge requires Python 3.10+ and supports CUDA 12.x for GPU acceleration - The pipeline API is inspired by scikit-image but is not API-compatible ## Docs - [Quickstart](https://pixelforge.dev/docs/quickstart.html.md): Install PixelForge and run your first image pipeline in under 5 minutes - [Pipeline API Guide](https://pixelforge.dev/docs/pipelines.html.md): Composable transformation chains with lazy evaluation and automatic GPU offloading - [API Reference](https://pixelforge.dev/docs/api.html.md): Complete reference for all modules, classes, and functions ## Examples - [Batch Processing](https://pixelforge.dev/examples/batch.html.md): Process 10,000 images with parallel pipelines and progress tracking - [Custom Filters](https://pixelforge.dev/examples/filters.html.md): Build reusable filter functions with the kernel API ## Optional - [Changelog](https://pixelforge.dev/changelog.html.md): Release history with migration guides for breaking changes - [Benchmark Suite](https://pixelforge.dev/docs/benchmarks.html.md): Performance comparisons against Pillow, OpenCV, and scikit-image
Example 2: Auditing an existing llms.txt
User says: "Can you audit my llms.txt file? Here it is: [pastes file]"
Actions:
- •Skip discovery and strategy (Steps 1-2)
- •Skip generation (Steps 3-4)
- •Run full validation (Step 5) against specification and best practices
- •Present checklist with pass/fail for each item
- •Provide specific rewrite suggestions for any issues
Result: A validation report with findings like:
- •PASS: H1 header present
- •PASS: Blockquote summary present
- •FAIL: Link descriptions are too generic ("API docs" -- add 10-20 word descriptions)
- •FAIL: Links point to HTML pages, not .md versions
- •WARN: No
## Optionalsection -- consider moving secondary content there - •RECOMMENDATION: Add llms-full.txt companion file for complete content delivery
Example 3: Multi-product organization
User says: "I need llms.txt for my company -- we have 4 products: an API gateway, a CDN, a database, and a serverless platform"
Actions:
- •Discovery: understand each product's documentation structure and target audience
- •Recommend the product-segmented pattern with a root llms.txt index
- •Generate root llms.txt linking to per-product documentation
- •Generate per-product llms-full.txt files (e.g.,
/api-gateway/llms-full.txt) - •Generate llms-small.txt for the constrained context window use case
- •Validate all files and deliver with deployment instructions
Result: A root llms.txt organized by product with H2 sections for each, plus four per-product llms-full.txt companion files and one llms-small.txt for the top-level overview.
Troubleshooting
Issue: File is too large for useful context
Cause: Including too many links turns llms.txt into a sitemap rather than a curated index.
Solution: Ruthlessly curate. Move secondary content to ## Optional. Consider the minimal index pattern with per-topic companion files. The llms.txt file itself should be a "greatest hits" -- aim for tens of links, not hundreds.
Issue: Links point to HTML instead of Markdown
Cause: The site does not serve .md versions of pages.
Solution: Check if the framework supports .md URL variants (many documentation platforms do). If not, advise the user to set up Markdown page versions -- the specification recommends appending .md to existing URLs (e.g., example.com/docs/guide.html.md). As a fallback, HTML links still work but are less token-efficient.
Issue: Missing or weak blockquote summary
Cause: The summary is omitted or too brief to give an LLM useful context. Solution: Write a 2-4 sentence blockquote that answers: What is this project? What does it do? Who is it for? What key constraints should an LLM know? This is the single most impactful element for LLM comprehension.
Issue: Generic link descriptions
Cause: Descriptions like "API docs" or "Guide" do not help an LLM decide which link to follow.
Solution: Rewrite each description to be 10-20 words, factual, and value-oriented. Good: Complete REST API reference covering authentication, rate limiting, and all 47 endpoints. Bad: API documentation.
Issue: Using llms.txt as a sitemap replacement
Cause: Misunderstanding the standard's purpose -- llms.txt is for curation, not exhaustive indexing. Solution: llms.txt coexists with sitemap.xml. The sitemap lists every indexable page; llms.txt lists only the pages most useful for LLM consumption. Remove pages that are not useful for AI context (login, admin, legal boilerplate, duplicate content).