llms.txt
A Markdown file at /llms.txt that gives agents a curated, human-edited map of your site's most important content — distinct from a comprehensive sitemap.xml.
On this page
- What is llms.txt and how does it work?
- Why does llms.txt matter for AI agent discoverability?
- Is llms.txt required, recommended, or optional for my site?
- What the llms.txt standard says
- What good llms.txt implementation looks like
- How do I add llms.txt to my site?
- How can I test my llms.txt file?
- Frequently asked questions
- Does llms.txt replace sitemap.xml or robots.txt?
- Do I need llms.txt if my site is already in ChatGPT's training data?
- How does llms.txt help e-commerce sites get cited by AI shopping agents?
- Should SaaS companies include pricing pages in llms.txt?
- Can I auto-generate llms.txt from my sitemap or docs framework?
- How is llms.txt different from schema.org structured data?
- Does llms.txt work with Next.js or Vercel deployments?
- What should developer documentation sites prioritize in llms.txt?
What is llms.txt and how does it work?
llms.txt is a Markdown file served at /llms.txt on your domain's root that provides AI agents with a human-curated map of your site's most important content. Unlike sitemap.xml, which exhaustively lists every page, llms.txt is editorial: you choose what matters, write a short description for each resource, and organize it under topical headings. The file follows a simple, readable structure—a top-level title, an optional summary blockquote, and sections with bulleted links and descriptions.
The format was proposed by Jeremy Howard as a lightweight alternative to forcing agents to crawl entire sitemaps or scrape navigation chrome. Agents like ChatGPT, Claude, and Perplexity use llms.txt to quickly understand your site's scope and retrieve the right content for citations, context injection, or agentic workflows—without wrestling with JavaScript bundles, authentication walls, or layout noise.
Why does llms.txt matter for AI agent discoverability?
When an agent needs to answer a question about your product, pull specs for code generation, or cite your documentation in a research synthesis, it has three options: scrape your homepage (bad signal-to-noise ratio), parse your sitemap (hundreds of URLs, no hierarchy), or read your llms.txt (curated, scoped, already in Markdown). The third option wins. Tools like Cursor, Windsurf, and Cline use llms.txt to populate project context; ChatGPT's web browsing prefers it for citation extraction; and emerging agent frameworks treat it as a first-class discovery mechanism. If your llms.txt is missing or malformed, agents fall back to heuristics—scraping nav menus, following pagination links, or simply giving up and citing a competitor.
This translates to measurable business outcomes. Better llms.txt coverage improves your citation rate in LLM-generated answers, especially in high-intent queries ("best CRM for small teams," "how to configure OAuth in X platform"). For developer tools, it directly affects agent installability—can an AI coding assistant find your API reference and quickstart without asking the user to paste docs manually? For e-commerce and SaaS, it smooths agentic commerce flows where a purchasing agent needs to compare feature matrices or find integration guides. It's not theoretical: companies with well-maintained llms.txt files are seeing higher referral traffic from Perplexity and ChatGPT's citations panel.
Is llms.txt required, recommended, or optional for my site?
We rate llms.txt as recommended for most sites—especially if you publish technical documentation, API references, product comparisons, or long-form educational content. It's a small investment (often under an hour to draft and deploy) with asymmetric upside. The main exceptions: single-page apps with no deep content, sites behind authentication that don't want public indexing, and early-stage projects with fewer than five substantive pages. If you're already maintaining a robots.txt and sitemap.xml, adding llms.txt is the logical next step. If you operate a SaaS product, developer platform, or content library that wants to be cited by agents, this should be non-negotiable.
What the llms.txt standard says
The llms.txt spec is informal but increasingly adopted. It defines:
- Location: Must be served at
https://yourdomain.com/llms.txt(root level, not/docs/llms.txt). - Format: Plain Markdown. Start with a single
# Title, optionally followed by a> Summary blockquote. - Structure: Organize content under
## Section Headings, each containing a bulleted list of- [Link text](URL): Brief description. - Optional fields: You can add subsections (
###), but keep nesting shallow. Agents parse H2 sections most reliably.
A minimal valid example:
# Acme API Documentation
> Acme provides REST APIs for payment processing, webhooks, and analytics.
## Getting Started
- [Quickstart Guide](https://docs.acme.com/quickstart): 5-minute setup with code examples in Python, Node, and Ruby.
- [Authentication](https://docs.acme.com/auth): How to generate and rotate API keys.
## API Reference
- [Payments API](https://docs.acme.com/api/payments): Create charges, refunds, and subscriptions.
- [Webhooks](https://docs.acme.com/api/webhooks): Real-time event notifications for payment status changes.
## Guides
- [Testing in Sandbox](https://docs.acme.com/guides/sandbox): Use test credentials to simulate transactions.
Proposed by Jeremy Howard (co-founder of fast.ai), the format is not a W3C standard but has organic adoption across developer tools, LLM vendors, and frameworks. Cloudflare, Anthropic, and OpenAI representatives have publicly acknowledged reading it.
What good llms.txt implementation looks like
The llmstxt.org site itself publishes a reference example. Stripe and Linear have been cited in community discussions as early adopters, though their exact files evolve. A typical production llms.txt for a SaaS product might look like:
# Beam Analytics
> Privacy-focused web analytics with GDPR compliance and real-time dashboards.
## Product
- [Features](https://beam.app/features): Event tracking, funnels, and cohort analysis.
- [Pricing](https://beam.app/pricing): Free tier and paid plans for growing teams.
## Documentation
- [Installation](https://docs.beam.app/install): Add the script tag or NPM package.
- [API Reference](https://docs.beam.app/api): Query events, export data via REST.
## Guides
- [GDPR Compliance](https://docs.beam.app/gdpr): How we handle user consent and data deletion.
- [Custom Events](https://docs.beam.app/custom-events): Track checkout flows and feature adoption.
Companies operating open documentation sites (PostHog, Supabase) are natural candidates; check their /llms.txt endpoints for live examples.
How do I add llms.txt to my site?
-
Draft the file. Open a text editor and write a Markdown outline. Start with your homepage title, add a one-sentence summary, then list 5–15 of your most important pages grouped by category (Getting Started, API, Guides, Blog). Each bullet needs a title, URL, and 10–20 word description.
-
Place it at the root. Save as
llms.txtand deploy it tohttps://yourdomain.com/llms.txt.- Static sites (Hugo, Jekyll, Eleventy): Drop
llms.txtin yourstatic/orpublic/folder. - Next.js: Place it in
public/llms.txt; Next servespublic/*at root automatically. - Cloudflare Pages / Vercel: Add to the root of your output directory.
- Server-rendered: Add a route handler that returns
text/plain; charset=utf-8with your Markdown content.
- Static sites (Hugo, Jekyll, Eleventy): Drop
-
Set the MIME type. Serve it as
Content-Type: text/plain; charset=utf-8. Most static hosts do this by default for.txt. -
Test and iterate. Check that the URL loads in a browser, then monitor analytics for
/llms.txttraffic. Update quarterly or whenever you publish major new sections.
Example route handler in Next.js (App Router):
// app/llms.txt/route.ts
export async function GET() {
const content = `# MyApp Docs
> API platform for real-time notifications.
## Getting Started
- [Quickstart](https://myapp.dev/quickstart): Send your first push in 5 minutes.
`;
return new Response(content, {
headers: { 'Content-Type': 'text/plain; charset=utf-8' },
});
}
How can I test my llms.txt file?
curl -I https://yourdomain.com/llms.txt
Verify you get a 200 OK with Content-Type: text/plain. Then fetch the body and confirm it's valid Markdown with a # title and at least one ## section.
Or just run a free scan and we'll check this for you alongside 30+ other agent-readiness signals.
Frequently asked questions
Does llms.txt replace sitemap.xml or robots.txt?
No. llms.txt is complementary, not a replacement. sitemap.xml exhaustively lists URLs for crawlers; robots.txt controls crawl permissions. llms.txt provides curated, editorial context specifically for AI agents—telling them which pages matter most and why. Maintain all three for optimal agent and search engine coverage.
Do I need llms.txt if my site is already in ChatGPT's training data?
Yes. Training data is static and outdated. llms.txt helps real-time agent retrieval—when ChatGPT, Claude, or Perplexity browse your site during a conversation to answer current questions, cite sources, or pull fresh API docs. It's about live discoverability, not pre-training.
How does llms.txt help e-commerce sites get cited by AI shopping agents?
AI shopping agents need structured product data, comparison guides, and return policies to recommend your store. A well-organized llms.txt surfaces category pages, shipping FAQs, and size guides—reducing the agent's need to scrape or guess. This increases citation in answers like 'best ergonomic chairs under $500.'
Should SaaS companies include pricing pages in llms.txt?
Absolutely. Pricing, feature comparison tables, and integration guides are high-intent queries for purchasing agents and users comparing tools. Include direct links to /pricing, tier breakdowns, and any ROI calculators. This improves your visibility in 'best X for Y' agent-generated recommendations.
Can I auto-generate llms.txt from my sitemap or docs framework?
Yes, but curate it. Tools exist to seed llms.txt from sitemaps or Docusaurus/Nextra navigation, but the value is editorial prioritization—choosing the 15-30 most important pages, not dumping 500 URLs. Start automated, then hand-edit to highlight quickstarts, key concepts, and integration guides.
How is llms.txt different from schema.org structured data?
schema.org markup (JSON-LD) is embedded in HTML for search engines and focuses on structured entities—products, reviews, events. llms.txt is a standalone Markdown file for AI agents, emphasizing human-readable navigation and descriptions. Use both: schema for Google, llms.txt for ChatGPT and coding assistants.
Does llms.txt work with Next.js or Vercel deployments?
Yes. Create a public/llms.txt file in your Next.js project; it will be served at yourdomain.com/llms.txt automatically. For dynamic generation, use a route handler at app/llms.txt/route.ts that returns Markdown with Content-Type: text/markdown. Deploy normally; no special Vercel config needed.
What should developer documentation sites prioritize in llms.txt?
Quickstart guides, authentication setup, API reference index, SDK installation pages, and common use-case tutorials. AI coding assistants like Cursor and Cline parse llms.txt to inject context into code generation. Prioritize pages that answer 'how do I get started?' and 'how do I do X?'