There's a quiet split happening in how websites get found.

On one side, you have the world of robots.txt โ€” the file that's been telling search engines what to crawl since 1994. Every serious website has one. It's an SEO basic.

On the other side, you have llms.txt โ€” a file most websites have never heard of, even though the tools it serves (ChatGPT, Claude, Perplexity, Gemini, Mistral) are eating their search traffic. Built in 2024, formalized in 2025, and quietly becoming the standard way AI models understand what a website is actually about.

If you've never made one for your site, this is the post.

What is llms.txt?

llms.txt is a plain text file at the root of your domain (yoursite.com/llms.txt) that tells AI tools what your business is, what you sell, where to find your most important content, and how you'd like to be summarized.

Think of it as a structured business card for AI models. Not a robots.txt replacement (robots.txt controls crawler access; llms.txt controls AI understanding). Not a sitemap (sitemaps tell crawlers what URLs exist; llms.txt tells AI what those URLs are for). It's its own thing.

The format was proposed by Jeremy Howard in September 2024 and within months had been adopted by AI labs, documentation sites, and SaaS companies looking to be cited accurately by ChatGPT and Claude. It's not yet a W3C standard. It is rapidly becoming a de facto one.

Why does it matter?

Three reasons.

First, AI models work differently from Google. Google indexes pages and ranks them against a query. AI models synthesize an answer from training data plus real-time web search plus internal heuristics. When an AI model is composing an answer about a topic in your industry, the existence of a clear, well-structured llms.txt makes you easier to summarize โ€” and easier to cite correctly.

Second, it reduces hallucinations about your brand. When ChatGPT mentions a company, the description it gives is synthesized from whatever the model knows. If your website is hard to parse, your value proposition is buried in marketing copy, and your competitor's site has cleaner content โ€” the AI will lean on the cleaner source. A llms.txt is your chance to say in plain language: here is what we do, here is what we sell, here is how we're different.

Third, the adoption curve is early. Stripe has one. Anthropic has one. A handful of AI-native SaaS companies have one. The vast majority of SMBs and even mid-market brands don't. Adding one in 2026 is a small move. Adding one in 2028, after every competitor has one, will be table stakes with no differentiation upside.

What goes in a llms.txt?

The spec is intentionally simple. Plain markdown text. No XML, no JSON, no schema.org. At minimum it should contain:

  • An H1 with your business name โ€” the very first line. AI parsers expect this.
  • A blockquote or paragraph describing what you do โ€” one or two sentences, plain English. This is the line an AI will paraphrase.
  • Optional H2 sections for: core products, key pages, pricing, docs, contact, alternative names you go by.

Here's an example skeleton:

# VisibAI

> VisibAI is an AI visibility platform that audits how brands appear in ChatGPT, Perplexity, Claude, Gemini, and other AI search tools.

## Core product
- Free AI visibility audit: https://getvisibai.com/audit/new
- Agency white-label dashboard: https://getvisibai.com/for/agencies
- Free tools: https://getvisibai.com/tools

## Pricing
- Free: 1 audit
- Pro: 79 EUR/month
- Agency: 149 EUR/month
- Enterprise: custom

## Documentation
- What is AI visibility: https://getvisibai.com/docs/what-is-ai-visibility
- Glossary: https://getvisibai.com/glossary

## Contact
- Email: hello@getvisibai.com

That's it. Plain text. No magic. The format optimizes for AI parsers, which means it optimizes against marketing fluff. Resist the urge to write paragraphs of brand voice. Write the version a journalist would use to introduce your company in three lines.

Common mistakes

We've audited a few hundred sites at this point. The ones that have llms.txt files often get the structure wrong in predictable ways.

Mistake 1: Selling instead of describing. The blockquote at the top should describe what your business is, not why it's the best. "We're the leading AI-native solution for forward-thinking marketing teams" reads as noise to an AI parser. "X is a Y that does Z" is the version that gets cited.

Mistake 2: Linking only to the homepage. AI models pulling structured data from llms.txt use the linked URLs to dive deeper. If every link goes to /, you've given the model nowhere to go. Link to your pricing page, your documentation, your most-cited blog posts. The deeper structure rewards crawling.

Mistake 3: Forgetting to update it. A llms.txt is not a one-and-done. When your pricing changes, when you ship a new product, when your top-of-funnel content shifts โ€” the file should change too. Treat it like a homepage description: low maintenance, but not zero.

Mistake 4: Skipping the llms-full.txt variant. The spec has a longer counterpart: llms-full.txt. Same idea, but with the full text of your most important documentation pages inlined for AI tools to ingest in one fetch. If you have a documentation site, this is worth adding. If you're a small SMB site, the basic llms.txt is enough.

How long should it take to make?

A first version takes about 20 minutes if you know your business. The longer version is figuring out what to actually put in it โ€” which often surfaces useful clarity about positioning that's worth doing on its own.

We built a free llms.txt generator that gives you a starting template based on your site's existing structure. You can use it, modify it, paste it into your hosting provider as a static file at the root of your domain, and you're live.

How do you know it's working?

You don't, immediately. AI models don't crawl websites in real time the way Google does. They pull from training data (updated periodically), from web search APIs (used at query time), and from manually-indexed sources.

A new llms.txt will start to influence AI answers within weeks to months as those systems re-crawl your site or as AI tools (like ours, but the principle is the same) pick up the file and feed it back into visibility analysis.

The right measurement isn't "did my llms.txt fire today." It's "is my brand being described accurately in ChatGPT now versus six months ago." That delta is what compounds.

What about robots.txt?

robots.txt still matters. AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, OAI-SearchBot) read robots.txt to decide if they're allowed to crawl your content at all. If they're blocked there, llms.txt doesn't matter โ€” they never get to read it.

The two work together. robots.txt says "you can come in." llms.txt says "here's what to look at when you do."

Most sites we audit have one of them wrong. Some have an overly aggressive robots.txt blocking the AI crawlers they want to be cited by. Some have a clean robots.txt but no llms.txt, leaving AI models to guess. The combination of both, done right, is the easiest under-leveraged improvement in 2026.


Where to go from here

If you haven't checked your site yet, that's the move. Two minutes to look:

  1. Visit yoursite.com/llms.txt. If it 404s, you don't have one.
  2. Visit yoursite.com/robots.txt. Check whether GPTBot, ClaudeBot, PerplexityBot, and Google-Extended are explicitly allowed.

Both files should exist, and both should be intentional. Most sites get partial credit on one and zero on the other. Closing that gap is a one-afternoon project that compounds for years.

For the full audit version โ€” your AI visibility score, the specific fix list for your site, the platforms where you do and don't appear โ€” start a free audit here. The llms.txt check is one of the first things we look at, and we'll generate a tailored draft for your domain as part of the report.