Run free audit
Technical Standards

llms.txt

llms.txt is a proposed plain-text file placed at the root of a website that gives large language models a concise, curated map of the sites most important pages and content sections, so AI systems can find the right pages without having to crawl the entire site.

Also known as:llms.txt file, LLM site index, AI sitemap

The llms.txt proposal is a community-driven standard: a single Markdown file at /llms.txt that lists the pages on a site that are most useful for an LLM to read. It is intentionally short and human-readable, with a title, a one-paragraph site description and a curated list of links grouped by section, each with a short note explaining what the linked page covers.

The format is different from a traditional XML sitemap, which lists every URL for search engines, and from robots.txt, which controls access. llms.txt is closer to a hand-picked reading list: it tells AI systems which pages are the canonical, well-maintained sources for each topic on the site. Whether and how each AI engine uses llms.txt is still evolving, but adoption is growing.

For brands, an llms.txt file is cheap to add and useful even before universal adoption. It documents the canonical pages for each topic, surfaces them prominently for any AI system that does fetch the file, and forces the content team to be explicit about which pages are the authoritative ones. It works best when the linked pages themselves are clean, accurate and well-structured.

Key points

  • A proposed Markdown file at /llms.txt that lists a sites most important pages for LLMs.
  • Different from XML sitemaps and from robots.txt: it is a curated reading list, not exhaustive coverage or an access policy.
  • Adoption by AI engines is still evolving, but the file is cheap to publish.
  • Most useful when paired with clean, accurate canonical pages it links to.

Frequently asked questions

What is llms.txt?

llms.txt is a proposed plain-text file at the root of a website that gives large language models a curated list of the sites most important pages and what they cover, so AI systems can find canonical content efficiently.

Is llms.txt the same as robots.txt?

No. robots.txt controls which crawlers may access which paths. llms.txt is a positive curation: a short Markdown reading list that points AI systems to the canonical pages on the site.

Do AI engines actually read llms.txt?

Adoption is uneven and evolving. Some engines and tools already read it; others do not yet. Publishing it is cheap and the upside grows as adoption increases.

Related VisibAI tools

Related terms

robots.txt for AI Crawlers
robots.txt for AI crawlers is the use of the standard /robots.txt file to allow or block specific AI user agents such as GPTBot, ClaudeBot, PerplexityBot and Google-Extended, controlling which AI systems can access the sites content for training or real-time grounding.
AI Crawler
An AI crawler is an automated user agent operated by an AI company that fetches public web pages to use either for training large language models or for real-time grounding inside AI answers, with named examples including GPTBot, ClaudeBot, PerplexityBot, Google-Extended and CCBot.
Structured Data (Schema.org)
Structured data is machine-readable markup, most commonly in JSON-LD format using the schema.org vocabulary, that labels the meaning of content on a page (Organization, Product, FAQPage, Article, Breadcrumb) so search engines and AI systems can parse it without having to guess from raw HTML.
Generative Engine Optimization (GEO)
Generative Engine Optimization (GEO) is the practice of shaping web content, structure and authority signals so that generative AI engines such as ChatGPT, Perplexity and Google AI Overviews recommend or cite a brand in their synthesized answers.
See how AI engines describe your brand.

Free audit. Score across ChatGPT, Perplexity, Gemini and Google AI Overviews.

Run a free audit
Back to the dictionary