|
23rd February 2026
|
10 min read

Should your website have an llms.txt file? An honest guide for 2026

Google's May 2026 AI optimisation guide put llms.txt on its list of things you don't need to do for Search. Here's what the file does, who actually benefits, and where to focus your time instead if you want to show up in AI answers.
A computer screen showcases code on the left and a vibrant travel app design featuring colorful interface elements on the right, demonstrating principles of responsive web design seamlessly. | Rubber Duckers
Written by

Eighteen months ago, every SEO blog and developer feed told you to add an llms.txt to your site. Then in May 2026, Google published its official guide to optimising for generative AI features and put llms.txt on the list of things you don't need to do.

So the answer to whether your site needs one isn't the same as it was a year ago.

We've had llms.txt on rubberduckers.co.uk since the standard emerged. We've added it for clients on our Growth Partner retainer. We watch the server logs to see who actually requests it. And we've now read Google's guide carefully and adjusted how we talk about the file.

This piece covers what llms.txt is, what Google has actually said, what real-world adoption looks like, and where the file does and doesn't earn its place in 2026.

How llms.txt works and what sets it apart

The llms.txt standard was proposed by Jeremy Howard at Answer.AI in late 2024. It's a single plain markdown file that sits at the root of your site (yoursite.com/llms.txt) and acts as a curated tour of your most important pages for AI models.

The format is simple. Your business name as an H1. A short summary. H2 sections (services, docs, blog, products) listing your key URLs with one-line descriptions.

What an llms.txt file contains

Most well-written llms.txt files share the same structure. They open with a site or company name as an H1, then a short summary of what you do, then sections grouped by intent. Each section lists curated URLs with a colon and a description.

You can include an Optional section for less critical pages. Models tend to drop these first when their context is tight. That's where you'd put changelogs, advanced configuration, or anything supplementary.

There's no minimum length. Some files are six lines. Some run to several hundred. The principle is curation, not coverage.

How it differs from robots.txt and sitemaps

These three files do different jobs and they get conflated often.

robots.txt controls access. It tells crawlers what they can and can't reach. It's binary: allow or disallow.

sitemap.xml lists everything. It's exhaustive. The job is to make sure search engines can discover every URL on the site.

llms.txt is curated and persuasive. It doesn't grant or deny access. It says "if you only had time to read ten of our pages, read these". Think of it as an editorial layer sitting on top of the technical files.

You need a working robots.txt and sitemap regardless. llms.txt is additive.

Why markdown is the format

Markdown suits AI consumption unusually well. It's plain text, so there's no parsing of nav bars, ad scripts, popups, or layout elements. The structure is explicit: hashes for headings, hyphens for lists, brackets for links.

Models are also trained on enormous quantities of markdown from documentation sites and GitHub repositories. The format is already familiar.

Some sites go further and publish markdown versions of key pages at the same URL with .md appended (yoursite.com/about.md, for example). Hugging Face does this with installation commands embedded directly in the file. It works well for documentation-heavy sites. For most marketing sites it's overkill.

Adoption and what the evidence actually shows

The picture of llms.txt adoption is more cautious than the SEO discourse suggests. Most data circulating online conflates "has an llms.txt" with "AI tools are reading and using it", which are different questions.

Who's adopted it

The clearest adoption is in technical documentation and developer tooling. Cloudflare, Anthropic, Vercel, Supabase, Mintlify, Ahrefs and Hugging Face all maintain one. The pattern is consistent: companies whose product gets queried by AI assistants on behalf of developers.

Outside that group, adoption is patchier. Plenty of marketing sites have added the file, including ours. Plenty haven't. There's no reliable data showing mainstream business websites are picking it up at scale.

What the major AI providers have said

This is where most of the existing discourse falls down. There's a lot of confidence online about which AI crawlers "respect" llms.txt and at what percentage. Most of those numbers aren't sourced to anything verifiable.

OpenAI, Anthropic and Google have all published documentation on how their crawlers (GPTBot, ClaudeBot, Google-Extended) interact with sites. None of those documents currently treat llms.txt as a directive in the way robots.txt is. The standard is proposed, not adopted.

Google's May 2026 guide goes further and says llms.txt is not needed for visibility in AI Overviews or AI Mode. Their position is unambiguous.

Real-world results so far

We watch our own server logs. Requests for /llms.txt are rare. We've added it for clients and seen the same pattern.

We've yet to identify a single case where llms.txt demonstrably drove an AI citation or referral. The file isn't useless on those grounds, but measurement is hard and the upside is currently theoretical for most sites.

If anyone tells you they've got a measurable case study tying llms.txt to traffic or citations, ask how they isolated the file from everything else they were doing. The honest answer is usually that they didn't.

Impacts on AI search, SEO, and visibility

The most important shift in 2026 is that we now have an official position from Google on AI search optimisation. That changes how llms.txt fits into the picture.

Google's May 2026 guide, in plain English

In May 2026, Google published its guide to optimising for generative AI features on Search. It's the clearest statement we've had on how AI Overviews and AI Mode pick which pages to cite.

Three things stand out.

First, foundational SEO still drives everything. AI Overviews aren't a parallel system. They pull from the same core Search index.

Second, Google explicitly lists llms.txt under "what you don't need to do". Same for chunking content, rewriting pages for AI, and over-structuring with schema. The guide states plainly that you don't need to create new machine-readable files, AI text files, or markdown to appear in generative AI search.

Third, the lever that actually moves the needle is non-commodity content. Google contrasts "7 Tips for First-Time Homebuyers" (common knowledge, available everywhere) with "Why we waived the inspection and saved money: a look inside the sewer line" (first-hand, specific, only that writer could have produced it). The second gets cited; the first doesn't.

How AI Overviews and AI Mode actually work

Two techniques sit underneath the AI features on Google Search.

Retrieval-augmented generation (RAG): the model grounds its response in pages retrieved from the Search index. Those pages get cited as clickable links in the answer.

Query fan-out: a single user question quietly fires several related queries behind the scenes. "How do I fix a weedy lawn" might also trigger "best herbicides", "remove weeds without chemicals", and "prevent weeds in lawn". Pages that comprehensively cover a topic can get retrieved across multiple fan-out queries from one user question.

The implication for content strategy is significant. One strong deep page on a topic beats five thin pages targeting variations. Spinning up thin variations to chase fan-out queries risks falling foul of Google's scaled content abuse policy.

Whether llms.txt affects rankings

For Google Search, no. The guide is direct about it.

For other AI systems, the picture is less clear. ChatGPT, Claude, Perplexity and others may pick up the file when they crawl. They haven't published authoritative documentation on how heavily they weight it. Anecdotal evidence is mixed and hard to verify.

Our working assumption is that llms.txt may help non-Google AI systems understand your business and surface the right pages, though the effect is currently small and unmeasurable for most sites. If you want a deeper view of what actually gets your site cited in AI answers, our piece on how to get your site cited in ChatGPT and AI search covers it in detail.

Where llms.txt earns its place

For three groups, the file is worth setting up.

Developer tools and SaaS platforms. If your product has an API, integration docs or a library reference, AI assistants get asked daily to help developers work with it. A clean llms.txt that points to your getting-started guide, auth flow and key endpoints saves models from scraping a deep documentation tree. This is the strongest use case.

Sites with content-to-clutter ratio issues. WordPress sites loaded with sidebars, popups and widgets can be hard for AI to parse cleanly. A markdown summary at /llms.txt is one way to point past the noise.

AI-native businesses. Companies whose primary audience is people building with AI have the closest natural fit for the standard.

For most other sites, including ours, the file is a low-cost insurance policy rather than a meaningful traffic driver.

Implementation, costs, and best practice right now

Setting up llms.txt is fast and free. The decision isn't about cost. It's about whether the time would be better spent elsewhere.

Setting up the file on a typical site

The basic steps haven't changed since the standard was proposed.

Write the file in markdown. Start with your business name as an H1. Add a one-line summary as a blockquote. Group your most important URLs under H2 sections (services, blog, docs, sectors, products). Add a colon and a one-line description for each link.

Upload to your site root. The file lives at yoursite.com/llms.txt. Yoast doesn't handle this directly, though SEOPress dynamic fields can keep content current. Our setup uses dynamic fields so the file refreshes as we publish or rename pages.

Validate with llmstxtvalidator.dev before publishing. Catches the obvious mistakes.

For a typical WordPress site, the file takes about thirty minutes to write and upload the first time. Maintenance is rare.

Where it fits with developer docs and APIs

If your site has comprehensive documentation, this is the strongest case for setting up the file properly.

Group endpoints, auth flows and getting-started guides under a Docs section. Add an Optional section for changelogs and advanced configuration that models can deprioritise. Consider publishing markdown versions of your key documentation pages at .md URLs.

Mintlify and several other documentation platforms now generate the file automatically from your existing docs structure. Worth checking if your platform supports it.

For non-developer sites (marketing pages, service pages, e-commerce), this level of structure isn't needed. A simple curated list will do.

When it's worth twenty minutes, and when it isn't

Worth setting up if:

  • You run a SaaS, developer tool or API and AI assistants are part of how your users work with you
  • Your documentation is already structured in clean markdown
  • Your buyers regularly ask ChatGPT or Claude about your product category and you want accurate references
  • You're on an ongoing site retainer where it can be added quickly as part of routine maintenance

Skip or deprioritise if:

  • You're a local service business, agency, restaurant or property developer whose buyers aren't using AI assistants to evaluate your category
  • Your time budget for SEO work is limited and you have weak service pages or thin blog content that needs attention first
  • You're hoping it'll move Google rankings; Google has said it won't

For everyone in between, the file is a low-risk low-reward addition. Worth doing if it's quick. Not worth losing sleep over.

Join the newsletter

"*" indicates required fields

Latest updates

View all