Technical

Everyone Says Add an llms.txt File. The Server Logs Say It Doesn't Matter.

Independent server-log studies show AI search crawlers touch llms.txt 0.1% of the time. Here's the data — and the one case where it's worth it.

RivalHound Team
8 min read
Everyone Says Add an llms.txt File. The Server Logs Say It Doesn't Matter.

Everyone Says Add an llms.txt File. The Server Logs Say It Doesn’t Matter.

One company watched 62,100 AI bot visits hit their site over 90 days. Exactly 84 of them touched the llms.txt file. That’s 0.1%. The file they were told would feed their content straight into AI answers got crawled three times less often than an average page, and roughly as often as a stray PDF buried in a downloads folder.

That number, from Otterly’s controlled experiment, is the cleanest single piece of evidence in a debate that has gotten weirdly heated. So let me say the quiet part first: for AI search visibility, llms.txt is mostly theater. The data doesn’t support the hype, the AI companies haven’t endorsed it, and the case studies that claim wins fall apart when you read the footnotes.

There’s one real exception, and it’s worth knowing. But it’s not the one most marketing teams think they’re getting.

What llms.txt was supposed to do

The idea is reasonable. Jeremy Howard proposed llms.txt in September 2024 to solve a genuine problem: language models have limited context windows, and a typical web page is a mess of navigation, ads, popups, and JavaScript. Converting that into clean text the model can actually use is slow and lossy.

So the proposal was a markdown file at yoursite.com/llms.txt: a short summary of what your site is, plus a curated list of links to the pages that matter, each with a one-line description. A table of contents written for machines. The pitch wrote itself — give the AI a clean map, and it’ll understand and cite you better.

Two years on, the map exists. The problem is that the crawlers it was built for don’t open it.

What the crawlers actually do

The Otterly study is the one to anchor on because it measured behavior, not opinion. Ninety days of server logs, filtered for AI and LLM user agents to cut out generic crawler noise. The result: 84 hits to /llms.txt against 62,100 total AI bot visits. The average content page on the same site pulled about 265 visits. The file built specifically for AI consumption performed worse than the content it was supposed to summarize.

Search Engine Land ran a different test and reached the same place. They tracked 10 sites across finance, SaaS, ecommerce, insurance, and pet care for 90 days before and after adding llms.txt. Eight saw no measurable change in AI traffic. One declined. Two went up — and both “wins” had obvious confounders. The neobank’s 25% lift coincided with Bloomberg coverage and a product-page rebuild. The SaaS platform’s 12.5% bump followed the launch of 27 downloadable templates. Neither growth had anything to do with a markdown file.

Then there’s the part nobody selling llms.txt services likes to quote. Google’s John Mueller, asked directly whether AI services use it, was blunt: “None of the AI services have said they’re using llms.txt, and you can tell when you look at your server logs that they don’t even check for it.”

That’s the whole game. You can argue about correlation studies and methodology, but you can’t argue with your own access logs. If GPTBot, Google-Extended, and ClaudeBot aren’t requesting the file, the file isn’t doing anything.

The claims vs. the evidence

Here’s the split between what gets promised and what holds up.

Claim about llms.txtWhat the evidence shows
AI search crawlers prioritize it~0.1% of AI bot traffic, fewer hits than an average page
Major AI platforms support itNo formal commitment from OpenAI, Google, Anthropic, or Meta as of mid-2026
It boosts your AI citationsControlled tests show no effect; reported wins trace to other changes
It’s a ranking signal like robots.txtIt’s a content map, not an access directive — and unenforced
Everyone’s doing itAdoption sits around one in ten sampled domains, concentrated in dev tools

Notice who’s on each side. The studies showing no effect are independent and log-based. The case studies showing dramatic lift — 300%, 900% — tend to come from companies that sell AEO services and publish no methodology. When the people claiming a result are the people selling the fix, and the people measuring behavior find nothing, that’s not a real controversy. That’s marketing meeting data.

Stop confusing it with robots.txt

A lot of the confusion comes from the name. People assume llms.txt is the AI-era version of robots.txt — a file that controls what bots can do. It isn’t. robots.txt tells crawlers what they’re allowed to access, and the major AI crawlers at least claim to respect it. llms.txt makes no demands and grants no permissions. It just offers a suggested reading list that crawlers are free to ignore, and overwhelmingly do.

If your goal is controlling which AI bots crawl you and how often, that’s a robots.txt and server-policy question, and we’ve written about how to think about AI crawler access without torching your visibility. llms.txt has nothing to do with access control. Adding it does not block, throttle, or invite anyone. It’s a separate thing solving a separate problem — one the crawlers haven’t agreed to participate in.

The one place it earns its keep

Now the exception, because it’s real and it changes the recommendation for a specific group.

llms.txt works beautifully for developer documentation consumed by coding agents. When you point Claude Code, Cursor, or GitHub Copilot at a docs site, those tools do fetch llms.txt and the longer llms-full.txt to load clean, structured documentation into context. Anthropic publishes one for Claude Code’s own docs. Stripe, Next.js, and a long list of API-first companies do too. This isn’t speculative — it’s a confirmed, daily-use behavior, and it’s why adoption clusters so heavily in technology and developer-tools companies.

The distinction that matters:

Use caseWho reads the fileWorth it?
AI search visibility (ChatGPT, Gemini, Perplexity answers)Search crawlers — which mostly skip itNo measurable payoff
Developer docs in IDE coding agentsCursor, Claude Code, Copilot — which fetch it on demandYes, real and confirmed
Access control for AI botsNobody — wrong tool entirelyUse robots.txt instead

So the honest rule is narrow. If you run a product with technical documentation that developers feed into AI coding tools, publish a clean llms.txt and a thorough llms-full.txt. Your users will use it, and it makes your docs first-class context inside the tools they already live in. If you’re a brand hoping it lifts your presence in ChatGPT’s answer about “best CRM for startups,” you’re optimizing a channel the crawlers don’t read.

What actually moves AI search visibility

The reason llms.txt feels like it should work is that it’s tidy and technical — the kind of fix engineers and SEOs like, because you ship it once and check a box. The things that genuinely drive AI visibility are messier and slower, which is exactly why people reach for the markdown file instead.

Here’s where that same effort pays off:

  1. Brand mentions across the open web. Models lean on how often and how credibly your brand shows up in places they trust. Branded mentions correlate far more strongly with AI visibility than tidy on-site files — we walked through the data in why brand mentions outperform backlinks in AI search. A file on your own server can’t manufacture third-party credibility.
  2. Structured data on the live page. Schema that describes your actual products, prices, and entities — applied to the pages crawlers do fetch — beats a separate file they don’t. Just be specific about it, because generic boilerplate schema does close to nothing.
  3. Presence in the source set models actually pull from. AI answers in any given category route through a surprisingly small set of trusted domains. Getting cited inside the handful of sources that disproportionately control AI visibility does more than any file on your own root.
  4. Freshness and clean retrieval. Crawlers fetch your real pages on their own cadence. Making those pages parse cleanly and stay current is the on-site work that compounds. The file you should care about is the one being crawled tens of thousands of times, not the one being crawled 84.

None of that fits in a markdown file at /llms.txt. That’s the point.

The honest read

llms.txt isn’t a scam. It was a thoughtful answer to a genuine problem, and in the developer-docs corner it solved that problem well. What happened is that GEO marketing picked it up, stripped the context, and resold it to brands as an AI search visibility lever it was never shown to be.

If you’ve already added one, leave it — it costs nothing and won’t hurt you. If you’re a dev-tools company, invest in it properly, because your users genuinely benefit. But if it’s on your roadmap as the thing that’s going to fix your standing in AI answers, pull it off and spend that engineering time on brand presence, structured data, and being in the sources models trust. Then measure. The whole lesson of the llms.txt saga is that the access logs settle arguments the case studies can’t.

Don’t take any vendor’s word for whether a tactic works, including ours. Watch what the crawlers do, and watch whether your brand’s presence in AI answers actually moves. That’s the only scoreboard that counts.

Stop guessing about your AI search presence. Start your free RivalHound trial and get real data.

#llms.txt #AI crawlers #GEO #technical SEO #AI visibility

Ready to Monitor Your AI Search Visibility?

Track your brand mentions across ChatGPT, Google AI, Perplexity, and other AI platforms.