15 Websites Now Control 68% of Your AI Visibility
5W analyzed 680M AI citations across ChatGPT, Claude, Gemini, Perplexity, and Google AI Overviews. The same 15 domains decide what gets said about your brand.
15 websites now control 68% of your AI visibility
If your AI search strategy is focused on your own website, you’re working on roughly a third of the problem. The other two-thirds sits on 15 domains you don’t own.
That number comes from the AI Platform Citation Source Index 2026, released May 1 by 5W Public Relations. The report consolidates roughly 680 million citations from six prior research efforts run between August 2024 and April 2026, and lines them up against the answers produced by ChatGPT, Claude, Perplexity, Gemini, and Google AI Overviews. The headline number is brutal: the top 15 domains absorb 68% of the entire AI answer pipeline.
Most brand teams aren’t pointed at those 15 domains. They’re working through long-tail outlets, niche directories, and on-site optimization, and treating earned media as a numbers game across hundreds of sites. That’s the strategic mismatch that defines AI visibility in 2026.
What the index actually shows
5W’s index is a meta-analysis, not a new crawl. The firm pulled together six of the largest published citation studies, normalized the data, and ranked the 50 sites that account for the bulk of what answer engines pull from. The 50 sites slot into six buckets: community and conversation, encyclopedic and reference, professional and identity, video and audio, editorial and news, and commerce and review.
A few patterns show up across every engine.
| Source type | Where it dominates | Citation behavior |
|---|---|---|
| Every engine, ~40% frequency | The single most-cited domain across LLMs | |
| Wikipedia | ChatGPT, 26–48% of top-10 share | Treated as near-foundational reference material |
| Journalism (NYT, Forbes, Business Insider, Atlantic) | Claude, time-sensitive queries | 27% of all citations; 49% of citations on time-sensitive queries |
| YouTube | Google AI Overviews | Roughly 200x the citation weight of other video sources |
| NIH/PubMed and primary sources | Perplexity | Heavily weighted for medical, scientific, and B2B authority queries |
The implications are uncomfortable. If you’ve spent the last year fixing schema on your product pages and writing FAQ sections, you’ve been working on a real problem. You haven’t been working on the problem that drives most citations.
Why concentration matters more than coverage
The instinct in earned media has always been coverage. Hit the most outlets, get the most mentions, count the impressions. Concentration says the opposite. Three placements on the right 15 domains will outperform fifty placements on the wrong ones, because answer engines aren’t searching the open web at runtime. They’re reranking a small set of sources they already trust.
We covered the on-domain side of this earlier this year. Stacker’s research found that 64% of AI citations come from third-party sources, not your own website. The 5W index sharpens the picture by telling you which third parties matter. The answer isn’t “any reputable site.” It’s a small, recognizable list, and it’s fairly stable across categories.
That stability is the part most teams miss. If the top 15 domains capture 68% of citations across five different AI engines, the playbook isn’t “be everywhere.” The playbook is “be on the 15.”
The platform splits inside the consolidation
The aggregate hides some sharp differences between engines. If you’re tracking visibility on a single platform, you’re seeing a different game than the headline number suggests.
ChatGPT leans hardest on Wikipedia. Anywhere from 26% to 48% of its top-10 citations come from a single domain, depending on the study. That’s closer to a training artifact than a search result. If your brand doesn’t have a Wikipedia presence, your ceiling on ChatGPT is structurally low. We wrote about the Wikipedia–AI visibility correlation last fall, and the new index confirms it hasn’t weakened.
Claude looks different. 5W’s data shows Claude pulls only 36% of its citations from the past 12 months, against 56% for ChatGPT. Claude’s top sources skew older and more editorial: The New York Times, The Atlantic, The New Yorker, The Economist. If you’re trying to influence Claude, fresh content matters less than placement in named editorial outlets.
Perplexity is the outlier on primary sources. NIH, PubMed, government data, and named B2B authority sites carry disproportionate weight. Perplexity is the engine where a citation in a specialty trade publication moves more visibility than a placement in a general-interest magazine.
Google AI Overviews is the YouTube engine. The video advantage is real and large enough that any brand with credible video assets should have an AI Overviews story, and any brand without one is structurally behind on the platform with the most queries.
Gemini sits roughly between ChatGPT and Google AI Overviews, which is what you would expect from a Google product that draws on both general web sources and the YouTube graph.
The volatility problem
The other thing the index makes clear is that the 15 domains aren’t the same 15 domains every quarter. 5W flags one example: ChatGPT’s Reddit citation share dropped from roughly 60% to roughly 10% in six weeks in late 2025, after a single change in how Google handled a parameter that ChatGPT was using to fetch results. We saw the same dynamic in the YouTube-versus-Reddit shift that played out around Perplexity’s legal dispute with Reddit.
Two things follow from that.
First, source concentration is fragile. A platform decision, a lawsuit, an indexing change — any of these can move 20 percentage points of share between domains in weeks. If you built your AI visibility plan on a single source winning forever, you’ve built on sand.
Second, the only honest way to operate is to monitor citation share by platform and rebuild allocations on a rolling basis. Static playbooks lose. Teams that re-baseline monthly catch the shifts before competitors do.
A practical hierarchy
Given the data, here’s a hierarchy that reflects how citation weight actually distributes. It isn’t a checklist. It’s a budget allocation.
| Tier | What it includes | What it gets you |
|---|---|---|
| Tier 1: Infrastructure | Wikipedia entry, accurate Knowledge Graph entity, complete LinkedIn company profile | Foundational presence on ChatGPT, Gemini, AI Overviews |
| Tier 2: Community surface | Active, accurate presence on Reddit; brand mentions in long-form Reddit threads (300+ words) | The single largest cited surface across LLMs |
| Tier 3: Editorial and analyst | Coverage in NYT, WSJ, Forbes, Business Insider, The Atlantic, named industry analysts | Claude leverage and time-sensitive citation weight |
| Tier 4: Vertical authority | Trade publications, NIH/PubMed for medical, government for regulated industries | Perplexity weight and B2B credibility |
| Tier 5: Video | A YouTube channel with substantive content, not just product clips | The dominant video signal for Google AI Overviews |
| Tier 6: Owned content | On-site optimization, schema, freshness | The remaining 32% of citation share |
Owned content sits at the bottom because that’s where the data puts it, not because it doesn’t matter. A team that perfects tier 6 while ignoring tiers 1 through 5 is optimizing the smallest slice of the pie.
What to do this quarter
If you’ve got a quarter to move on this, three actions matter more than the rest.
-
Audit your presence on the top 15. Most brands have never run this audit. You should know, by name, whether you are present on each of the dominant sources for your category. Anything missing is a gap to fill, not a tactic to argue about.
-
Track citation share per platform, not just total. The aggregate number hides where you’re winning and losing. Treat each engine as its own market with its own source mix. RivalHound was built for this — it samples across ChatGPT, Claude, Gemini, Perplexity, and AI Overviews so you can see where your brand actually appears, and which of the 15 domains carried the citation. We covered the metric framework in our GEO metrics guide if you want a deeper look at what to measure.
-
Plan for source rotation. Set a 60-day cadence to re-baseline your citation source mix. The Reddit-to-YouTube shift took six weeks. Whatever the next shift is, it’ll move on a similar timeline, and you’ll only catch it if you’re looking.
The instinct to treat AI visibility as a content problem is understandable. Content is what your team makes. But AI visibility is a distribution problem. The engines decide which sources to trust, and right now they’ve decided to trust about 15 of them. You either appear on those sources or you don’t appear at all.
The 15 aren’t a secret. They’ve been listed, ranked, and analyzed. The only question left is whether your team acts on the list or keeps optimizing for the long tail.
RivalHound tracks your brand’s visibility across ChatGPT, Google AI, Perplexity, and more. Start monitoring to see where you stand.