Research · Retrieval vs Citation

Why AI often cites your least visible pages in search

An investigation across three independent verticals (health YMYL, trades and construction, employment law): the pages that win AI citations are not the ones that dominate the SERPs. A Retrieval ≠ Citation model, Intent Match, and the Citation Efficiency metric.

It started with an anomaly. One of our sites dominated search on a query, among the highest impressions in its niche, and yet the generative assistants never seemed to cite it on that question. First in the SERPs, absent from ChatGPT. We first assumed an isolated case, a model quirk. Then we looked for the same signal elsewhere, and found it on two other unrelated verticals.

For years, SEOs assumed that a visible page was an influential page. Ranking first on Google meant being the reference on a topic. With ChatGPT, Perplexity or Copilot, that equation cracks.

We analyzed three sites belonging to three different verticals: health, trades and construction, employment law. The same signals appear in all three cases. The data are anonymized, but they come from real measurements across three distinct sites. They help identify the mechanisms that seem to favor citation by generative engines. Here are the data and the conclusions we drew.

The page that dominates AI citations is not necessarily the one that dominates the SERPs.

This is an investigation into a pattern observed across three independent verticals, not a recipe.

What three independent verticals show

The pages that win AI citations are not necessarily the most visible. They are often the ones that answer a conversational question most directly.

The thesis in three lines

  • SEO maximizes the probability of being retrieved.
  • AI maximizes the probability of finding the best answer.
  • These two objectives sometimes produce different winners.

Retrieval ≠ Citation: being retrieved as a candidate page by a search engine does not guarantee being cited in an AI-generated answer. These are two different selections, with different criteria.

Key Takeaways

  • Across three independent verticals, the most visible page in search is not the most cited by AI.
  • Hubs often win retrieval. Answer pages often win the citation.
  • Intent Match, the direct match to a question, probably explains more variance than the hub versus spoke distinction.
  • We propose an exploratory metric, Citation Efficiency (AI citations ÷ SERP impressions), useful mainly to compare similar pages. It is not an industry standard.
  • Citation and recommendation remain two different steps: being cited is not being recommended on the right criteria.

Methodology note

Both impressions and AI citations come from Bing Webmaster Tools, the latter via its native AI performance report, per page and over the same observation window. These are exploratory readings specific to these sites, not a representative sample of all engines.

Vertical A: Health (YMYL)

The phenomenon does not appear in a lab. It appears on a real site, in a YMYL topic where answer quality matters particularly.

SERP visibility

Page typeImpressions
Main hub204
Secondary guide140
Sub-topic A77
Answer page A43
Sub-topic B25
Sub-topic C21
Answer page B19

AI citations

Page typeCitations
Answer page A62
Secondary guide50
Sub-topic A43
Answer page B29
Main hub18

Citation Efficiency (citations ÷ impressions)

Page typeCitationsImpressionsEfficiency
Answer page B29191.53
Answer page A62431.44
Sub-topic A43770.56
Secondary guide501400.36
Main hub182040.09

Plotting impressions on the x-axis and citations on the y-axis, the paradox jumps out: the least visible pages rise, the most visible stays at the bottom right.

AI citations
 70 ┤
 60 ┤    ● Answer page A
 50 ┤                              ● Secondary guide
 40 ┤         ● Sub-topic A
 30 ┤  ● Answer page B
 20 ┤                                        ● Main hub
 10 ┤
  0 ┼────┬────┬────┬────┬────┬────┬────┬────
     0   30   60   90  120  150  180  210
                                  SERP impressions

The pages that answer a precise situation or question capture a disproportionate share of citations.

Why we looked for a second, then a third site

A result on a single site proves nothing. It could be an artifact: a particular topic, a page structure specific to that site, a quirk of timing. To find out, we had to reproduce the observation elsewhere, on unrelated sites. So we repeated exactly the same reading, SERP impressions versus AI citations, on a second vertical, then on a third.

Vertical B: Trades and profitability

The same phenomenon reappears in a completely different industry.

SERP visibility

Page typeImpressions
Main hub402
Sub-topic A66
Sub-topic B42
Trade page A35
Tool A20
Sub-topic C16
Sub-topic D16
Trade page B10
Answer page A9
Trade page C6

AI citations

Page typeCitations
Main hub172
Trade page C57
Trade page A39
Answer page A31
Sub-topic B19
Trade page B14
Tool B12

Citation Efficiency (citations ÷ impressions)

Page typeEfficiency
Trade page C9.50
Tool B4.00
Answer page A3.44
Trade page B1.40
Trade page A1.11
Main hub0.43

The most striking number

A trade page with only 6 SERP impressions generates 57 AI citations, a Citation Efficiency of 9.5. Just as powerful as the health example, and impossible for a competitor to exploit.

A ratio above 1 can surprise: how can a page be cited more often than it is seen? Because the two figures are not measured on the same surface. Impressions come from classic SERPs, while generative assistants can draw on their own index or on crawl databases, without generating an impression in Bing or Google. A page that is barely visible in search can therefore be widely reused in answers.

An honest caveat: here, the most cited page in raw volume is the hub, thanks to its massive visibility. But the answer pages dominate by far on efficiency. The pattern does not say they always win in volume, it says they capture a disproportionate share of citations relative to their visibility.

Vertical C: Employment law and psychosocial risks

A third vertical, unrelated to the first two, shows the same signal.

SERP visibility

Page typeImpressions
Statistical study128
Workplace situation A96
Workplace situation B75
Weak signal A52
Calculator page42

AI citations

Page typeCitations
Calculator page73
Statistical study73
Weak signal A34
Recourse guide30
Evidence page28

Citation Efficiency

Page typeCitationsImpressionsEfficiency
Calculator page73421.74
Weak signal A34520.65
Statistical study731280.57

The signal

A calculator page gets nearly twice as many citations as impressions, while the statistical study, far more visible, stays below 0.6.

Why the third site changes everything

A finding on a single site is an anecdote. On two, a hypothesis. On three unrelated verticals, a plausible pattern.

Number of sitesStatus of the finding
1 siteAnecdote, possible artifact
2 sitesHypothesis to confirm
3 independent verticalsPlausible pattern, to test more broadly

It is precisely the third vertical that changed our reading. As long as the phenomenon appeared only in health then construction, we could suspect a common bias: a writing style, a way of structuring pages, the same tooling. Employment law, with no editorial or technical link to the other two, ruled out that explanation. The signal seemed less tied to the sites themselves than to how the engines select certain answers.

Why Intent Match explains citations better than classic SEO

A methodological caution. What follows is a model hypothesis, not an absolute truth. The internal pipelines of ChatGPT, Google or Bing vary, change often, and are not public. The model is kept because it predicts well what we observe across the three verticals.

Retrieval

Intent Match

Citation

Intent Match: a page’s ability to match the exact natural phrasing of a user’s question.

SEO answers: “Can this page be found?”

Intent Match answers: “Is this the answer we were looking for?”

Classic SEO mainly optimizes retrieval: authority, indexing, coverage. It maximizes the probability of entering the candidate pool. But once that pool is built, the generative engine does not reuse the most authoritative page, it reuses the one that most resembles the question asked. Intent Match is what decides, and it is what best explains our gaps between visibility and citation.

Retrieval pulls several candidates. Intent Match often selects the one that already resembles the question.

Question typeSelected page
Why does my symptom persist?Answer page
How much does an independent professional earn?Answer page
What amount can I obtain?Answer page

In all three cases, the system does not need to extract the right portion from the middle of a guide. The page already resembles the question, so it is easier to reuse.

The real pattern: answer pages

What the highest Citation Efficiency pages have in common is not their topic, not their sector, not their traffic volume. It is their ability to answer a question immediately.

VerticalWinning page
HealthAnswer to a symptom
Trades / constructionAnswer to a trade question
Employment lawAnswer to a recourse question

The highest Citation Efficiency pages look more like answers than like content.

Answer page: a specialized page that immediately answers one question, situation or decision, without thematic dilution.

Hubs and spokes: a secondary mechanism

The hub versus spoke distinction keeps explanatory value, but it is secondary.

RoleStep often won
HubBroad page, wide coverage, high visibilityRetrieval
SpokeSpecialized page, direct answerCitation

Hub vs Spoke explains part of the phenomenon. Intent Match probably explains more: a specialized page poorly aligned with a question stays barely cited, while a page, even attached to a hub, that exactly matches a question wins the citation. The hub is not the spoke’s enemy, it captures visibility and feeds the answer pages.

The Citation Efficiency concept

Citation Efficiency: the ratio of a page’s AI citations to its SERP impressions. It measures neither SEO quality nor traffic, but a page’s ability to convert visibility into citations.

Citation Efficiency = AI citations ÷ SERP impressions
VerticalBest Citation Efficiency
Health1.53
Trades / construction9.50
Employment law1.74

Even when the numbers vary widely, from 1.53 to 9.50, the pages with high Citation Efficiency are systematically pages with high answer value.

Limits of Citation Efficiency

It is a comparison indicator, not an absolute truth. The ratio:

  • depends on the sample: on small numbers it becomes unstable and misleading;
  • depends on the observation window;
  • depends on the engine measured;
  • is mainly useful to compare similar pages.

Use it on sufficient volumes, over a stable window, and by segmenting page types.

How to spot your future Citation Efficiency winners

A page has strong citation potential if it:

  • targets a profession, a symptom or a situation;
  • targets a precise question or problem;
  • contains figures and thresholds;
  • has a title close to natural language.

Conversely, a very visible but rarely cited page is often a disguised hub, to break down into answer pages.

How to measure your AI visibility

There is no Search Console equivalent for ChatGPT yet. Measurement is built by cross-referencing several sources, each covering only one angle.

SourceWhat it revealsLimit
Bing Webmaster ToolsImpressions, queries and AI citations via the AI performance reportCoverage centered on the Bing and Copilot ecosystem
Google Search ConsoleImpressions, queries, appearance in AI OverviewsAI visibility still partial
GA4Referral traffic from AI assistantsCaptures clicks only, not citations without a click
Server logsAI crawler hits: GPTBot, OAI-SearchBot, PerplexityBot, ClaudeBot, Google-ExtendedIndicates collection, not citation
Dedicated toolsCitation tracking: Profound, Peec AI, Scrunch AICoverage and engines vary

Because the ratio depends on the window and the engine, an isolated measurement is worth little. AI visibility moves: a model or index update can shift your citations from one month to the next. The right reflex is not a one-off audit, but monitoring Citation Efficiency over time.

What the three verticals have in common

VerticalSERP winnerCitation Efficiency winnerSignal
HealthMain hubAnswer pageAnswer to a symptom
Trades / constructionMain hubTrade pageAnswer to a trade question
Employment lawStatistical studyCalculator pageDecisional answer

In this table, the Citation Efficiency winner means the page that captures the most citations relative to its visibility. Despite radically different topics, these pages all belong to the same family: answer pages.

What we observe, what we do not claim

What we observe. The most cited pages answer a question, a decision or a situation.

What we do not claim. We do not prove the internal workings of ChatGPT, Bing or Google. We simply show that the same signal appears across three different verticals.

Going further: citation is not recommendation

Citation Efficiency measures one step, not the whole race. Being cited does not mean being recommended. It helps to read these signals as a ladder of three thresholds.

ThresholdQuestionMetric
RetrievalAre we pulled into the candidate pool?Impressions, indexing, authority
CitationAre we reused in the answer?Citation Efficiency
RecommendationAre we recommended on the right criteria?Decision Share of Voice

For a publisher site, the stakes often stop at the citation and the traffic it brings. For a brand that sells an offer, winning the citation without winning the recommendation can be enough to appear without converting. That is the subject of Why am I cited by ChatGPT but not converting?.

Conclusion

What looked like an anomaly on a single site turned out to be a stable signal. The three verticals analyzed show the same phenomenon: generative engines do not seem to favor only the most visible pages. They seem to favor the pages that best match a question phrased by a user.

SEO remains essential to enter the candidate pool. But once retrieved, a page still has to win the Intent Match, then the citation.

SEO makes you findable. Intent Match makes you citable.

In an environment where generative engines become a layer of access to information, the question is no longer only “can I be found?”, but “am I the easiest answer to reuse?”.

Sources and tools cited

FAQ

Is this limited to one sector?

No. We observe it across three independent verticals: health YMYL, trades and construction, employment law. In all three, pages that answer a precise question capture a disproportionate share of citations relative to their SERP visibility.

Does GEO replace SEO?

No. SEO makes you findable, which remains the entry condition into the candidate pool. GEO adds a question: once retrieved, does your page win the Intent Match and then the citation?

Do backlinks still matter?

Yes, for authority and retrieval. But a strong link profile does not guarantee a citation if the page lacks a self-contained block of text aligned with a question.

How do I know if ChatGPT cites me?

By cross-referencing Bing Webmaster Tools, Google Search Console, GA4, AI crawler server logs and citation-tracking tools, then comparing citations and impressions to compute a Citation Efficiency per page.

What best explains the citation?

In our observations, Intent Match, the direct match between a page and the natural phrasing of a question, explains more variance than the simple hub versus specialized-page distinction.

Buyer question

What question does the buyer ask AI?

Documentation risk

Which documentation simplification can lower the standard?

Standard to impose

Which technical requirement must be clearly formulated?

Expected proof

Which evidence should be requested or published?

Rejection criterion

Which criterion excludes an insufficient answer?

Measure how AI already understands your market.

A short diagnostic identifies category compression, documentation gaps and criteria that influence the decision.