Research Report

The Half-Life Report

A Research Synthesis on AI Citation Persistence

April 2026 · Published by Quoted
AI citations have a half-life of roughly 4.5 weeks. Between 40% and 60% of cited sources rotate every month — yet the answers AI engines give stay 95% semantically consistent. The sources change; the conclusions don't. Most research in this field focuses on how to get cited. Almost none focuses on how to stay cited. This report synthesizes every major published study on AI citation persistence and offers a confidence-ranked framework for what to do about it.

The Coming Wave of AEO

Something fundamental has changed in how buyers find information, and most marketing teams haven't caught up yet.

AI search engines — ChatGPT, Perplexity, Google AI Overviews, Claude, Gemini — are increasingly where B2B buyers start their research. Instead of scanning ten blue links and choosing which to click, they ask a question and get a synthesized answer with sources inline. The buyer reads the answer, maybe clicks one or two citations, and moves on. The entire discovery funnel compressed into a single interaction.

The numbers are still early. Conductor's analysis of 3.3 billion sessions across 13,000 domains found that AI referrals account for roughly 1% of total traffic — small in absolute terms, but growing rapidly and disproportionately influential. These aren't casual browsers. AI search users tend to be further along in their buying journey, asking specific questions that signal intent: "best SIEM platform for mid-market companies," "how to measure content ROI across channels," "SOAR vs. SIEM for incident response."

For a decade, traditional SEO optimized for rankings that persisted for months or years. You published a strong piece, earned backlinks, and watched it climb to page one. Once there, it stayed — barring a major algorithm update or a competitor with a bigger content budget. The economics were predictable: invest upfront, harvest traffic over time.

AI citation works nothing like this.

When an AI engine answers a query, it doesn't consult a fixed index the way Google's traditional search does. It retrieves content dynamically, selects sources based on the specific query and context, and assembles an answer in real time. The sources it cites for the same question today may be entirely different from the ones it cited last week. There is no "page one" to hold. There is no ranking to defend. There is only the question of whether your content gets selected this time, for this query, on this engine.

The business implication is straightforward: your content investment has a shorter shelf life than you think. The playbook that worked for traditional search — publish, optimize, maintain — is necessary but no longer sufficient. In traditional SEO, intelligence was static: you ranked or you didn't. In AI citation, intelligence is compound — every week of tracking reveals patterns that the previous week couldn't, and the companies collecting that data now will have months of proprietary insight by the time their competitors realize they need it. Understanding what makes content get cited is only half the challenge. The other half, and the one almost nobody is studying rigorously, is understanding what makes content stay cited.

That's what this report is about.


The Core Finding: Citations Decay — Fast

The most important number in AI citation research is 4.5 weeks. That's the approximate half-life of an AI citation — the time it takes for a piece of content to lose half of its citation appearances across AI engines.

This finding comes from the most rigorous persistence-specific study publicly available: a longitudinal analysis by Stacker and Scrunch that tracked eight articles across 944 prompt-platform combinations on five AI engines using survival analysis, a statistical method borrowed from medical research that measures how long something "survives" before an event occurs. In this case, the event is losing a citation slot.

4.5 weeks. Not months. Not quarters. Weeks.

To put this in perspective: if your content gets cited in 100 AI responses today, you can expect it to appear in roughly 50 of those same responses five weeks from now — even if nothing about your content changed.

4.5wk
Citation half-life
(average across engines)
40-60%
Sources rotate
every month
95%
Semantic consistency
(answers don't change)
76%
Of cited pages updated
within 30 days
Citation Decay Curve: The 4.5-Week Half-Life
Modeled from Stacker/Scrunch survival analysis. Hover to explore.

The decay isn't uniform across platforms. Synthesis of multiple studies suggests platform-specific half-lives that vary significantly.

Platform-Specific Citation Half-Lives
The same content can be stable on one engine and disappearing on another.
ChatGPT
~3.4 wk
Google AI Overviews
~4–5 wk
Perplexity
~5.8 wk

The monthly turnover rate

Zooming out from half-life to monthly snapshots, the picture is equally striking. Industry estimates from EMARKETER suggest that 40-60% of cited sources change month-to-month across Google AI Mode and ChatGPT. Ahrefs confirmed the scale of this churn independently in a verified study of 43,000 AI Overview keywords, finding that AI Overview content changes 70% of the time for the same query, with 45.5% of citations replaced by entirely new sources — not updated versions of the same pages, but completely different URLs. Separate industry tracking from Superlines reported a 35% decline in AI visibility scores over just five weeks, consistent with the half-life data.

The semantic consistency paradox

Here's the counterintuitive finding that makes the turnover data particularly interesting. Despite the constant rotation of sources, AI engines almost never change their conclusions. The Ahrefs study measured semantic similarity across AI Overview responses for the same keywords over time and found 0.95 cosine similarity — meaning the substance of the answers stayed 95% consistent even as the citations underneath them churned.

AI engines don't change their minds. They change their sources.

Your content can be factually correct, well-structured, and authoritative — and still lose its citation slot because something newer, better-formatted, or more recently updated appeared. The problem isn't accuracy. It's structural competitiveness in a system that rotates its bibliography constantly.

The freshness dominance

Multiple independent studies converge on a clear finding: content freshness is one of the strongest predictors of citation selection. Analysis shows that 76.4% of ChatGPT-cited pages were updated within 30 days. AI-cited content is 25.7% fresher on average than organically ranked content (368 days average age versus 432 days). And research from Amsive indicates that 50% of top-cited content is less than 13 weeks old.

The practical implication is what researchers call the "3-month cliff" — citations drop sharply when content exceeds 90 days without a substantive update. Content older than a quarter is statistically disadvantaged in AI citation selection, particularly in fast-moving verticals like cybersecurity and marketing technology.

Platform divergence: the Reddit case study

Perhaps the most dramatic illustration of citation instability comes from Semrush's 13-week study, which tracked weekly citations for over 230,000 prompts across ChatGPT, Google AI Mode, and Perplexity, analyzing more than 100 million total citations.

Between early August and mid-September 2025, Reddit citations on ChatGPT collapsed from approximately 60% to 10%. Wikipedia saw a similar drop: 55% to below 20%. But during the exact same period, Reddit citation rates on Google AI Mode and Perplexity remained stable.

This wasn't a content quality issue — Reddit's content didn't suddenly get worse. It was a platform-level shift, likely tied to a model update or retrieval system change at OpenAI. A domain that appeared rock-solid on one engine was simultaneously collapsing on another.

The lesson is clear: single-engine citation tracking is a blind spot. Multi-engine monitoring isn't a nice-to-have — it's a requirement for understanding what's actually happening to your content's AI visibility.


Why Citations Decay: Three Mechanisms

A 4.5-week half-life is a useful headline, but it doesn't tell you why your content lost its citation slot — and without understanding the mechanism, you can't fix the problem. Research points to three distinct decay types, each requiring a different response.

📉

Statistical Decay

Your data gets stale. Content citing "2024 statistics" loses citation priority the moment a competitor publishes 2025 data. The 3-month cliff is largely driven by this mechanism.

Fix: Disciplined quarterly refresh cadence. Update the substance, not just the timestamp.
🔧

Structural Decay

AI engines evolve what formats they prefer for extraction. Tables today, lists tomorrow. Your information is accurate, but the engine's extraction preferences shifted.

Fix: Monitor extraction patterns. Restructure content when formats change.
🏁

Competitive Decay

Someone publishes deeper or better-structured content on the same topic. 45.5% of citation replacements are entirely new sources displacing old ones.

Fix: Continuous competitive intelligence on the prompts that matter to your business.

Statistical decay responds to disciplined refreshing. If your cybersecurity market overview references last year's breach statistics and a competitor publishes this year's data, the AI engine will prefer the fresher source regardless of how well your content is structured. Quarterly at minimum, monthly for data-heavy content in rapidly evolving fields — and the refresh needs to be substantive.

Structural decay is harder to detect because it's invisible at the content level. Your information is still accurate. Your page still ranks well in traditional search. But the AI engine's extraction preferences have shifted, and your content's format no longer matches what the engine is looking for. This is one area where longitudinal tracking is particularly valuable, because structural preferences shift gradually and are only visible over time.

Competitive decay operates on a much faster timeline in AI citation than in traditional search. In traditional search, a new competitor might take months to outrank you. In AI citation, a well-structured article published yesterday can displace content you've maintained for a year — because the AI engine evaluates each query fresh, without the historical inertia that traditional search algorithms build in.


What Gets Cited (and What Doesn't)

Beyond understanding decay, a growing body of research identifies the content features that correlate with getting cited in the first place. These findings come from multiple independent studies with varying methodologies and sample sizes.

Citation Factors, Ranked by Evidence Strength
1
Content Position
44.2% of citations come from the first 30% of content. Front-load claims, data, and definitions.
Strong
2
Extractable Structure
Tables boost citation 2.5x. Listicles = 50% of top citations. FAQ sections +30% with schema.
Strong
3
Distribution Breadth
Earned editorial distribution raised citation from 8% to 34%. 12.3-week life on Perplexity (3x average).
Moderate
4
Page-Level Authority
Citation happens at page level, not domain. Author entity authority matters more than domain rating.
Moderate
5
Detailed Schema Markup
Attribute-rich schema outperforms by 20pp. Generic CMS defaults have zero measurable effect.
Mixed

Position matters — front-load everything

Kevin Indig's analysis of 3 million ChatGPT responses containing 30 million citations revealed a consistent "ski ramp" pattern: citation probability is highest at the top of the page and declines steadily. 44.2% of all citations originate from the first 30% of a page's text.

Traditional content marketing often uses a narrative build-up structure — context first, then analysis, then the key insight near the end. AI engines don't read that way. They extract. And they extract disproportionately from the beginning. Front-load your strongest claims, data points, and definitions. Put your primary thesis in the first 100 words.

Structure matters — extractable formats win

FarAndWide's analysis of 170+ AI-cited sources found that tables boost citation probability by 2.5x and listicles account for 50% of top AI citations. FAQ sections show a 30% citation rate improvement when paired with FAQPage schema. The common thread is extractability — AI engines need to pull a discrete, self-contained piece of information from your content. Content that's easy to extract gets cited more often than content buried in flowing prose.

Authority matters — but at page level, not domain level

Research from Mike King and iPullRank directly contradicts a common assumption inherited from traditional SEO: that domain authority drives AI citation. It doesn't — at least not directly. Citation happens at the page level. What does correlate is entity authority: the degree to which the author is a recognized entity connected to organizations and topics. Named, credentialed authors with established expertise show stronger citation rates than anonymous or generic bylines.

Schema matters — but only when it's detailed

BrightEdge and AccuraCast found that 81% of cited pages include schema markup. But a December 2024 study found no correlation between schema coverage and sustained citation durability. EverTune's research resolved the contradiction: generic CMS-default schema has no measurable effect, while attribute-rich schema with specific information outperforms by 20 percentage points.

For your content team, this means the "add schema" checklist item isn't sufficient. Someone needs to hand-craft schema for your most important pages with specific, detailed attributes — pricing, ratings, author credentials, product specifications. That's a content investment, not a plugin toggle.

Distribution matters — maybe the most

Perhaps the most counterintuitive finding: distribution breadth may matter more for citation persistence than any single on-page optimization. Stacker's research found that earned editorial distribution raised citation frequency from 8% to 34%. On Perplexity, distributed content showed a citation life of 12.3 weeks — roughly three times the overall average. Press release syndication, by contrast, was cited only 0.04% of the time — effectively worthless.

This challenges the common "fix your blog" framing. On-page optimization is necessary but not sufficient. A distribution strategy that places your core claims across multiple authoritative publications may do more for persistence than any amount of on-page restructuring.

A word of caution: the cannibalization risk

Before rushing to optimize everything, it's worth noting a counterintuitive finding from the Princeton GEO study. While GEO optimization methods boosted visibility by up to 40% for most content, top-ranked content actually saw a 30.3% visibility decrease when the same techniques were applied. The researchers attributed this to a cannibalization effect — aggressively optimizing content that's already performing well can disrupt the signals that made it perform in the first place.

The implication: optimization is not uniformly positive. Content that's already getting cited may need a lighter touch than content that isn't. And as Lily Ray of Amsive has warned, GEO-specific tactics pursued at the expense of traditional SEO fundamentals can destroy the base that makes AI visibility possible. The strongest approach is to build GEO on top of SEO, not instead of it.


What Stays Cited: The Persistence Gap

The research synthesized in this report reveals a significant gap in the field's knowledge. Most existing studies focus on what gets cited. Almost none focus on what stays cited. These are different questions with potentially different answers.

The Princeton GEO study, published in November 2023 and presented at ACM KDD 2024, is the academic foundation for the entire field of generative engine optimization. Its findings are rigorous: GEO optimization methods can boost visibility by up to 40%, with statistics addition showing +37% improvement. The most dramatic result — citation of external sources showing +115.1% improvement — came with a critical caveat that's often omitted: that boost only applied to content ranked 5th or lower. For already top-performing content, the same optimizations actually decreased visibility by 30.3%. The study's results are more nuanced than the headlines suggest. And more importantly for our purposes, the study measured visibility at a single point in time. It answered "how do you get cited?" — it did not address whether those visibility gains persisted over weeks or months.

The Stacker/Scrunch half-life study is the closest thing to persistence-specific research, but it tracked only eight articles, and Stacker has a commercial interest in demonstrating that their distribution network works. The Semrush 13-week study has impressive scale — 230,000+ prompts, 100 million+ citations — but 13 weeks is barely long enough to establish persistence patterns. It's one cycle. You need multiple cycles to distinguish signal from noise.

The open questions

This gap in the research isn't just an academic concern. It has direct implications for how companies invest in content. Here are the questions that existing research cannot yet answer:

Does on-page optimization affect persistence, or only initial citation? The Princeton study showed that adding statistics and external citations boosts visibility. Nobody has tested whether those same optimizations help content stay cited longer.

What's the causal relationship between content features and citation longevity? All existing persistence data is correlational. Does FAQ schema cause longer citation life, or is it merely correlated with other factors that do?

How do engine model updates disrupt citation graphs? The Reddit/ChatGPT collapse shows that model updates can cause abrupt shifts. Nobody is systematically tracking the pattern of disruption and recovery.

Is the freshness signal about actual content updates or just schema timestamps? If engines use the dateModified field as a proxy, updating the date without changing content might temporarily preserve citations. A testable hypothesis with significant tactical implications.

Does distribution breadth cause persistence, or is it confounded by brand authority? Content distributed through Stacker's network tends to come from established brands. Is distribution the active ingredient, or is brand authority doing the work?

How does persistence vary across verticals? Does cybersecurity content behave like martech content? Vertical-specific persistence patterns could fundamentally change how companies prioritize content investment.

These aren't abstract research questions. They're the questions that determine whether a company's content strategy is built on evidence or assumption. And answering them requires something no snapshot study can provide: longitudinal data collected consistently over months, across multiple engines, with enough granularity to isolate individual variables.


A Framework for Citation Persistence

Despite the gaps in current research, the existing evidence is sufficient to build a practical framework. The recommendations below are graded by confidence level based on the strength and independence of the supporting evidence.

High Confidence — Multiple independent sources
1
Front-load claims and data in the first 30% of content. Move your primary thesis, key statistics, and authoritative claims to the opening paragraphs. Write for extraction, not suspense.Indig, 2025 — 3M responses, 30M citations
2
Refresh content substantively every quarter at minimum. Update the data, the examples, and the analysis — not just the publication date. 50% of cited content is less than 13 weeks old.Amsive, Jasper, multiple independent studies
3
Use structured, extractable formats. Tables, numbered lists, FAQ sections, and clear heading hierarchies outperform narrative formats for AI citation.FarAndWide, Princeton GEO study
4
Track multiple engines, not just one. Citation persistence varies dramatically by platform. Single-engine monitoring creates dangerous blind spots.Semrush, Stacker
5
Maintain strong traditional SEO as the foundation. GEO optimization builds on top of SEO fundamentals, not instead of them. Fix foundations first.Lily Ray / Amsive
Medium Confidence — Directional but limited sources
6
Invest in distribution breadth across authoritative domains. Earned editorial coverage may matter more for persistence than on-page optimization. Treat as a strong signal, not proven fact.Stacker — caveat: commercially motivated
7
Build entity authority for content authors. Named authors with recognized expertise and organizational affiliations correlate with higher citation rates.Mike King / iPullRank, Google Search Central
8
Use detailed, attribute-rich schema — not generic CMS defaults. Hand-craft schema for important pages with specific attributes. Auto-generated boilerplate has no measurable effect.BrightEdge, EverTune — mixed evidence
Emerging — Plausible but untested
9
Engine-specific refresh cadences may optimize persistence. Early data suggests: biweekly for ChatGPT, every six weeks for Perplexity, monthly for Google AI Overviews. Derived estimates, not primary research.AuthorityTech synthesis
10
Co-citation with authoritative sources may extend persistence. Logical inference from distribution breadth findings, but no direct evidence yet.
11
Updating dateModified schema may temporarily reset freshness signals. Plausible mechanism, but untested. If confirmed, a significant tactical finding.

The Case for Longitudinal Tracking

The recommendations in this report are the best available given current research. But they share a fundamental limitation: they're derived from snapshots and short-duration studies applied to a system that changes 40-60% every month.

A one-time audit captures a single frame of a movie. It tells you where you stand today, but it can't tell you whether your position is improving or deteriorating, which competitors are gaining citation share, or how the last model update affected your visibility. By the time you act on the findings, the underlying data may have already shifted.

Generic best practices — add schema, use FAQ format, front-load your claims — apply equally to every company in every vertical. They don't tell you which specific prompts matter for your buyer, which competitors are gaining citation share in your vertical, or how your content responds to engine updates. They're the starting line, not the strategy.

Consider a concrete example. A cybersecurity company publishes a quarterly SIEM comparison guide. A one-time audit tells them they're cited on Perplexity but not ChatGPT. Useful, but static. Longitudinal tracking over three months reveals something richer: their citations on Perplexity spike after each update but decay within four weeks, a competitor's guide published across three editorial sites maintains citations for 10+ weeks, and ChatGPT stopped citing comparison-format content entirely after a September model update. That's not a snapshot — it's a playbook. It tells them exactly where to invest: refresh cadence for Perplexity, distribution strategy to match the competitor, and a format shift for ChatGPT. None of that is visible from a single audit.

Compound Intelligence: What Longitudinal Tracking Produces Over Time
Week 1
Baseline
Which engines cite you, for which queries, who shares the slot
Month 3
Patterns
What persists, what decays, how fast, how updates disrupt
Month 6
Playbook
Data-driven strategy specific to your content, vertical, and engines

Longitudinal tracking produces compound intelligence. Week one gives you a baseline. Month three gives you patterns. Month six gives you a playbook — a data-driven understanding of what works for your specific content, in your specific vertical, on each specific engine.

This isn't theoretical. The research in this report demonstrates that citation behavior varies dramatically by platform, by content type, by vertical, and by time period. A one-size-fits-all approach based on industry averages will, by definition, be wrong for most specific cases. The companies that start building proprietary citation intelligence now will have months of accumulated insight by the time their competitors realize they need it.

The gap between what the industry knows and what individual companies need to know is exactly where longitudinal research lives. Closing that gap isn't a one-time project. It's an ongoing discipline — and the returns compound over time.

Start building your citation intelligence

Quoted builds longitudinal research programs that produce weekly citation tracking across all five engines (ChatGPT, Perplexity, Google AI Overviews, Claude, and Gemini), competitive citation share analysis, engine update impact assessments, and quarterly persistence playbooks tailored to your vertical and buyer prompts.


Sources

Academic Research
Aggarwal, P., Murahari, V., et al. "GEO: Generative Engine Optimization." Princeton, Georgia Tech, Allen Institute for AI, IIT Delhi. November 2023, presented at ACM KDD 2024. Peer-reviewed
No commercial bias. Foundational GEO study — measures initial visibility, not persistence.
Large-Scale Industry Research
Semrush. "The Most-Cited Domains in AI: A 3-Month Study." October 2025. 230,000+ prompts, 100M+ citations, 13-week duration. Verified
Moderate commercial bias (Semrush sells SEO tools). Partial persistence data (13-week window).
Indig, Kevin / Growth Memo. "44% of ChatGPT Citations Come from the First Third of Content." 2025. 3M responses, 30M citations. Verified
Low commercial bias. Independent practitioner research.
Conductor. AEO Benchmark Study. 3.3 billion sessions, 13,000 domains.
Moderate commercial bias (sells SEO platform). Not persistence-specific.
Persistence-Specific Research
Stacker/Scrunch. "Most AI Citations Fade in Weeks. Distributed Content Lasts Twice as Long." 2025. 944 prompt-platform combinations, survival analysis. Verified
High commercial bias (Stacker sells content distribution). Small sample (8 articles). Most rigorous persistence study publicly available.
Volatility & Change Rate Studies
Ahrefs. "AI Overviews Change Every 2 Days (But Never Change Their Mind)." 2025. 43,000 keywords. Verified
Moderate commercial bias. Key finding: 70% content change rate with 0.95 semantic consistency.
EMARKETER. AI citation turnover reporting. Unverified primary
Referenced in secondary sources. Findings directionally consistent with verified Ahrefs data.
Superlines. AI visibility decline measurement. Unverified primary
Findings directionally consistent with verified half-life data.
Content Structure & Optimization
FarAndWide. "We Analyzed 170+ AI-Cited Sources Across 5 Content Types." 2025-2026.
Low commercial bias. Medium methodology rigor.
King, Mike / iPullRank. E-E-A-T and entity authority research.
Low commercial bias. Key finding: citation at page level, not domain level.
BrightEdge/AccuraCast. Schema markup and AI citation correlation studies.
Moderate commercial bias.
EverTune. Schema quality vs. presence study.
Key finding: generic CMS schema has no effect; attribute-rich schema outperforms by 20pp.
Search/Atlas. Schema and citation durability study. December 2024.
Key finding: no correlation between schema coverage and sustained citation durability.
Canel, Fabrice (Microsoft Bing). SMX Munich 2025 presentation.
Official confirmation that schema helps LLM content understanding.
Practitioner Analysis
Ray, Lily / Amsive. "Your GEO Strategy Might Be Destroying Your SEO." 2025-2026.
Moderate commercial bias. Key warning: GEO without SEO foundations is counterproductive.
AuthorityTech. Platform-specific half-life synthesis.
Derived estimates, not primary research.