The Half-Life Report
A Research Synthesis on AI Citation Persistence
The Coming Wave of AEO
Something fundamental has changed in how buyers find information, and most marketing teams haven't caught up yet.
AI search engines — ChatGPT, Perplexity, Google AI Overviews, Claude, Gemini — are increasingly where B2B buyers start their research. Instead of scanning ten blue links and choosing which to click, they ask a question and get a synthesized answer with sources inline. The buyer reads the answer, maybe clicks one or two citations, and moves on. The entire discovery funnel compressed into a single interaction.
The numbers are still early. Conductor's analysis of 3.3 billion sessions across 13,000 domains found that AI referrals account for roughly 1% of total traffic — small in absolute terms, but growing rapidly and disproportionately influential. These aren't casual browsers. AI search users tend to be further along in their buying journey, asking specific questions that signal intent: "best SIEM platform for mid-market companies," "how to measure content ROI across channels," "SOAR vs. SIEM for incident response."
For a decade, traditional SEO optimized for rankings that persisted for months or years. You published a strong piece, earned backlinks, and watched it climb to page one. Once there, it stayed — barring a major algorithm update or a competitor with a bigger content budget. The economics were predictable: invest upfront, harvest traffic over time.
AI citation works nothing like this.
When an AI engine answers a query, it doesn't consult a fixed index the way Google's traditional search does. It retrieves content dynamically, selects sources based on the specific query and context, and assembles an answer in real time. The sources it cites for the same question today may be entirely different from the ones it cited last week. There is no "page one" to hold. There is no ranking to defend. There is only the question of whether your content gets selected this time, for this query, on this engine.
The business implication is straightforward: your content investment has a shorter shelf life than you think. The playbook that worked for traditional search — publish, optimize, maintain — is necessary but no longer sufficient. In traditional SEO, intelligence was static: you ranked or you didn't. In AI citation, intelligence is compound — every week of tracking reveals patterns that the previous week couldn't, and the companies collecting that data now will have months of proprietary insight by the time their competitors realize they need it. Understanding what makes content get cited is only half the challenge. The other half, and the one almost nobody is studying rigorously, is understanding what makes content stay cited.
That's what this report is about.
The Core Finding: Citations Decay — Fast
The most important number in AI citation research is 4.5 weeks. That's the approximate half-life of an AI citation — the time it takes for a piece of content to lose half of its citation appearances across AI engines.
This finding comes from the most rigorous persistence-specific study publicly available: a longitudinal analysis by Stacker and Scrunch that tracked eight articles across 944 prompt-platform combinations on five AI engines using survival analysis, a statistical method borrowed from medical research that measures how long something "survives" before an event occurs. In this case, the event is losing a citation slot.
4.5 weeks. Not months. Not quarters. Weeks.
To put this in perspective: if your content gets cited in 100 AI responses today, you can expect it to appear in roughly 50 of those same responses five weeks from now — even if nothing about your content changed.
(average across engines)
every month
(answers don't change)
within 30 days
The decay isn't uniform across platforms. Synthesis of multiple studies suggests platform-specific half-lives that vary significantly.
The monthly turnover rate
Zooming out from half-life to monthly snapshots, the picture is equally striking. Industry estimates from EMARKETER suggest that 40-60% of cited sources change month-to-month across Google AI Mode and ChatGPT. Ahrefs confirmed the scale of this churn independently in a verified study of 43,000 AI Overview keywords, finding that AI Overview content changes 70% of the time for the same query, with 45.5% of citations replaced by entirely new sources — not updated versions of the same pages, but completely different URLs. Separate industry tracking from Superlines reported a 35% decline in AI visibility scores over just five weeks, consistent with the half-life data.
The semantic consistency paradox
Here's the counterintuitive finding that makes the turnover data particularly interesting. Despite the constant rotation of sources, AI engines almost never change their conclusions. The Ahrefs study measured semantic similarity across AI Overview responses for the same keywords over time and found 0.95 cosine similarity — meaning the substance of the answers stayed 95% consistent even as the citations underneath them churned.
Your content can be factually correct, well-structured, and authoritative — and still lose its citation slot because something newer, better-formatted, or more recently updated appeared. The problem isn't accuracy. It's structural competitiveness in a system that rotates its bibliography constantly.
The freshness dominance
Multiple independent studies converge on a clear finding: content freshness is one of the strongest predictors of citation selection. Analysis shows that 76.4% of ChatGPT-cited pages were updated within 30 days. AI-cited content is 25.7% fresher on average than organically ranked content (368 days average age versus 432 days). And research from Amsive indicates that 50% of top-cited content is less than 13 weeks old.
The practical implication is what researchers call the "3-month cliff" — citations drop sharply when content exceeds 90 days without a substantive update. Content older than a quarter is statistically disadvantaged in AI citation selection, particularly in fast-moving verticals like cybersecurity and marketing technology.
Platform divergence: the Reddit case study
Perhaps the most dramatic illustration of citation instability comes from Semrush's 13-week study, which tracked weekly citations for over 230,000 prompts across ChatGPT, Google AI Mode, and Perplexity, analyzing more than 100 million total citations.
Between early August and mid-September 2025, Reddit citations on ChatGPT collapsed from approximately 60% to 10%. Wikipedia saw a similar drop: 55% to below 20%. But during the exact same period, Reddit citation rates on Google AI Mode and Perplexity remained stable.
This wasn't a content quality issue — Reddit's content didn't suddenly get worse. It was a platform-level shift, likely tied to a model update or retrieval system change at OpenAI. A domain that appeared rock-solid on one engine was simultaneously collapsing on another.
The lesson is clear: single-engine citation tracking is a blind spot. Multi-engine monitoring isn't a nice-to-have — it's a requirement for understanding what's actually happening to your content's AI visibility.
Why Citations Decay: Three Mechanisms
A 4.5-week half-life is a useful headline, but it doesn't tell you why your content lost its citation slot — and without understanding the mechanism, you can't fix the problem. Research points to three distinct decay types, each requiring a different response.
Statistical Decay
Your data gets stale. Content citing "2024 statistics" loses citation priority the moment a competitor publishes 2025 data. The 3-month cliff is largely driven by this mechanism.
Structural Decay
AI engines evolve what formats they prefer for extraction. Tables today, lists tomorrow. Your information is accurate, but the engine's extraction preferences shifted.
Competitive Decay
Someone publishes deeper or better-structured content on the same topic. 45.5% of citation replacements are entirely new sources displacing old ones.
Statistical decay responds to disciplined refreshing. If your cybersecurity market overview references last year's breach statistics and a competitor publishes this year's data, the AI engine will prefer the fresher source regardless of how well your content is structured. Quarterly at minimum, monthly for data-heavy content in rapidly evolving fields — and the refresh needs to be substantive.
Structural decay is harder to detect because it's invisible at the content level. Your information is still accurate. Your page still ranks well in traditional search. But the AI engine's extraction preferences have shifted, and your content's format no longer matches what the engine is looking for. This is one area where longitudinal tracking is particularly valuable, because structural preferences shift gradually and are only visible over time.
Competitive decay operates on a much faster timeline in AI citation than in traditional search. In traditional search, a new competitor might take months to outrank you. In AI citation, a well-structured article published yesterday can displace content you've maintained for a year — because the AI engine evaluates each query fresh, without the historical inertia that traditional search algorithms build in.
What Gets Cited (and What Doesn't)
Beyond understanding decay, a growing body of research identifies the content features that correlate with getting cited in the first place. These findings come from multiple independent studies with varying methodologies and sample sizes.
Position matters — front-load everything
Kevin Indig's analysis of 3 million ChatGPT responses containing 30 million citations revealed a consistent "ski ramp" pattern: citation probability is highest at the top of the page and declines steadily. 44.2% of all citations originate from the first 30% of a page's text.
Traditional content marketing often uses a narrative build-up structure — context first, then analysis, then the key insight near the end. AI engines don't read that way. They extract. And they extract disproportionately from the beginning. Front-load your strongest claims, data points, and definitions. Put your primary thesis in the first 100 words.
Structure matters — extractable formats win
FarAndWide's analysis of 170+ AI-cited sources found that tables boost citation probability by 2.5x and listicles account for 50% of top AI citations. FAQ sections show a 30% citation rate improvement when paired with FAQPage schema. The common thread is extractability — AI engines need to pull a discrete, self-contained piece of information from your content. Content that's easy to extract gets cited more often than content buried in flowing prose.
Authority matters — but at page level, not domain level
Research from Mike King and iPullRank directly contradicts a common assumption inherited from traditional SEO: that domain authority drives AI citation. It doesn't — at least not directly. Citation happens at the page level. What does correlate is entity authority: the degree to which the author is a recognized entity connected to organizations and topics. Named, credentialed authors with established expertise show stronger citation rates than anonymous or generic bylines.
Schema matters — but only when it's detailed
BrightEdge and AccuraCast found that 81% of cited pages include schema markup. But a December 2024 study found no correlation between schema coverage and sustained citation durability. EverTune's research resolved the contradiction: generic CMS-default schema has no measurable effect, while attribute-rich schema with specific information outperforms by 20 percentage points.
For your content team, this means the "add schema" checklist item isn't sufficient. Someone needs to hand-craft schema for your most important pages with specific, detailed attributes — pricing, ratings, author credentials, product specifications. That's a content investment, not a plugin toggle.
Distribution matters — maybe the most
Perhaps the most counterintuitive finding: distribution breadth may matter more for citation persistence than any single on-page optimization. Stacker's research found that earned editorial distribution raised citation frequency from 8% to 34%. On Perplexity, distributed content showed a citation life of 12.3 weeks — roughly three times the overall average. Press release syndication, by contrast, was cited only 0.04% of the time — effectively worthless.
This challenges the common "fix your blog" framing. On-page optimization is necessary but not sufficient. A distribution strategy that places your core claims across multiple authoritative publications may do more for persistence than any amount of on-page restructuring.
A word of caution: the cannibalization risk
Before rushing to optimize everything, it's worth noting a counterintuitive finding from the Princeton GEO study. While GEO optimization methods boosted visibility by up to 40% for most content, top-ranked content actually saw a 30.3% visibility decrease when the same techniques were applied. The researchers attributed this to a cannibalization effect — aggressively optimizing content that's already performing well can disrupt the signals that made it perform in the first place.
The implication: optimization is not uniformly positive. Content that's already getting cited may need a lighter touch than content that isn't. And as Lily Ray of Amsive has warned, GEO-specific tactics pursued at the expense of traditional SEO fundamentals can destroy the base that makes AI visibility possible. The strongest approach is to build GEO on top of SEO, not instead of it.
What Stays Cited: The Persistence Gap
The research synthesized in this report reveals a significant gap in the field's knowledge. Most existing studies focus on what gets cited. Almost none focus on what stays cited. These are different questions with potentially different answers.
The Princeton GEO study, published in November 2023 and presented at ACM KDD 2024, is the academic foundation for the entire field of generative engine optimization. Its findings are rigorous: GEO optimization methods can boost visibility by up to 40%, with statistics addition showing +37% improvement. The most dramatic result — citation of external sources showing +115.1% improvement — came with a critical caveat that's often omitted: that boost only applied to content ranked 5th or lower. For already top-performing content, the same optimizations actually decreased visibility by 30.3%. The study's results are more nuanced than the headlines suggest. And more importantly for our purposes, the study measured visibility at a single point in time. It answered "how do you get cited?" — it did not address whether those visibility gains persisted over weeks or months.
The Stacker/Scrunch half-life study is the closest thing to persistence-specific research, but it tracked only eight articles, and Stacker has a commercial interest in demonstrating that their distribution network works. The Semrush 13-week study has impressive scale — 230,000+ prompts, 100 million+ citations — but 13 weeks is barely long enough to establish persistence patterns. It's one cycle. You need multiple cycles to distinguish signal from noise.
The open questions
This gap in the research isn't just an academic concern. It has direct implications for how companies invest in content. Here are the questions that existing research cannot yet answer:
Does on-page optimization affect persistence, or only initial citation? The Princeton study showed that adding statistics and external citations boosts visibility. Nobody has tested whether those same optimizations help content stay cited longer.
What's the causal relationship between content features and citation longevity? All existing persistence data is correlational. Does FAQ schema cause longer citation life, or is it merely correlated with other factors that do?
How do engine model updates disrupt citation graphs? The Reddit/ChatGPT collapse shows that model updates can cause abrupt shifts. Nobody is systematically tracking the pattern of disruption and recovery.
Is the freshness signal about actual content updates or just schema timestamps? If engines use the dateModified field as a proxy, updating the date without changing content might temporarily preserve citations. A testable hypothesis with significant tactical implications.
Does distribution breadth cause persistence, or is it confounded by brand authority? Content distributed through Stacker's network tends to come from established brands. Is distribution the active ingredient, or is brand authority doing the work?
How does persistence vary across verticals? Does cybersecurity content behave like martech content? Vertical-specific persistence patterns could fundamentally change how companies prioritize content investment.
These aren't abstract research questions. They're the questions that determine whether a company's content strategy is built on evidence or assumption. And answering them requires something no snapshot study can provide: longitudinal data collected consistently over months, across multiple engines, with enough granularity to isolate individual variables.
A Framework for Citation Persistence
Despite the gaps in current research, the existing evidence is sufficient to build a practical framework. The recommendations below are graded by confidence level based on the strength and independence of the supporting evidence.
The Case for Longitudinal Tracking
The recommendations in this report are the best available given current research. But they share a fundamental limitation: they're derived from snapshots and short-duration studies applied to a system that changes 40-60% every month.
A one-time audit captures a single frame of a movie. It tells you where you stand today, but it can't tell you whether your position is improving or deteriorating, which competitors are gaining citation share, or how the last model update affected your visibility. By the time you act on the findings, the underlying data may have already shifted.
Generic best practices — add schema, use FAQ format, front-load your claims — apply equally to every company in every vertical. They don't tell you which specific prompts matter for your buyer, which competitors are gaining citation share in your vertical, or how your content responds to engine updates. They're the starting line, not the strategy.
Consider a concrete example. A cybersecurity company publishes a quarterly SIEM comparison guide. A one-time audit tells them they're cited on Perplexity but not ChatGPT. Useful, but static. Longitudinal tracking over three months reveals something richer: their citations on Perplexity spike after each update but decay within four weeks, a competitor's guide published across three editorial sites maintains citations for 10+ weeks, and ChatGPT stopped citing comparison-format content entirely after a September model update. That's not a snapshot — it's a playbook. It tells them exactly where to invest: refresh cadence for Perplexity, distribution strategy to match the competitor, and a format shift for ChatGPT. None of that is visible from a single audit.
Longitudinal tracking produces compound intelligence. Week one gives you a baseline. Month three gives you patterns. Month six gives you a playbook — a data-driven understanding of what works for your specific content, in your specific vertical, on each specific engine.
This isn't theoretical. The research in this report demonstrates that citation behavior varies dramatically by platform, by content type, by vertical, and by time period. A one-size-fits-all approach based on industry averages will, by definition, be wrong for most specific cases. The companies that start building proprietary citation intelligence now will have months of accumulated insight by the time their competitors realize they need it.
The gap between what the industry knows and what individual companies need to know is exactly where longitudinal research lives. Closing that gap isn't a one-time project. It's an ongoing discipline — and the returns compound over time.
Start building your citation intelligence
Quoted builds longitudinal research programs that produce weekly citation tracking across all five engines (ChatGPT, Perplexity, Google AI Overviews, Claude, and Gemini), competitive citation share analysis, engine update impact assessments, and quarterly persistence playbooks tailored to your vertical and buyer prompts.
Sources
No commercial bias. Foundational GEO study — measures initial visibility, not persistence.
Moderate commercial bias (Semrush sells SEO tools). Partial persistence data (13-week window).
Low commercial bias. Independent practitioner research.
Moderate commercial bias (sells SEO platform). Not persistence-specific.
High commercial bias (Stacker sells content distribution). Small sample (8 articles). Most rigorous persistence study publicly available.
Moderate commercial bias. Key finding: 70% content change rate with 0.95 semantic consistency.
Referenced in secondary sources. Findings directionally consistent with verified Ahrefs data.
Findings directionally consistent with verified half-life data.
Low commercial bias. Medium methodology rigor.
Low commercial bias. Key finding: citation at page level, not domain level.
Moderate commercial bias.
Key finding: generic CMS schema has no effect; attribute-rich schema outperforms by 20pp.
Key finding: no correlation between schema coverage and sustained citation durability.
Official confirmation that schema helps LLM content understanding.
Moderate commercial bias. Key warning: GEO without SEO foundations is counterproductive.
Derived estimates, not primary research.