How to track AI citations across five engines
A repeatable measurement system for AI visibility: which queries to test, how often to check, what to log, and what a good citation rate looks like.
· 9 min read · by the Crescendo team
You can’t see AI citations in Search Console, GA4, or any rank tracker, because a citation isn’t a click. The only way to know whether ChatGPT recommends you is to ask ChatGPT, systematically, repeatedly, and write down what happens. This is the measurement system we run for every client. It’s not complicated. It just has to actually run.
Build the query portfolio
Thirty to sixty questions your buyers genuinely ask, phrased the way people talk to assistants, not keyword-ese. Sources, in order of value: sales call recordings (the questions prospects asked before they found you), support tickets, your best closers’ memory, and only then keyword tools. Tag each query with intent (buy / sell / comparison / education / data) and priority (P1–P3). P1s are the queries where a citation plausibly changes revenue, they get checked most and fixed first.
Run the matrix
Queries down the side, engines across the top: Google AI Overviews, ChatGPT, Perplexity, Gemini, Claude. Each cell gets one of three states per check: cited (your domain in the sources or your brand named in the answer), missed (answer exists, you’re absent), or no answer (the engine didn’t produce an AI response, which for AI Overviews is common and worth tracking in itself).
For each check, log more than the binary, future-you will want:
- which URL of yours was cited (pages win citations, not domains),
- every other domain cited (this feeds competitor analysis),
- a sample of the answer text (so you can see how you’re characterized, being cited as a cautionary tale is a different problem),
- date and engine, obviously.
Cadence: weekly, non-negotiable
Engines are non-deterministic; the same query can cite you Monday and skip you Thursday. One check tells you nothing. Weekly checks for a month give you a citation rate, cited in 3 of 4 weeks, and rates are stable enough to act on and report. Monthly checking stretches that feedback loop to a quarter, which is how programs die. Daily is overkill outside a launch window; the noise floor swallows the signal.
Read the data like an operator
- Rate by engine: strong on Perplexity but absent on AI Overviews usually means content is quotable but rankings lag, a classic SEO problem wearing GEO clothes (the overlap explained).
- Rate by intent: cited on education queries but not buy-intent ones means engines see you as a reference, not a vendor. Fix the commercial pages.
- Flapping cells: a query that alternates cited/missed is winnable, you’re on the shortlist. A consistent miss with the same winner every week needs the gap treatment.
- Post-change windows: after a refresh or new page, watch the fast engines for two to three weeks. Movement there predicts the slower engines.
Then close the loop
Measurement without consequence is a hobby. Every monthly review ends with the same three lists: queries won (hold them, keep pages fresh), queries flapping (small push, refresh + corroboration), queries lost (build or rewrite, per the playbook). And the whole thing flows into the client narrative, covered in reporting for the AI era.
Common questions
- Can't I just ask ChatGPT manually once a month?
- You can, and it's how everyone starts. The problem is variance: engines answer the same question differently across runs, so a single manual check is a coin flip wearing a lab coat. By the time you're checking 30 queries across 5 engines weekly, that's 150 checks, and nobody sustains that by hand past week three.
- What does it cost to automate?
- Through SERP and LLM-response APIs like DataForSEO, roughly one to three cents per query per engine. A 40-query portfolio checked weekly across five engines runs a few dollars a week. The expensive part was never the checks, it's knowing what to do with the misses.
- What's a good citation rate?
- Depends on intent mix, but from our client data: under 20% on priority queries means nobody's been minding the channel; 40-50% is a strong program; above 60% across all five engines is category-leader territory. The trend matters more than the level, up and to the right over quarters is the win condition.
Keep reading
The agency playbook for AI search
How to package, price, deliver, and report GEO work for clients without overpromising. The operating model we use at Crescendo, written down.
Client reporting for the AI search era
Rankings-only reports undersell your work and confuse clients. A reporting structure that explains citations, deltas, and what you did about them.
Who do AI engines cite instead of you?
Every uncited query names the competitor who beat you. How to run a competitor citation analysis and turn it into a prioritized content plan.