Paid search

Paid social

SEO

Amazon

Technology

April 29, 2026

Your AI visibility tracking is probably wrong — here's how to audit it

Most AI-visibility tools sample a fraction of what they claim to measure. Here's a 30-minute audit before you trust the numbers in your next report.

‍
Brainlabs flagged it last week: AI visibility data is wrong. The piece stopped at the diagnosis. We've spent the past few months running these tools across client accounts and the diagnosis is correct — but it's also fixable. You can audit your tracking in about thirty minutes and know which numbers to trust.

‍
If you've been asked by a founder or CMO "how visible are we in ChatGPT?" and you've quoted a number from a SaaS dashboard, this one's for you.

‍

Why the numbers are unreliable

There are three structural problems with most AI-visibility tools, and they compound.

‍
Sampling. A tool that claims to track "your visibility in ChatGPT" doesn't query ChatGPT for every user. It queries a sample of prompts, on a schedule, from a fixed set of locations. If
your audience asks questions in ways the tool's prompt library doesn't capture, you're invisible to the tool but not to actual users.

‍
Prompt design bias. Every tool runs its own prompt set. Some run "best [category] for [use case]" templates. Others run "what's a good [product]" formats. Brand-led prompts ("is [your brand] any good?") return very different visibility than category prompts. Most dashboards average across prompt types and present it as a single score.

‍
Model versioning. ChatGPT 4.5, ChatGPT 5, ChatGPT with web search, ChatGPT without web search, the same models accessed via API vs the consumer app — they all behave
differently. Tools typically run against one configuration. Real users span all of them.

‍
The result: a number on a dashboard that's directionally interesting but not commercially actionable. Reporting it upward as "we're at 23% AI visibility" gives leadership a false sense
of precision.

‍

The 30-minute audit

You don't need to throw out the tool. You need to know what it's measuring, and verify a slice of it manually.
‍

Step 1 — Read the tool's methodology page (5 mins)

Find the documentation that describes the prompt set, the model versions queried, and the query frequency. If the tool doesn't publish this clearly, that's already informative. Note the answers to: which models, how often, how many prompts per category, US-based queries or geo-distributed, web-search-enabled or not.

‍

Step 2 — Check whether AI agents can actually reach your site (5 mins)

This is the foundation. If your robots.txt blocks ChatGPT-User or ClaudeBot, or your CDN is silently blocking AI crawlers, the visibility number will reflect that — but the tool won't tell you. Run your domain through Cloudflare's free isitagentready.com. It checks five layers — discoverability, content accessibility, bot access control, protocol discovery, and commerce — and flags the technical gaps that make AI agents bounce off your site before any tool's prompt has a chance to surface you. It's the cheapest first move, and we've seen sites with near-zero AI visibility scores fix it overnight by removing a stray bot block in their CDN.

‍

Step 3 — Run ten of your highest-intent queries manually (10 mins)

Pick the ten prompts your customers would actually type. Not "best [category]" — the specific phrases real buyers use. Run each one in:

ChatGPT with web search on
ChatGPT with web search off
Perplexity
Google AI Overviews

‍
Note where you're cited, where competitors are cited, and where neither is. This is the ground truth your dashboard is approximating.

‍
Step 4 — Compare the manual run to the dashboard (5 mins)

For the same ten prompts, what does the SaaS tool say? Where it agrees with your manual check, you can trust the broader dataset. Where it disagrees, you've found the bias. We've seen tools report 40% visibility for clients who weren't cited at all in our manual check, because the tool's prompts were category-led and the brand was a niche player that only surfaced on long-tail buyer queries.

‍
Step 5 — Decide what to report and what to ignore (5 mins)

Write down two numbers you'll report monthly:

The tool's score, with the caveat that it reflects a specific prompt set
Your manual citation count on the ten high-intent queries

‍
Report both. The tool's number is good for trend tracking. The manual count is what matters commercially.

‍

What this changes for your reporting

Three things.

You stop quoting AI visibility as a single percentage. It's a metric with too many dependencies for that. Report it as "X% on the tool's prompt set" and "Y of 10 high-intent queries cited."

‍
You add the technical-readiness layer to your monthly check. Run isitagentready.com once a month. AI bot blocking gets introduced accidentally — a CDN rule update, a new security plugin, a developer tightening robots.txt. Catching it in the same week it happens is much cheaper than discovering it three months later.

‍
You re-run the manual check whenever the tool's number moves more than 5 percentage points. That's usually a methodology change on the tool's side, not a real shiftin visibility. Verify before you report.

‍

What to do this week

Run isitagentready.com on your top three domains. Fix anything in the bot access control or content accessibility categories.
Pick ten buyer queries and run the manual check across ChatGPT, Perplexity and AI Overviews.
Compare to your dashboard. Note the gaps.
Replace any "AI visibility = X%" line in your next report with the dual metric described above.
‍

You'll spend thirty minutes and gain a defensible read on AI visibility that won't fall apart the first time a founder asks a follow-up question.

‍

FAQ

Is the SaaS tool worth the subscription if the numbers are unreliable?

Yes, for trend tracking. The absolute number is noisy, but the direction is usually informative. Just don't report the absolute number to leadership without context.

‍

How often should I re-run the manual check?‍

Monthly is enough for most categories. If you operate in a category where AI search behaviour is shifting fast (most of B2B SaaS, financial services, ecommerce), every two weeks is more honest.

‍