One framework. Two scores. The Agent Readiness Score™ grades whether AI agents can find, understand, and complete real tasks on your site. The AEO Readiness Score™ grades whether answer engines (ChatGPT, Perplexity, Google AI Overviews) can cite and recommend you. Both backed by deterministic checks and verifiable evidence on every finding.
The tools you know — SEO scanners, Lighthouse, PageSpeed — were built for humans using search engines. Agents are different. They parse raw HTML, follow structured endpoints, stop at CAPTCHAs, and quit when a button has no accessible name. A single 90/100 score tells them nothing.
The Agent Readiness Framework™ is our answer. Instead of a blanket grade, we run a set of deterministic, re-runnable checks. Instead of averaging them, we weight them against your site's actual archetype. And instead of trusting a human to believe the grade, we surface evidence — the exact URL, HTTP status, and content an agent saw — with every single finding.
You should be able to send a report to your engineering team and have them ship from it on Monday. That is the bar.
Find. Understand. Engage. Recommend. Agents rarely fail on just one of these — and passing them all is the real test of agent-readiness. We grade every dimension independently, then roll the result into a single, weighted number.
Can an autonomous agent discover, fetch, parse, and understand your site without human guesswork?
Can an agent map your site to the canonical jobs your users would delegate to it?
We launch a real headless browser and try the job. This is the live proof — either the agent succeeds, or we show you exactly where it broke.
When agents summarize the web, will they recommend you — and will they get the facts right?
Agent-native primitives are shipping faster than SEO tools can keep up. We added six new deterministic checks covering the capability, identity, policy, commerce, and representation standards that actually matter in 2026 — each with evidence and a one-click LLM fix prompt. Read the deep dive →
Anthropic's Model Context Protocol manifest. Makes your product natively installable in Cursor, Claude Desktop, and ChatGPT tool registries.
Enumerable list of discrete skills your site exposes — lighter than MCP, heavier than a raw OpenAPI blob.
IETF draft for cryptographically verified bot traffic. Lets your WAF safely relax rules for signed GPTBot / ClaudeBot identities.
Declares search / search-ai / ai-train permissions in a single robots.txt line. The cheapest defensible answer to the AI-training question.
ACP, UCP, MPP, and x402 — four overlapping standards for letting agents pay. Early adopters get disproportionate agent commerce flow.
Serve a clean Markdown body when an agent asks for it. Strips layout noise and preserves structural cues into the LLM's context window.
Every scan now produces a second top-level score: AEO Readiness. Agents (Claude, Cursor, Operator) crawl your site and act on it. Answer engines (ChatGPT, Perplexity, Google AI Overviews, Gemini) read it once and decide whether to cite it in a response. The signals overlap, but the weights are different — answer engines reward citable, structured, entity-clear content far more than they reward MCP servers or agent skills.
Can an agent crawl, understand, act, and walk away with a meaningful representation of your business? Three pillars, weighted by archetype.
Can an answer engine extract a citable answer from your page, attribute it to a clear entity, trust its accuracy, and recommend you in a comparison? Four pillars, uniformly weighted because AEO concerns apply to every site type.
13 dedicated AEO-only checks (10 deterministic + 3 LLM-assisted) plus ~14 existing checks (FAQ schema, llms.txt, semantic structure, business clarity, trust signals, etc.) re-weighted for answer-engine behavior. Every check rolls up into both scores — improving your AEO score also lifts your Agent Recommendation pillar.
AI platform visibility testing (citation behavior in ChatGPT / Perplexity / Claude), AI referral traffic measurement, and open-web reputation (Reddit / community visibility, third-party corroboration). These require external data sources or expensive prompt-runs per scan — we're adding them as their own audit modules instead of baking them into the base score.
We classify your site against one of eight archetypes before scoring. Checks irrelevant to your archetype are marked n_a and don't penalise you. Pillar weights also shift — docs sites are weighted more on technical signals, lead-gen on task readiness.
Every finding is paired with the exact URL fetched, the HTTP status, and a content snippet the agent saw. No hand-waving. No trust-me grades.
A docs site shouldn't lose points for lacking /cart. A storefront shouldn't be penalised for missing API references. Weights shift by archetype.
Two runs on the same site produce the same score. No LLM hallucinations in the scoring layer. LLM analysis only augments — never decides.
A score is useful only if it moves. Pro tracks every scan over time so you can see which shipped changes actually lifted your readiness.
| Criterion | Typical SEO scanner | Agent Readiness Framework™ |
|---|---|---|
| Primary user | Human skimming a search result | Autonomous agent parsing HTML + endpoints |
| Score calibration | One global grade for all sites | Weighted per archetype (commerce ≠ docs ≠ lead-gen) |
| Evidence | Opaque — trust the number | Every finding links to exact URL, status, content |
| Capability signals | Ignored | llms.txt, .well-known, MCP, x402 / ACP / UCP |
| Task completion | Assumed, never tested | Playwright-driven synthetic agent runs |
| Determinism | Mostly — LLM-augmented rubrics drift | Deterministic scoring. LLM only for summaries, never for grading. |
The framework is modular by design. New deterministic checks plug into the scanner's backend/scanner/checks package. New archetypes plug into the classifier. Pillar weights live in the scorer. Every module ships with its own test fixtures so adding a check never destabilises an existing score.
Upcoming: questionnaires, guided assessments, and a Tier-2 LLM-driven synthetic agent that adapts to UIs the scripted runner can't handle.
It takes 30 seconds. No signup, no email. You get a score, a weighted report, and evidence on every finding.