Claude Visibility — Complete Guide

Q: What is the difference between ClaudeBot and anthropic-ai?

Anthropic documents both user-agents. ClaudeBot is used for fetching content that may be referenced by Claude (including the web-search tool), while anthropic-ai has historically been used for training-data crawling. Both can be controlled separately in robots.txt. Refer to Anthropic's official support article for the current canonical user-agent names and behavior.

Last updated: June 1, 2026

What this guide is and is not. This is the long-form reference for how content interacts with Anthropic's Claude ecosystem — ClaudeBot, the anthropic-ai user-agent, and the contexts in which Claude may fetch or reference web content. This is not a "rank in Claude" playbook, because Claude is a generative model rather than a search engine and Anthropic does not publish ranking signals for Claude's responses. If you want a per-crawler reference covering every major AI bot (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and others), see the AI Crawlers — Complete Reference. If you want robots.txt mechanics in depth, see the robots.txt — Complete Guide.

1. Claude ecosystem overview

Anthropic operates Claude as a family of large language models accessible through two main surfaces: the Claude API (for developers building applications) and Claude.ai (the consumer-facing chat product). Some Claude deployments include a web-search tool that retrieves live web content during a conversation, when enabled by the developer or user.

The important framing for this guide: Anthropic does not operate a public search engine in the same way OpenAI runs ChatGPT Search or Perplexity runs its answer engine. There is no Claude index that ranks pages and serves them in response to user queries. Claude is a generative model whose responses are produced from its training, plus — when web tools are enabled — content fetched at the moment of the conversation.

This means the surface area for "Claude visibility" is fundamentally smaller and qualitatively different than for ChatGPT or Perplexity. There are essentially two contexts in which Anthropic's systems may interact with your content:

Training-data ingestion. Web content may be crawled and incorporated into the training data for future Claude models. Anthropic documents an opt-out mechanism via robots.txt.
Live retrieval via Claude's web-search tool. When developers or users enable web search, Claude may fetch pages relevant to the current conversation. The retrieval mechanism is documented in Anthropic's developer documentation.

Neither of these is "search ranking" in the classical sense. There is no documented set of signals you can optimize for that increases the likelihood Claude will reference your content. There is, however, basic technical hygiene that affects whether your content is reachable at all — and that is what this guide covers.

2. ClaudeBot and the anthropic-ai user-agent

Anthropic documents user-agents for its crawlers in its official support center. The two that are publicly named at the time of writing are:

User-agent	Documented purpose	Opt-out method
`ClaudeBot`	Content fetching used in the context of Claude products. May include fetches related to Claude's web-search tool.	Standard robots.txt `User-agent: ClaudeBot` block.
`anthropic-ai`	Historically associated with training-data crawling.	Standard robots.txt `User-agent: anthropic-ai` block.

Anthropic's canonical reference for these user-agents lives in the support article linked in the Sources section. Because Anthropic has updated this documentation over time, treat the live support article as the source of truth for current behavior and refresh against it when reviewing your robots.txt.

A minimal robots.txt block to selectively opt out of both, while allowing other crawlers, looks like this:

User-agent: ClaudeBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: *
Allow: /

To selectively allow access to specific sections only (for example, allow only public docs and block everything else), use a positive Allow: with a corresponding Disallow: /. The interaction between Allow and Disallow in robots.txt is governed by RFC 9309 — for a complete reference, see our robots.txt — Complete Guide.

Verification note. The robots.txt file on helperg.com today allows both ClaudeBot and anthropic-ai — a deliberate choice. Inspect it at /robots.txt for the live example.

3. Content accessibility for AI models

Whether you allow Anthropic's crawlers or not, there are several content-accessibility characteristics that affect whether your pages can be parsed cleanly when fetched. These apply to any AI system that processes HTML — not just Claude — and they are the same characteristics that benefit classic search crawlers and assistive technology.

Server-rendered main content

If the primary content of a page is only present after client-side JavaScript executes, an AI fetch that does not run JavaScript will see an empty or skeletal document. This is well-documented behavior for many crawlers. The technical baseline is to server-render the main content (text, headings, lists, tables) so the HTML response carries the content directly.

Semantic structure

Use standard HTML semantics: a single <h1>, ordered <h2>/<h3> nesting, real <p> paragraphs, lists for lists, tables for tabular data. Avoid styled <div>-only structure for content sections. This helps any parser identify the role of each block.

Stable URLs and clean canonicals

Provide a <link rel="canonical"> on every public page pointing to the preferred URL. Use HTTPS, drop session and tracking parameters from the canonical, and keep the canonical consistent across the page's content. Inconsistent canonicals confuse any system that follows them.

Factual clarity

If your content includes claims that depend on time, author, or source, surface those clearly. This is good editorial practice independent of any AI system, and it also gives downstream consumers (Claude included) the context they need to use a quote responsibly.

4. Structured information

JSON-LD schema.org markup is widely used for classic search appearance (Google rich results, Bing structured snippets). Anthropic has not, at the time of writing, published a statement that structured data is a signal Claude consumes. Even so, valid JSON-LD remains useful:

It helps generic AI parsers identify the role of content blocks (an FAQ vs. a how-to vs. an article).
It is required or recommended for Google rich-result eligibility, which is independent of Claude.
It costs little to add and signals editorial discipline.

The most relevant schema types for a content site are Article (or TechArticle / NewsArticle as appropriate), FAQPage, BreadcrumbList, and Organization. For a full reference, see our Structured Data — Complete JSON-LD Guide.

A worked example: this page carries a TechArticle block with breadcrumb nested inside, plus a separate FAQPage block matching the visible FAQ. Both are valid JSON; you can verify with our JSON-LD validator.

5. Technical best practices (checklist)

Items below are technical hygiene that may help when Anthropic's crawlers fetch your content. None of them is a "Claude ranking factor," and most apply equally to any AI or classic crawler. The list is deliberately short — depth and length here are not virtues.

Robots.txt is reachable at the site root and returns HTTP 200.
Robots.txt explicitly names ClaudeBot and anthropic-ai with an intentional Allow: or Disallow: rule, not relying on the default.
Sitemap is referenced in robots.txt via the Sitemap: directive.
Sitemap entries use HTTPS, are canonical, and have accurate <lastmod> values.
Every public page carries a unique <title> and <meta name="description">.
Every public page declares a single <link rel="canonical"> matching its served URL.
Main content is server-rendered, not produced only by client-side JavaScript.
HTML uses semantic elements (<h1>, ordered <h2>/<h3>, <p>, <ul>, <table>) rather than <div>-only structure.
Valid JSON-LD where applicable, with no fabricated Review or AggregateRating markup.
External links to authoritative sources include rel="noopener noreferrer" where they open in a new tab.
Pages load reasonably quickly under standard network conditions (Core Web Vitals helps users; it is also a documented classic-search ranking signal).
An llms.txt file at the site root, if you publish one, lists your top-level inventory in plain text — treat as a low-cost good-citizen artifact.
Internal links across the site are reachable, descriptive, and not orphaning any meaningful page.
Pages with significant content updates carry an updated dateModified in their JSON-LD and a visible last-updated line in the body.
The site's analytics or logging captures hits from ClaudeBot and anthropic-ai separately from human traffic, so you can confirm whether crawls are actually happening.

6. Common mistakes

The mistakes below come up repeatedly in conversations with site operators evaluating AI crawler behavior. Most are not Claude-specific — they are general AI-visibility errors that surface particularly clearly when the topic is Claude, because Claude does not have a "ranking" remedy.

Treating robots.txt as a content-protection mechanism. It is not. Robots.txt is an advisory standard. It signals intent to well-behaved automated crawlers. It does not prevent a human from copying your content into a Claude prompt, nor does it bind any crawler that ignores the protocol.
Believing there is a "Claude SEO" toolkit. There isn't, because Claude is not a search engine. Anyone selling "rank higher in Claude" services is selling speculation, not documented practice.
Blocking ClaudeBot while expecting Claude's web-search tool to still surface your content. If Anthropic uses ClaudeBot as the fetcher for the web-search tool, a robots.txt block will reduce the chance of inclusion. Make blocking and inclusion choices deliberately, not by accident.
Adding speculative JSON-LD types because "AI might use them." If a schema type does not match your content, do not add it. Fabricated Review or AggregateRating blocks are widely treated as policy violations by classic search engines and provide no documented Claude benefit.
Assuming opt-out from training equals deletion. Disallowing anthropic-ai today does not retroactively remove content that was already used. It only signals intent for future crawls.
Skipping the verification step. If you add an opt-out rule, verify it in your server logs. Confirm ClaudeBot requests now receive your robots.txt response and are not fetching the disallowed paths. Without verification, you do not know whether your intent reached production.

7. What we don't claim

This section is deliberate. Trust pages and AI-visibility content tend to over-promise. We are explicit about the limits of what is known.

We do not claim that Anthropic publishes a ranking algorithm for Claude. The system is a generative model; there is no documented ranking layer.
We do not claim that JSON-LD, llms.txt, sitemap freshness, or any other technical signal improves the likelihood Claude will reference your content. These may help; Anthropic does not publish such a claim.
We do not claim that opt-out via robots.txt is a complete privacy mechanism. It is an advisory standard.
We do not reproduce community-circulated lists of "Claude SEO factors." Where the source is not Anthropic's own documentation, the claim is inference, not fact.
We do not guarantee future Anthropic behavior. Documentation changes; user-agents are renamed; tools are added or removed. Re-check the Sources before relying on this page.

If you find something on this page that contradicts how Anthropic actually operates, please write to info@helperg.com and we will update it.

8. FAQ

Does Claude have a search ranking algorithm I can optimize for?

No. Claude is a generative model, not a search engine. Anthropic does not publish ranking signals for Claude's responses, because the system does not rank pages the way Google or Bing does. The concrete actions on this page (clean HTML, robots.txt clarity, valid structured data) are general technical hygiene that may help when Claude's web-search tool fetches your page — they are not "Claude SEO" factors.

What is the difference between ClaudeBot and anthropic-ai?

Anthropic documents both user-agents. ClaudeBot is used for fetching content that may be referenced by Claude (including the web-search tool), while anthropic-ai has historically been used for training-data crawling. Both can be controlled separately in robots.txt. Refer to Anthropic's official support article for the current canonical user-agent names and behavior.

If I block ClaudeBot, will Claude users still see my content?

Probably yes, in some cases. If a Claude user pastes your URL into a prompt or uses Claude's web-search tool, the fetch may use a different user-agent or behave differently than a normal crawl. A robots.txt block prevents automated discovery by the named user-agent but does not prevent a human from copying your content into a prompt. Treat robots.txt as a crawler-access control, not a content-protection mechanism.

Should I add llms.txt to my site for Claude?

llms.txt is a community-proposed convention, not a documented signal that Anthropic claims to consume. Adding it costs little and may help LLM-based tools build an inventory of your content. Treat it as a low-cost good-citizen artifact, not as something Anthropic recommends or processes. See our llms.txt — Complete Implementation Guide for more.

Does structured data (JSON-LD) help Claude find or use my content?

Anthropic has not published a statement that structured data is a signal for Claude. JSON-LD remains useful for general search (Google, Bing) and for any AI system that parses page semantics. We recommend valid JSON-LD as general hygiene without claiming Claude-specific benefit.

How often does ClaudeBot crawl my site?

Anthropic does not publish a fixed crawl schedule. Frequency depends on the bot's purpose, your site's perceived value, and respect for crawl-delay directives. If you observe excessive load, raise it with Anthropic via the contact listed in their crawler documentation, or apply a Crawl-delay directive — noting that not all crawlers honor Crawl-delay and it is not part of RFC 9309.

Where is the canonical Anthropic documentation for ClaudeBot?

Anthropic publishes the canonical reference in their support documentation at support.anthropic.com. See the Sources section below for the current URL. If the URL changes, search the Anthropic support center for "crawl" or "robots.txt" to locate the updated page.

9. Sources

Anthropic — Does Anthropic crawl data from the web, and how can site owners block the crawler? — captured 2026-06
Anthropic developer documentation — welcome — captured 2026-06
Anthropic developer documentation — Claude's web-search tool — captured 2026-06
Anthropic developer documentation — Messages API reference — captured 2026-06
RFC 9309 — Robots Exclusion Protocol — captured 2026-06
schema.org — FAQPage — captured 2026-06
sitemaps.org — Sitemap protocol — captured 2026-06

Documentation drift note: Anthropic's support center and developer docs change as products evolve. If you rely on a specific URL from this page in your own work, capture the URL with the date you reviewed it, and re-check periodically.