ChatGPT Visibility — Complete Guide
Last updated: June 1, 2026
What this guide is and is not. Long-form reference for the OpenAI crawler family — GPTBot, OAI-SearchBot, ChatGPT-User — and the documented patterns for content discovery. For a 60-second concept primer, see ChatGPT Indexing. For a 5-minute task-focused workflow, see How to Improve ChatGPT Visibility. This guide is the deep reference both link to. The page is structured around the three user-agents — that is the spine, and the differentiating thing from #1 (the general AI search visibility synthesis).
1. How ChatGPT discovers content
The OpenAI product surface that interacts with web content is split across three documented user-agents. OpenAI publishes them in its bots overview. The three serve distinct purposes, and they can be controlled independently in robots.txt. Conflating them is the source of most ChatGPT-visibility confusion.
| User-agent | Documented purpose | Triggered by |
|---|---|---|
GPTBot | Crawl public web for OpenAI model training | Automated discovery (cycles through known web URLs) |
OAI-SearchBot | Surface content in ChatGPT Search results | Search-index maintenance for the ChatGPT Search feature |
ChatGPT-User | User-initiated retrieval within ChatGPT | A ChatGPT user asks ChatGPT to fetch a specific URL |
Sections 2–4 describe each in detail, with the same DOCUMENTED / RECOMMENDED / NOTE pattern used in the AI Crawlers — Complete Reference.
2. GPTBot — training-data crawler
- DOCUMENTED
- OpenAI publishes
GPTBotas the user-agent used to crawl public web content for OpenAI model training. OpenAI documents the canonical user-agent string and the robots.txt opt-out mechanism in its GPTBot reference. OpenAI also publishes a JSON file of IP ranges for IP-level verification. - RECOMMENDED
- If your content is meant for wide use (reference documentation, public knowledge), allow GPTBot. If your content is your competitive product (paywalled work, proprietary research), consider blocking it. Whichever you choose, make the decision deliberate — name
GPTBotexplicitly in your robots.txt rather than relying on the default. - NOTE
- Blocking GPTBot only opts out of training. It does not affect OAI-SearchBot or ChatGPT-User. See §3 and §4.
User-agent: GPTBot
Disallow: /
3. OAI-SearchBot — ChatGPT Search crawler
- DOCUMENTED
- OpenAI documents
OAI-SearchBotas the user-agent used to surface content in ChatGPT Search. It is distinct from GPTBot — opt-out decisions for the two are independent. - RECOMMENDED
- If you want your content to appear in ChatGPT Search results, allow OAI-SearchBot explicitly. A coherent and increasingly common stance: block
GPTBot(opt out of training) but allowOAI-SearchBot(remain visible in ChatGPT Search). This separates the two concerns cleanly. - NOTE
- OpenAI does not publish a ChatGPT Search ranking algorithm. Allowing OAI-SearchBot makes inclusion possible; the search system itself decides relevance per query.
User-agent: GPTBot
Disallow: /
User-agent: OAI-SearchBot
Allow: /
4. ChatGPT-User — user-initiated retrieval
- DOCUMENTED
- OpenAI documents
ChatGPT-Useras the user-agent that fires when a ChatGPT user takes an action that fetches a URL — for example, pastes a link into a prompt or asks ChatGPT to read a specific page. It is user-driven retrieval, not background crawling. - RECOMMENDED
- Most sites allow ChatGPT-User. Blocking it primarily prevents users from successfully requesting your URL through ChatGPT. Decide based on what your content is for.
- NOTE
- ChatGPT-User fires only when the user acts. Blocking it does not prevent a user from copy-pasting your content into a prompt directly.
5. Robots.txt decisions
The three OpenAI user-agents map to three independent decisions. Most sites benefit from making each decision deliberately rather than relying on the wildcard fallback.
Pattern A — allow all OpenAI activity (visibility-favoring)
User-agent: GPTBot
Allow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: *
Allow: /
Pattern B — opt out of training only (training-restrictive)
User-agent: GPTBot
Disallow: /
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: *
Allow: /
Pattern C — opt out of all OpenAI activity
User-agent: GPTBot
Disallow: /
User-agent: OAI-SearchBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: *
Allow: /
For the full robots.txt protocol reference (precedence rules, wildcards, edge cases), see the robots.txt — Complete Guide. helperg.com's current /robots.txt implements Pattern A — every OpenAI user-agent is named with an explicit Allow.
6. Sitemap guidance
OpenAI's documented patterns for sitemap discovery follow the same conventions as classic search:
- Reference your sitemap in robots.txt via the
Sitemap:directive. - Use absolute URLs in the sitemap.
- Use accurate
<lastmod>values. Falsifying lastmod is detectable. - Ensure sitemap URLs match the canonical URLs declared on each page.
For the full sitemap reference, see the sitemap.xml — Complete Guide.
7. Metadata that matters
OpenAI has not published a ranked list of metadata signals that ChatGPT consumes. The pragmatic guidance is that the metadata that helps classic search and AI parsers generally — title, description, OG, canonical, valid structured data — is the metadata that gives OpenAI's user-agents the best chance of correctly understanding your page.
- Unique titles and descriptions. A good title summarizes the page; a good description gives the parser a one-sentence summary.
- Canonical link. Self-referential canonicals avoid accidental consolidation. See the Canonical URLs — Complete Guide.
- Valid JSON-LD. Particularly
TechArticle,FAQPage, andBreadcrumbListon reference content. See the Structured Data — Complete JSON-LD Guide. - Open Graph. Useful for link-preview generation when ChatGPT-User retrieves your URL.
None of the above is a documented OpenAI ranking signal. It is technical hygiene that may help any parser including OpenAI's.
8. Common misconceptions
- "Blocking GPTBot keeps my content out of ChatGPT." No. It blocks training-data crawling only. OAI-SearchBot and ChatGPT-User still operate. See §3 and §4.
- "GPTBot and OAI-SearchBot are the same crawler." They are not. They have distinct documented purposes and independent opt-outs.
- "There is a ChatGPT SEO playbook." OpenAI does not publish ranking signals for ChatGPT Search beyond what is in the bots reference. Tactical "rank in ChatGPT" content is speculation.
- "Blocking ChatGPT-User prevents my content from being used in ChatGPT prompts." It only prevents the automatic fetch. A user can still copy-paste your content directly into a prompt.
- "llms.txt is required for ChatGPT visibility." OpenAI has not published a statement that llms.txt is a documented signal. See the llms.txt — Complete Implementation Guide.
- "OpenAI's documentation does not change." It does. Re-verify the canonical URLs at the time you act on them.
9. Checklist
- Robots.txt names
GPTBotwith a deliberate Allow or Disallow rule. - Robots.txt names
OAI-SearchBotseparately from GPTBot. - Robots.txt names
ChatGPT-Userseparately, with awareness that this is user-initiated. - The decision for each is documented in your internal style guide.
- Robots.txt is at the site root and returns HTTP 200.
- Sitemap is referenced in robots.txt via the
Sitemap:directive. - Sitemap entries use HTTPS, canonical URLs, and accurate
lastmodvalues. - Every public page has a unique title and meta description.
- Every public page declares a self-referential canonical.
- Open Graph metadata is present for link-preview correctness.
- Valid JSON-LD is present where applicable (TechArticle/Article + BreadcrumbList + optional FAQPage).
- Server logs capture User-Agent strings and are reviewable.
- Log monitoring confirms
GPTBot,OAI-SearchBot, andChatGPT-Userfetch behavior matches your robots.txt intent. - OpenAI's GPTBot IP-range JSON is captured locally and refreshed quarterly.
- Any opt-out rule is verified in production via log review.
- Documentation review for OpenAI's bots reference happens on a quarterly cadence.
- If using llms.txt, it is maintained but not relied on as a documented signal.
10. FAQ
Does ChatGPT have a search algorithm I can optimize for?
ChatGPT's product surface includes a search feature (ChatGPT Search) that uses OAI-SearchBot to gather content. OpenAI documents the bot but does not publish a ranking algorithm. Concrete technical practices on this page may help content be discoverable; they are not "ChatGPT SEO factors" in the way Google publishes ranking signals.
What is the difference between GPTBot and OAI-SearchBot?
OpenAI documents GPTBot as the crawler used to gather public web data for OpenAI model training. OAI-SearchBot is documented as the crawler used to surface content in ChatGPT Search results. They can be allowed or blocked independently in robots.txt.
What does ChatGPT-User do?
OpenAI documents ChatGPT-User as the user-agent that fires when a ChatGPT user takes an action that fetches a URL — for example, pasting a link into a prompt or asking ChatGPT to read a specific page. It is user-driven retrieval, distinct from automated crawling.
If I block GPTBot, will my content stay out of ChatGPT?
Not entirely. Blocking GPTBot signals an opt-out from OpenAI's training-data crawling. It does not affect OAI-SearchBot (which retrieves content for ChatGPT Search) or ChatGPT-User (which fires when a user pastes your URL). To opt out of search-result visibility, block OAI-SearchBot separately.
Does OpenAI publish IP ranges?
Yes. OpenAI publishes a JSON file of GPTBot IP ranges that you can use to verify that a request claiming to be GPTBot is actually from OpenAI's infrastructure. The IP ranges are documented alongside the GPTBot reference.
Should I add llms.txt for ChatGPT?
OpenAI has not published a statement that ChatGPT consumes llms.txt as a documented signal. The convention is community-proposed and adopted by some LLM tools. Add it for inventory clarity if you want, but do not rely on it as a ChatGPT-visibility lever. See our llms.txt — Complete Implementation Guide.
Does structured data (JSON-LD) help in ChatGPT?
OpenAI has not published a commitment that structured data is consumed as a signal. Valid JSON-LD remains useful for general semantic clarity and classic-search rich results. Treat it as classic-SEO hygiene that may help any parser, not as a documented ChatGPT-visibility lever.
How can I monitor ChatGPT-User and OAI-SearchBot in my logs?
Server logs capture the User-Agent string of incoming requests. Filter by "ChatGPT-User" and "OAI-SearchBot" substrings. The first appears when ChatGPT users fetch your URLs directly; the second when ChatGPT Search retrieves them for ranking. Cross-check User-Agent matches against OpenAI's published IP ranges for confidence.
11. Sources
- OpenAI — Bots overview (GPTBot, OAI-SearchBot, ChatGPT-User) — captured 2026-06
- OpenAI — GPTBot reference (IP ranges, opt-out) — captured 2026-06
- RFC 9309 — Robots Exclusion Protocol — captured 2026-06
- Google Search Central — Sitemap overview (for sitemap conventions) — captured 2026-06
- sitemaps.org — Sitemap protocol — captured 2026-06
- schema.org — for the JSON-LD vocabulary — captured 2026-06