Technical SEO Checklist — Complete Implementation
Last updated: June 1, 2026
What this guide is and is not. An implementation-ready 60+ item checklist organized by category, with what to check, where to fix, and severity if missing. The checklist is the page's spine — references to per-topic guides for depth, but the operational use is here. For a shorter task-focused page on improving technical SEO, see How to Improve Technical SEO. For per-topic depth, follow the links in each section.
1. How to use this checklist
Three modes for using the checklist below.
New site / pre-launch
Walk every section before launch. Items missing at launch compound — a missing canonical on the article template applies to every article you publish.
Existing site / audit
Sample 5–10 representative URLs across templates. Template bugs are the most common source of widespread issues; sampling across templates catches them.
Ongoing / quarterly
Quarterly review with focus on what has changed: new templates, new content sections, new third-party integrations. Search Console errors are the trigger for unscheduled reviews.
2. Crawlability and indexation
The first layer. If crawlers cannot reach or correctly classify your URLs, nothing else matters.
- Robots.txt at host root, returns HTTP 200, Content-Type text/plain.
- Sitemap is referenced in robots.txt via
Sitemap:directive (absolute URL). - Sitemap.xml exists, is valid XML, lists only canonical URLs that return 200.
- Sitemap.xml's lastmod values reflect actual content updates (not falsified).
- Every public page returns HTTP 200 directly (no unnecessary redirects).
- 404 pages return 404 (not 200 with a "page not found" body).
- noindex meta tag is present on pages you do not want indexed (and allowed to be crawled so the tag can be read).
- robots.txt does not block CSS/JS that the page needs to render.
- No security-sensitive paths advertised in robots.txt.
For depth: robots.txt — Complete Guide, sitemap.xml — Complete Guide.
3. Canonical URLs
The signal that consolidates ranking across URL variants.
- Every public page declares exactly one
<link rel="canonical">in head. - The canonical is an absolute URL using HTTPS.
- The canonical resolves to HTTP 200 (no redirects in the canonical itself).
- Self-referential canonicals on standalone pages.
- Parameter variants point to the clean canonical URL when content is the same.
- Paginated URLs each use self-referential canonicals (not all pointing to page 1).
- One canonical host (www vs apex) applied consistently across the site.
- Cross-domain canonicals only where syndication intent is documented.
For depth: Canonical URLs — Complete Guide.
4. Metadata
The basics every public page needs.
- Unique
<title>per page, descriptive and concise (around 50–60 characters where natural). - Unique
<meta name="description">per page (around 150–160 characters where natural). - HTML
langattribute matches the page's language. - Viewport meta tag for responsive rendering.
- Robots meta tag explicit where applicable (
index, followdefault;noindexon intentionally non-indexed pages). - Open Graph metadata (og:title, og:description, og:url, og:image) for link-preview correctness.
- Twitter card metadata where applicable.
- Favicon and apple-touch-icon present and resolvable.
5. Structured data
JSON-LD that helps any parser understand the page semantically.
- Every
<script type="application/ld+json">block parses as valid JSON. - Primary
@typematches the actual page content (Article, TechArticle, Product, WebPage, etc.). - BreadcrumbList declared on pages with breadcrumb navigation.
- FAQPage declared only on pages with an actual visible FAQ.
- Organization declared on the homepage with consistent name, url, logo.
- No fabricated
RevieworAggregateRatingmarkup. - JSON-LD content matches visible page content (no schema-only claims).
- dateModified is current when content updates.
- The Rich Results Test passes for any rich-result type you intend to be eligible for.
For depth: Structured Data — Complete JSON-LD Guide.
6. Internal links
The crawl graph that determines discoverability.
- Every page worth indexing is reachable from the homepage within three clicks.
- Every hub has a card grid linking to all its spokes.
- Every spoke links back to its hub.
- Cross-spoke links where content naturally references other content.
- Anchor text is descriptive — no "click here," "read more," or bare URLs.
- Anchor text varies naturally across multiple links to the same page.
- No orphan pages (URLs in sitemap not reachable through internal links).
- Internal links target canonical URLs (not parameter variants).
- HTTPS used consistently on internal links.
For depth: Internal Linking — Complete Guide.
7. Performance — Core Web Vitals
Google has publicly documented Core Web Vitals (Largest Contentful Paint, Interaction to Next Paint, Cumulative Layout Shift) as a ranking signal. Effect size is implementation-dependent.
LCP — Largest Contentful Paint
Measures perceived load speed: how long until the largest visible content element renders. Target: under 2.5 seconds at the 75th percentile of users. See web.dev / LCP.
INP — Interaction to Next Paint
Measures responsiveness: how quickly the page responds to user interactions across the page lifecycle. Target: under 200 ms at the 75th percentile. See web.dev / INP.
CLS — Cumulative Layout Shift
Measures visual stability: unexpected layout shifts during page lifecycle. Target: under 0.1 at the 75th percentile. See web.dev / CLS.
Performance items
- LCP element is identified and optimized (preload critical resources, server-rendered, no late client work).
- Images use modern formats (WebP, AVIF) where supported.
- Images have width and height attributes to prevent CLS.
- Fonts use
font-display: swapor are preloaded to reduce CLS. - Render-blocking CSS and JS are minimized.
- JavaScript bundles use code-splitting where applicable.
- Search Console Core Web Vitals report is reviewed at field-data scale.
8. AI visibility considerations
Most technical SEO work overlaps with AI visibility. The AI-specific layer adds per-crawler robots.txt rules and a few content-clarity considerations. See the AI Search Visibility — Complete Guide for the full discussion.
- Robots.txt names each AI crawler with a deliberate Allow or Disallow (GPTBot, OAI-SearchBot, ChatGPT-User, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, Applebot-Extended, meta-externalagent, Bytespider, CCBot).
- Main content is server-rendered, not client-only JavaScript.
- Semantic HTML (h1-h3, p, ul, table) is preferred over
<div>-only structure. - Content has clear factual sections and source attribution where claims need them.
- An llms.txt file is published if you want LLM-tool inventory clarity (not a documented ranking signal).
9. Common mistakes
- Inconsistent canonicals across templates. Template-level bugs scale fast.
- Robots.txt blocking CSS or JS that the page needs to render. Defeats Google's ability to evaluate the page.
- Sitemap entries with non-canonical URLs. Scatters signals.
- Falsified sitemap lastmod values. Google detects and may discount the field site-wide.
- Self-conflicting canonicals. Page A → Page B → Page A. Neither indexes cleanly.
- FAQPage schema without a visible FAQ section. Schema and visible content must match.
- Fabricated Review or AggregateRating schema. Documented Google policy violation.
- Orphan pages in sitemap. URLs nobody links to from elsewhere.
- Blocking AI crawlers without a documented reason. Decide deliberately, not by accident.
- noindex meta tag combined with robots.txt Disallow. Crawler cannot read the noindex if it cannot crawl the page.
10. Master tabular checklist
The consolidated checklist below combines every item from sections 2–8. Use this as the operational reference; the per-section content above gives context.
- robots.txt exists at host root, returns HTTP 200, Content-Type text/plain.
- robots.txt uses UTF-8 encoding.
- robots.txt names each AI crawler with a deliberate rule.
- robots.txt does not block CSS/JS needed for rendering.
- robots.txt declares Sitemap directive with absolute URL.
- sitemap.xml is valid XML.
- sitemap.xml lists only canonical URLs returning 200.
- sitemap.xml lastmod values reflect actual content updates.
- sitemap.xml does not exceed 50,000 URLs or 50 MB.
- If sitemap exceeds limits, sitemap index file is used.
- Every public page returns HTTP 200 directly.
- 404 pages return 404 status.
- 301 redirects used for permanent moves; 302 for temporary.
- noindex meta tag on intentionally non-indexed pages.
- Every public page has exactly one rel=canonical link.
- Canonical is absolute URL using HTTPS.
- Canonical URL returns HTTP 200 (no redirect chain).
- Self-referential canonicals on standalone pages.
- Parameter variants point to clean canonical URL.
- Paginated URLs use self-referential canonicals.
- One canonical host applied consistently (www vs apex).
- Cross-domain canonicals only for documented syndication.
- Unique title per page (~50-60 chars where natural).
- Unique meta description per page (~150-160 chars).
- HTML lang attribute matches page language.
- Viewport meta tag present.
- Open Graph metadata (og:title, og:description, og:url, og:image) present.
- Twitter card metadata where applicable.
- Favicon and apple-touch-icon resolve.
- JSON-LD parses on every page that declares it.
- Primary @type matches actual page content.
- BreadcrumbList declared on pages with breadcrumb nav.
- FAQPage declared only when visible FAQ exists.
- Organization declared on homepage with consistent name/url/logo.
- No fabricated Review or AggregateRating markup.
- JSON-LD content matches visible page content.
- dateModified updated when content updates.
- Rich Results Test passes for intended rich-result types.
- Every indexable page reachable from homepage within 3 clicks.
- Hub pages have card grid linking to spokes.
- Spokes link back to hub.
- Cross-spoke links where content cross-references.
- Anchor text descriptive (no "click here" / "read more").
- Anchor text varies naturally across multiple links.
- No orphan pages (sitemap matches reachable-from-homepage set).
- Internal links target canonical URLs.
- HTTPS consistent on internal links.
- LCP element identified and optimized.
- Images use modern formats where supported.
- Images have width and height attributes.
- Fonts use font-display:swap or are preloaded.
- Render-blocking CSS/JS minimized.
- JS bundles use code-splitting where applicable.
- Search Console Core Web Vitals report reviewed.
- Main content server-rendered.
- Semantic HTML preferred over div-only structure.
- llms.txt published if desired (inventory clarity, not ranking).
- HTTPS enforced site-wide (HSTS where appropriate).
- SSL certificate valid and chained correctly.
- Site loads on mobile (responsive design or mobile-specific surface).
- Search Console set up and verified.
- Bing Webmaster Tools set up (optional but worthwhile).
- Sitemap submitted in Search Console.
- Coverage report reviewed monthly or quarterly.
- Structured data report in Search Console reviewed.
- An owner is identified for ongoing technical SEO health.
11. FAQ
Where should I start with technical SEO?
Crawlability and indexation first — confirm robots.txt and sitemap are correct, every public page returns 200, canonicals are self-referential. Once the basics are right, work through metadata, structured data, internal linking, and Core Web Vitals in that rough order.
Are Core Web Vitals a ranking signal?
Google has publicly documented Core Web Vitals (LCP, INP, CLS) as a ranking signal. The size of the effect varies by query and competitor field; treat performance as one signal among several and as a real user-experience win regardless of ranking impact.
What is the difference between crawl and indexation?
Crawl is the process of fetching a URL. Indexation is the decision to include that URL in the search index. A page can be crawled but not indexed (low quality, duplicate, noindex). Robots.txt controls crawl; noindex meta and canonical signals control indexation.
Do I need separate AI visibility configuration?
Most technical SEO work overlaps with AI visibility — robots.txt cleanliness, canonical correctness, structured data, semantic HTML. AI-specific decisions add per-crawler robots.txt rules for GPTBot, ClaudeBot, PerplexityBot, Google-Extended, and similar. See the AI Crawlers — Complete Reference.
How often should I run a technical SEO audit?
Quarterly for most sites; monthly for high-traffic sites or sites under active redesign. After any major template or platform change, run the checklist immediately.
What is the most common technical SEO bug?
Inconsistent canonicals across templates — one template declares self-referential canonicals correctly while another declares a wrong or stale canonical. Template-level bugs scale fast. Sample-audit across templates to catch them.
Does technical SEO matter if my content is good?
Good content that crawlers cannot reach, index, or understand cannot rank. Technical SEO is the layer between content quality and discoverability. Both matter.
What does Search Console tell me that the site itself does not?
Google's view of the site: indexed pages, coverage errors, structured data validation, Core Web Vitals at field-data scale, search impressions and click-through rate per query. The site shows you what is there; Search Console shows you what Google sees.
12. Sources
- Google Search Central — Documentation home — captured 2026-06
- Google Search Central — SEO starter guide — captured 2026-06
- Google Search Central — Sitemap overview — captured 2026-06
- Google Search Central — Robots intro — captured 2026-06
- Google Search Central — Consolidate duplicate URLs — captured 2026-06
- Google Search Central — Structured data introduction — captured 2026-06
- web.dev — Core Web Vitals — captured 2026-06
- web.dev — LCP — captured 2026-06
- web.dev — INP — captured 2026-06
- web.dev — CLS — captured 2026-06
- schema.org — home — captured 2026-06
- sitemaps.org — Sitemap protocol — captured 2026-06
- RFC 9309 — Robots Exclusion Protocol — captured 2026-06