Canonical URLs — Complete Guide
Last updated: June 1, 2026
What this guide is and is not. This is the conceptual reference for the rel=canonical signal: how it is defined (RFC 6596), how Google documents consuming it, common mistakes, and worked examples. If you want a task-focused workflow for fixing specific canonical bugs, see How to Fix Canonical URL Issues.
1. What canonicals are
The rel=canonical signal is defined in RFC 6596. It is a link relation that names the preferred URL when multiple URLs serve substantially similar content. The signal is consumed by search engines (and AI systems that follow link semantics) to consolidate ranking and indexing toward the chosen URL.
Google publishes the consolidate-duplicate-urls documentation describing how it processes canonicals. The signal is documented as a strong hint, not a binding directive — Google considers it alongside other consolidation signals (sitemap entries, internal linking, HTTP redirects). If signals conflict, Google may choose a different canonical than the one you declared.
The HTML syntax is a link element in the document head:
<link rel="canonical" href="https://example.com/preferred-url">
The URL should be absolute, use HTTPS where the site uses HTTPS, and resolve.
2. HTML link vs. HTTP Link header
RFC 6596 documents two equivalent ways to declare a canonical:
HTML link tag
<link rel="canonical" href="https://example.com/preferred-url">
Used for HTML pages. Sits in the document head.
HTTP Link response header
Link: <https://example.com/preferred-url>; rel="canonical"
Used for non-HTML resources (PDFs, images, JSON endpoints) where you cannot place a head tag. Also valid on HTML responses if you prefer header-based metadata.
Both are documented as equivalent. Google consumes both. Use whichever fits the response type.
3. Self-referencing canonicals
Every public HTML page should declare a self-referential canonical pointing to the URL it is served at. This is the cleanest signal and avoids accidental cross-page collapse.
For example, the canonical on this page is:
<link rel="canonical" href="https://helperg.com/technical-seo/canonical-urls-guide.html">
The page is served at the same URL the canonical names. The signal confirms that the page's preferred URL is itself, not some variant or aggregator.
4. Parameter URLs
Parameter variants are URLs that include query-string parameters serving the same content as a clean canonical URL. Common parameters include tracking (utm_source, ref), session identifiers, and filter or sort UI state.
Without a canonical, search engines may index the parameter variant instead of the clean URL, scatter ranking signals across variants, or treat the variants as duplicate content.
Pattern: parameter variant → clean canonical
Every variant served by your site should declare a canonical pointing to the clean URL:
URL: https://example.com/articles/widget?utm_source=newsletter&ref=email
Canonical: <link rel="canonical" href="https://example.com/articles/widget">
This consolidates the variants and ensures the clean URL is the indexed one.
When parameters serve different content
If a parameter changes the content meaningfully (a search query, a category filter that creates a distinct landing experience), declare a self-referential canonical on the parameterized URL rather than pointing to the unparameterized variant.
5. Cross-domain canonicals (syndication)
RFC 6596 and Google both document cross-domain canonicals. They are useful for syndication patterns where the same content is published on multiple sites and you want indexing credit to go to the original.
<!-- Syndicated copy on partner.example points to original on author.example -->
<link rel="canonical" href="https://author.example/articles/widget">
Legitimate uses
- Your content syndicated to a partner site that you control or have an arrangement with.
- A mirror of your content on a CDN or geographic-localization domain.
- Republication with consent where attribution is documented.
Misuse
Pointing competitor content's canonical to your own URL is detectable and treated as a policy issue. Cross-domain canonical is a syndication tool, not an SEO lever.
6. Pagination and canonical
For paginated content (page 2, page 3, etc.), each page should declare a self-referential canonical, not a canonical pointing to page 1. This is Google's documented recommendation as of 2026.
Historically, rel="prev" and rel="next" link annotations supplemented pagination signals; Google has documented that it no longer uses those as indexing signals. Self-referential canonicals on each paginated URL plus internal linking from the parent are the current pattern.
7. Worked examples
Example A — single article with a UTM parameter
Visited URL: https://example.com/article?utm_source=twitter
Canonical: <link rel="canonical" href="https://example.com/article">
Example B — product page with a category filter
Visited URL: https://example.com/products?category=widgets&sort=price
Canonical: <link rel="canonical" href="https://example.com/products?category=widgets&sort=price">
If the filter changes content meaningfully, the canonical is self-referential. If it does not, point to the clean URL.
Example C — protocol mismatch
The canonical's protocol should match the served URL's protocol when both are valid. Hardcoding http:// when serving from HTTPS sends a confusing signal.
Served via: https://example.com/article
Canonical: <link rel="canonical" href="https://example.com/article">
Example D — www vs. non-www
Pick one host form (www or apex) as canonical and use it consistently across all canonicals on the site. Mixing is a frequent source of split signals.
8. Validation workflow
Step 1 — Local syntax
Confirm every public page declares exactly one rel=canonical in the head (or as an HTTP Link header). Use view source or curl.
curl -s https://example.com/article | grep -oE '<link[^>]*rel="canonical"[^>]*>'
Step 2 — URL Inspection
In Google Search Console's URL Inspection tool, check the canonical Google selected for the URL. If it differs from your declared canonical, Google has chosen a different consolidation winner based on other signals.
Step 3 — Sample audit
For larger sites, sample 20+ representative pages from each template (article, product, hub, etc.) and confirm each has a valid self-referential canonical. Templates are the most common source of canonical bugs — one broken template breaks many pages.
9. Common mistakes
- Self-conflicting canonical. Page A declares canonical pointing to Page B; Page B declares canonical pointing to Page A. Neither gets cleanly indexed.
- Canonical pointing to a redirect. The canonical URL itself returns a 301. Google generally follows but the signal is muddled. Point canonicals at the final 200-response URL.
- Multiple canonical tags on the same page. Google documents that this causes the signal to be ignored.
- Mixed protocol (http vs https). Canonicals using http while the site serves https consolidates toward a deprecated URL.
- Mixed host (www vs non-www). Pick one and apply consistently.
- Parameter URLs without canonical alignment. Parameter variants without a canonical pointing to the clean URL scatter signals.
- Canonical to a 404. The named URL does not resolve. The signal is invalid.
- Cross-domain canonical without syndication intent. Pointing one site's canonical to another site without a syndication arrangement is a misuse.
10. Checklist
- Every public HTML page declares exactly one rel=canonical link in the head.
- The canonical URL is absolute (starts with https://).
- The protocol (https) matches the served URL's protocol.
- The host (www vs. apex) matches the site's chosen canonical host.
- The canonical URL returns HTTP 200 directly (no redirects in the canonical itself).
- Self-referential canonicals are used on standalone pages.
- Parameter variants point to the clean canonical URL (when content is the same).
- Parameter variants that serve distinct content use self-referential canonicals.
- Paginated URLs each use self-referential canonicals (not all pointing to page 1).
- Cross-domain canonicals are used only for documented syndication patterns.
- The canonical scheme is consistent across the site (one canonical pattern, not several).
- Sitemap entries match the canonical URLs (no parameter variants in sitemap).
- Internal links target the canonical URLs (not parameter variants).
- HTTP Link headers are used for non-HTML resources where applicable.
- The canonical pattern is documented in your internal style guide.
- A sample audit confirms canonical correctness across templates.
11. FAQ
What does rel=canonical do?
It signals to search engines that, when multiple URLs serve substantially similar content, the URL named in the canonical link is the preferred one to index. RFC 6596 defines the syntax; Google publishes how the signal is consumed.
Is rel=canonical a directive or a hint?
Google documents the rel=canonical link as a strong hint, not a binding directive. Google considers it alongside other consolidation signals (sitemap entries, redirects, internal linking). If signals conflict, the canonical may not be chosen as the canonical URL.
Should the canonical point to itself?
Yes. Every public page should declare a self-referential canonical pointing to the URL the page is served at. This is the cleanest signal and avoids accidental cross-page collapse.
How do parameter URLs interact with canonicals?
Parameter variants (?ref=, ?utm_source=) typically serve the same content as the clean canonical URL. The canonical link on every variant should point to the clean URL. This consolidates ranking signals and prevents the parameter variant from becoming the indexed URL.
Can canonicals point across domains?
Yes, for syndication patterns. RFC 6596 and Google both document cross-domain canonicals. Use them when content is intentionally syndicated and you want the original site to receive indexing credit. Misuse (pointing competitor content to your own canonical) is detectable and treated as a policy issue.
What happens if I have multiple canonical tags on one page?
Google documents that multiple canonical tags on the same page cause the signal to be ignored. The page falls back to other consolidation signals. The fix is to declare exactly one canonical per page.
Can I use HTTP Link header instead of HTML rel=canonical?
Yes. Both are documented as valid. The HTTP Link header is useful for non-HTML resources (PDFs, images) where you cannot place a head tag. For HTML pages, the rel=canonical link in head is the more common pattern.
Does the canonical affect non-Google search engines?
Most major search engines consume rel=canonical. Bing's webmaster documentation describes similar behavior. RFC 6596 defines the protocol independent of any specific provider.
12. Sources
- RFC 6596 — The Canonical Link Relation — captured 2026-06
- Google Search Central — Consolidate duplicate URLs with rel=canonical — captured 2026-06
- Google Search Central — Sitemap overview (canonical alignment) — captured 2026-06
- Google Search Central — SEO starter guide — captured 2026-06
- schema.org — home — captured 2026-06
- sitemaps.org — Sitemap protocol — captured 2026-06