Canonical URLs — Complete Guide

Last updated: June 1, 2026

What this guide is and is not. This is the conceptual reference for the rel=canonical signal: how it is defined (RFC 6596), how Google documents consuming it, common mistakes, and worked examples. If you want a task-focused workflow for fixing specific canonical bugs, see How to Fix Canonical URL Issues.

1. What canonicals are

The rel=canonical signal is defined in RFC 6596. It is a link relation that names the preferred URL when multiple URLs serve substantially similar content. The signal is consumed by search engines (and AI systems that follow link semantics) to consolidate ranking and indexing toward the chosen URL.

Google publishes the consolidate-duplicate-urls documentation describing how it processes canonicals. The signal is documented as a strong hint, not a binding directive — Google considers it alongside other consolidation signals (sitemap entries, internal linking, HTTP redirects). If signals conflict, Google may choose a different canonical than the one you declared.

The HTML syntax is a link element in the document head:

<link rel="canonical" href="https://example.com/preferred-url">

The URL should be absolute, use HTTPS where the site uses HTTPS, and resolve.

2. HTML link vs. HTTP Link header

RFC 6596 documents two equivalent ways to declare a canonical:

HTML link tag

<link rel="canonical" href="https://example.com/preferred-url">

Used for HTML pages. Sits in the document head.

HTTP Link response header

Link: <https://example.com/preferred-url>; rel="canonical"

Used for non-HTML resources (PDFs, images, JSON endpoints) where you cannot place a head tag. Also valid on HTML responses if you prefer header-based metadata.

Both are documented as equivalent. Google consumes both. Use whichever fits the response type.

3. Self-referencing canonicals

Every public HTML page should declare a self-referential canonical pointing to the URL it is served at. This is the cleanest signal and avoids accidental cross-page collapse.

For example, the canonical on this page is:

<link rel="canonical" href="https://helperg.com/technical-seo/canonical-urls-guide.html">

The page is served at the same URL the canonical names. The signal confirms that the page's preferred URL is itself, not some variant or aggregator.

4. Parameter URLs

Parameter variants are URLs that include query-string parameters serving the same content as a clean canonical URL. Common parameters include tracking (utm_source, ref), session identifiers, and filter or sort UI state.

Without a canonical, search engines may index the parameter variant instead of the clean URL, scatter ranking signals across variants, or treat the variants as duplicate content.

Pattern: parameter variant → clean canonical

Every variant served by your site should declare a canonical pointing to the clean URL:

URL:        https://example.com/articles/widget?utm_source=newsletter&ref=email
Canonical:  <link rel="canonical" href="https://example.com/articles/widget">

This consolidates the variants and ensures the clean URL is the indexed one.

When parameters serve different content

If a parameter changes the content meaningfully (a search query, a category filter that creates a distinct landing experience), declare a self-referential canonical on the parameterized URL rather than pointing to the unparameterized variant.

5. Cross-domain canonicals (syndication)

RFC 6596 and Google both document cross-domain canonicals. They are useful for syndication patterns where the same content is published on multiple sites and you want indexing credit to go to the original.

<!-- Syndicated copy on partner.example points to original on author.example -->
<link rel="canonical" href="https://author.example/articles/widget">

Legitimate uses

Misuse

Pointing competitor content's canonical to your own URL is detectable and treated as a policy issue. Cross-domain canonical is a syndication tool, not an SEO lever.

6. Pagination and canonical

For paginated content (page 2, page 3, etc.), each page should declare a self-referential canonical, not a canonical pointing to page 1. This is Google's documented recommendation as of 2026.

Historically, rel="prev" and rel="next" link annotations supplemented pagination signals; Google has documented that it no longer uses those as indexing signals. Self-referential canonicals on each paginated URL plus internal linking from the parent are the current pattern.

7. Worked examples

Example A — single article with a UTM parameter

Visited URL:    https://example.com/article?utm_source=twitter
Canonical:      <link rel="canonical" href="https://example.com/article">

Example B — product page with a category filter

Visited URL:    https://example.com/products?category=widgets&sort=price
Canonical:      <link rel="canonical" href="https://example.com/products?category=widgets&sort=price">

If the filter changes content meaningfully, the canonical is self-referential. If it does not, point to the clean URL.

Example C — protocol mismatch

The canonical's protocol should match the served URL's protocol when both are valid. Hardcoding http:// when serving from HTTPS sends a confusing signal.

Served via:     https://example.com/article
Canonical:      <link rel="canonical" href="https://example.com/article">

Example D — www vs. non-www

Pick one host form (www or apex) as canonical and use it consistently across all canonicals on the site. Mixing is a frequent source of split signals.

8. Validation workflow

Step 1 — Local syntax

Confirm every public page declares exactly one rel=canonical in the head (or as an HTTP Link header). Use view source or curl.

curl -s https://example.com/article | grep -oE '<link[^>]*rel="canonical"[^>]*>'

Step 2 — URL Inspection

In Google Search Console's URL Inspection tool, check the canonical Google selected for the URL. If it differs from your declared canonical, Google has chosen a different consolidation winner based on other signals.

Step 3 — Sample audit

For larger sites, sample 20+ representative pages from each template (article, product, hub, etc.) and confirm each has a valid self-referential canonical. Templates are the most common source of canonical bugs — one broken template breaks many pages.

9. Common mistakes

  1. Self-conflicting canonical. Page A declares canonical pointing to Page B; Page B declares canonical pointing to Page A. Neither gets cleanly indexed.
  2. Canonical pointing to a redirect. The canonical URL itself returns a 301. Google generally follows but the signal is muddled. Point canonicals at the final 200-response URL.
  3. Multiple canonical tags on the same page. Google documents that this causes the signal to be ignored.
  4. Mixed protocol (http vs https). Canonicals using http while the site serves https consolidates toward a deprecated URL.
  5. Mixed host (www vs non-www). Pick one and apply consistently.
  6. Parameter URLs without canonical alignment. Parameter variants without a canonical pointing to the clean URL scatter signals.
  7. Canonical to a 404. The named URL does not resolve. The signal is invalid.
  8. Cross-domain canonical without syndication intent. Pointing one site's canonical to another site without a syndication arrangement is a misuse.

10. Checklist

11. FAQ

What does rel=canonical do?

It signals to search engines that, when multiple URLs serve substantially similar content, the URL named in the canonical link is the preferred one to index. RFC 6596 defines the syntax; Google publishes how the signal is consumed.

Is rel=canonical a directive or a hint?

Google documents the rel=canonical link as a strong hint, not a binding directive. Google considers it alongside other consolidation signals (sitemap entries, redirects, internal linking). If signals conflict, the canonical may not be chosen as the canonical URL.

Should the canonical point to itself?

Yes. Every public page should declare a self-referential canonical pointing to the URL the page is served at. This is the cleanest signal and avoids accidental cross-page collapse.

How do parameter URLs interact with canonicals?

Parameter variants (?ref=, ?utm_source=) typically serve the same content as the clean canonical URL. The canonical link on every variant should point to the clean URL. This consolidates ranking signals and prevents the parameter variant from becoming the indexed URL.

Can canonicals point across domains?

Yes, for syndication patterns. RFC 6596 and Google both document cross-domain canonicals. Use them when content is intentionally syndicated and you want the original site to receive indexing credit. Misuse (pointing competitor content to your own canonical) is detectable and treated as a policy issue.

What happens if I have multiple canonical tags on one page?

Google documents that multiple canonical tags on the same page cause the signal to be ignored. The page falls back to other consolidation signals. The fix is to declare exactly one canonical per page.

Can I use HTTP Link header instead of HTML rel=canonical?

Yes. Both are documented as valid. The HTTP Link header is useful for non-HTML resources (PDFs, images) where you cannot place a head tag. For HTML pages, the rel=canonical link in head is the more common pattern.

Does the canonical affect non-Google search engines?

Most major search engines consume rel=canonical. Bing's webmaster documentation describes similar behavior. RFC 6596 defines the protocol independent of any specific provider.

12. Sources