Structured Data — Complete JSON-LD Guide
Last updated: June 1, 2026
What this guide is and is not. This is the long-form reference for JSON-LD structured data — schema.org types, primary-type decision-making, validation workflow, and anti-patterns. The guide insists on a three-layer distinction throughout: syntax (is the JSON-LD valid schema.org markup?), rich-result eligibility (does the page meet Google's documented criteria for a specific rich result?), and actual search display (does Google actually show the rich result in production?). Conflating these is the root cause of most structured-data disappointment. If you want a task-focused page on shaping JSON-LD blocks, see How to Structure JSON-LD. For the validator tool, see our JSON-LD validator.
1. What structured data is
Structured data is markup that adds semantic meaning to a page's content for machine consumers — search engines, AI parsers, browser features, accessibility tools. The most common vocabulary on the web is schema.org, a collaborative project originally launched by Google, Microsoft, Yahoo, and Yandex in 2011 and now community-maintained.
schema.org defines hundreds of types (Article, Person, Organization, Product, Recipe, ...) and properties (headline, datePublished, sameAs, ...). A page can declare one or more entities of these types using one of three syntaxes:
- JSON-LD — JSON for Linking Data, a W3C standard. Sits in a
<script type="application/ld+json">tag in the head or body. Decoupled from visible HTML; easy to author, validate, and version. This is the syntax Google recommends. - Microdata — inline HTML attributes (
itemtype,itemprop) on visible elements. Still supported; less common in new work because it couples markup to visible HTML. - RDFa — another inline attribute-based syntax. Rarely seen in new work.
The rest of this guide is JSON-LD-only. If you maintain Microdata or RDFa from older work, the type and property vocabulary is identical; only the syntax differs.
2. The three layers — syntax, eligibility, display
The single most useful framework for thinking about structured data is to keep these three layers separate. They are causally related but they are not the same thing.
Layer 1 — Syntax
Is the JSON-LD parseable, and does it use valid schema.org types and properties? Validated by any JSON parser plus a schema.org reference check.
Layer 2 — Eligibility
Does the page meet Google's documented criteria for a specific rich result (Article, Product, Recipe, etc.)? Validated by Google's Rich Results Test.
Layer 3 — Display
Does Google actually show the rich result on a real SERP for a real query? This is the only layer you cannot directly test — Google decides per query, per device, per cohort.
Most structured-data complaints come from conflating the three. "My JSON-LD is valid but I don't see the rich result" is not a contradiction — it means Layer 1 passes and Layers 2 or 3 are doing their own thing. Google publishes a structured data introduction that explicitly notes valid markup is necessary but not sufficient for rich results.
The guide is structured so that for each schema type, we keep the three layers visible: what the schema requires (syntax), what Google documents as eligibility criteria, and what we know — and do not know — about actual display.
3. Choosing a primary @type
The first decision per page is the primary @type. A schema.org type forms an "is-a" hierarchy: Article is a CreativeWork is a Thing. More specific is generally better, but specificity must match the actual content.
A practical decision tree for content pages:
- Is the page a news article published on a date? Use
NewsArticle. - Is the page a technical guide, reference, or how-to document? Use
TechArticle. - Is the page a generic blog or magazine-style article? Use
Articleor itsBlogPostingsubtype. - Is the page a product? Use
Product(with optionalOfferandReviewonly if real). - Is the page a software application or app? Use
SoftwareApplication. - Is the page a generic informational page (about, contact, policy)? Use
WebPage.
Only declare a primary type that the page genuinely fits. A homepage that is not a single article should not declare @type: "Article". Mismatch between declared type and actual page content is a frequent Rich Results Test failure.
Multi-type pages. A page can declare more than one entity (for example, an Article and a Breadcrumb), either as separate script blocks or as a single @graph. See §9 for the trade-off.
4. Organization
The Organization type represents your company or operating entity. Google uses it as a knowledge-panel signal and as an input to the logo rich result.
Layer 1 — Syntax
Common fields (schema.org reference: schema.org/Organization):
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "HELPERG LLC",
"url": "https://helperg.com",
"logo": "https://helperg.com/petrohrys-logo-outlined.svg",
"sameAs": [
"https://twitter.com/petrohrys",
"https://github.com/hrhelperg"
],
"address": {
"@type": "PostalAddress",
"streetAddress": "30 N Gould St Ste N",
"addressLocality": "Sheridan",
"addressRegion": "WY",
"postalCode": "82801",
"addressCountry": "US"
}
}
Layer 2 — Eligibility
For the logo rich result specifically, Google publishes documented requirements: a logo URL, a matching url, and dimensions that meet Google's stated minimums. The Rich Results Test surfaces failures.
Layer 3 — Display
Even with valid markup and met eligibility, Google decides whether to actually surface a knowledge panel for a brand query. Knowledge-panel display correlates with brand authority signals that are mostly outside structured-data control. Treat the logo rich result as the predictable payoff; the knowledge panel as a possible outcome, not a guaranteed one.
Recommended placement
Declare Organization once, on a stable URL (typically the homepage), with consistent name, url, and logo across every page that re-declares it. Inconsistent organization declarations across a site dilute the signal.
5. FAQPage
Layer 1 — Syntax
The FAQPage type contains a mainEntity array of Question nodes, each with a name and an acceptedAnswer of type Answer with a text field. Example (from this page's own FAQ):
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is JSON-LD?",
"acceptedAnswer": {
"@type": "Answer",
"text": "JSON-LD (JSON for Linking Data) is a W3C-standardized way to encode semantic data in JSON..."
}
}
]
}
Layer 2 — Eligibility
Google publishes the FAQPage rich-result documentation. The visible page must actually display the FAQ; the questions and answers in the schema must match the visible page; and the FAQ must be intended to help users, not for self-promotion.
Layer 3 — Display
Google changed the FAQPage rich-result display in mid-2023 to narrow display to a smaller set of authoritative sites. For most sites, the visible rich-result payoff is reduced or absent. The schema remains useful as semantic markup that other parsers — AI systems, browser features, semantic search — may consume. We continue to use FAQPage schema on this guide because the markup itself is valid and useful, while being explicit that the visible Google rich result is no longer a likely outcome for most sites.
Important policy point. Never fabricate FAQs. A page should only declare FAQPage if it actually contains a real FAQ section that addresses real user questions. Google has historically issued manual actions against sites that game FAQPage with promotional or fabricated Q&A.
6. BreadcrumbList
Layer 1 — Syntax
The BreadcrumbList type contains an itemListElement array of ListItem nodes, each with a position, name, and item URL. The structure mirrors the visible breadcrumb navigation.
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{"@type": "ListItem", "position": 1, "name": "Home", "item": "https://helperg.com/"},
{"@type": "ListItem", "position": 2, "name": "Technical SEO", "item": "https://helperg.com/technical-seo/"},
{"@type": "ListItem", "position": 3, "name": "Structured Data", "item": "https://helperg.com/technical-seo/structured-data-guide.html"}
]
}
Layer 2 — Eligibility
Google publishes the breadcrumb rich-result documentation. Positions must be sequential integers starting at 1; URLs should be absolute; final item is the current page.
Layer 3 — Display
Breadcrumb display in Google's SERPs is fairly reliable when the page is selected for it. This is one of the most predictable structured-data payoffs and is recommended on any site with hierarchical content. We include BreadcrumbList in the TechArticle block on every Phase 1A guide.
7. Article / TechArticle
Layer 1 — Syntax
The schema.org reference for Article and TechArticle lists many fields. The pragmatic minimum:
{
"@context": "https://schema.org",
"@type": "TechArticle",
"headline": "Structured Data — Complete JSON-LD Guide",
"description": "...",
"url": "https://helperg.com/technical-seo/structured-data-guide.html",
"datePublished": "2026-06-01",
"dateModified": "2026-06-01",
"inLanguage": "en",
"author": {"@type": "Organization", "name": "HELPERG LLC", "url": "https://helperg.com"},
"publisher": {
"@type": "Organization",
"name": "HELPERG LLC",
"url": "https://helperg.com",
"logo": {"@type": "ImageObject", "url": "https://helperg.com/petrohrys-logo-outlined.svg"}
},
"mainEntityOfPage": "https://helperg.com/technical-seo/structured-data-guide.html"
}
Layer 2 — Eligibility
Google publishes the Article rich-result documentation. Required fields include headline and image for some article rich results; recommended fields include datePublished, dateModified, and author. Google's documentation distinguishes "required" from "recommended" explicitly.
Layer 3 — Display
Article rich results (Top Stories carousels, AMP-style headlines) historically require classic news-site signals beyond markup. For most technical-reference sites the practical Article-block payoff is helping Google understand the page semantically; the visible carousel display is unlikely without independent news-site authority.
Recommended placement
Declare Article or one of its subtypes per article-style page. TechArticle is the appropriate subtype for documentation, references, and how-to guides. NewsArticle for news content. BlogPosting for blog-style content.
8. SoftwareApplication
Layer 1 — Syntax
The SoftwareApplication type represents a software product. Key fields include name, applicationCategory, operatingSystem, and optionally offers for pricing.
{
"@context": "https://schema.org",
"@type": "SoftwareApplication",
"name": "HELPERG PDF Editor",
"operatingSystem": "iOS, Android",
"applicationCategory": "BusinessApplication",
"offers": {
"@type": "Offer",
"price": "0",
"priceCurrency": "USD"
}
}
Layer 2 — Eligibility
Google publishes the software app rich-result documentation. Pay attention to the aggregateRating and review requirements: if you do not have authentic on-site reviews, do not include rating fields. Fabricated ratings are a documented policy violation.
Layer 3 — Display
Software app rich results are relatively narrow — typically install-prompt cards on mobile SERPs for branded queries. For most software products the practical payoff is semantic clarity rather than visible rich-result display.
9. @graph vs. separate script blocks
You can express multiple entities on one page either as a JSON-LD @graph array or as multiple <script type="application/ld+json"> blocks. Both are syntactically valid.
@graph
{
"@context": "https://schema.org",
"@graph": [
{"@type": "TechArticle", "headline": "...", "..." : "..."},
{"@type": "BreadcrumbList", "itemListElement": [...]},
{"@type": "FAQPage", "mainEntity": [...]}
]
}
One block, shared @context, cross-references via @id are clean. Useful when entities reference each other (an Article whose author is an Organization defined elsewhere on the page).
Separate script blocks
<script type="application/ld+json">{"@type":"TechArticle", ...}</script>
<script type="application/ld+json">{"@type":"FAQPage", ...}</script>
Each block is independent. Easier to audit and validate one entity at a time. We use this pattern across the helperg.com trust and authority pages — each script block is auditable on its own.
Choosing
Either is valid. Default to separate blocks unless you have specific cross-reference needs. Google's Rich Results Test accepts both. Inline references via @id still work across separate blocks if both share the same @context.
10. Validation workflow
Validation is a three-step process matching the three layers from §2.
Step 1 — JSON parse + schema.org type check
The cheapest validation: confirm every <script type="application/ld+json"> block parses as JSON and uses recognized schema.org types and properties. A small Node script does the job:
node -e "
const fs = require('fs');
const s = fs.readFileSync('your-page.html', 'utf8');
const re = /<script type=\"application\\/ld\\+json\">([\\s\\S]*?)<\\/script>/g;
let m, i = 0;
while ((m = re.exec(s))) {
i++;
try {
const o = JSON.parse(m[1]);
console.log('block', i, o['@type'], 'OK');
} catch (e) {
console.log('block', i, 'FAIL', e.message);
}
}
"
This catches malformed JSON, mismatched braces, unescaped quotes inside answer text, and similar mechanical errors. helperg.com's JSON-LD validator covers this layer through a UI.
Step 2 — Google Rich Results Test
Run Google's Rich Results Test against the live URL. The tool reports which rich-result types Google detected as eligible for the page and which fields are missing or incorrect. Eligibility is necessary but not sufficient for display — see Layer 3.
Step 3 — Live SERP observation
The only way to know whether a rich result is actually displayed is to observe real Google SERPs for the relevant query. Use a logged-out browser (or an incognito window with location simulation) so personalized results do not pollute the test. If Search Console shows impressions but no rich-result clicks, the rich result is eligible but Google is not displaying it for that query.
11. Common mistakes
- Mismatched on-page and JSON-LD content. The schema's questions/answers/dates/headline must match what is actually visible on the page. Mismatches are a frequent Rich Results Test failure.
- Missing required fields for the chosen rich result. Google's documentation distinguishes "required" from "recommended" per type. Required fields are not optional.
- Multiple incompatible primary types. Declaring
@type: "Article"on a page that is a product, or vice versa, confuses parsers and fails eligibility for both. - Self-conflicting Organization declarations. The same site declaring different
Organizationname, URL, or logo across pages dilutes the entity signal. - JSON escaping errors in long FAQ answers. Unescaped quotes inside answer text are the single most common JSON-LD parse failure. Either escape the inner quotes or use straight quotes inside curly quotes.
- Stale dateModified. If your published JSON-LD says
dateModified: 2024but the visible page is current, Google may treat the page as stale. Update the field when the content updates. - Breadcrumb position 0. Schema.org breadcrumb
positionvalues start at 1, not 0. Off-by-one is a common copy-paste error. - FAQPage on pages with no visible FAQ. The schema must reflect actual page content; FAQPage without a visible FAQ violates the documented usage and risks manual action.
12. What not to do
The following are explicit anti-patterns. Each has a documented policy or technical-correctness consequence.
Never fabricate Review or AggregateRating schema. Inventing reviews, ratings, or rating counts that do not correspond to authentic user submissions is widely documented as a Google policy violation. It can result in manual action against the entire site, not just the offending page.
- Never declare a schema type that does not match the page's actual content.
@type: "Recipe"on a non-recipe page is a policy issue. - Never add schema fields just because "AI might consume them." If a field is not true for the page, omit it.
- Never claim certifications, awards, or memberships your organization has not actually obtained. This rule applies to structured data the same way it applies to copy on the page.
- Never use FAQPage as a promotional surface. If the answers are not real answers to real user questions, they do not belong in FAQPage schema.
- Never mark up content that is hidden from users (collapsed sections that are not accessible, content behind login walls for non-logged-in viewers). Mark up what users actually see.
- Never inflate counts, prices, or numerical fields in
Offer,AggregateRating, orInteractionCounter. Inflated numerics are detectable and treated as policy violations.
13. Checklist
Pre-deployment checklist for any page carrying JSON-LD structured data.
- Every
<script type="application/ld+json">block parses as valid JSON. - Every block uses a
@contextofhttps://schema.org. - Every block declares a primary
@typefrom schema.org. - The primary
@typematches the actual content of the page. - If multiple entities are declared, they are either in one
@graphor in separate script blocks — never partial. - All
urlfields are absolute URLs using HTTPS. - All
urlfields point to URLs that resolve. - The
headlinefield (if present) matches the visible<h1>or close to it. - The
descriptionfield (if present) matches the visible meta description. - The
datePublishedfield is the original publish date. - The
dateModifiedfield is the most recent meaningful content update. - The
authorandpublisherfields are present where applicable. - The
publisher.logoURL resolves and meets Google's documented minimum dimensions if you want logo rich-result eligibility. - Breadcrumb
positionstarts at 1 and is sequential. - Breadcrumb
itemvalues are absolute URLs. - FAQPage
nameandacceptedAnswer.textmatch visible page content. - FAQPage is only declared on pages that actually contain an FAQ.
- No
RevieworAggregateRatingfields are declared unless the page has real, authentic reviews. - Inner double-quotes inside answer text are escaped with
\"or restructured. - The Rich Results Test passes for any rich-result type you intend to be eligible for.
- The Rich Results Test results are saved or screenshotted at the time of testing for future regression detection.
- JSON-LD blocks are versioned with the rest of the page in source control.
- When the page content updates, the corresponding JSON-LD fields update too.
- The page's HTML lang attribute matches the JSON-LD
inLanguagefield. - Search Console "Structured data" report is monitored for the page or the site.
- A documented decision exists for which schema types this site uses by default.
- The schema vocabulary used is documented in your own internal style guide so future authors are consistent.
14. FAQ
What is JSON-LD?
JSON-LD (JSON for Linking Data) is a W3C-standardized way to encode semantic data in JSON. It is the format Google explicitly recommends for structured data on the web because it sits cleanly in a script tag in the document head and does not require markup in the visible HTML.
Is valid JSON-LD enough to get rich results in Google?
No. Valid JSON-LD is syntactically correct schema.org markup. Rich-result eligibility is a separate, narrower set of requirements Google publishes for specific types (Article, Recipe, Product, etc.). Eligibility is necessary but still not sufficient — Google decides whether to actually display the rich result for any given query. Treat the three as a chain: syntax, then eligibility, then display.
Should I use FAQPage schema?
Use it only when the visible page actually contains a clear FAQ section whose questions and answers match the schema. Google changed FAQPage rich-result eligibility in 2023 to apply only to a narrower set of authoritative sites, so the visible-search-result payoff is reduced. The schema remains valid markup that semantic parsers (including AI systems) can use.
Can I add Review or AggregateRating schema if I do not have real reviews?
No. Fabricating reviews is widely documented as a Google policy violation and can lead to manual action against the entire site. If you do not have authentic, on-site reviews from real users, do not add the schema.
Should I use @graph or separate script blocks?
Either is valid. @graph lets you express multiple related entities in one block with cross-references. Separate script blocks are easier to audit and validate independently. We use separate script blocks across the helperg.com trust and authority pages for clarity, but @graph is a defensible choice when entity cross-references matter.
How do I validate my structured data?
Two layers. First, validate JSON-LD parses (any JSON parser will do; we run a small Node script across the trust and authority pages). Second, run Google's Rich Results Test against the live URL to confirm rich-result eligibility specifically. Local helperg.com has a JSON-LD validator at /tools/json-ld-validator.html.
Does structured data improve Google ranking directly?
Google documents structured data as a way to help Google understand content and enable rich-result features. Google does not publish structured data as a direct ranking factor. The practical benefit is appearance-level (rich results, knowledge-panel inclusion) rather than ranking-position-level.
How does AI search use structured data?
AI providers (OpenAI, Anthropic, Perplexity, Google's generative products) have not published statements committing to structured data as a documented signal. Valid JSON-LD remains useful as general semantic markup that any parser may use. Treat it as classic-search hygiene that may carry over, not as a documented AI ranking signal.
15. Sources
- Google Search Central — Introduction to structured data markup in Google Search — captured 2026-06
- Google Search Central — Article structured data — captured 2026-06
- Google Search Central — Breadcrumb structured data — captured 2026-06
- Google Search Central — Logo structured data — captured 2026-06
- Google Search Central — Software app structured data — captured 2026-06
- Google Search Central — FAQ structured data — captured 2026-06
- Google Rich Results Test — captured 2026-06
- schema.org — home — captured 2026-06
- schema.org — full type hierarchy — captured 2026-06
- schema.org — Organization — captured 2026-06
- schema.org — Article — captured 2026-06
- schema.org — TechArticle — captured 2026-06
- schema.org — BreadcrumbList — captured 2026-06
- schema.org — FAQPage — captured 2026-06
- schema.org — SoftwareApplication — captured 2026-06
- W3C — JSON-LD 1.1 specification — captured 2026-06
- JSON-LD.org — community site — captured 2026-06
Documentation drift note: Google's structured-data documentation changes; schema types are added and refined. Capture the URL with the date you reviewed it and re-check on a documented cadence.