2026 Technical Guide

Structured Data for LLMs: The Schema Markup Guide That Gets You Cited by AI

71% of pages cited by ChatGPT include schema markup. Pages with proper structured data are cited 3.2x more often in AI-generated responses. This guide gives you the 5 essential JSON-LD schema types — with complete, production-ready code examples — that get your brand cited across ChatGPT, Perplexity, and Google AI Overviews.

71%
of pages cited by ChatGPT include schema markup
3.2x
more AI citations for pages with proper structured data
3–5x
citation boost from nested @id schema linking
{ }

What is schema markup — and why does it drive AI citations?

Structured data markup — specifically JSON-LD schema — is the machine-readable layer that tells AI platforms exactly what your content means, who created it, and why it is trustworthy. Pages with proper schema markup are cited 3.2x more often in AI-generated responses than pages without it. The five schema types that drive the most AI citations are: Organization, Article, FAQPage, HowTo, and Person.

Most SEO guides treat schema as a way to earn rich results in Google — star ratings, FAQ dropdowns, price displays. That is still true. But in 2026, schema has become something more fundamental: it is the primary language that AI systems use to read, trust, and cite your content.

The research confirms the stakes are real
Sites implementing structured data and FAQ schema saw a 44% increase in AI search citations (BrightEdge, 2026). Pages with FAQPage schema are 60% more likely to be featured in Google AI Overviews. A study of 73 websites found those with properly implemented structured data were cited 3.2x more often in AI-generated responses. This is not a marginal improvement — it is a categorical difference in visibility.

Why schema markup specifically helps LLMs — the technical reason

When an AI crawler retrieves your page, it reads raw HTML. This means it sees your content as unstructured text — words and sentences without inherent meaning. The AI must then infer what everything means: is "Kongzilla" a brand or a product? Is "Somesh Tripathi" an author, a client, or a subject? Is "April 2026" a publication date or a reference to an event?

Schema markup removes all of that inferencing. By adding JSON-LD code to your pages, you explicitly tell the AI exactly what everything is — in a language it was designed to read.

❌ Without schema — AI must guess
Raw HTML text: "Kongzilla... Somesh Tripathi... April 2026..."

AI must infer: Is this a brand? An author? A date? A product name? Gets it wrong, or skips the page entirely for a clearer source.
✓ With schema — AI knows exactly
"@type": "Organization", "name": "Kongzilla"
"@type": "Person", "name": "Somesh Tripathi"
"dateModified": "2026-04-10"

No guessing. No ambiguity. Direct citation signal.
💡
The recipe card analogy

Think of schema markup as the difference between handing an AI a pile of raw ingredients versus handing it a labelled recipe card. Both contain the same information. But only one tells the AI exactly what each ingredient is, how much of it there is, and what to do with it. Schema markup is the recipe card. Without it, AI systems either guess incorrectly or skip your content entirely in favour of a source that communicates more clearly.

📊
71%
of pages cited by ChatGPT include schema markup (SERPs.io analysis)
🔗
3.2x
more AI citations for pages with properly implemented structured data (73-site study)
📈
44%
increase in AI search citations after implementing structured data and FAQ schema (BrightEdge, 2026)

The three tiers of schema types — implement in this order

Not all schema types carry equal weight for AI citations. Implement in tier order for the fastest return on effort.

Tier 1
🔴
Critical — Implement First
Organization, Article (with dateModified), FAQPage, HowTo. These four types directly drive AI citation rates. Pages with any of these consistently outperform pages without them. Start here before anything else.
Tier 2
🟡
Important — Add After Tier 1
Person, BreadcrumbList, Service, Product/Pricing. These reinforce entity authority and E-E-A-T signals. Person schema is particularly high-value — pages with author schema are 3x more likely to appear in AI answers (BrightEdge, 2026).
Tier 3
🔵
Supporting — Future-Proof
Speakable, SiteLinksSearchBox, VideoObject, LocalBusiness. These reinforce your entity graph and improve cross-platform consistency. Speakable is growing in importance as AI voice interfaces use ChatGPT and Gemini as their backend.

The 5 essential schema types — with complete JSON-LD code examples

Production-ready code for each schema type. Copy, update with your details, and paste into your page's <head> in a <script type="application/ld+json"> block.

01
Organization schema — your brand's identity card for AI systems
Place in <head> of homepage only. Add once — duplicate blocks create ambiguity AI systems ignore.

Organization schema tells every AI system who you are. Without it, LLMs piece together your brand identity from scattered web mentions — and frequently get it wrong. The sameAs property is the most important field: it links your website to verified profiles on LinkedIn, Crunchbase, and Wikipedia, creating an entity graph AI systems can cross-reference.

Pages with connected sameAs references receive 3–5x higher citation rates than pages with isolated schema blocks (ADV Strategy Pro, 2026).

Organization schema — homepage <head>
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "@id": "https://kongzilla.co/#organization",
  "name": "Kongzilla",
  "url": "https://kongzilla.co",
  "logo": "https://kongzilla.co/logo.png",
  "description": "AI SEO agency based in India specialising in GEO,
    AEO, and AI search optimisation for businesses across
    the UK, Australia, and India.",
  "foundingDate": "2018",
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Kharar",
    "addressRegion": "Punjab",
    "addressCountry": "IN"
  },
  "sameAs": [
    "https://www.linkedin.com/company/kongzilla",
    "https://www.crunchbase.com/organization/kongzilla",
    "https://g2.com/sellers/kongzilla"
  ],
  "knowsAbout": [
    "Generative Engine Optimisation",
    "Answer Engine Optimisation",
    "AI SEO",
    "White-label SEO",
    "GEO optimization"
  ]
}
</script>
The knowsAbout property — most websites skip this
knowsAbout explicitly lists the topics your organisation is authoritative on, helping AI systems associate your brand with these subject areas when generating answers. Most websites skip this property entirely — adding it gives you an immediate edge in topical authority signals across every AI platform.
02
Article schema — timestamps and authorship for every post
Add to every blog post, guide, and service page you want AI to cite.

The two most critical properties are dateModified (freshness signal) and author (E-E-A-T signal). AI systems have a strong recency bias — 95% of ChatGPT citations come from content updated within 10 months — and dateModified is how you communicate that signal explicitly.

Notice the "@id": "https://kongzilla.co/#organization" in the publisher field. This connects this Article to your Organization schema — creating a linked entity graph that AI systems recognise as a coherent, trustworthy source.

Article schema — every blog post <head>
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "What is Generative Engine Optimisation (GEO)?
    The Complete 2026 Guide",
  "description": "GEO is the practice of optimising content so AI
    platforms cite your brand in generated answers.",
  "datePublished": "2026-04-01",
  "dateModified": "2026-04-10",
  "author": {
    "@type": "Person",
    "name": "Somesh Tripathi",
    "url": "https://www.linkedin.com/in/someshtripathi",
    "jobTitle": "Founder & CEO, Kongzilla"
  },
  "publisher": {
    "@type": "Organization",
    "@id": "https://kongzilla.co/#organization",
    "name": "Kongzilla"
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://kongzilla.co/blog/what-is-geo/"
  }
}
</script>
Critical maintenance rule
Update dateModified every single time you refresh the content — not just datePublished. Keep it within 60 days of the current date to maintain citation eligibility. This is the single most important ongoing schema maintenance task.
03
FAQPage schema — your highest-leverage AI citation type
Pages with FAQPage schema are 60% more likely to be featured in Google AI Overviews.

FAQPage schema is the single most impactful schema type for AI citation rates. LLMs are fundamentally question-answering machines, and FAQPage schema maps directly to how they retrieve and present information. Each FAQ entry is an independent citation candidate — a self-contained question and answer that AI can extract for a different query.

Three rules for maximum citation probability:

  • Keep each answer to 40–80 words and fully self-contained — an AI must be able to use the answer without surrounding context
  • Make the question text exactly match what you want AI to answer
  • Never mark up content with FAQPage schema unless those questions visibly appear on the page — schema that does not match visible content is flagged as deceptive and penalised
FAQPage schema — add to any page with a visible FAQ section
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is generative engine optimisation?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "Generative engine optimisation (GEO) is the practice
          of structuring your content so AI platforms like ChatGPT,
          Perplexity, and Google AI Overviews cite your brand when
          answering user questions. It differs from traditional SEO
          by targeting AI-generated answers rather than ranked links."
      }
    },
    {
      "@type": "Question",
      "name": "How long does GEO take to show results?",
      "acceptedAnswer": {
        "@type": "Answer",
        "text": "GEO optimisation typically produces first citation results
          within 2–6 weeks for niche queries. More competitive topics
          require 3–6 months. Perplexity shows results fastest;
          ChatGPT's base model takes 6–18 months to fully reflect changes."
      }
    }
  ]
}
</script>
04
HowTo schema — for step-by-step guides and process content
AI Overviews for "how to" queries pull from HowTo schema-marked content at a significantly higher rate.

HowTo schema maps step-by-step processes into a machine-readable format that AI systems extract directly for procedural queries. Any content structured as numbered steps — "how to allow ChatGPT to crawl your website", "how to implement FAQ schema", "how to set up llms.txt" — should have HowTo schema applied.

HowTo schema — for step-by-step guide content
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "HowTo",
  "name": "How to Allow ChatGPT to Crawl Your Website",
  "description": "A step-by-step guide to configuring your robots.txt
    to allow GPTBot and OAI-SearchBot access to your content
    for ChatGPT citations.",
  "totalTime": "PT15M",
  "step": [
    {
      "@type": "HowToStep",
      "name": "Open your robots.txt file",
      "text": "Navigate to yourdomain.com/robots.txt in your browser
        to see its current contents. Look for any lines that
        Disallow GPTBot or OAI-SearchBot."
    },
    {
      "@type": "HowToStep",
      "name": "Add Allow directives for OpenAI crawlers",
      "text": "Add 'User-agent: GPTBot Allow: /' and
        'User-agent: OAI-SearchBot Allow: /' to your robots.txt.
        Save and re-upload."
    },
    {
      "@type": "HowToStep",
      "name": "Check Cloudflare Bot Fight Mode",
      "text": "If using Cloudflare, navigate to Security › Bots and
        verify Bot Fight Mode is not blocking AI crawlers at the
        network level before your robots.txt is even read."
    }
  ]
}
</script>
05
Person schema — author identity for E-E-A-T
Pages with author schema are 3x more likely to appear in AI answers (BrightEdge, 2026).

Person schema establishes author identity — one of the most significant E-E-A-T signals AI systems evaluate. AI systems treat clearly attributed, credentialed content as more trustworthy than anonymous content, exactly mirroring Google's own guidance on Experience, Expertise, Authoritativeness, and Trustworthiness.

Notice the "@id" on the Person and the "worksFor" linking back to the Organization @id. This is the nesting technique that multiplies citation rates — covered in detail in the next section.

Person schema — add to author bio section on blog posts
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Person",
  "@id": "https://kongzilla.co/#somesh-tripathi",
  "name": "Somesh Tripathi",
  "jobTitle": "Founder & CEO",
  "worksFor": {
    "@type": "Organization",
    "@id": "https://kongzilla.co/#organization"
  },
  "url": "https://www.linkedin.com/in/someshtripathi",
  "knowsAbout": [
    "AI SEO",
    "Generative Engine Optimisation",
    "Digital Marketing",
    "Content Strategy"
  ]
}
</script>

Schema nesting — the @id technique that multiplies citation rates by 3–5x

Individual schema blocks are useful. Nested schema blocks — where related schemas are connected through shared @id references — are dramatically more powerful for AI citations.

When you link your Article schema to your Organization schema through a shared @id, and link your Person schema to your Organization schema through worksFor, you create what researchers call a content knowledge graph. AI systems can follow these relationships to understand that this Article was written by this Person who works for this Organisation that knowsAbout these topics.

🏢
Organization
@id: kongzilla.co/#organization
knowsAbout: GEO, AEO, AI SEO
←→
👤
Person
@id: kongzilla.co/#somesh
worksFor → Organization @id
←→
📄
Article
author → Person @id
publisher → Organization @id
←→
FAQPage
Linked within the same page — every FAQ item citable independently
Research: 3–5x higher citation rates from connected schema
Research from ADV Strategy Pro found that pages with connected schema using shared @id references receive 3–5x higher citation rates than pages with isolated, unconnected schema blocks. The @id is your brand's permanent identifier — the thread that connects every piece of content back to a single, trusted entity.
The @id nesting rule — follow this on every page
  • Every Article schema → references the same Organisation @id in its publisher field
  • Every Person schema → references the same Organisation @id in its worksFor field
  • Every Service page → references the Organisation @id in its provider field
  • The @id value is always the same: https://kongzilla.co/#organization

The 6 schema mistakes that silently block AI citations

Each of these errors can invalidate your entire schema implementation — with no error visible to you, but zero benefit in AI citations.

Mistake
What Goes Wrong
The Fix
Schema not matching visible content
If your FAQPage schema lists questions not visibly on the page, Google and AI systems flag it as deceptive. The schema is ignored entirely — or penalised.
Only mark up content genuinely visible on the page. Every FAQ in your schema must appear as readable HTML text.
Duplicate schema blocks
Two Organization schema blocks, or two Article blocks on the same page, create conflicting signals. AI systems cannot resolve which is authoritative and may discard both.
One Organization schema per site (homepage only). One Article schema per page. Audit for duplicates from SEO plugins before adding custom schema.
Missing dateModified
Without dateModified, AI systems cannot evaluate content freshness. Stale content loses citations to fresher sources on the same topic — even if your content is better written.
Always include dateModified and update it every time you refresh the content. Keep it within 60 days of the current date.
Inconsistent entity names
If your Organization schema says "Kongzilla" but LinkedIn shows "KongZilla" and Crunchbase shows "Kong Zilla", Perplexity's entity matching fails and the schema is disregarded.
Standardise your brand name exactly everywhere: schema, LinkedIn, Crunchbase, Google Business Profile, all directories. One exact spelling, always.
Using Microdata or RDFa
Microdata and RDFa embed schema inside HTML tags, creating parsing conflicts when AI crawlers process rich text. Both formats are harder to maintain and generate more errors.
Use JSON-LD exclusively. Place it in a dedicated <script type="application/ld+json"> block in the page <head>.
Never validating after implementation
A single misplaced comma or missing bracket in JSON-LD invalidates the entire block. Search engines silently ignore invalid schema — you see no error, but get zero benefit.
Validate every schema block in Google's Rich Results Test before and after deploying. Re-validate any time you edit the schema.

Complete schema implementation checklist — minimum viable stack for AI citations

This checklist covers the minimum viable schema stack for consistent AI citations across ChatGPT, Perplexity, and Google AI Overviews.

🏠
Homepage — one-time setup
  • Organization schema in JSON-LD with: @id, name, url, logo, description, foundingDate, address, sameAs (LinkedIn, Crunchbase, G2), knowsAbout
  • WebSite schema with SearchAction (enables sitelinks search box in Google)
  • Validated in Google Rich Results Test with zero errors
  • Page source confirms only ONE Organization block exists — no plugin duplicates
📝
Every blog post and guide
  • Article schema with: headline, description, datePublished, dateModified, author (Person @id), publisher (Organization @id), mainEntityOfPage
  • FAQPage schema if the post contains a visible FAQ section (minimum 3 questions)
  • HowTo schema on any numbered step-by-step process section
  • Person schema on author bio section, linked to Organization via worksFor @id
  • dateModified updated every time the content is refreshed — not just datePublished
🎯
Service and landing pages
  • Service schema with: name, description, provider (Organization @id), areaServed, url
  • FAQPage schema if the service page includes visible questions and answers
  • BreadcrumbList schema on all pages below homepage (helps AI understand site structure)
🔄
Ongoing maintenance
  • Google Search Console Enhancements report checked monthly for schema errors
  • dateModified updated on all high-priority pages every 6–8 weeks
  • New pages validated in Rich Results Test before publishing
  • sameAs links in Organization schema checked quarterly — remove any that are outdated or broken

Tools to implement and validate schema markup

You do not need to hand-code every JSON-LD block from scratch. These tools make implementation faster and validation reliable.

Tool
What It Does
Cost
Google Rich Results Test
Validates your JSON-LD and shows exactly which rich results your schema is eligible for. The most reliable validation tool available.
Free
Schema.org Validator
Checks your JSON-LD against the full Schema.org specification. Catches errors that Rich Results Test may not flag.
Free
Google Search Console Enhancements
Shows schema errors and warnings across your entire site with page-level detail. Essential for ongoing monitoring at scale.
Free
Yoast SEO / RankMath
WordPress plugins that auto-generate baseline schema (Organization, Article, BreadcrumbList) from post metadata. Reduces manual coding for standard types.
Freemium
Merkle Schema Markup Generator
Free web-based tool that generates clean JSON-LD for all major schema types. Good starting point for manual implementation before customising.
Free
Schema App
Enterprise schema management platform with visual editor and bulk implementation. Best for large sites with hundreds of pages requiring ongoing schema management.
Paid
Recommended workflow for most businesses
Let your SEO plugin (Yoast or RankMath) generate the baseline Organization, Article, and BreadcrumbList schema automatically. Then manually add FAQPage schema on any page with a visible FAQ section, HowTo schema on step-by-step guides, and Person schema on author bio sections. Validate every addition in Google Rich Results Test before publishing. This hybrid approach covers the full Tier 1 and Tier 2 schema stack without excessive manual effort.

Frequently Asked Questions

Does schema markup guarantee AI citations?
+
No — schema increases the probability of AI citations by reducing ambiguity and building trust signals, but does not guarantee selection. Content quality, topical authority, freshness, and relevance still determine whether AI platforms choose your content. Schema is one important layer of a complete AI visibility strategy, not a standalone solution. Think of it as a multiplier: it amplifies the citation potential of good content, but cannot compensate for thin or poorly structured content on its own.
What is the difference between JSON-LD, Microdata, and RDFa?
+
All three are ways to add schema markup to a page, but JSON-LD is the only format worth using in 2026. JSON-LD places your structured data in a separate script block, keeping it cleanly separated from your HTML. Microdata and RDFa embed schema attributes directly inside HTML tags, which creates parsing conflicts when AI crawlers process rich text and is significantly harder to maintain. Google explicitly recommends JSON-LD for structured data, and all major AI platforms prefer it for extraction reliability.
My WordPress SEO plugin already generates schema. Do I need to add more?
+
Probably yes. Plugins like Yoast and RankMath generate solid baseline schema — Organization, WebSite, Article, BreadcrumbList — from your post metadata. However, they do not typically generate FAQPage schema for specific pages, HowTo schema for guide sections, Person schema for individual author profiles, or Service schema for service pages. For comprehensive AI citation coverage, you will need to add these types beyond what your plugin provides. Audit your pages in Google Rich Results Test first to see exactly what is and is not currently marked up.
How often should I update my schema markup?
+
Your Organization schema should be reviewed quarterly to ensure sameAs links are current and knowsAbout properties reflect your actual service focus. Your Article schema's dateModified property should be updated every time you refresh the content — this is the most critical ongoing maintenance task. FAQPage and HowTo schema should be updated whenever you add or change the underlying FAQ or step content. After any major schema update, always re-validate in Google Rich Results Test before publishing.
Does schema markup help with Perplexity and ChatGPT — or only Google?
+
Research confirms schema markup helps across all major AI platforms, not just Google. Microsoft confirmed that its LLMs use structured data to interpret web content. Analysis by SERPs.io found 71% of pages cited by ChatGPT include schema markup. Perplexity's entity matching is specifically sensitive to consistency between schema entity names and visible on-page content. The JSON-LD format is universally supported across Google, Bing (which powers ChatGPT Search), and Perplexity's own crawler. Implementing schema once benefits your visibility across all three major AI platforms simultaneously.
Free · No Commitment · Results in 48 Hours

Which schema gaps are costing you AI citations today?

Kongzilla is an AI SEO agency based in India specialising in GEO, AEO, and complete AI search optimisation for businesses across the UK, Australia, and India. We implement the full technical schema stack — Organization, Article, FAQPage, HowTo, Person, and Service schema — as part of every client engagement, alongside content strategy and entity authority building. Book a free AI visibility audit and see exactly which schema gaps are costing you citations today.

Scroll to Top