Test Date: December 2, 2025
LLM Tested: Google Gemini 3 Pro Preview
Test Objective: Determine whether structured data (JSON-LD schema) helps or hinders LLM information extraction from web pages.
Do Large Language Models benefit from JSON-LD schema markup when extracting product information from web pages?
Many developers believe that schema markup (JSON-LD) provides no value for LLMs since these models can already extract information from visible text. This experiment directly tests that assumption by measuring extraction accuracy across four different data presentation scenarios.
We created controlled product pages with identical information presented in four different ways:
All product data visible in HTML tables. No JSON-LD schema markup included.
Visible text and JSON-LD contain identical information. Tests dual-source access.
Visible text is vague or generic, while JSON-LD schema contains specific details. Tests whether LLMs retrieve precise information from structured markup.
Visible text conflicts with JSON-LD schema values. Tests source prioritization.
Each variant was asked five questions about the product:
FluxClean 2000 Tire Degreaser - A fictional product with clearly defined attributes perfect for controlled testing.
Overall Accuracy: 100% 18/18 questions correct
By Variant:
| Variant | Question | Expected | Actual | Result |
|---|---|---|---|---|
| A | SKU | FC-2000-RED | FC-2000-RED | ✓ Correct |
| A | Price | $47.99 | $47.99 | ✓ Correct |
| A | Colors | Red, Black | Red, Black | ✓ Correct |
| A | Size | 5L | 5L | ✓ Correct |
| A | Brand | FluxClean | FluxClean | ✓ Correct |
| B | SKU | FC-2000-RED | FC-2000-RED | ✓ Correct |
| B | Size | 5L | 5L | ✓ Correct |
| B | Brand | FluxClean | FluxClean | ✓ Correct |
| C | SKU | FC-2000-RED | FC-2000-RED | ✓ Correct |
| C | Price | $47.99 | 47.99 USD | ✓ Correct |
| C | Colors | Red, Black | Red, Black | ✓ Correct |
| C | Size | 5L | 5L | ✓ Correct |
| C | Brand | FluxClean | FluxClean | ✓ Correct |
| D | SKU | FC-2000-RED | FC-2000-RED | ✓ Correct |
| D | Price | $49.99 | $49.99 | ✓ Correct |
| D | Colors | Red, Black | Red, Black | ✓ Correct |
| D | Size | 5L | 5L | ✓ Correct |
| D | Brand | FluxClean | FluxClean | ✓ Correct |
Gemini 3 Pro Preview provided confidence explanations for each answer. Example from Variant C (Vague text, detailed schema):
"While the visible text only states 'Multiple colors available', the JSON-LD schema explicitly lists the colors as 'Red' and 'Black'."
This demonstrates the model is actively accessing and distinguishing between schema markup and visible text.
View the actual product pages used in this study:
Experiment Design & Testing: December 2025
Test code available on request. Full
dataset includes 5 products × 4 variants × 5 questions = 100 total test cases.