LLM Research Studies

Overview: This site contains controlled experiments testing how Large Language Models (LLMs) process and extract information from web pages. Each study examines different aspects of LLM behavior when interacting with structured data, formatting variations, and content presentation methods.

Research Studies

Study 1: Schema Markup vs. Plain Text

Research Question: Do LLMs benefit from JSON-LD schema markup when extracting product information, or can they extract equivalent data from plain text alone?

Methodology: Five products tested across four variants (text only, aligned text+schema, schema with extra facts, and conflicting data) to measure extraction accuracy and source prioritization.

View Study Details →

Study 2: Text Formatting & Visual Presentation

Research Question: How does text formatting (tables, lists, plain paragraphs) affect LLM information extraction accuracy?

Methodology: Testing whether structured HTML formatting helps LLMs parse information more accurately than unformatted text blocks.

View Study Details →