Definition
Training-corpus presence is the single most important signal for non-grounded LLM surfaces (default ChatGPT, Claude, ungrounded Gemini). You buy presence by being everywhere LLM training pipelines scrape: Wikipedia, Reddit, GitHub, established publishers, G2/Capterra, broad press coverage.
Why it matters
Training corpus sits in the "Signals & ranking" layer of the AI search stack. Teams that handle it well get cited more, recommended more, and earn more of the AI-mediated revenue in their category. Teams that ignore it spend a year wondering why their content investment never moves the needle inside ChatGPT or Perplexity.
Related terms
- LLM SEO - Optimizing for Large Language Models - making sure ChatGPT, Claude, Gemini, Perplexity, and Grok know about, cite, and recommend your brand.
- Cross-source consensus - How consistently many independent sources describe a brand the same way. The single biggest factor in whether an LLM names a brand by default.
- Mention rate - The percentage of prompts in a defined panel where an LLM names your brand. The headline metric of LLM SEO programs.
Apply it
The LLM SEO playbook ties every concept in this glossary into a single operating model. If you want to see how your brand performs across all the LLMs at once - mention rate, citation share, sentiment, rank - start with the free GEO audit or skip straight to a free Livesov account.