GEO Stack 2026: What Works for AI Visibility

Anyone trying to optimise a website for AI search engines today faces a paradox: there are more tools, standards, and recommendations than ever before, and at the same time less hard evidence about what actually works.

Generative Engine Optimization, AI Search Optimization, Agentic SEO. The labels change every quarter. What stays is the question of where to place your bets. Some of the tools currently discussed under these labels are old acquaintances from classic SEO that have unexpectedly become important again in this new world. Others are fresh standards that Cloudflare, Anthropic, OpenAI, or Google are only just establishing. And some are sold as “revolutionary” and “essential” even though hard evidence of their effectiveness is missing.

This article attempts a stocktaking as of late May 2026. Which measures already pay off measurably today? Which can be implemented quickly for a highly likely future? Which may be displaced by a new standard in two years? And where is it worthwhile to do nothing and keep watching how the development unfolds?

What Works for AI Visibility? Hard to Say

Classic SEO follows a familiar logic. Google ranks pages, you can measure those rankings, you can optimise content, and you see the effects in Search Console. Over two decades, this feedback loop allowed practitioners to test hypotheses and consolidate best practices.

Generative search lacks the body of experience that eventually produces a valid canon. When ChatGPT mentions an organisation in an answer, usually nobody knows why. Was it the schema markup, a Reddit thread from 2023, a link in an industry publication, or simply the training data? AI visibility today is not a question of ranking, it is a question of reputation. That reputation emerges from a wide range of signals, most of which, unfortunately, cannot be measured reproducibly.

There is a second weak spot. Many of the standards we are implementing today are being pushed in part by individual vendors with their own interests in mind, without a neutral consensus. Cloudflare promotes a vision of “Agent Readiness”, Anthropic drives MCP forward, Google experiments with AI Mode and its own WebMCP variant, and smaller initiatives like llms.txt establish bottom-up conventions, some of which find broad adoption without the large LLM providers demonstrably acting on them. Anyone investing today is investing in a lot of bets at once.

The consequence: much of what looks like time well spent in 2026 may turn out to be unnecessary by 2027. And some of what looks like hype today may become mandatory in eighteen months. You could read this as a sign that the optimisation industry tells a lot of tall tales because nobody can prove what works anyway, but it also reflects a genuinely open technological transition.

The Foundation: Classic SEO Still Matters

There is good news, too: a lot of it stays as it was. The most important levers for AI visibility in 2026 are still classic SEO measures. Google’s John Mueller has repeated this several times over the past twelve months, most recently on Bluesky and in several interviews: visibility in Google’s AI Overviews and AI Mode depends on the same signals that mattered before, only with a different weighting.

Crawlability, clean HTML, fast load times, well-maintained information architecture, hreflang attributes on multilingual sites, an XML sitemap. All of that is still the most important foundation in 2026. If an AI crawler cannot reliably fetch and understand a page, the best schema markup will not help. If the content has nothing relevant to say, an llms.txt file will not change that either.

One thing has shifted, though. Several studies from the first quarter of 2026 show a clear decline in the correlation between traditional metrics like Domain Authority and actual AI visibility. Instead, content with high semantic completeness, a clear answer structure, and demonstrable originality is preferentially cited. The rule of thumb: direct answers in the first 60 to 170 words of a section, clear question-and-answer structures, statements that can be cleanly attributed.

What Google established in its March 2026 core update, with “Information Gain” as the dominant signal, captures this shift well. Content is now evaluated on whether it actually contributes new insights to the already indexed knowledge base. Anyone repeating the same points others have already articulated better falls behind. Anyone contributing their own data, their own framework, or their own observation gains ground. This logic applies to AI Overviews just as it does to traditional search results.

The takeaway for the foundation stack: in 2026, a website without sound technical SEO basics is not “not optimised for AI”. It is not functional in modern search at all.

Structured Data: Long Overlooked, Now Essential

Schema.org has been remarkably upgraded in 2026, without much fanfare. For a long time, structured markup was seen as a means to the end of rich snippets, those prettier search results with stars, prices, or FAQ accordions.

With the Google core update of March 2026, that role has changed. Google tightened eligibility for classic rich results on several schema types while at the same time increasing the weight of schema data as a trust and entity signal in its AI Mode. In plain terms: FAQ markup on a page that is not really an FAQ no longer helps produce a pretty search result. But it can definitely contribute to Google’s Gemini model classifying the page as a source worth citing when it generates an answer.

Several analyses from spring 2026 report significant gains in AI citation rates for sites that have set up their Organization, Article, Person, and Service schemas cleanly and connected them via sameAs links to Wikipedia, LinkedIn, and other authoritative sources. The schema types with the greatest practical leverage right now are:

Organization with sameAs, contactPoint, and knowsAbout as anchors for entity recognition
Person for authors, with verification through external profiles
Article with author, datePublished, dateModified
Product and Service for e-commerce and service providers
FAQPage and HowTo, still useful, but only where the content structure and the markup genuinely match

Adoption of Speakable and similar specialty schemas is low, which can also be an opportunity. Anyone using them where the content is actually suited to them has a rare competitive advantage in 2026.

Structured data therefore belongs to the measures with demonstrable effect and a solid future outlook. The investment already pays off today, and is likely to remain relevant in two years’ time, because the logic of entity recognition and trust anchors works independently of the specific crawler standard.

robots.txt: Instructions for AI Crawlers

Few files have gained as much importance over the past eighteen months as robots.txt. For a long time it was a barely noticed technical formality. Today it is the most important lever for deciding which AI providers even get the chance to perceive a website.

Three developments brought this about.

First, the large AI providers have differentiated their crawlers. OpenAI now distinguishes between GPTBot (training), OAI-SearchBot (real-time search in ChatGPT), and ChatGPT-User (user-triggered fetches). Anthropic separates ClaudeBot, Claude-SearchBot, and Claude-User. Google differentiates between Googlebot (classic search), Google-Extended (AI training), and more recently the AI Mode components. Anyone blocking everything across the board locks themselves out of modern AI search. Anyone allowing everything gives away potentially valuable content for AI training without being compensated for it.
Second, on what Cloudflare called Content Independence Day on 1 July 2025, the company flipped the default setting for all newly registered domains on Cloudflare. AI crawlers have been blocked by default ever since, unless the operator explicitly opts in. Cloudflare serves around one fifth of the web through its services, which gives this step structural weight. Alongside it, Cloudflare launched a marketplace called Pay Per Crawl, which lets publishers grant AI crawlers access only against payment. In August 2025, AI Crawl Control followed, reviving the HTTP 402 “Payment Required” status code and making it machine-readably negotiable.
Third, Content Signals have established a new convention, placed directly in robots.txt. Instead of just “allow or disallow”, site operators can declare three separate permission levels: may the crawler use the content for AI training (ai-train)? For AI inference and real-time grounding (ai-input)? For regular search indexing (search)?

A modern robots.txt for a corporate website can therefore signal that content is available for classic search and for use in AI answers, but not for training new models. This kind of differentiation was not technically possible twelve months ago. Today it is available and increasingly read, even though respect from crawler providers remains voluntary, but that was already the case for search crawler directives.

The takeaway on robots.txt: it belongs to the most effective and at the same time cheapest measures. A cleanly configured file with explicit rules for the important AI crawlers is simply mandatory inventory in 2026. Anyone who does nothing here either risks visibility or hands their content over to LLM training without any control.

llms.txt: Lots of Adoption, But What’s It Actually Good For?

Few initiatives have attracted as much attention in the SEO community as llms.txt, the standard proposed by Jeremy Howard in September 2024. The idea is elegant: a curated Markdown file in the website root that tells a large language model, in a few kilobytes, what the site does, which pages are central, and where the content can be found.

Adoption is surprisingly high. According to the SE Ranking study from November 2025, around ten percent of all domains now have an llms.txt. Anthropic, Stripe, Cursor, Cloudflare itself, and many documentation sites publish one. There are WordPress plugins, an official specification, and a lively community discussion about best practices.

Effectiveness, however, is sobering. Three independent studies from the Q4 2025 to Q1 2026 period arrive at the same finding:

SE Ranking analysed 300,000 domains and writes verbatim: “Both statistical analysis and machine learning showed no effect of LLMs.txt on how often a domain is cited by LLMs. Removing this variable from our XGBoost model actually improved its accuracy.”
Limy.ai evaluated 515 million LLM crawler events. Only 408 requests went directly to the /llms.txt file. The share is statistically negligible.
OtterlyAI registered 84 AI crawler visits to llms.txt in a 90-day study, while average content pages on the same domain received about 265 visits. The specialised file performed three times worse than a regular page.

Add to that clear statements from Google. John Mueller explicitly compared llms.txt to the old keywords meta tag in a Reddit discussion, arguing that self-descriptions add little value to a search system that can already analyse the site directly. Gary Illyes confirmed at Google Search Central Live in Bangkok in summer 2025 that Google does not support llms.txt and has no plans to.

So where does the value lie? It lies in a different layer than classic AI search. Coding agents like Cursor, Claude Code, Continue, or Cline actively read llms.txt. When a user pastes a URL into ChatGPT, Claude, or Perplexity, the file is highly likely to be processed. Limy.ai calls this layer “business-to-agent” and cleanly separates it from the citation layer: llms.txt has no effect on visibility in AI search, but it does affect the quality with which agents and IDE tools can use the website.

The takeaway: llms.txt remains a sensible investment, but with honest communication. The effort for a cleanly curated file is minimal and requires occasional maintenance, say once a quarter. That effort is justified today by the agent layer and by the optionality if a major LLM provider were to support the standard more prominently in the future. It is not justified as a lever for more visibility in AI search.

Markdown for Agents: A Quiet Measure with Measurable Effect

Markdown has established itself as the de facto format for AI engineering. Instructions for agents are formatted as .md files. And when ChatGPT or Claude produce text, it usually comes out as Markdown. Serving content as Markdown when an AI crawler or coding agent asks for it is therefore gaining importance. Technically this works through content negotiation: the client sends an Accept: text/markdown header, the server responds with the Markdown version instead of HTML.

The big advantage of Markdown: it needs fewer resources. Cloudflare published figures on this as part of Agents Week 2026. For its own developer documentation, serving content as Markdown reduces agent token consumption by up to 80 percent. In practical terms, that means faster answers, cheaper inference, and a higher chance that the content fits completely into the context window. In tests against other large documentation sites, an agent on Cloudflare Docs needed 31 percent fewer tokens and reached the correct answer 66 percent faster.

The catch is in practice. According to the Checkly study from February 2026, only three of seven tested agents currently send the Accept: text/markdown header by default: Claude Code, OpenCode, and Cursor. For the rest, you need a URL-based fallback, typically by appending /index.md or .md to the URL.

Overall adoption is still modest. Cloudflare’s Radar analysis of the 200,000 most-visited domains shows that only 3.9 percent correctly support Markdown content negotiation. That means anyone implementing the feature today belongs to a small group and can secure measurable efficiency advantages in how agents perceive the site.

For WordPress sites, clean implementation is not trivial. You need either a plugin that generates Markdown endpoints, or a server-side solution via header detection in Nginx, Apache, or at the CDN level. For sites with complex custom post types and many interactive elements, the Markdown representation is often not straightforward to generate.

The takeaway: for sites with a lot of content, documentation sites, help centres, tech blogs, and SaaS providers with their own developer area, Markdown for agents is a very worthwhile investment with demonstrable effect. For lean corporate websites, one-pagers, or marketing landing pages, the effort is likely out of proportion to the benefit.

Cloudflare’s “Agent Readiness”: Is an Industry Standard Emerging Here?

In April 2026, as part of “Agents Week”, Cloudflare released isitagentready.com, a tool that checks a website’s agent suitability along four dimensions. The tool delivers a score similar to Google Lighthouse.

The four dimensions are:

Discoverability: robots.txt, sitemap.xml, Link headers per RFC 8288 for resource-related hints directly in the HTTP header
Content Accessibility: correct Markdown content negotiation, optionally an llms.txt
Bot Access Control: Content Signals in robots.txt, Web Bot Auth for cryptographic bot authentication
Capabilities: Agent Skills Index, API Catalog per RFC 9727, OAuth discovery, MCP Server Card, WebMCP

The adoption figures from Cloudflare’s own analysis of the 200,000 most-visited domains are revealing. 78 percent have a robots.txt, but only 4 percent declare AI preferences via Content Signals. 3.9 percent support Markdown content negotiation. MCP Server Cards and API Catalogs combined are found on fewer than 15 sites worldwide.

Cloudflare positions itself cleverly with this initiative. The company prescribes no standards you must implement; instead it offers a scoring system that makes any gaps visible. The strategy is presumably this: if enough large sites adopt the checklist’s recommendations, it becomes a de facto norm.

Google has so far made no statement on Agent Readiness. OpenAI and Anthropic remain neutral. The checklist is a Cloudflare vision for the web standard, not an IETF consensus.

Two interpretations follow for practitioners:

The offensive reading: early adopters have a competitive advantage if the scoring goes mainstream.
The defensive reading: much of the Cloudflare stack makes sense independently of the specific initiative anyway — a correctly configured robots.txt with Content Signals, or a thoughtful use of Link headers, for instance. Anyone setting up and maintaining those elements sensibly is well positioned regardless of the Cloudflare scoring.

The takeaway: in 2026, Agent Readiness is not yet a standard but a sensible proposal. The individual building blocks deserve different verdicts at this stage. Discoverability and Bot Access Control are mandatory anyway, so you get those points more or less for free. Markdown negotiation pays off depending on the type of site. Capabilities with MCP Server Card and WebMCP are pure bets on a standardisation race that has not been decided, and they are not relevant for many projects in the first place because no services exist for them.

MCP, WebMCP, and x402: The Next Layer Is Emerging

The Model Context Protocol (MCP) was introduced by Anthropic in November 2024 and is now supported by OpenAI, Google, and Microsoft, managed by the Linux Foundation through the “Agentic AI Foundation”, with more than 10,000 MCP servers in existence according to the PulseMCP directory. DigitalApplied reports 97 million MCP downloads by March 2026.

MCP is primarily an interface between agents and tools, not between agents and websites. An MCP implementation pays off for a company with an internal system that AI tools should access in a structured way: a product catalogue, a CRM, a knowledge base, a booking system. For the average marketing website, running your own MCP server is overkill.

WebMCP, a browser API proposed by Google and Microsoft, is currently under development inside a W3C community group. It aims to let websites expose tools directly in the browser for AI agents. A first preview for Chrome was released in February 2026; broader support across other browsers is expected for mid to late 2026. WebMCP’s future adoption remains to be seen.

x402 revives the HTTP 402 “Payment Required” status code for machine-readable payment flows between agents and servers. Cloudflare and Coinbase founded the x402 Foundation around it in September 2025. Competing proposals also exist, such as the Universal Commerce Protocol (UCP) from Google and Shopify and the Agentic Commerce Protocol (ACP) from OpenAI and Stripe. Which of these models will prevail is not decided as of May 2026.

The takeaway: for standard websites of SMEs, service providers, and mid-market companies, implementing MCP, WebMCP, or x402 does not yet make sense. These standards compete with each other, adoption is low, and the benefit for classic lead generation or content websites is limited. The picture is different for high-volume e-commerce companies, booking and reservation systems, and SaaS providers with API platforms. For those companies it is advisable to actively follow the development of these standards and to implement early once one of them clearly consolidates.

Anyone investing today should do so aware that possibly three of the protocols listed will be obsolete in eighteen months because a different one has won out. That is not an argument against early adoption, but it is worth calibrating your expectations accordingly.

Content Quality, Brand Presence, and Mentions: Unsexy, But a Central Success Factor

Amid all the technical discussion about standards, protocols, and headers, the most important finding sometimes recedes into the background. In 2026, AI visibility follows above all the logic of mention. Whoever is linked to and contextualised in authoritative sources gets cited. Whoever is not stays invisible.

The data here is consistent. Several studies from Q1 2026 show that LinkedIn posts are now the most-cited source for professional questions across all major AI platforms, with citation frequency doubling between November 2025 and February 2026. 59 percent of LinkedIn citations come from individuals, not from company pages. Personal content on LinkedIn has thus become a serious AI SEO factor.

Whether Reddit, Quora, industry media, trade publications, or conference talks, the logic is similar everywhere. AI models prefer content whose existence and relevance is confirmed by independent third-party sources. Your own content on the company website is necessary but rarely sufficient. Authority emerges in the network.

From a website operator’s perspective, that means half the effort for AI visibility belongs in content, the other half in distributing it. A well-written pillar piece that nobody notices has little effect in 2026. The same content, referenced in two relevant trade podcasts, shared by three industry peers on LinkedIn, and discussed in a Reddit thread, has a very good chance of being cited in ChatGPT and Perplexity.

This insight is neither new nor revolutionary. But it is empirically robust and independent of any coming standard. Anyone investing in quality and mentions is building a value that no protocol update can destroy.

Monitoring Tools: What They Deliver, and Where Their Limits Lie

A logical follow-up question to investing in AI visibility is: how do I measure whether it works? Over the past eighteen months, a dedicated tool category has emerged for this, listed in relevant market overviews as “AI Visibility Monitoring”, “LLM Tracking”, or “GEO Analytics”. The leading providers right now are Profound with an enterprise focus and a starting price of 499 US dollars per month, Peec AI from Berlin in the mid-market segment from around 89 euros, OtterlyAI with an entry tier from 29 US dollars, alongside KIME, ZipTie, Nightwatch, and the AI Visibility Toolkit integrated into Semrush.

What these tools deliver at their core is fairly uniform. They send defined prompts to the major LLM platforms (typically ChatGPT, Perplexity, Claude, Gemini, Google AI Overviews, and Microsoft Copilot), record the generated answers, extract brand mentions and citations, compare your own visibility with defined competitors, and chart trends over time. Additional features include sentiment analysis, source tracking (which sources were used for the answer), prompt discovery (what questions are users actually asking in this category), and in some cases concrete optimisation recommendations.

So much for the vendors’ promises. This measurement methodology, however, has structural limits that are central to interpreting the results.

The first and most serious limit concerns the stability of measurement. AI answers are generative, not stored. The same prompt run a second time often returns a different answer, with a different set of brands, in a different order, with different sentiment. A SparkToro experiment found that for two consecutive identical queries, the probability is under one percent that ChatGPT or Google’s AI returns the same brand list. An AirOps analysis points in the same direction: only around 30 percent of brands remain visible from one answer to the next, only 20 percent across five consecutive runs.

From that follows a second limit. What the tools display as “Share of Voice” or “Visibility Score” is always a sample mean across multiple runs and for a predefined prompt set. Aleyda Solis put this shift precisely in an AirOps webinar: “SEOs must rethink how they measure success — AI overviews change what visibility looks like.” In short: established expectations from classic SEO about stable rankings do not carry over to a generative answer machine.

A third limit is methodological and is glossed over in most tool comparisons. Most platforms query the LLMs via API rather than via the web interface that end users actually use. Web interfaces often use their own system prompts, different model versions, or additional retrieval layers. API-based measurement can structurally return different answers than what the user sees in the ChatGPT browser window. Tools like Omnia that explicitly rely on browser-session-based tracking try to reduce this bias, but they are exceptions.

A fourth limit concerns the prompt set. Volume figures on how often particular questions are actually asked are in most tools estimates from panel data, not first-party data from the LLM providers. The tools measure what is in their own prompt set, and nothing beyond. Anyone configuring too narrow or too generic a set ends up with a metric that has little to do with the real questions their target audience is asking.

So how should you handle monitoring tools? They are useful as diagnostic and trend instruments, not as ranking metrics with daily precision. They are suitable for establishing a fundamental problem (“we hardly appear in ChatGPT answers in our topic area”), for observing shifts over weeks and months, for tracking competitive benchmarks, and for deriving hypotheses for targeted measures. They are less suited to interpreting small movements between individual measurement points or to deriving operational decisions from them.

Anyone starting with monitoring should begin with a narrow, carefully curated prompt set that reflects real questions from their target audience, rather than trusting a broad auto-generation. Weekly runs are sufficient in most cases; daily measurements produce noise without added value. For SMEs with a modest budget, OtterlyAI is a realistic entry point. For agencies with multiple clients, Peec AI offers the cheaper multi-account model. Profound makes sense for enterprise setups with compliance requirements and deeper competitive analysis, but is rarely worth it for smaller sites.

The more important investment than the tool itself remains a clean setup: a prompt set that reflects the audience’s real questions, a defined competitive set, a clear sense of which trends deserve attention and which are sample noise. Anyone keeping that in mind can derive good decisions even from a cheaper tool. Anyone who does not will spend top dollar on the most expensive enterprise subscription for fuzzy numbers.

Outlook: What Works Today? What’s a Good Bet? What Will Become Obsolete?

What demonstrably works today:

Classic technical SEO work: crawlability, Core Web Vitals, hreflang, clean HTML, fast delivery
Structured data per Schema.org, especially Organization, Person, Article, Product, with sameAs links
robots.txt with differentiated AI crawler control and Content Signals declarations
Original content with your own data, clear structure, and direct answers
Mentions in authoritative third-party sources, including LinkedIn, industry media, Reddit, Quora, and trade blogs
AI visibility monitoring as a diagnostic and trend instrument, with restrained expectations about the explanatory power of individual measurements

A good bet on a likely future:

llms.txt as a curated file for the agent layer, not for immediate visibility
Markdown content negotiation, especially for sites with a lot of content
Implementing the simpler Cloudflare Agent Readiness building blocks, because they make sense independently anyway
Maintaining brand presence on the platforms from which AI citations actually come today

Premature or not yet decided:

WebMCP for classic marketing sites, as long as browser adoption and standard consolidation remain open
x402, UCP, ACP for e-commerce, as long as the competing models have not been settled
MCP servers for your own website, unless there is a clear use case in the B2B or API domain
Auto-generated llms-full.txt for small to mid-sized sites that do not primarily have a lot of content

Potential candidates for obsolescence:

llms.txt could either become the established standard by 2027 or be replaced by an official Anthropic or OpenAI equivalent.
The Cloudflare Agent Readiness checklist may remain relevant or be displaced by a W3C-backed consensus.
Today’s commerce protocols are explicitly competing proposals. At least one, possibly two of them will no longer be actively developed in 2027.

This uncertainty is no reason for inaction. It is a good occasion for well-tempered investments. A well-configured robots.txt with Content Signals takes a few hours and is hedged against all plausible future paths. An extensive MCP server implementation for a standard marketing site takes days or weeks and may turn out to be a dumb idea in eighteen months.

What I Recommend for 2026

From the sum of the analyses, what stands out above all is a pragmatic course of action:

The first step is always a clean audit of the classic SEO foundation and of robots.txt. This is where the most effective levers lie with the best effort-to-benefit ratio. Anyone with gaps here should close them before thinking about new standards.

The second step is the structured data strategy. Organization, Person, and Article schema belong on every site that wants to build authority in a topic area. sameAs links to Wikipedia, LinkedIn, GitHub, Crunchbase, and other authoritative sources are among the most underestimated tools here.

The third step is a curated llms.txt, with honest communication to the client that it is a bet on the future and on the agent layer, not something with immediate effect on AI visibility. The effort stays modest, so you can do it anyway to feel covered.

The fourth step is keeping an eye on the Cloudflare Agent Readiness score for the site in question. The easy points (Discoverability, Bot Access Control) are quick to implement and worthwhile anyway. The demanding points (MCP, WebMCP) can be deliberately deferred until the standards are more stable.

The fifth step, often the most underestimated, is the strategic investment in mentions and brand presence beyond your own site: a LinkedIn strategy for the leadership team, relationships with industry media, contributions to trade podcasts, visibility at conferences. This work demonstrably pays off in AI citations and survives every change in standards or protocols.

The sixth step is deliberately sized monitoring. A narrow, carefully curated prompt set, a defined competitive set, weekly runs, and sober trend interpretation over weeks instead of days create the basis for assessing the impact of the first five steps at all, without the metrics simply unsettling you.

What Can We Derive for the Next 18 Months?

The next twelve to eighteen months are likely to be a phase of consolidation. Several of today’s competing standards will gain or lose market share. The answer to which AI crawler heeds which signal will become clearer. Studies on effectiveness will improve because the measurement methods will improve too.

What probably will not change: the central role of content quality, clarity in positioning, and the number of mentions. Anyone investing substantially in these three dimensions is building a foundation that holds.

What probably will change: the specific configuration of the technical building blocks. Some of the headers, files, and endpoints we maintain today will be named differently or structured differently in 2027. That is normal for a field at this stage of development.

If you want to know more about how these findings translate to a concrete project, the easiest way to reach me is via the contact form below. I’m happy to walk through the topic with you individually, with an eye on your industry context, the resources available, and your existing technical base.

Netzkundig

Learn more