Agents & The New Internet (3/5)

Part 3 of a series on the future of the web

The previous post focused on AI as an intermediary. Something that sits between users and content, synthesizing answers from multiple sources. That’s the current state. Perplexity, Google AI Mode, ChatGPT with browsing. The user asks a question, the system reads content, the user receives a packaged answer.

But there’s a next step. AI systems are starting to move from answering questions to pursuing goals. From synthesis to action. That shift changes what it means to have a presence on the web.

From Questions to Goals

The intermediary model of AI on the open web is essentially sophisticated search. The user has a question. The assistant traverses the web, gathers relevant content, synthesizes an answer, and presents it. The user still decides what to do with that information. They still take the action themselves.

The agentic model is different. The user has a goal, not just a question. “Book me a flight to Tokyo next month, cheapest option, aisle seat.” Or they ask something simpler, because their AI already knows the preferences.

“Find a plumber who can come tomorrow and has good reviews. Book them for 10am.” “Research these three companies and tell me which one to invest in.” The user delegates the goal to the agent. The agent does the traversal, the synthesis, the comparison, and potentially the action itself. The user receives confirmation that the task is complete.

This is a meaningful distinction. In the intermediary model, the user does the final synthesis and takes the action. In the agentic model, the AI does both. The user is removed from the loop for everything except the initial delegation and the final approval.

How Agents Will Traverse the Web

Agents will use the same infrastructure the web already has. Search to discover relevant entities. “Domain authority” and trust signals to evaluate sources. Links to traverse between entities. Content to understand what each entity offers. I find it interesting how much money is flowing into AIO and GEO startups when the underlying way agents retrieve information is by using existing search indexes. ChatGPT uses Bing. Anthropic uses Brave. Google uses Google. The mechanics of the web don’t change. What changes is who’s doing the traversing.

There are two (well, maybe three) modes emerging for how agents interact with websites.

The first is browser operation. The agent literally operates a browser like a human would. It clicks buttons, fills forms, navigates pages. This works with any website, even ones not designed for agents. It’s slower and messier, but it’s universally applicable. Anthropic’s computer use (Which looks like it’s going to be replaced by Claude Cowork), OpenAI’s Atlas Browser, and Google’s browser agents all work this way.

The second is programmatic access. The website exposes capabilities in a structured way through protocols like MCP. The agent discovers what actions are possible and invokes them directly. This is cleaner and more efficient, but it requires the surface/entity to have built that interface.

Right now, both modes are developing in parallel. Browser operation provides the bridge for existing websites. Programmatic access is where things are heading for sites that want to participate effectively in an agentic web.

There’s a third mode emerging that goes beyond both. The browser itself becomes generative; though this mode is a bit further in the future.

Google’s Disco demonstrates this shift. The flagship feature, GenTabs, takes your open browser tabs as input, consumes the content across those sites, and generates a new interactive application from that data. You have five tabs open about trip planning. Disco creates a custom trip planner that synthesizes all of it into one interface, complete with calendars, maps, and comparison tools. You describe the tool you need in natural language, and it gets built from the underlying web content.

This isn’t an agent operating a browser. It’s the browser regenerating the web itself. Perhaps the most glaring example of how AI can change synthesis and the consumption model.

Perplexity’s Comet and OpenAI’s Atlas are moving in similar directions. The browser stops being a window into websites and becomes a workshop that transforms web content into purpose-built interfaces. The websites provide the data. The “browser” generates the experience – at an extreme, by generating a fully on demand app, but in general, simply presenting synthesized information in a deeply personalized format.

The implications for publishers are significant. Your content becomes raw material for interfaces you didn’t design and don’t control. A user researching your product alongside competitors might never see your carefully designed product page. Instead, they see a generated comparison tool that consumed your content and rendered it according to its own logic.

This is the discovery layer being rebuilt on the fly. And it makes the question of how you present your content to these systems even more important.

The Evolution of Agency

Think of this as a progression. What exists now is essentially Perplexity-style web search with more steps: gather content, generate synthesis, present to user. The user still makes decisions and takes actions. Near-term, users delegate specific tasks with explicit specifications, and agents can take actions like purchases or bookings within bounded authority. Further out, agents operate more autonomously based on standing guidelines, becoming something closer to economic actors in their own right.

The progression is toward more autonomy, but that doesn’t mean humans disappear from the loop. It means the loop gets wider. Instead of approving every action, users set guidelines and review outcomes.

What This Means for the Web

If agents are doing the discovery, comparison, and action, the website needs to serve multiple audiences.

Humans who visit directly still want visual presentation. In fact, they’ll likely expect something more than just content now. AI actually unlocks this. Sites can create more immersive and personalized experiences without needing a developer for every variation. Interactive data visualizations, product configurators, personalized content flows. The bar for what a “visit” should feel like is rising.

When AI handles the informational layer, the experiential layer becomes a differentiator.

AI intermediaries doing synthesis need structured, accessible content. Clear schemas, semantic density, good interlinking. This is the challenge most publishers are grappling with now. In fact there’s a bit of FUD in this industry. Billions of dollars flowing into AIO and GEO when much of what AI optimization really is is simply long-tail keyword search optimization.

Agents taking actions need to understand what actions are possible. They need programmatic access to those actions. They need trust signals to evaluate whether to engage with a particular entity.

The question becomes: is the presentation of content as it exists today what the presentation of content will be in the future? Probably not. The website was optimized for humans reading pages. If the readers are increasingly machines with goals, the presentation needs to evolve.

The Representation Problem

Here’s where things get interesting for publishers and businesses.

When a human visits your website, you control the experience. The design, the layout, the flow, the messaging. You’ve thought carefully about how to present yourself and guide the visitor toward a conversion.

When an agent visits your website, that control diminishes. The agent extracts the information it needs and moves on. It synthesizes your content according to its own logic. It represents you to its user based on what it found, not necessarily how you’d want to be represented.

This is a real shift. The entity that creates the content loses some control over how that content is presented and interpreted. The agent becomes the interface between you and the user. Your website becomes a data source rather than an experience.

For commerce, this might be fine. If the agent accurately represents your product and facilitates the transaction, the lack of a “visit” is irrelevant. The economic value is captured in the sale.

For media and services, it’s more complex. Your brand, your voice, your perspective, the things that differentiate you from competitors, these get flattened when an agent summarizes your content alongside everyone else’s.

The Next Year

Before full site delegates exist, there’s a middle ground that matters right now.

The content an agent has access to can be presented in a way that makes sense for how agents work today. Currently, that means structured markdown, clean semantic markup, content that’s easy to parse and understand. But even within static content, there’s room to be intentional about how information is organized for agent consumption.

Tomorrow, this evolves further. Presentations of content that prioritize what matters most. Rankings that signal which information is authoritative versus supplementary. Representations that progressively disclose detail, giving agents the summary first with clear paths to depth. All of this still static, not conversational, not dynamic, but shaped with agent traversal in mind.

Think of it as the difference between a pile of documents and a well-organized briefing. Both contain the same information. One is far more useful to someone trying to quickly understand what you offer.

This is the practical step publishers can take now. Structure your content so that when an agent arrives, it encounters information that’s organized for its purposes. Lead with what matters. Make relationships between content explicit. Create clear hierarchies. Expose metadata that helps agents understand not just what the content says, but what it’s for and how authoritative it is.

The full delegate vision, where your site has an agent of its own that can converse and negotiate, that’s coming. But the groundwork is simpler: present your content in a way that serves agent consumption, even before you have a delegate to do the presenting.

The Need for Representation

This points toward something that will be explored more in the next post: if agents are going to represent you to users, you might need your own agent to represent you to them.

Instead of just exposing static content and hoping the visiting agent interprets it well, the site could present a delegate of its own. Something that understands your content, your capabilities, your constraints, and your preferences. Something that can interact with the visiting agent, answer its questions, present information in the most effective way, and even negotiate.

The web evolves from a collection of static documents to a network of interacting agents, each representing the interests of their principal. The visiting agent represents the user. The site agent represents the entity. They communicate, they exchange information, they reach outcomes.

This isn’t science fiction. The protocols are being built. MCP is now under the Linux Foundation with support from Anthropic, OpenAI, Google, Microsoft, and others. Agent2Agent is being developed for agent-to-agent communication. The infrastructure for this kind of web is emerging.

The entities that prepare for this, that make their content accessible to agents, that expose their capabilities programmatically, that think about how they want to be represented in agent-to-agent interactions, those entities will have advantages as this transition unfolds.

The ones that don’t will still exist on the web. But they’ll be data to be scraped rather than participants in the conversation.

Next: The New Interface