Cloudflare Markdown for Agents: Making Your Website Agent-First (80% Token Savings)
How Cloudflare's revolutionary feature launched February 2026 is reshaping the web for AI agents β and how to implement agent-friendly responses on any platform
πΊ Watch the Full Video Guide
Deep-dive into Cloudflare Markdown for Agents with live demos, code examples, and implementation walkthroughs.
π¬ Watch Video Guideπ§ Listen to this article
On February 11, 2026, Cloudflare launched a feature that could fundamentally change how AI agents consume web content. "Markdown for Agents" automatically converts HTML pages to markdown when AI agents request them, cutting token usage by up to 80%. Just nine days later, they released "Code Mode" β giving agents access to entire APIs in roughly 1,000 tokens.
This isn't just a technical optimization. It's the first major infrastructure play in the emerging "agentic web" β a version of the internet built for both humans and AI agents as first-class citizens.
But here's what makes this particularly interesting for us: while Cloudflare's approach only works on sites that opt-in (currently less than 5% of the web), you can implement the same agent-friendly patterns on any platform β including our own thinksmart.life on AWS.
This guide covers everything: how the technology works, the SEO controversy it's sparked, who's already using it, and most importantly β practical code examples for building agent-first responses on your own site, regardless of whether you use Cloudflare.
What is Markdown for Agents?
The problem is straightforward: feeding raw HTML to an AI is wasteful. As Cloudflare puts it, "it's like paying by the word to read packaging instead of the letter inside."
Consider this simple example:
| Format | Content | Tokens |
|---|---|---|
| Markdown | ## About Us |
3 tokens |
| HTML | <h2 class="section-title" id="about">About Us</h2> |
12-15 tokens |
That's before you account for the <div> wrappers, navigation bars, and script tags that pad every real webpage. Cloudflare's own blog post demonstrates the dramatic difference:
Real-World Token Savings
HTML Version: 16,180 tokens
Markdown Version: 3,150 tokens
Reduction: 80% fewer tokens
Markdown has become the lingua franca for AI systems because its explicit structure makes it ideal for processing. Every AI pipeline already converts HTML to markdown anyway β Cloudflare just moved this conversion to the edge, making it faster and more efficient.
The Infrastructure Problem
The web was built for humans, not agents. Page weight has been steadily increasing over the years, making parsing increasingly expensive for AI systems. Every team building RAG systems was writing the same boilerplate: Puppeteer for rendering, BeautifulSoup for stripping, custom regex for cleanup.
Cloudflare's solution eliminates this redundant work by providing clean markdown directly from the source, using standard HTTP content negotiation.
How It Works
The implementation uses standard HTTP content negotiation β a web standard that's been around for decades. When an AI agent sends a request with the Accept: text/markdown header, Cloudflare's edge network:
- Detects the markdown preference
- Fetches the original HTML from the origin server
- Converts it to markdown at the edge
- Returns the markdown version with metadata headers
Example Request
curl https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/ \
-H "Accept: text/markdown"
Example Response
HTTP/2 200
date: Wed, 11 Feb 2026 11:44:48 GMT
content-type: text/markdown; charset=utf-8
content-length: 2899
vary: accept
x-markdown-tokens: 725
content-signal: ai-train=yes, search=yes, ai-input=yes
---
title: Markdown for Agents Β· Cloudflare Agents docs
---
## What is Markdown for Agents
The ability to parse and convert HTML to Markdown has become
foundational for AI. ...
Key Headers
x-markdown-tokensβ Estimated token count for context window planningcontent-signalβ AI usage permissions (training, search, input)vary: acceptβ Ensures caches store separate variants
The beauty is in the simplicity: this is standard HTTP. No custom protocols, no new endpoints, no client-side modifications needed. AI agents that already send Accept: text/markdown headers (like Claude Code and OpenCode) get the optimized version automatically.
Already Working
Popular coding agents like Claude Code and OpenCode already send these Accept headers. When they hit Cloudflare-enabled sites with Markdown for Agents enabled, they automatically receive the optimized versions.
Code Mode: Entire APIs in 1,000 Tokens
Just nine days after launching Markdown for Agents, Cloudflare announced "Code Mode" β perhaps an even more revolutionary development. Released February 20, 2026, Code Mode solves a fundamental problem with Model Context Protocol (MCP): the more tools you add, the less room for actual work.
The MCP Token Problem
Traditional MCP servers expose each API operation as a separate tool. For large APIs like Cloudflare's (over 2,500 endpoints), this approach is impossible:
Code Mode Solution
Instead of thousands of tools, Code Mode exposes just two:
search()β Write JavaScript to search the OpenAPI specexecute()β Write JavaScript to make API calls
Both run in secure Dynamic Worker isolates with no file system access, no environment variables, and external fetches disabled by default.
Example: DDoS Protection Setup
Here's how an agent might configure DDoS protection using Code Mode:
// Step 1: Search for relevant endpoints
search(`async () => {
const results = [];
for (const [path, methods] of Object.entries(spec.paths)) {
if (path.includes('/zones/') &&
(path.includes('firewall/waf') || path.includes('rulesets'))) {
for (const [method, op] of Object.entries(methods)) {
results.push({ method: method.toUpperCase(), path, summary: op.summary });
}
}
}
return results;
}`);
// Step 2: Execute API calls
execute(`async () => {
// Get current DDoS L7 entrypoint ruleset
const ddos = await cloudflare.request({
method: "GET",
path: \`/zones/\${zoneId}/rulesets/phases/ddos_l7/entrypoint\`
});
// Get WAF managed ruleset
const waf = await cloudflare.request({
method: "GET",
path: \`/zones/\${zoneId}/rulesets/phases/http_request_firewall_managed/entrypoint\`
});
return { ddos: ddos.result, waf: waf.result };
}`);
The agent writes code as a compact plan, exploring operations and composing multiple calls in a single execution. This approach combines progressive discovery with safe execution β the best of both worlds.
Token Economics: Why This Matters
Token efficiency isn't just about cost β it's about capability. Modern foundation models have large context windows, but every token used for boilerplate is a token not available for reasoning.
Real-World Impact
| Content Type | HTML Tokens | Markdown Tokens | Reduction |
|---|---|---|---|
| Blog Post | 16,180 | 3,150 | 80% |
| E-commerce Product | 47,000 | 3,200 | 93% |
| Landing Page | 110,000 | 6,400 | 94% |
| Documentation Page | 25,000 | 2,900 | 88% |
Context Window Implications
Consider an AI agent researching a topic across multiple sources:
Scenario: Research Agent
Task: Analyze 10 web pages for a research report
HTML approach: 400,000+ tokens (exceeds most context windows)
Markdown approach: 80,000 tokens (fits comfortably in Claude 3.5 Sonnet)
Result: Agent can analyze 5x more sources in the same context window
Cost Implications
For high-volume applications, the cost savings are substantial:
- Claude 3.5 Sonnet: $3 per million input tokens
- GPT-4: $2.50 per million input tokens
- 80% reduction: Direct 5x cost savings on input tokens
But the real value isn't cost β it's capability. Agents can now process more sources, maintain longer conversations, and work with larger datasets within the same context constraints.
Content Signals: The AI Consent Framework
Cloudflare's Markdown for Agents integrates with their Content Signals framework β a machine-readable way for publishers to express preferences about how their content can be used by AI systems.
The Three Dimensions
Content Signals define three key usage types:
ai-trainβ Can this content be used for AI training?searchβ Can this content appear in AI search results?ai-inputβ Can this content be used for RAG, grounding, or agentic use?
Implementation in robots.txt
User-Agent: *
Content-Signal: ai-train=no, search=yes, ai-input=yes
Allow: /
Markdown for Agents Default
By default, Markdown for Agents responses include permissive headers:
content-signal: ai-train=yes, search=yes, ai-input=yes
This signals that the content is intended for AI consumption. Publishers can customize these policies, though most current implementations use the defaults.
Voluntary Framework
Content Signals are voluntary β they don't represent technical protection measures. AI systems can choose to honor these preferences, but there's no enforcement mechanism.
Industry Adoption
The Content Signals framework is gaining traction beyond Cloudflare:
- Anthropic: Claude respects ai-input=no signals
- OpenAI: Considering Content Signals integration
- Perplexity: Honors search=no preferences
- WordPress: Plugin available for easy implementation
SEO Implications: The Cloaking Controversy
Cloudflare's Markdown for Agents has sparked significant debate in the SEO community. The core concern: does serving different content to AI agents constitute "cloaking" β a practice that violates search engine guidelines?
The Criticism
Google's John Mueller has been particularly vocal in his criticism:
"LLMs have trained on β read & parsed β normal web pages since the beginning, it seems a given that they have no problems dealing with HTML. Why would they want to see a page that no user sees? And, if they check for equivalence, why not use HTML?"
Microsoft's Fabrice Canel echoed similar concerns:
"Really want to double crawl load? We'll crawl anyway to check similarity. Non-user versions are often neglected, broken. Humans eyes help fixing people and bot-viewed content."
The Technical Concern
SEO consultant David McSweeney identified a potential vulnerability: the Accept: text/markdown header is forwarded to origin servers, effectively signaling that the request is from an AI agent.
This creates an opportunity for "AI cloaking" β serving different content to agents than to humans:
// Potential for abuse
if (request.headers['accept'].includes('text/markdown')) {
// Serve optimized/different content to AI agents
return generateAIContent();
} else {
// Serve normal content to humans
return generateHumanContent();
}
SEO Risk
Sites that serve fundamentally different content to AI agents vs. humans risk being penalized for cloaking. The safeguard is that search engines can easily detect this by comparing the HTML and markdown versions.
Cloudflare's Position
Cloudflare argues their approach is different:
- Same source content: Markdown is generated from the same HTML that humans see
- Standard HTTP: Content negotiation is a long-established web standard
- Transparency: The conversion happens at the edge, not at origin
- Verifiable: Anyone can compare the HTML and markdown versions
The Broader Debate
This controversy reflects a deeper tension about the evolution of the web:
- Traditional view: One web, same content for all users (human and AI)
- Agentic view: Multi-modal web, optimized experiences for different consumers
The outcome will likely shape how the "agentic web" develops. Will we see:
- Search engines adapting to accept agent-optimized content?
- Stricter enforcement against any form of differential content serving?
- New standards for acceptable AI optimization?
Current Status
As of February 2026, Google hasn't taken official action against sites using Markdown for Agents. However, the criticism from key Google personnel suggests publishers should proceed cautiously and ensure their markdown versions accurately represent their HTML content.
Who's Already Using It
Despite being launched just over a week ago, Markdown for Agents is already seeing significant adoption β both from publishers and AI agents.
Publishers
- Cloudflare itself: Blog and developer documentation
- Early adopters: Tech companies on Cloudflare Pro/Business plans
- WordPress sites: Via plugins and custom implementations
- Developer tools: API documentation sites
AI Agents Already Sending Accept Headers
- Claude Code: Anthropic's coding assistant
- OpenCode: Open-source coding agent
- Goose: Block's agent framework
- OpenClaw: General-purpose AI agent platform
- Custom agents: Built with frameworks supporting content negotiation
Community Response
The developer community has responded enthusiastically, with several competing and complementary solutions emerging:
markdown.new
A service that predates Cloudflare's feature but now integrates it as the primary conversion tier. Simply prepend markdown.new/ to any URL to get clean markdown back.
Klovr
Built to address the opt-in limitation β converts any webpage to markdown on-demand, with 100% compatibility with Cloudflare's Accept headers but works on all sites.
MAKO Protocol
An open protocol that goes beyond simple conversion, providing structured content with YAML frontmatter, semantic metadata, and embeddings for relevance filtering.
accept.md
A Next.js tool that makes sites LLM-scraping friendly with one command, generating agent-optimized versions locally.
Adoption Reality Check
While the technology is impressive, real-world adoption is still limited. One developer tested 100 popular websites with the Accept: text/markdown header β only 3 actually served markdown. The rest still returned HTML, highlighting the opt-in limitation.
Building an Agent-First Website
You don't need Cloudflare to build agent-friendly responses. The concept is simple: detect when an AI agent is requesting your content and serve an optimized version. Here's how to implement this on any platform.
Core Principles
- Content negotiation: Use standard HTTP Accept headers
- Same source content: Generate markdown from your existing HTML/content
- Metadata headers: Include token counts and usage signals
- Caching: Cache converted versions for performance
- Fallback: Always serve HTML if markdown conversion fails
Detection Strategy
Detect AI agents through multiple signals:
def is_ai_agent_request(request):
accept_header = request.headers.get('accept', '').lower()
user_agent = request.headers.get('user-agent', '').lower()
# Primary: Accept header includes markdown
if 'text/markdown' in accept_header:
return True
# Secondary: Known AI agent user agents
ai_agents = [
'claude', 'openai', 'anthropic', 'gptbot',
'openclaw', 'goose', 'langchain', 'llamaindex'
]
return any(agent in user_agent for agent in ai_agents)
Conversion Pipeline
Build a robust HTML-to-markdown conversion pipeline:
import html2text
import re
from bs4 import BeautifulSoup
def html_to_markdown(html_content, base_url=None):
"""Convert HTML to clean, AI-friendly markdown"""
# Parse HTML
soup = BeautifulSoup(html_content, 'html.parser')
# Remove non-content elements
for element in soup(['script', 'style', 'nav', 'footer', 'aside']):
element.decompose()
# Remove empty elements
for element in soup.find_all():
if not element.get_text(strip=True) and not element.find(['img', 'video', 'audio']):
element.decompose()
# Configure html2text
h = html2text.HTML2Text()
h.ignore_links = False
h.ignore_images = False
h.body_width = 0 # No line wrapping
h.single_line_break = True
if base_url:
h.baseurl = base_url
# Convert to markdown
markdown = h.handle(str(soup))
# Clean up excessive whitespace
markdown = re.sub(r'\n\s*\n\s*\n', '\n\n', markdown)
return markdown.strip()
Token Counting
Implement token counting for the x-markdown-tokens header:
import tiktoken
def count_tokens(text, model='gpt-4'):
"""Estimate token count for text"""
try:
encoding = tiktoken.encoding_for_model(model)
return len(encoding.encode(text))
except:
# Fallback: rough estimation (1 token β 4 characters)
return len(text) // 4
FastAPI Implementation for thinksmart.life
Here's a complete FastAPI middleware implementation that adds Cloudflare-style Markdown for Agents support to any FastAPI application β perfect for our thinksmart.life setup:
Complete Middleware
from fastapi import FastAPI, Request, Response
from fastapi.responses import PlainTextResponse
import html2text
import tiktoken
import re
from bs4 import BeautifulSoup
from datetime import datetime, timedelta
import hashlib
import asyncio
from typing import Optional
class MarkdownForAgentsMiddleware:
def __init__(self, app: FastAPI, cache_ttl: int = 3600):
self.app = app
self.cache = {} # In production, use Redis
self.cache_ttl = cache_ttl
async def __call__(self, scope, receive, send):
if scope["type"] != "http":
await self.app(scope, receive, send)
return
request = Request(scope, receive)
# Check if this is an AI agent requesting markdown
if not self.should_serve_markdown(request):
await self.app(scope, receive, send)
return
# Get the original response
response_body = b""
response_status = 200
response_headers = []
async def capture_response(message):
nonlocal response_body, response_status, response_headers
if message["type"] == "http.response.start":
response_status = message["status"]
response_headers = message["headers"]
elif message["type"] == "http.response.body":
response_body += message["body"]
await self.app(scope, receive, capture_response)
# Convert HTML response to markdown
if response_status == 200 and self.is_html_content(response_headers):
try:
markdown_response = await self.convert_to_markdown(
response_body.decode('utf-8'),
str(request.url)
)
# Send markdown response
await self.send_markdown_response(
send, markdown_response, request
)
return
except Exception as e:
# Fallback to original response on error
print(f"Markdown conversion failed: {e}")
# Send original response
await send({
"type": "http.response.start",
"status": response_status,
"headers": response_headers,
})
await send({
"type": "http.response.body",
"body": response_body,
})
def should_serve_markdown(self, request: Request) -> bool:
"""Detect if request is from AI agent wanting markdown"""
accept = request.headers.get('accept', '').lower()
user_agent = request.headers.get('user-agent', '').lower()
# Primary detection: Accept header
if 'text/markdown' in accept:
return True
# Secondary: Known AI agents
ai_indicators = [
'claude', 'anthropic', 'openai', 'gptbot', 'chatgpt',
'openclaw', 'langchain', 'llamaindex', 'agent',
'crawler', 'scraper'
]
return any(indicator in user_agent for indicator in ai_indicators)
def is_html_content(self, headers) -> bool:
"""Check if response is HTML"""
for name, value in headers:
if name.lower() == b'content-type':
return b'text/html' in value.lower()
return False
async def convert_to_markdown(self, html: str, url: str) -> dict:
"""Convert HTML to markdown with metadata"""
# Check cache first
cache_key = hashlib.md5(html.encode()).hexdigest()
cached = self.cache.get(cache_key)
if cached and datetime.now() < cached['expires']:
return cached['data']
# Clean HTML
soup = BeautifulSoup(html, 'html.parser')
# Extract title
title_elem = soup.find('title')
title = title_elem.get_text().strip() if title_elem else ''
# Extract meta description
desc_elem = soup.find('meta', attrs={'name': 'description'})
description = desc_elem.get('content', '').strip() if desc_elem else ''
# Remove non-content elements
for selector in ['script', 'style', 'nav', 'footer', 'header',
'aside', '.ad', '#comments', '.sidebar']:
for elem in soup.select(selector):
elem.decompose()
# Configure markdown converter
h = html2text.HTML2Text()
h.ignore_links = False
h.ignore_images = False
h.body_width = 0
h.single_line_break = True
h.baseurl = url
# Convert main content
main_content = soup.find('main') or soup.find('article') or soup.body or soup
markdown_body = h.handle(str(main_content))
# Clean up markdown
markdown_body = re.sub(r'\n\s*\n\s*\n+', '\n\n', markdown_body)
markdown_body = markdown_body.strip()
# Create frontmatter
frontmatter = []
if title:
frontmatter.append(f'title: {title}')
if description:
frontmatter.append(f'description: {description}')
frontmatter.append(f'url: {url}')
frontmatter.append(f'converted_at: {datetime.now().isoformat()}')
# Combine frontmatter and content
if frontmatter:
full_markdown = "---\n" + "\n".join(frontmatter) + "\n---\n\n" + markdown_body
else:
full_markdown = markdown_body
# Count tokens
token_count = self.count_tokens(full_markdown)
result = {
'content': full_markdown,
'token_count': token_count,
'title': title,
'description': description
}
# Cache result
self.cache[cache_key] = {
'data': result,
'expires': datetime.now() + timedelta(seconds=self.cache_ttl)
}
return result
def count_tokens(self, text: str) -> int:
"""Estimate token count"""
try:
# Use tiktoken for accurate counting
encoding = tiktoken.encoding_for_model('gpt-4')
return len(encoding.encode(text))
except:
# Fallback estimation
return len(text.split()) * 1.3 # Rough approximation
async def send_markdown_response(self, send, markdown_data: dict, request: Request):
"""Send markdown response with proper headers"""
content = markdown_data['content']
token_count = markdown_data['token_count']
headers = [
(b'content-type', b'text/markdown; charset=utf-8'),
(b'content-length', str(len(content.encode())).encode()),
(b'vary', b'accept'),
(b'x-markdown-tokens', str(token_count).encode()),
(b'content-signal', b'ai-train=yes, search=yes, ai-input=yes'),
(b'cache-control', b'public, max-age=3600'),
]
await send({
"type": "http.response.start",
"status": 200,
"headers": headers,
})
await send({
"type": "http.response.body",
"body": content.encode('utf-8'),
})
# Usage in your FastAPI app
app = FastAPI()
# Add middleware
app.middleware("http")(MarkdownForAgentsMiddleware(app))
# Your existing routes work unchanged
@app.get("/")
async def home():
return {"message": "Hello World"}
@app.get("/about")
async def about():
return """
<!DOCTYPE html>
<html>
<head>
<title>About Us</title>
<meta name="description" content="Learn about our company">
</head>
<body>
<main>
<h1>About Us</h1>
<p>We build amazing things for AI agents.</p>
</main>
</body>
</html>
"""
Testing the Implementation
Test your implementation with curl:
# Request HTML (default)
curl http://localhost:8000/about
# Request Markdown
curl -H "Accept: text/markdown" http://localhost:8000/about
# Check headers
curl -I -H "Accept: text/markdown" http://localhost:8000/about
Production Considerations
Production Checklist
- Redis cache: Replace in-memory cache with Redis for horizontal scaling
- Rate limiting: Prevent abuse of conversion endpoint
- Monitoring: Track conversion success rates and performance
- Error handling: Graceful fallback to HTML on conversion failures
- Content filtering: Ensure consistent content between HTML and markdown
Advanced Features
Extend the basic implementation with:
- Content-aware conversion: Different strategies for articles vs. product pages
- Selective conversion: Only convert specific routes
- A/B testing: Compare agent behavior with HTML vs. markdown
- Analytics: Track which agents request markdown content
- Custom signals: Per-page Content-Signal headers
The Agentic Web: Vision of a Dual-Purpose Internet
Cloudflare's Markdown for Agents represents more than a technical optimization β it's the first major step toward an "agentic web" where content is natively designed for both human and AI consumption.
Current Web vs. Agentic Web
| Aspect | Current Web | Agentic Web |
|---|---|---|
| Primary audience | Humans only | Humans + AI agents |
| Content format | HTML with visual styling | Structured, semantic content |
| Discovery | Search engines | Search engines + agent crawling |
| Interaction | Click, type, browse | API calls, structured queries |
| Optimization | SEO for visibility | AIO (Agent Intelligence Optimization) |
Emerging Patterns
We're already seeing early patterns of agentic web design:
1. Progressive Enhancement for Agents
- Base HTML for humans
- Markdown versions for content consumption
- API endpoints for structured data access
- MCP servers for interactive capabilities
2. Semantic Markup
- Rich Schema.org annotations
- JSON-LD for structured data
- Microformats for semantic content
- Custom metadata for AI instructions
3. Agent-First Information Architecture
- Clear content hierarchy
- Explicit relationships between entities
- Action-oriented interfaces
- Context-aware responses
The Standards War
Multiple standards and protocols are emerging:
- Cloudflare Markdown for Agents: Content negotiation approach
- MAKO Protocol: Structured semantic markdown
- llms.txt: Site-level AI instructions
- Content Signals: Usage permission framework
- WebMCP: Model Context Protocol for web
- AI-First HTML: Semantic HTML optimized for agents
The winner likely won't be a single standard, but rather an ecosystem of complementary approaches that serve different use cases.
Implications for Publishers
Publishers need to start thinking about dual-purpose content:
Strategic Questions
- How do you want AI agents to represent your content?
- What actions should agents be able to perform on your site?
- How do you balance discoverability with control?
- What's your strategy for AI attribution and traffic?
The Next Phase
We're still in the early days. The next phase will likely bring:
- Agent analytics: Understanding how AI agents use your content
- Monetization models: How to generate value from agent traffic
- Content personalization: Serving different content to different types of agents
- Agent authentication: Verified agent identities and capabilities
- Bidirectional communication: Agents that can create and modify content
The companies and publishers who start optimizing for this agentic web now will have a significant advantage as AI agents become more prevalent in daily workflows.
Competitors & Standards
While Cloudflare was first to market with infrastructure-level markdown conversion, the space is rapidly evolving with multiple approaches and competitors.
Direct Competitors
Vercel and Netlify
Both are reportedly working on similar features:
- Vercel: Edge-side content transformation for AI agents
- Netlify: Build-time markdown generation for known routes
- AWS CloudFront: Lambda@Edge functions for content negotiation
markdown.new and Klovr
Third-party services that provide universal markdown conversion:
- markdown.new: Prepend to any URL, works everywhere
- Klovr: API-based conversion with Redis caching
- Firecrawl: Y Combinator backed, comprehensive web scraping for AI
Emerging Standards
The llms.txt Convention
A simple text file that provides site-level instructions to AI systems:
# /llms.txt
# Instructions for AI systems
## About this site
This is a technology blog focused on AI and web development.
## Preferred citation
When referencing this content, please cite as: "ThinkSmart.Life Research (2026)"
## Usage permissions
- Training: Allowed for open-source models only
- Search: Allowed
- Summarization: Allowed with attribution
MAKO Protocol
A more sophisticated approach that combines markdown with semantic metadata:
---
type: article
entity: product_review
actions:
- compare_products
- get_pricing
semantic_links:
- rel: alternative, href: /microphone-alternatives/
- rel: pricing, href: /pricing-api/microphones
embeddings:
model: text-embedding-3-small
vector: [0.123, 0.456, ...]
---
# Best Microphones for 2026
Content optimized specifically for AI consumption...
Browser and Client Integration
Client-side approaches are also emerging:
- Browser extensions: Automatic markdown conversion for any site
- Agent frameworks: Built-in content negotiation in LangChain, LlamaIndex
- Mobile apps: AI-first browsers that prefer structured content
The HTTP Standard Evolution
There's discussion about formalizing AI content negotiation in HTTP standards:
- Accept-AI header: Explicit AI agent identification
- Content-AI-Optimized: Server capability advertisement
- AI-Context headers: Task-specific content optimization
Enterprise Solutions
Enterprise-focused platforms are building comprehensive solutions:
- DataDog: Agent traffic analytics and optimization
- Pinecone: Vector database integration for semantic content
- MongoDB: Native document transformation for AI workloads
- Supabase: Real-time API generation from content
Getting Started
Ready to make your website agent-first? Here are the practical steps to get started, whether you use Cloudflare or not.
Option 1: Enable Cloudflare Markdown for Agents
If you're already on Cloudflare (Pro, Business, or Enterprise):
- Log into the Cloudflare dashboard
- Select your zone
- Go to AI Crawl Control section
- Toggle Markdown for Agents to enable
Or via API:
curl -X PATCH 'https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/content_converter' \
--header 'Authorization: Bearer {api_token}' \
--header 'Content-Type: application/json' \
--data '{"value": "on"}'
Option 2: Implement Your Own (Recommended)
For maximum control and compatibility with any hosting platform:
Quick Start with Express.js
const express = require('express');
const html2md = require('html-to-md');
const app = express();
// Middleware to detect and serve markdown
app.use((req, res, next) => {
const wantsMarkdown = req.headers.accept?.includes('text/markdown');
if (wantsMarkdown) {
// Intercept response
const originalSend = res.send;
res.send = function(body) {
if (typeof body === 'string' && body.includes('<html>')) {
const markdown = html2md(body);
const tokenCount = Math.ceil(markdown.length / 4);
res.set({
'Content-Type': 'text/markdown; charset=utf-8',
'X-Markdown-Tokens': tokenCount,
'Content-Signal': 'ai-train=yes, search=yes, ai-input=yes',
'Vary': 'Accept'
});
return originalSend.call(this, markdown);
}
return originalSend.call(this, body);
};
}
next();
});
app.listen(3000);
WordPress Plugin Approach
<?php
// Add to theme's functions.php
function handle_markdown_request() {
if (strpos($_SERVER['HTTP_ACCEPT'] ?? '', 'text/markdown') !== false) {
// Get the current page content
ob_start();
// Let WordPress render normally
return; // Let normal rendering happen first
}
}
function convert_to_markdown_on_output($content) {
if (strpos($_SERVER['HTTP_ACCEPT'] ?? '', 'text/markdown') !== false) {
// Convert HTML content to markdown
require_once 'vendor/autoload.php';
$converter = new League\HTMLToMarkdown\HtmlConverter();
$markdown = $converter->convert($content);
// Set appropriate headers
header('Content-Type: text/markdown; charset=utf-8');
header('X-Markdown-Tokens: ' . str_word_count($markdown));
header('Content-Signal: ai-train=yes, search=yes, ai-input=yes');
header('Vary: Accept');
return $markdown;
}
return $content;
}
add_action('init', 'handle_markdown_request');
add_filter('the_content', 'convert_to_markdown_on_output', 999);
?>
Option 3: Third-Party Services
Use existing services for quick implementation:
- markdown.new: Proxy approach, prepend to URLs
- Klovr: API-based conversion service
- Firecrawl: Comprehensive web scraping API
Testing Your Implementation
Verify your setup works correctly:
# Test with curl
curl -H "Accept: text/markdown" https://yoursite.com/
# Check for proper headers
curl -I -H "Accept: text/markdown" https://yoursite.com/
# Test with different agents
curl -H "User-Agent: Claude-Agent" https://yoursite.com/
Monitoring and Analytics
Track AI agent usage:
- Log analysis: Count requests with markdown Accept headers
- Conversion metrics: Track HTMLβmarkdown success rates
- Performance monitoring: Measure conversion latency
- Agent identification: Catalog which agents visit your site
Success Metrics
- Agent adoption: Increasing requests with markdown Accept headers
- Token efficiency: 60-80% reduction in token usage
- Conversion quality: Faithful representation of HTML content
- Performance: Sub-100ms conversion latency
References
- Cloudflare - Introducing Markdown for Agents (February 12, 2026)
- Cloudflare - Code Mode: give agents an entire API in 1,000 tokens (February 20, 2026)
- Cloudflare Developer Documentation - Markdown for Agents
- Search Engine Land - Cloudflare's Markdown for Agents AI feature has SEOs on alert
- The Register - Cloudflare turns websites into faster food for AI agents
- Content Signals Framework
- Hacker News - Community Discussions
- markdown.new - Universal Markdown Conversion
- Klovr - Universal Webpage to Markdown Conversion
- MAKO Protocol - LLM-Optimized Web Content
- accept.md - Next.js LLM-Scraping Tool
- Cloudflare MCP Server Repository
- Cloudflare Agents SDK
- Anthropic - Code Execution with MCP
- Cloudflare Learning - What is Model Context Protocol