Cloudflare Markdown for Agents: Making Your Website Agent-First (80% Token Savings)

How Cloudflare's revolutionary feature launched February 2026 is reshaping the web for AI agents — and how to implement agent-friendly responses on any platform

February 21, 2026

15 min read

Research

📺 Watch the Full Video Guide

Deep-dive into Cloudflare Markdown for Agents with live demos, code examples, and implementation walkthroughs.

🎬 Watch Video Guide

🎧 Listen to this article

On February 11, 2026, Cloudflare launched a feature that could fundamentally change how AI agents consume web content. "Markdown for Agents" automatically converts HTML pages to markdown when AI agents request them, cutting token usage by up to 80%. Just nine days later, they released "Code Mode" — giving agents access to entire APIs in roughly 1,000 tokens.

This isn't just a technical optimization. It's the first major infrastructure play in the emerging "agentic web" — a version of the internet built for both humans and AI agents as first-class citizens.

80%

Token Reduction

20%

Of Web on Cloudflare

2,500+

API Endpoints via Code Mode

1,000

Tokens for Entire API

But here's what makes this particularly interesting for us: while Cloudflare's approach only works on sites that opt-in (currently less than 5% of the web), you can implement the same agent-friendly patterns on any platform — including our own thinksmart.life on AWS.

This guide covers everything: how the technology works, the SEO controversy it's sparked, who's already using it, and most importantly — practical code examples for building agent-first responses on your own site, regardless of whether you use Cloudflare.

What is Markdown for Agents?

The problem is straightforward: feeding raw HTML to an AI is wasteful. As Cloudflare puts it, "it's like paying by the word to read packaging instead of the letter inside."

Consider this simple example:

Format	Content	Tokens
Markdown	`## About Us`	3 tokens
HTML	`<h2 class="section-title" id="about">About Us</h2>`	12-15 tokens

That's before you account for the <div> wrappers, navigation bars, and script tags that pad every real webpage. Cloudflare's own blog post demonstrates the dramatic difference:

Real-World Token Savings

HTML Version: 16,180 tokens
Markdown Version: 3,150 tokens
Reduction: 80% fewer tokens

Markdown has become the lingua franca for AI systems because its explicit structure makes it ideal for processing. Every AI pipeline already converts HTML to markdown anyway — Cloudflare just moved this conversion to the edge, making it faster and more efficient.

The Infrastructure Problem

The web was built for humans, not agents. Page weight has been steadily increasing over the years, making parsing increasingly expensive for AI systems. Every team building RAG systems was writing the same boilerplate: Puppeteer for rendering, BeautifulSoup for stripping, custom regex for cleanup.

Cloudflare's solution eliminates this redundant work by providing clean markdown directly from the source, using standard HTTP content negotiation.

How It Works

The implementation uses standard HTTP content negotiation — a web standard that's been around for decades. When an AI agent sends a request with the Accept: text/markdown header, Cloudflare's edge network:

Detects the markdown preference
Fetches the original HTML from the origin server
Converts it to markdown at the edge
Returns the markdown version with metadata headers

Example Request

curl https://developers.cloudflare.com/fundamentals/reference/markdown-for-agents/ \
  -H "Accept: text/markdown"

Example Response

HTTP/2 200
date: Wed, 11 Feb 2026 11:44:48 GMT
content-type: text/markdown; charset=utf-8
content-length: 2899
vary: accept
x-markdown-tokens: 725
content-signal: ai-train=yes, search=yes, ai-input=yes

---
title: Markdown for Agents · Cloudflare Agents docs
---

## What is Markdown for Agents

The ability to parse and convert HTML to Markdown has become 
foundational for AI. ...

Key Headers

x-markdown-tokens — Estimated token count for context window planning
content-signal — AI usage permissions (training, search, input)
vary: accept — Ensures caches store separate variants

The beauty is in the simplicity: this is standard HTTP. No custom protocols, no new endpoints, no client-side modifications needed. AI agents that already send Accept: text/markdown headers (like Claude Code and OpenCode) get the optimized version automatically.

Already Working

Popular coding agents like Claude Code and OpenCode already send these Accept headers. When they hit Cloudflare-enabled sites with Markdown for Agents enabled, they automatically receive the optimized versions.

Code Mode: Entire APIs in 1,000 Tokens

Just nine days after launching Markdown for Agents, Cloudflare announced "Code Mode" — perhaps an even more revolutionary development. Released February 20, 2026, Code Mode solves a fundamental problem with Model Context Protocol (MCP): the more tools you add, the less room for actual work.

The MCP Token Problem

Traditional MCP servers expose each API operation as a separate tool. For large APIs like Cloudflare's (over 2,500 endpoints), this approach is impossible:

2,500+

Cloudflare API Endpoints

1.17M

Tokens (Traditional MCP)

1,000

Tokens (Code Mode)

99.9%

Token Reduction

Code Mode Solution

Instead of thousands of tools, Code Mode exposes just two:

search() — Write JavaScript to search the OpenAPI spec
execute() — Write JavaScript to make API calls

Both run in secure Dynamic Worker isolates with no file system access, no environment variables, and external fetches disabled by default.

Example: DDoS Protection Setup

Here's how an agent might configure DDoS protection using Code Mode:

// Step 1: Search for relevant endpoints
search(`async () => {
  const results = [];
  for (const [path, methods] of Object.entries(spec.paths)) {
    if (path.includes('/zones/') && 
        (path.includes('firewall/waf') || path.includes('rulesets'))) {
      for (const [method, op] of Object.entries(methods)) {
        results.push({ method: method.toUpperCase(), path, summary: op.summary });
      }
    }
  }
  return results;
}`);

// Step 2: Execute API calls
execute(`async () => {
  // Get current DDoS L7 entrypoint ruleset
  const ddos = await cloudflare.request({
    method: "GET",
    path: \`/zones/\${zoneId}/rulesets/phases/ddos_l7/entrypoint\`
  });

  // Get WAF managed ruleset  
  const waf = await cloudflare.request({
    method: "GET", 
    path: \`/zones/\${zoneId}/rulesets/phases/http_request_firewall_managed/entrypoint\`
  });
  
  return { ddos: ddos.result, waf: waf.result };
}`);

The agent writes code as a compact plan, exploring operations and composing multiple calls in a single execution. This approach combines progressive discovery with safe execution — the best of both worlds.

Token Economics: Why This Matters

Token efficiency isn't just about cost — it's about capability. Modern foundation models have large context windows, but every token used for boilerplate is a token not available for reasoning.

Real-World Impact

Content Type	HTML Tokens	Markdown Tokens	Reduction
Blog Post	16,180	3,150	80%
E-commerce Product	47,000	3,200	93%
Landing Page	110,000	6,400	94%
Documentation Page	25,000	2,900	88%

Context Window Implications

Consider an AI agent researching a topic across multiple sources:

Scenario: Research Agent

Task: Analyze 10 web pages for a research report
HTML approach: 400,000+ tokens (exceeds most context windows)
Markdown approach: 80,000 tokens (fits comfortably in Claude 3.5 Sonnet)
Result: Agent can analyze 5x more sources in the same context window

Cost Implications

For high-volume applications, the cost savings are substantial:

Claude 3.5 Sonnet: $3 per million input tokens
GPT-4: $2.50 per million input tokens
80% reduction: Direct 5x cost savings on input tokens

But the real value isn't cost — it's capability. Agents can now process more sources, maintain longer conversations, and work with larger datasets within the same context constraints.

Content Signals: The AI Consent Framework

Cloudflare's Markdown for Agents integrates with their Content Signals framework — a machine-readable way for publishers to express preferences about how their content can be used by AI systems.

The Three Dimensions

Content Signals define three key usage types:

ai-train — Can this content be used for AI training?
search — Can this content appear in AI search results?
ai-input — Can this content be used for RAG, grounding, or agentic use?

Implementation in robots.txt

User-Agent: *
Content-Signal: ai-train=no, search=yes, ai-input=yes
Allow: /

Markdown for Agents Default

By default, Markdown for Agents responses include permissive headers:

content-signal: ai-train=yes, search=yes, ai-input=yes

This signals that the content is intended for AI consumption. Publishers can customize these policies, though most current implementations use the defaults.

Voluntary Framework

Content Signals are voluntary — they don't represent technical protection measures. AI systems can choose to honor these preferences, but there's no enforcement mechanism.

Industry Adoption

The Content Signals framework is gaining traction beyond Cloudflare:

Anthropic: Claude respects ai-input=no signals
OpenAI: Considering Content Signals integration
Perplexity: Honors search=no preferences
WordPress: Plugin available for easy implementation

SEO Implications: The Cloaking Controversy

Cloudflare's Markdown for Agents has sparked significant debate in the SEO community. The core concern: does serving different content to AI agents constitute "cloaking" — a practice that violates search engine guidelines?

The Criticism

Google's John Mueller has been particularly vocal in his criticism:

"LLMs have trained on – read & parsed – normal web pages since the beginning, it seems a given that they have no problems dealing with HTML. Why would they want to see a page that no user sees? And, if they check for equivalence, why not use HTML?"

Microsoft's Fabrice Canel echoed similar concerns:

"Really want to double crawl load? We'll crawl anyway to check similarity. Non-user versions are often neglected, broken. Humans eyes help fixing people and bot-viewed content."

The Technical Concern

SEO consultant David McSweeney identified a potential vulnerability: the Accept: text/markdown header is forwarded to origin servers, effectively signaling that the request is from an AI agent.

This creates an opportunity for "AI cloaking" — serving different content to agents than to humans:

// Potential for abuse
if (request.headers['accept'].includes('text/markdown')) {
    // Serve optimized/different content to AI agents
    return generateAIContent();
} else {
    // Serve normal content to humans  
    return generateHumanContent();
}

SEO Risk

Sites that serve fundamentally different content to AI agents vs. humans risk being penalized for cloaking. The safeguard is that search engines can easily detect this by comparing the HTML and markdown versions.

Cloudflare's Position

Cloudflare argues their approach is different:

Same source content: Markdown is generated from the same HTML that humans see
Standard HTTP: Content negotiation is a long-established web standard
Transparency: The conversion happens at the edge, not at origin
Verifiable: Anyone can compare the HTML and markdown versions

The Broader Debate

This controversy reflects a deeper tension about the evolution of the web:

Traditional view: One web, same content for all users (human and AI)
Agentic view: Multi-modal web, optimized experiences for different consumers

The outcome will likely shape how the "agentic web" develops. Will we see:

Search engines adapting to accept agent-optimized content?
Stricter enforcement against any form of differential content serving?
New standards for acceptable AI optimization?

Current Status

As of February 2026, Google hasn't taken official action against sites using Markdown for Agents. However, the criticism from key Google personnel suggests publishers should proceed cautiously and ensure their markdown versions accurately represent their HTML content.

Who's Already Using It

Despite being launched just over a week ago, Markdown for Agents is already seeing significant adoption — both from publishers and AI agents.

Publishers

Cloudflare itself: Blog and developer documentation
Early adopters: Tech companies on Cloudflare Pro/Business plans
WordPress sites: Via plugins and custom implementations
Developer tools: API documentation sites

AI Agents Already Sending Accept Headers

Claude Code: Anthropic's coding assistant
OpenCode: Open-source coding agent
Goose: Block's agent framework
OpenClaw: General-purpose AI agent platform
Custom agents: Built with frameworks supporting content negotiation

Community Response

The developer community has responded enthusiastically, with several competing and complementary solutions emerging:

markdown.new

A service that predates Cloudflare's feature but now integrates it as the primary conversion tier. Simply prepend markdown.new/ to any URL to get clean markdown back.

Klovr

Built to address the opt-in limitation — converts any webpage to markdown on-demand, with 100% compatibility with Cloudflare's Accept headers but works on all sites.

MAKO Protocol

An open protocol that goes beyond simple conversion, providing structured content with YAML frontmatter, semantic metadata, and embeddings for relevance filtering.

accept.md

A Next.js tool that makes sites LLM-scraping friendly with one command, generating agent-optimized versions locally.

Adoption Reality Check

While the technology is impressive, real-world adoption is still limited. One developer tested 100 popular websites with the Accept: text/markdown header — only 3 actually served markdown. The rest still returned HTML, highlighting the opt-in limitation.

Building an Agent-First Website

You don't need Cloudflare to build agent-friendly responses. The concept is simple: detect when an AI agent is requesting your content and serve an optimized version. Here's how to implement this on any platform.

Core Principles

Content negotiation: Use standard HTTP Accept headers
Same source content: Generate markdown from your existing HTML/content
Metadata headers: Include token counts and usage signals
Caching: Cache converted versions for performance
Fallback: Always serve HTML if markdown conversion fails

Detection Strategy

Detect AI agents through multiple signals:

def is_ai_agent_request(request):
    accept_header = request.headers.get('accept', '').lower()
    user_agent = request.headers.get('user-agent', '').lower()
    
    # Primary: Accept header includes markdown
    if 'text/markdown' in accept_header:
        return True
        
    # Secondary: Known AI agent user agents
    ai_agents = [
        'claude', 'openai', 'anthropic', 'gptbot', 
        'openclaw', 'goose', 'langchain', 'llamaindex'
    ]
    
    return any(agent in user_agent for agent in ai_agents)

Conversion Pipeline

Build a robust HTML-to-markdown conversion pipeline:

import html2text
import re
from bs4 import BeautifulSoup

def html_to_markdown(html_content, base_url=None):
    """Convert HTML to clean, AI-friendly markdown"""
    
    # Parse HTML
    soup = BeautifulSoup(html_content, 'html.parser')
    
    # Remove non-content elements
    for element in soup(['script', 'style', 'nav', 'footer', 'aside']):
        element.decompose()
        
    # Remove empty elements
    for element in soup.find_all():
        if not element.get_text(strip=True) and not element.find(['img', 'video', 'audio']):
            element.decompose()
    
    # Configure html2text
    h = html2text.HTML2Text()
    h.ignore_links = False
    h.ignore_images = False
    h.body_width = 0  # No line wrapping
    h.single_line_break = True
    
    if base_url:
        h.baseurl = base_url
    
    # Convert to markdown
    markdown = h.handle(str(soup))
    
    # Clean up excessive whitespace
    markdown = re.sub(r'\n\s*\n\s*\n', '\n\n', markdown)
    
    return markdown.strip()

Token Counting

Implement token counting for the x-markdown-tokens header:

import tiktoken

def count_tokens(text, model='gpt-4'):
    """Estimate token count for text"""
    try:
        encoding = tiktoken.encoding_for_model(model)
        return len(encoding.encode(text))
    except:
        # Fallback: rough estimation (1 token ≈ 4 characters)
        return len(text) // 4

FastAPI Implementation for thinksmart.life

Here's a complete FastAPI middleware implementation that adds Cloudflare-style Markdown for Agents support to any FastAPI application — perfect for our thinksmart.life setup:

Complete Middleware

from fastapi import FastAPI, Request, Response
from fastapi.responses import PlainTextResponse
import html2text
import tiktoken
import re
from bs4 import BeautifulSoup
from datetime import datetime, timedelta
import hashlib
import asyncio
from typing import Optional

class MarkdownForAgentsMiddleware:
    def __init__(self, app: FastAPI, cache_ttl: int = 3600):
        self.app = app
        self.cache = {}  # In production, use Redis
        self.cache_ttl = cache_ttl
        
    async def __call__(self, scope, receive, send):
        if scope["type"] != "http":
            await self.app(scope, receive, send)
            return
            
        request = Request(scope, receive)
        
        # Check if this is an AI agent requesting markdown
        if not self.should_serve_markdown(request):
            await self.app(scope, receive, send)
            return
            
        # Get the original response
        response_body = b""
        response_status = 200
        response_headers = []
        
        async def capture_response(message):
            nonlocal response_body, response_status, response_headers
            
            if message["type"] == "http.response.start":
                response_status = message["status"]
                response_headers = message["headers"]
            elif message["type"] == "http.response.body":
                response_body += message["body"]
        
        await self.app(scope, receive, capture_response)
        
        # Convert HTML response to markdown
        if response_status == 200 and self.is_html_content(response_headers):
            try:
                markdown_response = await self.convert_to_markdown(
                    response_body.decode('utf-8'), 
                    str(request.url)
                )
                
                # Send markdown response
                await self.send_markdown_response(
                    send, markdown_response, request
                )
                return
            except Exception as e:
                # Fallback to original response on error
                print(f"Markdown conversion failed: {e}")
        
        # Send original response
        await send({
            "type": "http.response.start",
            "status": response_status,
            "headers": response_headers,
        })
        await send({
            "type": "http.response.body",
            "body": response_body,
        })
    
    def should_serve_markdown(self, request: Request) -> bool:
        """Detect if request is from AI agent wanting markdown"""
        accept = request.headers.get('accept', '').lower()
        user_agent = request.headers.get('user-agent', '').lower()
        
        # Primary detection: Accept header
        if 'text/markdown' in accept:
            return True
            
        # Secondary: Known AI agents
        ai_indicators = [
            'claude', 'anthropic', 'openai', 'gptbot', 'chatgpt',
            'openclaw', 'langchain', 'llamaindex', 'agent', 
            'crawler', 'scraper'
        ]
        
        return any(indicator in user_agent for indicator in ai_indicators)
    
    def is_html_content(self, headers) -> bool:
        """Check if response is HTML"""
        for name, value in headers:
            if name.lower() == b'content-type':
                return b'text/html' in value.lower()
        return False
    
    async def convert_to_markdown(self, html: str, url: str) -> dict:
        """Convert HTML to markdown with metadata"""
        
        # Check cache first
        cache_key = hashlib.md5(html.encode()).hexdigest()
        cached = self.cache.get(cache_key)
        if cached and datetime.now() < cached['expires']:
            return cached['data']
        
        # Clean HTML
        soup = BeautifulSoup(html, 'html.parser')
        
        # Extract title
        title_elem = soup.find('title')
        title = title_elem.get_text().strip() if title_elem else ''
        
        # Extract meta description
        desc_elem = soup.find('meta', attrs={'name': 'description'})
        description = desc_elem.get('content', '').strip() if desc_elem else ''
        
        # Remove non-content elements
        for selector in ['script', 'style', 'nav', 'footer', 'header', 
                        'aside', '.ad', '#comments', '.sidebar']:
            for elem in soup.select(selector):
                elem.decompose()
        
        # Configure markdown converter
        h = html2text.HTML2Text()
        h.ignore_links = False
        h.ignore_images = False  
        h.body_width = 0
        h.single_line_break = True
        h.baseurl = url
        
        # Convert main content
        main_content = soup.find('main') or soup.find('article') or soup.body or soup
        markdown_body = h.handle(str(main_content))
        
        # Clean up markdown
        markdown_body = re.sub(r'\n\s*\n\s*\n+', '\n\n', markdown_body)
        markdown_body = markdown_body.strip()
        
        # Create frontmatter
        frontmatter = []
        if title:
            frontmatter.append(f'title: {title}')
        if description:
            frontmatter.append(f'description: {description}')
        frontmatter.append(f'url: {url}')
        frontmatter.append(f'converted_at: {datetime.now().isoformat()}')
        
        # Combine frontmatter and content
        if frontmatter:
            full_markdown = "---\n" + "\n".join(frontmatter) + "\n---\n\n" + markdown_body
        else:
            full_markdown = markdown_body
        
        # Count tokens
        token_count = self.count_tokens(full_markdown)
        
        result = {
            'content': full_markdown,
            'token_count': token_count,
            'title': title,
            'description': description
        }
        
        # Cache result
        self.cache[cache_key] = {
            'data': result,
            'expires': datetime.now() + timedelta(seconds=self.cache_ttl)
        }
        
        return result
    
    def count_tokens(self, text: str) -> int:
        """Estimate token count"""
        try:
            # Use tiktoken for accurate counting
            encoding = tiktoken.encoding_for_model('gpt-4')
            return len(encoding.encode(text))
        except:
            # Fallback estimation
            return len(text.split()) * 1.3  # Rough approximation
    
    async def send_markdown_response(self, send, markdown_data: dict, request: Request):
        """Send markdown response with proper headers"""
        
        content = markdown_data['content']
        token_count = markdown_data['token_count']
        
        headers = [
            (b'content-type', b'text/markdown; charset=utf-8'),
            (b'content-length', str(len(content.encode())).encode()),
            (b'vary', b'accept'),
            (b'x-markdown-tokens', str(token_count).encode()),
            (b'content-signal', b'ai-train=yes, search=yes, ai-input=yes'),
            (b'cache-control', b'public, max-age=3600'),
        ]
        
        await send({
            "type": "http.response.start",
            "status": 200,
            "headers": headers,
        })
        
        await send({
            "type": "http.response.body", 
            "body": content.encode('utf-8'),
        })

# Usage in your FastAPI app
app = FastAPI()

# Add middleware
app.middleware("http")(MarkdownForAgentsMiddleware(app))

# Your existing routes work unchanged
@app.get("/")
async def home():
    return {"message": "Hello World"}

@app.get("/about")  
async def about():
    return """
    <!DOCTYPE html>
    <html>
    <head>
        <title>About Us</title>
        <meta name="description" content="Learn about our company">
    </head>
    <body>
        <main>
            <h1>About Us</h1>
            <p>We build amazing things for AI agents.</p>
        </main>
    </body>
    </html>
    """

Testing the Implementation

Test your implementation with curl:

# Request HTML (default)
curl http://localhost:8000/about

# Request Markdown  
curl -H "Accept: text/markdown" http://localhost:8000/about

# Check headers
curl -I -H "Accept: text/markdown" http://localhost:8000/about

Production Considerations

Production Checklist

Redis cache: Replace in-memory cache with Redis for horizontal scaling
Rate limiting: Prevent abuse of conversion endpoint
Monitoring: Track conversion success rates and performance
Error handling: Graceful fallback to HTML on conversion failures
Content filtering: Ensure consistent content between HTML and markdown

Advanced Features

Extend the basic implementation with:

Content-aware conversion: Different strategies for articles vs. product pages
Selective conversion: Only convert specific routes
A/B testing: Compare agent behavior with HTML vs. markdown
Analytics: Track which agents request markdown content
Custom signals: Per-page Content-Signal headers

The Agentic Web: Vision of a Dual-Purpose Internet

Cloudflare's Markdown for Agents represents more than a technical optimization — it's the first major step toward an "agentic web" where content is natively designed for both human and AI consumption.

Current Web vs. Agentic Web

Aspect	Current Web	Agentic Web
Primary audience	Humans only	Humans + AI agents
Content format	HTML with visual styling	Structured, semantic content
Discovery	Search engines	Search engines + agent crawling
Interaction	Click, type, browse	API calls, structured queries
Optimization	SEO for visibility	AIO (Agent Intelligence Optimization)

Emerging Patterns

We're already seeing early patterns of agentic web design:

1. Progressive Enhancement for Agents

Base HTML for humans
Markdown versions for content consumption
API endpoints for structured data access
MCP servers for interactive capabilities

2. Semantic Markup

Rich Schema.org annotations
JSON-LD for structured data
Microformats for semantic content
Custom metadata for AI instructions

3. Agent-First Information Architecture

Clear content hierarchy
Explicit relationships between entities
Action-oriented interfaces
Context-aware responses

The Standards War

Multiple standards and protocols are emerging:

Cloudflare Markdown for Agents: Content negotiation approach
MAKO Protocol: Structured semantic markdown
llms.txt: Site-level AI instructions
Content Signals: Usage permission framework
WebMCP: Model Context Protocol for web
AI-First HTML: Semantic HTML optimized for agents

The winner likely won't be a single standard, but rather an ecosystem of complementary approaches that serve different use cases.

Implications for Publishers

Publishers need to start thinking about dual-purpose content:

Strategic Questions

How do you want AI agents to represent your content?
What actions should agents be able to perform on your site?
How do you balance discoverability with control?
What's your strategy for AI attribution and traffic?

The Next Phase

We're still in the early days. The next phase will likely bring:

Agent analytics: Understanding how AI agents use your content
Monetization models: How to generate value from agent traffic
Content personalization: Serving different content to different types of agents
Agent authentication: Verified agent identities and capabilities
Bidirectional communication: Agents that can create and modify content

The companies and publishers who start optimizing for this agentic web now will have a significant advantage as AI agents become more prevalent in daily workflows.

Competitors & Standards

While Cloudflare was first to market with infrastructure-level markdown conversion, the space is rapidly evolving with multiple approaches and competitors.

Direct Competitors

Vercel and Netlify

Both are reportedly working on similar features:

Vercel: Edge-side content transformation for AI agents
Netlify: Build-time markdown generation for known routes
AWS CloudFront: Lambda@Edge functions for content negotiation

markdown.new and Klovr

Third-party services that provide universal markdown conversion:

markdown.new: Prepend to any URL, works everywhere
Klovr: API-based conversion with Redis caching
Firecrawl: Y Combinator backed, comprehensive web scraping for AI

Emerging Standards

The llms.txt Convention

A simple text file that provides site-level instructions to AI systems:

# /llms.txt
# Instructions for AI systems

## About this site
This is a technology blog focused on AI and web development.

## Preferred citation
When referencing this content, please cite as: "ThinkSmart.Life Research (2026)"

## Usage permissions  
- Training: Allowed for open-source models only
- Search: Allowed
- Summarization: Allowed with attribution

MAKO Protocol

A more sophisticated approach that combines markdown with semantic metadata:

---
type: article
entity: product_review  
actions:
  - compare_products
  - get_pricing
semantic_links:
  - rel: alternative, href: /microphone-alternatives/
  - rel: pricing, href: /pricing-api/microphones
embeddings: 
  model: text-embedding-3-small
  vector: [0.123, 0.456, ...]
---

# Best Microphones for 2026

Content optimized specifically for AI consumption...

Browser and Client Integration

Client-side approaches are also emerging:

Browser extensions: Automatic markdown conversion for any site
Agent frameworks: Built-in content negotiation in LangChain, LlamaIndex
Mobile apps: AI-first browsers that prefer structured content

The HTTP Standard Evolution

There's discussion about formalizing AI content negotiation in HTTP standards:

Accept-AI header: Explicit AI agent identification
Content-AI-Optimized: Server capability advertisement
AI-Context headers: Task-specific content optimization

Enterprise Solutions

Enterprise-focused platforms are building comprehensive solutions:

DataDog: Agent traffic analytics and optimization
Pinecone: Vector database integration for semantic content
MongoDB: Native document transformation for AI workloads
Supabase: Real-time API generation from content

Getting Started

Ready to make your website agent-first? Here are the practical steps to get started, whether you use Cloudflare or not.

Option 1: Enable Cloudflare Markdown for Agents

If you're already on Cloudflare (Pro, Business, or Enterprise):

Log into the Cloudflare dashboard
Select your zone
Go to AI Crawl Control section
Toggle Markdown for Agents to enable

Or via API:

curl -X PATCH 'https://api.cloudflare.com/client/v4/zones/{zone_id}/settings/content_converter' \
  --header 'Authorization: Bearer {api_token}' \
  --header 'Content-Type: application/json' \
  --data '{"value": "on"}'

Option 2: Implement Your Own (Recommended)

For maximum control and compatibility with any hosting platform:

Quick Start with Express.js

const express = require('express');
const html2md = require('html-to-md');

const app = express();

// Middleware to detect and serve markdown
app.use((req, res, next) => {
  const wantsMarkdown = req.headers.accept?.includes('text/markdown');
  
  if (wantsMarkdown) {
    // Intercept response
    const originalSend = res.send;
    res.send = function(body) {
      if (typeof body === 'string' && body.includes('<html>')) {
        const markdown = html2md(body);
        const tokenCount = Math.ceil(markdown.length / 4);
        
        res.set({
          'Content-Type': 'text/markdown; charset=utf-8',
          'X-Markdown-Tokens': tokenCount,
          'Content-Signal': 'ai-train=yes, search=yes, ai-input=yes',
          'Vary': 'Accept'
        });
        
        return originalSend.call(this, markdown);
      }
      return originalSend.call(this, body);
    };
  }
  
  next();
});

app.listen(3000);

WordPress Plugin Approach

<?php
// Add to theme's functions.php

function handle_markdown_request() {
    if (strpos($_SERVER['HTTP_ACCEPT'] ?? '', 'text/markdown') !== false) {
        // Get the current page content
        ob_start();
        // Let WordPress render normally
        return; // Let normal rendering happen first
    }
}

function convert_to_markdown_on_output($content) {
    if (strpos($_SERVER['HTTP_ACCEPT'] ?? '', 'text/markdown') !== false) {
        // Convert HTML content to markdown
        require_once 'vendor/autoload.php';
        $converter = new League\HTMLToMarkdown\HtmlConverter();
        $markdown = $converter->convert($content);
        
        // Set appropriate headers
        header('Content-Type: text/markdown; charset=utf-8');
        header('X-Markdown-Tokens: ' . str_word_count($markdown));
        header('Content-Signal: ai-train=yes, search=yes, ai-input=yes');
        header('Vary: Accept');
        
        return $markdown;
    }
    return $content;
}

add_action('init', 'handle_markdown_request');
add_filter('the_content', 'convert_to_markdown_on_output', 999);
?>

Option 3: Third-Party Services

Use existing services for quick implementation:

markdown.new: Proxy approach, prepend to URLs
Klovr: API-based conversion service
Firecrawl: Comprehensive web scraping API

Testing Your Implementation

Verify your setup works correctly:

# Test with curl
curl -H "Accept: text/markdown" https://yoursite.com/

# Check for proper headers
curl -I -H "Accept: text/markdown" https://yoursite.com/

# Test with different agents
curl -H "User-Agent: Claude-Agent" https://yoursite.com/

Monitoring and Analytics

Track AI agent usage:

Log analysis: Count requests with markdown Accept headers
Conversion metrics: Track HTML→markdown success rates
Performance monitoring: Measure conversion latency
Agent identification: Catalog which agents visit your site

Success Metrics

Agent adoption: Increasing requests with markdown Accept headers
Token efficiency: 60-80% reduction in token usage
Conversion quality: Faithful representation of HTML content
Performance: Sub-100ms conversion latency