Writing Markdown Documentation for AI Efficiency: A Practical Guide to Reducing Context Size

Posted Nov 11, 2025

38 min read

AI-Generated Content

This article was generated by AI. The accuracy of the content is not guaranteed, and we accept no responsibility for any damages resulting from use of this article. By continuing to read, you agree to the Terms of Use.

Target Audience: Python/JavaScript developers, engineers using AI tools (Claude, ChatGPT, Cursor, etc.) in their work
Prerequisites: Markdown basics, basic experience with AI development tools
Reading Time: 15 minutes

Overview

When leveraging large language models (LLMs) like Claude, ChatGPT, and Cursor in development, how you write project documentation (CLAUDE.md, README.md, prompt templates, etc.) directly impacts AI response quality and cost efficiency. This article explains Markdown writing techniques that minimize context size while maximizing information density, based on 2024-2025 research and best practices[1][2][3].

Note: The recommendations in this article explain optimization techniques that prioritize AI (LLM) readability.

Why You Should Be Mindful of Context Size

Current State of Context Windows (As of November 2025)

Major LLM context windows are as follows[4][5]:

Claude:

Paid plan: 200K tokens (~500,000 characters, equivalent to 500 pages)
Enterprise: 500K tokens (Claude Sonnet 4.5)
API Beta: 1M tokens (Claude Sonnet 4, Tier 4 and above)

ChatGPT/GPT-4o:

Free: 8K tokens
Plus: 32K tokens
Pro/Enterprise: 128K tokens
API: 128K tokens

Cost Impact

Claude applies premium pricing for requests exceeding 200K tokens[4]:

Input tokens: 2x
Output tokens: 1.5x

Performance Impact

Inputting large amounts of context causes the following problems:

Slower response times
Increased noise from less relevant information
Reduced accuracy (especially in RAG systems)[1]

Why Markdown Is Chosen

1. Token Efficiency

Markdown enables 20-30% token reduction compared to HTML, XML, and JSON[2].

  
# ❌ HTML (verbose)
<h1>Title</h1>
<ul>
  <li>Item 1</li>
  <li>Item 2</li>
</ul>

# ✅ Markdown (concise)
# Title
- Item 1
- Item 2

Reasons:

Markdown symbols (#, *, -, |) are often converted to single tokens[7]
No closing tags needed
No attribute description overhead

2. LLM Tokenization Process

OpenAI’s tiktoken and Claude’s tokenizer use Byte Pair Encoding (BPE) to split text into tokens[7][8]:

Convert to byte sequence using UTF-8 encoding
Pre-tokenize with predefined regex patterns (split at word boundaries)
Merge frequent byte sequences using BPE algorithm

Markdown structural symbols are learned as frequent patterns, enabling efficient tokenization.

3. RAG Search Accuracy Improvement

Clean Markdown has been reported to improve RAG search accuracy by up to 35% and reduce token usage by 20-30%[2].

Context-Efficient Markdown Writing

1. Heading Hierarchy Optimization

Principle: Clear hierarchical structure allows LLMs to instantly understand document structure[2].

  
# ❌ Bad example: Flat structure
## Overview
This is the project overview.
## Installation
Installation instructions.
## Usage
How to use.
## API Reference
API description.

# ✅ Good example: Hierarchical structure
## Overview
Concise project description

## Quick Start
### Installation
npm install project-name

### Basic Usage
const app = new App()

## API Reference
### Class: App
#### Constructor
#### Methods

Effects:

LLMs more easily understand relationships between sections
RAG system chunk splitting is optimized
Faster navigation to needed information

2. Eliminating Redundancy

Principle: Concise prompts can achieve 30-50% token reduction[11][12].

# ❌ Verbose expression (estimated 200 tokens)
This project is a modern web application framework designed to
be very convenient and easy to use for users, while having
extremely powerful features. By using this framework, developers
can rapidly build applications.

# ✅ Concise expression (estimated 80 tokens)
Modern web application framework.
Simple API enables rapid development.

Reduction techniques:

Reduce modifiers (“very”, “extremely”, etc.)
Eliminate duplicate expressions
Passive voice → Active voice
Reduce verbose conjunctions

3. Using Lists and Bullet Points

Principle: Structured lists are more token-efficient than prose[11].

  
# ❌ Prose format (estimated 150 tokens)
The main features of this library include, first of all,
high-speed processing. Next, type safety is guaranteed, which
is also important. Furthermore, it has extensibility through
plugins.

# ✅ List format (estimated 60 tokens)
Key features:
- High-speed processing
- Type safety
- Plugin extensibility

Details: [See documentation](./docs/features.md)

4. Code Block Optimization

Principle: Include only minimum necessary code examples, reference external files for details[13].

# ❌ Verbose code example (estimated 300 tokens)
import { App } from 'framework'
import { Logger } from 'logger'
import { Config } from 'config'

const logger = new Logger()
const config = new Config({
  port: 3000,
  host: 'localhost',
  debug: true
})

const app = new App(config, logger)
app.use(middleware1)
app.use(middleware2)
app.use(middleware3)
app.listen()

# ✅ Concise example (estimated 100 tokens)
import { App } from 'framework'

const app = new App({ port: 3000 })
app.listen()

// Full example: examples/basic-setup.ts

5. Table Token Optimization

Markdown tables consume significant tokens, so optimization is important[14].

  
# ❌ Verbose table (estimated 200 tokens)
| Parameter Name | Data Type | Default Value | Description |
|----------------|-----------|---------------|-------------|
| port           | number    | 3000          | Port number for server to listen on |
| host           | string    | localhost     | Server hostname |
| debug          | boolean   | false         | Whether to enable debug mode |

# ✅ Concise table (estimated 120 tokens)
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| port      | number | 3000  | Listen port |
| host      | string | localhost | Hostname |
| debug     | boolean | false | Debug mode |

# Or reference external file for complex tables
Complete list of config parameters: [config.md](./docs/config.md)

6. Directory Structure Representation

Principle: Tree line characters are for humans; for AI, use indentation or list format.

Tree line characters (├──, │, └──) are UTF-8 multibyte characters that are split into multiple tokens by BPE tokenizers, making them inefficient.

  
# ❌ Tree lines (estimated 120-150 tokens, visual for humans)
project/
├── src/
│   ├── components/
│   ├── api/
│   └── utils/
├── tests/
└── docs/

# ✅ Indentation (estimated 70-90 tokens, 40% reduction)
project/
  src/
    components/
    api/
    utils/
  tests/
  docs/

# ✅ List format (estimated 60-80 tokens, 50% reduction)
- project/
  - src/
    - components/
    - api/
    - utils/
  - tests/
  - docs/

# ✅ Path notation (estimated 40-50 tokens, 67% reduction, compact)
project/{src/{components,api,utils},tests,docs}

With descriptive comments:

In practice, directories often have descriptions added, requiring consideration of token efficiency in those cases as well.

  
# Tree lines + descriptions (estimated 180-200 tokens)
project/
├── src/
│   ├── components/   # React components
│   ├── api/          # API endpoints
│   └── utils/        # Utility functions
├── tests/            # Test files
└── docs/             # Documentation

# Indentation + descriptions (estimated 110-130 tokens, 40% reduction)
project/
  src/
    components/   # React components
    api/          # API endpoints
    utils/        # Utility functions
  tests/          # Test files
  docs/           # Documentation

# List + descriptions (estimated 100-120 tokens, 45% reduction)
- project/
  - src/
    - components/ - React components
    - api/ - API endpoints
    - utils/ - Utility functions
  - tests/ - Test files
  - docs/ - Documentation

# Table format (estimated 90-110 tokens, 50% reduction, highest information density)
| Path | Description |
|------|-------------|
| src/components/ | React components |
| src/api/ | API endpoints |
| src/utils/ | Utility functions |
| tests/ | Test files |
| docs/ | Documentation |

Usage Guidelines:

CLAUDE.md and AI-focused files: Indentation or list format
README.md (humans + AI): Indentation (balances readability and token efficiency)
Technical specifications: Path notation (most compact)

About Token Reduction Rates:

The token reduction rates in this section (“estimated 120-150 tokens”, “40% reduction”, etc.) are theoretical estimates based on typical directory structures. Actual effects vary by structure complexity, description length, and tokenizer used. Measuring effects in your own project before adoption is recommended.

Note: This article uses tree line format for human reader readability, but indentation format is recommended for AI-focused documentation.

7. Appropriate Use of Text Formatting

Principle: Use only formatting with semantic meaning, avoid excessive decoration[22][23].

Text formatting (**bold**, *italic*, `code`) helps LLM understanding, but formatting symbols themselves consume tokens. While Markdown symbols are often converted to single tokens[7][8], excessive use should be avoided[24].

  
# ❌ Excessive formatting (estimated 150 tokens)
**Important:** This **very important** feature must be used
***extremely*** carefully. You **absolutely must not** forget.

# ✅ Appropriate formatting (estimated 80 tokens)
**Important:** Use this feature carefully.

# ❌ Unnecessary formatting
This project is a *modern* *web* application *framework*.

# ✅ No formatting (meaning is clear)
This project is a modern web application framework.

When to use formatting:

Important warnings or notes (**Warning:**)
Commands, filenames, variable names and other technical elements
Emphasized technical terms (first occurrence only)

When to avoid formatting:

Purely for visual improvement
Formatting every sentence
Multiple emphases in the same paragraph

Comparing Markdown and JSON: Markdown is 15% more token efficient than JSON[25]:

JSON: 13,869 tokens
Markdown: 11,612 tokens

However, excessive formatting within Markdown reduces this advantage.

8. Using Section References

Principle: Don’t repeat explanations; define once and reference.

  
# ❌ Information duplication (estimated 400 tokens)
## Installation
npm install framework

Dependencies:
- Node.js 18+
- npm 9+
- TypeScript 5+

## Development Environment Setup
To set up development environment, you need:
- Node.js 18+
- npm 9+
- TypeScript 5+

# ✅ Deduplication through references (estimated 200 tokens)
## Requirements
- Node.js 18+
- npm 9+
- TypeScript 5+

## Installation
npm install framework

## Development Environment
See Requirements section above

9. Key-Value Pair and Labeled List Optimization

Principle: Maintain semantics while reducing unnecessary formatting[26].

The **Title**: pattern is frequently seen in technical documentation, but formatting is often unnecessary.

  
# ❌ Unnecessary formatting (estimated 60 tokens)
**Prerequisites**:
- Node.js 18+
- Docker environment

**Installation Steps**:
1. Clone repository
2. Install dependencies

**Notes**:
- Do not use in production

# ✅ Improvement 1: Use headings (estimated 40 tokens, 33% reduction)
### Prerequisites
- Node.js 18+
- Docker environment

### Installation Steps
1. Clone repository
2. Install dependencies

### Notes
- Do not use in production

# ✅ Improvement 2: No formatting (estimated 35 tokens, 42% reduction)
Prerequisites:
- Node.js 18+
- Docker environment

Installation Steps:
1. Clone repository
2. Install dependencies

Notes:
- Do not use in production

Usage Guidelines:

Pattern	Use Case	Token Efficiency
`### Title`	Main document sections	High (high semantics)
`Title:`	Inline labels, short lists	Highest
`Title:`	Truly emphasized warnings/notes	Low (minimize use)

CLAUDE.md Practical Example:

  
# ❌ Excessive formatting
**Project Name**: MyApp
**Language**: TypeScript
**Framework**: React
**Database**: PostgreSQL

# ✅ Headings and lists
## Tech Stack
- Language: TypeScript
- Framework: React
- DB: PostgreSQL

# ✅ Compact inline (for short items)
Project: MyApp | Language: TypeScript | DB: PostgreSQL

Using Definition Lists (Extended Syntax):

Some Markdown parsers (Pandoc, Jekyll, etc.) support Definition Lists[26]:

  
# Definition list syntax (if supported)
Title
: Description

API Key
: Secret key used for application authentication
: Set in environment variable `API_KEY`

# Rendered result (HTML)
<dl>
  <dt>API Key</dt>
  <dd>Secret key used for application authentication</dd>
  <dd>Set in environment variable <code>API_KEY</code></dd>
</dl>

However, this is not supported in standard Markdown, so use headings or lists when prioritizing compatibility.

Principle: Maintain Single Source of Truth (SSOT) and include only minimum necessary cross-references[27][28].

References to related files help navigation but consume tokens and increase cognitive load.

SSOT (Single Source of Truth) Principle

Definition: Define and manage each information element in one place only, using only references elsewhere[27][28].

  
# ❌ SSOT violation: Information duplication (estimated 500 tokens)
<!-- README.md -->
## Prerequisites
- Node.js 18+
- Docker 20+
- PostgreSQL 14+

<!-- CONTRIBUTING.md -->
## Development Environment
Development requires:
- Node.js 18+
- Docker 20+
- PostgreSQL 14+

<!-- docs/setup.md -->
## Setup
Please install the following:
- Node.js 18+
- Docker 20+
- PostgreSQL 14+

# ✅ SSOT compliant (estimated 200 tokens, 60% reduction)
<!-- README.md -->
## Prerequisites
- Node.js 18+
- Docker 20+
- PostgreSQL 14+

Details: [docs/setup.md](docs/setup.md)

<!-- CONTRIBUTING.md -->
## Development Environment
Prerequisites: See [README.md](../README.md#prerequisites)

<!-- docs/setup.md -->
## Setup
Ensure prerequisites are met: [README.md](../README.md#prerequisites)

Installation steps...

1. Selective References

Don’t list all related files; include only truly necessary references[29]:

  
# ❌ Excessive cross-references (estimated 300 tokens)
## Related Documentation
- [Project Overview](./docs/overview.md)
- [Architecture](./docs/architecture.md)
- [API Specification](./docs/api.md)
- [Database Schema](./docs/schema.md)
- [Deployment Guide](./docs/deploy.md)
- [Troubleshooting](./docs/troubleshooting.md)
- [FAQ](./docs/faq.md)
- [Changelog](./CHANGELOG.md)
- [License](./LICENSE)
- [Code of Conduct](./CODE_OF_CONDUCT.md)

## See Also
- [Getting Started](./docs/getting-started.md)
- [Advanced Usage](./docs/advanced.md)
- [Examples](./examples/README.md)

# ✅ Minimal references (estimated 100 tokens, 67% reduction)
## Next Steps
- Quick start: [docs/getting-started.md](docs/getting-started.md)
- API spec: [docs/api.md](docs/api.md)

Other: See [docs/](docs/)

2. Concise Link Text

Keep link text short with important words first[29]:

  
# ❌ Verbose link text
For details, please refer to [comprehensive documentation about project architecture](docs/architecture.md).

# ✅ Concise link text
Details: [Architecture documentation](docs/architecture.md)

3. Avoid Duplicate References on Same Page

Make hyperlinks only for the first occurrence of links to the same destination on a page[29]:

  
# ❌ Duplicate references
The [API spec](docs/api.md) describes all endpoints.
For authentication, see the auth section of the [API spec](docs/api.md).
Error handling is detailed in the [API spec](docs/api.md).

# ✅ First occurrence only
The [API spec](docs/api.md) describes all endpoints.
For authentication, see the auth section of the API spec.
Error handling is detailed in the API spec.

4. Implicit Relationships Through Directory Structure

Show relationships through directory structure rather than explicit links:

  
# ❌ List all files
## Documentation
- [Overview](docs/overview.md)
- [Installation](docs/installation.md)
- [Configuration](docs/configuration.md)
- [Usage](docs/usage.md)
- [API](docs/api.md)
- [CLI](docs/cli.md)
- [Troubleshooting](docs/troubleshooting.md)

# ✅ Directory reference
## Documentation
Quick start: [docs/getting-started.md](docs/getting-started.md)

Other documentation: See [docs/](docs/)

docs/
  getting-started.md   # Quick start
  installation.md      # Installation
  configuration.md     # Configuration
  usage.md             # Usage
  api.md               # API specification
  troubleshooting.md   # Troubleshooting

CLAUDE.md Practical Example

  
# ❌ Excessive file listing (estimated 400 tokens)
## Related Documentation

### Architecture
- [Overall Architecture](docs/architecture/overview.md)
- [Frontend Design](docs/architecture/frontend.md)
- [Backend Design](docs/architecture/backend.md)
- [Database Design](docs/architecture/database.md)

### Development Guide
- [Environment Setup](docs/dev/setup.md)
- [Coding Standards](docs/dev/coding-style.md)
- [Testing Strategy](docs/dev/testing.md)
- [CI/CD](docs/dev/cicd.md)

### API
- [REST API](docs/api/rest.md)
- [GraphQL API](docs/api/graphql.md)
- [WebSocket API](docs/api/websocket.md)

# ✅ Minimal references (estimated 120 tokens, 70% reduction)
## Guidance for Claude

**When generating code:**
- Prioritize type safety (coding standards: [CONTRIBUTING.md](CONTRIBUTING.md))
- API design patterns: [docs/api/](docs/api/)
- Testing: [docs/dev/testing.md](docs/dev/testing.md)

**Detailed documentation:**
See [docs/](docs/)

Optimizing References in AI Instructions

When having AI read documentation, specify only necessary files:

  
# ❌ List all documentation
Please reference the following documentation:
- README.md
- CONTRIBUTING.md
- docs/architecture.md
- docs/api.md
- docs/setup.md
- docs/coding-style.md
- docs/testing.md
(etc.)

# ✅ Specify only necessary files
Please reference the following:
- Coding standards: CONTRIBUTING.md
- Architecture: docs/architecture.md

Other: Reference files in docs/ as needed

11. File Integration vs Separation Trade-offs

Principle: Choose between integration and separation based on AI system type[30][31][32].

“Consolidating multiple small files into one” can significantly improve or worsen token efficiency depending on the situation.

Recommended Approaches by Use Case

Use Case	Recommendation	Reason	Token Efficiency
Claude Code (integrated AI)	Integrate	Load full context at once[32]	High (reference reduction)
RAG Systems	Separate	Retrieve only needed parts via semantic search[31]	High (exclude unnecessary info)
ChatGPT Custom Instructions	Integrate	Full content loaded each time	High (reference reduction)
AI Agents (selective loading)	Separate	Dynamically read only needed files	High (minimum necessary)
Documentation Sites	Separate	Users browse by topic	Medium (human-focused)

Pros and Cons of Integration

Pros:

  
# Before: 3 separate files (total 600 tokens, example calculation)

<!-- overview.md -->
# Project Overview
...
Details: [setup.md](setup.md), [usage.md](usage.md)

<!-- setup.md -->
# Setup
Overview: See [overview.md](overview.md)
Usage: See [usage.md](usage.md)

<!-- usage.md -->
# Usage
Overview: See [overview.md](overview.md)
Setup: See [setup.md](setup.md)

# After: 1 integrated file (400 tokens, 33% reduction, example calculation)

# Project Overview
...

## Setup
...

## Usage
...

Elements That Can Be Reduced:

Cross-reference links between files
Duplicate headers/footers
Repeated prerequisite explanations
Navigation sections

Claude Code Recommendation: Keep it concise[32]

  
# CLAUDE.md (recommended: concise)

## Project Overview
[Concise]

## Tech Stack
[List format]

## Coding Standards
[Key points only]

## Guidance for Claude
[Specific instructions]

# Add with @import as needed
@.claude/advanced-config.md

Cons:

File gets larger (avoid 1,000+ lines)
Separation of concerns becomes difficult
Increased conflicts during team editing

Pros and Cons of Separation

Pros:

  
# Effect in RAG Systems

<!-- Integrated file: 5,000 tokens -->
README.md (5,000 tokens) vectorized
→ User question: "How does authentication work?"
→ Search entire 5,000 tokens
→ May include irrelevant information

<!-- Separated files: 500 tokens each -->
- overview.md (500 tokens)
- authentication.md (500 tokens) ← Match!
- deployment.md (500 tokens)
...

→ User question: "How does authentication work?"
→ Retrieve only authentication.md (500 tokens)
→ 90% token reduction

RAG Chunking Best Practices[31]:

Semantic chunking: Split at logical boundaries (paragraphs, sections)
Fixed-size chunking: 100-300 tokens/chunk (smaller = faster but less accurate)
Document-based chunking: Split based on Markdown structure
Overlap: 20-50 token overlap with adjacent chunks

Cons:

Need reference descriptions between files
Inefficient when reading everything
Management becomes complex

Practical Decision Criteria

When to Integrate:

✅ Consider integration when ALL of the following apply

AI reads everything each time (Claude Code, Custom Instructions, etc.)
Total files under 1,000 tokens
Topics are closely related
Many cross-references (3+ locations)

Examples: CLAUDE.md, prompt templates, small project READMEs

When to Keep Separated:

✅ Keep separated when ANY of the following apply

Used in RAG systems
Total files over 2,000 tokens
Topics are independent
Selective loading is possible (AI agents, etc.)

Examples: Large documentation sites, API specifications, technical manuals

Hybrid Approach: Hierarchical CLAUDE.md

Claude Code supports hierarchical CLAUDE.md[32]:

project/
├── CLAUDE.md        # Top level
├── frontend/
│   ├── CLAUDE.md    # Frontend specific
│   └── src/
└── backend/
    ├── CLAUDE.md    # Backend specific
    └── src/

# Top level CLAUDE.md
## Project Overview
Monorepo structure. Frontend (React), Backend (Node.js)

## Global Rules
- TypeScript required
- ESLint compliant

See each directory's CLAUDE.md for details

# frontend/CLAUDE.md
## Frontend Specific Rules
- React 18
- Styled Components used

# backend/CLAUDE.md
## Backend Specific Rules
- Express 4
- Prisma used

Benefits:

Separation of concerns
Keep each file small (under 100 lines)
Claude Code auto-loads as needed
Token efficient (loads only working directory)

Token Reduction Examples

Case Study: Project Documentation Integration[30][32]

  
# Before: 5 separate files (total 2,500 tokens)

README.md: 800 tokens
  - Overview
  - Link list: INSTALL.md, USAGE.md, API.md, CONTRIBUTING.md

INSTALL.md: 400 tokens
  - Installation instructions
  - Links: README.md, USAGE.md

USAGE.md: 500 tokens
  - How to use
  - Links: README.md, API.md

API.md: 600 tokens
  - API specification
  - Links: README.md, USAGE.md

CONTRIBUTING.md: 200 tokens
  - Contribution guide
  - Links: README.md

# After: Integrated for Claude Code (1,800 tokens, 28% reduction)

CLAUDE.md: 1,800 tokens
  - Overview (simplified)
  - Installation instructions
  - Basic usage
  - Main APIs
  - Contribution guide (key points only)

Detailed API spec: docs/api.md (reference only when needed)

# Reduction breakdown
- Cross-reference links: -300 tokens
- Duplicate overview explanations: -200 tokens
- Navigation sections: -150 tokens
- Verbose preambles: -50 tokens
Total reduction: -700 tokens (28%)

Important Notes:

Keep human-facing documentation sites separated (prioritize usability)
Consider integration for AI context files (CLAUDE.md, etc.)
Choose based on purpose

12. Dynamic Optimization and Automatic Integration: Next-Generation Approach

Principle: Manage source files granularly, dynamically optimize and integrate when providing to AI[33][34][35].

“Granular management normally → auto-integrate as needed” is a new approach that goes beyond the binary choice of static integration vs separation.

Approach Overview

Source Management (Git, etc.)          AI Delivery
─────────────────────────────────    ─────────────
Granularly separated files    →      Dynamic optimization/integration
├── overview.md                       ↓
├── installation.md              Auto-processing pipeline
├── api-auth.md                       ↓
├── api-users.md                 Integrated, optimized
└── deployment.md                context

Key Technologies and Tools

1. GraphRAG (Knowledge Graph-based RAG)[33]

Traditional RAG retrieves similar text via vector search, but GraphRAG leverages knowledge graph relationship structures.

  
# Traditional RAG

User question: "What's the error handling for user authentication?"
↓
Vector search: Retrieve only "authentication.md" (500 tokens)
↓
Problem: Error handling details are in separate file (error-handling.md)

# GraphRAG

User question: "What's the error handling for user authentication?"
↓
Knowledge graph search:
- authentication.md (500 tokens)
- Related: error-handling.md (300 tokens) ← Auto-detected by graph
- Related: api-response-codes.md (200 tokens) ← Auto-detected by graph
↓
Integrated context: 1,000 tokens (includes all related info)

GraphRAG Benefits[33]:

Strong for multi-hop questions (“What’s the relationship between A and B?”, etc.)
Automatically aggregates related entities
Improved context relevance

2. Dynamic Context Switching[34]

Build context dynamically per request for LLM delivery.

  
# Pseudocode: Dynamic context building

def build_context(user_query, document_pool):
    # Step 1: Identify relevant documents
    relevant_docs = semantic_search(user_query, document_pool)

    # Step 2: Expand with knowledge graph
    expanded_docs = knowledge_graph.expand(relevant_docs)

    # Step 3: Optimize within token limit
    optimized_context = pack_documents(
        expanded_docs,
        max_tokens=4000,
        strategy="best_fit_packing"  # Eliminate wasteful truncation
    )

    return optimized_context

# Example:
# User query A: Calendar related → calendar.md + user-prefs.md
# User query B: Auth related → auth.md + error-codes.md + api-docs.md

Benefits:

Optimal context per query
Eliminate unnecessary information (token reduction)
High flexibility

3. llm-docs-builder (Auto-optimization Tool)[35]

A tool that automatically optimizes Markdown documentation for AI.

Features:

HTML noise removal (navigation bars, footers, JavaScript, etc.)
Token reduction: 85-95% reduction
Automatic llms.txt generation
Provides Markdown optimized for AI crawlers

Usage Example:

  
# llm-docs-builder configuration

input_dir: ./docs
output_dir: ./docs-optimized

transformations:
  - remove_frontmatter: true
  - remove_html_comments: true
  - remove_badges: true
  - normalize_links: true
  - optimize_headings: true
  - add_hierarchical_context: true  # Add heading context

# Result:
# Original HTML document: 5,000 tokens (including navigation, CSS, etc.)
# After optimization: 1,500 tokens (70% reduction)

Workflow Integration:

  
# Auto-optimize at build time
npm run build
  → llm-docs-builder transform
  → Generate AI-optimized docs (docs-ai/)
  → Maintain human-facing docs (docs/)

# Web server configuration
if user_agent == "AI-Crawler":
    serve docs-ai/  # Optimized version
else:
    serve docs/     # Normal version

Practical Strategies

Strategy 1: Hybrid Management

Source Management (detailed separation)
├── authentication/
│   ├── overview.md
│   ├── oauth.md
│   ├── jwt.md
│   └── sessions.md
├── api/
│   ├── users.md
│   ├── posts.md
│   └── comments.md
└── errors/
    ├── codes.md
    └── handling.md

↓ Dynamic integration at build time/request time

AI-focused integrated documents
├── authentication-full.md  # authentication/* integrated
├── api-full.md             # api/* integrated
└── errors-full.md          # errors/* integrated

OR

RAG vector DB
└── Each file + knowledge graph maintains relationships

Strategy 2: On-demand Integration

  
# Dynamic processing at AI request time

1. Receive user query: "What's the error handling for OAuth authentication?"

2. Detect related files (GraphRAG):
   - authentication/oauth.md
   - errors/handling.md
   - errors/codes.md (401, 403 related)

3. Dynamic integration:

# OAuth Authentication and Error Handling (Integrated)

## OAuth Authentication (from authentication/oauth.md) [content]

## Error Handling (from errors/handling.md) [OAuth-related parts only extracted]

## Related Error Codes (from errors/codes.md)

401 Unauthorized
403 Forbidden ```

Provide to LLM (optimized) ```

Pros and Cons

✅ Pros:

Source maintainability: Granular separation for easy management
AI optimization: Automatic integration/optimization
Flexibility: Dynamic optimization per query
Token efficiency: Integrate only needed parts (up to 95% reduction possible)
Human/AI compatibility: Optimize separately for humans and AI

❌ Cons:

Complexity: Pipeline construction required
Cost: Automation tool introduction/operation
Initial investment: Setup takes time
Overhead: Real-time integration increases processing time

Recommended Implementation Levels

Project Scale	Recommended Approach	Reason
Small (<10 files)	Manual integration	Automation cost not justified
Medium (10-50 files)	Tools like llm-docs-builder	Significant automation benefits
Large (50+ files)	GraphRAG + dynamic integration	Complex relationship management required
Enterprise	Full pipeline	High ROI

Summary: Choosing the Optimal Approach

# Decision Flowchart

Project scale?
├── Small (<1,000 tokens total)
│   → Manually integrate into single file
│
├── Medium (1,000-10,000 tokens)
│   → Auto-optimize with llm-docs-builder etc.
│   → If RAG system, keep separated
│
└── Large (10,000+ tokens)
    ├── Using RAG
    │   → GraphRAG + dynamic integration
    │
    └── Integrated AI like Claude Code
        → Hierarchical CLAUDE.md + @import

Key Points:

Source management: Always separate granularly (prioritize maintainability)
AI delivery: Dynamically optimize based on use case
Automation: Consider tool introduction based on scale
Measurement: Quantitatively evaluate token reduction effects

Practical Example: CLAUDE.md Before/After

Before: Verbose CLAUDE.md (estimated 1,500 tokens)

  
# Project Overview

This project is a web application with very convenient and powerful features.
This application is designed to be easy for users to use, and adopts a
modern technology stack.

## Tech Stack

In this project, we use the following technologies:
- Frontend: We use React 18
- Backend: We use Node.js 20 and Express 4
- Database: We use PostgreSQL 15
- Authentication: We use JWT (JSON Web Tokens)

## Directory Structure

The project directory structure is as follows:
- src/ directory: Contains source code
  - components/ directory: Contains React components
  - api/ directory: Contains API endpoints
  - utils/ directory: Contains utility functions
- tests/ directory: Contains test files
- docs/ directory: Contains documentation

## Coding Standards

Please follow these coding standards for this project:
- Write all code in TypeScript
- Format code according to ESLint settings
- Add type definitions to all functions
- Add Props type definitions to all components

Estimated token count: ~1,500 tokens

After: Optimized CLAUDE.md (estimated 400 tokens)

  
# Project Overview

Modern web application (React + Node.js)

## Tech Stack

- Frontend: React 18, TypeScript 5
- Backend: Node.js 20, Express 4
- DB: PostgreSQL 15
- Auth: JWT

## Directory Structure

src/ components/ # React components api/ # API endpoints utils/ # Utility functions tests/ # Test files docs/ # Documentation

## Coding Standards

- TypeScript required (type definitions required)
- ESLint compliant
- Details: [CONTRIBUTING.md](./CONTRIBUTING.md)

## Guidance for Claude

### When generating code
- Prioritize type safety (`any` prohibited)
- Consider security best practices
- Generate test code as needed

### API implementation
- RESTful design
- Error handling required
- OpenAPI specification compliant

Estimated token count: ~400 tokens

Reduction rate: 73% reduction (1,500 → 400 tokens)

Optimization Points

Reduce verbose explanations: “This project is…” → “Modern web application”
Use list format: Prose → bullet points
Use tree structure: Visually represent directory structure
External references: Details in separate files
Structured sections: Clear instructions with “Guidance for Claude”

Practical Example: Prompt Template Optimization

Before: Verbose Prompt (estimated 500 tokens)

  
You are a professional software engineer.
Please review the code I provide and point out any issues.
Please pay particular attention to the following when reviewing:

1. Please check the code quality
2. Please check if there are any security issues
3. Please check if there are any performance issues
4. Please check if it follows best practices
5. Please check if tests are sufficiently written

Please output the review results in the following format:
- First, state the overall evaluation
- Next, list the issues
- Finally, state improvement suggestions

After: Optimized Prompt (estimated 150 tokens)

  
## Role
Senior software engineer

## Task
Code review

## Focus Areas
- Code quality
- Security vulnerabilities
- Performance bottlenecks
- Best practices
- Test coverage

## Output Format
1. Overall assessment
2. Issues (prioritized)
3. Recommendations

Reduction rate: 70% reduction (500 → 150 tokens)

Optimization Techniques

Markdown structuring: ## headings for clear sections
Bullet points: “Please check…” → list items
Remove verbose conjunctions: “First of all” → “1.”
English keywords: Technical terms may be more token-efficient in English

How to Have AI Write Context-Efficient Documentation

1. Give Clear Instructions

Include instructions like the following when having AI generate documentation:

  
## Instructions Example

Please create README.md with the following requirements:

### Requirements
- Prioritize token efficiency
- Avoid verbose expressions
- Use bullet points and code blocks
- Separate detailed explanations to external files (docs/)
- Target token count: Under 500 tokens

### Sections to Include
- Project overview (2-3 sentences)
- Quick start (installation and basic usage only)
- Directory structure (tree format)
- Development guide (external reference)

### Do NOT Include
- Verbose preambles ("This project is...", etc.)
- Detailed API descriptions (separate to docs/api.md)
- Full license text (LICENSE reference is sufficient)

2. Provide Templates

Present the desired structure as a template for AI:

  
## Template Example

Please generate documentation following this template:

\`\`\`markdown
# [Project Name]

[Single sentence description]

## Quick Start

### Installation
[Single command]

### Basic Usage
[Minimal code example]

## Directory Structure
[Tree format, with comments]

## Development
Details: [CONTRIBUTING.md](./CONTRIBUTING.md)

## License
[LICENSE](./LICENSE)
\`\`\`

3. Explicitly Constrain Token Count

  
## Instructions Example

Please generate CLAUDE.md.

### Constraints
- Maximum token count: 300 tokens
- If exceeded, remove lower priority information
- Compensate removed info with references to separate files (docs/project-details.md)

4. Build Feedback Loops

  
## Prompt Example

Please reduce the token count of the following documentation:

[Document content]

### Goal
- Reduce to 50% of current token count
- Maintain information importance
- Explain reductions made and reasons

### Optimization Methods
1. Reduce verbose expressions
2. Convert to bullet points
3. Separate to external references
4. Simplify code examples

5. Practical Prompt Examples

Save prompt templates for use in Claude Code in .claude/commands/:

  
<!-- .claude/commands/optimize-docs.md -->
Please optimize the following documentation for token efficiency:

### Optimization Criteria
- Minimize token count while maintaining information density
- Markdown structuring (headings, lists, code blocks)
- Remove verbose expressions
- Separate details to external files with references

### Output
1. Optimized documentation
2. Token reduction rate
3. Explanation of main changes

---
[Paste document here]

Usage:

# In Claude Code
/optimize-docs

2024-2025 Trends: AI-Native Documentation

llms.txt Standard

The llms.txt standard proposed by Jeremy Howard, co-founder of Answer.AI, in September 2024 is rapidly gaining adoption[15][16][17].

Overview:

Placed in Markdown format in website root directory
Provides structured information to LLM crawlers, similar to robots.txt
Adopted by thousands of sites hosted by Anthropic, Cursor, and Mintlify

File Structure:

  
<!-- https://example.com/llms.txt -->
# Project Name

> Brief project summary (1-2 sentences)

## Documentation

- [Getting Started](https://example.com/docs/getting-started)
- [API Reference](https://example.com/docs/api)
- [Examples](https://example.com/docs/examples)

## Optional: Full Documentation

See [llms-full.txt](https://example.com/llms-full.txt) for complete documentation.

Anthropic Implementation Examples:

https://docs.anthropic.com/llms.txt
https://docs.anthropic.com/llms-full.txt

llm-docs-builder Tool

Tools for automatically optimizing documentation have emerged[18]:

Features:

85-95% noise reduction from HTML documents
Convert to Markdown
Automatically generate llms.txt index
Provide documentation optimized for AI crawlers

CLAUDE.md Best Practices (2025)

Anthropic official best practices[19][20]:

Conciseness: Recommend keeping it concise
Hierarchical structure: Hierarchical context with nested CLAUDE.md
Prompt templates: Save reusable prompts in .claude/commands/
Git management: Share CLAUDE.md with team

Recommended Sections:

  
# CLAUDE.md

## Project Overview
[2-3 sentences]

## Tech Stack
[List format]

## Directory Structure
[Tree format, important directories only]

## Coding Standards
[Key points only, details in external reference]

## Guidance for Claude
[Specific instructions for code generation]

## External Resources
- [Detailed Architecture](./docs/architecture.md)
- [API Specification](./docs/api.md)
- [Contribution Guide](./CONTRIBUTING.md)

Token Reduction Examples

Actual project token reduction effects have been reported[21]:

Use Case	Before	After	Reduction Rate
Vercel deploy monitoring	10,100 tokens	300-500 tokens	95-97%
Render log analysis	Data dependent	Concise summary	98%
Supabase project filtering	Large JSON	Target projects only	97%

Reduction Methods:

JSON filtering with jq
Removal of unnecessary metadata
Structured summary generation

Summary

Key points for writing Markdown documentation that AI can read efficiently:

Writing Principles

Clarify hierarchical structure: Use appropriate heading levels
Eliminate redundancy: Reduce modifiers, duplicate expressions (30-50% reduction possible)
Use bullet points: List format over prose
Use external references: Separate details to other files
Minimize code examples: Only minimum necessary examples
Optimize tables: Concise column names, consider external references
Directory structures: Avoid tree line characters, use indentation or list format (40-67% reduction possible)
Moderate text formatting: Use only formatting with semantic meaning, avoid excessive emphasis
Simplify key-value pairs: **Title**: → ### Title or Title: (33-42% reduction possible)
Apply SSOT principle: Define information in one place only, minimize related file references (60-70% reduction possible)
File integration vs separation: Choose based on AI system
- Claude Code/Custom Instructions: Integration recommended (28-33% reduction possible)
- RAG systems: Separation recommended (up to 90% reduction possible)
Dynamic optimization and automatic integration (next-generation approach):
- Manage source files granularly
- Dynamically integrate/optimize for AI delivery (GraphRAG, llm-docs-builder, etc.)
- Choose implementation level based on project scale

Instructions When Having AI Write Documentation

Constrain token count: Explicitly state “under 300 tokens maximum”
Provide templates: Show desired structure as example
Present optimization criteria: “Avoid verbose expressions”, “Prioritize bullet points”, etc.
Feedback loops: Measure token count, iteratively optimize

2024-2025 Trends

llms.txt standard: New standard for AI-native documentation
Auto-optimization tools: 85-95% noise reduction possible
Hierarchical CLAUDE.md: Context management according to project structure

Practical Effects

Clean Markdown: 35% RAG search accuracy improvement, 20-30% token reduction[2]
CLAUDE.md optimization: 70-95% token reduction cases[21]
Cost reduction: Avoid 2x charges when exceeding 200K tokens[4]

By utilizing these techniques, you can significantly reduce context size and costs while maintaining AI response quality.

Caveats:

The token reduction rates shown in this article are based on calculation examples or reported cases under specific conditions. Actual reduction effects may vary depending on project structure, documentation content, and AI system used. Measuring effects in your own project before adoption is recommended.

References

References corresponding to citation numbers [1]-[35] in the main text are listed in numerical order.

Boosting AI Performance: The Power of LLM-Friendly Content in Markdown - Webex Developers Blog https://developer.webex.com/blog/boosting-ai-performance-the-power-of-llm-friendly-content-in-markdown [Reliability: High] Explains general benefits of LLM-friendly Markdown
Why Your LLM Needs Clean Markdown: A Deep Dive - AnythingMD https://anythingmd.com/blog/why-llms-need-clean-markdown [Reliability: Medium-High] Data on 35% RAG accuracy improvement, 20-30% token reduction
Why Markdown is the best format for LLMs - Wetrocloud, Medium (2024) https://medium.com/@wetrocloud/why-markdown-is-the-best-format-for-llms-aa0514a409a7 [Reliability: Medium]
Context windows - Claude Docs - Anthropic Official Documentation https://docs.claude.com/en/docs/build-with-claude/context-windows [Reliability: High] Context windows and premium pricing
ChatGPT Context Window and Token Limit - 16x Prompt (2024) https://prompt.16x.engineer/blog/chatgpt-context-window-token-limit [Reliability: Medium-High]
Markdown Prompting In AI Prompt Engineering Explained - Applied AI Tools https://appliedai.tools/prompt-engineering/markdown-prompting-in-ai-prompt-engineering-explained-examples-tips/ [Reliability: Medium-High]
Let’s Build the GPT Tokenizer: A Complete Guide to Tokenization in LLMs - fast.ai (2024) https://www.fast.ai/posts/2025-10-16-karpathy-tokenizers.html [Reliability: High] Explanation by Andrej Karpathy
Complete Guide to LLM Tokenization - LLM Calculator (2024) https://llm-calculator.com/blog/complete-guide-to-tokenization/ [Reliability: Medium-High]
Cutting Cost and Enhancing Performance: Minifying Markdown Tables - Budi Syahiddin, Government Digital Products Singapore (2024) https://medium.com/singapore-gds/cutting-cost-and-enhancing-performance-minifying-markdown-tables-to-improve-token-efficiency-in-af488a784fd5 [Reliability: High]
How to Optimize Token Efficiency When Prompting - Portkey.ai https://portkey.ai/blog/optimize-token-efficiency-in-prompts/ [Reliability: Medium-High]
LLM prompt optimization: Reducing tokens usage - Saulius Šaulys, Medium (2024) https://medium.com/@sauliusaulys/llm-prompt-optimization-reducing-tokens-usage-343f5de178a5 [Reliability: Medium] 30-50% reduction data
Token optimization: The backbone of effective prompt engineering - IBM Developer https://developer.ibm.com/articles/awb-token-optimization-backbone-of-effective-prompt-engineering/ [Reliability: High]
Claude Code Best Practices - Anthropic Official https://www.anthropic.com/engineering/claude-code-best-practices [Reliability: High]
Cutting Cost and Enhancing Performance: Minifying Markdown Tables to Improve Token Efficiency in RAG - Government Digital Products Singapore (2024) https://medium.com/singapore-gds/cutting-cost-and-enhancing-performance-minifying-markdown-tables-to-improve-token-efficiency-in-af488a784fd5 [Reliability: High]
What is llms.txt? Breaking down the skepticism - Mintlify Blog (2024) https://www.mintlify.com/blog/what-is-llms-txt [Reliability: High]
LLMs.txt Explained - TDS Archive, Medium (2024) https://medium.com/data-science/llms-txt-explained-414d5121bcb3 [Reliability: Medium-High]
Simplifying docs for AI with /llms.txt - Mintlify Blog (2024) https://www.mintlify.com/blog/simplifying-docs-with-llms-txt [Reliability: High]
Announcing llm-docs-builder: An Open Source Tool for Making Documentation AI-Friendly - Maciej Mensfeld (2025) https://mensfeld.pl/2025/10/llm-docs-builder/ [Reliability: Medium-High] 85-95% noise reduction data
Claude Code Best Practices - Anthropic Official (2025) https://www.anthropic.com/engineering/claude-code-best-practices [Reliability: High]
My 7 essential Claude Code best practices for production-ready AI in 2025 - eesel AI (2025) https://www.eesel.ai/blog/claude-code-best-practices [Reliability: Medium-High]
Optimizing Token Efficiency in Claude Code Workflows - Pierre-Emmanuel Féga, Medium (2025) https://medium.com/@pierreyohann16/optimizing-token-efficiency-in-claude-code-workflows-managing-large-model-context-protocol-f41eafdab423 [Reliability: Medium] 95-98% reduction examples
Marking Up the Prompt: How Markdown Formatting Influences LLM Responses - Neural Buddies (2024) https://www.neuralbuddies.com/p/marking-up-the-prompt-how-markdown-formatting-influences-llm-responses [Reliability: Medium-High] Analysis of Markdown formatting’s influence on LLM responses
Markdown Best Practices for Technical Writers - Markdown Toolbox https://www.markdowntoolbox.com/blog/markdown-best-practices-for-technical-writers/ [Reliability: Medium-High] Best practices for avoiding excessive formatting
A Guide to Markdown Styles in LLM Responses - DreamDrafts, Medium (2024) https://medium.com/@sketch.paintings/a-guide-to-markdown-styles-in-llm-responses-ed9a6e869cf4 [Reliability: Medium] Effective use of Markdown styles
Markdown is 15% more token efficient than JSON - OpenAI Developer Community (2024) https://community.openai.com/t/markdown-is-15-more-token-efficient-than-json/841742 [Reliability: High] Token efficiency comparison with measured data
How to list key/value pairs in a markdown - Stack Overflow https://stackoverflow.com/questions/28429750/how-to-list-key-value-pairs-in-a-markdown [Reliability: Medium-High] Practical discussion of key-value pair expression in Markdown
Single source of truth - Wikipedia https://en.wikipedia.org/wiki/Single_source_of_truth [Reliability: High] Definition and background of SSOT principle
About the Single Source of Truth (SSOT) and Don’t Repeat Yourself (DRY) principles - Webel IT Australia https://www.webel.com.au/node/889 [Reliability: Medium-High] Explanation of relationship between SSOT and DRY principles
Cross-references and linking - Google developer documentation style guide - Google for Developers https://developers.google.com/style/cross-references [Reliability: High] Best practices for cross-references (link text simplification, cognitive load reduction, etc.)
Breaking the LLM’s Token Limit: Introducing the Modular AI Systems Architecture - Amir Ghasemi, Medium (2024) https://medium.com/@amir.ghm/breaking-the-llms-16k-token-limit-introducing-the-modular-ai-systems-architecture-5a23b37139ac [Reliability: Medium] Overcoming token limits with modular AI systems architecture
Chunking for RAG: best practices - Unstructured (2024) https://unstructured.io/blog/chunking-for-rag-best-practices [Reliability: High] Chunking strategies for RAG systems (semantic, fixed-size, document-based, etc.)
Claude Code Best Practices - Anthropic Official (2025) https://www.anthropic.com/engineering/claude-code-best-practices [Reliability: High] CLAUDE.md hierarchical structure, @import syntax, etc.
Graph Retrieval-Augmented Generation: A Survey - arXiv (2024) https://arxiv.org/abs/2408.08921 [Reliability: Medium-High] Comprehensive survey paper on GraphRAG. Methods using knowledge graphs for RAG, effectiveness for multi-hop questions, etc. Note: arXiv paper (pre-print before peer review). Cited for technical overview of GraphRAG
Level Up Your LLMs: Dynamic Context Switching for Smarter, Faster Inference - Yair Stern, Medium (2024) https://medium.com/@yairms.il/level-up-your-llms-dynamic-context-switching-for-smarter-faster-inference-4986a49269d1 [Reliability: Medium] Optimizing LLM inference through dynamic context switching
Announcing llm-docs-builder: An Open Source Tool for Making Documentation AI-Friendly - Maciej Mensfeld (2025) https://mensfeld.pl/2025/10/llm-docs-builder/ [Reliability: Medium-High] 85-95% token reduction, automatic llms.txt generation, HTML noise removal, etc.

About Medium Articles:

Medium articles ([11][21][24][30][34]) are cited as author case studies, but rated [Reliability: Medium] as they haven’t gone through peer review. For token reduction rates and optimization methods shown in this article, measuring effects in your own environment before adoption is recommended.

Other References (Not Numbered in Text)

Resources consulted during article creation but not directly cited in the text.

Cognitive Load Theory: Methods to Manage Working Memory Load - Fred Paas, Jeroen J. G. van Merriënboer (2020) https://journals.sagepub.com/doi/10.1177/0963721420922183 [Reliability: High] Academic paper on cognitive load theory
Creating the information architecture for your documentation - KnowledgeOwl Blog https://blog.knowledgeowl.com/blog/posts/information-architecture/ [Reliability: Medium-High]
awesome-claude-code: A curated list - GitHub https://github.com/hesreallyhim/awesome-claude-code [Reliability: Medium] Community resource

Notes:

On Citation Accuracy: The information cited in this article has been verified through the following methods:

Confirmation of official documentation (Anthropic, OpenAI, etc.)
Cross-verification through multiple independent sources (technical blogs, specialist media)
Priority given to 2024-2025 latest information

Some technical blogs and Medium articles have been cited after confirming author expertise and data backing, but are rated “Medium” or “Medium-High” reliability compared to official documentation and academic papers.

AI, Documentation

AI Markdown documentation token-efficiency Claude ChatGPT

This post is licensed under CC BY 4.0 by the author.

Overview

Why You Should Be Mindful of Context Size

Current State of Context Windows (As of November 2025)

Cost Impact

Performance Impact

Why Markdown Is Chosen

1. Token Efficiency

2. LLM Tokenization Process

3. RAG Search Accuracy Improvement

Context-Efficient Markdown Writing

1. Heading Hierarchy Optimization

2. Eliminating Redundancy

3. Using Lists and Bullet Points

4. Code Block Optimization

5. Table Token Optimization

6. Directory Structure Representation

7. Appropriate Use of Text Formatting

8. Using Section References

9. Key-Value Pair and Labeled List Optimization

10. Related File References and SSOT Principle

SSOT (Single Source of Truth) Principle

Related File Reference Best Practices

CLAUDE.md Practical Example

Optimizing References in AI Instructions

11. File Integration vs Separation Trade-offs

Recommended Approaches by Use Case

Pros and Cons of Integration

Pros and Cons of Separation

Practical Decision Criteria

Hybrid Approach: Hierarchical CLAUDE.md

Token Reduction Examples

12. Dynamic Optimization and Automatic Integration: Next-Generation Approach

Approach Overview

Key Technologies and Tools

Practical Strategies

Pros and Cons

Recommended Implementation Levels

Summary: Choosing the Optimal Approach

Practical Example: CLAUDE.md Before/After

Before: Verbose CLAUDE.md (estimated 1,500 tokens)

After: Optimized CLAUDE.md (estimated 400 tokens)

Optimization Points

Practical Example: Prompt Template Optimization

Before: Verbose Prompt (estimated 500 tokens)

After: Optimized Prompt (estimated 150 tokens)

Optimization Techniques

How to Have AI Write Context-Efficient Documentation

1. Give Clear Instructions

2. Provide Templates

3. Explicitly Constrain Token Count

4. Build Feedback Loops

5. Practical Prompt Examples

2024-2025 Trends: AI-Native Documentation

llms.txt Standard

llm-docs-builder Tool

CLAUDE.md Best Practices (2025)

Token Reduction Examples

Summary

Writing Principles

Instructions When Having AI Write Documentation

2024-2025 Trends

Practical Effects

References

Other References (Not Numbered in Text)

Trending Tags