Post
JA EN

How to Use ADRs in DDD: Trade-offs and a Situation-by-Situation Playbook

How to Use ADRs in DDD: Trade-offs and a Situation-by-Situation Playbook
  • Audience: Engineers and tech leads running a project with DDD (Domain-Driven Design) who are unsure whether to use ADRs to record design decisions, or who want to nail down how to use them. If you work with AI-assisted development, this article also points out where the leverage is.
  • Prerequisites: The basic concepts of DDD (Bounded Context, Ubiquitous Language). It helps to know that an ADR (Architecture Decision Record) is “a technique for capturing design decisions, one decision per file.”
  • Reading time: about 22 min

Overview

How should you combine DDD and ADRs? Approaching this as “they pair well, so let’s record everything” is a recipe for failure. DDD and ADRs are not an unconditionally good match. An ADR is a snapshot format: you make a point decision and lock it in. DDD’s strategic design (boundaries, the Ubiquitous Language) is a model you grow as you learn. The granularity clashes, and so does the underlying philosophy.

But there is a layer where the two genuinely click. Narrow ADRs down to decisions that are “strategic, expensive, and likely to be questioned later”—how a boundary is drawn, how contexts relate to each other, what alternatives were rejected—and they fill exactly the gap DDD leaves open. DDD-native recording tools (Context Map, Bounded Context Canvas, and so on) are good at expressing “what things look like right now (the What),” but they are thin on capturing “why we did it this way and what we gave up (the Why).”

This article lays out the upsides and downsides of using ADRs in DDD honestly, and then shows concrete usage: what to make an ADR and what not to, how to divide labor with constraint files like AGENTS.md, and how to vary your approach by situation—new large-scale projects, small ones, bug fixes, and enhancements. The one-line takeaway: confine ADRs to strategic, heavy decisions, and leave the current state to code and DDD tools. Hold that line, and ADRs will neither proliferate into hollow boilerplate nor rot unread after you wrote them. And one more thing—the management work itself (raising ADRs, formatting, handling supersedes, consistency checks) can be handed to AI. What humans must keep their hands on is the strategic decisions and verifying that the Why holds up.

The upsides of DDD + ADR

ADRs offer DDD four main kinds of value.

1. The “why” behind boundaries and relationships survives. The first thing lost in DDD is the reasoning: “why did we draw the boundary here?” and “why did we choose this inter-context relationship (Conformist, ACL, and so on)?” The boundary itself survives in the directory structure of the code, but the rejected alternatives and the constraints of the day do not. Six months later, when someone asks “wait, why did we split cleaning and front-desk work into separate contexts again?”, an ADR lets you recall the original judgment instantly.

2. Redrawing gets faster. DDD boundaries are hypotheses; revisiting them as you go is the intended way to work. ADRs follow an append-only principle: “don’t delete, overwrite with a new ADR (superseded by ADR-NNN).”12 When you redraw a boundary, having “why we set the original boundary” on record means you can decide by looking only at the delta between the assumptions back then and the situation now. Without it, you start with the archaeology of “why on earth did we…,” and redrawing becomes heavy.

3. It becomes shared ground for handoffs and reviews. New team members, code reviewers, people taking over a production rollout—none of them have to ask “why is it like this?” out loud every time. Being able to look up the reason for a decision in a document lowers review friction.2

4. (For AI development) AI stops silently overturning past decisions. When you have an AI assistant write code, it will fluently propose “a better design” and quietly rebuild a boundary you fought hard to draw. Add one line to CLAUDE.md or AGENTS.md—”check the ADRs before changing the design“—and make the decisions referenceable as ADRs, and the AI will notice “this boundary was deliberately set this way in ADR-005” and think twice before overwriting it.2 The decision history works as a soft guardrail against the AI.

The downsides and the friction

On the other side, let me be honest about the friction that ADRs introduce into DDD. Not knowing this and going “let’s ADR everything” is how you get hurt.

1. The granularity is off. An ADR “finalizes and records a point decision, one decision per file.” But DDD’s strategic design is “a model that grows continuously.” The finality of an accepted status fundamentally doesn’t sit well with DDD’s stance of “this is still a hypothesis.” Turn individual Ubiquitous Language terms or the fine-grained fields of an aggregate into ADRs, and every change triggers a supersede, until the ledger is buried in noise.

2. Double bookkeeping drifts. Write the current state (“here’s how it is now”) into an ADR, and the ADR goes stale every time the code changes. Conversely, write the rationale into AGENTS.md—which is supposed to hold current constraints—and the file bloats. Neglect the sync between ADRs and the code (and constraint files), and stale descriptions diverge from reality, creating more confusion than clarity.

3. Writing them takes effort. Writing an ADR tends to become “extra work nobody has time for.”3 In moments where you’re prioritizing project velocity, raising an ADR for every decision stops the main flow.

4. Overdo it and it becomes ceremony. Strain toward “every decision goes in an ADR,” and the count balloons until nobody reads them. The moment you make the number of ADRs a KPI, hollow records get mass-produced (Goodhart’s law4).

Every one of these downsides can be avoided by narrowing the scope where you use ADRs. The next section draws that line.

Dividing labor with DDD-native tools: What vs. Why

A natural objection: “Can’t we just record decisions with DDD’s own tools, then?” In fact, DDD comes with plenty of recording instruments. But almost all of them express ‘what things look like right now (the What)’ and don’t fill the Why gap that ADRs cover.

DDD-native toolWhat it capturesThe Why gap
Context Map5A current diagram of inter-boundary relationships (Conformist / ACL / Partnership, etc.)“Why we chose that relationship, and what we rejected” is not captured
Bounded Context Canvas6Each boundary’s responsibilities, UL, business rules, assumptions, open questionsCaptures what the rules are, but holds no rationale, no rejected options, no change history
Core Domain Charts7The strategic classification into core / supporting / genericA visualization of the current state, not the reasoning or history behind a decision
Ubiquitous Language glossaryThe current agreement on terms“Why we settled on that word” is not captured

The one tool that squarely confronts the why is Cyrille Martraire’s Living Documentation8—but that’s a “embed the decision rationale into the code via annotations” approach, a different shape from an ADR’s “independent, chronological decision ledger.” (Martraire himself takes the position of subsuming ADRs rather than replacing them.)

So the division of labor looks like this:

1
2
3
Context Map / Canvas / Core Domain Charts / glossary  →  the current model (What)
code + tests                                          →  the current behavior (What)
ADR                                                   →  the why, rejected options, history (Why)

A formulation discussed on Zenn2 puts it crisply: “ADRs preserve the why, but they don’t answer what’s happening now. The current spec should live in code and tests.” DDD tools own the current structure, code owns the current behavior, and ADRs own the reasoning behind decisions. Not mixing the roles is the foundation for avoiding the downsides.

Concrete usage

What to make an ADR, and what not to

This is the most important line to draw. Keep the criteria simple.2

Make it an ADR:

  • Decisions that are expensive to reverse (how a boundary is drawn, choice of data store, etc.)
  • Decisions where you seriously compared multiple alternatives (the rejected options are where the value is)
  • Decisions whose reasoning will likely stop being self-evident in three months
  • Decisions that affect a public spec or a context boundary

Don’t make it an ADR:

  • Bug fixes, simple refactors
  • Obvious technical choices (where there’s nothing to debate)
  • Reversible implementation details (things you can change freely later)
  • Local decisions that stay within a single module

When you’re unsure, cut the decision with this flow.

flowchart TB
    A["A design decision arises"]
    B{"Reversible and<br>local?"}
    C["No ADR needed<br>(code and tests suffice)"]
    D{"Compared multiple alternatives<br>or will the reason stop being<br>self-evident in 3 months?"}
    E["Write an ADR<br>(record rejected options and why)"]
    A --> B
    B -->|"Yes"| C
    B -->|"No"| D
    D -->|"No"| C
    D -->|"Yes"| E

Format

Michael Nygard’s classic format1, or its lightweight variant (MADR-lite)2, is plenty.

1
2
3
4
5
6
# ADR-NNN: <decision title>
## Status: accepted / superseded by ADR-MMM
## Context: in what situation, and what had to be decided
## Decision: what was decided
## Alternatives considered: the options examined and rejected, and why
## Consequences: upsides / downsides / trade-offs / follow-ups

In a DDD context, the section that pays off most is Alternatives considered. “Why Conformist instead of Partnership?” “Why split into two aggregates instead of merging into one?”—it’s precisely the reasons for rejection that speed up later redraw decisions.

Dividing labor with AGENTS.md / CLAUDE.md

If you do AI-assisted development, be deliberate about how you split work between ADRs and constraint files. AGENTS.md hands the AI the current constraints (the What) every turn (push model), while ADRs preserve the why (the Why) to be pulled when needed (pull model). Ubiquitous Language prohibitions and invariants go into AGENTS.md short and in the present tense; the history behind that decision and its rejected options go into an ADR. Mix the two, and either AGENTS.md bloats with history or the ADR drifts from the present.

Why this division works in the AI era, and what is actually happening when “an AI reads an ADR” in the first place—I covered that background and the supporting evidence as an essay in a companion article, Why ADRs Are Being Re-evaluated in the AI Era. This article focuses on usage.

How to use supersede

When you redraw a boundary or a relationship, don’t edit the original ADR—overwrite it with a new one. Change the original ADR’s Status to “superseded by ADR-NNN,” and write into the new ADR “why we’re redrawing (the delta between the assumptions of the time and now).” This keeps the decision history and prevents “relitigating the same debate.”

Hand ADR management to AI—but keep verifying the Why yourself

Where decisions are produced at high speed, the effort of writing ADRs becomes a bottleneck. Here, the realistic move is to be bold and hand the “management” of ADRs to AI. The tasks you can delegate are broad:

  • Raising the record: detecting from a commit or PR that “this is a decision worth recording” and proposing it be turned into an ADR
  • Generating the first draft: filling in Context / Decision / Consequences from the diff (the change and the commit message)3
  • Formatting and numbering: unifying the format, assigning sequential numbers, attaching metadata such as creation timestamp, model, and session ID (there are even frameworks like Agent Decision Records9 that standardize this)
  • Handling supersedes: when a boundary is redrawn, updating the old ADR’s Status and wiring up the cross-links
  • Consistency checks: detecting divergence between ADRs and the current code or AGENTS.md
  • Answering references: when asked “why did we go with X?”, pulling up the relevant ADR and answering

Once AI runs this far, an ADR changes from “a document left to rot because writing it is a chore” into a ledger that maintains itself automatically. From detecting a decision through numbering, supersede, and consistency checks, AI is faster and more thorough at the routine work.

But there’s one line you must not let AI cross. What AI can write from a diff goes only as far as “what changed.” “Why that option was chosen and the others rejected” is a human judgment that doesn’t appear in the diff, and if you make AI write it, it fills the gap with guesswork. This is the most dangerous form of the downside described earlier: a plausible-but-false Why sails through review, gets fixed in the ledger, and later handoff decisions are made on the premise of that false Why. So draw the line this way—management (raising, formatting, numbering, supersede, consistency checks, references) goes to AI; the strategic decision itself and verifying that the Why holds up stay with humans. AI runs the ledger; humans supervise only the crucial points. The technical background and limits of this approach are covered in detail in the companion essay.

Writing it so review doesn’t become the bottleneck

Once you set things up as “AI manages, humans verify the Why,” human review can become the new bottleneck—because the more decisions get mass-produced, the more ADRs there are to verify. The key is to make the AI write in a form that’s easy to review. Review speed depends far more on “how it’s written” than on the content.

There’s empirical support (for PR descriptions rather than ADRs as such). A study analyzing AI-generated PR descriptions (18,256 of them) found that descriptions written by AI and completed by a human had shorter review times and were more likely to be merged.10 On the flip side, LLMs left to their own devices tend to drift verbose (verbosity bias), and merely asking them to “be concise” doesn’t help much.11 So the realistic approach is to constrain them with structure. Concretely, make the AI follow this form:

  • Fix the structure. Don’t let it free-write; make it follow a template. The Y-statement12, which structures a decision into a single sentence, is a great example: “In the context of ⟨context⟩, facing ⟨concern⟩, we decided for ⟨option⟩ and against ⟨rejected option⟩, to achieve ⟨goal⟩, accepting that ⟨trade-off⟩.” Context, chosen option, rejected option, and trade-off all fit in one sentence.
  • Put the conclusion and reasoning up front. So reviewers don’t have to infer “why?” Make it state “what and why” in the first one or two sentences.
  • List rejected options one line each. What a human verifies for the Why is exactly this single point: whether “the rejected options and their reasons” hold up. Bullet them out, and review need only look there.
  • Push the details out to links. Keep the body short; relegate long discussions and the relevant code to link references.
  • Don’t adopt the AI’s first draft at 100%. Treat the why part as something a human fills in. In the study above, what shortened review time was precisely the cases where a human edited the AI’s draft.10

In short, have the AI write in a form that focuses human eyes on the single point of Why verification. If one line—the Y-statement’s “against (rejected option)”—is enough to judge validity, then even with management delegated to AI, human review is unlikely to become the bottleneck. The quantitative evidence on the code-review side—that humans are more likely to miss things when the review target is large or verbose—is collected in the companion essay.

Tailoring by situation

Even with the same DDD + ADR setup, you should vary the thickness of your ADRs by the nature of the project. Here it is across four situations.

New, large-scale projects

This is where you use ADRs most heavily. Once you’ve nailed down the strategic design with something like Event Storming, raise strategic ADRs from the results—the split of boundaries, the important Ubiquitous Language decisions, inter-context relationships, the technology-stack selection. That’s a few to a dozen-plus. A companion article that demonstrates ten ADRs running alongside a seven-phase build on a fictional hotel-reservation system is a concrete example of this situation.

Small projects

If you only have one or two boundaries, a handful of ADRs is enough. Use MADR-lite to lightly record things like “why we split into these two” and “why we chose this library.” Try to write ADRs as thick as a large project’s here, and only the downsides (effort, over-production) kick in. The notion that “full spec-driven development is overkill—just lightly record the reasons for decisions”2 is exactly the right fit at this scale.

Bug fixes

As a rule, don’t write an ADR. A bug fix usually hits one of “reversible, obvious, local,” and falls into “not needed” on the decision flow above. The exception is when the root cause of the bug lies in a past design decision, and you’re changing that decision itself. Then you record “why the original design was a problem” in an ADR—and in most cases, that’s a supersede of an existing ADR. What’s worth recording is not that you “fixed it” but that you “changed the design direction.”

Enhancements and feature additions

This branches on whether a boundary moves.

  • The boundary doesn’t move (you’re just adding a feature within an existing context) → No ADR needed. Code and tests suffice.
  • The boundary moves (you add a new context, or redraw an existing boundary) → Write an ADR. A new context means a new ADR; redrawing a boundary means a supersede. In the demonstration article above, the decision to stand up a new “group booking” context during the beta run (a new ADR) and the decision to redraw the front-desk / cleaning boundary (a supersede) are exactly this situation.

The deciding axis is consistent: “Is this a decision that touches the domain’s structure, or one that stays closed within the implementation?” Only the former is worth an ADR.

Conclusion

DDD and ADRs are not an unconditionally good match. The granularity clashes, and so does the philosophy. But confine them to strategic, heavy decisions, and ADRs fill exactly DDD’s weak spot—that the “why” doesn’t survive.

  • Upsides: the reasoning behind boundaries and relationships survives / redrawing is faster / shared ground for handoffs and reviews / (for AI development) a guardrail against the AI.
  • Downsides: granularity mismatch / drift from double bookkeeping / the effort of writing / ceremony from over-production. All avoidable by “narrowing the scope.”
  • Division of labor: the current state (What) goes to code and DDD tools (Context Map, Canvas, etc.); the reasoning behind decisions (Why) goes to ADRs. Don’t mix them.
  • What to make an ADR: only decisions that are expensive to reverse / compared multiple alternatives / whose reason won’t be self-evident later / that affect a boundary. Don’t write them for bug fixes, obvious choices, or reversible implementation details.
  • By situation: new large-scale, heavy; small, a few; bug fixes, as a rule none; enhancements, only when a boundary moves.

Writing an ADR is not an end in itself. Preserve the reasoning behind strategic decisions in a form that humans and AI can pull up later—narrow to that single point, and ADRs become a strong ally in DDD development.

You may also be interested in these related articles:

References

References corresponding to the citation numbers in the text, in numerical order.

  1. Documenting Architecture Decisions - Michael Nygard, Relevance / Cognitect (2011-11-15). The origin of ADRs. The Status / Context / Decision / Consequences format. Centralized as an official hub at adr.github.io. [Reliability: High] ↩︎ ↩︎2

  2. A Prescription for SDD Fatigue: Recording Architectural Decisions with ADRs in the AI Era - Kosk, Zenn (2026). “ADRs preserve the why but don’t answer now,” criteria for when to write and when not to, MADR-lite, supersede, and the CLAUDE.md rule “check the ADRs before changing the design.” [Reliability: Medium] (practitioner blog) ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5 ↩︎6 ↩︎7

  3. From Stale Docs to Living Architecture: Automating ADRs with GitHub + LLM - Iraj Hedayati, Medium (2025-09-14). A workflow where an LLM generates an ADR first draft from a PR diff, and a human reviews and refines it. [Reliability: Medium] (practitioner blog) ↩︎ ↩︎2

  4. An Implementation Guide for ADRs / Pitches / Kickoff Memos — From “Why Write It” to Operational Design - This blog. General operational guidance for ADRs (one decision per file, immutable numbering, recording “what we won’t do,” the over-production problem, the Goodhart runaway of making the count a KPI). ↩︎

  5. Context Mapping in Domain-Driven Design - Avanscoperta (the DDD community). Visualizing and documenting the relationships between Bounded Contexts (Conformist / ACL / Partnership, etc.). Captures the relationship patterns but not the reasoning behind the choice. [Reliability: Medium-High] ↩︎

  6. Bounded Context Canvas - ddd-crew (Nick Tune and others). A template for designing and documenting each boundary. It has Business Decisions / Assumptions / Open Questions fields, but does not explicitly retain the reasoning, rejected options, or change history of decisions. CC BY 4.0. [Reliability: Medium-High] ↩︎

  7. Core Domain Charts (Nick Tune) - Nick Tune (commentary: esilva.net, 2023). A strategic tool that visualizes capabilities as core / supporting / generic against complexity and differentiation. A visualization of the current state, not a decision log. [Reliability: Medium] ↩︎

  8. Living Documentation: Continuous Knowledge Sharing by Design - Cyrille Martraire, Pearson / Addison-Wesley (2019). ISBN 9780134689418. Covers a technique for retaining decision rationale via annotations, developing the idea that “documentation should live on the artifact itself—the code.” Takes the position of subsuming ADRs rather than replacing them. [Reliability: High] (single-author book by an expert) ↩︎

  9. Agent Decision Records (AgDR) - me2resh, GitHub (2025). An extended standard for recording an AI agent’s autonomous decisions as ADRs. Author = AI, required metadata (model identifier, session ID, timestamp), status = proposed / executed / superseded, automated via hook/skill/prompt. An advanced form of the practice of handing ADR management to AI. [Reliability: Medium] ↩︎

  10. Generative AI for Pull Request Descriptions: Adoption, Impact, and Developer Interventions - Tao Xiao, Hideaki Hata, Christoph Treude, Kenichi Matsumoto, PACMSE (2024, DOI: 10.1145/3643773). Analyzed 18,256 AI-generated PR descriptions. Descriptions written by AI and completed by a human had shorter review times and were more likely to be merged. Note: these are PR descriptions, not ADRs (the inference to ADRs is by analogy). [Reliability: High] ↩︎ ↩︎2

  11. Concise Thoughts: Impact of Output Length on LLM Reasoning (and other research on LLM output-length control) - (2024-). LLMs have a verbosity bias (a systematic bias toward being verbose) and don’t follow length instructions well. For concision, constraining with structure (a template) is more effective than the instruction “be concise.” [Reliability: Medium-High] ↩︎

  12. Architecture Decision Record Template: Y-Statements - Olaf Zimmermann (SATURN 2012). An ADR format that structures a decision’s context, chosen option, rejected option, and trade-off into a single sentence (one paragraph). It originated from the requirement “can you fit each decision on one slide?” A major template alongside MADR and Nygard. [Reliability: Medium-High] ↩︎

This post is licensed under CC BY 4.0 by the author.