Post
JA EN

The Tight Coupling Trap of AI Pair Programming: Understanding Technical Debt Through Coupling Balance

The Tight Coupling Trap of AI Pair Programming: Understanding Technical Debt Through Coupling Balance
  • Target Audience: Software engineers, architects, developers using AI tools (GitHub Copilot, Cursor, Claude, etc.) in their work
  • Prerequisites: Object-oriented programming, basic software design concepts
  • Reading Time: 20 minutes

Overview

AI pair programming tools like GitHub Copilot, Cursor, and ChatGPT dramatically improve development speed, but they also accumulate unexpected technical debt. A 2025 GitClear study analyzing 211 million lines of code changes revealed that after AI adoption, duplicated code increased 4-fold and refactoring decreased by 60%1.

This article uses the theoretical framework from Vlad Khononov’s “Balancing Coupling in Software Design”2 to analyze the “tight coupling” problem in AI-generated code. Furthermore, we use Cynefin theory to clarify areas that should be delegated to AI versus areas requiring human judgment, and propose practical prompt design techniques.

By reading this article, you can systematically understand quality issues in AI-generated code and learn practical methods for sustainable AI-collaborative development.

Note:

The code examples and prompt templates in this article are illustrative examples for explanation purposes and have not been executed to verify their operation. When using them in actual projects, please conduct appropriate testing and verification, and customize according to project characteristics.

The Reality of AI-Generated Code in Data

GitClear 2025 Report: Shocking Numbers

GitClear published a large-scale study in January 2025 investigating the impact of AI assistants on code quality1. This study analyzed 211 million lines of code changes over 5 years from 2020 to 2024 from repositories owned by Google, Microsoft, Meta, and multiple large enterprises.

Key Findings:

  1. Surge in Duplicated Code
    • The percentage of copy/pasted code lines increased from 8.3% to 12.3% (approximately 50% increase)
    • Code blocks with 5+ duplicate lines increased 4-fold
  2. Significant Decrease in Refactoring
    • Code changes classified as “refactoring” dropped from 25% in 2021 to less than 10% in 2024 (60% decrease)
    • Notable increase in DRY (Don’t Repeat Yourself) principle violations
  3. Trade-off Between Development Speed and Quality
    • 63% of developers using AI assistants use AI in their work1
    • However, long-term maintainability and reusability are declining

Google DORA 2024 Report: Impact on Stability

Google’s 2024 DevOps Research and Assessment (DORA) report reported the correlation between AI adoption and delivery stability3.

Key Findings:

  • For every 25% increase in AI adoption rate, delivery stability decreases by 7.2%
  • Code review speed improves, but defect rate increases

This result suggests that while AI excels at generating “code that works quickly,” there are challenges in long-term quality metrics.

Academic Research: Collapse of Partition Quality in Large-Scale Generation

A research paper submitted to arXiv4 analyzed coupling and cohesion in AI-generated code.

Key Findings:

  • Small code snippets (function level): Maintainability equivalent to human-written code (high cohesion, low coupling)
  • Large-scale code generation (entire applications, large modules): Partition quality significantly degrades, maintainability severely worsens
  • AI tends to propose inappropriate solutions when facing large, complex problems

This research result shows that AI excels at “local optimization” but has limitations in “global architecture design.”

TiMi Studio Case Study

A study published in ACM Digital Library5 reported a case study of AI pair programming adoption in a game development team at TiMi Studio.

Positive Aspects:

  • Reduction in cyclomatic complexity
  • Improved code coverage
  • Reduction in code smells

Negative Aspects:

  • Reliability-questioning
  • Explainability-questioning
  • Trust-lacking
  • Autonomy-losing

This case study shows that while AI pair programming may improve quality metrics, it introduces new challenges to developers’ cognitive processes and decision-making.

Why AI Generates Tightly Coupled Code

1. “Working Code” First Design Philosophy

AI language models learn from large amounts of code in their training data, but their learning goal is generating “syntactically correct, executable code”6. Abstract design principles like long-term maintainability, extensibility, and modularity are not direct learning goals.

An article published in LeadDev7 lists the following as typical problems with AI-generated code:

  • Highly coupled code
  • God Objects (objects with excessively concentrated responsibilities)
  • Overly structured solutions

These patterns occur because AI prioritizes “code that works now” and doesn’t consider design intent or long-term impact.

2. Limitations in Context Understanding

AI optimizes within the presented context (prompts, surrounding code) but cannot understand “tacit knowledge” like overall project architecture, existing module structure, and team coding conventions8.

This limitation causes the following problems:

  1. Ignoring Existing Modules
    • Generates new code even when similar functionality already exists
    • Results in duplicated code and inconsistent implementations
  2. Direct Dependency References
    • Directly depends on specific implementations without going through abstraction layers or Dependency Injection
    • Testability and module independence decrease
  3. Boundary Blurring
    • Generates coupling across domain boundaries and layer boundaries
    • Violates design principles like Clean Architecture and Hexagonal Architecture

3. Difficulty Understanding Abstract Concepts

A 27-day AI experiment article published on Medium8 reports that AI agent tools “struggle with abstract concepts like design principles, user experience, and code maintainability.”

Concrete Example:

When asking AI to “implement a RESTful API,” it tends to generate tightly coupled code like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Typical AI-generated pattern (tight coupling)
class UserAPI:
    def __init__(self):
        self.db = MySQLDatabase("localhost", "user", "pass", "db")  # Direct dependency
        self.logger = FileLogger("/var/log/app.log")  # Direct dependency
        self.cache = RedisCache("localhost:6379")  # Direct dependency

    def get_user(self, user_id):
        # Business logic, data access, logging, caching mixed together
        self.logger.log(f"Fetching user {user_id}")
        cached = self.cache.get(f"user:{user_id}")
        if cached:
            return cached
        user = self.db.query(f"SELECT * FROM users WHERE id = {user_id}")
        self.cache.set(f"user:{user_id}", user)
        return user

Problems with this code:

  • Direct dependency on database implementation (MySQL)
  • Direct dependency on logging implementation (FileLogger)
  • Direct dependency on cache implementation (Redis)
  • Requires actual database, file system, and Redis for testing
  • Difficult to switch databases or caches

Meanwhile, human designers introduce abstractions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Loosely coupled version designed by humans
class UserAPI:
    def __init__(
        self,
        repository: UserRepository,  # Abstraction
        logger: Logger,  # Abstraction
        cache: Cache  # Abstraction
    ):
        self.repository = repository
        self.logger = logger
        self.cache = cache

    def get_user(self, user_id):
        self.logger.log(f"Fetching user {user_id}")
        cached = self.cache.get(f"user:{user_id}")
        if cached:
            return cached
        user = self.repository.find_by_id(user_id)
        self.cache.set(f"user:{user_id}", user)
        return user

4. Local Optimization Bias

As academic research4 shows, AI achieves good quality in small code snippets (function level) but quality degrades in large-scale code generation.

This is because AI performs the following “local optimizations”:

  1. Prioritizes Completion Within Current Scope
    • Implements all necessary processing within a function
    • Doesn’t consider delegation to other modules or functions
  2. Selects Immediately Available Dependencies
    • Selects directly accessible concrete implementations over more appropriate but slightly distant abstractions
    • Example: Directly depends on existing classes rather than defining interfaces
  3. Prioritizes Short-Term Implementation Ease
    • Generates code that works now rather than considering long-term maintenance costs

Diagnosing with the Three Dimensions of Coupling Balance

Vlad Khononov, in his book “Balancing Coupling in Software Design”2, proposes a new approach beyond traditional “loose coupling supremacy”: Balancing Coupling.

The Three Dimensions of Coupling

Khononov presents a framework for evaluating coupling in three dimensions29:

1. Strength (Integration Strength)

Represents the density of coupling and degree of dependency.

Strength Levels (weak → strong):

  1. Data Coupling
    • Passing simple data types (int, string, etc.) as arguments
    • Example: calculate_total(price: float, quantity: int)
  2. Stamp Coupling
    • Passing structures or objects but using only some fields
    • Example: process_order(order: Order) (only using Order object’s id field)
  3. Control Coupling
    • Passing flags or control information to control callee behavior
    • Example: send_notification(user: User, notification_type: str)
  4. Common Coupling
    • Dependency on global variables or shared data structures
    • Example: Multiple modules referencing the same global config object
  5. Content Coupling
    • Direct dependency on internal implementation of other modules
    • Example: user._internal_cache.clear() (direct access to private fields)

Typical Problems with AI-Generated Code:

  • Generates common or content coupling where data or stamp coupling should be used
  • Many dependencies on global variables
  • Direct access to private fields

2. Distance (Locality)

Represents the physical and logical separation between modules.

Distance Levels (close → far):

  1. Within a class
  2. Within a package/module
  3. Within an application
  4. Across services

Khononov’s Principle9:

“High strength coupling should have shortened distance”

Typical Problems with AI-Generated Code:

  • Creates strong coupling between modules at far distances
  • Example: Frontend depending on backend’s specific database schema

3. Volatility

Represents changeability and scope of impact.

Volatility Levels (stable → unstable):

  1. Standard library: Extremely low change frequency
  2. Third-party library: Stable between major versions
  3. Internal shared library: Shared across projects, periodically updated
  4. Application-specific code: Frequently changed

Khononov’s Principle9:

“Low volatility can tolerate high strength”

Typical Problems with AI-Generated Code:

  • Many modules strongly depend on highly volatile business logic
  • Wide-ranging modifications needed when business rules change

Connascence: A More Refined Measure of Coupling

Connascence proposed by Meilir Page-Jones in 199210 is a more refined metric for measuring coupling. It’s also explained in detail in Chapter 6 of Khononov’s “Balancing Coupling”11.

Definition of Connascence:

Connascence between two software elements A and B exists when A requires a change (or careful checking) due to a change in B, or when both A and B need to be changed simultaneously.

Three Dimensions of Connascence11:

  1. Strength: Difficulty and cost of change
  2. Degree: Number of couplings
  3. Locality: Proximity between related elements

Types of Connascence (weak → strong):

  1. Connascence of Name (CoN)
    • Needs to reference the same name
    • Example: Function names, variable names
  2. Connascence of Type (CoT)
    • Needs to use the same type
    • Example: Function argument and return types
  3. Connascence of Meaning (CoM)
    • Specific values have specific meanings
    • Example: Magic numbers (if status == 1 where 1 means “active”)
  4. Connascence of Position (CoP)
    • Order of elements matters
    • Example: Positional arguments create_user("John", "Doe", 30)
  5. Connascence of Algorithm (CoA)
    • Needs to use the same algorithm
    • Example: Encryption and decryption

Connascence Diagnosis of AI-Generated Code:

AI-generated code often contains strong connascence:

  • Connascence of Meaning: Many magic numbers, magic strings
  • Connascence of Position: Heavy use of positional arguments, not using named arguments
  • Connascence of Algorithm: Same logic duplicated in multiple places

Practicing the Coupling Balance Model

Khononov presents the following practical judgment criteria based on 3D coupling evaluation9:

Acceptable Coupling:

  • Low strength and low volatility (e.g., standard library dependency)
  • High strength but close distance (e.g., inter-method dependency within same class)

Problematic Coupling:

  • High strength, far distance, and high volatility
  • Example: Dependency on specific database schema between microservices

Refactoring Guidelines for AI-Generated Code:

  1. Connascence of Meaning → Connascence of Name
    1
    2
    3
    4
    5
    6
    7
    
    # Before (Connascence of Meaning)
    if user.status == 1:
        send_email(user)
    
    # After (Connascence of Name)
    if user.status == UserStatus.ACTIVE:
        send_email(user)
    
  2. Common Coupling → Data Coupling
    1
    2
    3
    4
    5
    6
    7
    8
    9
    
    # Before (Common Coupling)
    global_config = {...}
    
    def process_data():
        timeout = global_config['timeout']  # Dependency on global variable
    
    # After (Data Coupling)
    def process_data(timeout: int):
        # Explicitly passed as argument
    
  3. Strong Coupling × Far Distance → Introduce Abstraction
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    # Before (Service A depends on Service B's concrete implementation)
    from service_b.mysql_repository import MySQLUserRepository
    
    class ServiceA:
        def __init__(self):
            self.user_repo = MySQLUserRepository()  # Dependency on concrete implementation
    
    # After (Dependency on interface)
    from service_b.interfaces import UserRepository
    
    class ServiceA:
        def __init__(self, user_repo: UserRepository):
            self.user_repo = user_repo  # Dependency on abstraction
    

Practice: Prompt Design to Prevent Tight Coupling

To mitigate tight coupling problems in AI-generated code, prompt engineering is important. Recent research shows the effectiveness of modular prompting techniques.

MoT (Modularization of Thought): Modular Prompting

Research submitted to arXiv12 proposes a new prompting technique called MoT (Modularization of Thought).

MoT Principles:

  • Decompose complex programming problems into small, independent reasoning steps
  • More structured and interpretable problem-solving process
  • Achieved Pass@1 scores of 58.1%-95.1% in experiments with GPT-4o-mini and DeepSeek-R1 on 6 datasets12

MoT Benefits:

  • Improved flexibility and generalizability
  • Error isolation
  • Easier integration of information retrieval, arithmetic operations, and external APIs

Prompts That Clarify Boundaries and Modules

Research13 points out the importance of separating “boundary (role/tone) prompts” from “adaptive control schemas.”

Recommended Prompt Structure:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## Context
[Project overview, tech stack, existing architecture]

## Constraints
- Use dependency injection
- Depend on interfaces, not concrete implementations
- Each class follows Single Responsibility Principle (SRP)
- Make design testable

## Module Boundaries
- Data access layer: repositories package
- Business logic layer: services package
- API layer: controllers package
- Each layer depends only on layers above (Dependency Inversion Principle)

## Task
[Specific implementation task]

Practical Example: RESTful API Implementation

❌ Bad Prompt (likely to generate tight coupling):

1
Implement a REST API in Python that retrieves user information.

With this prompt, AI is likely to generate tightly coupled code like:

  • Database connection written directly in API class
  • Business logic and data access mixed
  • Difficult to test

✅ Good Prompt (promotes loose coupling):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Context
This is a Python REST API project using FastAPI.
Following Clean Architecture, we emphasize layer separation.

## Architecture Structure
- domain/: Business logic and entities (framework-independent)
- application/: Use cases (interface definitions)
- infrastructure/: External dependencies (DB, APIs, etc.)
- presentation/: API layer (FastAPI)

## Constraints
- Use dependency injection (leverage FastAPI's Depends)
- Use repository pattern to abstract data access
- Each layer depends only on abstractions (interfaces) of lower layers
- Business logic concentrated in domain layer
- All classes follow Single Responsibility Principle
- Testable design (easy mocks and stubs)

## Task
Implement a REST API endpoint (GET /users/{user_id}) to retrieve user information.
Implement with the following file structure:

1. domain/entities/user.py: User entity
2. application/interfaces/user_repository.py: UserRepository abstract class
3. infrastructure/repositories/user_repository_impl.py: UserRepository implementation
4. application/services/user_service.py: User retrieval logic
5. presentation/api/user_controller.py: FastAPI endpoint

Dependency diagram:
Controller → Service → Repository(interface) ← RepositoryImpl

Clarify each file's role and thoroughly separate responsibilities between layers.

Using this prompt increases the likelihood that AI will generate loosely coupled code like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# domain/entities/user.py
from dataclasses import dataclass

@dataclass
class User:
    id: int
    name: str
    email: str

# application/interfaces/user_repository.py
from abc import ABC, abstractmethod
from typing import Optional
from domain.entities.user import User

class UserRepository(ABC):
    @abstractmethod
    async def find_by_id(self, user_id: int) -> Optional[User]:
        pass

# application/services/user_service.py
from typing import Optional
from application.interfaces.user_repository import UserRepository
from domain.entities.user import User

class UserService:
    def __init__(self, user_repository: UserRepository):
        self.user_repository = user_repository

    async def get_user(self, user_id: int) -> Optional[User]:
        return await self.user_repository.find_by_id(user_id)

# infrastructure/repositories/user_repository_impl.py
from typing import Optional
from application.interfaces.user_repository import UserRepository
from domain.entities.user import User
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from infrastructure.models.user_model import UserModel

class UserRepositoryImpl(UserRepository):
    def __init__(self, db_session: AsyncSession):
        self.db_session = db_session

    async def find_by_id(self, user_id: int) -> Optional[User]:
        result = await self.db_session.execute(
            select(UserModel).filter(UserModel.id == user_id)
        )
        user_model = result.scalar_one_or_none()
        if user_model is None:
            return None
        return User(
            id=user_model.id,
            name=user_model.name,
            email=user_model.email
        )

# presentation/api/user_controller.py
from fastapi import APIRouter, Depends, HTTPException
from application.services.user_service import UserService
from presentation.dependencies import get_user_service

router = APIRouter()

@router.get("/users/{user_id}")
async def get_user(
    user_id: int,
    user_service: UserService = Depends(get_user_service)
):
    user = await user_service.get_user(user_id)
    if user is None:
        raise HTTPException(status_code=404, detail="User not found")
    return user

Organizational Management of Prompt Templates

An article from Thoughtworks14 points out the importance of Test-Driven Development (TDD) and pair programming in AI pair programming. Similarly, organizational management of prompt templates is recommended.

Recommended Practices:

  1. Utilize CLAUDE.md or Cursor Rules
    • Document design principles at project root
    • Configure AI to automatically reference them
  2. Configure Custom Instructions
    • ChatGPT Custom Instructions
    • GitHub Copilot workspace settings
  3. Build Prompt Library
    • Document frequently used prompts
    • Share within team

CLAUDE.md Example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Project Design Principles

## Architecture
This project follows Clean Architecture.

## Coupling Principles
1. Depend on abstractions (interfaces, abstract classes), not concrete implementations
2. Use dependency injection
3. Each class follows Single Responsibility Principle (SRP)
4. Minimize dependencies on highly volatile business logic

## Code Generation Notes
- Separate data access and business logic
- No global variables
- No magic numbers/magic strings (use constants or enums)
- Always consider testability

Review Perspectives: Post-Generation Refactoring Points

Use the following checklists when reviewing AI-generated code:

Coupling Checklist:

  1. Strength
    • No dependencies on global variables
    • No direct access to private fields
    • Data coupling appropriately used
  2. Distance
    • No strong coupling between modules at far distances
    • Abstraction layers appropriately placed
  3. Volatility
    • Dependencies on highly volatile business logic minimized
    • Depends on stable abstractions (interfaces)

Connascence Checklist:

  1. No magic numbers/magic strings → Replace with constants/enums
  2. Not overusing positional arguments → Replace with named arguments or data classes
  3. No duplicated algorithms → Consider consolidation

Architecture Checklist:

  1. No Single Responsibility Principle (SRP) violations
  2. No Dependency Inversion Principle (DIP) violations
  3. Layer boundaries appropriately maintained

Using Cynefin Theory to Decide: Areas for AI vs. Areas for Human Judgment

Chapter 2 of Vlad Khononov’s “Balancing Coupling”15 explains how to use the Cynefin framework to understand complexity and make appropriate design decisions.

What Is the Cynefin Framework?

The Cynefin framework16 is a decision-making framework developed by management consultant and complexity science researcher David J. Snowden in the late 1990s. It classifies problems into four domains:

  1. Simple: Clear cause-and-effect, best practices exist
  2. Complicated: Cause-and-effect determined through analysis, expertise required
  3. Complex: Complex cause-and-effect, exploration and experimentation required
  4. Chaotic: Unclear cause-and-effect, immediate action required

Cynefin theory is often used to explain why software development should adopt agile development and Scrum16. In highly complex problem domains, approaches like agile development are more suitable than waterfall.

AI’s Strengths and Weaknesses

Using the Cynefin framework, we can classify areas for applying AI pair programming:

1. Simple Domain: Should Delegate to AI

Characteristics:

  • Clear input/output specifications
  • Established patterns
  • Low volatility

Examples:

  • Standard CRUD operation implementation
  • Data validation
  • Implementation of well-known algorithms (sorting, searching, etc.)
  • Boilerplate test code generation

Recommended Approach:

  • Fully delegate to AI
  • Mechanical checks (linters, type checkers) sufficient for review

2. Complicated Domain: Human-AI Collaboration

Characteristics:

  • Cause-and-effect determined through analysis
  • Expertise required
  • Multiple correct answers possible

Examples:

  • Performance optimization
  • Security implementation (authentication, authorization)
  • Database schema design
  • Adding features to existing systems

Recommended Approach:

  • Have AI generate initial implementation
  • Human reviews and improves based on expertise
  • Scrutinize from coupling, security, performance perspectives

3. Complex Domain: Human-Led, AI Assists

Characteristics:

  • Complex cause-and-effect
  • Exploration and experimentation required
  • Emergent solutions required

Examples:

  • System architecture design
  • Domain modeling (Domain-Driven Design)
  • Determining microservice boundaries
  • Coupling balance judgments

Recommended Approach:

  • Humans lead design decisions
  • AI assists with partial implementation or prototype generation
  • Humans evaluate the three dimensions of coupling (strength, distance, volatility)

Why AI Has Limitations in the Complex Domain:

As research4 shows, AI proposes inappropriate solutions when facing large, complex problems. This is due to:

  • Lack of Big Picture: AI judges only within presented context
  • Difficulty Evaluating Trade-offs: Cannot appropriately evaluate trade-offs between short-term implementation ease and long-term maintainability
  • Lack of Business Context Understanding: Cannot consider factors like business requirement changes, organizational constraints, team capabilities

4. Chaotic Domain: Humans Only

Characteristics:

  • Unclear cause-and-effect
  • Immediate action and judgment required
  • Rapidly changing situation

Examples:

  • Production incident response
  • Security incident response
  • Emergency hotfixes

Recommended Approach:

  • Humans handle directly
  • AI limited to reference information searching

Practical Decision Criteria

Use the following criteria to determine the scope to delegate to AI:

FactorDelegate to AI (Simple)Collaborate (Complicated)Human-Led (Complex)
ScopeSingle functionSingle moduleEntire system
VolatilityLow (standard library)Medium (shared library)High (business logic)
Coupling Impact RangeLocalModerateGlobal
Requirement ClarityClearCan be clarified through analysisAmbiguous, exploration needed
TestabilityEasyPossibleDesign-dependent

Example: Decision Flowchart

graph TD
    A[Task Start] --> B{Are requirements clear?}
    B -->|Yes| C{Is scope single function?}
    B -->|No| D[Human performs domain analysis]
    C -->|Yes| E{Is volatility low?}
    C -->|No| F{Impact on existing architecture?}
    E -->|Yes| G[Fully delegate to AI]
    E -->|No| F
    F -->|Local| H[Human-AI collaboration]
    F -->|Wide-ranging| I[Human-led, AI assists]
    D --> I

Case Study: Determining Microservice Boundaries

Scenario: When converting an EC site’s monolithic application to microservices, how should service boundaries be determined?

Cynefin Classification:

  • Complex domain: Complex cause-and-effect, exploration and experimentation required

Approach:

  1. Human’s Role:
    • Identify Bounded Contexts based on Domain-Driven Design (DDD)
    • Analyze business capabilities
    • Evaluate the three dimensions of coupling (strength, distance, volatility)
    • Make trade-off judgments (e.g., network latency vs. independence)
  2. AI’s Role:
    • Dependency analysis of existing codebase (static analysis)
    • Visualization of potential service boundaries
    • Organize pros/cons of each boundary candidate
    • Draft migration plans

Recommended Prompt:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
## Context
We're converting an EC site monolithic application (Python/Django) to microservices.
The following main domain areas exist:
- Product Management
- Inventory Management
- Order Management
- Customer Management
- Payment Processing

## Task
Analyze the existing codebase dependencies and propose 3 potential service boundary candidates.
For each candidate, evaluate the following:

1. Coupling strength (how tightly coupled to other domains)
2. Data consistency requirements (transaction boundaries)
3. Change frequency (volatility)
4. Fit with team structure

Organize candidates in table format and clearly state pros/cons.
Since humans will make the final decision, output as "option presentation" not "recommendation."

While referencing AI’s output, humans make final decisions considering additional factors:

  • Business strategy (which areas to invest in)
  • Team expertise and size
  • Infrastructure costs
  • Feasibility of phased migration

Summary

AI pair programming tools significantly improve development speed while easily generating tightly coupled code and risking technical debt accumulation. As the GitClear 2025 report1 shows, the 4-fold increase in duplicated code and 60% decrease in refactoring suggest increased long-term maintenance costs.

This article discussed the following points:

1. Tight Coupling Problems in AI-Generated Code

Empirical research134 revealed that AI has the following tendencies:

  • Good quality in small code snippets
  • Significantly degraded partition quality in large-scale code generation
  • Prioritizes “working code,” postpones long-term maintainability
  • Ignores existing architecture due to context understanding limitations

2. Three-Dimensional Coupling Evaluation

Using the framework from Vlad Khononov’s “Balancing Coupling”2, we showed methods for systematically evaluating coupling:

  • Strength: Degree of dependency
  • Distance: Separation between modules
  • Volatility: Changeability

Furthermore, we introduced the concept of Connascence1011 to enable more refined coupling evaluation.

3. Practical Prompt Design

We proposed specific prompt design techniques to prevent tight coupling:

  • MoT (Modularization of Thought)12: Modular prompting
  • Prompts that clarify boundaries and modules
  • Organizational management using CLAUDE.md and Cursor Rules
  • Review perspectives and checklists for post-generation code

4. Application Domain Judgment Based on Cynefin Theory

Using the Cynefin framework1516, we clarified areas to delegate to AI versus areas requiring human judgment:

  • Simple: Fully delegate to AI
  • Complicated: Human-AI collaboration
  • Complex: Human-led, AI assists
  • Chaotic: Humans only

Toward Sustainable AI-Collaborative Development

To address quality issues in AI-generated code, the following approaches are effective:

  1. Clarify Design Principles
    • Place CLAUDE.md or Cursor Rules at project root
    • Document coupling, cohesion, and layer separation principles
  2. Organizational Prompt Engineering Efforts
    • Build prompt template library
    • Share and improve within team
  3. Strengthen Review Process
    • Utilize coupling checklists
    • Introduce Connascence evaluation
    • Use static analysis tools (e.g., dependency analysis, circular dependency detection)
  4. Continuous Refactoring
    • Treat AI-generated code as “initial draft”
    • Regular architecture reviews
    • Visualize and manage technical debt
  5. Appropriate Responsibility Allocation
    • Establish decision criteria based on Cynefin theory
    • Humans lead design decisions, use AI as assistant

AI pair programming is a powerful tool that, when properly used, can significantly improve development productivity. However, by understanding its potential risks and systematically managing them from a coupling balance perspective, sustainable software development becomes possible.

As Vlad Khononov states2, what’s important is not “loose coupling supremacy” but appropriate balance according to the situation. This principle remains unchanged even in the AI era.

References

Other References (Not Numbered in Text)

Resources consulted during article creation but not directly cited in the text.

On Citation Accuracy:

The research cited in this article has been verified through the following methods:

  • Confirmation in academic databases (Google Scholar, arXiv, ACM Digital Library, IEEE Xplore, etc.)
  • Verification of paper information on official journal websites
  • Cross-verification through multiple independent sources (academic media, official research institution announcements, etc.)

Full PDF access may be restricted for some papers, but abstracts, DOIs, author information, and key findings have been confirmed through official academic databases and reliable secondary sources.

  1. Coding on Copilot: AI Code Quality Research 2025 - GitClear (2025). [Reliability: High] Large-scale study analyzing 211 million lines of code changes. Reports 4-fold increase in duplicated code, 60% decrease in refactoring. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5

  2. Balancing Coupling in Software Design: Universal Design Principles for Architecting Modular Software Systems - Vlad Khononov (2024). Addison-Wesley. [Reliability: High] Presents the three dimensions of coupling (strength, distance, volatility) and proposes the coupling balancing approach. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5

  3. DORA Report 2024 - Google DevOps Research and Assessment (2024). [Reliability: High] Reports 7.2% decrease in delivery stability for every 25% increase in AI adoption. ↩︎ ↩︎2

  4. The Impact of AI-Generated Solutions on Software Architecture and Productivity: Results from a Survey Study - arXiv (2024). [Reliability: Medium-High] Analyzes coupling and cohesion in AI-generated code. Reports significant quality degradation in partition quality during large-scale generation. ↩︎ ↩︎2 ↩︎3 ↩︎4

  5. The Impact of AI-Pair Programmers on Code Quality and Developer Satisfaction: Evidence from TiMi studio - ACM Digital Library (2024). [Reliability: High] TiMi Studio case study. Reports both positive and negative aspects of AI pair programming. ↩︎

  6. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT - arXiv (2023). [Reliability: Medium-High] Comparative analysis of code quality from GitHub Copilot, Amazon CodeWhisperer, and ChatGPT. ↩︎

  7. How AI generated code compounds technical debt - LeadDev (2025). [Reliability: Medium-High] Points out problems with highly coupled code, God Objects, and overly structured solutions generated by AI. ↩︎

  8. Zero Human Code - What I learned from forcing AI to build (and fix) its own code for 27 straight days - Daniel Bentes, Medium (2024). [Reliability: Medium] 27-day AI experiment report. Reports AI struggling with abstract concepts (design principles, code maintainability, etc.). ↩︎ ↩︎2

  9. Balancing Coupling in Software Design: Core Concepts - Vlad Khononov (2024). [Reliability: High] Detailed explanation of the three dimensions of coupling (strength, distance, volatility). Official site. ↩︎ ↩︎2 ↩︎3 ↩︎4

  10. Connascence - Wikipedia. [Reliability: Medium-High] Explains the concept of connascence proposed by Meilir Page-Jones in 1992. ↩︎ ↩︎2

  11. Book review reference (details of Chapter 6 on Connascence) ↩︎ ↩︎2 ↩︎3

  12. Modularization is Better: Effective Code Generation with Modular Prompting - arXiv (2025). [Reliability: Medium-High] Proposes MoT (Modularization of Thought) prompting technique. Achieved Pass@1 scores of 58.1%-95.1%. ↩︎ ↩︎2 ↩︎3

  13. Prompting Robotic Modalities (PRM): A structured architecture for centralizing language models in complex systems - ScienceDirect (2025). [Reliability: High] Points out the importance of separating boundary prompts from adaptive control schemas. ↩︎

  14. Why test-driven development and pair programming are perfect companions for GitHub Copilot - Thoughtworks (2024). [Reliability: Medium-High] Explains the importance of TDD and pair programming in AI pair programming. ↩︎

  15. Balancing Coupling in Software Design - Chapter 2: Coupling and Complexity: Cynefin - Vlad Khononov (2024). [Reliability: High] Book Chapter 2. Explains Cynefin theory and complexity. ↩︎ ↩︎2

  16. Cynefin Framework - Wikipedia. [Reliability: Medium] Explains application of Cynefin theory to software development. ↩︎ ↩︎2 ↩︎3

This post is licensed under CC BY 4.0 by the author.