The Tight Coupling Trap of AI Pair Programming: Understanding Technical Debt Through Coupling Balance
This article was generated by AI. The accuracy of the content is not guaranteed, and we accept no responsibility for any damages resulting from use of this article. By continuing to read, you agree to the Terms of Use.
- Target Audience: Software engineers, architects, developers using AI tools (GitHub Copilot, Cursor, Claude, etc.) in their work
- Prerequisites: Object-oriented programming, basic software design concepts
- Reading Time: 20 minutes
Overview
AI pair programming tools like GitHub Copilot, Cursor, and ChatGPT dramatically improve development speed, but they also accumulate unexpected technical debt. A 2025 GitClear study analyzing 211 million lines of code changes revealed that after AI adoption, duplicated code increased 4-fold and refactoring decreased by 60%1.
This article uses the theoretical framework from Vlad Khononov’s “Balancing Coupling in Software Design”2 to analyze the “tight coupling” problem in AI-generated code. Furthermore, we use Cynefin theory to clarify areas that should be delegated to AI versus areas requiring human judgment, and propose practical prompt design techniques.
By reading this article, you can systematically understand quality issues in AI-generated code and learn practical methods for sustainable AI-collaborative development.
Note:
The code examples and prompt templates in this article are illustrative examples for explanation purposes and have not been executed to verify their operation. When using them in actual projects, please conduct appropriate testing and verification, and customize according to project characteristics.
The Reality of AI-Generated Code in Data
GitClear 2025 Report: Shocking Numbers
GitClear published a large-scale study in January 2025 investigating the impact of AI assistants on code quality1. This study analyzed 211 million lines of code changes over 5 years from 2020 to 2024 from repositories owned by Google, Microsoft, Meta, and multiple large enterprises.
Key Findings:
- Surge in Duplicated Code
- The percentage of copy/pasted code lines increased from 8.3% to 12.3% (approximately 50% increase)
- Code blocks with 5+ duplicate lines increased 4-fold
- Significant Decrease in Refactoring
- Code changes classified as “refactoring” dropped from 25% in 2021 to less than 10% in 2024 (60% decrease)
- Notable increase in DRY (Don’t Repeat Yourself) principle violations
- Trade-off Between Development Speed and Quality
- 63% of developers using AI assistants use AI in their work1
- However, long-term maintainability and reusability are declining
Google DORA 2024 Report: Impact on Stability
Google’s 2024 DevOps Research and Assessment (DORA) report reported the correlation between AI adoption and delivery stability3.
Key Findings:
- For every 25% increase in AI adoption rate, delivery stability decreases by 7.2%
- Code review speed improves, but defect rate increases
This result suggests that while AI excels at generating “code that works quickly,” there are challenges in long-term quality metrics.
Academic Research: Collapse of Partition Quality in Large-Scale Generation
A research paper submitted to arXiv4 analyzed coupling and cohesion in AI-generated code.
Key Findings:
- Small code snippets (function level): Maintainability equivalent to human-written code (high cohesion, low coupling)
- Large-scale code generation (entire applications, large modules): Partition quality significantly degrades, maintainability severely worsens
- AI tends to propose inappropriate solutions when facing large, complex problems
This research result shows that AI excels at “local optimization” but has limitations in “global architecture design.”
TiMi Studio Case Study
A study published in ACM Digital Library5 reported a case study of AI pair programming adoption in a game development team at TiMi Studio.
Positive Aspects:
- Reduction in cyclomatic complexity
- Improved code coverage
- Reduction in code smells
Negative Aspects:
- Reliability-questioning
- Explainability-questioning
- Trust-lacking
- Autonomy-losing
This case study shows that while AI pair programming may improve quality metrics, it introduces new challenges to developers’ cognitive processes and decision-making.
Why AI Generates Tightly Coupled Code
1. “Working Code” First Design Philosophy
AI language models learn from large amounts of code in their training data, but their learning goal is generating “syntactically correct, executable code”6. Abstract design principles like long-term maintainability, extensibility, and modularity are not direct learning goals.
An article published in LeadDev7 lists the following as typical problems with AI-generated code:
- Highly coupled code
- God Objects (objects with excessively concentrated responsibilities)
- Overly structured solutions
These patterns occur because AI prioritizes “code that works now” and doesn’t consider design intent or long-term impact.
2. Limitations in Context Understanding
AI optimizes within the presented context (prompts, surrounding code) but cannot understand “tacit knowledge” like overall project architecture, existing module structure, and team coding conventions8.
This limitation causes the following problems:
- Ignoring Existing Modules
- Generates new code even when similar functionality already exists
- Results in duplicated code and inconsistent implementations
- Direct Dependency References
- Directly depends on specific implementations without going through abstraction layers or Dependency Injection
- Testability and module independence decrease
- Boundary Blurring
- Generates coupling across domain boundaries and layer boundaries
- Violates design principles like Clean Architecture and Hexagonal Architecture
3. Difficulty Understanding Abstract Concepts
A 27-day AI experiment article published on Medium8 reports that AI agent tools “struggle with abstract concepts like design principles, user experience, and code maintainability.”
Concrete Example:
When asking AI to “implement a RESTful API,” it tends to generate tightly coupled code like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Typical AI-generated pattern (tight coupling)
class UserAPI:
def __init__(self):
self.db = MySQLDatabase("localhost", "user", "pass", "db") # Direct dependency
self.logger = FileLogger("/var/log/app.log") # Direct dependency
self.cache = RedisCache("localhost:6379") # Direct dependency
def get_user(self, user_id):
# Business logic, data access, logging, caching mixed together
self.logger.log(f"Fetching user {user_id}")
cached = self.cache.get(f"user:{user_id}")
if cached:
return cached
user = self.db.query(f"SELECT * FROM users WHERE id = {user_id}")
self.cache.set(f"user:{user_id}", user)
return user
Problems with this code:
- Direct dependency on database implementation (MySQL)
- Direct dependency on logging implementation (FileLogger)
- Direct dependency on cache implementation (Redis)
- Requires actual database, file system, and Redis for testing
- Difficult to switch databases or caches
Meanwhile, human designers introduce abstractions:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# Loosely coupled version designed by humans
class UserAPI:
def __init__(
self,
repository: UserRepository, # Abstraction
logger: Logger, # Abstraction
cache: Cache # Abstraction
):
self.repository = repository
self.logger = logger
self.cache = cache
def get_user(self, user_id):
self.logger.log(f"Fetching user {user_id}")
cached = self.cache.get(f"user:{user_id}")
if cached:
return cached
user = self.repository.find_by_id(user_id)
self.cache.set(f"user:{user_id}", user)
return user
4. Local Optimization Bias
As academic research4 shows, AI achieves good quality in small code snippets (function level) but quality degrades in large-scale code generation.
This is because AI performs the following “local optimizations”:
- Prioritizes Completion Within Current Scope
- Implements all necessary processing within a function
- Doesn’t consider delegation to other modules or functions
- Selects Immediately Available Dependencies
- Selects directly accessible concrete implementations over more appropriate but slightly distant abstractions
- Example: Directly depends on existing classes rather than defining interfaces
- Prioritizes Short-Term Implementation Ease
- Generates code that works now rather than considering long-term maintenance costs
Diagnosing with the Three Dimensions of Coupling Balance
Vlad Khononov, in his book “Balancing Coupling in Software Design”2, proposes a new approach beyond traditional “loose coupling supremacy”: Balancing Coupling.
The Three Dimensions of Coupling
Khononov presents a framework for evaluating coupling in three dimensions29:
1. Strength (Integration Strength)
Represents the density of coupling and degree of dependency.
Strength Levels (weak → strong):
- Data Coupling
- Passing simple data types (int, string, etc.) as arguments
- Example:
calculate_total(price: float, quantity: int)
- Stamp Coupling
- Passing structures or objects but using only some fields
- Example:
process_order(order: Order)(only using Order object’s id field)
- Control Coupling
- Passing flags or control information to control callee behavior
- Example:
send_notification(user: User, notification_type: str)
- Common Coupling
- Dependency on global variables or shared data structures
- Example: Multiple modules referencing the same global config object
- Content Coupling
- Direct dependency on internal implementation of other modules
- Example:
user._internal_cache.clear()(direct access to private fields)
Typical Problems with AI-Generated Code:
- Generates common or content coupling where data or stamp coupling should be used
- Many dependencies on global variables
- Direct access to private fields
2. Distance (Locality)
Represents the physical and logical separation between modules.
Distance Levels (close → far):
- Within a class
- Within a package/module
- Within an application
- Across services
Khononov’s Principle9:
“High strength coupling should have shortened distance”
Typical Problems with AI-Generated Code:
- Creates strong coupling between modules at far distances
- Example: Frontend depending on backend’s specific database schema
3. Volatility
Represents changeability and scope of impact.
Volatility Levels (stable → unstable):
- Standard library: Extremely low change frequency
- Third-party library: Stable between major versions
- Internal shared library: Shared across projects, periodically updated
- Application-specific code: Frequently changed
Khononov’s Principle9:
“Low volatility can tolerate high strength”
Typical Problems with AI-Generated Code:
- Many modules strongly depend on highly volatile business logic
- Wide-ranging modifications needed when business rules change
Connascence: A More Refined Measure of Coupling
Connascence proposed by Meilir Page-Jones in 199210 is a more refined metric for measuring coupling. It’s also explained in detail in Chapter 6 of Khononov’s “Balancing Coupling”11.
Definition of Connascence:
Connascence between two software elements A and B exists when A requires a change (or careful checking) due to a change in B, or when both A and B need to be changed simultaneously.
Three Dimensions of Connascence11:
- Strength: Difficulty and cost of change
- Degree: Number of couplings
- Locality: Proximity between related elements
Types of Connascence (weak → strong):
- Connascence of Name (CoN)
- Needs to reference the same name
- Example: Function names, variable names
- Connascence of Type (CoT)
- Needs to use the same type
- Example: Function argument and return types
- Connascence of Meaning (CoM)
- Specific values have specific meanings
- Example: Magic numbers (
if status == 1where1means “active”)
- Connascence of Position (CoP)
- Order of elements matters
- Example: Positional arguments
create_user("John", "Doe", 30)
- Connascence of Algorithm (CoA)
- Needs to use the same algorithm
- Example: Encryption and decryption
Connascence Diagnosis of AI-Generated Code:
AI-generated code often contains strong connascence:
- Connascence of Meaning: Many magic numbers, magic strings
- Connascence of Position: Heavy use of positional arguments, not using named arguments
- Connascence of Algorithm: Same logic duplicated in multiple places
Practicing the Coupling Balance Model
Khononov presents the following practical judgment criteria based on 3D coupling evaluation9:
Acceptable Coupling:
- Low strength and low volatility (e.g., standard library dependency)
- High strength but close distance (e.g., inter-method dependency within same class)
Problematic Coupling:
- High strength, far distance, and high volatility
- Example: Dependency on specific database schema between microservices
Refactoring Guidelines for AI-Generated Code:
- Connascence of Meaning → Connascence of Name
1 2 3 4 5 6 7
# Before (Connascence of Meaning) if user.status == 1: send_email(user) # After (Connascence of Name) if user.status == UserStatus.ACTIVE: send_email(user)
- Common Coupling → Data Coupling
1 2 3 4 5 6 7 8 9
# Before (Common Coupling) global_config = {...} def process_data(): timeout = global_config['timeout'] # Dependency on global variable # After (Data Coupling) def process_data(timeout: int): # Explicitly passed as argument
- Strong Coupling × Far Distance → Introduce Abstraction
1 2 3 4 5 6 7 8 9 10 11 12 13
# Before (Service A depends on Service B's concrete implementation) from service_b.mysql_repository import MySQLUserRepository class ServiceA: def __init__(self): self.user_repo = MySQLUserRepository() # Dependency on concrete implementation # After (Dependency on interface) from service_b.interfaces import UserRepository class ServiceA: def __init__(self, user_repo: UserRepository): self.user_repo = user_repo # Dependency on abstraction
Practice: Prompt Design to Prevent Tight Coupling
To mitigate tight coupling problems in AI-generated code, prompt engineering is important. Recent research shows the effectiveness of modular prompting techniques.
MoT (Modularization of Thought): Modular Prompting
Research submitted to arXiv12 proposes a new prompting technique called MoT (Modularization of Thought).
MoT Principles:
- Decompose complex programming problems into small, independent reasoning steps
- More structured and interpretable problem-solving process
- Achieved Pass@1 scores of 58.1%-95.1% in experiments with GPT-4o-mini and DeepSeek-R1 on 6 datasets12
MoT Benefits:
- Improved flexibility and generalizability
- Error isolation
- Easier integration of information retrieval, arithmetic operations, and external APIs
Prompts That Clarify Boundaries and Modules
Research13 points out the importance of separating “boundary (role/tone) prompts” from “adaptive control schemas.”
Recommended Prompt Structure:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
## Context
[Project overview, tech stack, existing architecture]
## Constraints
- Use dependency injection
- Depend on interfaces, not concrete implementations
- Each class follows Single Responsibility Principle (SRP)
- Make design testable
## Module Boundaries
- Data access layer: repositories package
- Business logic layer: services package
- API layer: controllers package
- Each layer depends only on layers above (Dependency Inversion Principle)
## Task
[Specific implementation task]
Practical Example: RESTful API Implementation
❌ Bad Prompt (likely to generate tight coupling):
1
Implement a REST API in Python that retrieves user information.
With this prompt, AI is likely to generate tightly coupled code like:
- Database connection written directly in API class
- Business logic and data access mixed
- Difficult to test
✅ Good Prompt (promotes loose coupling):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
## Context
This is a Python REST API project using FastAPI.
Following Clean Architecture, we emphasize layer separation.
## Architecture Structure
- domain/: Business logic and entities (framework-independent)
- application/: Use cases (interface definitions)
- infrastructure/: External dependencies (DB, APIs, etc.)
- presentation/: API layer (FastAPI)
## Constraints
- Use dependency injection (leverage FastAPI's Depends)
- Use repository pattern to abstract data access
- Each layer depends only on abstractions (interfaces) of lower layers
- Business logic concentrated in domain layer
- All classes follow Single Responsibility Principle
- Testable design (easy mocks and stubs)
## Task
Implement a REST API endpoint (GET /users/{user_id}) to retrieve user information.
Implement with the following file structure:
1. domain/entities/user.py: User entity
2. application/interfaces/user_repository.py: UserRepository abstract class
3. infrastructure/repositories/user_repository_impl.py: UserRepository implementation
4. application/services/user_service.py: User retrieval logic
5. presentation/api/user_controller.py: FastAPI endpoint
Dependency diagram:
Controller → Service → Repository(interface) ← RepositoryImpl
Clarify each file's role and thoroughly separate responsibilities between layers.
Using this prompt increases the likelihood that AI will generate loosely coupled code like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# domain/entities/user.py
from dataclasses import dataclass
@dataclass
class User:
id: int
name: str
email: str
# application/interfaces/user_repository.py
from abc import ABC, abstractmethod
from typing import Optional
from domain.entities.user import User
class UserRepository(ABC):
@abstractmethod
async def find_by_id(self, user_id: int) -> Optional[User]:
pass
# application/services/user_service.py
from typing import Optional
from application.interfaces.user_repository import UserRepository
from domain.entities.user import User
class UserService:
def __init__(self, user_repository: UserRepository):
self.user_repository = user_repository
async def get_user(self, user_id: int) -> Optional[User]:
return await self.user_repository.find_by_id(user_id)
# infrastructure/repositories/user_repository_impl.py
from typing import Optional
from application.interfaces.user_repository import UserRepository
from domain.entities.user import User
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select
from infrastructure.models.user_model import UserModel
class UserRepositoryImpl(UserRepository):
def __init__(self, db_session: AsyncSession):
self.db_session = db_session
async def find_by_id(self, user_id: int) -> Optional[User]:
result = await self.db_session.execute(
select(UserModel).filter(UserModel.id == user_id)
)
user_model = result.scalar_one_or_none()
if user_model is None:
return None
return User(
id=user_model.id,
name=user_model.name,
email=user_model.email
)
# presentation/api/user_controller.py
from fastapi import APIRouter, Depends, HTTPException
from application.services.user_service import UserService
from presentation.dependencies import get_user_service
router = APIRouter()
@router.get("/users/{user_id}")
async def get_user(
user_id: int,
user_service: UserService = Depends(get_user_service)
):
user = await user_service.get_user(user_id)
if user is None:
raise HTTPException(status_code=404, detail="User not found")
return user
Organizational Management of Prompt Templates
An article from Thoughtworks14 points out the importance of Test-Driven Development (TDD) and pair programming in AI pair programming. Similarly, organizational management of prompt templates is recommended.
Recommended Practices:
- Utilize CLAUDE.md or Cursor Rules
- Document design principles at project root
- Configure AI to automatically reference them
- Configure Custom Instructions
- ChatGPT Custom Instructions
- GitHub Copilot workspace settings
- Build Prompt Library
- Document frequently used prompts
- Share within team
CLAUDE.md Example:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Project Design Principles
## Architecture
This project follows Clean Architecture.
## Coupling Principles
1. Depend on abstractions (interfaces, abstract classes), not concrete implementations
2. Use dependency injection
3. Each class follows Single Responsibility Principle (SRP)
4. Minimize dependencies on highly volatile business logic
## Code Generation Notes
- Separate data access and business logic
- No global variables
- No magic numbers/magic strings (use constants or enums)
- Always consider testability
Review Perspectives: Post-Generation Refactoring Points
Use the following checklists when reviewing AI-generated code:
Coupling Checklist:
- Strength
- No dependencies on global variables
- No direct access to private fields
- Data coupling appropriately used
- Distance
- No strong coupling between modules at far distances
- Abstraction layers appropriately placed
- Volatility
- Dependencies on highly volatile business logic minimized
- Depends on stable abstractions (interfaces)
Connascence Checklist:
- No magic numbers/magic strings → Replace with constants/enums
- Not overusing positional arguments → Replace with named arguments or data classes
- No duplicated algorithms → Consider consolidation
Architecture Checklist:
- No Single Responsibility Principle (SRP) violations
- No Dependency Inversion Principle (DIP) violations
- Layer boundaries appropriately maintained
Using Cynefin Theory to Decide: Areas for AI vs. Areas for Human Judgment
Chapter 2 of Vlad Khononov’s “Balancing Coupling”15 explains how to use the Cynefin framework to understand complexity and make appropriate design decisions.
What Is the Cynefin Framework?
The Cynefin framework16 is a decision-making framework developed by management consultant and complexity science researcher David J. Snowden in the late 1990s. It classifies problems into four domains:
- Simple: Clear cause-and-effect, best practices exist
- Complicated: Cause-and-effect determined through analysis, expertise required
- Complex: Complex cause-and-effect, exploration and experimentation required
- Chaotic: Unclear cause-and-effect, immediate action required
Cynefin theory is often used to explain why software development should adopt agile development and Scrum16. In highly complex problem domains, approaches like agile development are more suitable than waterfall.
AI’s Strengths and Weaknesses
Using the Cynefin framework, we can classify areas for applying AI pair programming:
1. Simple Domain: Should Delegate to AI
Characteristics:
- Clear input/output specifications
- Established patterns
- Low volatility
Examples:
- Standard CRUD operation implementation
- Data validation
- Implementation of well-known algorithms (sorting, searching, etc.)
- Boilerplate test code generation
Recommended Approach:
- Fully delegate to AI
- Mechanical checks (linters, type checkers) sufficient for review
2. Complicated Domain: Human-AI Collaboration
Characteristics:
- Cause-and-effect determined through analysis
- Expertise required
- Multiple correct answers possible
Examples:
- Performance optimization
- Security implementation (authentication, authorization)
- Database schema design
- Adding features to existing systems
Recommended Approach:
- Have AI generate initial implementation
- Human reviews and improves based on expertise
- Scrutinize from coupling, security, performance perspectives
3. Complex Domain: Human-Led, AI Assists
Characteristics:
- Complex cause-and-effect
- Exploration and experimentation required
- Emergent solutions required
Examples:
- System architecture design
- Domain modeling (Domain-Driven Design)
- Determining microservice boundaries
- Coupling balance judgments
Recommended Approach:
- Humans lead design decisions
- AI assists with partial implementation or prototype generation
- Humans evaluate the three dimensions of coupling (strength, distance, volatility)
Why AI Has Limitations in the Complex Domain:
As research4 shows, AI proposes inappropriate solutions when facing large, complex problems. This is due to:
- Lack of Big Picture: AI judges only within presented context
- Difficulty Evaluating Trade-offs: Cannot appropriately evaluate trade-offs between short-term implementation ease and long-term maintainability
- Lack of Business Context Understanding: Cannot consider factors like business requirement changes, organizational constraints, team capabilities
4. Chaotic Domain: Humans Only
Characteristics:
- Unclear cause-and-effect
- Immediate action and judgment required
- Rapidly changing situation
Examples:
- Production incident response
- Security incident response
- Emergency hotfixes
Recommended Approach:
- Humans handle directly
- AI limited to reference information searching
Practical Decision Criteria
Use the following criteria to determine the scope to delegate to AI:
| Factor | Delegate to AI (Simple) | Collaborate (Complicated) | Human-Led (Complex) |
|---|---|---|---|
| Scope | Single function | Single module | Entire system |
| Volatility | Low (standard library) | Medium (shared library) | High (business logic) |
| Coupling Impact Range | Local | Moderate | Global |
| Requirement Clarity | Clear | Can be clarified through analysis | Ambiguous, exploration needed |
| Testability | Easy | Possible | Design-dependent |
Example: Decision Flowchart
graph TD
A[Task Start] --> B{Are requirements clear?}
B -->|Yes| C{Is scope single function?}
B -->|No| D[Human performs domain analysis]
C -->|Yes| E{Is volatility low?}
C -->|No| F{Impact on existing architecture?}
E -->|Yes| G[Fully delegate to AI]
E -->|No| F
F -->|Local| H[Human-AI collaboration]
F -->|Wide-ranging| I[Human-led, AI assists]
D --> I
Case Study: Determining Microservice Boundaries
Scenario: When converting an EC site’s monolithic application to microservices, how should service boundaries be determined?
Cynefin Classification:
- Complex domain: Complex cause-and-effect, exploration and experimentation required
Approach:
- Human’s Role:
- Identify Bounded Contexts based on Domain-Driven Design (DDD)
- Analyze business capabilities
- Evaluate the three dimensions of coupling (strength, distance, volatility)
- Make trade-off judgments (e.g., network latency vs. independence)
- AI’s Role:
- Dependency analysis of existing codebase (static analysis)
- Visualization of potential service boundaries
- Organize pros/cons of each boundary candidate
- Draft migration plans
Recommended Prompt:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
## Context
We're converting an EC site monolithic application (Python/Django) to microservices.
The following main domain areas exist:
- Product Management
- Inventory Management
- Order Management
- Customer Management
- Payment Processing
## Task
Analyze the existing codebase dependencies and propose 3 potential service boundary candidates.
For each candidate, evaluate the following:
1. Coupling strength (how tightly coupled to other domains)
2. Data consistency requirements (transaction boundaries)
3. Change frequency (volatility)
4. Fit with team structure
Organize candidates in table format and clearly state pros/cons.
Since humans will make the final decision, output as "option presentation" not "recommendation."
While referencing AI’s output, humans make final decisions considering additional factors:
- Business strategy (which areas to invest in)
- Team expertise and size
- Infrastructure costs
- Feasibility of phased migration
Summary
AI pair programming tools significantly improve development speed while easily generating tightly coupled code and risking technical debt accumulation. As the GitClear 2025 report1 shows, the 4-fold increase in duplicated code and 60% decrease in refactoring suggest increased long-term maintenance costs.
This article discussed the following points:
1. Tight Coupling Problems in AI-Generated Code
Empirical research134 revealed that AI has the following tendencies:
- Good quality in small code snippets
- Significantly degraded partition quality in large-scale code generation
- Prioritizes “working code,” postpones long-term maintainability
- Ignores existing architecture due to context understanding limitations
2. Three-Dimensional Coupling Evaluation
Using the framework from Vlad Khononov’s “Balancing Coupling”2, we showed methods for systematically evaluating coupling:
- Strength: Degree of dependency
- Distance: Separation between modules
- Volatility: Changeability
Furthermore, we introduced the concept of Connascence1011 to enable more refined coupling evaluation.
3. Practical Prompt Design
We proposed specific prompt design techniques to prevent tight coupling:
- MoT (Modularization of Thought)12: Modular prompting
- Prompts that clarify boundaries and modules
- Organizational management using CLAUDE.md and Cursor Rules
- Review perspectives and checklists for post-generation code
4. Application Domain Judgment Based on Cynefin Theory
Using the Cynefin framework1516, we clarified areas to delegate to AI versus areas requiring human judgment:
- Simple: Fully delegate to AI
- Complicated: Human-AI collaboration
- Complex: Human-led, AI assists
- Chaotic: Humans only
Toward Sustainable AI-Collaborative Development
To address quality issues in AI-generated code, the following approaches are effective:
- Clarify Design Principles
- Place CLAUDE.md or Cursor Rules at project root
- Document coupling, cohesion, and layer separation principles
- Organizational Prompt Engineering Efforts
- Build prompt template library
- Share and improve within team
- Strengthen Review Process
- Utilize coupling checklists
- Introduce Connascence evaluation
- Use static analysis tools (e.g., dependency analysis, circular dependency detection)
- Continuous Refactoring
- Treat AI-generated code as “initial draft”
- Regular architecture reviews
- Visualize and manage technical debt
- Appropriate Responsibility Allocation
- Establish decision criteria based on Cynefin theory
- Humans lead design decisions, use AI as assistant
AI pair programming is a powerful tool that, when properly used, can significantly improve development productivity. However, by understanding its potential risks and systematically managing them from a coupling balance perspective, sustainable software development becomes possible.
As Vlad Khononov states2, what’s important is not “loose coupling supremacy” but appropriate balance according to the situation. This principle remains unchanged even in the AI era.
Related Articles
- Balancing Coupling in Software Design: Understanding Vlad Khononov’s Coupling Strategy - Detailed explanation of the coupling balance concepts that form the theoretical foundation of this article
References
Other References (Not Numbered in Text)
Resources consulted during article creation but not directly cited in the text.
Structured Design - Stevens, W. P., Myers, G. J., Constantine, L. L., IBM Systems Journal (1974). [Reliability: High] Classic paper that first proposed the concepts of coupling and cohesion.
Connascence: Coupling, Cohesion & Connascence - Khalil Stemmler. [Reliability: Medium] Practical explanation of connascence.
A Pair Programming Framework for Code Generation via Multi-Plan Exploration and Feedback-Driven Refinement - arXiv (2024). [Reliability: Medium-High] PairCoder framework. Navigator agent and Driver agent collaboration.
Practices and Challenges of Using GitHub Copilot: An Empirical Study - arXiv (2023). [Reliability: Medium-High] GitHub Copilot usage survey analyzing 169 posts and 655 discussions from Stack Overflow and GitHub Discussions.
Security Weaknesses of Copilot-Generated Code in GitHub Projects: An Empirical Study - ACM TOSEM (2024). [Reliability: High] Analyzes security vulnerabilities in AI-generated code. Python 29.5%, JavaScript 24.2% of snippets have vulnerabilities.
On Citation Accuracy:
The research cited in this article has been verified through the following methods:
- Confirmation in academic databases (Google Scholar, arXiv, ACM Digital Library, IEEE Xplore, etc.)
- Verification of paper information on official journal websites
- Cross-verification through multiple independent sources (academic media, official research institution announcements, etc.)
Full PDF access may be restricted for some papers, but abstracts, DOIs, author information, and key findings have been confirmed through official academic databases and reliable secondary sources.
Coding on Copilot: AI Code Quality Research 2025 - GitClear (2025). [Reliability: High] Large-scale study analyzing 211 million lines of code changes. Reports 4-fold increase in duplicated code, 60% decrease in refactoring. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5
Balancing Coupling in Software Design: Universal Design Principles for Architecting Modular Software Systems - Vlad Khononov (2024). Addison-Wesley. [Reliability: High] Presents the three dimensions of coupling (strength, distance, volatility) and proposes the coupling balancing approach. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5
DORA Report 2024 - Google DevOps Research and Assessment (2024). [Reliability: High] Reports 7.2% decrease in delivery stability for every 25% increase in AI adoption. ↩︎ ↩︎2
The Impact of AI-Generated Solutions on Software Architecture and Productivity: Results from a Survey Study - arXiv (2024). [Reliability: Medium-High] Analyzes coupling and cohesion in AI-generated code. Reports significant quality degradation in partition quality during large-scale generation. ↩︎ ↩︎2 ↩︎3 ↩︎4
The Impact of AI-Pair Programmers on Code Quality and Developer Satisfaction: Evidence from TiMi studio - ACM Digital Library (2024). [Reliability: High] TiMi Studio case study. Reports both positive and negative aspects of AI pair programming. ↩︎
Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT - arXiv (2023). [Reliability: Medium-High] Comparative analysis of code quality from GitHub Copilot, Amazon CodeWhisperer, and ChatGPT. ↩︎
How AI generated code compounds technical debt - LeadDev (2025). [Reliability: Medium-High] Points out problems with highly coupled code, God Objects, and overly structured solutions generated by AI. ↩︎
Zero Human Code - What I learned from forcing AI to build (and fix) its own code for 27 straight days - Daniel Bentes, Medium (2024). [Reliability: Medium] 27-day AI experiment report. Reports AI struggling with abstract concepts (design principles, code maintainability, etc.). ↩︎ ↩︎2
Balancing Coupling in Software Design: Core Concepts - Vlad Khononov (2024). [Reliability: High] Detailed explanation of the three dimensions of coupling (strength, distance, volatility). Official site. ↩︎ ↩︎2 ↩︎3 ↩︎4
Connascence - Wikipedia. [Reliability: Medium-High] Explains the concept of connascence proposed by Meilir Page-Jones in 1992. ↩︎ ↩︎2
Book review reference (details of Chapter 6 on Connascence) ↩︎ ↩︎2 ↩︎3
Modularization is Better: Effective Code Generation with Modular Prompting - arXiv (2025). [Reliability: Medium-High] Proposes MoT (Modularization of Thought) prompting technique. Achieved Pass@1 scores of 58.1%-95.1%. ↩︎ ↩︎2 ↩︎3
Prompting Robotic Modalities (PRM): A structured architecture for centralizing language models in complex systems - ScienceDirect (2025). [Reliability: High] Points out the importance of separating boundary prompts from adaptive control schemas. ↩︎
Why test-driven development and pair programming are perfect companions for GitHub Copilot - Thoughtworks (2024). [Reliability: Medium-High] Explains the importance of TDD and pair programming in AI pair programming. ↩︎
Balancing Coupling in Software Design - Chapter 2: Coupling and Complexity: Cynefin - Vlad Khononov (2024). [Reliability: High] Book Chapter 2. Explains Cynefin theory and complexity. ↩︎ ↩︎2
Cynefin Framework - Wikipedia. [Reliability: Medium] Explains application of Cynefin theory to software development. ↩︎ ↩︎2 ↩︎3