Automation Bias — Why We Fail to Catch AI's Mistakes
This article was generated by AI. The accuracy of the content is not guaranteed, and we accept no responsibility for any damages resulting from use of this article. By continuing to read, you agree to the Terms of Use.
- Target audience: IT engineers who use AI tools in their daily work
- Prerequisites: Basic experience with AI coding tools such as GitHub Copilot and ChatGPT
- Reading time: 15 minutes
Overview
Hitting tab to accept GitHub Copilot’s suggestion. Copy-pasting ChatGPT’s response verbatim. Approving AI-generated code in a code review with a casual “looks about right.” Behind all of these behaviors lies a cognitive bias known as automation bias. Drawing on multiple peer-reviewed studies published between 2023 and 2025, this article examines the mechanisms behind this bias, the risks unique to the LLM era, and the countermeasures engineers should adopt.
“If the AI Says So, It Must Be Right” — What Is Automation Bias?
Automation bias is the cognitive tendency to place excessive trust in the output of automated systems, prioritizing the system’s judgment over one’s own 1.
The concept itself has been studied since the 1990s in the context of aircraft autopilot systems and medical devices. However, the advent of LLMs (large language models) has expanded this bias to an unprecedented scale. Unlike traditional automation — instrument readings and alerts — LLMs respond in fluent natural language. Because they generate plausible-sounding text delivered with a confident tone, the triggers that normally prompt humans to “question” are far less likely to fire 2.
Translated to a software engineer’s daily experience, this is a very tangible problem:
- Accepting GitHub Copilot’s code suggestions by hitting tab without scrutinizing the content
- Confirming only that AI-generated test code “passes” without verifying whether the tests themselves are sound
- Waving AI-authored code through code review with the assumption that “the AI wrote it, so it’s probably fine”
According to GitHub’s own reporting, roughly 30% of Copilot’s code suggestions are accepted as-is 3. While this figure is not necessarily high on its own, questions remain about the quality of accepted code. An independent study found that AI-generated code exhibits a 41% higher churn rate (the rate at which code is rewritten shortly after being introduced) compared to human-written code, suggesting that problems often go unnoticed at the time of acceptance 4.
CRT Experiment: Incorrect AI Assistance Cuts Accuracy by More Than Half
Study Design
Wingerter et al. (2025) conducted an online experiment using the Cognitive Reflection Test (CRT-2) to investigate automation bias toward generative AI 5. The CRT is designed so that the intuitive (but incorrect) answer diverges from the correct answer reached through deliberation, measuring a participant’s ability to “stop and think.”
The experiment used a between-subjects design with three groups:
flowchart TB
A["Participants"] --> B["Control group<br>No AI assistance"]
A --> C["Incorrect AI group<br>Incorrect AI answers presented"]
A --> D["Incorrect AI + Nudge group<br>Incorrect AI answers +<br>warning message"]
B --> E["Solve CRT-2 independently"]
C --> F["Incorrect answers presented<br>via AI chatbot-style UI"]
D --> G["Incorrect answers +<br>warning: 'AI answers<br>may be inaccurate'"]
In the two AI-assisted conditions, each CRT-2 question page displayed a screenshot mimicking an AI chatbot UI, presenting an intentionally incorrect answer.
Results
The group that received incorrect AI assistance saw their correct answer count drop to less than half compared to the control group (p < 0.01) 5. Participants failed to exercise the “thinking against intuition” capacity that CRT measures and instead followed the AI’s answers.
Notably, even participants who were already familiar with the CRT were not fully immune to the influence of incorrect AI assistance 5. This suggests that “AI literacy education” and “prior warnings” alone cannot fundamentally eliminate this bias.
Limitations of This Study
This was an online experiment and did not fully replicate real-world work environments. Because it used a specific test (CRT-2), the extent to which results generalize to other tasks (e.g., code review) requires further investigation. Additionally, exact sample sizes and group allocations should be confirmed from the preprint version.
The Cognitive Mechanisms Behind Automation Bias
Why do humans trust AI output to such a degree? The review by Romeo & Conti (2025) organizes the multiple cognitive mechanisms underlying automation bias 1.
Related Cognitive Biases
flowchart TB
AB["Automation<br>Bias"]
AB --> NB["Normalcy bias<br>'The AI couldn't possibly<br>be wrong'"]
AB --> CB["Complacency bias<br>'It's been right so far,<br>so it must be right now'"]
AB --> RB["Rationalization bias<br>'If my intuition differs<br>from the AI, I must<br>be the one who's wrong'"]
AB --> AuthB["Authority bias<br>'It's a product of<br>advanced technology, so it<br>must be more accurate<br>than humans'"]
- Normalcy bias: Past experience of “this AI has been accurate” leads to underestimating the possibility of errors
- Complacency bias: Familiarity with the system erodes attentiveness, leading to uncritical acceptance of output
- Rationalization bias: When the AI’s judgment conflicts with one’s own, the individual revises their own judgment instead
- Authority bias: Information such as “cutting-edge technology” and “trained on vast datasets” causes the AI to be perceived as an “authority”
LLM-Specific Amplification Factors
With traditional automation (instrument readouts, alerts), the mechanical format of the output made it easier for humans to consciously register that “this is a machine’s judgment.” LLMs, however, are qualitatively different in several respects 2:
- The fluency trap: LLMs generate grammatically correct, well-structured natural language. Fluency is easily misinterpreted as a proxy for accuracy
- Confidence mimicry: LLMs tend to respond in an assertive tone even when uncertain. In human conversation, a confident tone signals accuracy — but this correlation does not hold for LLMs
- Conversational interface: Chat-based UIs evoke associations with human-to-human conversation, creating the impression that “the other party understands and is responding thoughtfully”
Ibrahim et al. (2025) identify LLMs’ function as “collaborative thinking partners” as a distinguishing characteristic from earlier technologies, noting that this elevates the risk of overreliance beyond previous levels 6.
Evidence from Medicine: Even Experts Cannot Escape the Bias
An important study demonstrates the extent of automation bias’s impact in high-stakes domains. Dratsch et al. (2023) examined how AI BI-RADS category predictions affected mammography reading among 27 radiologists with varying levels of experience 7.
Experimental Design
- Participants: 27 radiologists (11 inexperienced, 11 moderately experienced, 5 veterans)
- Task: Assign BI-RADS categories to 50 mammography images
- Conditions: AI shows correct category vs. intentionally incorrect category
- Journal: Radiology (peer-reviewed, a top journal in radiology)
Results
| Experience Level | Accuracy with Correct AI | Accuracy with Incorrect AI | Accuracy Decline |
|---|---|---|---|
| Inexperienced (< 2 months) | 79.7% ± 11.7 | 19.8% ± 14.0 | −59.9 points |
| Moderate (avg. 13 months) | 81.3% ± 10.1 | 24.8% ± 11.6 | −56.5 points |
| Veteran (avg. 10.8 years) | 82.3% ± 4.2 | 45.5% ± 9.1 | −36.8 points |
When the AI was wrong, inexperienced radiologists’ accuracy plummeted from 79.7% to 19.8%. Notably, even veterans with over 10 years of experience saw a significant drop from 82.3% to 45.5% 7.
Implications for Engineering
These results carry important implications for software development:
- In scenarios where AI-generated code is “reviewed,” the reviewer’s judgment may be anchored by the AI’s output
- Even experienced senior engineers are not fully immune to this bias
- The assumption that “it’s safe because a human reviewed it” can become overconfidence when automation bias is factored in
Limitations of This Study
The sample size of 27 (with only 5 in the veteran group) is small, and statistical power should be interpreted with caution. Furthermore, mammography reading and code review are different tasks, and direct comparison requires care. However, the qualitative finding that “even experts are susceptible to automation bias” is consistent with the conclusions of Romeo & Conti (2025), who reviewed 35 prior studies 1.
AI Favors AI — The Danger of Feedback Loops
The problem of automation bias extends beyond humans overtrusting AI. A study published in PNAS demonstrated that LLMs themselves preferentially favor output generated by other LLMs 8.
Study Design
Laurito, Davis, Grietzer, Gavenčiak, Böhm, & Kulveit (2025) adapted experimental designs from employment discrimination research to test whether five LLMs — GPT-3.5, GPT-4, Llama-3.1-70B, Mixtral-8x22B, and Qwen2.5-72B — prefer “human-written descriptions” or “LLM-generated descriptions” across three domains 8.
Results
| Domain | GPT-4 Preference for LLM-Generated | Human Preference for LLM-Generated |
|---|---|---|
| Consumer products | 89% | 36% |
| Academic papers | 78% | 61% |
| Movie descriptions | 70% | 58% |
GPT-4 preferred LLM-generated descriptions for consumer products 89% of the time. The gap relative to human preference (36%) is striking (p < 10⁻¹⁶) 8.
Implications for Engineering: The Pitfall of AI Review Chains
These findings raise a red flag for software development workflows where “AI-written code is reviewed by AI.”
flowchart TB
A["AI generates code"] --> B["AI reviews code"]
B --> C["AI generates tests"]
C --> D["AI evaluates quality"]
D -.->|"Feedback loop<br>AI output is favored<br>at each stage"| A
E["No point where<br>human judgment<br>intervenes"] -.-> F["Bias<br>accumulates"]
When AI-based quality checks are integrated into CI/CD pipelines and consecutive stages operate without human judgment, a structure can emerge in which AIs mutually affirm each other’s outputs.
Limitations of This Study
This study focuses on text preference and whether the findings apply directly to code quality evaluation requires further investigation. Additionally, the LLMs tested were 2024-era models, and replication is needed to confirm whether the same tendencies persist in newer models.
The Two Faces of Bias by Skill Level: AI Aversion vs. AI Blind Trust
What makes automation bias particularly complex is that not everyone exhibits the bias in the same direction. Research shows that opposite biases emerge depending on skill level.
The Dual Structure
flowchart TB
S["Skill Level"]
S --> Low["Low skill"]
S --> High["High skill"]
Low --> OB["Automation bias<br>(AI blind trust)"]
High --> AA["Algorithm aversion<br>(AI undervaluation)"]
OB --> LR["Overlooks AI errors<br>Cannot independently verify"]
AA --> HR["Unnecessarily rejects<br>even correct AI output"]
LR --> Sub["Both deviate from<br>optimal decision-making"]
HR --> Sub
Low-skill pattern (AI blind trust):
In the Dratsch et al. (2023) study, inexperienced radiologists saw their accuracy plummet from 79.7% to 19.8% when the AI provided incorrect categories 7. Despite being capable of answering correctly on their own, they deferred to the AI’s judgment. This reflects a structure where lack of confidence in one’s own judgment leads to the inference that “the AI must be right.”
High-skill pattern (algorithm aversion):
Conversely, experienced experts can exhibit bias in the opposite direction. Romeo & Conti’s (2025) review reports a tendency among those with high domain expertise to reject AI recommendations based on their own established heuristics 1. Prior research by Fügener et al. (2022) demonstrated that humans struggle to properly distinguish which tasks should be delegated to AI and which should not — in other words, insufficient “meta-knowledge” about the limits of one’s own knowledge leads to inefficient delegation 9.
How This Manifests in Engineering Teams
This dual nature of bias can appear in development teams as follows:
| Situation | Junior (AI blind trust tendency) | Senior (AI aversion tendency) |
|---|---|---|
| Copilot suggestion | Accepts with tab without scrutiny | Rejects even useful suggestions |
| AI review results | Applies all flagged items unconditionally | Ignores valid flags because “it’s just the AI” |
| AI-generated tests | Done once “tests pass” | Rewrites AI tests entirely |
| Optimal behavior | Verifies AI output against own knowledge | Evaluates AI output as a “hypothesis” |
Both biases undermine the complementarity — achieving better outcomes by appropriately combining human and AI judgment than either could achieve alone — that human-AI collaboration is supposed to deliver 6.
Parallels to the Dunning-Kruger Effect
Interestingly, this structure bears a resemblance to the Dunning-Kruger effect (the tendency for low-ability individuals to overestimate their competence while high-ability individuals underestimate theirs). A nonlinear pattern has been reported where algorithm aversion is somewhat stronger among those with the least AI experience, automation bias peaks at moderate experience levels, and the bias attenuates again among the most experienced 1.
Countermeasures: Nudges, Explainability, and Process Design
Automation bias is not a matter of individual “carelessness” — it is a structural problem rooted in human cognitive architecture. Consequently, exhortations to “just pay more attention” will not solve it. The countermeasures suggested by research can be organized into three broad approaches.
1. Nudges (Behavioral Economics Interventions)
In Wingerter et al.’s (2025) CRT experiment, a concise warning message (nudge) stating “AI answers may be inaccurate” showed a measurable effect in mitigating automation bias 5.
Engineering applications:
- Code review tools: Automatically label AI-generated code (e.g., “AI-generated”) to alert reviewers
- CI/CD pipelines: Display confidence scores alongside AI-based quality check results
- IDE settings: Introduce a confirmation step — a deliberate pause — before accepting Copilot suggestions
However, nudges alone have their limits. The CRT experiment showed that even participants with prior knowledge could not fully avoid the bias, confirming that nudges are a “mitigation strategy,” not a “solution” 5.
2. Explainable AI (XAI)
Romeo & Conti’s (2025) review finds that “Explainable AI” (XAI) — making AI’s reasoning process transparent — is promising for mitigating automation bias, while noting that its effectiveness is limited 1.
Conditions under which explanations are effective:
- Users possess sufficient knowledge to understand the explanation
- Explanations are designed to encourage critical evaluation (simply displaying the AI’s confidence level is insufficient)
- Users have the motivation to take verification actions themselves
3. Process Design (An Engineering Approach)
Synthesizing the research findings, the most effective strategy is not changing individual mindsets but designing processes where bias is less likely to occur.
flowchart TB
subgraph WRONG["Flow Where Bias Accumulates"]
direction TB
W1["AI generates code"] --> W2["Developer gives it<br>a quick glance"]
W2 --> W3["Merge"]
end
subgraph RIGHT["Flow That Mitigates Bias"]
direction TB
R1["AI generates code"] --> R2["Developer organizes<br>requirements without AI"]
R2 --> R3["Verifies AI output<br>against requirements"]
R3 --> R4["Another developer<br>reviews independently"]
R4 --> R5["Merge"]
end
Practical recommendations:
- Form your own hypothesis before seeing the AI’s output: Before having the AI generate code, think through “what should this function return?” and “what are the edge cases?” yourself
- Apply the Generator-Verifier pattern: Treat AI-generated code as a “hypothesis” (Generator) and clearly separate the “verification” (Verifier) process — testing, static analysis, and manual review
- Ensure independence at the team level: Consider mechanisms where code reviewers conduct reviews without knowing that “the AI wrote it” (blind review)
- Test the independence of your judgment: For important code reviews and architecture decisions, write down your own conclusions before consulting the AI. Then compare with the AI’s output, and when they differ, investigate “why.” Build a regular habit of checking whether your judgment is being anchored by AI
Discussion: The Core of AI Literacy Is the Ability to Question
Research on automation bias offers insights that get to the heart of what it truly means to “use AI effectively.”
“AI literacy” is commonly discussed as “the ability to use AI effectively.” However, what the research reviewed here demonstrates is that “the ability to effectively question AI” is the true core of AI literacy.
Metacognition — the ability to objectively monitor one’s own thought processes — is critically important in collaborating with AI for precisely this reason. Evaluating AI output requires continually asking oneself: “Am I being influenced by the AI’s output?” and “Is my judgment truly independent?”
Moreover, the behavior where experienced practitioners appear “excessively cautious” toward AI output becomes rational when reinterpreted through the lens of automation bias. With experience comes the ability to independently judge correct answers, enabling one to treat AI output as “something to be verified.” On the other hand, when this attitude goes too far, there is a risk of sliding into “algorithm aversion.”
The danger of “delegating everything to AI” is directly intertwined with the problem of automation bias. AI delegation without active cognition maximizes the influence of the bias.
And understanding that AI’s reasoning capabilities have fundamental limitations constitutes the most basic line of defense against automation bias. Merely knowing that “AI makes mistakes” is not sufficient on its own (as the CRT experiment demonstrated), but it at least establishes the precondition for “questioning.”
Summary
Here is a synthesis of the research findings on automation bias covered in this article:
Automation bias is a cognitive characteristic, not individual “negligence”: Everyone is susceptible to this bias. Even veteran radiologists with over 10 years of experience saw their accuracy drop from 82.3% to 45.5% when given incorrect AI predictions 7
LLMs amplify this bias: Fluent natural language, a confident tone, and conversational interfaces lower human vigilance more than traditional automation ever did 2
Opposite biases emerge depending on skill level: Low-skill individuals tend toward AI blind trust (automation bias), while high-skill individuals tend toward AI undervaluation (algorithm aversion). Both deviate from optimal collaboration 17
Education and warnings alone are insufficient: In the CRT experiment, even participants with prior knowledge were not immune to the bias. Process design and systemic countermeasures are necessary 5
AI evaluating AI also warrants caution: LLMs exhibit a bias toward favoring other LLMs’ output, and AI-to-AI feedback loops can become blind spots in quality assurance 8
The potential for AI tools to enhance engineer productivity is not in question. However, to maximize that potential, it is essential to recognize automation bias as an “invisible pitfall” and to address it through both individual mindset and process design.
Related Articles
- Only Those with High Metacognitive Skills See Creativity Gains from AI - A deep dive into the relationship between metacognition and AI use
- The Truth Behind Experts Who Seem to “Blindly Delegate” to AI - Analysis of why experienced practitioners are cautious with AI output
- The AI Delegation Paradox - The risks of AI delegation without active cognition
- Cognitive Offloading to AI — Does Outsourcing Thought Erode Critical Thinking? - Analysis of the risks of outsourcing the act of thinking itself
- AI Reasoning Limitations: A Practical Guide - An explanation of AI’s fundamental limitations
References
Romeo, G., & Conti, D. (2025). Exploring automation bias in human–AI collaboration: a review and implications for explainable AI. AI & Society, Springer. https://link.springer.com/article/10.1007/s00146-025-02422-7 — A systematic review covering 35 prior studies. Peer-reviewed. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5 ↩︎6 ↩︎7
Ibrahim, L., Collins, K. M., et al. (2025). Measuring and mitigating overreliance is necessary for building human-compatible AI. arXiv preprint, 2509.08010. https://arxiv.org/abs/2509.08010 — Preprint (not yet peer-reviewed). A comprehensive analysis of LLM overreliance risks. ↩︎ ↩︎2 ↩︎3
GitHub Copilot’s code suggestion acceptance rate (approximately 30%) is based on GitHub’s official report. Research: quantifying GitHub Copilot’s impact on developer productivity and happiness - GitHub Blog (2022). ↩︎
Measuring AI Coding Assistant Quality: 2025 Research Report - GitClear (2025). An independent study on AI-generated code churn rates (41% increase). ↩︎
Wingerter, T. L., Straub, T., & Schweitzer, S. (2025). Mitigating Automation Bias in Generative AI Through Nudges: A Cognitive Reflection Test Study. Procedia Computer Science. https://www.sciencedirect.com/science/article/pii/S1877050925030042 — Peer-reviewed conference paper. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5 ↩︎6
Ibrahim, L., Collins, K. M., et al. (2025). Same as 2. Discussion on LLMs functioning as “collaborative thinking partners.” ↩︎ ↩︎2
Dratsch, T., Chen, X., Rezazade Mehrizi, M., et al. (2023). Automation Bias in Mammography: The Impact of Artificial Intelligence BI-RADS Suggestions on Reader Performance. Radiology, 307(4), e222176. https://pubs.rsna.org/doi/10.1148/radiol.222176 — Peer-reviewed. A prospective study of 27 radiologists. Note the small veteran group (n=5). ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5
Laurito, W., Davis, B., Grietzer, P., Gavenčiak, T., Böhm, A., & Kulveit, J. (2025). AI–AI bias: Large language models favor communications generated by large language models. Proceedings of the National Academy of Sciences (PNAS), 122(31). https://www.pnas.org/doi/10.1073/pnas.2415697122 — Peer-reviewed. Five LLMs tested across three domains. ↩︎ ↩︎2 ↩︎3 ↩︎4
Fügener, A., Grahl, J., Gupta, A., & Ketter, W. (2022). Cognitive Challenges in Human–Artificial Intelligence Collaboration: Investigating the Path Toward Productive Delegation. Information Systems Research, 33(2), 678–696. — Peer-reviewed. Prior research demonstrating that insufficient meta-knowledge (awareness of the limits of one’s own knowledge) leads to inefficient AI delegation. ↩︎