Post
JA EN

Automation Bias — Why We Fail to Catch AI's Mistakes

Automation Bias — Why We Fail to Catch AI's Mistakes
  • Target audience: IT engineers who use AI tools in their daily work
  • Prerequisites: Basic experience with AI coding tools such as GitHub Copilot and ChatGPT
  • Reading time: 15 minutes

Overview

Hitting tab to accept GitHub Copilot’s suggestion. Copy-pasting ChatGPT’s response verbatim. Approving AI-generated code in a code review with a casual “looks about right.” Behind all of these behaviors lies a cognitive bias known as automation bias. Drawing on multiple peer-reviewed studies published between 2023 and 2025, this article examines the mechanisms behind this bias, the risks unique to the LLM era, and the countermeasures engineers should adopt.

“If the AI Says So, It Must Be Right” — What Is Automation Bias?

Automation bias is the cognitive tendency to place excessive trust in the output of automated systems, prioritizing the system’s judgment over one’s own 1.

The concept itself has been studied since the 1990s in the context of aircraft autopilot systems and medical devices. However, the advent of LLMs (large language models) has expanded this bias to an unprecedented scale. Unlike traditional automation — instrument readings and alerts — LLMs respond in fluent natural language. Because they generate plausible-sounding text delivered with a confident tone, the triggers that normally prompt humans to “question” are far less likely to fire 2.

Translated to a software engineer’s daily experience, this is a very tangible problem:

  • Accepting GitHub Copilot’s code suggestions by hitting tab without scrutinizing the content
  • Confirming only that AI-generated test code “passes” without verifying whether the tests themselves are sound
  • Waving AI-authored code through code review with the assumption that “the AI wrote it, so it’s probably fine”

According to GitHub’s own reporting, roughly 30% of Copilot’s code suggestions are accepted as-is 3. While this figure is not necessarily high on its own, questions remain about the quality of accepted code. An independent study found that AI-generated code exhibits a 41% higher churn rate (the rate at which code is rewritten shortly after being introduced) compared to human-written code, suggesting that problems often go unnoticed at the time of acceptance 4.

CRT Experiment: Incorrect AI Assistance Cuts Accuracy by More Than Half

Study Design

Wingerter et al. (2025) conducted an online experiment using the Cognitive Reflection Test (CRT-2) to investigate automation bias toward generative AI 5. The CRT is designed so that the intuitive (but incorrect) answer diverges from the correct answer reached through deliberation, measuring a participant’s ability to “stop and think.”

The experiment used a between-subjects design with three groups:

flowchart TB
    A["Participants"] --> B["Control group<br>No AI assistance"]
    A --> C["Incorrect AI group<br>Incorrect AI answers presented"]
    A --> D["Incorrect AI + Nudge group<br>Incorrect AI answers +<br>warning message"]

    B --> E["Solve CRT-2 independently"]
    C --> F["Incorrect answers presented<br>via AI chatbot-style UI"]
    D --> G["Incorrect answers +<br>warning: 'AI answers<br>may be inaccurate'"]

In the two AI-assisted conditions, each CRT-2 question page displayed a screenshot mimicking an AI chatbot UI, presenting an intentionally incorrect answer.

Results

The group that received incorrect AI assistance saw their correct answer count drop to less than half compared to the control group (p < 0.01) 5. Participants failed to exercise the “thinking against intuition” capacity that CRT measures and instead followed the AI’s answers.

Notably, even participants who were already familiar with the CRT were not fully immune to the influence of incorrect AI assistance 5. This suggests that “AI literacy education” and “prior warnings” alone cannot fundamentally eliminate this bias.

Limitations of This Study

This was an online experiment and did not fully replicate real-world work environments. Because it used a specific test (CRT-2), the extent to which results generalize to other tasks (e.g., code review) requires further investigation. Additionally, exact sample sizes and group allocations should be confirmed from the preprint version.

The Cognitive Mechanisms Behind Automation Bias

Why do humans trust AI output to such a degree? The review by Romeo & Conti (2025) organizes the multiple cognitive mechanisms underlying automation bias 1.

flowchart TB
    AB["Automation<br>Bias"]
    AB --> NB["Normalcy bias<br>'The AI couldn't possibly<br>be wrong'"]
    AB --> CB["Complacency bias<br>'It's been right so far,<br>so it must be right now'"]
    AB --> RB["Rationalization bias<br>'If my intuition differs<br>from the AI, I must<br>be the one who's wrong'"]
    AB --> AuthB["Authority bias<br>'It's a product of<br>advanced technology, so it<br>must be more accurate<br>than humans'"]
  • Normalcy bias: Past experience of “this AI has been accurate” leads to underestimating the possibility of errors
  • Complacency bias: Familiarity with the system erodes attentiveness, leading to uncritical acceptance of output
  • Rationalization bias: When the AI’s judgment conflicts with one’s own, the individual revises their own judgment instead
  • Authority bias: Information such as “cutting-edge technology” and “trained on vast datasets” causes the AI to be perceived as an “authority”

LLM-Specific Amplification Factors

With traditional automation (instrument readouts, alerts), the mechanical format of the output made it easier for humans to consciously register that “this is a machine’s judgment.” LLMs, however, are qualitatively different in several respects 2:

  1. The fluency trap: LLMs generate grammatically correct, well-structured natural language. Fluency is easily misinterpreted as a proxy for accuracy
  2. Confidence mimicry: LLMs tend to respond in an assertive tone even when uncertain. In human conversation, a confident tone signals accuracy — but this correlation does not hold for LLMs
  3. Conversational interface: Chat-based UIs evoke associations with human-to-human conversation, creating the impression that “the other party understands and is responding thoughtfully”

Ibrahim et al. (2025) identify LLMs’ function as “collaborative thinking partners” as a distinguishing characteristic from earlier technologies, noting that this elevates the risk of overreliance beyond previous levels 6.

Evidence from Medicine: Even Experts Cannot Escape the Bias

An important study demonstrates the extent of automation bias’s impact in high-stakes domains. Dratsch et al. (2023) examined how AI BI-RADS category predictions affected mammography reading among 27 radiologists with varying levels of experience 7.

Experimental Design

  • Participants: 27 radiologists (11 inexperienced, 11 moderately experienced, 5 veterans)
  • Task: Assign BI-RADS categories to 50 mammography images
  • Conditions: AI shows correct category vs. intentionally incorrect category
  • Journal: Radiology (peer-reviewed, a top journal in radiology)

Results

Experience LevelAccuracy with Correct AIAccuracy with Incorrect AIAccuracy Decline
Inexperienced (< 2 months)79.7% ± 11.719.8% ± 14.0−59.9 points
Moderate (avg. 13 months)81.3% ± 10.124.8% ± 11.6−56.5 points
Veteran (avg. 10.8 years)82.3% ± 4.245.5% ± 9.1−36.8 points

When the AI was wrong, inexperienced radiologists’ accuracy plummeted from 79.7% to 19.8%. Notably, even veterans with over 10 years of experience saw a significant drop from 82.3% to 45.5% 7.

Implications for Engineering

These results carry important implications for software development:

  • In scenarios where AI-generated code is “reviewed,” the reviewer’s judgment may be anchored by the AI’s output
  • Even experienced senior engineers are not fully immune to this bias
  • The assumption that “it’s safe because a human reviewed it” can become overconfidence when automation bias is factored in

Limitations of This Study

The sample size of 27 (with only 5 in the veteran group) is small, and statistical power should be interpreted with caution. Furthermore, mammography reading and code review are different tasks, and direct comparison requires care. However, the qualitative finding that “even experts are susceptible to automation bias” is consistent with the conclusions of Romeo & Conti (2025), who reviewed 35 prior studies 1.

AI Favors AI — The Danger of Feedback Loops

The problem of automation bias extends beyond humans overtrusting AI. A study published in PNAS demonstrated that LLMs themselves preferentially favor output generated by other LLMs 8.

Study Design

Laurito, Davis, Grietzer, Gavenčiak, Böhm, & Kulveit (2025) adapted experimental designs from employment discrimination research to test whether five LLMs — GPT-3.5, GPT-4, Llama-3.1-70B, Mixtral-8x22B, and Qwen2.5-72B — prefer “human-written descriptions” or “LLM-generated descriptions” across three domains 8.

Results

DomainGPT-4 Preference for LLM-GeneratedHuman Preference for LLM-Generated
Consumer products89%36%
Academic papers78%61%
Movie descriptions70%58%

GPT-4 preferred LLM-generated descriptions for consumer products 89% of the time. The gap relative to human preference (36%) is striking (p < 10⁻¹⁶) 8.

Implications for Engineering: The Pitfall of AI Review Chains

These findings raise a red flag for software development workflows where “AI-written code is reviewed by AI.”

flowchart TB
    A["AI generates code"] --> B["AI reviews code"]
    B --> C["AI generates tests"]
    C --> D["AI evaluates quality"]

    D -.->|"Feedback loop<br>AI output is favored<br>at each stage"| A

    E["No point where<br>human judgment<br>intervenes"] -.-> F["Bias<br>accumulates"]

When AI-based quality checks are integrated into CI/CD pipelines and consecutive stages operate without human judgment, a structure can emerge in which AIs mutually affirm each other’s outputs.

Limitations of This Study

This study focuses on text preference and whether the findings apply directly to code quality evaluation requires further investigation. Additionally, the LLMs tested were 2024-era models, and replication is needed to confirm whether the same tendencies persist in newer models.

The Two Faces of Bias by Skill Level: AI Aversion vs. AI Blind Trust

What makes automation bias particularly complex is that not everyone exhibits the bias in the same direction. Research shows that opposite biases emerge depending on skill level.

The Dual Structure

flowchart TB
    S["Skill Level"]
    S --> Low["Low skill"]
    S --> High["High skill"]

    Low --> OB["Automation bias<br>(AI blind trust)"]
    High --> AA["Algorithm aversion<br>(AI undervaluation)"]

    OB --> LR["Overlooks AI errors<br>Cannot independently verify"]
    AA --> HR["Unnecessarily rejects<br>even correct AI output"]

    LR --> Sub["Both deviate from<br>optimal decision-making"]
    HR --> Sub

Low-skill pattern (AI blind trust):

In the Dratsch et al. (2023) study, inexperienced radiologists saw their accuracy plummet from 79.7% to 19.8% when the AI provided incorrect categories 7. Despite being capable of answering correctly on their own, they deferred to the AI’s judgment. This reflects a structure where lack of confidence in one’s own judgment leads to the inference that “the AI must be right.”

High-skill pattern (algorithm aversion):

Conversely, experienced experts can exhibit bias in the opposite direction. Romeo & Conti’s (2025) review reports a tendency among those with high domain expertise to reject AI recommendations based on their own established heuristics 1. Prior research by Fügener et al. (2022) demonstrated that humans struggle to properly distinguish which tasks should be delegated to AI and which should not — in other words, insufficient “meta-knowledge” about the limits of one’s own knowledge leads to inefficient delegation 9.

How This Manifests in Engineering Teams

This dual nature of bias can appear in development teams as follows:

SituationJunior (AI blind trust tendency)Senior (AI aversion tendency)
Copilot suggestionAccepts with tab without scrutinyRejects even useful suggestions
AI review resultsApplies all flagged items unconditionallyIgnores valid flags because “it’s just the AI”
AI-generated testsDone once “tests pass”Rewrites AI tests entirely
Optimal behaviorVerifies AI output against own knowledgeEvaluates AI output as a “hypothesis”

Both biases undermine the complementarity — achieving better outcomes by appropriately combining human and AI judgment than either could achieve alone — that human-AI collaboration is supposed to deliver 6.

Parallels to the Dunning-Kruger Effect

Interestingly, this structure bears a resemblance to the Dunning-Kruger effect (the tendency for low-ability individuals to overestimate their competence while high-ability individuals underestimate theirs). A nonlinear pattern has been reported where algorithm aversion is somewhat stronger among those with the least AI experience, automation bias peaks at moderate experience levels, and the bias attenuates again among the most experienced 1.

Countermeasures: Nudges, Explainability, and Process Design

Automation bias is not a matter of individual “carelessness” — it is a structural problem rooted in human cognitive architecture. Consequently, exhortations to “just pay more attention” will not solve it. The countermeasures suggested by research can be organized into three broad approaches.

1. Nudges (Behavioral Economics Interventions)

In Wingerter et al.’s (2025) CRT experiment, a concise warning message (nudge) stating “AI answers may be inaccurate” showed a measurable effect in mitigating automation bias 5.

Engineering applications:

  • Code review tools: Automatically label AI-generated code (e.g., “AI-generated”) to alert reviewers
  • CI/CD pipelines: Display confidence scores alongside AI-based quality check results
  • IDE settings: Introduce a confirmation step — a deliberate pause — before accepting Copilot suggestions

However, nudges alone have their limits. The CRT experiment showed that even participants with prior knowledge could not fully avoid the bias, confirming that nudges are a “mitigation strategy,” not a “solution” 5.

2. Explainable AI (XAI)

Romeo & Conti’s (2025) review finds that “Explainable AI” (XAI) — making AI’s reasoning process transparent — is promising for mitigating automation bias, while noting that its effectiveness is limited 1.

Conditions under which explanations are effective:

  • Users possess sufficient knowledge to understand the explanation
  • Explanations are designed to encourage critical evaluation (simply displaying the AI’s confidence level is insufficient)
  • Users have the motivation to take verification actions themselves

3. Process Design (An Engineering Approach)

Synthesizing the research findings, the most effective strategy is not changing individual mindsets but designing processes where bias is less likely to occur.

flowchart TB
    subgraph WRONG["Flow Where Bias Accumulates"]
        direction TB
        W1["AI generates code"] --> W2["Developer gives it<br>a quick glance"]
        W2 --> W3["Merge"]
    end

    subgraph RIGHT["Flow That Mitigates Bias"]
        direction TB
        R1["AI generates code"] --> R2["Developer organizes<br>requirements without AI"]
        R2 --> R3["Verifies AI output<br>against requirements"]
        R3 --> R4["Another developer<br>reviews independently"]
        R4 --> R5["Merge"]
    end

Practical recommendations:

  • Form your own hypothesis before seeing the AI’s output: Before having the AI generate code, think through “what should this function return?” and “what are the edge cases?” yourself
  • Apply the Generator-Verifier pattern: Treat AI-generated code as a “hypothesis” (Generator) and clearly separate the “verification” (Verifier) process — testing, static analysis, and manual review
  • Ensure independence at the team level: Consider mechanisms where code reviewers conduct reviews without knowing that “the AI wrote it” (blind review)
  • Test the independence of your judgment: For important code reviews and architecture decisions, write down your own conclusions before consulting the AI. Then compare with the AI’s output, and when they differ, investigate “why.” Build a regular habit of checking whether your judgment is being anchored by AI

Discussion: The Core of AI Literacy Is the Ability to Question

Research on automation bias offers insights that get to the heart of what it truly means to “use AI effectively.”

“AI literacy” is commonly discussed as “the ability to use AI effectively.” However, what the research reviewed here demonstrates is that “the ability to effectively question AI” is the true core of AI literacy.

Metacognition — the ability to objectively monitor one’s own thought processes — is critically important in collaborating with AI for precisely this reason. Evaluating AI output requires continually asking oneself: “Am I being influenced by the AI’s output?” and “Is my judgment truly independent?”

Moreover, the behavior where experienced practitioners appear “excessively cautious” toward AI output becomes rational when reinterpreted through the lens of automation bias. With experience comes the ability to independently judge correct answers, enabling one to treat AI output as “something to be verified.” On the other hand, when this attitude goes too far, there is a risk of sliding into “algorithm aversion.”

The danger of “delegating everything to AI” is directly intertwined with the problem of automation bias. AI delegation without active cognition maximizes the influence of the bias.

And understanding that AI’s reasoning capabilities have fundamental limitations constitutes the most basic line of defense against automation bias. Merely knowing that “AI makes mistakes” is not sufficient on its own (as the CRT experiment demonstrated), but it at least establishes the precondition for “questioning.”

Summary

Here is a synthesis of the research findings on automation bias covered in this article:

  1. Automation bias is a cognitive characteristic, not individual “negligence”: Everyone is susceptible to this bias. Even veteran radiologists with over 10 years of experience saw their accuracy drop from 82.3% to 45.5% when given incorrect AI predictions 7

  2. LLMs amplify this bias: Fluent natural language, a confident tone, and conversational interfaces lower human vigilance more than traditional automation ever did 2

  3. Opposite biases emerge depending on skill level: Low-skill individuals tend toward AI blind trust (automation bias), while high-skill individuals tend toward AI undervaluation (algorithm aversion). Both deviate from optimal collaboration 17

  4. Education and warnings alone are insufficient: In the CRT experiment, even participants with prior knowledge were not immune to the bias. Process design and systemic countermeasures are necessary 5

  5. AI evaluating AI also warrants caution: LLMs exhibit a bias toward favoring other LLMs’ output, and AI-to-AI feedback loops can become blind spots in quality assurance 8

The potential for AI tools to enhance engineer productivity is not in question. However, to maximize that potential, it is essential to recognize automation bias as an “invisible pitfall” and to address it through both individual mindset and process design.



References

  1. Romeo, G., & Conti, D. (2025). Exploring automation bias in human–AI collaboration: a review and implications for explainable AI. AI & Society, Springer. https://link.springer.com/article/10.1007/s00146-025-02422-7 — A systematic review covering 35 prior studies. Peer-reviewed. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5 ↩︎6 ↩︎7

  2. Ibrahim, L., Collins, K. M., et al. (2025). Measuring and mitigating overreliance is necessary for building human-compatible AI. arXiv preprint, 2509.08010. https://arxiv.org/abs/2509.08010 — Preprint (not yet peer-reviewed). A comprehensive analysis of LLM overreliance risks. ↩︎ ↩︎2 ↩︎3

  3. GitHub Copilot’s code suggestion acceptance rate (approximately 30%) is based on GitHub’s official report. Research: quantifying GitHub Copilot’s impact on developer productivity and happiness - GitHub Blog (2022). ↩︎

  4. Measuring AI Coding Assistant Quality: 2025 Research Report - GitClear (2025). An independent study on AI-generated code churn rates (41% increase). ↩︎

  5. Wingerter, T. L., Straub, T., & Schweitzer, S. (2025). Mitigating Automation Bias in Generative AI Through Nudges: A Cognitive Reflection Test Study. Procedia Computer Science. https://www.sciencedirect.com/science/article/pii/S1877050925030042 — Peer-reviewed conference paper. ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5 ↩︎6

  6. Ibrahim, L., Collins, K. M., et al. (2025). Same as 2. Discussion on LLMs functioning as “collaborative thinking partners.” ↩︎ ↩︎2

  7. Dratsch, T., Chen, X., Rezazade Mehrizi, M., et al. (2023). Automation Bias in Mammography: The Impact of Artificial Intelligence BI-RADS Suggestions on Reader Performance. Radiology, 307(4), e222176. https://pubs.rsna.org/doi/10.1148/radiol.222176 — Peer-reviewed. A prospective study of 27 radiologists. Note the small veteran group (n=5). ↩︎ ↩︎2 ↩︎3 ↩︎4 ↩︎5

  8. Laurito, W., Davis, B., Grietzer, P., Gavenčiak, T., Böhm, A., & Kulveit, J. (2025). AI–AI bias: Large language models favor communications generated by large language models. Proceedings of the National Academy of Sciences (PNAS), 122(31). https://www.pnas.org/doi/10.1073/pnas.2415697122 — Peer-reviewed. Five LLMs tested across three domains. ↩︎ ↩︎2 ↩︎3 ↩︎4

  9. Fügener, A., Grahl, J., Gupta, A., & Ketter, W. (2022). Cognitive Challenges in Human–Artificial Intelligence Collaboration: Investigating the Path Toward Productive Delegation. Information Systems Research, 33(2), 678–696. — Peer-reviewed. Prior research demonstrating that insufficient meta-knowledge (awareness of the limits of one’s own knowledge) leads to inefficient AI delegation. ↩︎

This post is licensed under CC BY 4.0 by the author.