Edit on GitHub

Problems

AI-assisted coding is incredibly useful, but it comes with a predictable set of failure modes. The biggest risk is not that the model is useless — it’s that it is often plausible when it is wrong.

This page collects the most common mistakes teams make when using AI for software development and pairs them with practical ways to reduce the damage.

1. The trust problem

Problem: AI sounds confident even when it is wrong. It usually does not show uncertainty, cite sources, or tell you when it is guessing.

That makes it very easy to accept a bad answer that looks polished. In practice, this shows up as hallucinated APIs, outdated library suggestions, incorrect assumptions about your framework, or code that almost works.

Example:

# AI generated with 100% confidence:
def connect_database():
    # Uses deprecated library
    import mysql.connector  # ⚠️ This library is deprecated!
    conn = mysql.connector.connect(...)

Warning

Never trust the AI blindly. Always verify the code.

Always check:

  • New libraries and frameworks, especially if your model may be out of date
  • Security-critical implementations such as encryption, authentication, and authorization
  • Performance-critical code paths such as loops, caching, memory usage, and database access
  • Edge cases, error handling, and return values

Best Practices:

  • Treat AI output as a draft, not as truth
  • Verify every significant line before merging
  • Write or update tests for the behaviour you expect
  • Review the change with a human who understands the system
  • Prefer small, reversible commits

What this usually looks like in real life:

  • A code sample compiles, but uses a deprecated function or a missing import
  • A solution works on the happy path, but fails on null, empty, or unexpected input
  • The model confidently invents a feature that exists in another library, but not yours
  • The answer references a version or API that does not match your project

2. The context problem

Problem: AI often generates code that is not context-aware and does not match your project requirements, architecture, or conventions.

GitHub’s guidance on Copilot usage emphasizes context management: keep the conversation focused, only provide what matters, and use explicit instructions when the project has conventions that matter.

Example:

# Your project uses Repository Pattern
class UserRepository:
    def __init__(self, db_session):
        self.db = db_session

# AI suddenly suggests Active Record (inconsistent!)
class Product:
    def save(self):  # Wrong Pattern!
        db.session.add(self)

Best Practices:

  • Use .cursorrules, AGENTS.md, and copilot-instructions.md to define project requirements and local conventions
  • Keep prompts small and specific; give the model only the files or details it needs
  • Start a fresh chat when the task changes and old context is no longer useful
  • Make the architecture obvious: patterns, boundaries, naming, and testing expectations
  • If AI keeps making the same mistake, fix the instruction layer instead of repeatedly correcting the output

Tip

Context management is a skill, not a convenience.

Modern tools work better when you give them a clear spec, a small slice of relevant code, and explicit acceptance criteria. In other words: less “read my mind,” more “read this file.”

Common context mistakes:

  • Mixing multiple tasks into one prompt
  • Sending unrelated files “just in case”
  • Forgetting to mention architectural constraints
  • Assuming the model knows your internal conventions
  • Reusing stale conversation context after requirements changed

3. The hallucination problem

Problem: The model may invent functions, options, error codes, commands, or library capabilities that sound real but are not.

This is especially dangerous when the answer includes code that looks idiomatic. A hallucinated function name can be harder to spot than a syntax error because it looks professional.

Typical symptoms:

  • The API call does not exist
  • The package name is valid, but the feature belongs to a different package
  • The docs example uses flags or options that were removed in a newer version
  • The generated code depends on behaviour that was never guaranteed

What to do:

  • Verify library names, method signatures, and package versions against official docs
  • Ask the model to explain each step in plain language
  • Prefer code that can be checked with tests, type-checkers, or linters
  • Be extra cautious with answers that mention obscure APIs or brand-new releases

4. The “Fundamentals Trap”

Problem: Developers use AI as a hiding place instead of a tool. They generate code they do not understand.

This is one of the fastest ways to become dependent on a tool while losing the ability to debug, refactor, or judge the quality of the output.

Warning Signs:

  • ❌ “The AI did this, I don’t know why it works”
  • ❌ Copy-paste without reading
  • ❌ No idea what a decorator does, but AI uses them everywhere
  • ❌ Debugging by “asking AI what’s wrong” instead of trying to debug yourself
  • ❌ Accepting code you could not explain in a code review
  • ❌ Not being able to tell whether a bug is in the prompt, the model output, or your app

Caution

Danger for Junior Developers:

AI can accelerate your learning curve OR destroy it.

“If you feel like a fraud because you genuinely don’t understand the code you’re submitting, that’s not imposter syndrome - that’s a sign you need to slow down and learn the fundamentals.”

Source: Mimo Blog

Best Practice for Juniors:

  • Fundamentals first: Learn Python/JavaScript basics without AI
  • Then AI as Tutor: Let AI explain concepts to you
  • Then AI as Co-Pilot: Use AI for familiar patterns
  • Never as Autopilot: Never blindly accept code

A healthy workflow:

  • Ask AI to explain the code it generated
  • Re-implement small parts yourself
  • Step through bugs with a debugger
  • Read the docs before asking for the next layer of help
  • Practice writing tests for code you do not fully trust yet

5. Security & Compliance Risks

Problem: AI often generates code with known security issues or compliance problems.

The OWASP GenAI Security Project highlights risks such as prompt injection, sensitive information disclosure, supply-chain issues, improper output handling, and unbounded consumption. In plain English: AI can produce unsafe code, and it can also help unsafe input slip into your application.

Common Issues:

# ❌ SQL Injection
def get_user(username):
    query = f"SELECT * FROM users WHERE name = '{username}'"

# ❌ Hardcoded Secrets
API_KEY = "sk-proj-abc123..."

# ❌ GDPR problematic
def log_user_action(user_id, email, action):
    logger.info(f"User {email} performed {action}")  # PII in logs!

Warning

Security is NOT delegable!

AI may know the right words, but it still makes mistakes. Be especially careful with:

  • Authentication/Authorization
  • Input Validation
  • Data Encryption
  • Secrets handling and token storage
  • GDPR/DSGVO compliance and logging
  • Output encoding and sanitization

Common security mistakes people make with AI code:

  • Copying a sample that uses string concatenation in SQL, shell commands, or HTML
  • Pasting API keys or private data into a prompt
  • Trusting generated code to validate untrusted user input
  • Logging emails, tokens, session IDs, or other personal data
  • Adding dependencies without checking maintenance, licensing, or provenance
  • Assuming the generated code is safe just because it “looks secure”

Mandatory Checklist:

  • SAST (Static Application Security Testing) tools in CI/CD where appropriate
  • Manual security review for critical flows
  • Penetration testing for exposed systems
  • GDPR impact assessment when personal data is involved
  • Input validation and output encoding everywhere data crosses trust boundaries
  • Secret scanning and dependency review before merging

Tip

If the code touches identity, payments, storage, or personal data, treat AI output as untrusted until proven otherwise.

6. Missing tests and false confidence

Problem: AI can make code look finished before it has actually been exercised.

Code without tests is easy to admire and hard to trust. The model may confidently generate the happy path and forget the cases that break production: time zones, empty arrays, retries, race conditions, and malformed inputs.

Common mistakes:

  • Accepting generated code without running the test suite
  • Writing tests that mirror the implementation instead of the requirement
  • Forgetting boundary cases such as zero, one, empty, null, and large values
  • Skipping regression tests after an AI-assisted bug fix

Better workflow:

  • Write the test first when the behaviour matters
  • Let the model implement against the test
  • Run the test suite, linters, and type checks after each meaningful change
  • Add at least one test for every bug the AI introduced

Tip

GitHub’s Copilot guidance on context windows and TDD is a useful reminder: keep the task narrow, ask clarifying questions when requirements are vague, and use tests to catch edge cases before users do.

7. Cost explosions

Problem: API-based AI tools can get expensive, especially with heavy usage.

Cost Examples:

  • V0: $20 per simple app with frequent regeneration
  • Claude API: $15 per million tokens (depending on model)
  • GitHub Copilot: $10/month (Business $19/user)
  • Cursor: $20/month

Tip

Cost Management:

  • Use free tiers for experiments
  • Enable caching (Cursor, Claude)
  • Minimize context (only relevant files)
  • Batch operations instead of single requests
  • Set up monitoring

Common cost mistakes:

  • Regenerating the same answer repeatedly instead of improving the prompt
  • Sending entire repositories when a few files would do
  • Forgetting that long chats are expensive because they carry old context
  • Using the most expensive model for every task, even simple edits

8. Vendor lock-in

Problem: Some tools tie you to specific backends or cloud providers.

This can happen at the model layer, the hosting layer, the agent layer, or even in the workflow layer. Once your process depends on a proprietary IDE, runtime, or workflow engine, switching becomes much harder.

Examples:

  • Cursor: → Cursor + Cursor
  • V0: → Vercel + Supabase
  • Bolt: → Browser-based, difficult to transfer to production
  • Amazon Q: → AWS-optimized, suboptimal for other clouds

Solution:

  • Keep code portable
  • Use abstraction layers
  • Architect for multi-cloud readiness
  • Always consider an exit strategy

Questions to ask before adopting a tool:

  • Can I export the code and run it elsewhere?
  • Does this workflow depend on a vendor-specific API?
  • Will my team still understand the code if we switch tools later?
  • Can I replace the model without redesigning the whole app?

9. Privacy, prompts, and prompt injection

Problem: AI tools can leak sensitive data if you paste too much context, and AI-powered apps can be tricked by malicious instructions hidden in user input.

This is one of the big OWASP GenAI concerns: untrusted input can change behaviour, reveal private information, or make the system do things it should not do.

Common mistakes:

  • Pasting secrets, customer data, or internal source code into prompts
  • Treating user-provided text as safe just because it is “only a prompt”
  • Using model output directly without validation or filtering
  • Allowing the assistant to call tools with overly broad permissions

Mitigations:

  • Minimize what you share with the model
  • Redact secrets and personal data before prompting
  • Validate and sanitize every model output that reaches users or downstream systems
  • Restrict tool permissions and dangerous actions
  • Assume prompt injection is possible whenever user-controlled content is involved

10. The maintenance problem

Problem: AI can generate code that works today but becomes expensive to maintain tomorrow.

This usually happens when the code is too clever, too broad, or too tightly coupled to a specific prompt. The result is a system that is hard to refactor because nobody understands why it was written that way.

Warning signs:

  • Huge generated files with no clear boundaries
  • Duplicated logic because the model was asked to “just add one more thing”
  • Code that ignores the project’s naming or layering conventions
  • A feature that works, but only if you never need to change it

Better habits:

  • Ask for small, modular changes
  • Refactor while the context is still fresh
  • Keep functions short and responsibilities clear
  • Prefer boring code over magical code

A practical AI-coding checklist

Before you accept AI-generated code, ask:

  • Does this match the project’s architecture and conventions?
  • Can I explain what the code does and why it is correct?
  • Did I verify the API, library, or version against the docs?
  • Are the security, privacy, and compliance implications acceptable?
  • Do I have tests for the important behaviour and edge cases?
  • Did I keep the prompt and context small enough to stay focused?
  • Would I still merge this if the AI suggestion had no author name on it?

Sources and further reading

If you want, I can also add a short “safe AI coding workflow” section or turn this into a workshop-ready checklist page.