Problems

AI-assisted coding is incredibly useful, but it comes with a predictable set of failure modes. The biggest risk is not that the model is useless — it’s that it is often plausible when it is wrong.

This page collects the most common mistakes teams make when using AI for software development and pairs them with practical ways to reduce the damage.

1. The trust problem

Problem: AI sounds confident even when it is wrong. It usually does not show uncertainty, cite sources, or tell you when it is guessing.

That makes it very easy to accept a bad answer that looks polished. In practice, this shows up as hallucinated APIs, outdated library suggestions, incorrect assumptions about your framework, or code that almost works.

Example:

# AI generated with 100% confidence:
def connect_database():
    # Uses deprecated library
    import mysql.connector  # ⚠️ This library is deprecated!
    conn = mysql.connector.connect(...)

Warning

Never trust the AI blindly. Always verify the code.

Always check:

New libraries and frameworks, especially if your model may be out of date
Security-critical implementations such as encryption, authentication, and authorization
Performance-critical code paths such as loops, caching, memory usage, and database access
Edge cases, error handling, and return values

Best Practices:

Treat AI output as a draft, not as truth
Verify every significant line before merging
Write or update tests for the behaviour you expect
Review the change with a human who understands the system
Prefer small, reversible commits

What this usually looks like in real life:

A code sample compiles, but uses a deprecated function or a missing import
A solution works on the happy path, but fails on null, empty, or unexpected input
The model confidently invents a feature that exists in another library, but not yours
The answer references a version or API that does not match your project

2. The context problem

Problem: AI often generates code that is not context-aware and does not match your project requirements, architecture, or conventions.

GitHub’s guidance on Copilot usage emphasizes context management: keep the conversation focused, only provide what matters, and use explicit instructions when the project has conventions that matter.

Example:

# Your project uses Repository Pattern
class UserRepository:
    def __init__(self, db_session):
        self.db = db_session

# AI suddenly suggests Active Record (inconsistent!)
class Product:
    def save(self):  # Wrong Pattern!
        db.session.add(self)

Best Practices:

Use .cursorrules, AGENTS.md, and copilot-instructions.md to define project requirements and local conventions
Keep prompts small and specific; give the model only the files or details it needs
Start a fresh chat when the task changes and old context is no longer useful
Make the architecture obvious: patterns, boundaries, naming, and testing expectations
If AI keeps making the same mistake, fix the instruction layer instead of repeatedly correcting the output

Tip

Context management is a skill, not a convenience.

Modern tools work better when you give them a clear spec, a small slice of relevant code, and explicit acceptance criteria. In other words: less “read my mind,” more “read this file.”

Common context mistakes:

Mixing multiple tasks into one prompt
Sending unrelated files “just in case”
Forgetting to mention architectural constraints
Assuming the model knows your internal conventions
Reusing stale conversation context after requirements changed

3. The hallucination problem

Problem: The model may invent functions, options, error codes, commands, or library capabilities that sound real but are not.

This is especially dangerous when the answer includes code that looks idiomatic. A hallucinated function name can be harder to spot than a syntax error because it looks professional.

Typical symptoms:

The API call does not exist
The package name is valid, but the feature belongs to a different package
The docs example uses flags or options that were removed in a newer version
The generated code depends on behaviour that was never guaranteed

What to do:

Verify library names, method signatures, and package versions against official docs
Ask the model to explain each step in plain language
Prefer code that can be checked with tests, type-checkers, or linters
Be extra cautious with answers that mention obscure APIs or brand-new releases

4. The “Fundamentals Trap”

Problem: Developers use AI as a hiding place instead of a tool. They generate code they do not understand.

This is one of the fastest ways to become dependent on a tool while losing the ability to debug, refactor, or judge the quality of the output.

Warning Signs:

❌ “The AI did this, I don’t know why it works”
❌ Copy-paste without reading
❌ No idea what a decorator does, but AI uses them everywhere
❌ Debugging by “asking AI what’s wrong” instead of trying to debug yourself
❌ Accepting code you could not explain in a code review
❌ Not being able to tell whether a bug is in the prompt, the model output, or your app

Caution

Danger for Junior Developers:

AI can accelerate your learning curve OR destroy it.

“If you feel like a fraud because you genuinely don’t understand the code you’re submitting, that’s not imposter syndrome - that’s a sign you need to slow down and learn the fundamentals.”

Source: Mimo Blog

Best Practice for Juniors:

Fundamentals first: Learn Python/JavaScript basics without AI
Then AI as Tutor: Let AI explain concepts to you
Then AI as Co-Pilot: Use AI for familiar patterns
Never as Autopilot: Never blindly accept code

A healthy workflow:

Ask AI to explain the code it generated
Re-implement small parts yourself
Step through bugs with a debugger
Read the docs before asking for the next layer of help
Practice writing tests for code you do not fully trust yet

5. Security & Compliance Risks

Problem: AI often generates code with known security issues or compliance problems.

The OWASP GenAI Security Project highlights risks such as prompt injection, sensitive information disclosure, supply-chain issues, improper output handling, and unbounded consumption. In plain English: AI can produce unsafe code, and it can also help unsafe input slip into your application.

Common Issues:

# ❌ SQL Injection
def get_user(username):
    query = f"SELECT * FROM users WHERE name = '{username}'"

# ❌ Hardcoded Secrets
API_KEY = "sk-proj-abc123..."

# ❌ GDPR problematic
def log_user_action(user_id, email, action):
    logger.info(f"User {email} performed {action}")  # PII in logs!

Warning

Security is NOT delegable!

AI may know the right words, but it still makes mistakes. Be especially careful with:

Authentication/Authorization
Input Validation
Data Encryption
Secrets handling and token storage
GDPR/DSGVO compliance and logging
Output encoding and sanitization

Common security mistakes people make with AI code:

Copying a sample that uses string concatenation in SQL, shell commands, or HTML
Pasting API keys or private data into a prompt
Trusting generated code to validate untrusted user input
Logging emails, tokens, session IDs, or other personal data
Adding dependencies without checking maintenance, licensing, or provenance
Assuming the generated code is safe just because it “looks secure”

Mandatory Checklist:

SAST (Static Application Security Testing) tools in CI/CD where appropriate
Manual security review for critical flows
Penetration testing for exposed systems
GDPR impact assessment when personal data is involved
Input validation and output encoding everywhere data crosses trust boundaries
Secret scanning and dependency review before merging

Tip

If the code touches identity, payments, storage, or personal data, treat AI output as untrusted until proven otherwise.

6. Missing tests and false confidence

Problem: AI can make code look finished before it has actually been exercised.

Code without tests is easy to admire and hard to trust. The model may confidently generate the happy path and forget the cases that break production: time zones, empty arrays, retries, race conditions, and malformed inputs.

Common mistakes:

Accepting generated code without running the test suite
Writing tests that mirror the implementation instead of the requirement
Forgetting boundary cases such as zero, one, empty, null, and large values
Skipping regression tests after an AI-assisted bug fix

Better workflow:

Write the test first when the behaviour matters
Let the model implement against the test
Run the test suite, linters, and type checks after each meaningful change
Add at least one test for every bug the AI introduced

Tip

GitHub’s Copilot guidance on context windows and TDD is a useful reminder: keep the task narrow, ask clarifying questions when requirements are vague, and use tests to catch edge cases before users do.

7. Cost explosions

Problem: API-based AI tools can get expensive, especially with heavy usage.

Cost Examples:

V0: $20 per simple app with frequent regeneration
Claude API: $15 per million tokens (depending on model)
GitHub Copilot: $10/month (Business $19/user)
Cursor: $20/month

Tip

Cost Management:

Use free tiers for experiments
Enable caching (Cursor, Claude)
Minimize context (only relevant files)
Batch operations instead of single requests
Set up monitoring

Common cost mistakes:

Regenerating the same answer repeatedly instead of improving the prompt
Sending entire repositories when a few files would do
Forgetting that long chats are expensive because they carry old context
Using the most expensive model for every task, even simple edits

8. Vendor lock-in

Problem: Some tools tie you to specific backends or cloud providers.

This can happen at the model layer, the hosting layer, the agent layer, or even in the workflow layer. Once your process depends on a proprietary IDE, runtime, or workflow engine, switching becomes much harder.

Examples:

Cursor: → Cursor + Cursor
V0: → Vercel + Supabase
Bolt: → Browser-based, difficult to transfer to production
Amazon Q: → AWS-optimized, suboptimal for other clouds

Solution:

Keep code portable
Use abstraction layers
Architect for multi-cloud readiness
Always consider an exit strategy

Questions to ask before adopting a tool:

Can I export the code and run it elsewhere?
Does this workflow depend on a vendor-specific API?
Will my team still understand the code if we switch tools later?
Can I replace the model without redesigning the whole app?

9. Privacy, prompts, and prompt injection

Problem: AI tools can leak sensitive data if you paste too much context, and AI-powered apps can be tricked by malicious instructions hidden in user input.

This is one of the big OWASP GenAI concerns: untrusted input can change behaviour, reveal private information, or make the system do things it should not do.

Common mistakes:

Pasting secrets, customer data, or internal source code into prompts
Treating user-provided text as safe just because it is “only a prompt”
Using model output directly without validation or filtering
Allowing the assistant to call tools with overly broad permissions

Mitigations:

Minimize what you share with the model
Redact secrets and personal data before prompting
Validate and sanitize every model output that reaches users or downstream systems
Restrict tool permissions and dangerous actions
Assume prompt injection is possible whenever user-controlled content is involved

10. The maintenance problem

Problem: AI can generate code that works today but becomes expensive to maintain tomorrow.

This usually happens when the code is too clever, too broad, or too tightly coupled to a specific prompt. The result is a system that is hard to refactor because nobody understands why it was written that way.

Warning signs:

Huge generated files with no clear boundaries
Duplicated logic because the model was asked to “just add one more thing”
Code that ignores the project’s naming or layering conventions
A feature that works, but only if you never need to change it

Better habits:

Ask for small, modular changes
Refactor while the context is still fresh
Keep functions short and responsibilities clear
Prefer boring code over magical code

A practical AI-coding checklist

Before you accept AI-generated code, ask:

Does this match the project’s architecture and conventions?
Can I explain what the code does and why it is correct?
Did I verify the API, library, or version against the docs?
Are the security, privacy, and compliance implications acceptable?
Do I have tests for the important behaviour and edge cases?
Did I keep the prompt and context small enough to stay focused?
Would I still merge this if the AI suggestion had no author name on it?

Sources and further reading

If you want, I can also add a short “safe AI coding workflow” section or turn this into a workshop-ready checklist page.