When you let an AI write your code, you get speed - but you also get hidden risks. Vibe coding, where developers give high-level instructions to AI agents instead of writing every line themselves, is speeding up development. But it’s also flooding apps with security flaws that traditional tools miss. You might think, "If the app works, it’s fine." But in vibe-coded projects, functionality and security are rarely the same thing.
Why Vibe Coding Breaks Traditional Security Rules
Large language models (LLMs) were trained on millions of lines of public code - including buggy, outdated, and insecure examples. When you ask an AI to "build a login system," it doesn’t understand risk. It understands patterns. And the most common pattern for authentication? Hardcoded tokens, weak password checks, or no validation at all. That’s why 45% of AI-generated code fails basic security tests, according to Databricks’ 2024 analysis.Unlike human developers who learn from past mistakes, LLMs don’t have context. They don’t know that exposing an API key in a config file is dangerous. They just copy what they’ve seen before. That’s why a single insecure pattern can spread across dozens of generated files - and why 62% of vulnerabilities found in vibe-coded apps involve exposed secrets or personally identifiable information (PII), as Escape.tech reported in mid-2024.
Three Dimensions of Triaging: Severity, Exploitability, Impact
Triaging isn’t just about listing bugs. It’s about deciding what to fix first. In vibe coding, you need to evaluate every vulnerability across three lenses:- Severity: How bad is the flaw?
- Exploitability: How easy is it for an attacker to use it?
- Impact: What happens if it’s exploited?
Traditional CVSS scores don’t cut it here. A SQL injection in a human-written app might take weeks to find. In a vibe-coded app? It’s often right there in the first draft. That’s why Vidoc Security’s 2024 taxonomy shows that hardcoded secrets are exploitable in 100% of cases - with zero effort. Broken authorization? 87% exploitable. Insecure deserialization? Still dangerous, but requires more skill.
And the impact? CVSS scores tell you: hardcoded secrets = 9.8 (critical), broken auth = 8.2 (high), insecure deserialization = 6.5 (medium). But those numbers don’t capture the real problem: in vibe coding, one flaw can cascade. One leaked key can give attackers access to databases, cloud buckets, and third-party APIs. That’s why severity isn’t just about the flaw - it’s about how far it can reach.
The Exploitability Gap: Why AI Code Passes Tests But Fails Security
Here’s the trap: AI-generated code often passes functional tests. It logs users in. It returns the right data. It even handles edge cases. But it ignores security boundaries.The SusVibes benchmark, released in December 2025, tested 200 security tasks across 108 open-source projects. The results were brutal: frontier LLMs failed over 80% of security tests - even though they passed more than half of functional ones. Why? Because LLMs optimize for correctness, not safety. They don’t ask, "Is this endpoint protected?" They ask, "Does this code match what I’ve seen before?"
Researchers once asked ChatGPT to generate a parser for the GGUF binary format. The output worked perfectly - and contained a buffer overflow that could crash the system. No warnings. No flags. Just clean, functional, deadly code.
This isn’t a glitch. It’s a design flaw in how LLMs learn. They don’t understand intent. They don’t know the difference between "this works" and "this is safe." That’s why manual review isn’t optional - it’s the last line of defense.
Real-World Impact: When Vibe Coding Exposes Client Data
Marketing agencies using vibe coding to build client websites are seeing this firsthand. Duda.co found that 45% of AI-generated sites failed OWASP Top 10 tests. Why? Because the AI didn’t know to validate user input, sanitize outputs, or enforce HTTPS. One agency lost access to 12 client databases after an AI-generated form allowed SQL injection. The code looked fine. The form worked. The server logged everything correctly. But it didn’t check what was being sent.Another case involved a chatbot built with AI that stored user messages in plain text. The developer assumed the AI would handle privacy. It didn’t. The data was exposed through a misconfigured API endpoint - a flaw that would’ve been caught by a human in five minutes, but slipped past the AI because it had never been trained on privacy regulations.
These aren’t edge cases. They’re symptoms of a system that treats security as an afterthought.
Three-Level Triaging Framework for Vibe-Coded Projects
Aikido.dev’s 2024 security checklist breaks triaging into three clear levels. This isn’t theory - it’s what teams are using today to stop breaches before they happen.Level 1: Automate the Basics
Set up CI/CD pipelines with automated scanning tools. Use SAST (Static Application Security Testing) like SonarQube - it catches 85% of code quality and security issues, according to Susan Taylor’s 2024 benchmark. Pair it with DAST tools like OWASP ZAP to test running apps. These tools don’t replace humans, but they catch the low-hanging fruit: missing headers, unencrypted connections, outdated libraries.Don’t skip dependency scanning. Tools that monitor Software Bill of Materials (SBOM) can detect drift with 99.2% accuracy. A single vulnerable library can turn your whole app into a target.
Level 2: Make the AI Review Itself
This is the game-changer. After the AI generates code, feed it back into the model with a security prompt: "Review this code for hardcoded secrets, missing authentication, and exposed endpoints. List all risks."Databricks tested this in 2024. Teams using this "self-reflective review" step reduced vulnerabilities by 57% in PurpleLlama’s cybersecurity benchmark. Why? Because the AI, when asked directly, can spot patterns it missed before - like a secret in a comment or an unvalidated input field.
It’s not perfect. The same study found that when asked to fix a vulnerability, the AI introduced a new one in 68% of cases. But when used as a second pair of eyes - not the only pair - it’s powerful.
Level 3: Enforce Organizational Guardrails
No tool will save you if your team doesn’t have rules. Mandatory secret scanning across repositories, Slack, and Notion is non-negotiable. GitGuardian’s 2024 report showed that 78% of enterprises now auto-revoke or rotate secrets within 24 hours of detection.Also, require code reviews for any AI-generated code. Not just "looks good." But: "Did you check authentication? Is data encrypted at rest? Are logs sanitized?" Make security part of the workflow, not a checklist item.
Modified DREAD: A New Way to Prioritize in Vibe Coding
Traditional DREAD (Damage, Reproducibility, Exploitability, Affected Users, Discoverability) doesn’t fit vibe coding. Why? Because exposure happens instantly. A flaw can be live in minutes - not weeks.ReversingLabs’ 2024 update to DREAD shifts the weights:
- Exposure (40%): How quickly and widely can this flaw be reached?
- Damage (30%): What’s the cost if exploited?
- Reproducibility (15%): Can it be triggered reliably?
- Exploitability (10%): How hard is it to exploit?
- Affected Users (5%): How many users are at risk?
This model prioritizes flaws that are easy to find and exploit - exactly the kind AI leaves behind. Tools like Vidoc Security Lab now integrate this scoring into dev pipelines, flagging issues that match the 77 CWE patterns from the SusVibes benchmark.
The Human Edge: Why AI Can’t Replace Security Experts
AI is a tool. Not a replacement. Google Cloud’s 2025 Security Command Center reduced false positives by 42% by learning how AI-generated code behaves. IBM’s research showed that combining automated scans with LLM-based reflection caught 91% of vulnerabilities that either method missed alone.But none of this works without a human who understands context. Who knows that a "temporary" API key shouldn’t be in production. Who recognizes that a "simple" form field can be weaponized. Who asks, "What happens if this gets leaked?" before the code is deployed.
That’s the real triage: not sorting tickets, but asking the right questions.
What Comes Next?
Vibe coding isn’t going away. It’s getting faster, smarter, and more widespread. But security can’t be an afterthought. The tools are here. The frameworks exist. The data is clear.The question isn’t whether you can use AI to write code. It’s whether you’re ready to protect what it builds.
What is vibe coding, and why is it risky for security?
Vibe coding is when developers use AI agents to generate code based on high-level instructions, rather than writing it manually. It’s fast, but risky because AI models are trained on public code that often includes insecure patterns. They don’t understand security context - only functionality. As a result, AI-generated code frequently contains vulnerabilities like hardcoded secrets, broken authentication, and SQL injection - even when it works perfectly.
How do you triage vulnerabilities in AI-generated code differently than in human-written code?
Traditional triaging focuses on CVSS scores and exploit complexity. In vibe coding, you must prioritize exposure and propagation. A single flaw can appear in dozens of files at once. Hardcoded secrets are exploitable in 100% of cases with zero effort. Use a modified DREAD model that weights exposure higher (40%) and combine automated tools with manual review. Never trust AI-generated code without validation.
What are the most common vulnerabilities in vibe-coded projects?
The top three are hardcoded secrets (9.8 CVSS), broken authorization (8.2 CVSS), and insecure deserialization (6.5 CVSS). Other frequent issues include missing HTTPS enforcement, unvalidated input fields, and vulnerable third-party dependencies. The SusVibes benchmark identified over 77 CWE types commonly found in AI-generated code - far more than older benchmarks.
Can automated tools catch all AI-generated security flaws?
No. Tools like Snyk and SonarQube catch 85-92% of known vulnerabilities, but AI often creates novel or context-specific flaws that don’t match existing patterns. For example, an AI might generate a secure-looking API that leaks data through an undocumented endpoint. Human review is essential to catch these edge cases. The most effective approach combines automated scanning with AI self-review and manual validation.
Should I stop using AI to write code because of security risks?
No. AI is a powerful tool that can boost productivity. But you must treat AI-generated code like a junior developer’s draft - review everything. Implement a three-level triaging framework: automate scanning, require AI self-review, and enforce organizational policies like secret rotation and mandatory code reviews. With the right process, you can use AI safely and securely.
How can I test if my vibe-coded app is secure?
Run SAST and DAST scans in your CI/CD pipeline. Use tools like SonarQube, OWASP ZAP, and GitGuardian for secret scanning. Then, manually test for broken access controls by trying to access endpoints without authentication. Use the SusVibes benchmark as a reference for common failure points. Finally, feed your code back into the AI with prompts like, "Find all security flaws in this code," and compare its findings with your own.
2 Comments
John Fox
AI writes code like a drunk intern who read a stackoverflow post once
chioma okwara
yo i just used gemini to generate a login page and it hardcoded the password in the js file lmao
also it named the file 'passwords.txt' and pushed it to git
my boss thought it was 'cool' until the hacker emailed him asking for bitcoin
we had to reset 12 client passwords and no one would believe me when i said 'the ai did it'
also the ai wrote 'const admin = true' in the frontend because 'it looked right'
the worst part? it passed all the unit tests
how is this even legal
we're not even talking about exploits here, we're talking about a toddler writing nuclear launch codes
why do people still think ai is 'productive' when it's just a fancy autocomplete with zero context
and don't get me started on the 'self-review' trick
asking the same broken system to fix itself is like asking a smoker to quit by smoking more
we need mandatory human reviews, not magic prompts
and yes i'm still mad about the $8k in fines