How to Triaging Vulnerabilities in Vibe-Coded Projects: Severity, Exploitability, Impact

Tamara Weed, Jan, 10 2026

Categories:

Tags:

When you let an AI write your code, you get speed - but you also get hidden risks. Vibe coding, where developers give high-level instructions to AI agents instead of writing every line themselves, is speeding up development. But it’s also flooding apps with security flaws that traditional tools miss. You might think, "If the app works, it’s fine." But in vibe-coded projects, functionality and security are rarely the same thing.

Why Vibe Coding Breaks Traditional Security Rules

Large language models (LLMs) were trained on millions of lines of public code - including buggy, outdated, and insecure examples. When you ask an AI to "build a login system," it doesn’t understand risk. It understands patterns. And the most common pattern for authentication? Hardcoded tokens, weak password checks, or no validation at all. That’s why 45% of AI-generated code fails basic security tests, according to Databricks’ 2024 analysis.

Unlike human developers who learn from past mistakes, LLMs don’t have context. They don’t know that exposing an API key in a config file is dangerous. They just copy what they’ve seen before. That’s why a single insecure pattern can spread across dozens of generated files - and why 62% of vulnerabilities found in vibe-coded apps involve exposed secrets or personally identifiable information (PII), as Escape.tech reported in mid-2024.

Three Dimensions of Triaging: Severity, Exploitability, Impact

Triaging isn’t just about listing bugs. It’s about deciding what to fix first. In vibe coding, you need to evaluate every vulnerability across three lenses:

Severity: How bad is the flaw?
Exploitability: How easy is it for an attacker to use it?
Impact: What happens if it’s exploited?

Traditional CVSS scores don’t cut it here. A SQL injection in a human-written app might take weeks to find. In a vibe-coded app? It’s often right there in the first draft. That’s why Vidoc Security’s 2024 taxonomy shows that hardcoded secrets are exploitable in 100% of cases - with zero effort. Broken authorization? 87% exploitable. Insecure deserialization? Still dangerous, but requires more skill.

And the impact? CVSS scores tell you: hardcoded secrets = 9.8 (critical), broken auth = 8.2 (high), insecure deserialization = 6.5 (medium). But those numbers don’t capture the real problem: in vibe coding, one flaw can cascade. One leaked key can give attackers access to databases, cloud buckets, and third-party APIs. That’s why severity isn’t just about the flaw - it’s about how far it can reach.

The Exploitability Gap: Why AI Code Passes Tests But Fails Security

Here’s the trap: AI-generated code often passes functional tests. It logs users in. It returns the right data. It even handles edge cases. But it ignores security boundaries.

The SusVibes benchmark, released in December 2025, tested 200 security tasks across 108 open-source projects. The results were brutal: frontier LLMs failed over 80% of security tests - even though they passed more than half of functional ones. Why? Because LLMs optimize for correctness, not safety. They don’t ask, "Is this endpoint protected?" They ask, "Does this code match what I’ve seen before?"

Researchers once asked ChatGPT to generate a parser for the GGUF binary format. The output worked perfectly - and contained a buffer overflow that could crash the system. No warnings. No flags. Just clean, functional, deadly code.

This isn’t a glitch. It’s a design flaw in how LLMs learn. They don’t understand intent. They don’t know the difference between "this works" and "this is safe." That’s why manual review isn’t optional - it’s the last line of defense.

Three-tiered security triage system depicted as a heroic triathlon with human developer as the hero.

Real-World Impact: When Vibe Coding Exposes Client Data

Marketing agencies using vibe coding to build client websites are seeing this firsthand. Duda.co found that 45% of AI-generated sites failed OWASP Top 10 tests. Why? Because the AI didn’t know to validate user input, sanitize outputs, or enforce HTTPS. One agency lost access to 12 client databases after an AI-generated form allowed SQL injection. The code looked fine. The form worked. The server logged everything correctly. But it didn’t check what was being sent.

Another case involved a chatbot built with AI that stored user messages in plain text. The developer assumed the AI would handle privacy. It didn’t. The data was exposed through a misconfigured API endpoint - a flaw that would’ve been caught by a human in five minutes, but slipped past the AI because it had never been trained on privacy regulations.

These aren’t edge cases. They’re symptoms of a system that treats security as an afterthought.

Three-Level Triaging Framework for Vibe-Coded Projects

Aikido.dev’s 2024 security checklist breaks triaging into three clear levels. This isn’t theory - it’s what teams are using today to stop breaches before they happen.

Level 1: Automate the Basics

Set up CI/CD pipelines with automated scanning tools. Use SAST (Static Application Security Testing) like SonarQube - it catches 85% of code quality and security issues, according to Susan Taylor’s 2024 benchmark. Pair it with DAST tools like OWASP ZAP to test running apps. These tools don’t replace humans, but they catch the low-hanging fruit: missing headers, unencrypted connections, outdated libraries.

Don’t skip dependency scanning. Tools that monitor Software Bill of Materials (SBOM) can detect drift with 99.2% accuracy. A single vulnerable library can turn your whole app into a target.

Level 2: Make the AI Review Itself

This is the game-changer. After the AI generates code, feed it back into the model with a security prompt: "Review this code for hardcoded secrets, missing authentication, and exposed endpoints. List all risks."

Databricks tested this in 2024. Teams using this "self-reflective review" step reduced vulnerabilities by 57% in PurpleLlama’s cybersecurity benchmark. Why? Because the AI, when asked directly, can spot patterns it missed before - like a secret in a comment or an unvalidated input field.

It’s not perfect. The same study found that when asked to fix a vulnerability, the AI introduced a new one in 68% of cases. But when used as a second pair of eyes - not the only pair - it’s powerful.

Level 3: Enforce Organizational Guardrails

No tool will save you if your team doesn’t have rules. Mandatory secret scanning across repositories, Slack, and Notion is non-negotiable. GitGuardian’s 2024 report showed that 78% of enterprises now auto-revoke or rotate secrets within 24 hours of detection.

Also, require code reviews for any AI-generated code. Not just "looks good." But: "Did you check authentication? Is data encrypted at rest? Are logs sanitized?" Make security part of the workflow, not a checklist item.

Hacker exploiting one secret that triggers a cascade of data breaches in vintage comic style.

Modified DREAD: A New Way to Prioritize in Vibe Coding

Traditional DREAD (Damage, Reproducibility, Exploitability, Affected Users, Discoverability) doesn’t fit vibe coding. Why? Because exposure happens instantly. A flaw can be live in minutes - not weeks.

ReversingLabs’ 2024 update to DREAD shifts the weights:

Exposure (40%): How quickly and widely can this flaw be reached?
Damage (30%): What’s the cost if exploited?
Reproducibility (15%): Can it be triggered reliably?
Exploitability (10%): How hard is it to exploit?
Affected Users (5%): How many users are at risk?

This model prioritizes flaws that are easy to find and exploit - exactly the kind AI leaves behind. Tools like Vidoc Security Lab now integrate this scoring into dev pipelines, flagging issues that match the 77 CWE patterns from the SusVibes benchmark.

The Human Edge: Why AI Can’t Replace Security Experts

AI is a tool. Not a replacement. Google Cloud’s 2025 Security Command Center reduced false positives by 42% by learning how AI-generated code behaves. IBM’s research showed that combining automated scans with LLM-based reflection caught 91% of vulnerabilities that either method missed alone.

But none of this works without a human who understands context. Who knows that a "temporary" API key shouldn’t be in production. Who recognizes that a "simple" form field can be weaponized. Who asks, "What happens if this gets leaked?" before the code is deployed.

That’s the real triage: not sorting tickets, but asking the right questions.

What Comes Next?

Vibe coding isn’t going away. It’s getting faster, smarter, and more widespread. But security can’t be an afterthought. The tools are here. The frameworks exist. The data is clear.

The question isn’t whether you can use AI to write code. It’s whether you’re ready to protect what it builds.

What is vibe coding, and why is it risky for security?

Vibe coding is when developers use AI agents to generate code based on high-level instructions, rather than writing it manually. It’s fast, but risky because AI models are trained on public code that often includes insecure patterns. They don’t understand security context - only functionality. As a result, AI-generated code frequently contains vulnerabilities like hardcoded secrets, broken authentication, and SQL injection - even when it works perfectly.

How do you triage vulnerabilities in AI-generated code differently than in human-written code?

Traditional triaging focuses on CVSS scores and exploit complexity. In vibe coding, you must prioritize exposure and propagation. A single flaw can appear in dozens of files at once. Hardcoded secrets are exploitable in 100% of cases with zero effort. Use a modified DREAD model that weights exposure higher (40%) and combine automated tools with manual review. Never trust AI-generated code without validation.

What are the most common vulnerabilities in vibe-coded projects?

The top three are hardcoded secrets (9.8 CVSS), broken authorization (8.2 CVSS), and insecure deserialization (6.5 CVSS). Other frequent issues include missing HTTPS enforcement, unvalidated input fields, and vulnerable third-party dependencies. The SusVibes benchmark identified over 77 CWE types commonly found in AI-generated code - far more than older benchmarks.

Can automated tools catch all AI-generated security flaws?

No. Tools like Snyk and SonarQube catch 85-92% of known vulnerabilities, but AI often creates novel or context-specific flaws that don’t match existing patterns. For example, an AI might generate a secure-looking API that leaks data through an undocumented endpoint. Human review is essential to catch these edge cases. The most effective approach combines automated scanning with AI self-review and manual validation.

Should I stop using AI to write code because of security risks?

No. AI is a powerful tool that can boost productivity. But you must treat AI-generated code like a junior developer’s draft - review everything. Implement a three-level triaging framework: automate scanning, require AI self-review, and enforce organizational policies like secret rotation and mandatory code reviews. With the right process, you can use AI safely and securely.

How can I test if my vibe-coded app is secure?

Run SAST and DAST scans in your CI/CD pipeline. Use tools like SonarQube, OWASP ZAP, and GitGuardian for secret scanning. Then, manually test for broken access controls by trying to access endpoints without authentication. Use the SusVibes benchmark as a reference for common failure points. Finally, feed your code back into the AI with prompts like, "Find all security flaws in this code," and compare its findings with your own.

9 Comments

John Fox

January 11, 2026 at 21:31

AI writes code like a drunk intern who read a stackoverflow post once

chioma okwara

January 13, 2026 at 10:39

yo i just used gemini to generate a login page and it hardcoded the password in the js file lmao

also it named the file 'passwords.txt' and pushed it to git

my boss thought it was 'cool' until the hacker emailed him asking for bitcoin

we had to reset 12 client passwords and no one would believe me when i said 'the ai did it'

also the ai wrote 'const admin = true' in the frontend because 'it looked right'

the worst part? it passed all the unit tests

how is this even legal

we're not even talking about exploits here, we're talking about a toddler writing nuclear launch codes

why do people still think ai is 'productive' when it's just a fancy autocomplete with zero context

and don't get me started on the 'self-review' trick

asking the same broken system to fix itself is like asking a smoker to quit by smoking more

we need mandatory human reviews, not magic prompts

and yes i'm still mad about the $8k in fines

Tonya Trottman

January 15, 2026 at 09:42

you people are missing the real issue

the ai didn't 'make a mistake' - it did exactly what it was trained to do

the internet is full of insecure code, and the model is just mirroring the data it was fed

we're not dealing with bugs, we're dealing with systemic epistemological failure

the machine doesn't understand 'danger' - it understands 'likelihood'

and since 90% of public code is trash, the model assumes trash is normal

we're not fixing code, we're fixing the cultural rot that produced the training data

until we stop glorifying 'quick and dirty' github repos, this will keep happening

and no, 'self-review' won't help

it's like asking a racist to judge if their own bias is problematic

the real triage is in the hiring process - stop hiring people who think ai is a productivity hack

start hiring people who understand that security is a mindset, not a checklist

Krzysztof Lasocki

January 17, 2026 at 04:28

bro i was skeptical too until i tried the self-review trick

we fed our ai-generated auth module back in with 'find all security flaws'

it caught 3 hardcoded secrets we missed

one was in a comment that said 'temporary key for dev only'

we didn't even know it was there

yes it also introduced a new bug - but we caught it before merge

it's not perfect, but it's like having a paranoid intern who never sleeps

pair it with SAST and you're 90% there

and yes i know humans are still needed - but why not use every tool we've got?

the goal isn't to replace people, it's to make them less tired

if you're still doing manual reviews without automation, you're just doing extra work for no reason

we're not living in 2015 anymore

Santhosh Santhosh

January 18, 2026 at 10:08

i work at a startup in bangalore and we use ai for everything

we built a customer portal last month and it worked perfectly

but then we got a call from a client saying their data was exposed

turns out the ai generated a dashboard that showed all user emails in the url params

it was in every single page

we didn't notice because the ui looked clean

the ai didn't know that urls should never have pii

we spent three days patching it

and the worst part? the client didn't blame us

they said 'oh the ai did it' like it was some kind of natural disaster

we're now requiring two human reviews for every ai-generated module

one for functionality, one for security

it slows us down

but we've had zero breaches since

maybe we're just old-fashioned

but i'd rather be slow than sorry

Victoria Kingsbury

January 19, 2026 at 06:19

the modified dread model is actually brilliant

exposure being weighted at 40% makes total sense

in vibe coding, a flaw doesn't need to be complex to be catastrophic

it just needs to be everywhere

and that's exactly what ai does - it scales bad patterns

we implemented the framework last quarter

our mean time to remediate dropped from 14 days to 3

because we're prioritizing the stuff that's both easy to exploit and widespread

hardcoded secrets? immediate red flag

insecure deserialization? still bad, but we can wait

also the ai self-review step saved us from a massive leak last week

it flagged a secret in a .env.example file we thought was safe

turns out someone had copied it to production

the ai didn't care about 'example' - it just saw 'key'

human eyes would've missed it

so yes, the model works

and no, it's not magic - it's just better prioritization

Tasha Hernandez

January 20, 2026 at 19:52

oh sweet jesus

another blog post pretending ai is 'just a tool'

it's not a tool, it's a liability with a ui

you think you're being 'efficient'

you're just outsourcing your job to a statistical parrot that thinks 'admin' is a valid password

and then you have the gall to call it 'triage'

you're not fixing security

you're just rearranging deck chairs on the titanic

every time you say 'self-review' you're giving the ai another chance to poison your codebase

and the fact that people are actually using this as a strategy

is why the world is ending

we're not building software

we're building digital landmines

and you're all the ones who lit the fuse

Anuj Kumar

January 21, 2026 at 10:27

ai is a government spy tool

they trained it on github to find weak code

then they let companies use it so they can collect secrets

every hardcoded key is sent to a server

every api token is logged

the 'vibe coding' thing is just a cover

the real goal is to get companies to expose their own data

then the feds can hack them easier

that's why they say 'use ai' - it's not for speed

it's for surveillance

you think you're saving time

you're giving away your company

look at the dates

all these reports came out after the cisa warnings

coincidence? i think not

stop using ai

write your own code

or your data is already gone

Henry Kelley

January 22, 2026 at 01:39

we tried the self-review thing and it worked better than expected

our devs were skeptical at first

but now they ask the ai to review their own code before pushing

it catches stuff like 'password' in a string literal or a missing csrf token

not perfect, but it's like having a second pair of tired eyes

and the modified dread model? game changer

we used to waste weeks on low-impact stuff

now we fix the things that can spread fast

hardcoded secrets first, then auth, then deserialization

it's not about perfect security

it's about not getting owned

also, yes, humans still need to review

but we're not doing it alone anymore

ai is our junior dev

and we're finally treating it like one