AI Bug Reports Are Drowning Open Source — And the Fix Isn't 'Stop Using AI'

Wed, May 20, 2026 · 9 min read

On May 18, 2026, Linus Torvalds said the Linux kernel security mailing list had become “almost entirely unmanageable” because of duplicate AI-generated bug reports. Two months earlier, longtime stable maintainer Willy Tarreau had already shared the numbers: a list that received two to three reports per week in 2024 was getting five to ten reports per day by March 2026. In January, Daniel Stenberg shut down the curl bug bounty after the valid-report rate on HackerOne dropped from above 15% to below 5%, with twenty submissions in 21 days — seven of them in one 16-hour window — and zero real vulnerabilities among them.

The reaction online has been about whose side to take. Is Linus right? Is he too negative? Is AI ruining open source, or is it the future?

I think that whole framing misses the point. The interesting question isn’t whether the flood is real — everyone agrees it is. The interesting question is what do we actually do about it, given that AI bug-hunting is here, is going to keep growing, and at the same time has produced real, mergeable patches in the same kernel.

This post is about the latter.

The two things that are simultaneously true

The first is that the flood is real. The numbers are not in dispute. curl stopped accepting bug-bounty reports through HackerOne on January 31. Matplotlib and Ghostty have started rejecting AI reports outright. Smaller maintainers — the ones who run a single library that the rest of the world depends on — are openly afraid they’re the next target.

The second is that AI bug hunting genuinely works. Greg Kroah-Hartman, who is roughly as senior in the kernel hierarchy as anyone alive, wrote in late March that “the world switched” — that AI reports have, in a matter of weeks, gone from junk to genuine. He’s running an AI-assisted fuzzer he calls Clanker T1000 on his own Framework Desktop, and it has surfaced two dozen real kernel bugs across SMB, USB, HID, F2FS, LoongArch, WiFi drivers, and the LED subsystem. His own AI experiments produced 60 patches with about two-thirds correct after cleanup. The networking and BPF subsystems have been quietly using LLM-assisted code review for a while. Google’s Sashiko tool, donated to the Linux Foundation, runs AI pre-review on patches before they reach human maintainers.

So this is not “AI bad.” It’s not even “AI mixed.” It’s two separate problems wearing the same costume:

The signal-to-noise problem. AI generates a lot of reports. Some are real, most are duplicates or garbage. Volunteer maintainers can’t triage at that volume.
The duplication problem. Multiple researchers running similar AI tools find the same bug on the same day, file separately, and the closed nature of the security mailing list means none of them knows the others already reported it. Linus’s exact words: “the security list almost entirely unmanageable, with enormous duplication due to different people finding the same things with the same tools.”

These are different problems and they want different solutions.

What the kernel team actually did

While the headlines were “Torvalds slams AI,” the kernel tree quietly merged updated security-bugs documentation that does something much more interesting than slam anything: it changes the workflow.

The new guidelines treat AI-detected bugs as public by default. The reasoning, in Linus’s own framing, is that “AI detected bugs are pretty much by definition not secret” — if multiple researchers using similar tools find the same flaw on the same day, that flaw is not a zero-day. Treating it like one wastes everybody’s time. So:

AI-discovered vulnerabilities go directly to the relevant maintainers in public, not into the closed security list
Reports must be plain text, concise
They must include a verified reproducer — not just a finding, not just a model’s hand-wave, but something that actually triggers the bug

That last requirement is the lever. A reproducer is the proof of work that filters out the slop. An AI that can run its own finding to verify the bug is real is dramatically more useful than one that can’t.

This is a good fix and it scales. The duplication problem gets solved by making everything visible: if Researcher A files in public and Researcher B is about to file the same thing, B can see A’s report and not file. The signal-to-noise problem gets partially solved by the reproducer requirement: garbage reports don’t make it past the gate.

What individual contributors can actually do

Linus said it directly: “If you actually want to add value, read the documentation, create a patch too, and add some real value on top of what the AI did.”

That’s the right ask. Translated:

A report is worth less than a report plus a reproducer. A reproducer is worth less than a tested patch. A tested patch is worth less than a patch plus a credible changelog and a clear explanation of root cause.

This is how human contributors have always added value, and AI doesn’t change the gradient — it just makes the bottom of the gradient infinitely cheaper to produce. The market for just-the-finding reports has collapsed because anyone with $20/month and a prompt can generate them. The market for patch + reproducer + changelog has not collapsed, because that work still requires actually understanding the code.

If you are using AI to find bugs in open source, don’t stop at reporting. Use the AI to find the bug, then use the AI to write the reproducer, then use the AI to draft a patch, then read the patch yourself, test it, write a changelog that explains why the bug exists and why your fix is correct, and submit the whole package. That is the workflow that turns AI from a maintainer-burden into a maintainer-multiplier. Greg KH is already doing this. He’s not spamming the list; he’s submitting fixes.

What projects can do

There’s no single answer here, but the projects that are surviving the flood are doing some combination of these:

Make the channel public. What the kernel did. If you can read what’s already there, you don’t file a duplicate. Closed channels work when you have humans filing one bug a week. They break when you have machines filing ten a day.
Demand a reproducer. A report without a reproducer is almost free to generate and almost worthless to act on. Require one and the slop ratio drops by an order of magnitude.
Drop monetary bounties for low-tier reports. This is what curl did: HackerOne’s payouts on minor findings incentivized exactly the AI-flood behavior they were trying to discourage. Pay for patches instead, or pay for high-severity confirmed findings only. Money is a signal; aim it carefully.
Triage with AI. HackerOne is shipping AI-powered triage tools exactly because the only realistic way to filter a flood of AI-generated reports at scale is with more AI. This is the closest thing the field has to a sustainable answer. The same technology that creates the problem can do the first pass on solving it — and it can do it in seconds per report instead of hours.
Block bad-faith reporters. Both individual reporters and the AI services they’re using behind them. Reputation matters. A contributor who has filed three slop reports gets put on a slower queue or stops being responded to.

None of these are silver bullets. All of them are things projects can do today, and several are things the kernel and curl have already done.

What the rest of us — the users — can do

The unfair thing in this whole story is that the people pointing AI tools at open source projects often aren’t the ones who depend on those projects. They’re security researchers building portfolios, companies running automated red-team tooling, or hobbyists chasing bounties. The volunteer maintainers absorbing the cost are the people keeping the entire dependency chain alive — for free.

If your company ships products that depend on the Linux kernel, on curl, on Matplotlib, on any of the libraries getting flooded — and mine does — you can do exactly two things:

Fund the maintainers. Directly. Sponsor them on GitHub, fund their foundation, write a check to their employer if they have one. The AI-flood problem is a maintainer-capacity problem, and maintainer capacity is currently being subsidized by a small number of volunteers running on hobby time.
Don’t ship internal AI scanners at upstream projects. If your security team runs a vulnerability scanner and it finds something in an upstream open-source dependency, don’t forward the raw output to the upstream. Verify it internally first. Reproduce it internally first. Then submit, with a reproducer and ideally a patch. Treat the upstream maintainer’s time like you’d treat your most senior engineer’s time — because functionally, they are.

The thing I keep coming back to

I run Linux because of the people who write Linux for free. The kernel security list exists because Willy Tarreau and a small number of people like him spend their weeknights triaging reports for the infrastructure the entire industry runs on. The curl project exists because Daniel Stenberg has, for twenty-eight years, refused to stop maintaining a library that ships in roughly every internet-connected device on Earth.

These people are the actual fragility in the stack. Not the C code, not the dependencies, not the supply chain — the humans. If we as an industry use AI to multiply the rate at which we ask them to do unpaid work, the stack breaks not at a CVE, but at the point where the human who used to triage CVEs decides they have better things to do with their weekends.

The good news is that the same AI that’s generating the flood can also generate patches. Greg KH has proved it. The kernel’s new public workflow is a real, structural fix to the duplication problem. The reproducer-required policy is a real, structural fix to the signal-to-noise problem. curl’s switch from HackerOne to GitHub private vulnerability reporting is a real, structural fix to the incentive problem.

Linus’s frustration is legitimate. So is Greg’s optimism. They’re not contradicting each other — they’re describing the same situation from two different vantage points. The interesting work is in the middle: building the workflows, the gates, the incentives, and the funding that let the maintainers say yes to the patches and no to the noise.

The flood isn’t going away. Our defenses against it can.