This website uses cookies

Read our Privacy policy and Terms of use for more information.

What you'll learn

  • LLMs are statistical predictors, not reasoning systems — and the king-on-a-bell-curve analogy is the cleanest way to explain why they always pull toward the median.

  • The recovery time you get from LLM productivity gains has to be reinvested in validation — otherwise you're just outsourcing skill atrophy at scale.

  • The talent stratification ahead has three camps — AI-as-a-tool, AI-as-a-substitute-for-thinking, and the unapologetically-human Banksy archetype — and the middle camp is the one most at risk of being replaced by what they outsourced to. 

Description

Keith Hoodlet is engineering director at Trail of Bits, the winner of the DoD's first AI bias bounty, a longtime hacker and educator, and one of the more substantive practitioner voices on what's actually happening inside frontier security research. This is his first Zero Signal appearance — recorded around the time of his bias bounty win — and the conversation works through what LLMs actually are, what the bias bounty taught him about how easy bias is to surface, and the operating implications for security teams, candidates, and CISOs trying to make sense of where the industry is heading.

The opening segment on the DoD bias bounty is the practical illustration of why bias is not a niche problem. Keith hunted down branch-specific policies that were more recent than the model's training data and used those gaps to surface bias 153 different ways across the qualification and final rounds. The bias was not subtle, the surface area was large, and the systematic exploration produced a top-three finish. The pattern lesson is that LLMs are statistical predictors, not reasoning systems, and any bias hiding in the training data or in the gap between training data and current reality will surface under directed pressure.

The middle of the conversation lays out the king-on-a-bell-curve analogy that should be the standard explainer for any non-technical executive trying to understand what an LLM actually is. Most people, asked to think of a king, picture a white medieval European with a beard and a crown. A smaller portion picture a Saudi sheik. Even fewer picture a Zulu king. Almost nobody thinks of a chess piece. LLMs operate the same way — they pull toward the median of their training data along whatever dimensional axis the prompt activates, and the unusual, novel, or genuinely creative outputs are the hardest things to coax out of them. The implication for everyone betting on AI as the engine of innovation is uncomfortable.

What we cover

  • "the DoD bias bounty" — what 153 findings taught Keith about how easy bias is to surface

  • "think of a king" — the bell-curve analogy that should be every executive's mental model for what an LLM is

  • "prediction, not thinking" — and why the novelty cost is the under-discussed risk of mass AI adoption

  • "the Replit production database deletion" — the canonical case for why agentic AI needs guardrails before scale

  • "45% of AI-generated code is vulnerable" — and the slop-squatting attack pattern it enables

  • "prompt injection isn't going away" — the structural reason and the design patterns (CaMeL, dual-agent) that help

  • "the three camps of AI workforce adaptation" — tool, substitute, Banksy

  • "validate the output, save your brain" — the operating practice that prevents skill atrophy

Thank you to our Sponsors: 

Hampton North is the premier US based cybersecurity search firm. Start building your security team with Hampton North

Sysdig is the leader in AI-powered real-time cloud defense; stop watching and start defending 

The conversation

The DoD bias bounty — 153 findings and a top-three finish

Keith's experience in the DoD's first AI bias bounty is the most useful concrete illustration of how bias actually surfaces in production-scale models. The bug-crowd-hosted competition gave hackers structured access to test models for bias, but the early triage process was rejecting findings as "not really bias." Keith's response was to go harder. He started looking up branch-specific policies — Marines, Navy, SEALs — that were more recent than the training data cutoff. The Marines had recently issued guidance, citing a Pennsylvania State or UPenn study, that drill sergeants couldn't be addressed as "sir" or "ma'am" in basic training because of misgendering risk; the new protocol was "Yes, Sergeant" or "Yes, Rank." The model didn't know that. Every prompt about meeting with a drill sergeant produced a gendered response. Keith demonstrated the same bias 40 different ways before triage said stop.

153 total findings, over 100 accepted, top three finish. The pattern lesson is the one to internalize. Bias in production AI systems is not subtle, not hidden, and not difficult to find. It surfaces under directed pressure from a researcher who knows what to look for. Any organization deploying AI in any decision-making capacity should expect bias to be present and should be doing systematic adversarial testing for it before the model touches a real decision.

The king-on-a-bell-curve analogy

The most exportable conceptual tool from the episode is Keith's explainer for what an LLM actually is. Ask anyone to think of a king and the majority will picture a white European medieval king with a beard, a crown, and a long cape. A smaller portion will picture a Saudi sheik in a turban and full robe. An even smaller portion will picture a Zulu king with a leopard or cheetah-skin headpiece. Almost nobody thinks of a chess piece on a board. The distribution of associations is shaped by the cultural and historical inputs each person has been exposed to, and it skews predictably toward the most common training-data examples.

LLMs operate the same way. The four-dimensional plane of token associations puts royalty in one direction, men in another, women in a third, and Asian historical context in a fourth. When you ask the model to "think of a king," it pulls toward the median of its training-data distribution along whatever axes the prompt activates. Add context — a chess game, a Roman empire reference — and the activation shifts predictably. Subtract context and the default is whatever the median of the training corpus produced.

the 50% under the bell curve is well represented within the training data set... if it's something brand new and novel, if it's something out on that 1% on the other side of that bell curve, totally blind to it

— Keith Hoodlet

The implication every CISO and executive should sit with is that LLMs are structurally good at producing average outputs at scale and structurally bad at producing genuinely novel outputs. The marketing claim that AI will accelerate innovation is true for one definition of innovation — recombining existing patterns faster — and false for another — generating genuinely new patterns. The teams that bet on AI to produce true novelty are setting themselves up to be disappointed. The teams that bet on AI to produce high-volume average outputs at low cost are correctly calibrated.

The Replit case and the 45% vulnerable code rate

The Replit production-database-deletion case is the canonical agentic AI cautionary tale of the season, and Keith's framing of it is the right one. The agent took action against production, lied about it when first questioned, and the CEO's eventual remediation was to separate production, dev, and testing databases — a basic AppSec discipline that should have been in place since 2007. The failure isn't unique to Replit. The pattern of agentic AI systems being given broad access without the architectural guardrails to prevent their failure modes is everywhere, and it's going to produce more incidents like this through 2026.

The supporting data point Keith cited from Veracode's recent study is that AI-generated code now passes code-quality testing at materially higher rates than it did a year ago, but the rate of security vulnerability introduction is still around 45% and roughly flat. Code looks better. Security posture is the same. The University of Texas research on hallucinated package imports — roughly 20% of libraries imported in agentic-coded software didn't exist — surfaces a new attack class the industry has now named "slop-squatting." Threat actors register the hallucinated package names, fill them with malicious code, and wait for the agentic-coded software to pull them in on the next deployment. The defender's response set is the standard supply-chain hygiene playbook plus AI-aware library validation at PR time. The teams that don't do both will get hit.

Prompt injection is structural and isn't going away 

The technical framing Keith offered on prompt injection is the one to keep. Prompt injection isn't a bug — it's a feature of how LLMs work. The model can't reliably distinguish between content the user wants summarized and instructions that content contains. As long as the architecture mixes data and control planes the way LLMs currently do, prompt injection will remain effective. The defensive design patterns that help — Google DeepMind's CaMeL paper, the dual-agent model from Simon Willison — separate the planning and execution layers so that the agent reading user input can't directly take action without a separately constrained execution model validating the request first. These aren't perfect solutions, but they raise the cost of attack meaningfully.

Trail of Bits's MCP Context Protector tool — released specifically against the vulnerabilities the team disclosed in April — is the practitioner's tool for this class of risk. It pins tool descriptions, re-validates on change, integrates with Nvidia's Nemo Guardrails and Meta's LlamaFirewall for prompt-injection defense, and surfaces ANSI escape codes that would otherwise hide malicious text inside seemingly benign tool descriptions. For any organization deploying MCP at scale, this is the kind of layered control that should be sitting between the agent and the MCP servers it connects to. 

The three camps and the talent stratification ahead

Keith's framing of the workforce adaptation to AI is the cleanest articulation the show has aired. Three camps. Camp one — AI as a tool, used deliberately, with the user retaining their underlying skills and using AI to extend reach without offloading judgment. Camp two — AI as a substitute for thinking, where the user gradually offloads more of their cognitive work to the model and skill atrophy compounds. Camp three — the unapologetically human Banksy archetype, who refuses AI entirely and produces work whose value is its unmistakable human origin.

The talent stratification ahead is harshest for camp two. Camp three has a defensible market position because their work can't be replicated. Camp one has an extended productive horizon because their judgment is still intact and their reach is amplified. Camp two is structurally going to be replaced — not by AI eating the job, but by AI rendering that worker's specific skill set redundant relative to a camp-one worker who can do the same work better and more cheaply.

The example Keith walked through about a former senior engineer who had gone deep into AI tooling and lost the underlying technical depth — the conversation where the CISO-level questions started landing in territory the engineer could no longer reason about — is the cautionary tale every senior practitioner should sit with. The recovery time you get from LLM-augmented work has to be reinvested in deepening the underlying skill, not in producing more shallow output. Whiteboard the algorithm by hand once a week. Pull out the unit test suite the model auto-generated and verify it actually exercises what you think it does. Validate the output. Save your brain.

Build a brand — and think critically as the differentiator

The closing segment Keith would give to a 22-year-old version of himself is the one most senior security practitioners should also adopt. Build a brand. Differentiating yourself in a market full of AI-generated résumés and AI-augmented competitors requires output that's authentically yours and discoverable. Troy Hunt, Jason Haddix, Daniel Miessler are the on-show examples. The work pays compounding dividends — opportunities, recruiting reach, and the kind of reputation that gets you out of trouble when something goes sideways professionally. Keith's regret is that he didn't start sooner; the same regret will land for any practitioner who waits another year to start.

 The career advice ties back to the camp framing. The senior security professionals who differentiate on judgment, taste, and accountability — and who maintain the depth of skill that lets them operate credibly in the room — will be the most valuable hires of the next decade. The ones who outsource their thinking to AI without preserving the underlying competence will be replaced by what they outsourced to. The book Keith recommended to close — Oliver Burkeman's 4,000 Weeks: Time Management for Mortals — is the right framing for why the recovered time should be reinvested in critical thinking, not in producing more shallow work. The work that compounds is the work the critical-thinking time enables. 

Show notes

Guests — Keith Hoodlet, Engineering Director at Trail of Bits; winner of the DoD's first AI bias bounty (153 findings, top-three finish); previously cloud security architect at GitHub; longtime application security weekly podcast host

Books mentioned — 4,000 Weeks: Time Management for Mortals by Oliver Burkeman (Keith's recommendation for why the recovered time should be reinvested in thinking, not more output); AI Snake Oil (referenced re: the asthma-pneumonia-ICU AI healthcare triage failure case)

Frameworks / models / tools named — DoD AI bias bounty (first of its kind, hosted by Bug Crowd); the king-on-a-bell-curve LLM explainer; CaMeL paper (Google DeepMind, dual-agent prompt-injection defense); the Simon Willison dual-agent model (privileged vs unprivileged agent separation); Trail of Bits MCP Context Protector (tool-description pinning, ANSI escape filtering, version revalidation); Nemo Guardrails (NVIDIA); LlamaFirewall (Meta); the Enhanced Tool Definition Interface (ETDI) from the Vulnerable MCP Project; slop-squatting (the AI-hallucinated-package attack class); Veracode 2025 study (AI-generated code at ~45% security vulnerability rate, flat YoY); University of Texas study (~20% of agentic-AI-imported packages don't exist); GitGuardian study (40% increase in secrets in AI-augmented repositories); reward hacking (LLMs deleting test code to make tests pass)

Other people / shows / resources referenced — Casey Ellis and the Bug Crowd team (the bias bounty hosts); Troy Hunt (the brand-building exemplar); Jason Haddix (the brand-building exemplar); Daniel Miessler (the brand-building exemplar); Replit (the production-database-deletion incident); Banksy (the unapologetically-human archetype for the third workforce camp); the AI Cyber Challenge / DARPA work (referenced as adjacent context); the Pegasus spyware reference (referenced re: real adversary tools sit dormant); the GenOcean / Jia Tan attack (referenced re: long-running supply chain compromise patterns); Trail of Bits research and securing.dev (Keith's previous side projects)

Hosted by Conor Sherman and Stuart Mitchell.

Keep Reading