AI Made Everyone Faster. But Faster Is Not Necessarily Better.
You just gave everyone in the building a chainsaw. Including the people who were already a danger with scissors. AI amplifies the sign of impact, not just the magnitude — and nobody is deploying it with that in mind.
AI Made Everyone Faster. But Faster Is Not Necessarily Better.
You just gave everyone in the building a chainsaw. Including the people who were already a danger with scissors.
The conversation nobody has at work
A trusted former colleague of mine — someone who started life as a developer and now directs an organization of over ten thousand people — shared a workforce model with me over virtual coffee. He has seen things at every level. His model was simple, cynical, and immediately recognizable to anyone who has spent serious time in a large organization:
- 10% of the people do 80% of the work
- 20% do 30%
- 30% do 0%
- 40% do -10%
Add those up. One hundred percent of the productive output comes from thirty percent of the workforce. The bottom forty percent are not merely unproductive — they are actively subtracting value. Whether the organization succeeds depends entirely on the top ten percent being far more impactful than the bottom forty percent are destructive.
No C-suite executive would say this in public. HR would spontaneously combust. Every corporate communication insists that “our people are our greatest asset” — all of them, presumably, including the forty percent who are setting things on fire.
And yet. If you have spent twenty or thirty or forty years inside large organizations, you just read those numbers and nodded. Not because they are precisely calibrated — they are not — but because the shape is instantly familiar. You have seen it. You know exactly who falls where. You could probably sketch the distribution for your current organization on a napkin in under sixty seconds.
The uncomfortable question is what happens when you hand every one of those people a cognitive amplifier.
The research says he’s roughly right
My friend’s specific numbers are anecdotal. The underlying pattern is one of the most replicated findings in organizational research.
O’Boyle and Aguinis published the landmark study in 2012 — five studies, 198 samples, 633,263 individuals across researchers, entertainers, politicians, and athletes [1]. Their finding won the Personnel Psychology Best Article award: individual performance is not normally distributed. It follows a power-law distribution. The top decile produces roughly 30% of total output. The top quartile produces over 50%. This held consistently across industries, job types, performance measures, and time frames. Their 2014 follow-up found that 82.5% of 229 samples had significantly heavy right tails [2].
The phenomenon scales. In software engineering, every major study from Sackman in 1968 through Oliveira in 2023 confirms large individual variation — though the precise ratio is debated [3][4]. The most careful recent work suggests log-normal distributions with roughly a 2.4x ratio between top and bottom halves, not the dramatic 10:1 of older studies [4]. The skew is real. The specific multiplier depends on what you measure and how.
But my friend’s model has a feature that the standard productivity research avoids: the negative tier. And that, too, has empirical backing.
Schulmeyer formalized the “Net Negative Producing Programmer” concept in 1992 — programmers whose defect rates are high enough that the cost of their errors exceeds the value of their output [5]. In a typical team of ten, he estimated up to three may qualify. On high-defect projects, up to half. Felps, Mitchell, and Byington demonstrated experimentally that a single negative team member reduces team performance by 30–40% [6]. And in what may be the most delightfully counterintuitive finding in the management literature, Housman and Minor studied 50,000 workers and found that avoiding one toxic hire saves $12,489 — while hiring a top-one-percent superstar adds only $5,303 [7]. Avoiding a bad hire is worth more than double finding a great one. Oh, and the toxic workers? They often have above-average raw output. They look productive. They are net destructive.
Every senior leader who has been in the trenches knows this instinctively. The research just makes it awkward to deny.
Now give them a cognitive amplifier
Here is where the conversation over coffee turned into something I could not stop thinking about.
AI is commonly framed as a productivity tool — a universal capability multiplier that lifts all boats. The enterprise surveys confirm that this is how organizations deploy it: universally, by function and hierarchy, without differentiating by performer capability [8][9][10]. No major enterprise survey has identified a single organization that stratifies AI deployment based on individual performer quality [11]. The assumption is uniform benefit.
The research says the assumption is wrong. And the way it is wrong depends on what you are asking people to do.
For structured, routine tasks — customer service, professional writing, well-defined consulting problems — AI does act as a leveler. It narrows the gap. Brynjolfsson, Li, and Raymond studied 5,172 customer service agents and found that low-skilled workers improved by 34%, while experienced workers saw minimal gains [12]. Noy and Zhang found the same pattern in professional writing [13]. Dell’Acqua and Mollick’s landmark study of 758 BCG consultants found that bottom-half performers improved by 43% versus 17% for the top half — on tasks inside AI’s capability frontier [14].
Inside the frontier. That qualifier is doing a lot of work.
On tasks outside the frontier — problems requiring judgment, contextual reasoning, novel problem-solving — the pattern reverses. In the same BCG study, consultants using AI on tasks beyond its capability were 19 percentage points less likely to get correct answers than those working without AI [14]. The AI produced plausible, confident, wrong output. The consultants could not tell.
The Otis study of Kenyan entrepreneurs is the one that should keep deployment strategists up at night [15]. Researchers gave GPT-4 business advice to entrepreneurs via WhatsApp. High performers gained roughly 15%. Low performers declined by roughly 8%. Both groups sought advice at similar rates. Both received similar quality advice. The difference was entirely in the human’s ability to evaluate, select, and implement the right pieces of that advice. The AI did not discriminate. The humans’ judgment did.
The reconciling framework is clean: when AI substitutes for human effort, it levels [16]. When AI complements human judgment, it amplifies. The task determines the outcome, not the tool.
Now map that back to my friend’s workforce model. The bottom 40% — the ones already subtracting value — are not primarily doing structured, routine work poorly. They are making bad decisions, creating coordination overhead, generating errors that others must fix, and injecting dysfunction into teams. That is judgment work. And on judgment work, AI does not level. It amplifies.
You just gave the bottom 40% a tool that makes them faster at everything they were already doing wrong.
The compression problem
There is a subtler issue hiding inside the leveling data, and it is arguably more dangerous than the amplification problem.
When AI raises the floor more than the ceiling on routine tasks, it compresses the performance distribution. Everyone converges toward the middle. The gap between the best and the rest shrinks. On a spreadsheet, this looks like progress. Average performance improves. Variance decreases. The metrics go up.
But organizations do not succeed because of average performance. They succeed because the top of the distribution — the ten percent doing eighty percent of the work — solves the problems that nobody else can solve. The value of that top tier is not in their routine output. It is in their judgment, their pattern recognition, their ability to see around corners. AI does not compress that. But it does make it harder to identify, because now everyone’s routine output looks similar.
The DORA 2025 report captured this paradox at the organizational level: individual developers using AI completed 21% more tasks and merged 98% more pull requests, but organizational delivery metrics stayed flat [17]. Individual output went up. Organizational outcomes did not. The throughput increase exposed downstream bottlenecks — testing, review, deployment — in organizations without mature practices to absorb the acceleration. AI made the individuals faster. It made the organization’s dysfunction more visible.
More output is not the same as more value.
The conversation nobody is having
Every AI deployment strategy I have seen treats the tool as a uniform multiplier. Give everyone access. Measure time saved. Report productivity gains. Declare victory.
Nobody is asking: what happens when the forty percent get faster?
Nobody is measuring: did the quality distribution change, or just the quantity? Are the people who were making bad decisions now making bad decisions at machine speed? Are the net-negative producers generating net-negative output more efficiently?
The Housman and Minor finding should be tattooed on every AI deployment plan: avoiding a destructive employee is worth more than double finding a great one [7]. If that was true before AI, it is dramatically more true after. A toxic worker with above-average raw output was already hard to identify and expensive to tolerate. A toxic worker with AI-augmented above-average output is the same problem at higher velocity.
My friend saw this from the top of a ten-thousand-person organization. I saw it from the trenches, building infrastructure that had to survive after I walked away. The research sees it in the data — 633,263 individuals’ worth of power-law distributions, 50,000 workers’ worth of toxic-hire economics, 758 consultants falling for plausible nonsense outside the frontier.
The only people who do not see it are the ones writing the AI deployment strategy.
This article is a companion to “The Trust Chasm,” which examines the structural root causes of the widening gap between AI capability and AI reliability. Where that article traces the compounding feedback loop from sycophancy to deskilling, this one asks what happens when the tool reaches the people who were already the problem.
References
[1] E. O’Boyle and H. Aguinis, “The Best and the Rest: Revisiting the Norm of Normality of Individual Performance,” Personnel Psychology, 65(1), 79–119, 2012. DOI: 10.1111/j.1744-6570.2011.01239.x https://doi.org/10.1111/j.1744-6570.2011.01239.x
[2] H. Aguinis and E. O’Boyle, “Star Performers in Twenty-First Century Organizations,” Personnel Psychology, 67, 313–350, 2014. DOI: 10.1111/peps.12054 https://doi.org/10.1111/peps.12054
[3] H. Sackman, W. J. Erikson, and E. E. Grant, “Exploratory Experimental Studies Comparing Online and Offline Programming Performance,” Communications of the ACM, 11(1), 3–11, 1968. DOI: 10.1145/362851.362858. https://doi.org/10.1145/362851.362858 Replicated by Curtis (1981), DeMarco and Lister (1985), Boehm et al. (2000), and others with ratios ranging from 10:1 to 28:1.
[4] E. Oliveira et al., “Characteristics and Generative Mechanisms of Software Development Productivity Distributions,” Information and Software Technology, 159, 107200, 2023. DOI: 10.1016/j.infsof.2023.107200 https://doi.org/10.1016/j.infsof.2023.107200
[5] G. G. Schulmeyer, “The Net Negative Producing Programmer,” American Programmer, 1992. Collected in Total Quality Management for Software, Van Nostrand Reinhold.
[6] W. Felps, T. R. Mitchell, and E. Byington, “How, When, and Why Bad Apples Spoil the Barrel: Negative Group Members and Dysfunctional Groups,” Research in Organizational Behavior, 27, 175–222, 2006. https://doi.org/10.1016/S0191-3085(06)27005-9 Supported by R. F. Baumeister et al., “Bad Is Stronger Than Good,” Review of General Psychology, 5(4), 323–370, 2001.
[7] M. Housman and D. Minor, “Toxic Workers,” Harvard Business School Working Paper 16-057, November 2015. https://www.hbs.edu/ris/Publication%20Files/16-057_d45c0b4f-fa19-49de-8f1b-4b12fe054fea.pdf
[8] McKinsey, “The State of AI: Global Survey 2025.” https://www.mckinsey.com/capabilities/quantumblack/our-insights/the-state-of-ai
[9] BCG, “AI at Work 2025: Momentum Builds, but Gaps Remain.” https://www.bcg.com/publications/2025/ai-at-work-momentum-builds-but-gaps-remain
[10] Deloitte, “State of AI in the Enterprise 2026.” https://www.deloitte.com/global/en/issues/generative-ai/state-of-ai-in-enterprise.html
[11] No major enterprise survey (McKinsey n=1,933; BCG n=10,600; Deloitte n=3,235) identified capability-based stratification in AI deployment. Deployment is phased by function, hierarchy, and geography — never by performer quality.
[12] E. Brynjolfsson, D. Li, and L. Raymond, “Generative AI at Work,” The Quarterly Journal of Economics, 140(2), 889–938, 2025. DOI: 10.1093/qje/qjae044 https://doi.org/10.1093/qje/qjae044
[13] S. Noy and W. Zhang, “Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence,” Science, 381(6654), 187–192, 2023. DOI: 10.1126/science.adh2586 https://doi.org/10.1126/science.adh2586
[14] F. Dell’Acqua, E. McFowland III, E. R. Mollick et al., “Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality,” Harvard Business School Working Paper 24-013, 2023. DOI: 10.2139/ssrn.4573321 https://doi.org/10.2139/ssrn.4573321
[15] N. G. Otis, R. Clarke, S. Delecourt, D. Holtz, and R. Koning, “The Uneven Impact of Generative AI on Entrepreneurial Performance,” Harvard Business School Working Paper 24-042, 2023. DOI: 10.2139/ssrn.4671369 https://doi.org/10.2139/ssrn.4671369
[16] “Equalizer or Amplifier? How AI May Reshape Human Cognitive Differences,” 2025. https://arxiv.org/abs/2512.03902
[17] DORA, “State of AI-Assisted Software Development 2025.” https://dora.dev/research/2025/dora-report/