HACKER Q&A
📣 make_it_sure

Why there are no actual studies that show AI is more productive?


I know there are companies that are highly productive with AI including ours. However, AI skeptics ask for real studies and all of them available now show no real gains.

Many won't care unless you show them an actual study.


  👤 flawn Accepted Answer ✓
AI can build systems based on static assumptions that the orchestrator (you) gives it. But proper engineering (which is what matters economically much more) is the process of the system's assumptions & requirements changing over time to ensure you have a reliable and consistent service - and that's not something that AI excels at (yet).

👤 anovikov
If AI makes people so much more productive, why aren't there much more apps on the Apple store? Mobile apps involve a lot of dirty, boring scaffolding work which AI automated first thing, 2 years ago easily. It should've been the very first place where productivity boost should've been evident, a year ago at least. But it's just not there. Why not?

👤 Nevermark
I think most major efficiency improvements involve more adaptation costs than expected.

Those that can “see” the potential clearly push through the adaptation period over time, but it can be much longer than anyone expects.

Even as new practices reveal the benefits are real, the cost including iterative adaptation may not actually be net positive on a running, as apposed to forward looking, basis for some time.

Depending on how forward looking a group is, that is a problem, or not a problem at all.

But external measurements won’t be able to distinguish between what may be very fast forward looking returns and no benefit, for some time.


👤 AugustoCAS
Dora released a report last year: https://dora.dev/research/2025/dora-report/

The gains are ~17% increase in individual effectiveness, but a ~9% of extra instability.

In my experience using AI assisted coding for a bit longer than 2 years, the benefit is close to what Dora reported (maybe a bit higher around 25%). Nothing close to an average of 2x, 5x, 10x. There's a 10x in some very specific tasks, but also a negative factor in others as seemingly trivial, but high impact bugs get to production that would have normally be caught very early in development on in code reviews.

Obviously depends what one does. Using AI to build a UI to share cat pictures has a different risk appetite than building a payments backend.


👤 thinkingemote
[delayed]

👤 chrisjj
[delayed]

👤 IshKebab
These sort of things are really hard to study. Combine that with the fact that the AI landscape is so varied and fast moving... It's easy to see why there aren't many studies on it.

There are a mountain of things that we reasonably know to be true but haven't done studies on. Is it beneficial for programming languages to support comments? Are regexes error-prone? Does static typing improve productivity on large projects? Is distributed version control better than centralised (lock based)? Etc.

Also you can't just say "AI improves productivity". What kind of AI? What are you using it for? If you're making static landing pages... yeah obviously it's going to help. Writing device drivers in Ada? Not so much.


👤 austin-cheney
Some people prefer evidence before investing large amounts of money and labor. That is not an indication of irrational behavior even if challenging your emotionally invested opinion or result.

👤 vjk800
We've had the AI tools for maybe two years, and they have only gotten really good in the past half a year or so. For fuck's sake, adopting electricity took like 50 years, why would you expect to see any kind of effect from the AI so quickly? The tools are still developing - rapidly - and people are still figuring out the best usage patterns for it.

👤 Lionga
There are a few studies that show perceived increases in productivity (all of them show negativ or almost no real increase, but I don't that is relevant to snake oils salesman).

👤 felipeerias
Most people seem to be expecting some kind of quantitative analysis: N developers undertook M tasks with and without access to a given AI tool, here is the statistical evidence that shows (or fails to show) the effect, and this result is valid across other projects and tools.

In practice, arriving at this ideal scenario can be very challenging, so typically our experiments will be necessarily narrow with the expectation that their results can be extrapolated outside of their specific experimental setup.

Another valid approach would be to carry out qualitative research, for example a case study. This would typically require the study of one (or a few) developers and their specific contexts in great detail. The idea is that by understanding how one person navigates their work and their tools we would gain insights that might be adapted to our specific situation.

Personally, I tend to prefer detailed qualitative accounts of how other developers are working on similar projects and with similar tools as me. But in any case, both approaches are valid and complementary.


👤 ChicagoDave
I can report all kinds of productivity using Claud AI and Code.

- built AWS dashboard to identify and manage internal resources in a few hours

- solved several production problems connecting Claude to devops APIs in near real-time

- identified solutions for feature requests or bugs for existing internal applications including detailed source changes

- built Ledga.us

- built sharpee.net and its associated GitHub repo

- building mach9 poker ios and android apps

- working on undisclosed app that might disrupt a huge Internet sector

We’re still in the early stages of LLM influenced development and reporting productivity will take time


👤 chrysoprace
Self-reported productivity does not equate to actual productivity. People have all sorts of biases that make such assessments fairly pointless. They only gauge how you feel about your productivity, which is not necessarily a bad thing, but it doesn't mean you're actually more productive.

👤 ghostlyInc
I think the productivity gain from AI is mostly micro-friction reduction.

Things like generating boilerplate, quick test scaffolding or documentation lookups. Each one is small, but they compound during the day.

That’s probably why it’s hard to capture in traditional studies.

Curious: has anyone seen studies measuring task-level productivity instead of overall output?


👤 lysecret
Because we are incapable of measuring developer productivity.

👤 aragilar
How do you know you're more productive? Humans are excellent at fooling themselves, and absent a metric (or multiple metrics) by which you can measure your productivity, you can't be sure you're actually being more productive.

👤 shawntwin
Surely, current openclaw has show AI's productive. More and more common person use it to change their lives, amazing

👤 Stronz
It might also depend on how the tools are used. In practice a lot of value seems to come from reducing small bits of friction rather than dramatically increasing output.

👤 ltning
Why are we even discussing this before the theft problem has been solved? Or the energy consumption?

If anything, there needs to be studies done on

- the drop in creative, novel output from actual people (due to theft and loss of jobs)

- the energy cost per pax in relevant industries, pre/post LLMs being adopted


👤 smackeyacky
The code was never the bottleneck. It’s always the org around it.

👤 rienbdj
GitHub has their own study using Copilot but given the obvious conflict of interest I would discount it.

👤 metalman
I believe that individual productivity in most areas peaked long ago. Industrial production is still scaling up, and this is the model that applies to AI, or as it realy is, automation of "management", but as this is NOT a linear mechanical process,(almost, oh! so almost mechanical), it is not quite working.For exactly the same reason that industry can not make you one ,lets say,car, that is green on one side, but orange on the other, and has six headlights, but only one seat, industry cant scale down, minimum order is 250000 units, it will take 3 years, pay us now! I deal with this every week, something small,(smol), breaks, in a large corporate environment, they work in millions, they have teams, and departments, but the little handle thing on a set of automated front doors facing a main street in a significant asset, has failed, and watching the whole corporate aparatus convulse as they try and figure out how to pay for something smaller than a rounding error to a company that barely exists, and needs to be passed higher and higher to be approved as there is no button, just like a major corporate deal. People cant figure this out, AI never will. And I am exploring just how to exploit this scaling problem to my advantage.

👤 bawolff
> Many won't care unless you show them an actual study

Why are the pro AI people so obsessed with proving the AI skeptics wrong.

Is AI is working for you? Great. Go make great things. Isn't that the point after all? Who cares who believes you if the results speak for themselves?


👤 heraldgeezer
So... you want a study to prove your ready made hypothesis?

👤 charcircuit
Because the data is private and often such studies are not measuring solely the part that AI makes more productive. And measuring productivity in general is a very hard problem so the results of whatever study often are meaningless in practice. Pair this with studies today still being based off ancient models like GPT-4o and it's even more meaningless.

If you are familiar with AI it's obvious how it increases productivity. When bugs get fixed with 0 human time it's plain as day that it was productive compared to a human making the fix.


👤 otabdeveloper4
Just trust the vibe, bro. One trillion market cap cannot be wrong.

👤 danr4
because you can just look at the commit log

👤 blitzar
Ask HN: Why are there no actual studies that show the sky is green and the earth is at the centre of the universe?

I would have included the flatness of earth, but the flat earthers have some excellent studies (reviewed by their flat earth peers) on the subject.


👤 hennell
What's the best car? If you're trying to go fast it's one answer, if you're trying to carry as much load as possible it's another, if you're buying for your just-qualifed-teen it's another. But best is obviously subjective, so what about safest? I don't know specifics there, but if you're in the EU the "safest" car would be very different to the "safest" in the US, because their safety studies measure very different things.

Which is the issue with almost all studies and statistics, what it means depends entirely on what you're measuring.

I can program very very fast if I only consider the happy path, hard code everything and don't bother with things like writing tests defining types or worrying about performance under expected scale. It's all much faster right up until the point it isn't - and then it's much slower. Ai isn't quite so obviously bad, but it can still hide short term gains into long term problems which is what studies tend to focus on as the short term doesn't usually require a study to observe.

I think Ai is similar to outsourcing staff to cheeper counties, replacing ingredients with cheaper alternatives and other MBA style ideas. It's almost always instantly beneficial, but the long term issues are harder to predict, and can have far more varied outcomes dependent on weird specifics of the business.


👤 lnsru
It’s not company. It’s always the 10x developer who uses the tools to increase his output. My buddies report at least once a month the new AI policy in corporate world. All of them are bollocks written by someone who never wrote any code.

👤 jokoon
Funny how much money is invested yet no proof those investments will yield profits

It's all make believe


👤 andrewstuart
Because software development - anything to do with software development is incredibly hard to quantify.

And no, no-one is waiting for a “study” to believe in AI, they’re out doing it.


👤 hypeatei
Productivity was never about the lines of code written. I thought the industry as a whole had collectively decided that metric was a joke before the age of LLMs. The bottlenecks are the same: office politics, coordinating teams, consulting subject matter experts and coherent system design. AI is not a swiss army knife that results in devs becoming their own island; LLMs cannot tell me if something would jive well with our customer base -- I need people in the company who actually interact with them, for example.

👤 devilkin
I have a coworker who is obsessed by LLMs and keeps reiterating that he is super productive with them.

Yet I have to still see the first delivery or codebase by that same person. (I am not his manager)

I lean in the LLM skeptic camp, I know they're great for some things (never to outsource your thinking, what unfortunately a lot of people do), but I'd like to see some studies. Because there are a lot of net negatives in the business press, or max up to 10% improvement.


👤 PunchyHamster
I do remember some of them showed some productivity improvement but it pretty much dove off cliff with the complexity of the tasks involved, or the small improvement on medium difficulty task was eaten by time to wait for responses.

Note that most of them were focused on programming tasks aimed to ship a product, not other use cases like "prototype a dozen of ideas quickly before we pick direction", or "write/update documentation about this feature" which AI might be significantly more productive use case than just programming.


👤 eudamoniac
I can tell you that at Cisco they just released an internal AI study that measured just about everything related to AI at Cisco except tangible gain. No mention of productivity, but tons of other data about who uses it, how long, why or why not, what correlates to usage or non usage, etc. I can only assume what that means.