AI coders think they’re 20% faster — but they’re actually 19% slower

David Gerard@awful.systems · 1 year ago

AI coders think they’re 20% faster — but they’re actually 19% slower

RememberTheApollo_@lemmy.world · 1 year ago

Anyone who has had to unfuck someone else’s work knows it would have been faster to do the work correctly from scratch the first time.

NigelFrobisher@aussie.zone · 1 year ago

I have an LLM usage mandate in my performance review now. I can’t trust it to do anything important, so I’ll get it to do incredibly noddy things like deleting a clause (that I literally always have highlighted) or generate documentation that’s more long-winded than just reading the code and then go to the bathroom while it happens.

Threeme2189@sh.itjust.works · 1 year ago

Are you fucking serious?

David Gerard@awful.systems · 1 year ago

this sort of bloody stupid metric is widespread, i’ve heard about it widely

froztbyte@awful.systems · 1 year ago

goodhart’s law’s zombie era

purplemonkeymad@programming.dev · 1 year ago

Gotta justify all that money that they have just spent without any trials, testing or end user input.

hedgehog@ttrpg.network · 1 year ago

From the blog post referenced:

We do not provide evidence that:

AI systems do not currently speed up many or most software developers

Seems the article should be titled “16 AI coders think they’re 20% faster — but they’re actually 19% slower” - though I guess making us think it was intended to be a statistically relevant finding was the point.

That all said, this was genuinely interesting and is in-line with my understanding of the human psychology that’s at play. It would be nice to see this at a wider scale, broken down across different methodologies / toolsets and models.

Tar_Alcaran@sh.itjust.works · 1 year ago

I just want to point out that every single heavily downvoted, idiotic pro-AI reply on this post is from a .ml user (with one programming.dev thrown in).

I wonder which way the causation flows.

Omega@discuss.online · 1 year ago

For each time saved, you’re having that one kink that will slow you down by a fuck ton, something that AI just can’t get right, something that takes ai 5 hours to fix but would’ve taken you 10-20 to write from scratch

TommySoda@lemmy.world · 1 year ago

As someone that has had to double check peoples code before, especially those that don’t comment appropriately, I’d rather just write it all again myself than try and decipher what the fuck they were even doing.

David Gerard@awful.systems · 1 year ago

ahahaha holy shit. I knew METR smelled a bit like AI doomsday cultists and took money from OpenPhil, but those “open source” projects and engineers? One of them was LessWrong.

Here’s a LW site dev whining about the study, he was in it and i think he thinks it was unfair to AI

I think if people are citing in another 3 months time, they’ll be making a mistake

dude $NEXT_VERSION will be so cool

so anyway, this study has gone mainstream! It was on CNBC! I urge you not to watch that unless you have a yearning need to know what the normies are hearing about this shit. In summary, they are hearing that AI coding isn’t all that actually and may not do what the captains of industry want.

around 2:30 the two talking heads ran out of information and just started incorrecting each other on the fabulous AI future, like the worst work lunchroom debate ever but it’s about AI becoming superhuman

the key takeaway for the non techie businessmen and investors who take CNBC seriously ever: the bubble starts not going so great

diz@awful.systems · 1 year ago

I think if people are citing in another 3 months time, they’ll be making a mistake

In 3 months they’ll think they’re 40% faster while being 38% slower. And sometime in 2026 they will be exactly 100% slower - the moment referred to as “technological singularity”.

BigMuffN69@awful.systems · edit-2 1 year ago

Yeah, METR was the group that made the infamous AI IS DOUBLING EVERY 4-7 MONTHS GRAPH where the measurement was 50% success at SWE tasks based on the time it took a human to complete it. Extremely arbitrary success rate, very suspicious imo. They are fanatics trying to pinpoint when the robo god recursive self improvement loop starts.

shnizmuffin@lemmy.inbutts.lol · 1 year ago

@dgerard@awful.systems who is your illustrator? These are consistently great.

David Gerard@awful.systems · 1 year ago

these are stock images! Which are surprisingly cheap. By Valeriy Kachaev, who puts stuff up as Studiostoks on a pile of stock image sites. His pics are bizarre and keep being the perfect thing.

HedyL@awful.systems · 1 year ago

I’m not sure how much this observation can be generalized, but I’ve also wondered how much the people who overestimate the usefulness of AI image generators underestimate the chances of licensing decent artwork from real creatives with just a few clicks and at low cost. For example, if I’m looking for an illustration for a PowerPoint presentation, I’ll usually find something suitable fairly quickly in Canva’s library. That’s why I don’t understand why so many people believe they absolutely need AI-generated slop for this. Of course, however, Canva is participating in the AI hype now as well. I guess they have to keep their investors happy.

David Gerard@awful.systems · 1 year ago

all the stock sites are. use case: an image that’s almost perfect but you wanna tweak it

LEARN PAINT YOU GHOULS

Charlie Stross@wandering.shop · 1 year ago

@dgerard What fascinates me is *why* coders who use LLMs think they’re more productive. Is the complexity of their prompt interaction misleading them as to how effective the outputs it results in are? Or something else?

HedyL@awful.systems · edit-2 1 year ago

What fascinates me is why coders who use LLMs think they’re more productive.

As @dgerard@awful.systems wrote, LLM usage has been compared to gambling addiction: https://pivot-to-ai.com/2025/06/05/generative-ai-runs-on-gambling-addiction-just-one-more-prompt-bro/

I wonder to what extent this might explain this phenomenon. Many gambling addicts aren’t fully aware of their losses, either, I guess.