Does anyone have numbers for churn vs. cumulative code?
Most of my commits (hand written and AI) have delete counts that are 75-110% the added line count.
The point that many developers will probably forget to tell the LLM to run cleanup/refactoring paths is probably true though. (I’ve definitely found ghost-chasing bugfixes in all sorts of corners of LLM generated code).
So, I’ve explored AI coding, but my conclusion up to this point has been that it’s interesting, but the code is sometimes a mess, and sometimes it will completely crater the project to the point where you just have to throw it all away and start over. After reading this article, I keep wondering if we’re really being productive or just creating lots of crappy code at machine speeds now. It’s one thing to say that we are using a “security agent,” for example, to ensure the security of the code, but quite another to actually know (or at least strongly believe) that our code is really secure. With all the froth of generating thousands of lines of code, how are we sure? In some sense, my question is whether we’re building a Winchester Mystery House or a house of cards.
Software developers working on their own have built monstrosities before (not as quickly) but it seems likely that this is a skill issue and we will learn how to use these tools better. You can tell coding agents to work on cleaning up code, improving the architecture, and so on.
Maybe adopting some hard constraints on code complexity that agents have to work within would help?
Yep, surely humans write bad code, too. But not nearly as fast. This feels a lot like hiring oodles of hyper-productive junior developers. Are we going to get true productivity out of that or a scrambled mess? I don’t know the answer to that. Or maybe the models get so much better that it’s like hiring oodles of senior developers and architects and the payoff is real.
I’d just like to thank the author for giving the correct t reason for the Winchester Mystery House instead of just blindly repeating the “she went crazy” line story as truth.
Not really. From the essay: “I had been preaching the Unix gospel of small tools, rapid prototyping and evolutionary programming for years. But I also believed there was a certain critical complexity above which a more centralized, a priori approach was required. I believed that the most important software (operating systems and really large tools like the Emacs programming editor) needed to be built like cathedrals, carefully crafted by individual wizards or small bands of mages working in splendid isolation, with no beta to be released before its time.”
So the Unix-philosophy small tools that constitute an important part of the GNU project are excluded. Rather, it’s about any programs of significant complexity, like Emacs (and likely GCC) and many commercial products. While the cathedral model doesn’t imply closed source, it implies building “in […] isolation”, rather than in the open. It may or may not remain proprietary and/or closed source.
Linux demonstrated to ESR that complex projects can also be built in the open with many collaborators, and don’t necessarily require the cathedral; which inspired the essay.
It wasnt one thing, gnu is a case of cathedrals. Corps are usually more cathedrally than bazaary because of their hierarchical top down structure, but ymmv, an elon musk or steve jobs company will be more cathedral than a conglomerate like unilever or a google or microsoft
Julia Morgan, Winchester's contemporary, was the first woman to obtain an architecture license in California in 1904 and had a very prolific career throughout the state including her most famous - Hearst Castle - commissioned in 1919.
> Which is why maintainers feel like they’re drowning.
How about actually funding opensource project mantainers? We have non profit orgs, that eat billions of public funds. We spend biilions for influencing hardly measurable metrics, with very nebulous benefits in far distant future.
Direct sponsoring of critical projects would have far better and concrete benefits.
The problem is the cost is so wildly asymmetric. When everyone with a computer and a subscription can vibe code low quality features, when everyone can submit dubious security bug reports, no amount of funding will even that out. Producing submissions is essentially free while triaging and reviewing remains very expensive.
3 years ago the cost was asymmetric in the other direction. The cost of writing code was high. The cost of finding security bugs was extremely high. The cost of triaging and reviewing was basically the same as it is today.
Large corporations that are well funded are facing the exact same issues internally right now. With agent output so cheap, how do you deal with the deluge? It’s not practical or desirable to have your best engineers doing nothing but reviewing generated code, some of which is likely very low value.
This plus accountability is the way; and what I think I mean here is "accountability for those who choose to USE (maybe not create) the software in a way that may be harmful."
If you'd like to push that accountability to the developers, that can work, but they should be paid or otherwise compensated accordingly for the risk they take on.
>"Sarah didn’t build her mansion to house ghosts, she built her mansion because she liked architecture."
That quote from the article directly-contradicts what multiple tour-guides at the Winchester Mystery House in California have told me over many decades. Specifically: Sarah Winchester built the house because she was told that the ghosts of all those killed by Winchester guns would haunt her unless her house was sufficiently labyrinthine, and endlessly expanding; to confuse them.
Visit the house (the tour is rad) and see for yourself the architecture. There is no reasonable explanation for internal doors leading to sheer-drops, throughout the house, and other bizarre 'traps', apart from Sarah legitimately believing she had to confuse the ghosts.
This is more akin to a programmer consciously obfuscating and expanding a codebase to make it impossible for their angry-users to ever finish auditing it, or to determine its author.
> That quote from the article directly-contradicts what multiple tour-guides at the Winchester Mystery House in California have told me over many decades.
The house is run by an organization that has a very vested interest in playing up the supernatural element of the house. Some tour guides have gone on record discussing their frustrations with having to repeat known falsehoods to guests.
> Visit the house (the tour is rad) and see for yourself the architecture. There is no reasonable explanation for internal doors leading to sheer-drops, throughout the house, and other bizarre 'traps', apart from Sarah legitimately believing she had to confuse the ghosts.
Parts of the house were damaged by the 1906 earthquake and were not rebuilt. A lot of the weird path-to-nowhere stuff is "the destination collapsed during the earthquake", nothing particularly mysterious there.
Does anyone know what “agent tea” is in the second graph? There is a paper about a protocol but it seems a bit obscure to be featured in this context and the other two points on the graph are models.
I'm pretty sure that I could consistently spew 1000 lines a day/per commit if it was mostly cut-n-pasting of existing code, that I had complete access to, with some minor variations.
Yea, I was curious about that, too. It’s one thing to vibe code a one-off personal project. It’s another to create something that can run the distance.
The cathedral and bazaar simply isn't the magic this article treats it as. And ESR, a human molerat who publicly premeditates murder on his blog, certainly isn't either.
Most of my commits (hand written and AI) have delete counts that are 75-110% the added line count.
The point that many developers will probably forget to tell the LLM to run cleanup/refactoring paths is probably true though. (I’ve definitely found ghost-chasing bugfixes in all sorts of corners of LLM generated code).
Maybe adopting some hard constraints on code complexity that agents have to work within would help?
https://skepticalinquirer.org/2024/08/the-truth-about-sallie...
So the Unix-philosophy small tools that constitute an important part of the GNU project are excluded. Rather, it’s about any programs of significant complexity, like Emacs (and likely GCC) and many commercial products. While the cathedral model doesn’t imply closed source, it implies building “in […] isolation”, rather than in the open. It may or may not remain proprietary and/or closed source.
Linux demonstrated to ESR that complex projects can also be built in the open with many collaborators, and don’t necessarily require the cathedral; which inspired the essay.
Julia Morgan, Winchester's contemporary, was the first woman to obtain an architecture license in California in 1904 and had a very prolific career throughout the state including her most famous - Hearst Castle - commissioned in 1919.
How about actually funding opensource project mantainers? We have non profit orgs, that eat billions of public funds. We spend biilions for influencing hardly measurable metrics, with very nebulous benefits in far distant future.
Direct sponsoring of critical projects would have far better and concrete benefits.
The problem is the cost is so wildly asymmetric. When everyone with a computer and a subscription can vibe code low quality features, when everyone can submit dubious security bug reports, no amount of funding will even that out. Producing submissions is essentially free while triaging and reviewing remains very expensive.
3 years ago the cost was asymmetric in the other direction. The cost of writing code was high. The cost of finding security bugs was extremely high. The cost of triaging and reviewing was basically the same as it is today.
Large corporations that are well funded are facing the exact same issues internally right now. With agent output so cheap, how do you deal with the deluge? It’s not practical or desirable to have your best engineers doing nothing but reviewing generated code, some of which is likely very low value.
If you'd like to push that accountability to the developers, that can work, but they should be paid or otherwise compensated accordingly for the risk they take on.
That quote from the article directly-contradicts what multiple tour-guides at the Winchester Mystery House in California have told me over many decades. Specifically: Sarah Winchester built the house because she was told that the ghosts of all those killed by Winchester guns would haunt her unless her house was sufficiently labyrinthine, and endlessly expanding; to confuse them.
Visit the house (the tour is rad) and see for yourself the architecture. There is no reasonable explanation for internal doors leading to sheer-drops, throughout the house, and other bizarre 'traps', apart from Sarah legitimately believing she had to confuse the ghosts.
This is more akin to a programmer consciously obfuscating and expanding a codebase to make it impossible for their angry-users to ever finish auditing it, or to determine its author.
The house is run by an organization that has a very vested interest in playing up the supernatural element of the house. Some tour guides have gone on record discussing their frustrations with having to repeat known falsehoods to guests.
> Visit the house (the tour is rad) and see for yourself the architecture. There is no reasonable explanation for internal doors leading to sheer-drops, throughout the house, and other bizarre 'traps', apart from Sarah legitimately believing she had to confuse the ghosts.
Parts of the house were damaged by the 1906 earthquake and were not rebuilt. A lot of the weird path-to-nowhere stuff is "the destination collapsed during the earthquake", nothing particularly mysterious there.
Winchester Mystery Potemkin Village.