The Zig project's rationale for their firm anti-AI contribution policy

(simonwillison.net)

116 points | by lumpa 4 hours ago

11 comments

hitekker 1 hour ago
Apparently, the noise around the AI policy came from Bun's developers saying that policy blocks upstreaming their performance PR. But the real reason seems to be that PR's code itself isn't in great shape, and introduces unhealthy complexity https://ziggit.dev/t/bun-s-zig-fork-got-4x-faster-compilatio...
> Parallel semantic analysis has been an explicitly planned feature of the Zig compiler for a long time, and it has heavily influenced the design of the self-hosted Zig compiler. However, implementing this feature correctly has implications not only for the compiler implementation, but for the Zig language itself! Therefore, to implement this feature without an avalanche of bugs and inconsistencies, we need to make language changes.
[-]
- bonzini 1 hour ago
  A single PR for a 3000-line addition would, in all likelihood, be rejected anyway.
  [-]
  - jeffmess 31 minutes ago
    Doubt it: https://github.com/ziglang/zig/pull/24536
- daishi55 21 minutes ago
  What’s the point in debating the PR quality? The policy explicitly forbids all LLM code, so that policy is of course the “real reason”.
  [-]
  - lelanthran 7 minutes ago
    > What’s the point in debating the PR quality?
    Because the pro-group are whining that the policy is preventing the merge, when in actual fact even if the policy did not exist, the PR is crap anyway.
jillesvangurp 19 minutes ago
It's a good rationale. But it points the finger at a real bottleneck in open source development: the burden of manually reviewing contributions. And the need to automate that with AI as well. Reviews were already becoming a problem before AI. Lots of projects have been dealing with a large influx of contributions from inexperienced developers from all over the world looking to boost their CVs by increasing their Github statistics. It's the same dynamic that destroyed Stackoverflow. Which, thanks to AI has been largely sidelined now. And now that AI is there, those same inexperienced developers are using that at scale to generate even more garbage contributions.
Doing manual reviews of everything is very labor intensive and not scalable. However, AIs are pretty good at doing code reviews and verifying adherence to guard rails, contributor guidelines, and other rules. It's not perfect, but it's an underused tool. Both by reviewers and contributors. If your contribution obviously doesn't comply with the guidelines, it should be rejected automatically. The word "obviously" here translates into "easy to detect with some AI system".
Projects should be using a lot of scrutiny for contributions by new contributors. And most of that scrutiny should be automated. They should reserve their attention for things that make it past automated checks for contribution quality, contributor reputability, adherence to whatever rules are in place, etc. Reputability is a good way to ensure that contributions from reputable sources get priority. If your reputation is not great, you should expect more scrutiny and a lower priority.
[-]
- emj 3 minutes ago
  > [you can] stop accepting imperfect PRs in order to maximize ROI from your work, but that’s not what we do in the Zig project
  The real bottle neck when you want to grow is connecting with the right people. An LLM is not helping with that if you want to build a community. When you use LLM to skip the need to understand a problem how are you ever going to get a reputation that I can trust?
  The post is not about reputation it about seeing how people respond and work with you in a community.
  EDIT: I see that you frame it as a help and a tool and sure it might work, but I feel like it is just another obstacle.
jart 2 hours ago
> This makes a lot of sense to me. It relates to an idea I've seen circulating elsewhere: if a PR was mostly written by an LLM, why should a project maintainer spend time reviewing and discussing that PR as opposed to firing up their own LLM to solve the same problem?
The same argument applies to open source itself. Why use someone's project when you can just have the robot write your own? It's especially true if the open source project was vibe coded. AI and technology in general makes personalization cheap and affordable. Whereas earlier you had to use something that was mass produced to be satisfactory for everyone, now you have the hope of getting something that's outstanding for just you. It also stimulates the labor economy, because you have lots of people everywhere reinventing open source projects with their LLMs.
[-]
- simonw 1 hour ago
  > Why use someone's project when you can just have the robot write your own?
  I've been thinking about this a bunch recently, and I've realized that the thing I value most in software now isn't robust tests or thorough documentation - an LLM can spit those out in a few minutes. It's usage. I want to use software which other people have used before me. I want them to have encountered the bugs and sharp edges and sanded them down.
  [-]
  - earleybird 3 minutes ago
    Depth of use over the lifetime of an app is a quality all its own that often not appreciated. A recurring pattern at $dayjob is that a new manager or director will join a business unit and declare an existing app as the worst terrible, no good, horrible app they've seen and they're going to fix that. A year and a half later the new app is finally delivered with 80% of the original functionality and a fresh set of bugs. The new dev team sees the surface functionality but misses a lot of the hard earned nuance the old system accrued over time. This is a pattern that existed long before LLMs.
  - anp 25 minutes ago
    I feel similarly but IIUC I think that doesn’t strictly require an open source development model. I’ve benefited a huge amount from consuming and contributing to open source projects and I’m a bit worried that the “unit economics” changing might break some of the social dynamics upon which the ecosystem is built.
  - tovej 5 minutes ago
    An LLM most definitely cannot spit out robust tests or thorough documentation. It can spit out some tests or some documentation, but they will not cover the user perspective or edge cases unless those are already documented somewhere. That's verified by both experience and just thinking about it for two seconds.
    The sanding down you refer to is what generates those tests and documentation.
  - porridgeraisin 56 minutes ago
    Yep. I realised the same. No one reads docs, or goes through tests. Either ways it's easy to write useless tests. And easy to write useless docs. Idt most even read the code. Now the difference is that it has become possible to write useless code.
    So it's just the fact that others have already gone through the motions before I did. That's it really. I suppose in commercial settings, this is even more true and perhaps extends to compliance.
  - alex1sa 24 minutes ago
    [dead]
  - jart 59 minutes ago
    I value software that reveals knowledge. The frontier LLMs were trained on all the code that institutions had been keeping to themselves. So they're revealing programing know-how on a scale that just wasn't possible with open source. LLMs are the ultimate Prometheus. Information is more accessible and useful now than it's ever been.
    [-]
    - Antibabelic 46 minutes ago
      I promise you, "the code that institutions had been keeping to themselves" is not nearly as special or good as you are implying here.
- jillesvangurp 10 minutes ago
  I've been seeing a drop in PRs against my repositories. I have a couple of repositories with around a hundred stars. Nothing spectacular but they were getting occasional PRs until last year. This year I've had almost none so far. My theory is that LLMs prefer sticking to mainstream projects. And since lots of developers are now leaning heavily on LLMs, they are biased to ignoring most of what I provide.
  And you indeed get a lot of wheel reinvention by LLMs because that is now cheap to do. So rather than using some obscure thing on Github (like my stuff), it's easier to just generate what you need. I've noticed this with my own choices in dependencies as well. I tend to just go with what the LLM suggests unless I have a very good reason not to.
- gausswho 1 hour ago
  That only holds true for the smallest tier of open source projects. Past a certain point of complexity, it's unlikely you can expect the robot to read your mind well enough to provide something of high quality and 'outstanding for just you'.
  The Zig project is certainly far beyond such capability.
  [-]
  - 8n4vidtmkvmk 1 hour ago
    I'm finding this out the hard way. I set out to build a 1 page app. I thought it would take a day. It's 98% vibe coded at this point. Even with AI implementing everything, its taken several weekends and many evenings. And not because AI is doing a bad job its just that as i see it come together, i have more and more feature requests. I've got a couple dozen left but I can't just let the AI chew through them all at once. Im effectively QA now. Have to make sure everything is just right.
- skeledrew 1 hour ago
  LLM access is not yet universally available. There are those who can't exactly afford it. And there are also those with access but there are occasional or perennial issues, like Claude outages and general degraded performance over time. For example couple of months ago when I just started using Claude, I was easily making good progress on multiple projects within a week. Nowadays I'm hardly getting through much of anything as most of the time Claude is just showing spinners, and it also feels like the code quality has taken a nosedive.
- bee_rider 1 hour ago
  Most people don’t have the ability to read code well enough to determine if an LLM output is good or not. And most people don’t have subscriptions to models that can develop non-trivial programs…
  Maybe this will be a real problem in a couple years though.
  [-]
  - dawnerd 39 minutes ago
    Code aside, most people don't even know how to describe what they actually want it to do, and LLMs are still a loooong way away from mind reading. I've seen developers struggle to even write down what they want. Simple demos like they love to show off with snake-like games are fun and all but they're nothing like the complex opensource apps everyone seems to think we'll just generate with a simple prompt.
- LeCompteSftware 39 minutes ago
  >> Whereas earlier you had to use something that was mass produced to be satisfactory for everyone
  As someone who recently started using OpenSCAD for a project I find this attitude quite irritating. You certainly did not "have to" use popular tools.
  The OpenSCAD example is particularly illuminating because it's fussy and frustrating and clearly tuned towards a few specific maintainers; there's a ton of things I'd like changed. But I would never trust an LLM to do it! "Oh the output looks fine, cool" is not enough for a CAD program. "Oh, there are a lot of tests, cool" great, I have no idea what a thorough CAD test suite looks like. I would be a reckless idiot if I asked Claude to make me a custom SCAD program... unless I put in a counterproductive amount of work. So I'm fine with OpenSCAD.
  I am also sincerely baffled as to how this stimulates the "labor economy." The most obvious objection is that Anthropic seems to be the only party here getting any form of economic benefit: the open-source maintainers are just plain screwed unless they compromise quality for productivity, and the LLM users are trading high-quality tooling built by people who understand the problem for shitty tooling built by a robot, in exchange for uncompensated labor. It only stimulates the "labor economy" in a Bizarro Keynesian sense, digging up glass bottles that someone forgot to put the money in.
  I have seen at least 4 completely busted vibe-coded Rust SQLite clones in the last three months, happily used by people who think they don't need to worry their pretty little heads with routine matters like database design. It's a solved problem and Claude is on the case! In fact unlike those stooopid human SQLIte developers, Claude made it multithreaded! So fucking depressing.
- chromacity 11 minutes ago
  [dead]
felipeerias 1 hour ago
The other side of this is that open source projects that allow AI tools will be more restrictive towards new contributors.
This already happens to some degree on large software projects with corporate backing (Web engines, compilers, etc.), where it is often not trivial to start contributing as an independent individual.
Reasonable people can disagree on whether one approach is inherently better than the other, as ultimately they seem to be optimising for different goals.
[-]
- nicman23 1 hour ago
  yeah giving a llm git blame and git grep has saved me a lot of time of doing boring basically re.
buggymcbugfix 1 hour ago
One reason I love writing production code in Ur/Web is that LLMs are incapable of synthesising something even remotely resembling it. Keeps me on my toes.
I think this is a great policy by the Zig team.
[-]
- wk_end 31 minutes ago
  Ur/Web! That's something I haven't heard about in ages. Is it still in active development? In what circumstances are you using it? Fun, your own startup, is some secret big commercial user of it...?
jwzxgo 3 hours ago
I talked to developers of https://deerflow.tech/ and they pretty much had the same plan, unless it's coming from a known and trusted developer.
[-]
- mapontosevenths 1 hour ago
  > unless it's coming from a known and trusted developer.
  That's exactly the sketchy part here. They turned down known, working and tested, code that came from a partner (bun) due to this policy. Code that 4x'd compile speed.
  A general ban makes sense based on their rationalization ("contributor poker"[0]). A total and inflexible ban can lead to a worse outcome for everyone though.
  If a senior, experienced, contributor vouches for the code it shouldn't matter if they hand crafted it on stone tablets, generated it with yarrow sticks, or used gpt-3.
  [0] https://kristoff.it/blog/contributor-poker-and-ai/
  [-]
  - lelanthran 11 minutes ago
    > That's exactly the sketchy part here. They turned down known, working and tested, code that came from a partner (bun) due to this policy. Code that 4x'd compile speed.
    No; they turned it down because the vibe-coded PR was crap.
    > The rewritten type resolution semantics were designed to avoid these issues, but Bun’s Zig fork does not incorporate the changes (and has not otherwise solved the design problems), which means their parallelized semantic analysis implementation will exhibit non-deterministic behavior. That’s pretty much a non-starter for most serious developers: you don’t want your compilation to randomly fail with a nonsense error 30% of the time.
  - lmm 1 hour ago
    > If a senior, experienced, contributor vouches for the code it shouldn't matter if they hand crafted it on stone tablets, generated it with yarrow sticks, or used gpt-3.
    The flip side of that is that if such a contributor vouches for code that turns out to be poor-quality, this should severely damage their reputation. I've found far too many "senior" developers will give AI a pass on poor coding practices.
  - JoshTriplett 1 hour ago
    https://news.ycombinator.com/item?id=47958209
    [-]
    - superb_dev 1 hour ago
      A standout paragraph from that thread:
      > Put more simply, we are going to make these enhancements, but hacking them in for a flashy headline isn’t a good outcome for our users. Instead we’re approaching the problem with the care it deserves, so that when we ultimately ship it, we don’t cause regressions.
      These exact changes are already on the roadmap and Bun’s PR is rushing ahead.
    - mapontosevenths 1 hour ago
      Thanks. That explains away most of my concern.
  - feverzsj 1 hour ago
    Quite the contrary, Bun's developers don't even understand language spec. Their slop didn't use the same type resolution semantics as Zig, which makes their implementation exhibits non-deterministic behavior.
slopinthebag 9 minutes ago
Go zig! I don't use the language but I totally respect where they're coming from and their mission and ethics.
For those who are pissed because a large OSS project isn't accepting LLM generated slop: Fuck off!
slopinthebag 10 minutes ago
Very convenient of Mr. Willison to omit the fact that Bun's upstream changes are total garbage and would not be upstreamed regardless of any policies, omitting LLM generated code or not, since they are, as a zig core team member articulated in a classier way, shite.
marlburrow 11 minutes ago
[flagged]
feverzsj 2 hours ago
No human should trust any bullshit made by bullshit machine.