22 comments

  • gu5 1 day ago
    Lix is also a soft fork of the official Nix package manager implementation: https://lix.systems/
    • yjftsjthsd-h 1 day ago
      I really assumed that this was that; even calling it a universal version control system for binary files would be kind of a weird way of describing it but is plausibly a valid description for the package manager.
    • Rexxar 19 hours ago
      Lix is also the name of a computer science and mathematics laboratory: https://www.lix.polytechnique.fr/
  • uasi 1 day ago
    Git can display diff between binary files using custom diff drivers:

    > Put the following line in your .gitattributes file: *.docx diff=word

    > This tells Git that any file that matches this pattern (.docx) should use the “word” filter when you try to view a diff that contains changes. What is the “word” filter? You have to set it up [in .gitconfig].

    https://git-scm.com/book/en/v2/Customizing-Git-Git-Attribute...

    • danudey 19 hours ago
      In their 'Git is unsuited for applications' blog post[0] they also say the following:

      > We currently have to clone the whole repository just to edit translation files. That is problematic for big repositories. The repository for posthog.com for example is ~680MB in size. Even though we only need translation files which would be at max 1MB in size, we have to clone the whole repository. That is also one of the reasons why git is not used at Facebook, Google & Co which have repository sizes in the gigabytes.

      I get that it can be a bit complex, but Git can handle this circumstance pretty easily if you know how (or write a script for it).

      For example, cloning the GIMP repo from GitLab takes me about 56 seconds and uses up 632 MB on disk, using just `git clone <repo>`.

      In comparison, running these commands:

          git clone --quiet --filter=blob:none --sparse https://gitlab.gnome.org/GNOME/gimp.git gimp-sparse-clone
          git -C gimp-sparse-clone sparse-checkout add po po-libgimp po-plug-ins po-python po-script-fu po-tags po-tips po-windows-installer
      
      (You can also run `git sparse-checkout init --no-cone` and then just `git sparse-checkout add *.po` to grab every .po file in the repo and nothing else)

      Takes 14 seconds on my laptop and uses 59 MB of disk space, and checks out only the specified directories and their contents.

      So yeah, it's not as automatic as one might like but ship a shell script to your translators and you're good to go. The 'Git can't do X' arguments are mostly untrue; it should really be 'Getting git to do X is more complicated than I would prefer' or 'Explaining how to do X is git is a pain', both of which are legitimate complaints.

      [0] https://samuelstroschein.com/blog/git-limitations/

    • theknarf 1 day ago
      Would be interesting to see some tooling built around being a custom diff driver for a bunch of different standard formats!
      • WorldMaker 19 hours ago
        I had some interesting luck with the generic approach to unzip the DOCX/XLSX/ODT/etc, then to the contents recursively apply other filters like XML and JSON formatters/prettifiers.

        (My work [1] in this space predated git so it wasn't written as a git diff filter, instead it automated source control. But the same principles could be used in the other direction.)

        Not the highest level diffs you could possibly get, but at least for a programmer even ugly XML and JSON diffs were still nice to have over binary diffs.

        [1] https://github.com/WorldMaker/musdex

      • theknarf 1 day ago
        I found this in my git starts: https://github.com/xltrail/git-xl?tab=readme-ov-file

        And then there is also Pandoc that I guess could be helpful in this regard.

    • nine_k 20 hours ago
      This is great for showing diffs. To actually make git store only deltas, not entire binaries, you would need to configure "clean" and "smudge" filters for the format. Given that docx (and xlsx) are a bunch of XML files compressed by zip, you can actually have clean diffs, and small commits.
    • packetlost 22 hours ago
      Yeah, this is how I would prefer to solve this problem personally, but it would be really nice to have some collection of tools that cover common binary file formats automatically instead of having to configure this manually every time.
    • cat5e 21 hours ago
      This is really great. I read the Git config article, but I thought the image diff example was kinda lackluster. Im sure some better metrics could be extracted for a more descriptive diff.

      Thanks for sharing!

  • samuelstros 1 day ago
    Holy moly. I just went to bed. Checking my phone for last time. Opening hackernews for "one last scroll" and see lix, my project, popping up here.

    Going through the questions now. So much for going to bed.

    • samuelstros 1 day ago
      Learnings from the comments so far: I need to refine the positioning of lix.

      Lix is not a replacement for git. Nor does it target version controlling code as the primary use case.

      A better positioning might be "version control system as a library". The primary use case is embedding lix into applications, AI agents, etc. that need version control.

      I need to to bed now. I have a flight to catch in 6 hours.

      PS I am open to suggestions regarding the positioning!

  • micw 1 day ago
    I wonder how much room this leaves for unintended, not shown changes. E.g. Excel is a complex format that allows all sort of metadata and embeddings that would not always seem as cell changes ...
    • samuelstros 1 day ago
      Depends on the diff you render and what the plugin tracks.

      In general, lix gives in API to track changes in any file format (via plugins). The "diff noise" thus depends on a) the plugin i.e. does it track them metadata? and b) what is rendered as the diff.

      If the user doesn't care about seeing a diff of metadata in Excel, don't render the metadata in the diff. The latter is trivial because diffing in lix is just a SQL query.

  • mrgoldenbrown 21 hours ago
    Home page states Lix can diff. "any file format like .xlsx, .pdf, .docx"

    Wow, sounds useful. Git doesn't do that out of the box.

    BUT... the list of available "plugins" only has .csv,.md and json, which are things that git already handles just fine?

    Can it actually diff excel and word and PDF or not?

    • samuelstros 21 hours ago
      It can but the plugins are not developed for production readiness yet. I should clarify that.

      The way to write a plugin:

      Take an off the shelf parser for pdf, docx, etc. and write a lix plugin. The moment a plugin parses a binary file into structured data, lix can handle the version control stuff.

  • thephotonsphere 1 day ago
    name confusing it be

    https://lix.systems/

    • rlonstein 23 hours ago
      Name collision. I thought it might be the "Lix" fork of "Nix".
  • ezoe 1 day ago
    It seems to me that this is just an issue of diff features. Git can extended to show semantic diff of binary files and it doesn't technically need a completely new VCS.

    As git became the most popular VCS right now and it continues to do so for foreseeable future, I don't think incompatibility with git is a good design choice.

    • samuelstros 22 hours ago
      Indeed, if lix were to target code version controlling, incompatibility with git is a “dead on arrival” situation.

      But, Lix use case is not version controlling code.

      It’s embedding version control in applications. Hence, the reason why lix runs within SQL databases. Apps have databases. Lix runs of top of them.

      The benefit for the developer is a version control system within their database, and exposing version control to users.

  • notachatbot123 1 day ago
    I look at the page and leave without any clue as to what it actually does. Agents and AI are mentioned so I assume it might just be incoherent slop?

    The person behind this boasts on Twitter, that they fired all their remote developers and used AI instead.

    Judging by tweets, this project is 2-3 years in the making.

    > Lix is a universal version control system that can diff any file format (.xlsx, .pdf, .docx, etc).

    > Unlike Git's line-based diffs, Lix understands file structure. Lix sees price: 10 → 12 or cell B4: pending → shipped, not "line 4 changed" or "binary files differ".

    How? I have a custom binary file format, how would Lix be able to interpret this?

    > Lix adds a version control system on top of SQL databases that let's you query virtual tables like file, file_history, etc. via plain SQL. These table's are version controlled.

    What does SQL have to do with everything?

    • samuelstros 22 hours ago
      Thanks for the feedback.

      AI agents are the pull right now to why version control is needed outside of software engineering.

      The mistake in the blog post is triggering comparisons to git, which leads to “why is this better/different than git?”.

      If you have a custom binary file, you can write a plugin for it! :)

      Lix runs on top of a SQL database because we initially built lix on top of git but needed:

      - database semantics (transactions, acid, etc.)

      - SQL to express history queries (diffing arbitrary file formats cant be solved with a simple diff() API)

  • mackross 19 hours ago
    Same name as my Phoenix inspired framework for go: https://codeberg.org/lixgo/lix
  • danmeier 1 day ago
    Great semantic diffs, but does Lix actually define a merge algebra for concurrent structured edits, or are conflicts just punted back to humans? How does its SQL engine guarantee deterministic merges vs last-write-wins?
    • samuelstros 22 hours ago
      Merge algebra is similar to git with a three way merge. Given that lix tracks individual changes, the three way merge is more fine grained.

      In case of a conflict, you can either decide to do last write wins or surface the conflict to the user e.g. "Do you want to keep version A or version B?"

      The SQL engine is merge unrelated. Lix uses SQL as storage and query engine, but not for merges.

  • KingMob 1 day ago
    Hi, before you get too wedded to the name, you should be aware that there's already a major nix project called lix: https://lix.systems/.

    Before clicking, I assumed this was actually a new feature of theirs that would apply nix build principles of some sort to version control of binaries.

  • yoyohello13 1 day ago
    Looks cool, but seems kind of weird that it only works through an sdk. Should there be a cli or something?

    Edit: Oh I see. Seems like their use case is embedding version control into another application.

    • samuelstros 1 day ago
      Correct. Lix has been developed with the embedded use-case in mind.

      Someone can write a CLI for it. Though, the primary use case is not code version control but embedding into applications

  • internet_points 1 day ago
    They should change the name while they still can https://lix.systems/
  • orthoxerox 1 day ago
    It's nice, but it needs to support the most common file formats used in gamedev to gain enough traction.
  • AmbroseBierce 1 day ago
    Git is a command line program so it feels strange that this doesn't seem to support that use case.
    • samuelstros 1 day ago
      Hi,

      I'm the creator of lix.

      Lix doesn't target code version control. It can be used for it. But the primary use case is embedding version control in applications. Such an application can be an AI agent that modifies files which entails the need to show what the agent did in that file e.g. tracking the changes.

      Git is good enough for code. I don't think there is space to gain much market share.

      • Terretta 1 day ago
        Some feedback about the primary use case.

        Your Lix doc (LLM written but with typos?) is sort of weird, handwaving how Lix does version control over, say, Excel, to say it's about working with SQL databases:

        How does Lix work?

        Lix adds a version control system on top of SQL databases that let's you query virtual tables like file, file_history, etc. via plain SQL. These table's are version controlled.

        Then it gets weirder:

        Why this matters:

        Lix doesn't reinvent databases — durability, ACID, and corruption recovery are handled by battle-tested SQL databases.

        This seems like a left turn from the value prop and why the value prop matters?

        A firm-wide audit trail of changes to typically opaque file types (M365 files in particular) could be tremendously valuable -- and additive -- compared to the versioning that's baked into the file bundles. The version control is already embedded by the app, what adds value is reporting on or managing that from outside the app.

        As for how it works, both in the docs and in the comment I'm replying to, it's unclear how any of this interacts with the native version control embedded in M365 apps or why this tool can be trusted as effective at tracking M365 content changes.

        • samuelstros 21 hours ago
          Does the following make more sense to you in respect to SQL?

          Lix uses SQL databases as storage and query engine. Aka you get a filesystem on top of your SQL database that is version controlled.

          Or, the analogy to git: Git uses the computers filesystem as storage layer. Lix uses SQL database (with the nice benefit of making everything queryable via SQL).

          > Lix doesn't reinvent databases — durability, ACID, and corruption recovery are handled by battle-tested SQL databases.

          >> This seems like a left turn from the value prop and why the value prop matters?

          Better wording might be "Lix uses a SQL database as storage layer"?

          The SQL part is crucial for two reasons. First, the hard part like data loss guarantees, transactions, etc. are taking care of by the database. We don't have to build custom stuff. Which secondly, reduces the risk for adapters that data loss can occur with lix.

          > As for how it works, both in the docs and in the comment I'm replying to, it's unclear how any of this interacts with the native version control embedded in M365 apps or why this tool can be trusted as effective at tracking M365 content changes.

          It doesn't interact with version control in M365.

          I'll update the positioning. Lix is a library to embed version control in whatever developers are building. Right now, lis is mostly interesting for startups that build AI-first solutions. They run into the problem "how do customers verify the changes AI agents make?".

          The angle of universal version control, and using docx or Excel as an example, triggers the wrong comparisons. By no means is Lix competing with Sharepoint or existing version control solutions for MS Office.

    • hekkle 1 day ago
      Based on the product description, it seems that they don't like text, and want to deal in objects. It would feel strange if they did support a terminal, rather than a GUI.
    • lombasihir 1 day ago
      because its a stupid content tracker. see man git.
  • solidsnack9000 1 day ago
    It was initially hard for me to understand how this could work but it looks like there is a plugin system?
    • samuelstros 1 day ago
      Yes. The tracking works via plugins to keep it generic. Here is a rough illustration:

      File change -> Plugin (detects changes) -> Lix

      It works surprisingly well because most standard file formats have off the shelf parsers. Parse a file format, and et voila, it is trivial to diff. Then pass on a standard schema for changes to lix and you end up with a generic API to query changes.

  • mog_dev 1 day ago
    I wonder if this could be used in conjunction with git for UT5 projects
  • anttiharju 1 day ago
    for office files one can also unzip and zip to store them in git as plaintext
    • brnt 1 day ago
      Its a pity Word doesnt open it's own OOXml export. At least Libre office has .fodt.
      • volemo 1 day ago
        > Its a pity Word doesnt open it's own OOXml export

        They can’t. It’s the only thing keeping them relevant.

  • forrestthewoods 1 day ago
    Weird sales pitch. I think Git is super mediocre and a VCS that supports binary files would be awesome.

    But then the first thing it talks about is diffing files. Which honestly shouldn’t even be a feature of VCS. That’s just a separate layer.

    • samuelstros 1 day ago
      > But then the first thing it talks about is diffing files. Which honestly shouldn’t even be a feature of VCS. That’s just a separate layer.

      There is nuance between git line by line diffing and what lix does.

      For text diffing it holds true that diffing is a separate layer. Text files are small in size which allows on the fly diffing (that's what git does) by comparing two docs.

      On the fly diffing doesn't work for structured file formats like xlsx, fig, dwg etc. It's too expensive. Both in terms of materializing two files at specific commits, and then diffing these two files.

      What lix does under the hood is tracking individual changes, _which allows rendering a diff without on the fly diffing_. So lix is kind of responsible for the diffs but only in the sense that it provides a SQL API to query changes between two states. How the diff is rendered is up to the application.

      • forrestthewoods 21 hours ago
        > On the fly diffing doesn't work for structured file formats like xlsx, fig, dwg etc. It's too expensive. Both in terms of materializing two files at specific commits, and then diffing these two files.

        I don’t think that’s actually true?

        How often are binary files being diffed? How long does it take to materialize? How long to run a diff algorithm?

        I’ve worked with some tools that can diff images. Works great. Not a problem in need of solving.

        In any case I’ll give benefit of the doubt that this project solves some real problem in a useful way. I’m not sure what it is.

        My goals in a VCS for binary files seem to be very very very different than yours.

        • samuelstros 14 hours ago
          I think our goals indeed differ.

          > How often are binary files being diffed? How long does it take to materialize? How long to run a diff algorithm?

          If version control is embedded in an app, constantly.

          Imagine a cell in a spreadsheet. An application wants to display a "blame" for a cell C43 i.e. how did the cell change over time?

          The lix way is this SQL query

          SELECT * from state_history WHERE file_id <the_spreadsheet> AND schema_key "excel_cell" AND entity_id C43;

          Diffing on the fly is not possible. The information on what changed needs to be available without diffing. Otherwise, diffing an entire spreadsheet file for every commit on how cell C43 changed takes ages.

    • Izkata 23 hours ago
      Through the gitattributes and gitconfig files, git can be extended to work with any external tool for specific file types. For example: https://github.com/ewanmellor/git-diff-image
    • Antibabelic 1 day ago
      Most version control systems that are not Git support binary. In the industry you most often see Perforce P4 and Subversion being used for that purpose.
      • forrestthewoods 1 day ago
        Correct. Perforce is expensive AF and is also kinda meh. They got bought by private equity and haven’t meaningfully improved it for like 15 years. But they’ve got gamedevs by the balls who don’t have an alternative. It’s unfortunate.
  • dev_l1x_be 23 hours ago
    Great name! :)
  • bibimsz 1 day ago
    compelling problem statement. md and csv have their limit.
  • huflungdung 1 day ago
    [dead]