What ORMs have taught me: just learn SQL (2014)

(wozniak.ca)

67 points | by ciconia 3 days ago

22 comments

  • stephen 50 minutes ago
    I'm admittedly an ORM apologist [1], but a few of his points articulated as "deal breakers" aren't that bad imo:

    - "the pernicious use of foreign keys [...] links between classes are [...] foreign keys" ==> that just sounds like schema normalization, which is usually a good thing?

    - "bending over backwards [...] to generate SQL that runs efficiently" ==> the huge majority of ORM-driven queries are "select * from table where id in ..."; for the queries that are more complicated than that, then yes use SQL! That's allowed!

    Folks who dislike ORMs seem to have this false dichotomy that "the ORM _must_ be used for all queries", which is a self-imposed/unpractical restriction.

    - "dual schema dangers" ==> he's exactly right that database should own the schema definition, but then just codegen the entities from the db schema? That's your singular source of truth, no drift. You can do this with Hibernate, ActiveRecord, Joist, many ORMs.

    - "Identities" ==> ironically I think ORMs (that use the unit of work pattern) actually have net-better DX here b/c you can hook up a graph of entities with just references.

    I.e. hook up a book to its author w/o knowing their ids yet, which explicitly avoids the annoyance he mentions of doing a partial commit/going to the db to figure out "what value should I INSERT into in the book.author_id column?" (but my author is new) in the middle of your business logic that just wants to "create books".

    - transactions ==> agreed that "transactions via annotations" ala JPA/Hibernate are terrible, but afaiu all "internet scale" apps these days do reads outside of transactions, and just use op-locking during the singular flush/commit step to the db.

    Disclaimer I am sure I won't change anyone's minds

    [1] https://joist-orm.io/

    • hn_throwaway_99 19 minutes ago
      > "bending over backwards [...] to generate SQL that runs efficiently" ==> the huge majority of ORM-driven queries are "select * from table where id in ..."; for the queries that are more complicated than that, then yes use SQL! That's allowed!

      This is exactly why I hate ORMs. As I always put it "ORMs make the easy stuff slightly easier, and they make the harder stuff way harder".

      If you're just using an OEM for the "select * from table where ID in ...", then you're saving practically nothing by using an ORM - just learn to write SQL, because as you put it, you're going to have to use it anyway for places where it falls over. There are lighter weight options that do basic stuff like transaction management and binding result sets to object properties that are much less of a PITA than ORMs.

      In practice I've seen people try to use the ORM features first for places that need complicated SQL (which is a reasonable assumption), only to waste a boatload of time before concluding the ORM makes stuff harder.

    • hatefulheart 38 minutes ago
      I have seen many ORM enjoyers argue the point about “you can just use SQL!” but I have never once seen an ORM enjoyer allow it, much less do it themselves in an actual codebase. They will time and time again prefer you write 100 lines of Typescript/Python for what could be achieved with 15 lines of SQL.
      • jghn 25 minutes ago
        To make matters worse, most of the time I've successfully argued a project to just use SQL instead of an ORM, what has happened is that people over time built a home rolled ORM in the development language.

        It's like people can't just let go.

        • hparadiz 22 minutes ago
          This is inevitably what happens every single time so just use an ORM and stop being stubborn.
      • zadikian 6 minutes ago
        And then the 100 lines of JS/Py ends up being way slower than the manual SQL, plus the autogen'd SQL part of it is slow, plus you can't even get the SQL query to profile without running the actual thing with prints.
      • pjmlp 22 minutes ago
        Worse, that code will be executed on the receiving end, and waste a bunch of network traffic.
      • nfw2 26 minutes ago
        The reason given to use raw SQL is for the performance not the perceived code clarity.
        • baq 16 minutes ago
          If you never used a CTE, maybe… The reason to use SQL is to get what you need out of a database. Performance is orthogonal to that.
        • hatefulheart 23 minutes ago
          I’m not sure why you thought I meant code clarity and not performance? It’s clear in all cases the correct SQL query will be more performant.

          Confused at what you’re evening trying to say here. Are you suggesting that 100 lines of application layer code is easier to understand than 15 lines of SQL?

    • bearjaws 3 minutes ago
      > the huge majority of ORM-driven queries are "select * from table where id in ..."; for the queries that are more complicated than that, then yes use SQL! That's allowed!

      The issue is, your lowest value queries are always this type, then you get the 10-20 in any code base that are 100x more complex, and they are the ones your end users care about the most.

      You end up with a 80/20 principal in the wrong way, it's great at producing queries that represent 20% of the value of your app, and awful for the 80% that define the core value of it.

    • khurs 1 minute ago
      " the huge majority of ORM-driven queries are "select from table where id in ..."*

      Well they shouldn't be, they should be limiting the columns they select ;-)

    • swasheck 46 minutes ago
      > Folks who dislike ORMs seem to have this false dichotomy that "the ORM _must_ be used for all queries", which is a self-imposed/unpractical restriction

      my experience is the exact opposite. People who love and advocate the merits of ORM insist that everything be executed through ORM because it introduces too much complexity for them to blend handwritten SQL with the ORM generated queries

      • hparadiz 38 minutes ago
        I've written/worked on several ORMs from scratch. ORMs are the industry standard. When I see posts like this I simply can't take them seriously. All they are saying is "I won't be a team player" and "I don't actually understand the subject matter". The reality is at a certain scale there's an entire orm team that optimizes everything. But even when there's no team involved there's no way you can write anything more optimized because I'm already at the computational limit of how far something can be optimized.

        There's no (good) ORM that doesn't let you simply put your own query in.

        • hatefulheart 26 minutes ago
          What optimizations are you making here when at the end of the day performance is dictated by the schema, the query planner and the network?
          • bot403 6 minutes ago
            I read it as "I've optimized the orm to be minimal overhead over raw sql a lot of the time".
          • hparadiz 3 minutes ago
            How can I possibly condense 24 years of deep knowledge in one comment for you?

            The tldr is if you're ever concatenating strings in order to build a query you're just doing what the entire job of orm is but rolling your own and chances are you'll end up with a bunch of bugs in how you handle well.... Everything.

            • hatefulheart 0 minutes ago
              I think your tone is a bit combative. You can certainly provide the cliff notes but if you want me to believe you’re at working at computational limits whilst talking to me about string concatenation in web dev backend languages I think the burden of proof is on you.
      • HelloNurse 38 minutes ago
        They don't consider the ORM the second class citizen it actually is: an optional simplified alternative to normal queries, that can be used for the easy cases.
      • stephen 41 minutes ago
        Fair point, both "pro ORM" and "anti ORM" camps are prone to extreme stances.

        I definitely don't agree with the "all queries must be executed through the ORM", and think that dogmatic stance has done a lot of damage to the ORM brand. :-/

    • marcosdumay 19 minutes ago
      > the huge majority of ORM-driven queries are "select * from table where id in ..."

      From my experience, you are mistaken on that. Those queries mostly come with some joins, either necessary or not to represent the object, and that often could be avoided if the data wasn't mapped into some standard object.

    • bluefirebrand 42 minutes ago
      > Folks who dislike ORMs seem to have this false dichotomy that "the ORM _must_ be used for all queries", which is a self-imposed/unpractical restriction.

      I've always heard a major selling point of ORMs is "You don't have to write the actual SQL anymore"

      Because of that, I tend to not trust people who use ORMs to even know how to write queries by hand in the first place

      • stephen 34 minutes ago
        You're right, that has been another "pro ORM" pitch that has gone awry and, taken to the extreme, is wrong imo.

        My nuanced articulation is "you don't have to write the _boilerplate_ SQL for the 90% of just-do-some-CRUD endpoints in your enterprise SaaS application, but you 100% need to 'know SQL' for the last 5-10% of ~reporting/analytics queries that the ORM is going to mess up".

        • marcosdumay 13 minutes ago
          AKA making the easy parts easier while making the difficult parts harder.
        • bluefirebrand 29 minutes ago
          Personally I find the 90% boilerplate SQL is easy enough to write that injecting an ORM into the process doesn't make much sense

          But that's just me

  • valzam 1 hour ago
    The big problem is that raw SQL has pretty bad type inference and linting support in most editors. A query builder can still give you a lot of type safety benefits.
    • rmunn 53 minutes ago
      Autocomplete is making me lazy. If I don't see what I'm about to type within two or three characters, I feel like the IDE isn't doing its job of helping me. So being able to type `db.Cust` and autocomplete Customers is really nice. I do know SQL, but yes, the language servers usually have a harder time connecting the SQL to my backend code, whatever language it's in, without quite a lot of config fiddling that pretty much obviates any time savings I would have gained from autocomplete.
      • NetOpWibby 43 minutes ago
        In my database[0] you get an SDK generated from your schema. Typescript is the default and man, the autocomplete works so well.

        I recently added support for SDK generation in Rust and Go, just do `disc codegen —rust` (double dash, my iPad is autocompleting the wrong dash) and you’re good to go.

        [0]: https://disc.sh

    • pjmlp 21 minutes ago
      Which is why one is better off using IDEs, especially those from DB vendors.
    • dadie 45 minutes ago
      I think the bigger problem is that SQL is in almost every language a second-class citizen. And even calling it second-class can be seen as a stretch.
    • ai_slop_hater 1 hour ago
      The problem is that there is no "SQL" — it's different for every database.
      • photios 25 minutes ago
        It's not that different. I'd rather have a different way to do UPSERTs or a different window function here and there [1] than figure out every ORM's join syntax or its sneaky ways to SELECT N+1 me into oblivion.

        [1] LLMs make these very easy to handle.

      • allthetime 47 minutes ago
        For the vast majority of simple use cases the common subset of all popular SQLs is exactly the same. Otherwise… just use Postgres
    • threethirtytwo 59 minutes ago
      [dead]
  • noisy_boy 1 hour ago
    As someone who started their programming journey with SQL, it just feels so odd hearing about learning SQL being presented as an useful option. I get it, it just feels odd. SQL was considered table stakes in the financial IT world - if you said you didn't know SQL, people would look at you funny.
    • le-mark 22 minutes ago
      My first job was at a financial services software company. They put everyone through multiple weeks of training on sql. That experience has been paying dividends for 25 years.
    • pjmlp 19 minutes ago
      Still applies today in data science, one is expected to master SQL alongside Python and Excel.
    • bluefirebrand 32 minutes ago
      It's very strange too. You can learn something like ~90% of useful SQL in an afternoon. The remainder is stuff that you only really need for extremely performance sensitive operations
  • revetkn 7 minutes ago
    If you use Java and like to write SQL, check out https://pyranid.com

    I stopped using ORMs around 2008 because they made the easy problems easier and the hard problems harder. I wanted to just write SQL and exploit all the power the DBMS has to offer instead of fighting with an abstraction layer, so I created Pyranid in 2015 and keep it actively updated.

  • pull_my_finger 12 minutes ago
    I wonder if the real problem isn't being able to write efficient queries, but that developers struggle to add (yet another) programming language. Just use AWK, just use SQL, just use jq, just use xyz. It's a lot of overhead. I would be OK to lose whatever fractional speed difference to be able to write my queries in a different scripting language. If I ever scaled so much that I needed to shave microseconds off my queries, there are already tons of DBs available, maybe just using a different tool or, even better, compile the DB with(out) different scripting support.
    • bot403 9 minutes ago
      I can't tell if you're arguing against SQL or orms. But I take your argument in favor of SQL because that's the native language of all the DBS and the dozens of frameworks and systems on top of them are "just use x...."
  • clutter55561 26 minutes ago
    ORMs have their place but they are leaky as hell. RDMSs are very diverse, have different languages, and require different optimisation techniques.

    ORMs that try to paper over all the differences fail miserably. They become super complicated and generally produce crap SQL.

    ORMs also tend to oversimplify database design. They are just tables with primary keys, right? Who needs indices? Who needs to think about collation? God forbid anyone mentions physical organisation of the data!

    Having said this, I do use a very small subset of SQLAlchemy (the bits I understand) in data pipelines.

  • zadikian 7 minutes ago
    I never use ORMs. But slightly before 2014, there was still kind of a reason to use them, getting/setting a whole nested bag of fields at once that you don't care about individually. Json/jsonb now handles that better.
  • capitainenemo 12 minutes ago
    I thought this was well put. https://web.archive.org/web/20160301022121/http://www.revisi...

    A now defunct site discussing why ORM is a poor map.

  • exabrial 12 minutes ago
    The purpose of an orm is not to "stop writing SQL". In order to effectively use a layer abstraction, you must be able to use the layer below the abstraction.
  • scritty-dev 54 minutes ago
    the N+1 trap and having to incorporate eager loading dictates you need to pretty much understand SQL regardless. applying the object oriented paradigm to relational data created Frankenstein's monster which we unaffectionately refer to as ORMs
  • Kaliboy 29 minutes ago
    I feel like ActiveRecord has none of these problems, but I also feel some strong confirmation bias.

    Can anyone that has used ActiveRecord share their opinion?

  • teliskr 53 minutes ago
    I use both SQL and ORMs every day. I've used hibernate since 2004. I've certainly had some difficult times with it; but overall it is a net positive. I find that it generally works well and saves a ton of time as long as I stick to my known patterns.
  • sbuttgereit 1 hour ago
    Just one quick note...

    > ...(although things like Postgres’ hstore can help)...

    Back when this blog post was written, this advice would have been reasonable. Today, I don't know anyone reaching for hstore since the more featureful json support was added.

  • Waterluvian 38 minutes ago
    What Python taught me: just use C.

    These are simply tools. The only wrong opinion is to believe that there’s a strict superiority of one over another. However, the content of this and other blogs can help people make informed decisions on when to reach for each tool.

  • add-sub-mul-div 1 hour ago
    2014: people respond with indignance that they should have to learn SQL now that there's a shortcut

    2026: people respond with indignance that they should have to learn anything now that there's a shortcut

    • flir 1 hour ago
      I like SQL. I enjoy writing SQL. I find ORMs produce crap SQL.

      But the current shortcut du jour is pretty damn good at writing SQL.

      • mrweasel 29 minutes ago
        While I do enjoy the Django ORM, for many queries SQL is just better. It's almost as if it was designed for querying database.

        Once you hit a certain level of complexity in your queries, you're better of with SQL. It's not that you can't do the query in the ORMs, but you're then looking at learning their special query language and those are never better nor easier to understand than just SQL. Those ORM query languages certainly aren't transferable across ORMs, but SQL frequently is. If you can query MariaDB with SQL, you can query SQLServer and PostgreSQL. The same can't be said for e.g. Django vs. Hibernate.

        For the "give me all the entries, with this one property" ORMs a much quicker and easier to work with. Once you start needing to use subselect, multiple joins, weird ranges or constructing object with data from across tables, I'd rather just write the SQL myself.

    • el_io 1 hour ago
      At yet people (mostly) skip SQL and learn some ORM.
  • yieldcrv 13 minutes ago
    LLMs are better at writing raw queries now and knowing the consequences of how it fits in your architecture (if you ask)

    So I think the ORM debate could be over

    postgresql is a beast

  • jdw64 15 minutes ago
    Use it where it fits, and don't use it where it doesn't.

    If you don't use an ORM, you'll end up with more boilerplate from mapping code with DTOs. The reason to use an ORM is dirty checking. It's hard to impose this kind of "state" with a relational database. But fundamentally, relational data doesn't fit well with OOP. In the end, you inevitably have to create a layer that absorbs this mismatch. Both approaches have their pros and cons anyway.

    Isn't it just a matter of using it where it fits and not using it where it doesn't? I wonder if we really have to frame it as "never use this" or "always use that."

    Actually, on second thought, I take it back. "Right tool for the right place" is harder. If you're on a team, it's probably better to just pick one: either don't use it at all, or use it everywhere. Because either way, friction is going to happen. My earlier thinking was too shallow.

  • bob1029 44 minutes ago
    ORMs are a horrible fit for OLAP scenarios. I've got a situation where I need to load ~40 tables with a total of 100k+ rows and I need it to happen at user-interactive speeds (less than 10 seconds).

    There is nothing that an ORM can do to help with this sort of problem without reaching for the obvious escape hatch of arbitrary command text execution. The ability to map the tables to objects in my programming environment is a distracting clown show for this specific problem. What really matters is understanding the provider and its techniques for bulk loading records. No ORM will ever be able to touch these provider capabilities on their "happy" paths. At best you'll wind up using the ORM and a bunch of provider-specific SQL anyways.

    ORMs for schema management is a stronger argument, but only in cases where the codebase/service has complete ownership over each respective database. Any kind of heterogenous workload says that ORM for schema management is a potential nightmare unless you do something like create a project that is only for migrating the schema, at which point I'd argue you could just maintain a source controlled folder of sql/shell scripts.

  • danlugo92 1 hour ago
    Also, NoSQL taught me to love SQL.
    • pjmlp 18 minutes ago
      Especially Dynamo DB.
  • gedy 42 minutes ago
    One nice thing about the rise of ORMs back in the day was it broke the stranglehold our traditional DBAs had on the data tier. I respected them and their skills, but in a product org it was really difficult to have a separate group that refused to participate in planning and wanted to design everything up front, optimize based on their performance assumptions, and then who would argue with devs when we'd need to do pretty normal things like, say, list users in a webapp.

    I'm talking about my experience, not generalizing to all DBAs of course. And of course ORMs introduced performance issues, etc.

  • ai_slop_hater 1 hour ago
    Next step is go down one more level to ditch SQL and learn LMDB and/or RocksDB.
  • ChicagoDave 23 minutes ago
    ORMs taught me that relational databases are an operational anti-pattern.

    NoSQL for operational data storage is more efficient and cost effective.

    ORMs were a regression test that exposed unnecessary complexity.

    • zsoltkacsandi 5 minutes ago
      I’ve never seen any reliable service built on a NoSQL store as a primary data store. If data consistency and not losing customer data important for you, RDBMS are just fine.