Git, JSON and Markdown walk into bar

(grumpygamer.com)

69 points | by speckx 3 days ago

17 comments

raincole 5 hours ago
JSON is one of the most iconic tragedies that happened in software development.
It's not that JSON itself is bad, but it's obviously for machines to author, not for humans.
- No comment
- No trailing comma
- No multi-line string
It's a terrible format to type manually. However, we just shrugged and said "at least it's not XML" and started writing it manually anyway.
And later we finally realized comments are not optional, so we got JSON5, JSONC, etc...
[-]
- Cthulhu_ 4 hours ago
  I agree that it's a tragedy, XML had so much standardization, tooling, validation, etc available, it was great. But people used it for configuration and manually writing it, which sucked.
  15 years later and JSON still is still very far from the standardization and tooling that XML had at the time.
  It reminds me of the NoSQL thing from back then too, oversimplified it was "what if we chuck JSON blobs in a key/value store?". It took years to realize that relational databases and SQL weren't actually that bad, and / or that NoSQL had a long term cost.
  [-]
  - whizzter 2 hours ago
    And that was the downfall..
    1: Standardization was made by committee lovers and/or architecture astronauts leaving us with overly convoluted (sometimes to fit lacklustre object models in early OO languages) and complex ways of working.
    2: Complexity that introduced security vulnerabilities
    Sure, there was some great tooling available, but how much of it was needed because all complexity made it hopeless to work with without the tooling?
    I used it for configuration and serialization in some projects, and it was actually great but I almost always diverged from the bloated norms and defaults for readability/writeability (that made it a bit annoying to specify serialization rules).
    I mean, why did people prefer?
```
  <object><property name="somename"><int32>123</int32></property></object>
```
    Over just?
```
  <object somename="123"/>
```
    Yeah there is so much "flexibility" in those above designs but it wasn't needed 99% of the time.
    JSON was and is so far the best popular compromise between "just plain data" and "some" structure to make automated processing non-painful.
    Also as an improvement over XML collections (do you created a container element to specify container target leading to bloat or just map some of the sub-elements to specific collections and hope you don't run into ambiguities?) is that collections are just specific lists to a property.
    The biggest drawback of JSON is that we never had a way to handle type specializations/subtyping but had we done that we might have not gotten the universal acceptance across languages.
    Yes, comments but whenever you need that you can make a single-line regexp to strip a useful subset of them without affecting anything by the standard by removing matches of
```
  /^\s*\/\/[^\r\n]*/mg
```
- theshrike79 4 hours ago
  And we have XML.
  JSON is rediscovering XML Schema, XML DTDs etc, when we had those a quarter century ago already.
  It was so good when you could define the structure easily and validate it with standard tooling.
  [-]
  - ninkendo 4 hours ago
    The problem I always had with XML is that it's not clear how to properly encode typical data.
    If I have a structure like this JSON:
```
  {
    "foo": "bar",
    "nested": {
      "nested_foo": "nested_bar",
    }
  }
```
    Should I do this in XML?
```
  <foo>
    bar
  </foo>
  <nested>
    <nested_foo>
      nested_bar
    </nested_foo>
  </nested>
```
    Or should I do this?
```
  <foo>
    bar
  </foo>
  <nested nested_foo="nested_bar"/>
```
    If your goal is to simply encode a data structure that looks more or less like a C-style struct (key/values, arrays, primitives), the concept of attributes on tags is superfluous, and introduces ambiguity in how to serialize something.
    To me, JSON is nice because it's essentially the minimum viable way of encoding a C-style struct (floating point behavior notwithstanding.) XML has extras like attributes, schemas, and DTDs, which may be useful, but they come at the cost of having additional syntax (<!DOCTYPE ...>, etc), which is auxiliary to the goal of encoding data structures, and thus makes it no longer as minimal.
    To me, JSON's approach of having a separate out-of-band definitions of schema (e.g. JSON schema) is the better approach because it's less opinionated, you only pay for it when you need it, and it doesn't require separate syntax. Leave validation to a validation step, my data is my data.
  - p2detar 3 hours ago
    You forgot to mention XSLT, which is just such a ridiculously powerful tool when you got XML data. It allows one to transform XML into XHTML or even other formats. Edit: Browsers also support this automatically, so you open an XML document and you get to see a rendered HTML from it.
    I had been tasked at least several times in my past jobs to develop XSLT scripts to transform data into user-readable content. I don't know of anyone that uses XSLT today and I have no idea if there is a JSON equivalent.
    [-]
    - rimunroe 2 hours ago
      > Browsers also support this automatically, so you open an XML document and you get to see a rendered HTML from it.
      Though as I understand it it's possible that this might not be the case for much longer: https://github.com/whatwg/html/issues/11523
    - oneeyedpigeon 2 hours ago
      I wrote a simple CMS a few years ago that used XSLT to assemble static pages from templates and data. It worked like a dream, although writing XSLT is not the easiest task!
- eviks 5 hours ago
  > And later we finally realized comments are not optional, so we got JSON5, JSONC, etc...
  But we haven't, otherwise we'd use all those better formats instead
- pavel_lishin 4 hours ago
  Still better than YAML.
- s1mplicissimus 4 hours ago
  > And later we finally realized comments are not optional
  except they very much are. the place to explain your payload is in the API documentation, not alongside the payload. It's not code.
  [-]
  - illuminator83 1 hour ago
    When you are commenting your schema, that's true. Anything which is generated by machines doesn't need comments either. But when it's written by people? And the values? That belongs with the 'payload'.
  - rkomorn 4 hours ago
    That kind of stops working when (for example) your "payload" is configuration in a file where comments may make sense, though.
    I'm more partial to YAML for readability, but I don't think JSON configs are an awful anti pattern.
  - raincole 4 hours ago
    The exact issue is that JSON's usage spread far beyond "payload."
    > It's not code.
    package.json has a field literally called 'scripts' where the values are shell one-liners.
  - tedggh 4 hours ago
    This is true, but it is also true that if someone who didn’t read the documentation or the latest version of it, changes a config parameter that brings down a critical system you are responsible for, your bosses or customers won’t care if it was in the documentation or not.
  - nwellinghoff 4 hours ago
    What about json schema files? Hmmm
  - slowmovintarget 2 hours ago
    Confused puppy look
    https://orgmode.org/worg/org-contrib/babel/
- TZubiri 2 hours ago
  "I use JSON for just about every data file format in my games."
  Not to be mean, but this has the trifecta of amateur programming:
  - JSON - Games - One solution for everything.
  Pro tip, you can store variables as they are in memory to disk. Got 1 million 2D points representing units? Each point is a couple of floats? You can store each float as-is, write the amount of floats as an int (4bytes), the first float is the X coord, the second the Y coord, (4 bytes each), then repeat 1M times, boom you just solved that in 8MB, and in a couple of miliseconds of compute. Bonus point, no escaping, no import json, just a programmer programming.
  [-]
  - dxdm 2 hours ago
    > Not to be mean, but this has the trifecta of amateur programming
    But you are being mean. Also, the guy who wrote the article is no amateur, but a seasoned veteran who's likely been storing floats in files for decades. Check him out, you might be surprised.
    https://www.grumpygamer.com/about/
p2detar 6 hours ago
To those who may not know, Grumpy Gamer is a blog by Ron Gilbert, the creator of iconic game titles such as Maniac Mansion and Monkey Island. A true legend.
[-]
- theshrike79 4 hours ago
  Also his blog RSS feed is unabridged (full post in the feed with no "click here to read) and contains every single post ever.
  Now I have 308 posts to read :)
- itomato 5 hours ago
  Who works in a GUI editing unflattened JSON like so much Markdown.
dspillett 6 hours ago
> Why is italics italics* and bold is *bold*? Why not bold and _italics_. That would make a lot more sense to me.*
I think that comes from separating content and style, indicating meaning rather than explicit style: it isn't really one asterisk for italic and two for bold, it is one for emphasis and two for strong emphasis and the renderer choses how to display those levels. Like using HTML's “em” and “strong” tags instead of explicit “i” and “b” tags.
[-]
- oniony 6 hours ago
  Back on Fidonet/Aminet/Spot on the Amiga in the 1990s, pre-internet, the convention was /italic/, *bold* and _underline_. This was much more intuitive for me than what we have today.
  [-]
  - zygentoma 4 hours ago
    That's also how I did it in IRC times, when nothing would actually formatted by this
    It still looks good even without any formatting! (And btw. I thought that was the intention of markdown …)
    [-]
    - dspillett 3 hours ago
      The problem is that markdown has no real standard. Well, it does, but not everything follows it because many things existed before the formal standard and many created since are made to be compatible with something that pre-dates the standard. Some optimise for matching stylistic intent (bold, italic, underscore) and others prefer to be more abstract (emphasis, sting emphasis), and yet more try to support both but that requires compromise and they don't all compromise the same way.
      Many interpreters will accept underscores for italic, though they still generally (but not always) require two asterisks for bold.
      I just accept it as a general idea, not a standard, and lookup the local conventions for whatever tool I'm using at the time. Or if I'm writing a translator, I prioritise converting things written how I personally prefer to write plain text documentation.
yanis_t 5 hours ago
Since I became taking notes basically for everything[0], markdown was a savior. Just text is fine, but when you're able to sprinkle a little bit of formatting here and there (and provide links) without sacrificing the readability, this just great.
[0] https://www.mindthis.io/
[-]
- theshrike79 4 hours ago
  This is why I have standardised on Obsidian as my data storage along with Datasette[0] (by simonw) for larger data amounts.
  Watch a movie? Add a page to Obsidian with the movie title as the note title, run a python script and boom it has all of the metadata filled up along with everything relevant from TMDB and it's a pretty card with a cover image on my Movies Base.
  If Obsidian turns Evil Corporate, my workflow will still be the same, the editor just changes. I'll miss Bases, but all of my own automation is a bunch of external scripts that modify markdown.
  [0] https://datasette.io
pavel_lishin 3 hours ago
The lack of comments is a bummer; allowing /* */ style comments seems like it wouldn't complicate parsing significantly, and instead I've sometimes ended up having to do things like this:
```
    {
      "channel-comment": "This is the Slack channel that will receive notifications.",
      "channel": "#abc"
    }
```
dabeeeenster 6 hours ago
The drive by shooting of Gruber in this article is sort of weird?!
[-]
- tobr 5 hours ago
  The ad hominem is unnecessary when there are more substantial criticisms of his writing and interviews.
  [-]
  - fluoridation 2 hours ago
    It's not an ad hominem, it's just a insult.
wingmanjd 3 days ago
Coming from PHP, the lack of trailing commas in JSON always bites me. It's annoying when I rearrange the items and now the missing comma line was buried somewhere upwards.
[-]
- petepete 7 hours ago
  Trailing commas is the source of 95% of my SQL syntax errors. I know I _could_ put them at the start of the line, but it's unintuitive.
  [-]
  - hk1337 4 hours ago
    I like the comma in front for SQL better than at the end but you still have the problem of one of the fields missing a comma, the first field instead of the last but I guess maybe you're more likely to move fields and forget to add/remove a comma at the end.
  - neocron 4 hours ago
    Who writes trailing commas in sql?In every company I worked with the standard was prefix commas:
    <pre><code> var , var2 , var3 , var4 </code></pre>
  - zarzavat 7 hours ago
    I recommend writing a function to remove trailing commas from SQL. Big time save.
brap 5 hours ago
As for JSON, I am also constantly annoyed by lack of trailing commas and mandatory quotations for keys. However I think these were the right design decisions and the slight annoyance is a small price to pay (especially when automation exists).
No trailing commas is great for enforcing consistency. I’ma huge fan of consistency in code. Same with required quotation marks, which also simplify writing (imagine having to wonder if something needs it, or be surprised when it does and things break).
[-]
- Orygin 4 hours ago
  I don't understand you. Forcing trailing commas is one of the best features of Go, it enforces consistency where you must have a comma at the end of the line. Re-orderding lines? No worries, all of them have commas. Removing a line? No comma to change anywhere
- eviks 4 hours ago
  No trailing commas is actually INconsistent, consistency is when every element ends with a comma
  You've also got it backwards on quotes, it complicates writing by forcing you to write more. And with "Especially when automation exists" wondering is a non-issue, you'd get the syntax hint/error right there while typing and see if you need quotes before anything breaks
- RedNifre 5 hours ago
  If you already have quotation marks, what's the point of the commas?
  [-]
  - bayindirh 5 hours ago
    Try parsing this:
```
    { foo :"bar" baz :"bak" quux :[ a,b,c,d ] lol :9.7E+42 }
```
    Ref: https://www.json.org/json-en.html, but without commas. It's line noise. Commas allow a nice visual anchor.
    [-]
    - practal 4 hours ago
      Oh, actually that is the syntax I will use for writing abstractions:
      my-abstr x y z foo: "bar" baz: "bak" "quak" quux: [a, b, c, d] lol: 9.7E+42
      I don't think
      my-abstr x y z, foo: "bar", baz: "bak" "quak", quux: [a, b, c, d], lol: 9.7E+42
      would be better. Indentation and/or coloring my-abstr and the labels (foo:, baz:, quux:, lol:) are the right measures here.
      [-]
      - bayindirh 3 hours ago
        While I have no problems with indentation based syntax, it's not very conductive to minimization, so it's a no go for JSON's case.
        Coloring things is a luxury, and from my understanding not many people understand that fact. When you work at the trenches you see a lot of files on a small viewport and without coloring most of the time. Being able to parse an squiggly file on a monochrome screen just by reading is a big plus for the file format in question.
        As technology progresses we tend to forget where we came from and what are the fallbacks are, but we shouldn't. Not everyone is seeing the same file in a 30" 4K/8K screen with wide color gamut and HDR. Sometimes a 80x24 over some management port is all we have.
  - merelysounds 5 hours ago
    This is a popular question, the most common answer I’ve seen is:
    > Commas exist mostly to help JSON be human-readable. They have no real syntactic purpose as one could make their own notation without the commas and it'd work just fine.
    https://stackoverflow.com/a/36104693
    Elsewhere such commas can be optional, e.g. in clojure: https://guide.clojure.style/#opt-commas-in-map-literals
DarkNova6 7 hours ago
Neat little blog post. But it kept me waiting for a punchline which never appeared.
b2ccb2 6 hours ago
For the mentioned problem of removing sensitive data, BFG Repo-Cleaner does the trick: https://rtyley.github.io/bfg-repo-cleaner/
[-]
- kingkero 6 hours ago
  [dead]
reddalo 6 hours ago
I was surprised by the comment about John Gruber; I love reading his Daring Fireball. I'll try to look at the articles from Gilbert's perspective now.
[-]
- stevoski 6 hours ago
  I agree with you.
  An article like this is weakened by including an unnecessary personal attack.
  [-]
  - timmg 6 hours ago
    Interesting. I read it as a bit tongue-in-cheek. But who knows...
    [-]
stevage 6 hours ago
>Why not bold and _italics_. That would make a lot more sense to me.
Surely _underline_ would make more sense than _italics_. Somewhere I have seen /italics/ in use, but that does look kind of regexpy.
>I dislike that trailing commas are not allowed...There is no need for this and it makes writing out valid JSON more complex.
Trailing commas as a trend emerged after JSON was standardised. And thank god JSON is as well and truly standard as it is.
[-]
- mananaysiempre 6 hours ago
  It’s an old convention that underline in a manuscript (handwritten or typewritten) directs the typesetter to use italics (as underlines are basically nonexistent in professional typesetting before the WWW). I expect that this is where the _italics_ thing (which predates Markdown) comes from. (There is precedent for /italics/ and I don’t think it’s unreasonable, but it is much rarer.)
  [-]
  - chrisldgk 5 hours ago
    To add to this, when I went to school for design a long time ago, our typography teacher basically told us to never use underlines if we can use italics instead. It tends to mess with the readability of a paragraph and shifts the visual center of gravity downward, making text more difficult to parse. I assume that’s also why italics and underline seem to be used interchangeably from time to time, since they generally achieve the same goal of emphasizing text in the same semantic manner.
- philipwhiuk 6 hours ago
  The trailing comma trend is almost entirely down to line-based diff tools as far as I can tell.
  [-]
  - wahern 5 hours ago
    Trailing commas were common in language design long before JSON or even JavaScript existed as it simplifies machine generation of code while being comparatively trivial to handle in a parser, so a net win.
    The convenience it offers for diffing is just a manifestation of the positive interaction with grammars and language tools. The convention of humans using trailing commas in lists, along with one item per line, is relatively new, though. Stylistically, this used to be frowned upon as long definition lists made source files longer, slower to scroll through, and worsened code locality from the perspective of someone using, e.g., a 25 row terminal.
  - stevage 6 hours ago
    Huh, I don't know about that. I find it much more convenient for editing because it means the last item in a list doesn't behave differently when it comes to cut/copy/paste.
    [-]
  - dspillett 6 hours ago
    It can also help eliminate some editing errors, when copying entries to extend a list or reordering entries.
    I prefer leading commas to having a final comma with an empty clause, though some people hate that and they don't really solve all the final-entry issues (they address some of them, but others are just moved to being first-entry issues).
  - ithkuil 6 hours ago
    I wonder how much json aware diff representation and merge conflict resolution would remove the need of having the trailing comma
    [-]
    - usrusr 5 hours ago
      I'd rather have a format supporting a friendly truth on the ground (on the filesystem) than adding yet another "almost like standard behavior" quirk to tooling.
  - perching_aix 6 hours ago
    Git is generally insufferable when it comes to these. Diffing YAMLs is even worse, and it gets downright hideous when the specific document you're working with betrays YAML's rules about orderedness (the document is order invariant, while YAML is ordered). In that case, even most semantic diffing tools become unusable. This is a thing with JSON too, arrays are ordered.
    I've been recently using dyff [0] to diff YAMLs in an order invariant way, and it's been absolutely liberating. Couldn't help with version control, but it's still night and day.
    [0] https://github.com/homeport/dyff
  - arccy 6 hours ago
    it's also annoying to edit, copy a line, add it to the end of a block. now you need to add a comma to the previous line, and strip off the comma on the new one you added (if you copied it from not the previous line).
  - kragen 4 hours ago
    Trailing commas avoid having the last item behave differently, but they're easy to accidentally omit when they're optional.
```
  [ "or"
  , "alternatively"
  , [ "the"
    , "first"
    , "item"
    ]
  , "could"
  , "behave"
  , "differently"
  ]
```
    I feel that, despite its repugnant appearance, this "comma-first" approach is the best tradeoff in languages like JSON where trailing commas are forbidden; the leading `[` is much harder to accidentally omit or insert than the subtle trailing `,`. In Emacs I use js3-mode, a hack of js2-mode to support comma-first syntax.
    Comma-first syntax is especially convenient in SQL, which has the forbidden-trailing-comma problem and several analogous problems. In C if I have a long Boolean conjunction
```
  if (unpleasantly long boolean expression &&
      another unpleasantly long boolean expression &&
      yet another unpleasantly long boolean expression) {
```
    there are several ways to fix it, such as nesting ifs or factoring the expressions into variables or functions. The "comma-first" approach is also visually unappealing for spacing reasons, requiring two extra spaces after the parenthesis:
```
  if (  unpleasantly long boolean expression
     && another unpleasantly long boolean expression
     && yet another unpleasantly long boolean expression
     ) {
```
    In SQL, C's alternative approaches are not available, and the "comma-first" style is much more natural:
```
  where unpleasantly long boolean expression
    and another unpleasantly long boolean expression
    and yet another unpleasantly long boolean expression
```
    I do agree, though, that it's better to design languages to avoid this problem, and I think the way to do that is by using item terminators or item initiators in a list rather than by using item separators. That's what C did for statements with `;`, which was a difference from the ALGOL tradition including Pascal, where `;` was a statement separator, with the unpleasant consequences described in https://www.cs.virginia.edu/~evans/cs655/readings/bwk-on-pas....
    In Meta5ix http://www.canonical.org/~kragen/sw/dev3/meta5ixrun.py I experimented with using item initiators for rules in a grammar, like Markdown uses for bulleted lists. I'm not pleased with the rest of the syntactic decisions I tried in Meta5ix, but I do think that one was a good tradeoff; here's about a quarter of the Meta5ix compiler:
```
    - terms: term ["," {continue $choice} term] @choice
    - term: (factor {else $seq}, output) [factor {assert}, output] @seq
    - factor: string {literal $it}
           , "(" terms ")"
           , "[" @many terms {continue $many} "]"
```
    Note that, while comma-first layout feels like a gross abuse of a punctuation mark with `,`, it's quite common and natural with `|` in grammars and pattern-matches in languages like ML, where an initial `|` is also permitted; here's an excerpt from my port of μKanren to OCaml (http://canonical.org/~kragen/sw/dev3/mukanren.ml):
```
    let rec walk (s : env) = function
      | Vart (Var x) when Env.mem x s -> walk s (Env.find x s)
      | u -> u
```
    I think that's what I should have used in Meta5ix, and I will if I get around to revising it.
    [-]
    - stevage 4 hours ago
      This still seems to have the problem that the first term behaves differently from the others, and is hence inconvenient for editing operations.
      FWIW with SQL multi-line booleans, I tend to do:
      WHERE TRUE AND something AND something_else
      [-]
      - kragen 3 hours ago
        Yes, I agree. That's a nice trick! It's more verbose, which is a tradeoff, maybe a bad one. Too bad `,` doesn't have an identity element the way `and` does, so this doesn't solve the problem in JSON.
- arnsholt 6 hours ago
  Underscore for italics probably has its origins in the use of a solid underline as markup in a manuscript/typescript instructing the typesetter to set that fragment in italics (underscore generally being frowned upon in professionally set material otherwise, I think).
- andrewingram 5 hours ago
  I always preferred Textile over Markdown for reasons like this, so was a little sad when Markdown won the popularity contest.
- otikik 5 hours ago
  Clearly it should be *bold*, _underline_ and /italics/
  [-]
  - stevage 5 hours ago
    and -strikethrough-
    [-]
    - oneeyedpigeon 5 hours ago
      ~strikethrough~ is pretty well established and far less risky than overloading hyphens.
eviks 5 hours ago
> Why not bold and _italics_.
Because that's _underline_, and /italics/ are slanted
> I also have issue with it’s creator
Just pick a different specification of markdown, it's not like there is only one :)
TYMorningCoffee 5 hours ago
Did the grumpy gamer mean perforce instead of perform and it got auto 'corrected'?
phoronixrly 6 hours ago
> I also have issue with it’s creator, John Gruber. He is a highly annoying smug Apple Fanboy. His writing was fine in the early days when Apple was #3, but got intolerable as Apple became the 800lb gorilla. It’s changed recently as Apple has snubbed him but I still can’t read anything he writes.
Thank you! I thought it was just me...
[-]
- jackhalford 5 hours ago
  Hasn’t is always been an open secret that daring fireball it a shadow marketing website for apple?
- darkwater 5 hours ago
  Came here to paste the exact same quote with a very similar comment. I expand it by adding that Gruber is probably the archetypal "Apple fanboy" made flesh. They don't look like fanatics - as in sports fan or politics discussions in a bar over a beer - they just made their point with a subtle superiority and an "obviously Apple made the best choice here, as usual" spin to all of his posts. I should also add that I stopped reading him time ago though, things might have changed (also it feels like his posts are not so present on the HN frontpage like they used to be)
  [-]
unit149 5 hours ago
[dead]