This is a really solid deep-dive. I like how you move from this seems obviouscases (ints, strings) into the subtle edge cases where ordering quietly breaks and then show practical encodings that actually work in byte-lex order. The examples make the pitfalls very concrete, especially the varint and tuple sections. Nice balance between theory and systems-level pragmatism
In contrast, I found it rather lacking. No discussion of the most common way to sort floats as bytes (shift the sign bit down and XOR the other 31 bits with the resulting masks), nor NaNs and +/-0 for that matter. Varint sorting introduces its own homegrown serialization but doesn't discuss the issue of overlong encodings. Nothing about string collation or Unicode issues in general. Composite data suggests adding NULs, but what if there are NULs in the actual data? (It is briefly mentioned, but only as in “you can't”.)
1 comments