4 comments

  • 867-5309 401 days ago
    (for chess)
    • karmakaze 401 days ago
      Also (2003) from response "Date: Tue, 10 Jun 2003 14:16:50 GMT" for "pgn-complete.htm".
  • fasterik 401 days ago
    PGN and especially the use of standard algebraic notation (SAN) is one of my pet peeves in chess programming. SAN is great for humans to read but is annoyingly tricky and less efficient for a computer to parse. PGN files are also huge relative to the amount of information they need to encode. I'm not sure why a human would ever want to read a PGN file.

    I'll mention a tool called pgn-extract. It can do lots of things with PGN files. I am primarily interested in using large PGNs of master games to extract opening lines and individual positions. Some features of pgn-extract I have found useful are stripping tags and converting moves to UCI notation (e.g. "g1f3" instead of "Nf3"), and dumping the FEN string of every position in the PGN.

    https://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/

    • register 400 days ago
      As a chess player ( around 1700 FIDE ) and developer of a retired ( and moderatly successfull ) android chess application I cannot disagree more.

      Back in 1995 there was no free Chessbase reader, no Scid but there was TWIC and the Exter Chess Club website. I was studying most of the games with a Notepad and a physical chessboard. PGN has contributed in a significant way to the growth of digital chess.

      At the same time PGN is extremely easy to parse and I can tell by having implemented such a parser myself. The main trick is to segment the games start and finish on newlines and tags. Once the start and finish of a game are identified tokenization of the moves and other symbols is trivial. To increase robustness most of the PGN readers also check that moves are legal with a move generator.

      • fasterik 400 days ago
        I wasn't making a claim about the historical importance of PGN. My issue is that it's become the de facto standard for distributing large collections of games, which is inefficient and no longer necessary given the ubiquity of free chess programs.

        I agree that tokenizing isn't difficult. My pet peeve is about using SAN as a standardized format for feeding games to the computer. A naive encoding would use 6 bits for source square + 6 bits for destination square + 4 bits for promotion and other move flags. This fits a move in 2 bytes and is unambiguous. Compare that to SAN, which uses up to 7 bytes per move; requires knowledge of the board state and iteration over every legal source square; is potentially ambiguous if source rank/file isn't disambiguated properly; and adds extraneous information about captures and checks.

    • apetresc 401 days ago
      The human-readability of PGN is very handy. The kinds of chess players who curate PGN collections can definitely visualize/identify games based on algebraic notation.

      A well-formed PGN file is basically just an ASCII scoresheet that every tournament player is required to write out by hand over the course of a game anyway.

      • arnsholt 401 days ago
        Being human readable is nice, but compared to something like lichess’s Huffman based encoding, it’s something like an order of magnitude larger than needed. If your reference base is several million games, that really adds up.
      • fasterik 401 days ago
        I'm not convinced. Why would you read the file in a text editor instead of loading it into ChessBase or a similar program?
        • apetresc 395 days ago
          Oh, you're not wrong. I didn't mean to imply that chess players read PGN files cold. They definitely load it into Chessbase for actual use.

          But I'm just saying it's a nice QoL feature to be able to paste a PGN into an e-mail or load it up in your text editor and just be able to, at a glance, get info like who the players were, how the game ended, how long it was, etc. Often that's the thing you need to decide if this is a file you want to load into Chessbase in the first place.

          It's not a killer feature or critical use-case, but it is good ergonomics and you're not going to supplant PGN with some binary encoding just because it's a few kilobytes smaller because of it.

        • Bjartr 401 days ago
          Maybe they can read the game faster than it would play on screen? Like the difference between a video and its transcript?
          • fasterik 401 days ago
            You can still read the notation on screen in ChessBase. It also makes navigating through the games and searching for specific players and opening lines much easier.
  • seabass-labrax 401 days ago
    PGN has been a phenomenal success; almost all chess software supports the format, allowing games to be, for instance, recorded in one program and analysed in another. I also wonder if PGN was the 'nail in the coffin' for the (somewhat bizarre) 'descriptive' chess notation, which was popular in the USA until very recently.