Great project, write-up, and sense of humor in the videos!
> Using that part number I wasn’t able to find any information about the chip online apart from an article that claimed it was based around a “SuperH” CPU core – an ISA that I’ve encountered for the first time ever in that article.
Also found in Sega 32x, Sega Saturn, and Sega Dreamcast! And some early Pocket PCs (turn-of-the-century handhelds running Windows CE) like the HP Jornada series, although most Pocket PCs were ARM-based.
This is such a ludicrous premise, I'm amazed you pulled it off.
You mention "another packing optimization". I'm wondering, how are you transferring frames? The dot matrix is eight 7x5 characters, i.e. 280 bits in total, which amounts to 40 7-bit groups per frame. You seem to be using twice that space in transmission, is it wasted on some control data or is the transmission just slightly suboptimal?
The dot matrix is actually eight 5x8 characters, or 320 bits in total. I'm packing those 320 bits into the the 4 bits per byte that are available to us in this shell protocol. Plus, another 9 bytes for the packet header and footer. Looks like I wrote 92 in the article, I must have miscalculated that.
I'm not using the full 7 bits because figuring out a way to do so turned out to be way too hard for me, so I opted for a solution that is negligibly worse than the optimal one, in comparison to the original one.
Another option might be to modify the baud rate of the MIDI interface. MIDI is terribly slow at 1M/32 bps, and most UARTs can go at least 115200. That would also mean changing the baud rate on your PC software at that point in transmission, and would not allow a standard MIDI file to be used.
This was being done over USB midi, which is already faster than standard DIN midi AFAIK. Trying to change the baud rate of DIN midi on both ends of the communication seems like a lot of work.
Knowing a lot but being aware you know nothing is like level 4 of experience, where level 1 is being new and eager but knowing you know nothing yet, level 2 is the "I am a god" stage, and level 3 is "I'm an idiot".
That is amazing research! Reminds me a bit of the 2017 research of RCE on a DNA sequencing machine by synthesizing shellcode in actual DNA/RNA molecules [0]. I was gonna say "next up: OSC" but I guess MIDI is still dominant.
So to test this, surely they tried it once, and presumably Metasploit users generated tons of them .
And from having worked in this space a long time ago, such CVEs would trigger a host of malicious midi files looking for holes, especially since people embedded them in webpages around early 2000s.
Around that time I'd routinely take a MS update, diff the DLLs, reverse interesting location changes, and craft shellcode attacks during training to show people how it's not very hard. And there were tons of people across the spectrum able to do similarly. CVE disclosures made them much easier to develop.
Of course it is SysEx. SysEx is to standard MIDI what inline assembler is to Python. A world of undocumented proprietary stuff lurks within just about every MIDI device !
I wish it was somehow possible to perform a piece of music that would cause remote code execution. It’d be so cool to plug in a MIDI keyboard, play an Am6,9/G# and have it open a terminal window with root access.
That doesn't make much sense as note on and note off messages are very simple and you can't insert arbitrary bytes with them, unless maybe you use some very particular run mode.
Note on/off are just messages. We by convention map them to notes, but note 69 is A-440 on most keyboards, but you could see note 69 and play a C instead (this is somewhat common - have the computer transpose so you can play with others who play the music in a different key - better players can do this in their head but it is not a universal skill even with great players)
There is no reason you can't take a sequence of notes and do something else - pop up a root window for example. It isn't normally done because it would confuse everyone for no reason. (IIRC The original MIDI spec from the 1980s didn't have conventions of nearly as much and some midi devices did really weird things from note commands)
Oh there definitely is some MIDI device out there that will get a buffer overrun from a particular set of just regular note inputs. Maybe 11 notes at once due to the programmer thinking "humans have only 10 fingers, a static array of 10 elements is enough to hold all notes currently playing".
More notes or voices playing than the player has fingers is quite common, a note doesn't stop just because you let go of the key. Sometimes you want it to ring out so most synthesizers handle that case. Some even let you configure the behaviour, for example you could reallocate the longest playing note or the closest note.
Many synthesizers have limits on how many sounds they can support. Midi was originally started because 1970's (analog) synthesizers could only produce one sound and so they wanted a way to have several synthesizers connected together. Before midi was finished synthesizers (now digital) could play more than one note. Though hardware limitations (not just software) didn't support infinite notes and so until around 2000 that synthesizers could generally play enough notes that players wouldn't run out in the real world.
The companies that came together to make MIDI all had analog polysynths capable of true polyphony before the MIDI standard was even finished. (distinct osc/amp/filter outputs per note and not just paraphonic synths that shared AMP/Filter circuits between OSCs)
MIDI was more about unifying the entire studio of synths, samplers, drum machines, and recording equipment. And creating interoperability between various manufacturers of music equipment. It was a solution to the multiple control voltage standards that predated it and made it troublesome to tie equipment together.
Yep, and not forgetting that serial ports on a computer were (at the time) expensive, and the sounds most synths were capable of were.. kind of simple. So there was the motivation for stacking multiple synths up to produce bigger/richer sound, doing keyboard splits (possible on some hardware of the time but not most), as well as driving many devices from a single port.
Multi timbral synths (different sounds addressable per MIDI channel) were a later thing too, analog polysynths could play more than one voice, but very few could play more than maybe two different _sounds_ at once.
Polyphonic analog synths existed before MIDI. Notably, the Novachord from the late 30s. For the modern era, analog 2-8 note polyphony was available by the late 70s.
It was well before 2000. Most of my gear is 1990s vintage and while some has limited polyphony, most has unlimited polyphony and doesn‘t do note stealing.
Sustain pedal. Not sure how it’s implemented in midi, but that’s one way to have more than ten notes playing at once. (There’s also four-hand duets and the rare but not non-existent play two adjacent white keys with one finger technique.)
Wow, this is a whole thing I had no idea about. Was recently looking into what it would take to fuzz MIDI and while I found some resources to generate a .mid file to this end it wasn't exactly what I was hunting for. This is maybe something I should consider exploring instead, thanks!
of course webmidi must support sysEx, it's essential to work with midi2 at all. but mostly because you must parse the weird sized packet to properly ignore it.
what might show up are chrome specific sysEx messages which then leads to exploits as the article.
I tried to make a bidirectional channel from a webpage to a python script over MIDI. I'd just found that with sysex you can pack any arbitrary data that you want, that python can create virtual MIDI devices, and that Chrome can then connect to such devices.
While I suggest reading the whole thing, the money-quotes:
> So yeah, these [keyboard manufacturer] madlads made a shell that runs on top of MIDI SysEx messages on top of USB.
> [T]he most interesting commands that we have are arbitrary memory read/write commands. So, if we really wanted to, we could just peek and poke the memory of the synth via MIDI.
> If we wanted to, we could write these messages to a MIDI file and play it on the synth like any other MIDI file. Hey, that gives me an idea.....
> From the countless sleepless nights of digging around in the firmware I’ve discovered a function that sends arbitrary data to the LCD controller.
P.S.: Now the real question is whether you can change the running-code on the keyboard so that it tries to infect other keyboards (of the same model) that might receive MIDI data originating from-it.
In a way, this is a peek at the nightmare of Internet of Things (IoT, where the S stands for Security.) Almost any device might have a backdoor in it, and it might even be a stupid backdoor, like #0000.
> > [T]he most interesting commands that we have are arbitrary memory read/write commands. So, if we really wanted to, we could just peek and poke the memory of the synth via MIDI.
This sounds easy but with SysEx having no delivery guarantees, and no sense of connection/session it can be frustrating. Totally normal to get "packet loss".
On the off chance any of the HN crowd is in Armenia: Porta will be giving a talk about this on January 10th at the Hacker Embassy. You should definitely come: https://t.me/hackerembassy/17
My bad! I added *.bin to .gitignore last minute to exclude assembled code snippets, but looks like the dumps were excluded as well. I'm going to upload them in the next few hours
As someone who's reversed some basic MIDI stuff in an old video game and has always wanted to get into hardware hacking, I really enjoyed reading this article. Great work!
There are also several videos in the article that are hosted on the same site as the original article (so on the .ru site). Those are not included in the snapshot unfortunatelly. You'll see placeholders and the associated text that describes them but you can't view those via the snapshot.
It also contains a link to a GitHub repo at the end.
Do you not have DNS-over-HTTPS/TLS configured, or are you on some weird browser that doesn't support ECH [0]? I wonder how they blocked it if those are in place (unless you use the ISP's DNS), since the site itself is served by Cloudflare from wherever close to you, not from Russia (and so the actual SNI isn't leaked via the ClientHelloOuter field either).
For an ISP? Absolutely not. I get it for corporate firewalls, but I absolutely don't want my ISP blocking a country. It does absolutely nothing for security (domains aren't hard to buy) and I'm very happy with my ISP being a "dumb pipe". If I wanted more than that I'd use my firewall or something
I will tell you how: derangement, a failure to engage with society and the degeneration/depravity of being one step away from making MIDI component improvised explosives and emailing your remaining friends a manifesto.
> Now, we have to get a little philosophical here. In my eyes, RE is like a game of minesweeper. You start with an empty field not knowing the state of any of the cells, i.e. not knowing whether each individual cell contains a landmine or not. When you discover the state of a cell, you have the context to deduce the state of its neighbor cells. In minesweeper, you don’t have a particular direction in which you progress. You never say “In this game of minesweeper, I want to go up no matter what”, you just let the numbers nudge you in the direction that is the easiest to go in at the moment. I assert that this is also true for RE. Once you find out what a function or a variable does, you suddenly understand a little more about functions and variables that depend on the ones whose meaning you’ve just inferred. It may be beneficial not to set any particular goal with an RE project, and instead letting the complex network of intertwined functions and variables guide you towards understanding the system as a whole.
That's such a nice way to think about it. Maybe I should try giving RE a go again.
I don’t know RE but I love this sentiment. I think it’s quite generalizable too. So many things are like that. Just start somewhere. It really doesn’t matter where. And what you find will guide your next steps. Eventually you’ll have enough context to see a much bigger picture well before it’s fully revealed.
The one difference is in re you sometimes go up no matter what because there isn't enough information. (in minesweeper I typically pick a few squares early because without them the game probably won't be solveable anyway. Typically in thegame once you have enough solved you have it all - once in a while a couple squares are unknowable. In re you hit points where you know all the clues can get you more often and so have to try something at random. Otoh it is much rarer in re for a wrong try to be catestropic [not unheard of but rare]
> Using that part number I wasn’t able to find any information about the chip online apart from an article that claimed it was based around a “SuperH” CPU core – an ISA that I’ve encountered for the first time ever in that article.
Also found in Sega 32x, Sega Saturn, and Sega Dreamcast! And some early Pocket PCs (turn-of-the-century handhelds running Windows CE) like the HP Jornada series, although most Pocket PCs were ARM-based.
You mention "another packing optimization". I'm wondering, how are you transferring frames? The dot matrix is eight 7x5 characters, i.e. 280 bits in total, which amounts to 40 7-bit groups per frame. You seem to be using twice that space in transmission, is it wasted on some control data or is the transmission just slightly suboptimal?
The dot matrix is actually eight 5x8 characters, or 320 bits in total. I'm packing those 320 bits into the the 4 bits per byte that are available to us in this shell protocol. Plus, another 9 bytes for the packet header and footer. Looks like I wrote 92 in the article, I must have miscalculated that.
I'm not using the full 7 bits because figuring out a way to do so turned out to be way too hard for me, so I opted for a solution that is negligibly worse than the optimal one, in comparison to the original one.
If you're wondering about the exact algorithm, consider checking these files out, but please keep in mind that I haven't cleaned the code up yet: https://github.com/portasynthinca3/swl01u/blob/master/fun/bi..., https://github.com/portasynthinca3/swl01u/blob/master/fun/ba...
Lol… where does that leave the rest of us in comparison
[0] https://www.usenix.org/conference/usenixsecurity17/technical...
There's been MIDI shell code for well over 20 years on pretty much all major platforms: https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=midi
So to test this, surely they tried it once, and presumably Metasploit users generated tons of them .
And from having worked in this space a long time ago, such CVEs would trigger a host of malicious midi files looking for holes, especially since people embedded them in webpages around early 2000s.
Around that time I'd routinely take a MS update, diff the DLLs, reverse interesting location changes, and craft shellcode attacks during training to show people how it's not very hard. And there were tons of people across the spectrum able to do similarly. CVE disclosures made them much easier to develop.
There is no reason you can't take a sequence of notes and do something else - pop up a root window for example. It isn't normally done because it would confuse everyone for no reason. (IIRC The original MIDI spec from the 1980s didn't have conventions of nearly as much and some midi devices did really weird things from note commands)
Cue Frankie Goes to Hollywood's "Relax" triggering Derek Zoolander to kill the Malaysian leader during fashion week
MIDI was more about unifying the entire studio of synths, samplers, drum machines, and recording equipment. And creating interoperability between various manufacturers of music equipment. It was a solution to the multiple control voltage standards that predated it and made it troublesome to tie equipment together.
Multi timbral synths (different sounds addressable per MIDI channel) were a later thing too, analog polysynths could play more than one voice, but very few could play more than maybe two different _sounds_ at once.
Though Behringer is still quite good with it. E.g. their Deepmind can be pretty much 100% programmed with it, on top of already good MIDI CC scope.
I really like this comparison!
of course webmidi must support sysEx, it's essential to work with midi2 at all. but mostly because you must parse the weird sized packet to properly ignore it.
what might show up are chrome specific sysEx messages which then leads to exploits as the article.
I'm sure I'd more code than what's in my 8yo repo, but the premise is simply https://github.com/prashnts/midipacks/blob/master/midipacks/...
> So yeah, these [keyboard manufacturer] madlads made a shell that runs on top of MIDI SysEx messages on top of USB.
> [T]he most interesting commands that we have are arbitrary memory read/write commands. So, if we really wanted to, we could just peek and poke the memory of the synth via MIDI.
> If we wanted to, we could write these messages to a MIDI file and play it on the synth like any other MIDI file. Hey, that gives me an idea.....
> From the countless sleepless nights of digging around in the firmware I’ve discovered a function that sends arbitrary data to the LCD controller.
In a way, this is a peek at the nightmare of Internet of Things (IoT, where the S stands for Security.) Almost any device might have a backdoor in it, and it might even be a stupid backdoor, like #0000.
I'm imagining dubstep would be the result
This sounds easy but with SysEx having no delivery guarantees, and no sense of connection/session it can be frustrating. Totally normal to get "packet loss".
There's an embedded YouTube video in the article as well, that appears twice. First at the top and then again further down.
https://www.youtube.com/watch?v=u6sukVMijBg
There are also several videos in the article that are hosted on the same site as the original article (so on the .ru site). Those are not included in the snapshot unfortunatelly. You'll see placeholders and the associated text that describes them but you can't view those via the snapshot.
It also contains a link to a GitHub repo at the end.
https://github.com/portasynthinca3/swl01u
[0] you can test these by using https://www.cloudflare.com/ssl/encrypted-sni/
I will tell you how: derangement, a failure to engage with society and the degeneration/depravity of being one step away from making MIDI component improvised explosives and emailing your remaining friends a manifesto.
Ah just kidding, this is pretty sweet.
That's such a nice way to think about it. Maybe I should try giving RE a go again.