learned elixir in a week for an interview. didn’t clear it, but that week changed how i write code. understood state isolation for the first time. no shared data. fail and restart clean. pattern matching everywhere. structs over classes. pipes for everything. after that, i started writing code topdown. move sideeffects out. keep logic close to the data. elixir kinda rewired that for me.
after seeing this i saw that same mindset. not flashing any big genservers. simplified with fast procs, raw ETS tables. simple flow, but still fault aware. still clean.
I really wished to see an OTP-first design. Unfortunately for me, the code is almost procedural as it's touching ETS or Application, which is built on ETS, in nearly every operation.
If the author wishes to learn how to design services in Elixir, or any BEAM language, with OTP, they can take a look at "Designing Elixir Systems with OTP" by by James Edward Gray and Bruce Tate, and "Functional Web Development with Elixir, OTP, and Phoenix" by Lance Halvorsen.
On my first try I did write it in a more OTP-y style but the scaling potential for this very specific flow is just not the same. In the end a torrent tracker is just a specialized database and handling the data as fast as possible is the top objective.
That said I'll give the books a go.
If, like me, you don't know what OTP means in this context, here it is:
OTP stands for Open Telecom Platform, although it's not that much about telecom anymore (it's more about software that has the property of telecom applications, but yeah.) If half of Erlang's greatness comes from its concurrency and distribution and the other half comes from its error handling capabilities, then the OTP framework is the third half of it.
I'll second this, just using the built in Logger [0] and Telemetry [1] applications would be fine, opentelemetry or anything else can be added to the telemetry hooks easily to export the metrics later.
DHT (Distributed Hash Table) and PEX (Peer Exchange) let torrent clients find peers without centralised trackers. Hence, you don't need a central place / public tracker anymore
Yes, they just don't track the individual torrents anymore. They only play a role during the initial peer discovery stage (bootstrapping). Peers find torrent swarms on their own, the bootstrap servers are excluded from all that.
If they listen on a well known port, and there are millions, send out a few thousand probes to 'random' IPv4 addresses and you'll most likely find one.
If you get and keep a list of bootstrap nodes when you find one, then you can random select from the bootstrap addresses rather than all routable IPv4 addresses.
What do you mean exactly? If you need a notification engine, reaching for a pubsub implementation is very easy with phoenix’s popularity and quite battle tested. I’ve implemented notifications at scale a few times in the ecosystem. What problems are you encountering that you don’t feel you have a tool in the shed to work with in this case?
Interesting! I'd done something similar in Typescript to learn more about BT, and then redid it in rust to learn rust (https://github.com/ckcr4lyf/kiryuu).
However I decided to just use redis as the DB. It sounds like your entire DB is in memory? Any interesting design decisions you made and/or problems faced in doing so?
(My redis solution isn't great since it does not randomize peers in subsequent announces afaik)
in my case using the in-memory ETS has been the best decision, it lets me read&write the peer's data concurrently each on its own process so contention and latency are minimal. the only sequential part is when a new swarm is initially created but that doesn't happen a lot so its fine.
there's sadly no native support for taking random rows directly from the tables, so for now i grab the whole swarm and then take a random subset (https://github.com/Dahrkael/ExTracker/blob/master/lib/ex_tra...)
I don't remember if there's a way to see how many slots an ets table has, but if you're ok with imperfect distribution, you could maybe pick a slot at random and use ets:slot/2 to get all the items in that slot, then select from those.
You might be able to get the slot count from eta:table_info(Table, stats), although that's not intended for production use, so the format may change without notice.
For small trackers opentracker is probably faster and use a bit less memory.
Where extracker is gonna shine compared to it is when core count starts having 2 digits.
I still have to do a proper benchmark though.
I started because I needed a tracker for another project but the tracker turned to be more fun to make.
I did glance over other trackers code but their code tends to be either overly complex or too simple so not very useful.
So far its been 3 months of revenge bedtime procrastination.
While this is not a client like qbittorrent I have ideas for a seedbox-oriented client project in the future.
A torrent tracker is basically the world’s most antisocial matchmaking service that knows who has what files but refuses to actually store anything itself, like that friend who always knows where the party is but never hosts one. When your BitTorrent client asks “hey who’s got that Linux ISO,” the tracker dumps a list of IP addresses faster than a startup pivoting after their Series A falls through. Your client then connects to these strangers (seeders with complete files and leechers still downloading) and starts exchanging data while the tracker pretends nothing happened. It’s like Tinder but for file sharing, except everyone’s anonymous and probably downloading something weird at 3am.
The "paused" event is part of BEP 21. Clients send it to the tracker to let it know that the client is still incomplete, but won't download anymore. For example, because a user only wants some files from the torrent.
Readme of the project shows that support for BEP 21 is not implemented.
Telemetry for the HTTP side is in my ToDo list yes, since I'm using a 3rd party library for the webserver I still need to figure out how to do it right.
For HTTPS to work you need to provide a valid certificate path in :https_keyfile but right now I would recommend sticking Caddy or Nginx in front of the tracker if you want HTTPS.
I have certbot integration planned but is not a priority since most of the torrent peers use UDP.
I feel like ETS has been the real killer feature to pull this, being able to concurrently read and write from protected tables makes the whole thing incredibly parallel
There's something about C++ developers that makes them love Go and Elixir (and I include myself in this demographic). I think it's something about the people who are attracted to C++ for performance are attracted to Go/Elixir for its multithreaded performance. Really cool project
Not sure about C++ devs, but Erlang/Elixir are great to handle parsing of protocols, with its implementation of pattern matching. Also, makes the code much cleaner because pattern matching basically eliminates most branching and thus depth of the code base.
The let it crash philosophy allows you to ignore most corner cases with the knowledge that, if they are encountered or a cosmic ray flips a bit, the crash is localised to a single client. I have worked with Elixir almost a decade at this point, and I have never seen an unexpected downtime of the apps I deployed. Aside of maintenance and updates, they all have 100% uptime. How cool is that?
This is how I sell it to clients. “Will you be using Python, Go?” Me: “What about Elixir and the promise that your service won’t ever crash? And you get cool dashboards with it.” Them: “Sold.”
I wish there was a systems language that allows you to pattern match on structs and enums, and in function signatures like Elixir
Indeed. when your daily job is tracking down memory stomps, deadlocks, invalid pointers and unexpected state in very big codebases then using Elixir feels like "why is this so easy? it just works?". Also i'm a network programmer so the binary pattern matching is very much appreciated.
It's not though. Processes can be supervised and crashes can just lead to "restart with good state" behavior. It's not that you don't try handling any errors at all, you just can be confident that anything you missed won't bring the system down.
And Elixir is strongly typed by most definitions. Perhaps you mean static?
You can be more confident. But remember that time an Ericsson switch crashed upon handling a message that it sends to adjacent switches every time it restarts? That crashed the whole network, and you could still do that in Erlang.
Trackers are not relics - they're used exclusively in private tracker websites. Public-access torrents would more commonly use DHT and PEX for discovery.
after seeing this i saw that same mindset. not flashing any big genservers. simplified with fast procs, raw ETS tables. simple flow, but still fault aware. still clean.
If the author wishes to learn how to design services in Elixir, or any BEAM language, with OTP, they can take a look at "Designing Elixir Systems with OTP" by by James Edward Gray and Bruce Tate, and "Functional Web Development with Elixir, OTP, and Phoenix" by Lance Halvorsen.
OTP stands for Open Telecom Platform, although it's not that much about telecom anymore (it's more about software that has the property of telecom applications, but yeah.) If half of Erlang's greatness comes from its concurrency and distribution and the other half comes from its error handling capabilities, then the OTP framework is the third half of it.
https://learnyousomeerlang.com/what-is-otp
ETS is built into OTP, so how is using ETS not "OTP-first"? What's wrong with using ETS? It's just an in-memory store.
I looked through the code and didn't find it to be anywhere close to procedural in style.
[0] https://hexdocs.pm/logger/1.18.4/Logger.html [1] https://hexdocs.pm/telemetry/readme.html
- https://en.m.wikipedia.org/wiki/Peer_exchange
If you get and keep a list of bootstrap nodes when you find one, then you can random select from the bootstrap addresses rather than all routable IPv4 addresses.
However I decided to just use redis as the DB. It sounds like your entire DB is in memory? Any interesting design decisions you made and/or problems faced in doing so?
(My redis solution isn't great since it does not randomize peers in subsequent announces afaik)
You might be able to get the slot count from eta:table_info(Table, stats), although that's not intended for production use, so the format may change without notice.
I wrote a basic tracker in Elixir a few years ago, here's the code: https://github.com/aalin/mr_torrent
Also my console gets spammed with:
04:43:20.160 [warning] invalid 'event' parameter: size: 6 value: "paused"
but it seems to work. I would've liked to see HTTP stats too but I guess UDP is fine (though I have it disabled)
Ah, missed that.
The let it crash philosophy allows you to ignore most corner cases with the knowledge that, if they are encountered or a cosmic ray flips a bit, the crash is localised to a single client. I have worked with Elixir almost a decade at this point, and I have never seen an unexpected downtime of the apps I deployed. Aside of maintenance and updates, they all have 100% uptime. How cool is that?
This is how I sell it to clients. “Will you be using Python, Go?” Me: “What about Elixir and the promise that your service won’t ever crash? And you get cool dashboards with it.” Them: “Sold.”
I wish there was a systems language that allows you to pattern match on structs and enums, and in function signatures like Elixir
This is such a dangerous take. Also Elixir is not strongly typed, so...
And Elixir is strongly typed by most definitions. Perhaps you mean static?