Show HN: Goralim - a rate limiting pkg for Go to handle distributed workloads

(github.com)

69 points | by dasubhajit 455 days ago

10 comments

jamespwilliams 454 days ago
Fun weekend project but definitely not production-ready (no tests, no error handling, concurrent requests will cause a race condition, etc.). If readers are looking for something production-ready to use, consider https://github.com/go-redis/redis_rate (which implements GCRA/leaky bucket), or https://github.com/ulule/limiter (which uses a much simpler algorithm, but has good middleware).
sethammons 454 days ago
> It is capabale to handle distributed workload with its redis database support
sounds like this is limited by redis. For many organizations, this is fine. At my last gig, we used redis for deduplication and it required over 150 redis nodes with deterministic routing and consensus. Redis reportedly could support 200k rps per node, but in our case, we wouldn't see it get passed around 50k rps no matter how we tuned it.
An interesting addition to this library would be to use an interface and allow your backing datastore of choice allowing teams to use redis, zookeeper, an in-mem Go instance of the same library, sql, etc.
A fun exercise would be to figure out how to make the rate limiting itself distributed so you don't need a single store keeping everything in sync. Maybe a combo of deterministic request routing in a ring topology
[-]
- dasubhajit 454 days ago
  > An interesting addition to this library would be to use an interface and allow your backing datastore of choice allowing teams to use redis, zookeeper, an in-mem Go instance of the same library, sql, etc.
  Thanks for the feedback. I'm gonna implement an in-mem Go instance for local dev, but not sure if that will be enough to use in prod. also, in the next release, I will make redis optional.
- dist1ll 454 days ago
  RPS meaning reads or writes? What's the distribution of message sizes, and how large is your total dataset? What specs (core count, NIC) did each node have?
  I'm asking because without this info, RPS is not a particularly useful metric. As an extreme example, if your dataset is <1MB, you could likely serve read-heavy requests from your SmartNIC's dcache at close to line rate.
  [-]
  - sethammons 454 days ago
    It's been a number of years since I worked on it. I can try to answer your questions. The calls were nearly entirely INCR calls against keys that were typically around 150 bytes long and app logic was based on the return values. I believe each node took up around 20GB of memory when saturated. Redis uses a single CPU, so core count shouldn't matter, no? I'm not entirely sure what our NIC situation was, but it was tuned to handle gigabit traffic.
    [-]
    - brazzledazzle 454 days ago
      Did you tune the file descriptor limits?
      [-]
      - sethammons 454 days ago
        yeah, good call out. We did. One of the things a lot of folks stub their toe on when starting to scale up
b1-88er 454 days ago
Given most of the backends use round robin for loadbalancing, having in-memory counter should be enough. Removing redis as downstream dependency is a big win.
For the redis implementation, there should be fallback to in-memory counting instead blocking altogether. Currently the redis is a SPOF for the entire service.
[-]
- maerF0x0 454 days ago
  if you're round robining clients w/o sticky assignment then you're going to get nodes*limit consumption. Not correct.
  Also if you give limit/nodes per node and random assign a connection, you get correct answers on average, but a really janky pattern at the edge case (a user gets a 429, and retries and succeeds, then gets 429 again as they consume those last few requests).
  [-]
  - b1-88er 454 days ago
    > if you're round robining clients w/o sticky assignment then you're going to get nodes*limit consumption. Not correct.
    Fair point, using in-mem storage changes the meaning of the limit, since accounting changes to local. Something to consider in the library API.
- dasubhajit 454 days ago
  > Given most of the backends use round robin for loadbalancing, having in-memory counter should be enough. Removing redis as downstream dependency is a big win.
  thanks for the feedback. planning to make redis optional in next release.
nikolayasdf123 454 days ago
1. some tests, over the wire preferably, would be nice
2. redis.go does not seem to be nessary it just changes signature of redis client constructor without much difference, might as well inline its contents
3. using fmt too much, if you don't need run time variables encoding, can do something more simpler. like writing to w.Write([]byte) directly. fmt uses reflect and runtime type detection, better avoid if not needed.
4. code comments do not follow godoc conventions. they should start from symbol name. did you run go fmt and some basic linters?
5. mutex is not used. and it should be pointer.
[-]
- blowski 454 days ago
  I really like the idea of getting code reviews from Hacker News for personal projects. There's so much knowledge and passion on here, it could be really useful. It would also help for me, as a bystander, to understand the context of these recommendations. I've done a bit of Go, and some of these sound useful to know.
  [-]
  - peter_l_downs 454 days ago
    Selfishly, I'd love a documentation review if you (or anyone else) has the time to take a try out some golang projects I've been working on.
    The code is frankly not very polished and not worth reviewing, but I'm curious to hear feedback on the documentation / README and whether or not you clearly understand what these libraries are for and how to use them.
    * https://github.com/peterldowns/pgtestdb
    * https://github.com/peterldowns/testy
    * https://github.com/peterldowns/pgmigrate
    [-]
    - blowski 454 days ago
      If that's not polished code, then I'm really curious what you consider to be polished!
      * It follows Go conventions, has helpful comments and readme file.
      * Given it's not user facing prod code, I'm assuming high-performance isn't critical, and the nature of what it's doing means using `once.*` is worth the trade-off with concurrency.
      * It also makes sense not to actual test against a database, given that's the whole point of this package.
      * The code is well-structured, with "internal" code separated.
      I don't think I'm good enough at Go to be able to provide any more useful feedback than that.*
      [-]
      - peter_l_downs 449 days ago
        Thank you! Just saw this — kind of you to take the time, and I appreciate the compliments.
        > Given it's not user facing prod code, I'm assuming high-performance isn't critical, and the nature of what it's doing means using `once.*` is worth the trade-off with concurrency.
        I'm impressed you noticed this. I don't do a good job of explaining this in code comments or docstrings, but the linearization / contention is necessary for correctness. Basically, each time someone asks for a new testdb, the code needs to make sure the relevant user exists, and the relevant template exists, and has the migrations run on it. If these things don't exist, the test will need to create them. With many tests running in parallel, they need some way to contend and make sure that the user/template is only created once.
        Because golang runs the tests of separate packages as totally separate processes, the code does the contention with "advisory locks" inside the Postgres server. When two tests are contending on these advisory locks, they have to hold a connection open to the server. As a result, server speed and max simultaneous connections are the primary limiters of how many tests can be operating in parallel and how fast your test suite runs.
        I added the `once.*` helpers to move the contention "left" where possible, to the memory of the packages that are being tested. Within a package, tests can (and should) also run in parallel. The `once.*` helpers force the different tests to contend in the shared process memory, the theory being that it's faster and that it reduces the number of server queries/connections held open just waiting around on a server-side lock.
        I haven't actually tested the code with and without this, and thanks to your comment I will try to do this at some point!
dasubhajit 455 days ago
a minimal rate limiting package for Golang microservice. small weekend project. feedback and stars will be appreciated.
[-]
- citrin_ru 454 days ago
  To avoid race conditions with concurrent requests better to use Redis Lua. See this example: https://gist.github.com/ptarjan/e38f45f2dfe601419ca3af937fff...
  [-]
  - pquerna 454 days ago
    For this general pattern implemented in Golang, check out redis_rate: https://github.com/ductone/redis_rate
    This fork also implements Redis Client Pipelining to check multiple limits at the same time, and Concurrency Limits.
- maerF0x0 454 days ago
  Good for you doing a thing! Please understand the community is likely very wary of single maintainer, weekend project, high risk dependencies right now.
- candiddevmike 454 days ago
  There aren't any tests...?
  If folks are looking for a serious rate limiting package for Go, limiter is a good start: https://github.com/ulule/limiter
  [-]
  - yau8edq12i 454 days ago
    I'm sure that the people needing a "serious" solution would realize that a weekend project is probably not that.
    [-]
    - rob74 454 days ago
      Still, the README says:
      > For production use, fork it and made changes based on your need.
      ...instead of the standard "this is work in progress/just a prototype/weekend project, don't use it in production!" disclaimer (which of course won't reliably stop people from using it in production either, but at least they can't say they haven't been adequately warned).
abrahms 454 days ago
https://stanza.systems is a managed thing that offers this functionality (and a bit more) if y'all are looking for an escape valve away from redis as the coordination mechanism.
[-]
- jahewson 454 days ago
  Very nice!
dasubhajit 454 days ago
Thank you to everyone who provided me with suggestions to make it better. I got a lot of awesome feedback. and I will build/fix them. you guys are awesome.
kseistrup 452 days ago
Fun fact: If my Hebrew is not too rusty, “goralim” means “destinies” in Hebrew.
maerF0x0 454 days ago
see also: https://github.com/Clever/sphinx