Ask HN: Hardware for 1k RPS?

I ran an uncensored model on a CPU server. as expected its dead slow (min or two per query).

What kinda hardware (GPU) do i need to serve 1k RPS?

I could not find APIs for uncensored models that kinda forced me to run locally

5 points | by gsky 1 day ago

2 comments

  • eddythompson80 1 day ago
    Depends on your model size and how many of it can fit in memory. Multiply the size by 1k and divide by the memory capacity of the hardware for a rough ballpark.
  • barnabee 1 day ago
    https://venice.ai claim to offer uncensored models (I’ve not tested that claim)
    • gsky 1 day ago
      Thanks, I give it a try.