10 comments

  • furyofantares 15 days ago
    I don't really get who these are for - do people use them in their projects?

    I don't find success just using a prompt against some other model without having some way to evaluate it and usually updating it for that model.

    • vatican_banker 15 days ago
      > Trained routers are provided out of the box, which we have shown to reduce costs by up to 85%

      The answer is here. This is a cost-saving tool.

      All companies and their moms want to be in the GenAI game but have strict budgets. Tools like this help to keep GenAI projects within budget.

    • rodrigobahiense 14 days ago
      For the company I work for, one of the most important aspects is ensuring we can fallback to different models in case of content filtering since they are not equally sensitive/restrict.
    • monarchwadia 14 days ago
      I think a lot of people are just interested in hitting the LLM without any bells or whistles, from Typescript. A low level connector lib would come in handy, yeah? https://github.com/monarchwadia/ragged
    • brandall10 15 days ago
      You may have a variety of model types/sizes, fine tunes, etc, that serve different purposes - optimizing for cost/speed/specificity of task. At least that's the general theory with routing. This one only seems to optimize for cost/quality.
    • veb 15 days ago
      From what I understand, it's from people using it in their workflows - say, Claude but keep hitting the rate limits, so they have to wait until Claude says "you got 10 messages left until 9pm", so when they hit that, or before they switch to (maybe) ChatGPT manually.

      With the router thingy, it keeps a record, so you know every query where you stand, and can switch to another model automatically instead of interrupting workflow?

      I may be explaining this very badly, but I think that's one use-case for how these LLM Routers help.

      • PiRho3141 15 days ago
        This is for applications that use LLMs or Chat GPT via API.
  • worstspotgain 15 days ago
    I like their "LLM isovalue" graph, and the idea that different vendors can be forced to partake in the same synergy/distillation scheme. Vendors dislike these schemes, but they're probably OK with them as long as they're niche.
  • tananaev 15 days ago
    The problem is to understand how complex the request is, you have to use a smart enough model.
    • Grimblewald 14 days ago
      not true at all, you could have a efficient cheap model which is generally terrible at most things but has a savant like capacity for categorizing tasks by requirement and difficulty. Even easier when you dont need to support multiple languages and a truly staggering breadth of domains, like a conventional llm does. You could train a really small model to reject out of domain requests and partition the rest, running at a fraction of the cost of a more capable model.
    • ethegwo 15 days ago
      The weak-to-strong assumption is that it is easier to eval the result of a task than to generate it. If it is wrong, human can not make a stronger intelligence than us.
    • PiRho3141 15 days ago
      Not true. You can easily train a BERT single class classification model without having to train an LLM.
    • CuriouslyC 15 days ago
      You can distill evaluation ability
  • vatican_banker 15 days ago
    The tool currently allows only one set of strong and weak models.

    I’d be really good to allow more than two models and change dynamically based on multiple constraints like latency, reasoning complexity, costs, etc.

    • voiper1 14 days ago
      I think unify.ai (like openrouter) does that - it has several paramters you can choose from.

      But the underlying "how to choose a model that's smart enough but not too smart" seems difficult to understand.

    • KTibow 15 days ago
      Some of that is already possible, since it can generate a difficulty score for a prompt that could be manually mapped between models based on ranges.
  • fazmi 14 days ago
    Poppt
  • mmm3 15 days ago
    [flagged]
  • mmm3 15 days ago
    [flagged]