Show HN: ArchGW – An intelligent edge and service proxy for agents

(github.com)

86 points | by honorable_coder 1 day ago

4 comments

mutant 1 day ago
Huh, this is pretty dope. I tried this example https://github.com/katanemo/archgw/blob/main/demos/samples_p...
And was pleased with what I was able to do. Thanks
[-]
- sparacha 1 day ago
  That’s an example of what the edge component could do. Did you give the preference-based automatic routing a try?
  [-]
  - mutant 1 day ago
    No, but I've already put this at the top of my tinker pile. I'm sure I will soon
jufter 16 hours ago
Was going to ask how this integrates into Envoy but dug into the code it looks like proxywasm which must mean `envoy.bootstrap.wasm` ?
[-]
- honorable_coder 11 hours ago
  We’re using proxy-wasm and compiling to wasm32-wasip1, then mounting the .wasm binaries into Envoy as HTTP filters via envoy.filters.http.wasm. The line you're referring to:
  vm_config: runtime: "envoy.wasm.runtime.v8" code: local: filename: "/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm"
  …is where the integration happens. There's no need to modify envoy.bootstrap.wasm; instead, Arch loads the WASM modules at runtime using standard Envoy config templating. The filters (prompt_gateway for ingress, and llm_gateway for egress sit in the request path and do things like prompt inspection, model routing, header rewrites, and telemetry collection.
isuckatcoding 17 hours ago
I’m still new to this ecosystem but is this something you’d use together with langchain or does it replace some use cases there?
[-]
- honorable_coder 16 hours ago
  What’s missing right now are our guides showing how well ArchGW integrates with existing frameworks and tools. But the core idea is simple: it offloads low-level responsibilities—like routing, safety, and observability—that frameworks like LangChain currently try to handle inside the app. That means less bloat and more clarity in your agent logic.
  And importantly, some things just can’t be done well in a framework. For example, enforcing global rate limits across LLMs isn’t realistic when each agent instance holds its own local state. That kind of cross-cutting concern needs to live in infrastructure—not in application code.
unit149 18 hours ago
[dead]