We’re using proxy-wasm and compiling to wasm32-wasip1, then mounting the .wasm binaries into Envoy as HTTP filters via envoy.filters.http.wasm. The line you're referring to:
…is where the integration happens. There's no need to modify envoy.bootstrap.wasm; instead, Arch loads the WASM modules at runtime using standard Envoy config templating. The filters (prompt_gateway for ingress, and llm_gateway for egress sit in the request path and do things like prompt inspection, model routing, header rewrites, and telemetry collection.
What’s missing right now are our guides showing how well ArchGW integrates with existing frameworks and tools. But the core idea is simple: it offloads low-level responsibilities—like routing, safety, and observability—that frameworks like LangChain currently try to handle inside the app. That means less bloat and more clarity in your agent logic.
And importantly, some things just can’t be done well in a framework. For example, enforcing global rate limits across LLMs isn’t realistic when each agent instance holds its own local state. That kind of cross-cutting concern needs to live in infrastructure—not in application code.
And was pleased with what I was able to do. Thanks
vm_config: runtime: "envoy.wasm.runtime.v8" code: local: filename: "/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm"
…is where the integration happens. There's no need to modify envoy.bootstrap.wasm; instead, Arch loads the WASM modules at runtime using standard Envoy config templating. The filters (prompt_gateway for ingress, and llm_gateway for egress sit in the request path and do things like prompt inspection, model routing, header rewrites, and telemetry collection.
And importantly, some things just can’t be done well in a framework. For example, enforcing global rate limits across LLMs isn’t realistic when each agent instance holds its own local state. That kind of cross-cutting concern needs to live in infrastructure—not in application code.