Why? Companies are increasingly adopting usage-based pricing models, requiring accurate metering. In addition, many SaaS products are expected to offer AI capabilities. To effectively cover costs and stay profitable, these companies must meter AI usage and attribute it to their customers.
When I worked at Stripe, my job was to price and attribute database usage to product teams. You can think about it like internal usage-based pricing to keep teams accountable and the business in the margins. This was when I realized that it’s challenging to extract usage data from various cloud infrastructure components (execution time, bytes stored, query complexity, backup size, etc.), meter it accurately, and handle failure scenarios like backfills and meter resets. I was frustrated that no standard exists to meter cloud infrastructure, and we had to do this on our own.
Usage metering requires accurately processing large volumes of events in real-time to power billing use cases and modern data-intensive applications. Imagine you want to meter and bill workload execution on a per-second granularity or meter the number of API calls you make to a third party and act instantly on events like a user hitting a billing threshold. The real-time aspect requires instant aggregations and queries; scalability means to able to ingest and process millions of usage events per second; it must be accurate—billing requires precise metering; and it must be fault tolerant, with built-in idempotency, event backfills, and meter resets.
This is challenging to build out, and the obvious approaches don’t work well: writing to a database for each usage event is expensive; monitoring systems are cheaper but inaccurate and lack idempotency (distributed systems use at-least-once delivery); batch processing in the data warehouse has unacceptable latency.
Companies also need to extract usage data from cloud infrastructure (Kubernetes, AWS, etc.), vendors (OpenAI, Twilio, etc.), and hardware components to attribute metered usage to their customers. Collecting usage in many cases requires writing custom code like measuring execution duration, listening to lifecycle events, scraping APIs periodically, parsing log streams, and attributing usage of shared and multi-tenant resources.
OpenMeter leverages stream processing to be able to update meters in real-time while processing large volumes of events simultaneously. The core is written in Go and uses the CloudEvents format to describe usage, Kafka to ingest events, and ksqlDB to dedupe and aggregate meters. We are also working on a Postgres sink for long-term storage. Check out our GitHub to learn more: https://github.com/openmeterio/openmeter
Other companies in the usage-based billing space are focused on payments and basically want to be Stripe replacements. With OpenMeter, we’re focusing instead on the engineering challenge of collecting usage data from cloud infrastructure and balancing tradeoffs between cost, scale, accuracy, and staleness. We’re not trying to be a payment platform—rather, we want to empower engineers to provide fresh and accurate usage data to Product, Sales, and Finance, helping them with billing, analytics, and revenue use cases.
We’re building OpenMeter as an open-source project (Apache 2.0), with the goal of making it the standard to collect and share usage across many solutions and providers. In the future, we’ll offer a hosted / cloud version of OpenMeter with high availability guarantees and easy integrations to payment, CRM, and analytics solutions.
What usage metering issues or experiences do you have? We would love to hear your feedback on OpenMeter and to learn from which sources you need to extract usage and how the metered data is leveraged. Looking forward to your comments!
We run millions of tiny VMs. Each gets billed on a number of dimensions: egress, runtime (per cpu / memory combo), storage / io. We also have other metered services: ssl certificates, IP addresses, etc.
The thing is, we _already_ have metrics for everything we want to bill. They're in a time series DB (VictoriaMetrics, in this case). Sending a shit ton of events to yet-another-system is complicated and brittle.
Your k8s pods example is a good source of my hives. Anything that runs on a timer to fire off events is going to get bogged down and lose data at our scale. And we're not very big!
This is a somewhat solved problem for metrics stacks. It's relatively straightforward to scale metrics scrapes. And once we have that infra, it's pretty easy to just start pushing "events" to it when events become useful. We don't end up needing Kafka and its complexity.
My dream for billing is: a query layer on top of a time series database that takes data I already have and turns it into invoices.
One thing about your post that struck me – the last mile to billing and reporting is the thing we're most interesting in buying. It's less specialized. There also aren't any products out there that have really figured this out, I don't think (we've evaluated pretty much all of them).
Usage tracking and reporting is a thing we're ok building, because it's core to our product.
* Lago - https://www.getlago.com
* Octane - https://www.getoctane.io
* Metronome - https://metronome.com
Btw how do you handle idempotency with time series? I think replays and least-once delivery in distributed systems can be challenging around usage collection if you don't have a key to deduplicate by.
I'm fond of SQL, but I'd be ok with some other way to express "aggregate my usage into an invoice".
Continuously aggregating smaller windows and sending them off is roughly how we send usage to Stripe right now (which, let's be clear, is not working for us AT ALL). We've resigned ourselves to having to do the same thing if we want to use a billing SaaS, but I don't love it.
A pure SaaS model for billing doesn’t do much more for you at then end of the day other than to apply billing ratings and generate an invoice (which Stripe can and is doing for you already).
I’d love to learn more about what you’d like to do with Stripe and how you are handling entitlements, plan versioning, etc
I'd have said if you want to gracefully deal with a system that loses events, a timer-based system can ensure the errors from lost events will always be in your customer's favour.
The last thing you'd want is to start the taximeter with a start-pod event, then lose the stop-pod event and end up billing the customer for nothing, forever.
We track everything as metrics, then hook up functions to read from the TSDB each billing period. How it works:
- One Inngest function runs each billing period (scheduled), and "fans out" by creating an event for each user we want to charge.
- A billing function reacts to these individual user events, queries metrics, queries our plans, and then generates a stripe invoice
- This is then charged by stripe.
- We have other functions that respond to other billing events and do dunning like flows.
We only have to focus on the small amount of billing code required. We don't have to handle scheduling, state, retries, or the event flows — that's built into Inngest.
I'd really like to make this more robust. We had to put in the work on emails, billing, and dunning, even if it is simple function code. It would be nice to have someone "productize" this. I feel your pain.
In general, it definitely makes sense to read from our own metrics stores to charge for our products. Very much agree with everything you've said. There's not the most joy in billing.
We’re working hard to make it easy to send data to us (e.g. just drop it in S3). Granted, this isn’t ideal per your note of already throwing it into VictoriaMetrics, but we’re finding that often times companies don’t have the optimized in-house infra to run the sort of usage metric queries to go from raw events -> invoice, especially when you get to >100k events/second scale. It often makes sense for us to control the storage (and sharding) format — and we can bring those costs down over time.
Given the usage events and the SQL queries over them, we then focus a bunch on making the core billing interactions work well, do invoice delivery for you (example: https://invoices.withorb.com/view?token=IlJwOVRSdWhEQ3NiRUVH...), and make it easy for you to do financial reporting & rev-rec (honestly, this is its own rabbithole — but we find it’s extremely outside the core competency of eng/product teams)
Also, any invoice system that’s based on query-time aggregation is probably too expensive and definitely a risk. Fees are accumulated in real time, incrementally - not when I generate an invoice.
But probably the biggest miss I encounter is around auditing. Customers dispute bills, and always will to some degree. The process of resolving those disputes needs to be automatic, clear and understandable, and based on something that the customer can cross check (ideally).
We're a hosting platform for AI Agents and in practice that means keeping track of a stream of usage events across agents, microservices (equivalent of Fly's VMs), and third-party services (OpenAI, Eleven Labs, D-ID, Pinecone, Replicated, etc) that roll up into an itemized consumption bill.
It might be this is just a super-niche billing situation, but we also haven't yet found an open-source solution that fits. Fully agree that a query layer atop a time-series database is the style of product that we're craving.
Metronome comes very close, but since it's not open source we're hesitant to make the jump before it's a perfect fit.
Congrats on the launch! It looks very cool!
I don't necessarily want to send data to another SaaS vendor's data lake/warehouse/tsdb/etc. Egress costs, privacy, and storage costs. Also if I already have a data platform, I don't want 2.
But yes I agree, it's more challenging to build in open but I think it benefits the customer, the quality has to be there.
In this way, it will enable you to be able to fully use the minor number and the patch number when you will need them.
Congrats on launching this, there's lots of ways to deal with this, but would love a metering standard much like we have one around tracing.
Would you track API calls by default like a usage-based third-party (OpenAI API, Twilio, etc.)? Or your own web server (Lambda, Next.js etc)? Or both.