I don't know much about Nifi, but from re-reviewing the documentation, it looks like Nifi is a stateful application with orchestration and multi-user access, etc.
Benthos requires something else handle orchestration and source control and multi-user access and the conditions that trigger an ETL run.
On the plus side, the stateless nature of Benthos means there isn't any big setup process or cluster of containers to login to or anything. You just call benthos from the cli and pass in a config file.
Last week I needed to get 120k rows of data from an Oracle view that would crash on a few bad rows, and shove rows that worked into a SQL Server table.
Writing a Benthos config to tick through every id, fetch the relevant rows one at a time, annotate with batch information, and drop error rows and error messages in a log and healthy rows into SQL Server was, all told, about 120 lines of YAML. Roughly 3 hours to write while consulting the extensive documentation, and then trying performance tweaks until I could saturate my network connection (increase threads to 256) only took another 30 minutes.
Tweaking that YAML to cover a similar second scenario as a new config then was only another 15 minutes.
The week before that, I was trying Benthos at home to stress test MQTT on a Raspberry Pi with synthetic data to see which messages got dropped or mis-ordered.
(Depends on your QoS and in-flight settings, obviously)
I've never had "basic" performance testing be so simple.
If you are stuck on a Windows platform, it even works there, which is nice for one-off runs for dev work, troubleshooting, or break-fix work.
Benthos is much simpler, since it's stateless and it's a single static binary written in Go. I don't really know much about NiFi, but if you need to use a messages bus with Benthos, such as Kafka, you can. However, you don't have to.
I liked Yahoo! pipes and the idea of being able to glue things together on the internet, but unfortunately many web sites simply don't have good APIs to enable this. In my experience pipes workflows were also brittle since any site could easily break them.
The lack of stable API support on (many/most) web sites (and no incentive to provide it), the likely low user/developer base, the apparent lack of killer apps, and Yahoo! itself probably all combined to prevent pipes from becoming a big thing.
That was my takeaway after building API Blocks. The goal was to display arbitrary data from APIs on a dashboard.
APIs just aren't standardized or stable enough, and it took too much effort to maintain the API library. The UX just wasn't up to scratch because almost every week a block on your dashboard would be broken because the API changed,or your auth expired or broke. The cost to maintain was not worth it.
An interconnected public web as data would be incredible but we just haven't built that.
Not to detract from the concept here with Benthos Studio, the use case is a bit different. But it is something to keep in mind, the end user might not be technical and APIs constantly changing might not be something they expected.
Smart contracts do provide a stable (immutable!) api which is designed for composable functionality, although it’s not used across different peoples contracts that much. A pipes like UI that binds things together could be interesting. If the connective contracts are also on chain you could in theory develop flash bots that execute everything in the same block.
At least some sites provide somewhat-stable and documented APIs nowadays... Definitely a far cry from what was being promised 15 years ago, but at least most of them moved away from SOAP. I guess it's still used in some places, unfortunately.
Benthos Studio is an application that provides visual editing capabilities to the Benthos (https://www.benthos.dev) stream processor, which lets you craft and test yaml-based configurations that you can then run using Benthos.
Benthos itself is a stateless command line (CLI) app written in Go. It supports quite a few types of "macro" building blocks (aka components) which are various flavours of inputs, outputs, processors, caches, rate limits, buffers, metrics, tracers and loggers. The most important processor is the `mapping` one which lets you execute Bloblang code against each message which passes through it. Bloblang is a functional programming language embedded in Benthos as a DSL for manipulating structured data. You can read more about it over here: https://www.benthos.dev/docs/guides/bloblang/about Also, if you'd like to use it outside of Benthos, you can import it as a library: https://firstname.lastname@example.org/...
Since I mentioned importing Bloblang as a library, you can import the entire Benthos framework as a library and inject your own custom plugins to create a custom Benthos build with whatever components and extra functionality you need. It's also a great way to slim down the existing distribution and only import the components that you require. See some examples here: https://github.com/benthosdev/benthos-plugin-example
Some history here https://github.com/benthosdev/benthos/issues/1184#issuecomme... but the TL; DR version is that the client library has caused a bit of friction in the past. They had several releases where one of its dependencies produced various compiler warnings and there was this dbus zombie processes issue which caused some concern... However, if you wish to run this at scale in production, you might want to look at https://github.com/benthosdev/benthos-plugin-example and craft your own custom Benthos build, where you import just the components you need, such as the Pulsar ones.
> I should've mentioned that the official mascot is called The Benthos Blobfish.
The mascot really really creeps me out.
This is entirely subjective, of course, and I really don't want to be mean about it, but I thought it might be a helpful data point for you to collate in case others feel the same way: If I were forced to use benthos, and had that ugly thing on my screen all day long due to working with benthos documentation etc. I really don't know whether I could handle it.
It's really hard to tell without doing a proper deep dive into Dagster, but even if there is a lot of overlap, there's a lot of reading that one must do before even starting with basic workflows. Is there a one-click demo UI that I can run which produces a valid config that I can just copy / paste and then do smth like `dagster -c config.yaml` to run it?