A fine example of what caffeine has done to our great nation's youth. What could have been an obscure command-line interface enjoying a modest but dignified place in the field of traditional operating systems is instead condemned to the gridded prison of automated balance sheets under the beady eyes of our financial overlords.
Jokes aside, this is an amazing achievement, the more so for being realized by a single person working away patiently for years - truly inspiring. And it brought out a wave of nostalgia for a former life where I made a living building bespoke tools in the (pre-VBA) Excel macro language. All that and solid business sense too! I hope companies shower Antonio with money, and that Microsoft either leaves him alone or rewards him with gold and the autonomy to lead technical product development for the long term. It's rare to see such a singular combination of vision and execution, doubly so in a very mature product space belonging to a global megacorporation.
This is excellent. I can see buying it for my business. We already use Excel extensively for the open file format; that it's local first; that it's just a file; the low ceremony of getting started; the multipurpose grid for code, data, and presentation; the ubiquity.
The more low-code/no-code SaaS products I've tried, the more I value it. LC/NC SaaS generally is expensive and inflexible. Everybody knows Excel, it's dirt cheap, and it just works. It's part of a certain IT subculture to bash it, but I stopped laughing along.
Excel is great at what it does. Love seeing clean solutions like this that give it even more muscle.
I love how approachable it is. Domain experts in other fields are able to program in excel and codify their knowledge. While, not always great in terms of code quality, often it is bad, it still provides a great starting point in moving to something more resilient and safer in terms of data validation.
There is a reason it is the worlds most poplar programming language
The thing is: Excel is one part data store and one part programming interface. The programming interface just happens to be a part of the data store. This leads people who have been told they're not allowed to program to solve the problem with the best tool they can hold: Excel.
I once got an email in my inbox that had the subject "Help: recommendations for machine to run Excel". The body described a person within a business unit who had a laptop, ca. 2018, with a then current I7, 32G of RAM, dual raid10 SSDs, and it had one job: Open, edit, and evaluate one 5+GB Excel workbook.
What happens is this pattern where more applicable programming models are taken from business folk, who need a way to do some level of broad, tabular accounting usually. These individuals don't have Python, don't know Python, or web applications, etc. What they have is Excel.
Excel lets you do something very easily that no other thing in existence lets you do: Real time (ish) calculations against very large sets of data. I can write one cell, `=AVERAGE($A9:$A99)` and get the running average for 90 rows in a table. Imagine the work that would go into doing that in a language like Python or Go, plus the scaffolding to make it work in a webapp.
And so when a business unit says "we need a way to do X", inevitably someone says "there's no budget for that" and someone eventually finds their way to Excel. Then they build the thing in Excel and when someone realizes they have a monster on their hands someone inevitably says "this should be a webapp" at which point nobody has the funding to maintain that webapp plus its infrastructure, plus handling the implicit and explicit logic that comes along with the mixture of data, code, and everything else in an Excel spreadsheet.
I think a lot of the issue with Excel is the developers, often, look down upon it. This is too bad as it doesn't help those using it to solve real world problems and it often leads to them not seeking help. Then we end up with the billion dollar London Whale's.
I use it for quick table to code codegen often. But it wasnt about you personally just the comment’s people make about it. With onedrive people do get a sort of versioning, albeit very crude, too. But your right, the practices are bad and part of it is people are not having conversations about best practices. Also, the cost of development being what it is and people’s incentives, Excel fits into that just get it done and deal with it later spot. I know in my org it was a minimum of 50k to get a dev shop, internal, to do a project. This was years ago too. Too many big fish for them to work on. So the small groups went with what they could use. Also, it is on all machines.
I think with Numpy and Pandas getting the average of 90 rows wouldn't be too hard. With R it's insanely easy: mean(data[9:99,]). I'll admit Excel is a easier for the average joe to get started with but there's a lot of downsides like it being relatively difficult to figure out WTF a sheet is doing at a glance.
The combination of a spreadsheet, database and data scripting runtime would be extremely powerful.
This is where Airtable, nocodb and baserow are candidates.
I worked at a company that used gridgain with Excel as a cluster to process models on lots of servers.
I suspect there's a lot of low level Windows integration details about how to integrate with Excel. COM objects and DLLs.
Why is integrating with software so difficult? Very few people know how to do it.
How you design software that is extensible?
Atom was extensible and VS Code is less so.
> Why is integrating with software so difficult? Very few people know how to do it.
I feel it's because of inherent conflict of interests and priorities. A software product can provide services to the user, or provide services to other programs. It's rare to see one that does both, because the two goals seem often at odds.
Say you're writing a software component for other programs to use. You probably have some specific programs in mind, some of which are end-user-facing. In this case, you may not want to spend resources on developing user-facing UI and features that make your product usable stand-alone - you'd likely end up competing with your own customers.
Say you're writing user-facing software instead. You want to own the entire user experience. Mixing UI, user interaction and the "business problem"-solving code together lets you work faster. Also, you don't want to make it easy for anyone to integrate with the core of your software, because that would be inviting competitors to own the user experience while using your own backend, ruining all kinds of marketing shenanigans you'd rather play on your users.
I don't like this, but I see there's a mix of both reasonable and greedy reasons why software isn't developed with integration in mind.
The major exception here is when the platform on which the software runs encourages, insists on, or demands the software to allow for integration. Windows does that to an extent with COM / DCOM. Emacs does that by its nature. But it's only moving the problem one level up - e.g. Microsoft environment can't integrate well with Apple environment.
OT: You can do that with Qt. We played with implementing it ourselves for a project about 10 years ago. Since then I think there is an official QPA that will render Qt apps within the browser, and there is the Wt project. Might be worth looking at for some needs but I don't see it becoming popular.
No idea, haven't seen this until now. Hopefully they provide a mechanism for allowing legitimate addins to get around it. Perhaps using a code signing cert (which I do) or an EV certificate would be appropriate. Security has been a bit of nightmare with Excel addins. I've had to deal with false positives on and off again a bunch of times already. First it was windows smart screen, then antivirus vendors every so often. It's a slog...
May I suggest a new feature? I imagine there's no real hard reason why Data Validation won't take dynamic arrays. It would be amazing to use UNIQUE or even FILTER in the Custom format. I've tried all sorts of workarounds including defining named ranges, using INDIRECT, etc. but it seems like I inevitably need to create a range for Data Validation to work
Which in turn leads me to think there should be a way to create a virtual Range object in VBA that isn't really in a sheet (or a VirtualRange object that is similar enough to Range to make it work for most purposes). If one were to refactor Range entirely (not suggesting that), it seems like the issue is that the Range object actually encapsulates two different data representations into one jumbled up object: (1) cells in a worksheet with properties such as style, fill, value, etc. and (2) a traversable quasi-array data type with useful methods such as offset, rows, columns, etc. Those two things often line up, but not always.
Anyway, I probably spend too much time thinking about Excel :-) </rant>
Oh Microsoft the company is not to be trusted and is not your friend. It’s a huge company with many employees and products. Moreover I can’t attest to any other group’s culture - I can only say what we do in Excel.
In Excel working hard to make addons possible and trying to foster an ecosystem of users sharing expertise is a focus as well as community building and smbs.
Taking a cool community project and doing it ourselves would shooting our selves in the foot.
More than once I didn’t do things or work in features for that reason.
Again not because Microsoft is your friend but because it’s in our business interest.
I don’t speak for Microsoft or Excel and am just a developer/hacker ;)
I always got the feeling that because it's such a key product the Excel team was basically its own organisation within the wider corporation and as a result had a lot more freedom to make its own choices.
(not that I'm arguing for trust that would, as you say, be misplaced, but if my feeling is correct then the Excel org's englightened self interest can diverge from the mothership's self interest more than other divisions can, which makes the calculations as to what your management is likely to decide notably different)
I got this impression as well. I worked on a team two years back building out a very complex Excel plugin using Excel.js. The SDK has quite a few quirks and we ran into some problems along the way, but the team at Microsoft responsible for it was happy to help and interested in hearing feedback and features we wanted. A positive experience :)
I hope it does work out for the original author and he deserves it but there have been cases before where concepts have been taken from their originator and implemented by the large corporate with no cash or even credit given.
I always felt that there was more to that story than what was reported. My understanding is they reached out to him to interview for a PM role to develop an official "AppGet" at Microsoft but, for whatever reason, he didn't get the job. Maybe the interviews didn't go well? Maybe there were some red flags somewhere? Who knows. But it didn't really sound to me like they acted in bad faith.
Really cool. I could see MS jumping on this because of their strong association with C#.
That said, I think the ideal replacement for VB & VBA in most contexts is python. The reason is they're both high level languages with ease of learning as a selling point. Python has a massive userbase, and there is significant overlap of python and MS Office tools. Data analysts who have to deal with Excel & SQL Server are much more likely to know python than any other language outside of SQL.
It seems like python has pretty much locked up the spaces that VB was designed for.
I thought I remember hearing like 4 or 5 years ago that MS was planning on replacing VBA with Python Support in Office apps. But I can't find anything to back that up, so maybe I just had a really wishful dream.
What I think they should do is not just make a python API, but replace VBA with it, and expose it to end users where VBA is currently exposed. Python should be the language of macros and user defined functions.
I tried to find info on how to set that up a while ago, but found nothing. The Microsoft pages gave me the impression it only works for the web version. Akso I'm on MacOS - couldn't find any mention of that either.
Looks awesome. One fundamental issue I always faced with Excel calculations of any kind (formulas or vba) is the lack of ability to source control cleanly (binary format plus data and code are intermingled). This weakness is the other side of the spreadsheet’s greatest strength which is low friction experimentation with data. Does this product attempt to do anything here?
You can version control the code you write with QueryStorm, to an extent. There are basically two kinds of apps you can build with this: extension apps (where you build general excel functionality) and workbook apps (where you automate a particular workbook).
The code for extension apps is stored in a folder on your machine, and you can version control that easily.
The code for workbook apps is stored inside workbooks. While there's no version control functionality in QueryStorm, it does let you export code from a workbook into a folder, as well as import code from a folder into a workbook project. There are context menu commands for this in the code explorer pane. This lets you decouple the code from the workbook.
Basically, for version controlling code in a workbook, you'd have to export the code into a folder and version control it there. Exporting into a folder will clear everything from the folder except hidden stuff like the ".git" folder so your git repository will be safe. Not a perfect solution, but it can be done.
Have you considered implementing a project folder structure, so that the workbook and the source code could be version controlled together in git, but the IDE would import and export the source automatically from/to the workbook?
Since XLSX etc. use Microsoft's Open Packaging Convention (a ZIP container consisting of mostly plain text), consider adding native support for unpacking and normalizing the payloads so they can be trivially diffed by existing source control systems.
Thanks! Wouldn't mind that outcome one bit. It's hard to get exposure and adoption for a platform like this without some serious muscle backing it. I suspect Microsoft has its sights set on the cloud, though, but still, it would be amazing if they adopted QueryStorm.
At first I thought you were referring to VSTO, then I saw the "not to be confused with VSTO" part. I had no idea this ever existed. Thanks for mentioning it. I wonder why they decided to pull the plug back then.
I suspect it must have been part of "let's try to secure / lockdown office", "users writing codes is bad, they should only push buttons", and at the same time they tried to make it as hard as possible to use VBA (all sorts of warnings and things disabled by default, etc)
So can people upload their code snips to the appstore such that, say, a marketing data analytics person with no coding experience can pay to get 'buttons' to run queries that are relevant to their needs?
like the salaries example, for example, I used to spend a shit-ton of time educating a CFO on AWS spends in excel.
If I could have a code snip 'button' to use some AWS-CLI creds to pull results directly from AWS asa "CFO BUTTON" and set them up with an easy export to an XLS - and then let them build their own dash to their choice that would be cool.
I havent looked at Cloudability in quite a while, (know the founder) -- but giving a CFO-type an excel button... Every CFO can navigate excel. so they get an invisible tool that empowers them, and relieves stress on the Ops team...
You create a project that defines some custom functions, ribbon commands, context menus, shortcuts etc... and then publish it to a store. Then anyone who has the QueryStorm Runtime (and the url of the store you published to) can install it through the "Extensions" dialog.
I haven't yet set up the ability to charge for packages but I plan on offering this. I first need a community of users, so didn't focus on charging yet.
If you'd like some help setting up something like that AWS scenario, reach me via email: antonio [at] querystorm.com, I've done several such integrations already (Google analytics, Monday.com, TSheets, Google drive).
This is incredible. I work in industrial automation and I often have to manage / make edits to "point assignment charts" which are essentially Excel Workbooks with 100+ Worksheets, each containing ~10-100 rows of formatted data which define various parameters each real world "point" should have in order to communicate with RTUs / various SCADA nodes.
Sometimes the feedback involves making a minor tweak on every Worksheet. Sure you can muck around with VBA (which is a horrible language and experience), but I often just do it manually to get it over with instead (click click click... very tedious).
C# is my go-to "business logic" language as a developer, and this looks so well integrated, really excited to use it.
> "Sometimes the feedback involves making a minor tweak on every Worksheet."
Could be a good task for PowerShell and the ImportExcel module.
(PowerShell also being a .NET language and written in C#; it has convenient access to COM control of Excel if you can't import the spreadsheet and export an updated version; through $xl = New-Object -ComObject Excel.Application )
No not really, it's many projects with many documents, shared among many contractors. You use Excel Workbooks when you work in fields like this. We have it formatted to be pretty too, so we can just publish to PDF via Adobe Acrobat when it's time to send in submittals.
Therein lies the problem. The language doesn't have much marketshare, and few champions.
I just finished reading Scott Wlaschin's Domain Driven Design for Functional Programmers and it's an incredible, batteries-included guide that goes from "hello world" to "and here's how you use it in an organization" in a few hundred pages.
I wish I had been exposed to it about a decade ago. Incredibly easy to follow and well-explained.
This! Other good books are ”Expert F#” by Granicz, Syme & Cisternino, ”Real world functional programming” by Petricek and ”F# for scientists” by Harrop. The last is old but still really great in succintly explaining what ’new&unfamiliar’ structures in F# are great for pragmatic software engineering.
Why is there so little noise in F# land? I have no data, but I can guess that the language is so unproblematic that everybody is just happily coding instead of explaining how to fix something. It’s also not very good at guis so there is usually nothing visual to demonstrate.
- fsharp's big selling point is that it runs on dotnet
- most FP ideologues run screaming from M$, and there are other, better (subjectively), more-active FP langs that do many of the same things, such as OCaml and Haskell
- most dotnet shops are "csharp or die" and getting fsharp adoption is more or less an impossible task for cultural reasons
I support fsharp and generally like it, but I'm not reaching for it for net-new stuff that doesn't already have a dependency on dotnet. $0.02.
Tooling: Not only CLI and classlibs. The debugger in Visual Studio follows calls to c/c++ dll:s over pinvoke withhout a cinch. (Same as C#). If you wrap complex entities in e.g. Protobuf then writing hybrid applications is pretty straighforward and on solid ground.
All of the GUI stuff is missing of course. I would not advice to implement a GUI on top of F#.
The main feature of F# imo is it's succint type system where one can wrap everything in a type without much bureaucratic cost, but in a way that makes program structure so much more obvious. "If it compiles it's correct" is prety much how it goes.
I would claim the business benefit of F# would come from a more succint, legible and correct codebase that saves developer time, but I have no basis for that argument except intuition I get having developed both in C# and F#. Also I know much better engineers than me to whom this would be a nill-argument ("You can write everything really fast, just don't make mistakes").
"It isn't as if a mixed C#/F# codebase brings the same value as doing Python/C++."
I think it depends on industry and domain what the value add specifically is? To me it would totally make sense to have small, succinct business logic module in F# that talks natively to a verbose chunk of C# wpf for instance.
"Also to note ... CPython is now taking performance more seriously."
Personally I would not hold Python isomorphic to F# on any architectural level, the former not having the typesystem of F#. IMO the whole point of F# is the wonderful typesystem, immutability first, combined with full .NET ecosystem.
On the other hand I would not try to use F# for the stuff I use Python for - so trying to take Python head on does not sound like a super good strategy.
I guess F# will languish in the "few percenter limbo" in the best case. But it IS a really nice language :)
This is really cool. For context, We have a tool that compiles complex Excel files into highly optimized WebAssembly.. which is great for dealing with those 2M formula files that take 10 minutes to open. We can replace all of that automatically with an API that can be called from any programming language or even a "hollowed out" version of the Excel itself.
However, replacing the functions on the VBA front is trickier. This is no longer pure functional programming, but essentially an application / data integration pipeline that wraps the formula logic. For MSFT, VBA is a dead-end for a variety of technical and business reasons ... unfortunately the current alternatives in Excel (Typescpript Add-ins, ScriptLab, OfficeScripts and lambda) all have some shortcomings. MSFT still hasn't quite gotten it right. As a result some financials have tried moving out of Excel entirely and into things like Python... but those projects either fail outright or reach a steady state of about 10%-30% complete as it becomes clear that you can't replace SMEs with a IT development team and keep up with maintenance and business changes in thousands of files.
Anyway - we have a few clients standing in-front of a few hundred/thousand Excel "Applications" some of which could use a modern, Excel centric IDE and runtime. Would love to chat with you and see if we can help them out together. Look me up on LinkedIn.
I wish that someone would make a Wine-style compatibility library that would run a local websocket server that communicates with a lightweight Add-In using the new cross-platform JS APIs for Excel, and translates VSTO/COM calls into IPC messages that are processed and responded to. Then you could bring all the old VSTO tools into the modern world, running in a separate process from Excel. This is... rather daunting to even think about. Arguably Apple should sponsor the project as a way to bring the world of finance towards using Macs!
What did you use to write the installer, and how do you determine if the targeted version of Excel is 32 or 64 bit?
I've built an Excel-DNA based Add-in (https://www.excelpricefeed.com/) and distribute it via an installer built using Advanced Installer. It works well, the only real pain point is the the user has to find out which version of Excel they have installed in order to choose the correct download (32 or 64 bit).
I use Wix for the installer. Can't remember how I dealt with that particular issue, but ping me at antonio [at] querystorm.com in a day or two, if you're interested, and I'll have a look at how exactly I dealt with it.
I would love to but I'm a bit scared to fiddle with the post now while it's trending. I think it's mentioned in a few places on the website, but not in this blog post. I do also use a bunch of other libraries, but yeah, ExcelDNA is one of the critical ones so mentioning it would be in order.
This looks great!
A solid C# foundation for building on top of excel could probably be very useful for some companies.
I do wonder though how you can nail down a target audience for this kind of tool, seems like you'd need a special kind of tinkerer and I'm not sure how many like that are out there. In my other company, which was a small company, the ones who headed it were essentially engineers that transitioned to executive positions. I doubt there are many like that out there.
I have used QueryStorm since May 2016 back then ThingieQuery.
My head exploded from the first moment I saw it considering everything that it integrates and embeds within Excel since for a long time I have resorted to automating routines with ado and vba.
Querystorm is more than just a SQL IDE built right into the book and honestly this is the best solution in this environment that every data analyst, self-service developer or business user should give it a try.
Looks great, but I ran into an issue I'm hoping you can help with. I copy/pasted the very first example from the blog and it doesn't compile. I also have been unable to find the documentation necessary to get past the problem.
It's the example that reads the files in a dir and makes an Excel table, and it fails on line 16, which reads `(excel.Selection as Range).WriteTable(files, "myNewTable");`.
The error is `Argument 2: cannot convert from 'System.Collections.Generic.IEnumerable<<anonymous type: string File, long Size>>' to 'QueryStorm.Data.ITabular' | `
I can't find docs for WriteTable() [even after finding those gitlab repos that contain QueryStorm docs].
I think anyone trying to run with this, will need access to docs on the API.
It looks like I was using a pre-release version of QueryStorm while I was writing the blog post. I just released that version so if you restart Excel you should get an "Update available" button in the QueryStorm ribbon in Excel.
Could you update and post back if that fixed the problem?
Another way of "fixing" it would be to replace the following line:
(excel.Selection as Range).WriteTable(files, "myNewTable");
with this line:
The Write method is available globally (it's not a method that belongs to a class). This is a trick specific to C# scripts (would be illegal in regular C#).
The documentation does lack detail in some areas, which I basically cover by answering questions via email. I do need to invest time every so often to update and extend it.
There's a lot of people who only know VBA (in terms of programming languages), and there's a huge amount of VBA code in production out there. Microsoft won't make itself popular by replacing VBA for some silly technical reason, especially not because the syntax is perceived to look better. It's a bit like renaming the SUM function to ADD and expecting everyone to rewrite their formulas.
devs "hate" excel because you have to reference by col/row (C10 = tax rate, etc). But few devs know that you can name cells (and ranges and tables). Just to the left of the formula bar you'll see a textbox with the default name of the cell (C10). Just delete C10 and call it "tax_rate" and now you can reference "tax_rate" in all your formulas.
Spectacular. Stats PhD here, I've needed this for years for practical projects in different data science roles. I'd love to talk to you about adding statistics/forecasting capabilities, maybe by exposing some existing libraries with syntactic sugar.
Would love to! There's a library called Math.NET that I wanted to look into, but never got around to it (plus, I forgot 99% of the math I learned in college so wouldn't know how to use it). Can you get in touch via email: antonio [at] querystorm.com?
A bit of both. On the Windows side I can use VSTO or ExcelDNA to integrate with Excel, but neither is available on the Mac. One alternative I'd like to try is a web-based version which would use Blazor and the Monaco editor but I'm really not sure if that would be at all doable. I don't have the time to go there at the moment, but I'd love to try at some point.
You must not work with any kind of real world data if you think Excel is a horrible tool.
Pulling in 1000000000 rows of data you know absolutely nothing about to throw into your ML model and have it spit out some questionable result doesn't mean you actually work with real world data.
>a few hundred rows data science.
People that work with a few hundred rows are called "almost every office employee ever" and not data scientists. And you are naïve as hell if don't realize how powerful excel is for every random ass thing that happens in the real world.