Here’s a quick product demo: https://www.loom.com/share/c0ce8ab860c044c586c13a24b6c9b391?...
Marketers always say that half their spend will be wasted - they just don’t know which half. Real-world experiments help, but they’re too slow and expensive to run at scale. So, we’re building simulations that let you test rapidly and cheaply to find the best version of your message.
How it works:
- We create AI personas based on real-world data from actual individuals, collected from publicly available social media profiles and web sources.
- For each audience, we retrieve relevant personas from our database and map them out on an interactive social network graph, which is designed to replicate patterns of social influence.
- Once you’ve drafted your message, each experiment runs a multi-agent simulation where the personas react to your content and interact with each other - these take 30s to 2 minutes to run. Then, we then surface results and insights to help you improve your messaging.
Our two biggest challenges are accuracy and UI. We’ve tested our performance at predicting how LinkedIn posts perform, and the initial results have been promising. Our model has an R2 of 0.78 and we’ve found that “message spread” in our simulations is the single most important predictor of actual engagements when looking at posts made by the same authors. But there’s a long way to go in generalising these simulations to other contexts, and finding ground truth data for evals. We have some more info on accuracy here: https://societies.io/#accuracy
In terms of UI, our biggest challenge is figuring out whether the ‘experiment’ form factor is attractive to users. We’ve deliberately focused on this (over AI surveys) as experiments leverage our expertise in social influence and how ideas spread between personas.
James and I are both behavioral scientists by training but took different paths to get here. I helped businesses run A/B tests to boost sales and retention. Meanwhile, James became a data scientist and, in his spare time, hooked together 33,000 LLM chatbots and wrote a paper about it (https://bpspsychub.onlinelibrary.wiley.com/doi/pdfdirect/10....). He showed me the simulations and we decided to make a startup out of it.
Pricing: Artificial Societies is free to try. New users get 3 free credits and then a two week free trial. Pro accounts get unlimited simulations for $40/month. We’re planning on introducing teams later, and enterprise pricing for custom-built audiences.
We’d love you to give the tool a try and share your thoughts!
As a serial wantrepreneur I know what the letters SEO and A/B testing are but little else about this side of business. As a total novice, the ability to run a bunch of experiments quickly and cheaply, even if there isn't a 1:1 mapping to the real world, seems like a huge win.
To be honest I think I feel this way about most things ai/llm right now. I am ok taking the train to the stadium and walking half a mile over taking an uber to the front door and waiting in traffic.
I’ve seen a a couple of start ups pitching similar ideas lately - platforms that use AI personas or agents to simulate focus groups, either for testing products or collecting user insights. I can see the appeal in scaling audience feedback, reducing costs, reaching demographics that are traditionally hard to access.
That said, this is one of the areas of AI that gives me the most concern. I work at a company building AI tools for creative professionals, so I'm regularly exposed to the ethical and sustainability concerns in this space. But with AI personas specifically, there is something a little more troubling to me.
One recent pitch really stuck with me, in this case, the startup was proposing to use AI personas for focus groups on products and casually mentioned local government consultation. That's where I think this starts to veer into troubling territory. The idea of a local council using synthetic personas instead of talking directly to residents about policy decisions is alarming. It may be faster, cheaper, or even easier to implement but it fundamentally misunderstands what real feedback looks like.
LLMs don't live in communities. They don't vote, experience public services, or have lived context. No matter how well calibrated or "representative" the personas are claimed to be, they are ultimately a reflection of training data and assumptions - not the messy, multimodal, contradictory, emotional reality of human beings. And yet, decisions based on these synthetic signals could end up shaping products, experiences, or even policies that affect real people.
We're entering an era where human behaviour is being abstracted and compressed into models, and then treated as if it's a reliable proxy for actual human insight. That's a level of abstraction I'm deeply uncomfortable with and it's not a signal I think I would ever trust, regardless of how well it's marketed.
Would be curious to know what your approach is to convince others that may also be skeptical or not want to see this kind of tech being abused for the reasons listed above?
That's what motivated me to start researching in the area of creating "Artificial Societies" - first as an academic project, now as a product everyone can use, because I believe the best way to build a new technology is to try to make it useful for as many people as possible, rather than reserving it for governments and enterprises only. That's why unlike other builders in this space, we've made it a rule to never touch defence use cases; that's why we've gone against much business wisdom to produce a consumer product that anyone can use, rather than going after bigger budgets.
We totally agree that synthetic audiences should never replace listening to real people - we ourselves actually insist on doing manual user interviews so that we can feel our users pain ourselves. We hope what we build doesn't replace traditional methods, but expands what market research can do - that's why we try to simulate how people behave in communities and influence one another, so that we capture the ripple effects that a traditional survey ignores because it treats humans like isolated line items, rather than the communities we really are.
Hopefully, one day, just like a new plane is first tested in a wind tunnel before risking the life of a test pilot, a new policy will also first be tested in an artificial society, before risking unintended consequences in real participants. We are still in the early days though, so for now, we are just working hard to make a product people would love to use :)
Is there any way to know if you ran a simulation of your message on Hacker News, and whether the actual outcome matched what your simulation predicted?
I'm a stats nerd, so I'm just here for the data.
I'm certain Big [insert industry] will gobble this kind of thing up.
I think you guys might be onto something but I'm still skeptic as to whether you are the most accurate (on whatever metric). It's not surprising that you beat a survey of experts, or straight out of the box commercial LLMs.
I'm more interested in seeing how your model performs against purpose specific models that are currently industry standard. Unless you're making the claim that you're the first service to predict content engagement?
In this case, it's a technology that is hoping to significantly enhance the ability of people with the money to sell whatever they like to targets they have modelled.
Leaving aside the stalky model bit, the thing that should give pause is where is the bulk of that money going to come from?
It's probably (perhaps not with this product but say with the Grok version) inevitable, but is giving more help to oil, junk food, tobacco, dodgy politicians, and the odd millionaire who fancies themselves the next Il Duce what we want?
This seems like it could be useful for product discovery (“what are these people complaining about?”), content marketing (“how will my twitter followers react to this blog post?”), and other… reactionary… activities. But what about GTM and lead-gen? Can you ask it “who has job title of CISO, within two degrees of connection to me, working at a company with at least 500 employees that is SOC2 certified?”
I think you need to focus on a target buyer and make sure you nail their use case, or you risk wasting time on a really cool product that kinda/sorta does everything.
What’s your differentiator? Are you in the business of data gathering and curation? Or do you enhance some existing targeting data with “talk to my audience?” These are two distinct product development paths… either you invest in sophisticated data scraping, or you focus on “bring your own audience.” Most companies already have this sort of data on their customers and prospects – how can you meet them where they are?
The other problem is garbage in, garbage out. This product is only as useful as the data you can gather (or that your customer brings to you). A list of emails and names isn’t much on its own. You need the data generated by those people. Maybe you need to partner with data brokers to enrich audience data with social media profiles. Or maybe you leave it up to your customer – let them upload all their support tickets (ZenDesk) and sales calls (Gong) to your software so that they can “talk to their customers.” (Hint: maybe you should partner with Gong, and similar companies who already have this data, to provide this feature to their customers. White-labeling this product might be your fastest path to market.)
But more existentially… is “opinions” the most important aspect of your customer that you want to simulate? And if so… why do you need a _network_ of customers for this? It seems like two disconnected ideas. An “audience” might be a group of people that you only know to be associated because of their shared subscription to your product. Or they all follow someone on Twitter. Or they’ve all written an HN comment with “trivial” in its text. Does the _network_ aspect actually matter, at all?
And if the network aspect is important… why? Is it because you want to discover a new audience? In that case, are you focusing on the right value by simulating the current audience? Or should you be focused more on features for expanding/enriching/discovering “similar” profiles?
I think you’ve got the basis of something really cool here, but you need to figure out your identity and core competency, or you risk doing a bunch of marginally useful stuff kinda well.
Did you see the recent HN launch of Sumble? I see some overlap and similarities between your products, and I’d suggest reaching out to them in case you can work together…
edit: Just saw you went to Cambridge… I live here, if you’re ever around and want to grab coffee. I’m “Miles Richardson” on LinkedIn (and uh… in life). Feel free to message me… I spent five years on a startup that never hit product/market fit, so I’m always happy to point out the hazards…
I'm chronically online but I have very few public profiles anyone could glean anything from. (And even the one I'm posting this with here is in the queue for deletion... supposedly... I should probably check on my request.)
I love the idea of going from "AI generated customer avatar" to "simulated real people". It would help add depth to the customer avatars, and lead to better product design.
I tried creating a society around products that I sell, but it looks like the "real-world data" is pulled from LinkedIn? I'm not necessarily targeting business people.
Hadn't seen that paper, thanks for sharing it. This is the one I see cited most often that's got some similar vibes: https://arxiv.org/abs/2411.10109
If it isn't based on ACTUAL buyers who have ACTUAL input, what is it really doing? Great job at creating something, but at the same time, it feels kind of unnecessary.
You're essentially telling people what you THINK they'll want.
You can make it collect personas from actual users who actually interact with you on social media, for example Facebook page fans or x.com followers.
If the models are built from those actual users and their public social graphs, that gives a lot of data points to the inference engine about their demographics as well as their interests.
No idea if the results are statistically relevant, but it might bring good enough results at a fraction of the time*cost of an actual study (that probably won't be statistically accurate either anyway).
https://news.ycombinator.com/newsguidelines.html
https://news.ycombinator.com/newsguidelines.html