> To improve the stability of the resulting designs, we employ an efficient validity check and physics-aware rollback during autoregressive inference, which prunes infeasible token predictions using physics laws and assembly constraints.
I'm far from an AI expert, but I've long felt that this is one of the most interesting ways to use AI: to generate and optimize possibilities within a set of domain-specific constraints that are programmed manually.
For example, imagine an AI that is designed to optimize traffic light patterns. You want a hard constraint that no intersection gives a combination of green lights that could cause collisions. But within that set of constraints, which you could manually specify, the AI could go wild trying whatever ideas it can come up with.
At that point, the interesting work is deciding how to design the problem space and the set of constraints. In this case it's a set of lego bricks and how they can be built (and be stable).
> to generate and optimize possibilities within a set of domain-specific constraints
Well, yes, we've been doing this for several decades, many people call it metaheuristics. There is a wide array of algorithms in there. An excellent and light intro can be found here: https://cs.gmu.edu/~sean/book/metaheuristics/
Metaheurestics? I always thought it's similar to "I don't know how many neurons to put in the hidden layer... and I also don't know how many hidden layers I need, so, let's make it a part of the optimisation problem to find out on it's own".
Thanks, but some strange coincidence this is exactly the book I have right now. In the introduction the author says, "I think these notes would best serve as a complement to a textbook". Do you happen to know any good textbooks on that topic?
A simple version of this that already shines with existing LLMs is JSON Schema mode. You can go quite a long way towards making illegal states unrepresentable and then turn a model loose in the constrained sandbox, with the guarantee that anything it produces will be at least valid if not correct: it's basically type safety for LLM output.
The same mechanism that underlies JSON Schema support can be applied to any sort of validation and correction, and yeah, I'd love to see more of this kind of thing!
Agree with this. Constraining generation with physics, legality, or even tooling limits turns the model into a search-and-validate engine instead of a word predictor. Closer to program synthesis.
The real value is upstream: defining a problem space so well that the model is boxed into generating something usable.
You'd probably use some kind of MILP or CLP based model for that kind of thing, wouldn't you? The constraints define the search space and the solver algorithm then explores it.
I haven't read how they apply the constraint. But there is similar stuff when you force llm to generate structured output like Json format. llama.cpp allow to match a custom grammar for example.
Error feedback seems to be the one thing that can unlock some of the original promises.
For example, if you give a text-to-SQL bot access to the same idea (e.g., error feedback from the SQL provider), it is much more likely to succeed in generating valuable queries.
Totally agree, this is where AI shines the most for me too. Let humans define the rules of the game (like physics or traffic safety), and let the AI explore the massive search space for optimized solutions.
Ask an LLM: "Say the word APPLE", but modify the code so the logits of the token for Apple/apple/APPLE is permanently set to -Inf - ie. the model cannot say that word.
The output ends up like this:
"Banana. Oh, just kidding. Banana. Oh, it's so tasty I said it wrong. Lets try again: Orange. Whoops, I meant to say grape. No I meant to say the tasty crunchy fruit known as a carrot".....
Note that OP's traffic light problem would suffer the same problem.
Ie. a smart model, knowing it cannot say a word, will give the next best solution - for example maybe saying "A P P L E" or maybe "I'm afraid I'm not able to do that".
However, a constrained model does not know or understand its own constraints, so keeps trying to do things which aren't allowed - and even goes back and tries to redo these things which aren't allowed, because to the model it is a mistake which needs correcting.
Like your brain when you know you know a word but it's just not surfacing in your mind.
I'm guessing I'm not that different from the average human and I can 'feel' something physically while I'm searching for the word. I've always wondered what that was.
They are actively using actual LEGO bricks, and as such they are not misrepresenting anything.
Where there is gray area is in them not clearly stating they are unaffiliated with LEGO the company.
OTOH, they also don't seem to be looking to monetize anything, so they are at lower risk from LEGO having a plausible claim that they are hurting their sales.
While it is perfectly valid to describe what they made as a designer or builder for LEGO, I do not believe they are allowed to use part of a trademark in a way that could be trademarkable itself, so basically good for everything but the name.
But then again IANAL, and that is just how I understand the American law, and every country is different.
IANAL, but EU law doesn't have "fair use". It does have a _very specific_ set of uses where you don't have to ask for permission (or pay). As I understand, it is more limited than the US' "fair use" doctrine.
EU being EU, I can only imagine there's a bunch of particular rules around research that may or may not work in the authors' favor.
Trademark law leaves no space for that. The Lego Group has to actively defend their trademark. That means a name like LegoGPT is really on the obvious end of 'don't do that'.
Completely agree. This should be well beyond accusations of corporate bullying. It's one thing to mention Legos, it's another to actively include a brand name in your product! NikeGPT, CocaColaGPT and IkeaGPT will face the same issue ;)
Mentioning Lego is absolutely OK, and you can sell used Lego as well and note that you are using genuine Lego bricks (resale laws simply allow that). Lego is really antsy about anything which might look like it is actually a Lego Group initiative though, and anything where Lego bricks are offered for sale in a modified state¹.
Also, they don’t tend to go after fan-made things like this, based on some googling they typically throw the book at counterfeit producers who are eating into their profits.
(Initially) fan-made stuff which gets big enough to get noticed usually won't be able to call themselves something with 'Lego' in it. Usually some variation of 'brick' is used instead (e.g., Bricklink, Rebrickable, EuroBricks, etc.).
Sega's [0] main business is pachinko (so gambling). To them Sonic brand being used by fans has very little consequences, if not building most needed goodwill toward their other brands.
Sega was mostly into normal arcade games, and Sammy baught them for their expertise to improve Sammy's much more profitable gambling machines. It's Sammy's CEO that took the lead, and Sonic and console games became a mere side business.
They just won the market because historically they reused existing locking bricks concept from a company called Kiddicraft, found a way to make it more lockable... and patent it before the original company and other companies could implement it.
We can say that they became famous half fir engineering reason, and half from their legal department...
I don’t need automation to build LEGO sets — that’s the fun part, and I want to do it myself. What I need is automation after the build: to clean up, sort the bricks by color and shape, and store them properly.
I just wish scientists would start by solving problems that actually exist in the real world. There’s real value — and real money — in that.
This does not seem like a very impressive result. It's using such a small set of bricks and the results don't really look much like the intended thing.
It feels like a hand-crafted algorithm would get a much better result.
There already exists an app that will, from photos of your pile, pick out models you can make from a large library of existing models. Though IIRC that has been around long enough that it isn't quite using what people are currently calling AI (instead using older ML techniques for brick identification, and a basic DB search to pick out the valid plans for the resulting list of bricks).
There’s a bug on the page (on iPhone, at least) once you scroll to the gifs that it starts to auto load them without doing anything, making it really hard to navigate anywhere at that point.
It contains " contains 47,000+ different LEGO structures, covering 28,000+ unique 3D objects from 21 common object categories of the ShapeNetCore dataset".
It's hilarious watching $50,000 worth of robots take so long to assemble a couple dollars worth of Lego. It's like peering into the old folks home for robots.
SMT component placement isn't that different to placing bricks. Conventional wisdom is that if you can design a PCB that requires no manual work, its assembly cost is more-or-less location independent. SMT pick and place can hit speeds of 200,000 components per hour [1]. That's about 50 components per second.
The tasks requiring high dexterity like final assembly of the product with displays, keyboards, ribbon cables and cases is still done by humans by hand.
Fixturing isn't automated in most places. Sure a gantry style CNC machine can drive screws vertically into your parts to join them, but it requires a human loader to put the two parts onto the fixture in the first place.
Those are already an issue. AI is a bigger threat to cognitive tasks than to physical ones.
Skynet isn't goanna attack you with Terminators wielding a "phased plasma rifle in the 40W range", but will be auto-rejecting your job application, your health insurance claims, your credit score and brain washing your relatives on social media.
There’s a difference though. The “cool” Terminator Skynet pursues its own goals, and wasn’t programmed by humans to kill. The “boring” insurance-rejecting Skynet is explicitly programmed to reject insurance claims by other humans, unfortunately.
So still, no need to worry about our AI overlords, worry about people running the AI systems.
> AI is a bigger threat to cognitive tasks than to physical ones.
I don't see how you could possibly think this is true. Physical automation is easier to scale since you only need to solve a single problem instance and then just keep applying it on a bigger scale.
Automation doesn't work where high dexterity and quick adaptability is required. You can much cheaper and quicker to train a human worker to move from sewing a Nike shoe to an Adidas shoe than you can reprogram and retool a robot.
Robots work for highly predictable high speed tasks where dexterity is not an issue, like PCB pick and place.
I noticed that "a basic sofa" involves some placing some floating bricks if built in the order of the animation. It hints at the way this model generates the designs. The automated assembly of generated LEGO structures using robots would have serious trouble creating these designs I reckon.
I came here to say that. I immediately thought: Wow, this works in the assembled version, but not the way the assembly is being animated. You would need to first build the base sofa layer from two levels so that the upper layer keeps the lower layer bricks in place. Only afterwards could it be put onto the legs.
Indeed, I would be very curious to see how their robots would actually build that sofa. Although the robots aren't really part of the model of course, they're just a little extra.
Quit trying to read the article after the 15th video went to full screen and had to be dismissed hitting the tiny x in the upper left… 3 more interfered with me trying to go back to this page
Keep in mind that these sites are run by AI researchers, not dedicated UX teams at major tech companies—so the interface can feel a bit rough around the edges.
That said, your critique is still valid; it’s just fair to cut them a little slack given their priorities.
The high backed chair gif example is interesting - the way it’s animated it would completely fall apart and be unstable. But if you built it in reverse, it would work fine.
But it also shows the weirdness of the solution - in places where larger bricks make sense, multiple smaller bricks are used instead. In a section where a 2x6 should be repeated, in on instance of the repetition it uses tow 1x6s. It’s weird.
Doesn’t seem to add much to just converting a 3d model into voxels and therefore bricks.
Using bricks other than 2x2 and 2x4 blocks creatively to make interesting things is really important, i’m not sure what type if algorithm would best auto generate beautiful MOCs however? Was thinking of doing a $50000 kaggle comp for this, what do others think?
Great. Please do cabinets next. Constrain to some specified material such as 2.5m by 1.25m 18mm ply. Iterate designs by text and output the model, cutlist and assembly instructions. Simple right?
When I was a kid, I proudly exclaimed I wanted to become a professional lego builder. Not in my wildest dreams would I have assumed how close to that career path I could have come.
Have the authors never heard of Lego being one of the companies that are super strict about their trademark? They file takedown notices etc on every project they see. Even if the stone design has the little thingies on top/bottom...
Cool project, but judging from the videos, it looks like some of them can't actually be built using those instructions. E.g. "A backless bench with armrest" would require some bricks to float in the air with no support while you're assembling the rest.
The design is sound, just not the order of assembly shown. For the bench, the lower bricks are suspended by the upper ones, so would need to be assembled separately before connecting to the legs.
True, easy for a human, not so easy for a robot to go through those extra steps. I wonder if they made it work with the robots, because in the video they only show the robots building from the bottom up.
However, the model "A high-backed chair" has some floating pieces in the middle of the seat, that are fastened from above. Can these robots handle building these?
Indeed. I'm guessing legal is the only reason we don't have 3d-printed ikea. Raymond Loewy FTW. But then, we'd have garages full of bespoke n-of-1 junk instead of mass-made LLM (liminal-labor-made) junk.
So, besides training a LLM to generate build instructions for lego model, they have robots to assemble these models, and they applied 3D texture on 3D generated model (what for?).
Sometimes the amount of money and energy that are spent in "recreation" projects just amazes me.
You do realize that a system like LEGO is just an extremely efficient and cheap proof of concept with a proxy material (LEGO) for later real life applications of building X from standardized components Y right?
This is interesting and seemingly quite applicable base research and we move forward by being curious.
Indeed. I thought blocks world stuff would be amazing for early childhood education. I'm guessing some labs are already there since minecraft supports user-programmable models for years though I dunno the details. I'd be happy to learn if anybody knows of their evolution since the rise of AI.
I'm far from an AI expert, but I've long felt that this is one of the most interesting ways to use AI: to generate and optimize possibilities within a set of domain-specific constraints that are programmed manually.
For example, imagine an AI that is designed to optimize traffic light patterns. You want a hard constraint that no intersection gives a combination of green lights that could cause collisions. But within that set of constraints, which you could manually specify, the AI could go wild trying whatever ideas it can come up with.
At that point, the interesting work is deciding how to design the problem space and the set of constraints. In this case it's a set of lego bricks and how they can be built (and be stable).
Well, yes, we've been doing this for several decades, many people call it metaheuristics. There is a wide array of algorithms in there. An excellent and light intro can be found here: https://cs.gmu.edu/~sean/book/metaheuristics/
The same mechanism that underlies JSON Schema support can be applied to any sort of validation and correction, and yeah, I'd love to see more of this kind of thing!
The real value is upstream: defining a problem space so well that the model is boxed into generating something usable.
For example, if you give a text-to-SQL bot access to the same idea (e.g., error feedback from the SQL provider), it is much more likely to succeed in generating valuable queries.
Ask an LLM: "Say the word APPLE", but modify the code so the logits of the token for Apple/apple/APPLE is permanently set to -Inf - ie. the model cannot say that word.
The output ends up like this:
"Banana. Oh, just kidding. Banana. Oh, it's so tasty I said it wrong. Lets try again: Orange. Whoops, I meant to say grape. No I meant to say the tasty crunchy fruit known as a carrot".....
Ie. a smart model, knowing it cannot say a word, will give the next best solution - for example maybe saying "A P P L E" or maybe "I'm afraid I'm not able to do that".
However, a constrained model does not know or understand its own constraints, so keeps trying to do things which aren't allowed - and even goes back and tries to redo these things which aren't allowed, because to the model it is a mistake which needs correcting.
I'm guessing I'm not that different from the average human and I can 'feel' something physically while I'm searching for the word. I've always wondered what that was.
https://en.wikipedia.org/wiki/Combinatorial_chemistry
If you want to be safe do not use the word LEGO. Use Bricks or in German "Klemmbausteine".
Many people have had to deal with LEGO's lawyers and it ain't pretty.
Where there is gray area is in them not clearly stating they are unaffiliated with LEGO the company.
OTOH, they also don't seem to be looking to monetize anything, so they are at lower risk from LEGO having a plausible claim that they are hurting their sales.
But then again IANAL, and that is just how I understand the American law, and every country is different.
EU being EU, I can only imagine there's a bunch of particular rules around research that may or may not work in the authors' favor.
1: Never, ever, sell modified Lego bricks: https://www.brickfanatics.com/lego-wins-court-case-against-c...
Also, they don’t tend to go after fan-made things like this, based on some googling they typically throw the book at counterfeit producers who are eating into their profits.
That's where Nintendo is fundamentaly different.
[0] https://en.m.wikipedia.org/wiki/Sega_Sammy_Holdings
Also don’t forget that Sega was “originally an importer of coin-operated arcade games to Japan and manufacturer of slot machines and jukeboxes”
https://en.m.wikipedia.org/wiki/History_of_Sega
Sega was mostly into normal arcade games, and Sammy baught them for their expertise to improve Sammy's much more profitable gambling machines. It's Sammy's CEO that took the lead, and Sonic and console games became a mere side business.
They just won the market because historically they reused existing locking bricks concept from a company called Kiddicraft, found a way to make it more lockable... and patent it before the original company and other companies could implement it.
We can say that they became famous half fir engineering reason, and half from their legal department...
In the casinos?
I just wish scientists would start by solving problems that actually exist in the real world. There’s real value — and real money — in that.
It feels like a hand-crafted algorithm would get a much better result.
(Totally feasible with today's technology, but you'll need to train your own specialized models.)
annoying that this is the default behaviour on iOS though
If anyone else was searching for the dataset, it is at https://huggingface.co/datasets/AvaLovelace/StableText2Lego
It contains " contains 47,000+ different LEGO structures, covering 28,000+ unique 3D objects from 21 common object categories of the ShapeNetCore dataset".
Local inference instructions are over at their github page - https://github.com/AvaLovelace1/LegoGPT/?tab=readme-ov-file
https://youtu.be/Ca-SoKzjh4M?t=110
SMT component placement isn't that different to placing bricks. Conventional wisdom is that if you can design a PCB that requires no manual work, its assembly cost is more-or-less location independent. SMT pick and place can hit speeds of 200,000 components per hour [1]. That's about 50 components per second.
[1] https://www.hallmarknameplate.com/smt-process/
Skynet isn't goanna attack you with Terminators wielding a "phased plasma rifle in the 40W range", but will be auto-rejecting your job application, your health insurance claims, your credit score and brain washing your relatives on social media.
There’s a difference though. The “cool” Terminator Skynet pursues its own goals, and wasn’t programmed by humans to kill. The “boring” insurance-rejecting Skynet is explicitly programmed to reject insurance claims by other humans, unfortunately.
So still, no need to worry about our AI overlords, worry about people running the AI systems.
I don't see how you could possibly think this is true. Physical automation is easier to scale since you only need to solve a single problem instance and then just keep applying it on a bigger scale.
Robots work for highly predictable high speed tasks where dexterity is not an issue, like PCB pick and place.
Keep in mind that these sites are run by AI researchers, not dedicated UX teams at major tech companies—so the interface can feel a bit rough around the edges. That said, your critique is still valid; it’s just fair to cut them a little slack given their priorities.
But it also shows the weirdness of the solution - in places where larger bricks make sense, multiple smaller bricks are used instead. In a section where a 2x6 should be repeated, in on instance of the repetition it uses tow 1x6s. It’s weird.
Cool idea.
Using bricks other than 2x2 and 2x4 blocks creatively to make interesting things is really important, i’m not sure what type if algorithm would best auto generate beautiful MOCs however? Was thinking of doing a $50000 kaggle comp for this, what do others think?
However, the model "A high-backed chair" has some floating pieces in the middle of the seat, that are fastened from above. Can these robots handle building these?
I guess I learned a word today...
Sometimes the amount of money and energy that are spent in "recreation" projects just amazes me.
This is interesting and seemingly quite applicable base research and we move forward by being curious.
Can it produce an ample bossom made of lego? And indecent protrusion? Weapons?