That's a nice piece of motor engineering. It's well known that high ratio gearboxes for robots are a headache. Back driveability doesn't work, and tiny teeth are fragile. Comments on this go all the way back to Feynman writing about his time spent engineering automatic gunnery aiming systems in WWII.
This new discovery is that gearbox problems mess up a machine learning system. It's trying to track gearbox noise and is using up all its learning capacity on that.
This discovery means that robotics people can tap machine learning funding for motor and gearbox development. Robotics labs used to be really low-budget operations. No longer.
What you really want is a direct drive motor, but those have to be large-diameter. They can be flat; that's a pancake motor. That's too large for fingers. So their compromise moves partly in that direction; the rotor is flatter, torques are higher, speeds are slower, and gearbox ratios are lower. As they point out, reflected inertia is the square of the gear ratio, because the gear ratio gets you both going out and coming back. So this is a bigger than linear win.
Good back-drivabiilty means much less risk of gear breakage on overload. Some of the academic designs, such as harmonic drives and series elastic actuators, have huge gear ratios in a small space. That's OK for prototypes but not production. As I've mentioned before, "you cannot strip the teeth of a magnetic field", a line from a GE electric locomotive salesman around 1900. If an overload forces a motor backwards, nothing breaks.
Would have been nice to hear more about the motor design. That's the real achievement here. There are CAD tools which understand electromagnetic fields now, so strange motor geometries are not as much of a trial and error and experience process as it once was. It's also respectable for an EE to work on rotating machinery again. That field matured around the 1960s, and until computers took over motor control, didn't change much.
While it’s got some clear LLM patterns, the content seems novel enough to be worth the squeeze. That or I’m far enough outside of my Gell-Mann amnesia bubble that I can’t see the slop
Surgical robots, and robot pianos both exist. Neither employ humanoid hands. This all just illustrates how humanoid robots are, in multiple dimensions, going down technology rat holes. In some cases better solutions already exist without looking humanoid. In other cases, the humanoid form factor fails to address problems like a high center of gravity in a device that needs to not fall on grandma while helping her around the house.
I continue to be amazed that the wrong form factor keeps being pursued. Though I suppose I shouldn't be too surprised given the parade of failed "AI devices."
I think one major draw to human-like for factors is the reuse of existing ecosystems and tools. If you have human-like grasping, you can reuse tools and utensils for human hands, otherwise, you need custom attachments. If you have human-like legs you can navigate stairs, wear pants for customization, and possibly operate a car or bike.
Its a bit like choosing JS / python -- of course performance is inferior to a compiled language with highly tailored code, but they are flexible and have an ecosystem that might do 99% of the lifting for you.
But in isolation, I agree with your idea that specialized robots with form fitted specifically to task will likely outperform a more generalized solution in a specific domain of behavior, the more generalized will likely outperform in flexibility and reusability (e.g. capable of reusing the human ecosystem).
I think it’s less about tools and more about the spaces that humans operate in.
You don’t need a human-like hand to hold a tool made for humans. As an extreme example, you can make a robot operate a power drill with strap to hold it and a servo with a small bit of wood to operate the trigger mechanism.
But for a robot operating in a space made for humans there certainly are some physical requirements which are based on the human form: maximum volume and clearances, stairs, fragile fixtures that can’t be operated with too much force, etc.
Ever walk through some over-crowded antique shop where you need to twist and lean your body to avoid knocking into thing?
There are a whole lot of tools intended for human use that I would use much more effectively if I could rotate my wrist repeatedly in the same direction.
Many overactuated, purpose built robots (like surgical robots and pianos) exist, and have existed since the Unimate, and work great in certain situations. The problem with all of them is they are very expensive, often extremely large, and single purpose or very narrow purpose (and even if they are narrowly multipurpose, require tons of setup to get to work for each job they are intended to do).
I personally am not bullish on 1:1 human hands either, but IMO the question shouldn't be $100k 2 ton Kuka arm vs biped with hands, it's overactuated robotics (build it from the floor with hard coded operations) vs underactuated (build it from the contact point of the work backwards with ML and sensors). We shall see which form factors prevail, but the type of robotics development posted here seems like the way forwards regardless, an ecosystem of small, power dense, reliable, accurate QDD actuators will lead to many general purpose robot applications. I recognize I am not using underactuated vs overactuated in their strict definition here but if you are familiar with robots I think you'll understand where I am coming from as far as a robot design ethos.
I will say though in designing robots of this type without necessarily being bound by trying to make a robot look like a human, I have often found myself accidentally recreating human arm DOF in a round trip way, it does just end up being well packaged beyond the "world designed for humans" talking point. Maybe hands will end up being a similar situation.
I see it as trying to apply the bitter lesson to robotics. Specialized robots will always have their place, but humanoid ones can take advantage of all the design interfaces that already exist in the world for humans.
Similar to how claude code gained so much traction in terminal by just leveraging the command line interface that already exists for humans, no need to invent a domain specific MCP to just run shell commands.
I agree with you that it's far from the most efficient approach for specific tasks. But the analogy would be that you also generally don't want to use LLMs to do something you can "just" write a script for... that doesn't make LLMs useless though.
Similar to how we are seeing LLMs shoved into spaces where existing ML was already doing well and better suited.
Not to dismiss the value of LLMs in those cases as an interface/interpretation layer.
If grandma goes into the windowless surgery factory, I just want the best bots working on her. There is value in having Dr. Bot the replicant give me the face-to-face status updates. We are not breaking out those layers as much, anymore, as the focus becomes minimizing FOMO.
A humanoid human will fall over too if pushed into a sufficiently awkward corner. It’s a fundamental problem with things that aren’t statically stable and need active stabilization.
You are right. If the hand is doing a specific task, better morphologies are likely. But that's not always desirable. The canonical example is of course the household. I don't want X robots, I want 1. And I don't want to change anything. Robot hand!
Not to mention that the world is very widely designed to be manipulated by hands: doorknobs, handles, container sizes. A unique door opening appendage isn't going to do much good around your house.
This new discovery is that gearbox problems mess up a machine learning system. It's trying to track gearbox noise and is using up all its learning capacity on that. This discovery means that robotics people can tap machine learning funding for motor and gearbox development. Robotics labs used to be really low-budget operations. No longer.
What you really want is a direct drive motor, but those have to be large-diameter. They can be flat; that's a pancake motor. That's too large for fingers. So their compromise moves partly in that direction; the rotor is flatter, torques are higher, speeds are slower, and gearbox ratios are lower. As they point out, reflected inertia is the square of the gear ratio, because the gear ratio gets you both going out and coming back. So this is a bigger than linear win.
Good back-drivabiilty means much less risk of gear breakage on overload. Some of the academic designs, such as harmonic drives and series elastic actuators, have huge gear ratios in a small space. That's OK for prototypes but not production. As I've mentioned before, "you cannot strip the teeth of a magnetic field", a line from a GE electric locomotive salesman around 1900. If an overload forces a motor backwards, nothing breaks.
Would have been nice to hear more about the motor design. That's the real achievement here. There are CAD tools which understand electromagnetic fields now, so strange motor geometries are not as much of a trial and error and experience process as it once was. It's also respectable for an EE to work on rotating machinery again. That field matured around the 1960s, and until computers took over motor control, didn't change much.
Multiple times, over and over.
We need to stop with the AI stuff.
I continue to be amazed that the wrong form factor keeps being pursued. Though I suppose I shouldn't be too surprised given the parade of failed "AI devices."
Its a bit like choosing JS / python -- of course performance is inferior to a compiled language with highly tailored code, but they are flexible and have an ecosystem that might do 99% of the lifting for you.
But in isolation, I agree with your idea that specialized robots with form fitted specifically to task will likely outperform a more generalized solution in a specific domain of behavior, the more generalized will likely outperform in flexibility and reusability (e.g. capable of reusing the human ecosystem).
You don’t need a human-like hand to hold a tool made for humans. As an extreme example, you can make a robot operate a power drill with strap to hold it and a servo with a small bit of wood to operate the trigger mechanism.
But for a robot operating in a space made for humans there certainly are some physical requirements which are based on the human form: maximum volume and clearances, stairs, fragile fixtures that can’t be operated with too much force, etc.
Ever walk through some over-crowded antique shop where you need to twist and lean your body to avoid knocking into thing?
I personally am not bullish on 1:1 human hands either, but IMO the question shouldn't be $100k 2 ton Kuka arm vs biped with hands, it's overactuated robotics (build it from the floor with hard coded operations) vs underactuated (build it from the contact point of the work backwards with ML and sensors). We shall see which form factors prevail, but the type of robotics development posted here seems like the way forwards regardless, an ecosystem of small, power dense, reliable, accurate QDD actuators will lead to many general purpose robot applications. I recognize I am not using underactuated vs overactuated in their strict definition here but if you are familiar with robots I think you'll understand where I am coming from as far as a robot design ethos.
I will say though in designing robots of this type without necessarily being bound by trying to make a robot look like a human, I have often found myself accidentally recreating human arm DOF in a round trip way, it does just end up being well packaged beyond the "world designed for humans" talking point. Maybe hands will end up being a similar situation.
Similar to how claude code gained so much traction in terminal by just leveraging the command line interface that already exists for humans, no need to invent a domain specific MCP to just run shell commands.
I agree with you that it's far from the most efficient approach for specific tasks. But the analogy would be that you also generally don't want to use LLMs to do something you can "just" write a script for... that doesn't make LLMs useless though.
Not to dismiss the value of LLMs in those cases as an interface/interpretation layer.
If grandma goes into the windowless surgery factory, I just want the best bots working on her. There is value in having Dr. Bot the replicant give me the face-to-face status updates. We are not breaking out those layers as much, anymore, as the focus becomes minimizing FOMO.