That's a nice piece of motor engineering. It's well known that high ratio gearboxes for robots are a headache. Back driveability doesn't work, and tiny teeth are fragile. Comments on this go all the way back to Feynman writing about his time spent engineering automatic gunnery aiming systems in WWII.
This new discovery is that gearbox problems mess up a machine learning system. It's trying to track gearbox noise and is using up all its learning capacity on that.
This discovery means that robotics people can tap machine learning funding for motor and gearbox development. Robotics labs used to be really low-budget operations. No longer.
What you really want is a direct drive motor, but those have to be large-diameter. They can be flat; that's a pancake motor. That's too large for fingers. So their compromise moves partly in that direction; the rotor is flatter, torques are higher, speeds are slower, and gearbox ratios are lower. As they point out, reflected inertia is the square of the gear ratio, because the gear ratio gets you both going out and coming back. So this is a bigger than linear win.
Good back-drivabiilty means much less risk of gear breakage on overload. Some of the academic designs, such as harmonic drives and series elastic actuators, have huge gear ratios in a small space. That's OK for prototypes but not production. As I've mentioned before, "you cannot strip the teeth of a magnetic field", a line from a GE electric locomotive salesman around 1900. If an overload forces a motor backwards, nothing breaks.
Would have been nice to hear more about the motor design. That's the real achievement here. There are CAD tools which understand electromagnetic fields now, so strange motor geometries are not as much of a trial and error and experience process as it once was. It's also respectable for an EE to work on rotating machinery again. That field matured around the 1960s, and until computers took over motor control, didn't change much.
They don't back-drive well. The whole point of this hand design is to back-drive the contact forces into the motor, where there's force control. They're somewhat bulky, too.
Key concept: force-based motor control works quite well. Preserve that property through the gear train and force-based hand control works.
What? An ideal capstan drive can be backdriven perfectly fine. You only run into problems once it stops being ideal (e.g. built out of heavy parts, high gear ratio, etc.)
It's the high reduction ratio that's the problem. If you build a 200:1 capstan, it's not going to back drive well. And it won't be anywhere near ideal.
"This deployment is temporarily paused" is the crappiest "this account has gone over its budget" error page ever. Does that mean the site's down, or is that some meta-joke like "you've reached the end of the Internet"? Quick, explain to my non-technical friends what a "deployment" is. They're not trying to go on a deployment; they're trying to look at a website.
In this case, I'm not sure it matters what it says or how your non-technical friends interpret it. The site is down. Why it is down doesn't change the next thing casual viewers will do (close the tab).
But it does matter to their opinion of the site: is it down because the author took it down, is it down due to a technical problem, or is it is it down because the hosting provider took it down?
"This deployment is temporarily paused", if anything, sounds like the people who put the site up took it down again. That sends the wrong message.
Personally, if my hosting provider took my post down, I'd want them to make that obvious to my visitors. Or at the very least make it look like a technical issue. Not make it look like I took it down.
Is it? The title is "The Robotic Dexterity Deadlock". For all I know, it's a joke about what deadlock looks like for robots, showing what could be interpreted as a deadlock in a webserver. At a glance, I can't tell if the site is down, or if it's up and correctly showing its very short message.
So, yeah, in reality, I'm 99% sure it really is an error message. That's only because I've seen similar error messages in the past and can infer how to interpret it.
You can redesign every single actuator, sure. Or you can design a wrist camera and use it to close the loop. Or just run enough imitation learning + "RL IRL" and fold the imperfect real world actuators into the policy implicitly.
Robotics doesn't have a single silver bullet - the design space is vast and underexplored.
Tip for those who want to skip shit like this, excessive headings glued together by bullet points is quicker to spot, especially since the headings almost always start with "The".
I now scroll any AI-adjacent article I see and just read headings and if I see this I know what I'm getting into:
Deciding whether A is an X or a Y is a really basic part of why we're all communicating. Suspicion of em dashes is one thing, but once you start getting nervous on seeing "It’s not X. It’s Y." then you're just going to get paranoid.
The fundamentals of an LLM is to statistically match their output with the corpus. The tics they have are really common in natural human usage too.
While it’s got some clear LLM patterns, the content seems novel enough to be worth the squeeze. That or I’m far enough outside of my Gell-Mann amnesia bubble that I can’t see the slop
Surgical robots, and robot pianos both exist. Neither employ humanoid hands. This all just illustrates how humanoid robots are, in multiple dimensions, going down technology rat holes. In some cases better solutions already exist without looking humanoid. In other cases, the humanoid form factor fails to address problems like a high center of gravity in a device that needs to not fall on grandma while helping her around the house.
I continue to be amazed that the wrong form factor keeps being pursued. Though I suppose I shouldn't be too surprised given the parade of failed "AI devices."
I think one major draw to human-like for factors is the reuse of existing ecosystems and tools. If you have human-like grasping, you can reuse tools and utensils for human hands, otherwise, you need custom attachments. If you have human-like legs you can navigate stairs, wear pants for customization, and possibly operate a car or bike.
Its a bit like choosing JS / python -- of course performance is inferior to a compiled language with highly tailored code, but they are flexible and have an ecosystem that might do 99% of the lifting for you.
But in isolation, I agree with your idea that specialized robots with form fitted specifically to task will likely outperform a more generalized solution in a specific domain of behavior, the more generalized will likely outperform in flexibility and reusability (e.g. capable of reusing the human ecosystem).
I think it’s less about tools and more about the spaces that humans operate in.
You don’t need a human-like hand to hold a tool made for humans. As an extreme example, you can make a robot operate a power drill with strap to hold it and a servo with a small bit of wood to operate the trigger mechanism.
But for a robot operating in a space made for humans there certainly are some physical requirements which are based on the human form: maximum volume and clearances, stairs, fragile fixtures that can’t be operated with too much force, etc.
Ever walk through some over-crowded antique shop where you need to twist and lean your body to avoid knocking into thing?
There are a whole lot of tools intended for human use that I would use much more effectively if I could rotate my wrist repeatedly in the same direction.
Many overactuated, purpose built robots (like surgical robots and pianos) exist, and have existed since the Unimate, and work great in certain situations. The problem with all of them is they are very expensive, often extremely large, and single purpose or very narrow purpose (and even if they are narrowly multipurpose, require tons of setup to get to work for each job they are intended to do).
I personally am not bullish on 1:1 human hands either, but IMO the question shouldn't be $100k 2 ton Kuka arm vs biped with hands, it's overactuated robotics (build it from the floor with hard coded operations) vs underactuated (build it from the contact point of the work backwards with ML and sensors). We shall see which form factors prevail, but the type of robotics development posted here seems like the way forwards regardless, an ecosystem of small, power dense, reliable, accurate QDD actuators will lead to many general purpose robot applications. I recognize I am not using underactuated vs overactuated in their strict definition here but if you are familiar with robots I think you'll understand where I am coming from as far as a robot design ethos.
I will say though in designing robots of this type without necessarily being bound by trying to make a robot look like a human, I have often found myself accidentally recreating human arm DOF in a round trip way, it does just end up being well packaged beyond the "world designed for humans" talking point. Maybe hands will end up being a similar situation.
Similar to how we are seeing LLMs shoved into spaces where existing ML was already doing well and better suited.
Not to dismiss the value of LLMs in those cases as an interface/interpretation layer.
If grandma goes into the windowless surgery factory, I just want the best bots working on her. There is value in having Dr. Bot the replicant give me the face-to-face status updates. We are not breaking out those layers as much, anymore, as the focus becomes minimizing FOMO.
You are right. If the hand is doing a specific task, better morphologies are likely. But that's not always desirable. The canonical example is of course the household. I don't want X robots, I want 1. And I don't want to change anything. Robot hand!
Not to mention that the world is very widely designed to be manipulated by hands: doorknobs, handles, container sizes. A unique door opening appendage isn't going to do much good around your house.
A humanoid human will fall over too if pushed into a sufficiently awkward corner. It’s a fundamental problem with things that aren’t statically stable and need active stabilization.
I see it as trying to apply the bitter lesson to robotics. Specialized robots will always have their place, but humanoid ones can take advantage of all the design interfaces that already exist in the world for humans.
Similar to how claude code gained so much traction in terminal by just leveraging the command line interface that already exists for humans, no need to invent a domain specific MCP to just run shell commands.
I agree with you that it's far from the most efficient approach for specific tasks. But the analogy would be that you also generally don't want to use LLMs to do something you can "just" write a script for... that doesn't make LLMs useless though.
This new discovery is that gearbox problems mess up a machine learning system. It's trying to track gearbox noise and is using up all its learning capacity on that. This discovery means that robotics people can tap machine learning funding for motor and gearbox development. Robotics labs used to be really low-budget operations. No longer.
What you really want is a direct drive motor, but those have to be large-diameter. They can be flat; that's a pancake motor. That's too large for fingers. So their compromise moves partly in that direction; the rotor is flatter, torques are higher, speeds are slower, and gearbox ratios are lower. As they point out, reflected inertia is the square of the gear ratio, because the gear ratio gets you both going out and coming back. So this is a bigger than linear win.
Good back-drivabiilty means much less risk of gear breakage on overload. Some of the academic designs, such as harmonic drives and series elastic actuators, have huge gear ratios in a small space. That's OK for prototypes but not production. As I've mentioned before, "you cannot strip the teeth of a magnetic field", a line from a GE electric locomotive salesman around 1900. If an overload forces a motor backwards, nothing breaks.
Would have been nice to hear more about the motor design. That's the real achievement here. There are CAD tools which understand electromagnetic fields now, so strange motor geometries are not as much of a trial and error and experience process as it once was. It's also respectable for an EE to work on rotating machinery again. That field matured around the 1960s, and until computers took over motor control, didn't change much.
Key concept: force-based motor control works quite well. Preserve that property through the gear train and force-based hand control works.
What? An ideal capstan drive can be backdriven perfectly fine. You only run into problems once it stops being ideal (e.g. built out of heavy parts, high gear ratio, etc.)
"This deployment is temporarily paused", if anything, sounds like the people who put the site up took it down again. That sends the wrong message.
Personally, if my hosting provider took my post down, I'd want them to make that obvious to my visitors. Or at the very least make it look like a technical issue. Not make it look like I took it down.
Is it? The title is "The Robotic Dexterity Deadlock". For all I know, it's a joke about what deadlock looks like for robots, showing what could be interpreted as a deadlock in a webserver. At a glance, I can't tell if the site is down, or if it's up and correctly showing its very short message.
So, yeah, in reality, I'm 99% sure it really is an error message. That's only because I've seen similar error messages in the past and can infer how to interpret it.
Robotics doesn't have a single silver bullet - the design space is vast and underexplored.
Multiple times, over and over.
We need to stop with the AI stuff.
I now scroll any AI-adjacent article I see and just read headings and if I see this I know what I'm getting into:
The Dexterity Deadlock
The Problem
The Geometric Curse
The Sim-to-Real Gap
The Structural Gap f(⋅)
Seeing It in Motion
The N^2 Impedance Mismatch
The Chaos Term ϵchaos
The Information Wall
The Weakest Link
Why Manipulation Needs Better
What We Built
From 288 to 15
Does It Work?
Hardware Validation
Robot Hand Landscape
The Take-Home
The fundamentals of an LLM is to statistically match their output with the corpus. The tics they have are really common in natural human usage too.
In this day and age, I wish people would ask any model OTHER than ChatGPT to rewrite their shit. At least we'd get a different flavor of slop.
I continue to be amazed that the wrong form factor keeps being pursued. Though I suppose I shouldn't be too surprised given the parade of failed "AI devices."
Its a bit like choosing JS / python -- of course performance is inferior to a compiled language with highly tailored code, but they are flexible and have an ecosystem that might do 99% of the lifting for you.
But in isolation, I agree with your idea that specialized robots with form fitted specifically to task will likely outperform a more generalized solution in a specific domain of behavior, the more generalized will likely outperform in flexibility and reusability (e.g. capable of reusing the human ecosystem).
You don’t need a human-like hand to hold a tool made for humans. As an extreme example, you can make a robot operate a power drill with strap to hold it and a servo with a small bit of wood to operate the trigger mechanism.
But for a robot operating in a space made for humans there certainly are some physical requirements which are based on the human form: maximum volume and clearances, stairs, fragile fixtures that can’t be operated with too much force, etc.
Ever walk through some over-crowded antique shop where you need to twist and lean your body to avoid knocking into thing?
What makes human hands especially suitable for e.g. assembling a phone or installing a door handle onto a car?
yes. do you think it's safe to just plug usb into some hole and type? the safest option for a robot is typing with fingers
I personally am not bullish on 1:1 human hands either, but IMO the question shouldn't be $100k 2 ton Kuka arm vs biped with hands, it's overactuated robotics (build it from the floor with hard coded operations) vs underactuated (build it from the contact point of the work backwards with ML and sensors). We shall see which form factors prevail, but the type of robotics development posted here seems like the way forwards regardless, an ecosystem of small, power dense, reliable, accurate QDD actuators will lead to many general purpose robot applications. I recognize I am not using underactuated vs overactuated in their strict definition here but if you are familiar with robots I think you'll understand where I am coming from as far as a robot design ethos.
I will say though in designing robots of this type without necessarily being bound by trying to make a robot look like a human, I have often found myself accidentally recreating human arm DOF in a round trip way, it does just end up being well packaged beyond the "world designed for humans" talking point. Maybe hands will end up being a similar situation.
Not to dismiss the value of LLMs in those cases as an interface/interpretation layer.
If grandma goes into the windowless surgery factory, I just want the best bots working on her. There is value in having Dr. Bot the replicant give me the face-to-face status updates. We are not breaking out those layers as much, anymore, as the focus becomes minimizing FOMO.
Similar to how claude code gained so much traction in terminal by just leveraging the command line interface that already exists for humans, no need to invent a domain specific MCP to just run shell commands.
I agree with you that it's far from the most efficient approach for specific tasks. But the analogy would be that you also generally don't want to use LLMs to do something you can "just" write a script for... that doesn't make LLMs useless though.