Watch enough humanoid robot demo videos and you start to notice a pattern. The walking is getting very good. The balance and locomotion — the ability to stride across a warehouse floor, navigate obstacles, recover from a shove — have improved dramatically over the past few years. But the hands are still a problem. The grasp is still uncertain. The fingers still fumble. A robot that can walk confidently across a stage will still pause, recalibrate, and occasionally drop the object it is trying to hand you.
This is not a coincidence or a temporary gap that will close easily. Manipulation — the ability to grasp, handle, and manipulate objects reliably across a wide range of shapes, sizes, weights, and surface properties — is one of the genuinely hard unsolved problems in robotics. Understanding why it is hard, and where the field currently stands, tells you a lot about which humanoid deployments are realistic in the near term and which are not.
What Your Hands Actually Do
The human hand contains 27 bones, 29 joints, more than 30 muscles (many of them in the forearm, controlling the fingers via tendons), and somewhere in the range of 17,000 touch receptors. It can apply the same grip to a wine glass and a sledgehammer, adjust in real time for a slipping object, work in the dark, and thread a needle. It does most of this automatically, below conscious awareness, via a dense feedback loop between the fingertips and the nervous system that is faster than deliberate thought.
Replicating this in hardware and software is extremely difficult. The difficulty is not any single technical problem but a stack of interlocking ones: mechanical complexity, sensor density, the physics of contact, and the near-infinite variety of objects a useful hand needs to handle.
Mechanical complexity first. A fully articulated robotic hand that approximates the range of motion of the human hand requires dozens of independently controlled joints. Each joint needs an actuator, and at hand scale, those actuators need to be small, light, powerful, and precise simultaneously. That combination remains expensive and technically challenging to engineer. Most commercial humanoid hands today make trade-offs: fewer fingers, fewer joints per finger, simplified actuation — which limits what they can reliably grasp.
The Sensor Gap
The second major problem is sensing. Human fingertips provide continuous, high-resolution tactile feedback: how hard you are pressing, which part of the finger is in contact, how the object is beginning to slip before it actually slips. This feedback is what allows you to hold an egg firmly enough that it does not fall while gently enough that it does not break. You are not consciously calculating the grip force; your nervous system is managing it automatically through constant sensory input.
Robotic hands have sensors — typically force-torque sensors in the wrist or palm, sometimes pressure sensors on the fingertips — but the resolution and coverage is nowhere near what human skin provides. A robot picking up a drinking glass can measure how much total force it is applying, but getting fine-grained, spatially distributed feedback across all fingertip surfaces simultaneously, in real time, with enough resolution to detect early slip — that is a sensor problem the industry has not yet fully solved at a practical price point.
Several companies are working on this specifically. Tactile sensing startups — including Touchlab and Xela Robotics — have developed sensor technologies that bring robotic fingertip sensing closer to human resolution. The challenge is integrating those sensors into a hand design that is also mechanically functional, durable enough for industrial use, and manufacturable at reasonable cost. Progress is real; the gap to human-level tactile capability is still significant.
The Physics of Contact
Even with better sensors and more articulated fingers, there is a third problem: the physics of contact itself is genuinely complex. When you pick up an object, the interaction between your fingertips and the object’s surface involves friction, deformation, and dynamics that are difficult to model precisely in software. A slightly different surface texture, a slightly different object weight, or a slightly different approach angle changes what grip force is needed and how the object will respond.
For traditional industrial robots working with known objects in controlled environments — the same box, the same weight, the same position, every time — this is manageable. You program the grip for the specific object and the robot executes it reliably. For humanoid robots, which are supposed to operate in unstructured environments handling a wide variety of unknown objects, the problem is much harder. The robot needs to look at an object it has never seen before, estimate its weight, predict its surface properties, plan a grip, execute that grip, and adapt in real time if anything goes wrong.
This is where machine learning has made meaningful progress. Training manipulation systems on large datasets of object interactions — either real robot experience or physics simulation — has produced significant improvements in the ability to generalise across novel objects. Google DeepMind’s robotics work, manipulation research out of Carnegie Mellon and MIT, and work at companies like Figure AI and Physical Intelligence have all demonstrated increasingly capable general manipulation. The systems are getting better. They are not yet reliably good enough for the full range of tasks an actually useful household or industrial robot would need to handle.
Where the Industry Currently Stands
The most honest summary of where humanoid manipulation stands today is: reliable in structured settings, inconsistent in unstructured ones.
Robots operating in logistics and warehouse environments — like Agility Robotics’ Digit at Amazon — are handling a fairly constrained set of tasks: moving totes, carrying bins, interacting with known objects in known positions. This is real, useful work, and robots are doing it. But it is closer to the controlled industrial case than the general manipulation case. The objects are predictable; the environments are engineered to reduce uncertainty.
The harder version of the problem — a robot that can reliably pick up any random object from a kitchen counter, put it in a dishwasher, handle a piece of fruit without bruising it, and thread a cable through a clip — is still a research frontier. Several labs and companies are making real progress. None are yet at a point where those capabilities are reliable enough for unsupervised deployment in genuinely unstructured settings.
Tesla’s Optimus demos have shown meaningful manipulation: folding laundry, sorting objects, performing basic assembly tasks in a factory context. The demos are real. What the demos do not show is failure rate — how often the robot drops something, misidentifies a grip, or gets stuck on an object that does not cooperate. Reliable capability means performing well under conditions you did not hand-pick for a highlight reel. That data, for every company working on humanoid manipulation, is not publicly available.
Why the Hand Shape Matters
One design question the industry has not converged on is whether humanoid robots actually need human-shaped hands. The human hand evolved for a specific set of constraints — tool use, social communication, fine motor tasks across a huge range — but a robot designed for a narrower set of tasks might do better with a different end-effector design entirely.
Two-fingered grippers, three-fingered designs, and purpose-built tools have all been shown to outperform human-like hands on specific tasks. The argument for human-like hands is that human environments are built around human hands: door handles, keyboards, screwdrivers, bottles, and cutlery are all designed to be used by five-fingered grippers of roughly human proportions. A robot with a radically different end effector will struggle with the hardware of the existing world.
This tension — between the flexibility of a human-like hand and the potential performance advantages of a task-optimised design — is real, and different companies are making different bets. Apptronik and Figure AI have moved toward more human-like multi-fingered hands. Some logistics robots use simplified two-fingered grippers for their specific tasks. Neither approach is obviously wrong; the right answer probably depends on the deployment context.
What Comes Next
The most promising current direction in robotic manipulation is the intersection of better tactile sensors, improved simulation for training, and large-scale learning from human demonstration data. The logic: if you can train a manipulation system on millions of examples of humans performing tasks — collected via data gloves, motion capture, or video — the system can learn the subtle dynamics of object interaction from human expertise rather than needing to discover everything from scratch through trial and error.
This is an area where significant research investment is currently concentrated. Physical Intelligence, which has raised substantial funding specifically to work on general robotic manipulation, is pursuing this direction. Several university labs are publishing meaningful work. The trajectory is positive.
The honest answer about timelines is: meaningful improvement is happening, but the gap between an impressive demo and reliable capability across the full range of manipulation tasks a general-purpose robot would need is probably measured in years, not months. Deployment of humanoids in structured industrial settings — where the manipulation problem is partially constrained — will scale faster than deployment in genuinely unstructured home or service environments.
The hand problem is not unsolvable. But it is the reason why every honest analyst who has looked carefully at humanoid timelines has concluded that the path from impressive prototype to general-purpose deployed robot is longer and harder than the demo videos suggest. The walk is nearly ready. The hands are still catching up.