A new AI technique allows a robot to create complicated plans for manipulating an object with its complete hand, rather than just its fingertips. Using a typical laptop, this model can develop effective plans in approximately a minute.
Assume you need to transport a large, hefty package up a flight of stairs. You may spread your fingers and lift the box with both hands, then hold it on top of your forearms and balance it against your chest, manipulating the box with your entire body.
Humans are generally adept at whole-body manipulation, whereas machines struggle. To the robot, each location where the box could come into contact with any of the carrier’s fingers, arms, or torso constitutes a contact event that must be accounted for. With billions of possible contact occurrences, this process quickly becomes intractable.
This method, known as contact-rich manipulation planning, was discovered by MIT researchers. They employ smoothing, an AI approach that summarizes multiple contact events into a smaller number of decisions, to allow even a simple computer to swiftly develop an appropriate manipulation plan for the robot.
Rather than thinking about this as a black-box system, if we can leverage the structure of these kinds of robotic systems using models, there is an opportunity to accelerate the whole procedure of trying to make these decisions and come up with contact-rich plans.
H.J. Terry Suh
While this method is still in its early stages, it has the potential to allow industries to use smaller, mobile robots that can control objects with their complete arms or bodies rather than giant robotic arms that can only grasp with their fingers. This may help to cut energy use and expenditures. Furthermore, this technology could be effective in robots sent on Mars or other solar system worlds to adapt to the environment swiftly using only an onboard computer.
“Rather than thinking about this as a black-box system, if we can leverage the structure of these kinds of robotic systems using models, there is an opportunity to accelerate the whole procedure of trying to make these decisions and come up with contact-rich plans,” says H.J. Terry Suh, an electrical engineering and computer science (EECS) graduate student and co-lead author of a paper on this technique.
Joining Suh on the paper are co-lead author Tao Pang PhD ’23, a roboticist at Boston Dynamics AI Institute; Lujie Yang, an EECS graduate student; and senior author Russ Tedrake, the Toyota Professor of EECS, Aeronautics and Astronautics, and Mechanical Engineering, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research appears this week in IEEE Transactions on Robotics.
Learning about learning
Reinforcement learning is a machine-learning technique in which an agent, such as a robot, learns to accomplish a task through trial and error and receives a reward for coming closer to a goal. This form of learning, according to researchers, takes a black-box approach because the system must learn everything about the world through trial and error.
It’s been used successfully for contact-rich manipulation planning, in which the robot tries to figure out the optimal method to move an object in a specific way. However, because a robot may have billions of potential touch sites to consider when choosing how to utilize its fingers, hands, arms, and body to connect with an object, this trial-and-error approach necessitates a significant amount of processing.
“Reinforcement learning may need to go through millions of years in simulation time to actually be able to learn a policy,” Suh adds.
However, if researchers specifically create a physics-based model based on their knowledge of the system and the task they want the robot to complete, that model adds structure about this environment that makes it more efficient. Suh and Pang pondered why physics-based techniques aren’t as effective as reinforcement learning when it comes to contact-rich manipulation planning. They did a thorough investigation and discovered that a technique called as smoothing allows reinforcement learning to perform so well.
Many of the decisions a robot could make when determining how to manipulate an object aren’t important in the grand scheme of things. For instance, each infinitesimal adjustment of one finger, whether or not it results in contact with the object, doesn’t matter very much. Smoothing averages away many of those unimportant, intermediate decisions, leaving a few important ones.
Reinforcement learning performs smoothing implicitly by trying many contact points and then computing a weighted average of the results. Drawing on this insight, the MIT researchers designed a simple model that performs a similar type of smoothing, enabling it to focus on core robot-object interactions and predict long-term behavior. They showed that this approach could be just as effective as reinforcement learning at generating complex plans.
“If you know a bit more about your problem, you can design more efficient algorithms,” Pang says.
A winning combination
Even though smoothing greatly simplifies the decisions, searching through the remaining decisions can still be a difficult problem. So, the researchers combined their model with an algorithm that can rapidly and efficiently search through all possible decisions the robot could make. With this combination, the computation time was cut down to about a minute on a standard laptop.
They began by simulating robotic hands performing activities such as moving a pen to a desired configuration, opening a door, or picking up a plate. In each case, their model-based approach outperformed reinforcement learning, but in a fraction of the time. When they tested their model in hardware on real robotic arms, they got comparable results.
“The same ideas that allow for whole-body manipulation also apply to planning with dexterous, human-like hands.” Previously, most researchers believed that reinforcement learning was the only approach that scaled to dexterous hands, but Terry and Tao demonstrated that by incorporating the key idea of (randomized) smoothing from reinforcement learning, they can make more traditional planning methods work extremely well, too,” Tedrake says.
However, because their model is based on a simplified approximation of the real world, it cannot handle highly dynamic motions such as things falling. While their approach is useful for slower manipulation tasks, it cannot construct a plan that would allow a robot to toss a can into a garbage bin, for example. The researchers intend to improve their technique in the future so that it can handle these highly rapid motions.
“There are definitely some gains you can achieve if you carefully study your models and truly understand the problem you are attempting to solve. There are advantages to doing things outside of the black box,” Suh explains.