Machine learning: Aided by AI language models, Google’s robots get smart

Robots weren’t able to reliably manipulate objects they had never seen before, and they certainly weren’t capable of making the logical leap from “extinct animal” to “plastic dinosaur.”

Update: 2023-08-01 13:30 GMT

File Photo

NEW YORK: A one-armed robot stood in front of a table. On the table sat three plastic figurines: a lion, a whale and a dinosaur. An engineer gave the robot an instruction: “Pick up the extinct animal.” The robot whirred for a moment, then its arm extended and its claw opened and descended. It grabbed the dinosaur. Until very recently, this demonstration, which I witnessed during a podcast interview at Google’s robotics division in Mountain View, California, last week, would have been impossible. Robots weren’t able to reliably manipulate objects they had never seen before, and they certainly weren’t capable of making the logical leap from “extinct animal” to “plastic dinosaur.”

But a quiet revolution is underway in robotics, one that piggybacks on recent advances in so-called large language models — the same type of artificial intelligence system that powers ChatGPT, Bard and other chatbots.

Google has recently begun plugging state-of-the-art language models into its robots, giving them the equivalent of artificial brains. The secretive project has made the robots far smarter and given them new powers of understanding and problem-solving.

I got a glimpse of that progress during a private demonstration of Google’s latest robotics model, called RT-2. The model, which was being unveiled Friday, amounts to a first step toward what Google executives described as a major leap in the way robots are built and programmed.

“We’ve had to reconsider our entire research program as a result of this change,” said Vincent Vanhoucke, Google DeepMind’s head of robotics. “A lot of the things that we were working on before have been entirely invalidated.”

Robots still fall short of human-level dexterity and fail at some basic tasks, but Google’s use of AI language models to give robots new skills of reasoning and improvisation represents a promising breakthrough, said Ken Goldberg, a robotics professor at the University of California, Berkeley.

“What’s very impressive is how it links semantics with robots,” he said. “That’s very exciting for robotics.” To understand the magnitude of this, it helps to know a little about how robots have conventionally been built. For years, the way engineers at Google and other companies trained robots to do a mechanical task — flipping a burger, for example — was by programming them with a specific list of instructions. (Lower the spatula 6.5 inches, slide it forward until it encounters resistance, raise it 4.2 inches, rotate it 180 degrees, and so on.) Robots would then practice the task again and again, with engineers tweaking the instructions each time until they got it right.

This approach worked for certain, limited uses. But training robots this way is slow and labour-intensive. It requires collecting lots of data from real-world tests. And if you wanted to teach a robot to do something new — to flip a pancake instead of a burger, say — you usually had to reprogram it from scratch. Partly because of these limitations, hardware robots have improved less quickly than their software-based siblings. OpenAI, the maker of ChatGPT, disbanded its robotics team in 2021, citing slow progress and a lack of high-quality training data.

Tags:    

Similar News

Editorial: Separation anxiety