Many observers were disappointed with the recent demo of the AI-enabled “Optimus” robot at Tesla’s AI Day. One reviewer cleverly titled his article “Sub-Optimus.” However, these views actually miss the point. Whatever else may be said of Elon Musk, he is a genius at sensing timing and opportunity, applying technology and providing the necessary resources.
The quality and enthusiasm of the engineering team suggest Optimus could succeed, even if it takes longer than the estimate of 3 to 5 years for full production. If successful, Optimus could bring personal robots into the mainstream within a decade.
Although initially expensive at an estimated $20,000, an Optimus sibling in 2032 could be as commonplace in shops or factories as Tesla is today on the road. Fast forward another 10 years and humanoid robots in daily life could be commonplace, whether at home or in stores and restaurants, in factories and warehouses, or in health and home care settings.
AI hype: Interacting with robots
In this vision, the idea of an “artificial friend,” an emotionally intelligent android as portrayed by Kazuo Ishiguro in Klara and the Sun, does not seem so farfetched. Neither do “digients” (short for “digital entities”), as described by Ted Chiang in The Lifecycle of Software Objects. Digients are artificial intelligences created within a purely digital world that inhabit a digital shared space (much like the emerging metaverse) but also can be downloaded into physical robots such that they can interact with people in the real world.
This ability for people to interact with a robot appears to be the key to successful robot implementation. At least that is the view of Will Jackson, the founder and CEO of Engineered Arts, who recently said: “The ‘true killer app’ for a humanoid robot are people’s desire to interact with it.”
Is it possible that this robot vision is entirely unrealistic and little more than science fiction or entrepreneurial hype? That is the view of some, says Michael Hiltzik of the Los Angeles Times. He stated: “AI hype is not only a hazard to laypersons’ understanding of the [robotics] field, but poses the danger of undermining the field itself.” In this he is correct and, certainly, it is important to cull the hype from reality.
What Hiltzik is perhaps missing is the arc of history. Robotics today, much like the broadening field of artificial intelligence (AI), is still in its early days. The rate of progress, however, is phenomenal. While Optimus is years from a finished product and numerous technical and cultural hurdles remain, it is impossible to ignore the extraordinary pace of advancement. In just one year, Optimus went from concept to a mobile, bipedal robot. It is a growing field as Tesla is hardly alone in building a humanoid robot. For example, a team of engineers from the Rochester Institute of Technology (RIT) have announced a humanoid robot that can teach Tai Chi.
A long way to go to achieve AI-powered robots
Building robots that emulate human actions is extremely difficult. An EE Times article describes these challenges. “From a mechanics perspective, for example, bipedal locomotion (walking on two legs) is an extremely physically demanding task. In response, the human body has evolved and adapted such that the power density of human joints in areas like the knees is very high.” In other words, simply staying upright is very difficult for robots.
Despite such challenges, real progress is being made. Oregon State University researchers recently established a Guinness World Record for a robot running a 100-meter dash, completing the course in under 25 seconds. The team has been training “Cassie” since 2017, using reinforcement-learning AI algorithms to reward the robot when it moves correctly. The significance of the record was noted by the lead researcher who said: “[Now we] can make robots move robustly around the world on two legs.” While impressive, the human body not only stays upright but navigates the world through an extremely intricate sensory system.
“The hardest part is creating a machine that interacts with humans in a natural way,” according to Nancy J. Cooke, a professor at Arizona State University. Re-creating that in a robot is still in its infancy. That is now among the most daunting challenges for Optimus and other humanoid robotic efforts.
AI automation takes center stage
Humanoid robots are becoming possible because of AI, and AI is racing forward aided by the triple exponential growth of computer power, software development and data.
Perhaps nowhere is this rapid AI progress better exemplified than with natural language processing (NLP), especially in relation to text and text-to-image generation. OpenAI released its first text-generation tools in February 2019 with GPT-2, followed by GPT-3 in June 2020, and the text-to-image DALL-E in January 2021, and DALL-E 2 in April 2022. Each iteration was vastly more capable than previous versions.
Additional companies are pushing these technologies forward, such as MidJourney and Stable Diffusion. Now the same phenomena is occurring with text-to-video, with several new apps appearing recently from Meta, Google, Synthesia, GliaCloud and others.
NLP technologies are quickly finding real-world applications, from code development to advertising (from copywriting to image creation), and even filmmaking. In my last article, I described how creative artist Karen X. Cheng was tasked with creating an AI-generated cover image for Cosmopolitan. To help create ideas and the final image, she used DALL-E 2.
The Crow, an AI-generated video, recently won the Jury Award at the Cannes Short Film Festival. To create the video, computer artist Glenn Marshall fed the video frames of an existing video as an image reference to CLIP (Contrastive Language–Image Pre-training), another text-to-image neural network also created by OpenAI. Marshall then prompted CLIP to generate a video of “a painting of a crow in a desolate landscape.”
If it only had a brain
Of course, building an NLP application is not the same as robotics. While computing power, software and data are commonalities, the physical aspect of building robots that need to interact with the real world adds challenges beyond developing software automation. What robots need is a brain. AI researcher Filip Piekniewski told Business Insider that “robots don’t have anything even remotely close to a brain.” That is largely true today, though what NLP provides is the beginning of the brain robots need to interact with humans. After all, a major humanoid brain function is the ability to perceive and interpret language and turn that into contextually appropriate responses and actions.
NLP is already used in chatbots, software robots that facilitate communications with people. Project December, a text-based chatbot developed using GPT-3 – has helped people to obtain closure by “talking” with a deceased loved one. Bot developer Jason Rohrer said of Project December: “It may not be the first intelligent machine. But it kind of feels like it’s the first machine with a soul.” Intelligent robots with a soul that can walk and manipulate objects would be a major advance.
That advance is near, though it could still be a decade or more for humanoid robots to roam the world. Optimus and other robots today are mostly simple machines that will grow in capabilities over the next couple of decades to become fully evolved artificial humanoids. We have now truly begun the modern robotic era.
Gary Grossman is the senior VP of technology practice at Edelman and global lead of the Edelman AI Center of Excellence.