Cosmos Marks Another Masterful Stroke for Nvidia in AI Robotics

Automate Asia Magazine
May 29
6 min read

*Nvidia CEO Jensen Huang On Stage At Company Keynote, CES 2025 In Las Vegas, NV*

At what was likely the most widely attended keynote in CES history, Nvidia CEO Jensen Huang took to the stage at the jam-packed Michelob Ultra arena, with a dizzying array of announcements of new technologies, from consumer devices like the new GeForce RTX 50 Series of gaming graphics cards, to a new secure autonomous vehicle platform called Thor that’s based on the company’s latest Blackwell GPU technology, and more – a lot more. However, a new Nvidia generative AI technology dubbed Cosmos, that some folks might have glossed over due to its complexity was, in my opinion, another star of the show. I’d even dare say, if Cosmos plays out as the company is intending, it could be a launch-pad for rocketing Nvidia’s robotics and autonomous vehicle businesses.

Understanding Nvidia Cosmos for Physical AI

*Nvidia's Depiction of What the Machine See in A Cosmos Model-Generated Training Simulation*

Nvidia calls Cosmos a “platform for accelerating physical AI development.” And Simply put, you can think of physical AI as the brains behind anything robotic, whether it’s humanoid robots that are designed to optimally navigate the world we live in, factory automation robots, or autonomous vehicles, which are optimized robots for navigating our roads, carrying humans or various payloads. However, training robotic AI is hugely labor and resource-intensive, often requiring the capture, labeling and categorization of millions of hours human interaction in real world environments, or millions of miles driven on real roadways around the world.

Nvidia Cosmos aims to partially solve this resource problem with a family of what the company is calling “World Foundational Models,” or AI neural networks that can generate accurate physics-aware videos for the future state of a virtual environment – or a multiverse, if you will. You can go ahead and queue Dr. Strange now, and Jensen even referred to the Marvel character in his keynote presentation. It all sounds mind-bendingly deep, but it’s actually fairly straightforward. WFMs are similar to Large Language Models, but where LLMs are trained AI models for natural language recognition, generation, translation, etc., WFMs utilize text, images, video content and movement data to generate simulated virtual worlds and virtual world interactions that have accurate spatial awareness, physics and physical interaction, and even object permanence. For example, if a bolt rolls off a table in a factory, and can’t be seen in the current camera view, the AI model knows it’s still there but perhaps just on the floor.

Still with me? Good, because this is where it gets even more interesting. This new form of synthetic data generation to train physical AI, or robots, needs to be based on ground truth to be accurate. In other words, bad data in means a corrupt model that hallucinates or is otherwise unreliable for generating training data for robotic AI. That’s where Nvidia Omniverse, which the company announced a couple of years ago, comes into play.

Cosmos Is Built to Interface with Nvidia Omniverse Digital Twins

*Nvidia's Huang Details Application of Cosmos AI World Foundational Training Models for Robotics*

Nvidia’s Omniverse digital twin operating system allows companies and developers from virtually any industry to simulate products, factories, robots, vehicles, etc. in an environment that’s designed to connect with industry standard tools, from computer aided design, to animation and more. In fact, Nvidia unveiled new Omniverse “Blueprints” at CES 2025 as well, to help developers in simulating robot fleets for factories and warehouses (called Mega), and AV simulation, spatial streaming to the Apple Vision Pro headset for large-scale industrial digital twins, and real-time Computer Aided Engineering and physics visualization. The company bundles these with free instructional courses for OpenUSD, or Universal Scene Description, which is the language that underpins Omniverse and allows the integration of industry standard tools and content. Nvidia announced several major players are adopting its Omniverse platform, from Cadence for EDA design tools for semiconductors, to Altair and Ansys for computational fluid dynamics, among many others.

Circling back to Cosmos, now we can see Nvidia’s full stack solution coming together for physical AI in robotics. Cosmos models take input from a digitized version of the real world, and then generate AI training content from it. Though Cosmos models were developed from training on 20 million hours of video data, according to Huang in his keynote address, developers that want to train physical or robotic AI on their own digital twins and their own data can simulate in Omniverse, and then let Cosmos play out a myriad of synthetic realities that these robot AIs can then train on.

Is Cosmos Another CUDA Moment For Nvidia?

At this point, I know what you’re thinking. Training robots on simulated data and in simulated worlds, what could go wrong? There’s no question, this technology is still in its infancy, but as the old saying goes, you have to start somewhere. The beauty of machine learning, though it’s prone to hallucinations and needs to have guardrails (which Nvidia has a well-documented tools and policies on), is that you can train and keep training until you’re confident you’ve got it right. And the machine doesn’t sleep or take coffee breaks, not to mention it’s a whole lot more efficient than manually training an AI on human generated and categorized content.

*Nvidia CEO Jensen Huang on Stage at CES 2025, Describing The Company's 3 Computer Solution For Robotics*

That said, years ago, when Nvidia first announced its CUDA programming language that sparked the age of machine learning on GPU accelerators, the company went Johnny Appleseed, so to speak, making its tools available to developers from all walks of life, eventually allowing it to become the de-facto standard for accelerating AI workloads in the data center. With Cosmos, Nvidia is once again making these generative AI World Foundational Models available to developers for free, under its open model license, and they’re accessible on Hugging Face or the company’s own NGC catalog repositories. The models will also soon be available as optimized Nvidia Inference Microservices (or NIMs), all of which will be accelerated on its DGX data center AI platforms and at AI edge devices, in robots and autonomous vehicles, with its AGX Drive Orin and Thor car computer platforms for autonomous vehicles. Or, as Huang and the company call it, Nvidia’s “Robotics 3 Computer Solution.”

*Nvidia CEO Jensen Huang Hold Company's New AGX Drive Thor Autonomous Vehicle Platform Based on Its Blackwell GPU Tech*

Nvidia notes that several big-name players in physical AI have already adopted Cosmos, from humanoid robot companies like 1X and XPENG, to Hillbot and SkildAI for general purpose bots, to rideshare giant, Uber, that’s using Cosmos in combination with its massive driving datasets to help build AI models for the AV industry.

It might be a stretch to call this another “CUDA moment” for Nvidia, but the world’s leader in AI just dropped some seriously powerful new tools for physical AI developers, and for free. I personally think it’s another master stroke for Jensen Huang and his band of AI wizards. We’ll have to see just how far this robotic AI, multiverse rabbit hole goes with Cosmos, and it should be fascinating to watch.

About writer: Dave Altavilla is Principal Analyst and Co-Founder of HotTech Vision And Analysis, as well as Editor In Chief of HotHardware.com. He has been a Forbes Senior Contributor for over a decade. He covers semiconductors and adjacent technologies, including AI, client computing, cloud data centers, mobile compute, automotive tech and chip design and fabrication. He has lived and breathed chips and computing for over 30 years, previously as a semiconductor sales engineer and global account manager. In tandem, he has also served as a journalist and technology analyst for multiple publications including Forbes, Computer World, Schwab Network, Fox Business and a tech and sciences focused web magazine he founded decades ago called HotHarware.com. Some of companies Dave tracks may be clients of his analyst firm HotTech. Follow Altavilla for detailed coverage of all things computing, semiconductors, AI and related technologies from the cloud to the intelligent edge.

The above comments and opinions in the article are the author’s own and do not necessarily represent Automate Asia Magazine’s view

Source: www.forbes.com

Cosmos Marks Another Masterful Stroke for Nvidia in AI Robotics

Related Posts

Subscribe to Our Newsletter