19.4 C
Casper
Wednesday, September 18, 2024

Nvidia Expands AI Microservices for Physical Worlds

Must read

Nvidia boosts its AI Inference Microservices with new tools for physical environments, 3D modeling, and generative AI, expanding capabilities for developers and enterprises.

At the Siggraph conference in Denver, Nvidia Corp. announced that it’s significantly expanding its library of Nvidia Inference Microservices to encompass physical environments, advanced visual modeling, and a wide variety of vertical applications.

Among the highlights are the availability of Hugging Face Inc.’s inference-as-a-service on the Nvidia cloud and expanded support for three-dimensional training and inferencing.

NIM is a containerized microservice delivered in the Nvidia AI Enterprise suite that simplifies and speeds artificial intelligence model deployment. Each is an optimized inference engine tailored for various hardware setups and accessible via application program interfaces to reduce latency and operational costs and improve performance and scalability. Developers can use NIMs to deploy AI applications quickly without extensive customization and fine-tune models with proprietary data.

Nvidia said Hugging Face will offer inferencing-as-a-service on top of Nvidia’s DGX cloud, giving Hugging Face’s 4 million developers faster performance and easier access to serverless inferencing. Hugging Face provides a platform specialized for natural language processing, machine learning development, and staging, as well as a library of pre-trained models for NLP tasks such as text classification, translation, and question answering. It also offers a large repository of datasets optimized with Transformers, an open-source Python library that provides resources for working with NLP models.

Nvidia announced generative physical AI advancements, including its Metropolis reference workflow for building interactive visual AI agents. Metropolis is a collection of developer workflows and tools for building, deploying, and scaling generative AI applications across all types of hardware. It also announced new NIM microservices that help developers train physical machines to handle complex tasks.

Also Read: Explained: Composite AI

3D worlds

Today’s announcements include three new Fast Voxel Database NIM microservices that support new deep learning frameworks for three-dimensional worlds. FVDB is a new deep-learning framework for generating AI-ready virtual representations of the real world. It’s built on OpenVDB, an industry-standard library of structures and programs for simulating and rendering sparse volumetric data such as water, fire, smoke, and clouds.

FVDB provides four times the spatial scale of prior frameworks, 3.5 times the performance, and access to a large library of real-world datasets. It simplifies processes by combining functions that previously required multiple deep-learning libraries.

Also being announced are the three microservices—USD Code, USD Search, and USD Validate—that use the Universal Scene Description open-source interchange format to create arbitrary 3D scenes.

USD Code can answer OpenUSD knowledge questions and generate Python code, USD Search enables natural language access to massive libraries of OpenUSD 3D and image data. USD Validate checks the compatibility of uploaded files against OpenUSD release versions and generates a fully rendered path-traced image using Omniverse cloud APIs.

“We built the world’s first generative AI models that can understand OpenUSD-based language, geometry, materials, physics, and spaces,” said Rev Lebaredian, Nvidia’s vice president of Omniverse and simulation technology.

Also Read: Top AI Chatbots for Every Need

Physical AI support

Nvidia said its NIMs are tailored for physical AI, which supports speech and translation, vision, and realistic animation and behavior. Visual AI agents use computer vision capabilities to perceive and interact with the physical world and perform reasoning tasks.

They’re powered by a new class of generative AI models called vision language models, which enable enhanced decision-making, accuracy, interactivity, and performance. Nvidia’s AI and DGX supercomputers can be used to train physical AI models, and its Omniverse and OVX supercomputers can be applied to refine skills in a digital twin.

Applications include robotics, and in line with that, Nvidia said it will provide the world’s leading robot manufacturers, AI model developers, and software makers with a suite of services, models, and computing platforms to develop, train, and build the next generation of humanoid robotics (pictured).

Offerings include NIM microservices and frameworks for robot simulation and learning, the OSMO orchestration service for running multistage robotics workloads and an AI- and simulation-enabled teleoperation workflow that significantly reduces the human demonstration data required to train robots.

Generative AI’s visual output is typically “random and inaccurate, and the artist can’t edit finite details exactly how they want,” Lebaredian said. With Omniverse and NIM microservices, the designer or artist builds a ground-truth 3D scene that conditions the generative AI. They assemble their scene in Omniverse, which lets them aggregate brand-approved assets like a Coke bottle and various models for props and the environment into one scene.

Getty Images Holdings Inc.’s 4K image generation API and Shutterstock Inc.’s 3D asset generation will be available as Nvidia NIMs for image generation using text or image prompts. Both use Nvidia Edify, a multimodal architecture for visual generative AI.

“We’ve been investing in OpenUSD since 2016, making it, and therefore Omniverse, easier and faster for industrial enterprises and physical AI developers to develop performant models,” Lebaredian said. Nvidia has also worked with Apple Inc., which co-founded the Alliance for Open USD, to build a hybrid rendering pipeline stream from its Graphics Delivery Network to Apple Vision Pro. Software development kits and APIs that enable this on Omniverse are now available through an early access program.

Also Read: Explained: Artificial General Intelligence

Developers can use NIM microservices and Omniverse Replicator to build generative AI-enabled synthetic data pipelines, addressing a shortage of real-world data that often limits model training.

Coming soon, NIMs or USD Layout, USD Smart Material, and FDB Mesh Generation generate an OpenUSD-based mesh rendered by Omniverse APIs.

More articles

Latest posts