NVIDIA Picasso
Last reviewed
May 11, 2026
Sources
17 citations
Review status
Source-backed
Revision
v3 ยท 2,484 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 11, 2026
Sources
17 citations
Review status
Source-backed
Revision
v3 ยท 2,484 words
Add missing citations, update stale details, or suggest a clearer explanation.
See also: Model Deployment and artificial intelligence applications See also: Image generation, Video generation, and 3D generation
NVIDIA Picasso is a cloud-based generative AI foundry from NVIDIA for building, training, and deploying visual generative models that produce images, video, and 3D content from text prompts. The service was announced on March 21, 2023 at the GPU Technology Conference (GTC) in Las Vegas as part of the broader NVIDIA AI Foundations family, which paired Picasso for visual media with NeMo for large language models and BioNeMo for biology and drug discovery. Picasso runs on NVIDIA DGX Cloud and is targeted at enterprises, independent software vendors, and creative service providers that want to host customized foundation models trained on their own licensed data rather than rely on public consumer image generators. The platform hosts the Edify family of visual foundation models, supports fine-tuning on proprietary datasets, and exposes inference through APIs so partners can embed the models in their own products.
At launch Picasso was offered in private preview and developers signed up for access through the NVIDIA website. The first wave of named partners included Adobe, Getty Images, and Shutterstock, each working with NVIDIA on different parts of the visual stack and on a shared commitment to use only fully licensed training data with artist compensation. Through 2024 the underlying Edify models were exposed as NVIDIA NIM microservices on the NVIDIA AI Foundry, and the public Picasso branding was effectively folded into the Edify and AI Foundry product names. The NIM preview for Edify was later retired in 2025, but the underlying Edify research and partner deployments at Getty Images, iStock, and Shutterstock remain in service.
| Attribute | Detail |
|---|---|
| Vendor | NVIDIA |
| Product type | Cloud foundry for visual generative AI |
| Announced | March 21, 2023, GTC keynote |
| Initial status | Private preview |
| Foundation models | Edify Image, Edify 3D, Edify 360 HDRi, Edify Video |
| Modalities | Text-to-image, text-to-video, text-to-3D, 360 HDRi, PBR materials |
| Infrastructure | NVIDIA DGX Cloud |
| Key partners | Adobe, Getty Images, Shutterstock, WPP |
| Successor branding | NVIDIA Edify on NVIDIA AI Foundry, delivered as NIM microservices |
| Related products | NVIDIA Omniverse, NeMo, BioNeMo |
NVIDIA used the spring 2023 GTC keynote to position itself as more than a chip supplier for the generative AI wave. CEO Jensen Huang introduced AI Foundations as a managed cloud offering that would let enterprises build proprietary models without standing up their own GPU clusters. Three services made up the family at launch: NeMo for text, Picasso for visual content (images, video, and 3D), and BioNeMo for protein structures and small-molecule drug discovery.
Picasso fit a specific gap in the market. Public text-to-image systems like Stable Diffusion, Midjourney, and DALL-E had become household names, but the training data behind those systems was contested, the commercial licensing was unclear, and enterprises could not easily customize them on their own brand assets. NVIDIA's pitch was that Picasso would solve all three problems at once: cleanly licensed training data through partners, custom fine-tuning on enterprise data, and inference optimization on DGX Cloud hardware. Reporting at the time placed Picasso alongside DGX Cloud itself as the centerpiece visual announcement of the keynote.
NVIDIA refers to the visual foundation models that power Picasso collectively as Edify. Edify is a multimodal architecture designed for image, video, 3D, 360 HDRi, and physically based rendering (PBR) material generation. Customers can either start from Edify checkpoints pretrained by NVIDIA partners or train new Edify models from scratch on their own data.
| Model | Output | Notable capability |
|---|---|---|
| Edify Image | 2D images up to 4K resolution | Pixel-space Laplacian cascaded diffusion, photorealistic synthesis, control via depth and segmentation, sketch-to-image guidance |
| Edify Video | Short videos from text prompts | Temporal denoising layers added on top of the image backbone to preserve frame-to-frame consistency |
| Edify 3D | 3D meshes with clean quad topology and PBR materials | Text-to-3D and image-to-3D generation suitable for set dressing and game asset pipelines |
| Edify 360 HDRi | Up to 16K HDRi panoramas | Environment maps usable for image-based lighting in 3D scenes |
The Edify Image research paper, posted by NVIDIA's Deep Imagination Research group in November 2024 (arXiv 2411.07126), describes a U-Net based architecture with around 2.7 billion parameters operating directly in pixel space rather than in a learned latent space. The novelty in the paper is the Laplacian diffusion formulation that attenuates image signals at different frequency bands at different rates, which the authors argue produces sharper textures than standard latent diffusion at the same training cost. The model uses invertible Haar wavelet transforms at the network boundaries to reduce the number of spatial tokens in the attention layers by a factor of 16, which keeps high-resolution training tractable.
Edify 3D, described in a companion paper (arXiv 2411.07135), targets game and virtual production pipelines. NVIDIA highlighted that Shutterstock customers using Edify 3D could preview an asset within about 10 seconds and receive a finished mesh with PBR-ready quad topology suitable for editing inside standard 3D tools. Edify 360 HDRi extends the same architecture to spherical environment maps for image-based lighting; Shutterstock advertised 16K resolution panoramas generated from a single text or image prompt.
The service exposes three main modes of use.
The most lightweight path is API-based inference against an Edify model that has already been trained, either by NVIDIA or by a partner such as Getty Images. A developer calls an endpoint with a text prompt and optional control inputs (depth map, sketch, segmentation mask, camera parameters), and the service returns a generated asset. NVIDIA emphasized in launch materials that DGX Cloud inference is optimized for low latency interactive use, which matters for creative tools where artists want to iterate on dozens of variations.
Customers can upload their own image, video, or 3D asset libraries and fine-tune an Edify checkpoint to match a brand style or a particular visual language. This is the path WPP and Coca-Cola took (covered below) and the path Getty Images offers to its enterprise customers as a custom model service. NVIDIA handles the orchestration of the fine-tuning job across DGX Cloud GPUs and returns a private model checkpoint that only the customer can call.
Larger customers can train an Edify-architecture model from the ground up on their own corpus. This option is the most expensive and the most flexible. It is the path NVIDIA took with Shutterstock to build the original Edify 3D model and with Getty Images to build the original commercial text-to-image and text-to-video models, in both cases on fully licensed source content.
The partnership story is the single most important thing to understand about Picasso. NVIDIA built the service around the idea that visual generative AI for enterprises has to be defensible on copyright grounds, and the named partners are the proof points for that thesis.
At the GTC 2023 announcement NVIDIA said it was co-developing a next generation of commercially viable generative AI image models with Adobe. The plan was to ship the resulting models inside Adobe Creative Cloud applications including Photoshop, Premiere Pro, and After Effects, and also to offer them as Picasso APIs. Adobe folded the work into its Firefly product family and emphasized the Content Authenticity Initiative for tracking provenance of generated assets.
Getty Images partnered with NVIDIA to train responsible text-to-image and text-to-video foundation models on its fully licensed library, with revenue sharing back to contributing photographers and videographers. The first Getty service built on Picasso was Generative AI by Getty Images, launched in late 2023. In 2024 Getty extended the program to iStock under the name Generative AI by iStock. Both services run on Edify models hosted on Picasso and include commercial indemnification for customers, a major selling point compared to consumer image tools whose output rights remain legally uncertain. The Edify Image upgrade in 2024 doubled the speed of generation for Getty's service, producing four images in about six seconds, and added camera controls (depth of field, focal length) plus depth-map composition copying.
Shutterstock worked with NVIDIA on text-to-3D and 360 HDRi generation, again trained on fully licensed Shutterstock content and with contributor compensation through Shutterstock's Contributor Fund. The service let creators describe a 3D asset or an environment in plain language and receive a usable mesh or 16K HDRi panorama, collapsing what would have been days of modeling and lighting work into minutes. Shutterstock made the API available in early access through its Creative AI platform.
Marketing holding company WPP used NVIDIA Picasso to fine-tune the Getty Images Edify model on Coca-Cola's brand guidelines, then used the resulting custom model to generate brand-consistent campaign visuals at scale. NVIDIA highlighted the case as an example of how a creative agency could build a private model that respects a client's brand book while still letting individual art directors iterate quickly.
Launch materials also featured demos from creative software vendors and studios. Runway showcased its video generation work alongside Picasso. Cuebric, founded by Seyhan Lee, used the service for virtual production environment generation aimed at film studios. WOMBO, the consumer app, was named as a developer using the service for stylized text-to-image generation.
The Picasso brand has changed shape since 2023. At GTC March 2024, NVIDIA announced NIM microservices, a way to package optimized inference for popular models as containerized services that customers can run on DGX Cloud, in their own data centers, or in third-party clouds. The Edify visual models were among the first NIMs offered. NVIDIA simultaneously consolidated its enterprise generative AI offerings under the NVIDIA AI Foundry banner, and the public marketing page that previously promoted Picasso started redirecting to a page titled NVIDIA Edify, AI Foundry for Generative AI Models. By autumn 2024 NVIDIA was describing Edify on AI Foundry as the consolidated product, with Picasso largely retired as an outward-facing name even though the underlying DGX Cloud foundry continued to run.
The NIM preview path for Edify models had a shorter shelf life than the partner deployments. NVIDIA announced in 2025 that Edify NIM microservices were no longer available as previews on build.nvidia.com, and pointed developers toward the partner-hosted services (Getty Images, iStock, Shutterstock) for production use. The Edify research papers and the Picasso-trained models embedded in those partner products remain the durable artifacts of the program.
Looking back, Picasso was a hedge as much as a product. NVIDIA had spent most of 2022 watching its GPU customers monetize generative AI at the application layer (OpenAI, Stability AI, Midjourney) while NVIDIA's own revenue was limited to hardware sales. AI Foundations was an attempt to capture some of that application-layer value directly. The decision to anchor Picasso on cleanly licensed datasets through Getty Images and Shutterstock looks better in hindsight as copyright disputes piled up around models trained on uncredited web scrapes.
The service never matched the consumer mindshare of Stable Diffusion or Midjourney, but it did seed a class of enterprise visual generative products that emphasize provenance and indemnification: Getty Images' generative service, iStock's generative service, Shutterstock's Creative AI, and various WPP brand pipelines. These services still rely on Edify models trained originally inside Picasso.
NVIDIA never published a public price list for Picasso. The standard model was an enterprise sales motion: customers contacted NVIDIA, signed an agreement, and were quoted based on reserved DGX Cloud capacity plus partner licensing pass-through (for example, royalties to Getty Images contributors). DGX Cloud itself is priced on a reserved-capacity basis with on-demand options through cloud partners. Picasso started in private preview and required application; later, access shifted toward the AI Foundry and NIM offerings.
The Picasso program had a quieter public footprint than NVIDIA's GPU launches or NeMo updates, and several observers noted that the consumer quality of Picasso outputs lagged behind the leading public image generators in 2023 and 2024. The product was also complex, sitting at the intersection of an unfamiliar billing model (DGX Cloud reservations), a new architecture (Edify), and a partner ecosystem that did not always tell a clear story to end customers. The shift to the Edify and AI Foundry names in 2024 helped, but it also made it harder to track what "Picasso" referred to at any given point in time. The retirement of Edify NIM previews in 2025 underlined that NVIDIA prefers customers consume these visual models through partner products rather than through a self-serve developer microservice.