# D-ID

> Source: https://aiwiki.ai/wiki/d_id
> Updated: 2026-06-23
> Categories: AI Companies, Generative AI, Video Generation
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**D-ID** is an Israeli [artificial intelligence](/wiki/artificial_intelligence) company that turns a single still photograph into a photorealistic, talking [digital human](/wiki/digital_human) video using [generative AI](/wiki/generative_ai). Founded in 2017 in Tel Aviv by Gil Perry, Sella Blondheim, and Eliran Kuta, three veterans of the Israel Defense Forces' Unit 8200, D-ID began as a facial de-identification tool that protected images from [facial recognition](/wiki/facial_recognition) before pivoting to synthetic media [1][3][4]. Its flagship product, the Creative Reality Studio, generates AI videos of lifelike talking avatars from one image and a text script, and the company has raised a total of $48 million in funding, including a $25 million Series B led by Macquarie Capital in March 2022 [2]. D-ID's technology powered MyHeritage's viral Deep Nostalgia feature, which animated nearly 100 million faces in old family photographs within roughly a year of its 2021 launch [2][5][8].

D-ID originally focused on protecting images from unauthorized [facial recognition](/wiki/facial_recognition) and mass surveillance. Over time, the company pivoted toward generative AI video production, becoming one of the leading platforms in the AI avatar and synthetic media space. The company serves customers including Warner Bros. Studios, Mondelez, Publicis, and MyHeritage, and now markets real-time conversational agents that connect its face animation to [large language models](/wiki/large_language_model) [1][12].

## What does D-ID do?

D-ID builds software that synthesizes video of a human face speaking. Given a photograph (or a stock presenter) plus either a text script or an audio file, the system produces a video in which the photographed person appears to talk, with synchronized lip movements, natural head motion, blinking, and facial expressions. The company brands this capability "Creative Reality" and applies it across three product layers: self-service video creation (Creative Reality Studio), developer APIs (Video and Chat/Agents APIs), and real-time interactive [digital human](/wiki/digital_human) agents. The same underlying face-reenactment models that power consumer animations also drive enterprise avatars used for marketing, training, and customer service [1][7][12].

## History

### Founding and early focus

D-ID was founded in 2017 by Gil Perry (CEO), Sella Blondheim (COO), and Eliran Kuta (CTO). The three founders are veterans of the Israeli Defense Forces' elite Unit 8200 intelligence corps and met during their military service [3]. Perry holds a B.Sc. in Computer Science from Tel Aviv University with a focus on [computer vision](/wiki/computer_vision) and image processing. Kuta is a computer vision and image processing expert with extensive management and development experience, while Blondheim brings expertise in project management, advertising, and marketing, holding a B.Sc. in Industrial Engineering from Shenkar College of Engineering and Design [3].

The company's original mission centered on de-identification technology, a system designed to protect photographs from facial recognition software. The name "D-ID" itself is shorthand for "de-identification." The founders were motivated by growing concerns about mass surveillance and the erosion of privacy through facial recognition systems deployed without consent. In the years following the founding, governments and corporations had deployed facial recognition at an unprecedented scale, from public spaces to social media platforms, and the founders saw an urgent need for counter-technology that could protect individuals' visual identities [4].

D-ID's initial product could alter images in ways imperceptible to the human eye but sufficient to prevent facial recognition algorithms from matching those images to identity databases. The technology applied subtle pixel-level perturbations to facial images that disrupted the mathematical representations (known as face embeddings) that recognition algorithms use to identify individuals. From a human perspective, the altered photo looked identical to the original, but to a facial recognition system, the two images appeared to belong to different people [4].

D-ID participated in Y Combinator and gradually expanded its technology from privacy protection into creative applications of face animation and synthetic media [3].

### Why did D-ID pivot from privacy to generative AI?

While the privacy-focused de-identification technology attracted early interest from security-conscious organizations and privacy advocates, D-ID's leadership recognized a broader commercial opportunity in using their [deep learning](/wiki/deep_learning) face animation capabilities for content creation. The company began developing tools that could take a single still photograph of a face and animate it to speak, blink, and express emotions with convincing realism.

This pivot positioned D-ID within the rapidly growing synthetic media industry, where businesses and individuals sought efficient ways to produce video content without traditional filming equipment, actors, or studios. The shift also aligned with the explosive growth of [AI-generated content](/wiki/ai_generated_content) tools in the early 2020s, as advances in [generative adversarial networks](/wiki/generative_adversarial_network), [diffusion models](/wiki/diffusion_model), and transformer architectures made increasingly realistic synthetic media possible.

The pivot did not mean D-ID abandoned its privacy roots entirely. The company continued to offer de-identification services alongside its creative tools, and the technical expertise in understanding and manipulating facial features at the pixel level directly informed the quality of its generative products.

### Growth and viral success (2021-2023)

D-ID's breakout moment came in February 2021 when MyHeritage launched the Deep Nostalgia feature powered by D-ID's Live Portrait technology. The feature, which animated faces in old family photographs, went viral on social media, particularly on TikTok, where millions of users shared videos of their deceased relatives appearing to move and look around. The emotional resonance of seeing long-dead family members "come to life" drove extraordinary engagement: more than 1 million photos were animated in the first 48 hours, 26 million animations were created in the first 11 days, and roughly 72 million animations had been generated within five weeks of launch [5][13]. The MyHeritage mobile app became the number one free iPhone app in the United States for seven days [13].

The viral success of Deep Nostalgia significantly raised D-ID's profile and demonstrated the consumer appeal of face animation technology. It also showed that the company's technology could handle a vast range of input image qualities, from crisp modern photographs to faded, damaged images from the 19th century.

Building on this momentum, D-ID launched the Creative Reality Studio in 2022, expanding from a technology provider (powering other companies' products) to a direct-to-consumer and direct-to-business platform. The studio made D-ID's face animation technology available to anyone through a web browser, dramatically lowering the barrier to entry for AI video creation [7].

### Real-time agents (2024-2026)

In 2024, D-ID shifted its strategic focus from generating pre-rendered videos to real-time conversational [digital human](/wiki/digital_human) agents. On February 26, 2024, the company announced the general availability of D-ID Agents, autonomous AI avatars that take verbal commands and respond in multiple languages using facial expressions and hand gestures. The agents use Retrieval Augmented Generation (RAG) to ground their answers in supplied documents, delivering responses with over 90% accuracy in under two seconds [12]. Gil Perry, CEO and co-founder, described the launch as "a leap forward in our mission to bridge the communications gap between humans and rapidly advancing tech," adding that "the natural progression was to move from text-based interactions to audio and video" [12].

On March 16, 2026, D-ID launched V4 Expressive Visual Agents, a diffusion-based avatar system trained on real actor performances. V4 added dynamic sentiment alignment (facial expressions that adapt to the tone and intent of a conversation), an optional real-time camera layer for sentiment awareness, and inline interactive UI elements such as forms, charts, and quizzes. The company reported sub-0.5-second latency for conversational turns and output of up to 4K resolution, and stated the system was roughly 70 times cheaper to run than Google's Veo 3 Fast for comparable real-time interaction [14]. "With V4, we're setting a new benchmark for avatar fidelity and performance while keeping it fast enough for real-time conversations," Perry said [14].

## Technology

### How does D-ID animate a photo?

D-ID's core technology uses [deep learning](/wiki/deep_learning) models to analyze the structure of a human face in a photograph, then generate frame-by-frame animations that produce natural-looking head movements, lip synchronization, and facial expressions. The system processes a static input image along with either a text script or audio file and outputs a video in which the photographed individual appears to speak the provided content.

The animation pipeline handles several technical challenges simultaneously:

| Challenge | D-ID's Approach |
|-----------|----------------|
| Lip synchronization | Deep learning model maps phonemes to mouth shapes in real time |
| Head movement | Generates natural micro-movements to avoid an uncanny stillness |
| Eye blinking | Adds realistic blink patterns and gaze behavior |
| Facial expressions | Supports emotional range through configurable expression parameters |
| Multi-language support | Handles 119+ languages with appropriate mouth shapes for each phonetic system |
| Resolution preservation | Maintains the quality of the original input photograph in the output video |
| Occlusion handling | Manages partial face visibility (hair, glasses, headwear) |
| Aging and damage tolerance | Processes degraded historical photographs alongside modern high-resolution images |

The system's approach to lip synchronization is particularly notable. Rather than using a simple lookup table that maps speech sounds to mouth shapes, D-ID's model generates continuous, fluid mouth movements that account for coarticulation (the phenomenon where the pronunciation of one sound is influenced by surrounding sounds). This produces more natural-looking speech than simpler approaches, which can result in robotic or disjointed mouth movements.

### Live Portrait technology

D-ID's proprietary Live Portrait technology forms the foundation of several of its products and partnerships. This system was the technology behind the viral Deep Nostalgia feature on MyHeritage, which animated the faces in historical family photographs. Live Portrait works by mapping a pre-recorded "driver" video of facial movements onto a target still image, transferring motion from one face to another while preserving the identity of the person in the photograph [5]. Within a few days of the Deep Nostalgia launch, the technology was animating faces "at a rate of thousands per hour" [5].

The technology handles a range of challenging input conditions:

- **Low-resolution images**: Historical photographs that may be only a few hundred pixels across
- **Damaged or faded photographs**: Images with scratches, discoloration, or missing portions
- **Non-frontal poses**: Faces captured at angles rather than directly facing the camera
- **Group photographs**: The ability to animate individual faces within larger group images
- **Painted or illustrated portraits**: The system can animate artistic renderings, not just photographs

### Natural User Interface (NUI) agents

Beyond pre-recorded video generation, D-ID has developed technology for real-time interactive digital humans. These Natural User Interface agents combine D-ID's face animation with [large language model](/wiki/large_language_model) text generation, enabling face-to-face conversations between users and AI-driven avatars. The agents can be integrated into websites, applications, and kiosks to serve as virtual customer service representatives, tutors, or brand ambassadors [6][12].

NUI agents represent an evolution from passive content creation (generating a video) to active interaction (holding a conversation). The technical requirements are significantly more demanding, as the system must generate facial animations in real time with low enough latency to feel conversational. D-ID achieves this through a streaming architecture where the avatar begins responding visually before the full response has been generated, similar to how [ChatGPT](/wiki/chatgpt) streams text responses token by token.

Typical NUI agent deployments include:

| Use Case | Description |
|----------|-------------|
| Customer support kiosks | AI avatars that greet customers and answer questions in retail locations, airports, or hotels |
| Virtual tutors | Interactive educational characters that teach language, science, or other subjects |
| Corporate training | Onboarding presenters that guide new employees through company policies and procedures |
| Healthcare information | Patient-facing avatars that explain medical procedures or medication instructions |
| Museum and exhibit guides | Interactive characters that provide information about exhibits in cultural institutions |

## Products

### Creative Reality Studio

The Creative Reality Studio is D-ID's flagship self-service platform, launched in 2022 to allow businesses and individual creators to produce AI-generated videos without technical expertise. The studio combines D-ID's face animation technology with text-to-speech capabilities and LLM integration into a single browser-based interface [7]. The platform markets itself with the pitch to "transform text, audio, or still images into high-quality videos with lifelike digital avatars, no cameras or crews needed" [15].

Key features of the Creative Reality Studio include:

- **Photo-to-video conversion**: Users upload a photograph (or select from a library of stock presenters) and provide a text script. The system generates a video of the photographed person delivering the script with natural lip sync and head movements.
- **Presenter library**: A collection of diverse AI presenters spanning different ages, ethnicities, and professional styles that users can select for their videos, eliminating the need to use personal photographs.
- **Multi-language support**: Text-to-speech and lip synchronization in over 119 languages and dialects, making the platform accessible to global audiences. Video translation with voice cloning is offered in 40+ languages [15].
- **Emotional expression controls**: Configurable parameters that allow users to set the emotional tone of the presenter's delivery, including happiness, seriousness, concern, and enthusiasm.
- **Batch video creation**: Tools for generating multiple personalized videos at scale, useful for marketing campaigns and customer communications where each recipient receives a video addressed specifically to them.
- **Script from URL**: The ability to generate a video script automatically from a URL, allowing users to turn blog posts or web pages into video presentations.
- **Output resolution**: Standard AI presenters render at up to 1280x1280 pixels, while premium presenters render at 1080p on paid plans; individual clips are capped at 5 minutes and uploaded images at 10 MB [15].

### Chat API

D-ID offers a Chat API that enables developers to build face-to-face conversational experiences powered by AI. This API connects D-ID's real-time face animation with LLMs such as [GPT-4](/wiki/gpt-4) and [Claude](/wiki/claude_ai), allowing developers to create interactive digital humans that can hold natural conversations. The Chat API supports streaming, meaning the avatar responds in real time rather than requiring pre-rendered video [6].

The API provides several configuration options:

- **Avatar customization**: Developers can specify which presenter or uploaded image to use as the conversational partner
- **Voice selection**: Integration with multiple text-to-speech providers, including ElevenLabs and Azure Speech
- **LLM selection**: Developers can connect the avatar to their preferred language model
- **Knowledge base attachment**: Custom documents and data can be attached to inform the avatar's responses
- **Conversation memory**: The system maintains context across multiple exchanges within a session

### Agents platform

The D-ID Agents platform allows businesses to deploy persistent AI characters that maintain context across conversations, remember previous interactions, and can be trained on company-specific knowledge bases. Made generally available on February 26, 2024, the platform uses RAG to deliver grounded answers with over 90% accuracy in under two seconds, and was upgraded in March 2026 with the diffusion-based V4 Expressive Visual Agents offering sub-0.5-second response latency and up to 4K output [12][14]. These agents are designed for use cases such as customer support, employee training, and interactive educational content. Unlike the Chat API, which is designed for developers, the Agents platform provides a no-code interface for business users to create and manage their digital human deployments.

### Video API

For developers who need to generate talking-head videos programmatically (rather than interactively), D-ID provides a Video API. This API accepts an image, a script or audio file, and configuration parameters, and returns a rendered video. Common use cases include automated video generation pipelines for e-commerce product descriptions, news summaries, and personalized marketing campaigns.

## Partnerships and notable projects

### MyHeritage Deep Nostalgia

The most publicly visible application of D-ID's technology is the Deep Nostalgia feature, developed in partnership with genealogy platform MyHeritage. Launched in February 2021, Deep Nostalgia allows users to animate the faces in old family photographs, creating short videos that show how deceased relatives might have moved and looked in life. The feature became a viral sensation, generating hundreds of millions of views on social media platforms including TikTok and Instagram. By the time D-ID closed its Series B in March 2022, nearly 100 million faces had been animated through the feature [2][5][8].

Following the success of Deep Nostalgia, D-ID and MyHeritage expanded their partnership with the LiveStory feature, which adds narrated voice to the animated photographs, allowing users to hear what appears to be their ancestors speaking. This feature combines D-ID's face animation with text-to-speech technology to produce a fully narrated video from a single photograph and a written script [8].

### ElevenLabs partnership

D-ID partnered with [ElevenLabs](/wiki/elevenlabs) to integrate premium AI-generated voices into the Creative Reality Studio. This partnership gives D-ID users access to ElevenLabs' highly realistic [text-to-speech](/wiki/text_to_speech_ai) voices, which are known for their natural intonation, emotional expressiveness, and support for voice cloning. The integration means users can pair D-ID's visual realism with ElevenLabs' audio realism, producing videos where both the face and voice are AI-generated but appear convincingly human [9].

### Enterprise clients

D-ID serves a range of enterprise customers across industries:

| Client | Industry | Use Case |
|--------|----------|----------|
| Warner Bros. Studios | Entertainment | Film and entertainment production applications |
| Mondelez | Consumer goods | Marketing and brand communication videos |
| Publicis | Advertising | Creative campaigns at scale |
| MyHeritage | Genealogy | Deep Nostalgia and LiveStory features |
| Various educational institutions | Education | Virtual tutors and training content |
| Healthcare providers | Healthcare | Patient education and information delivery |

## Pricing

D-ID operates on a credit-based pricing model with several tiers designed for different user segments. Credits are consumed based on the duration of video generated, with one credit typically corresponding to a set amount of video time.

| Plan | Price | Key Features |
|------|-------|--------------|
| Free Trial | $0 (14 days) | Limited credits, watermarked output, access to core features |
| Lite | Starting at $6/month | Basic video generation, limited credits per month |
| Pro | Mid-tier pricing | Increased credits, no watermark, API access, premium presenters |
| Advanced | Higher tier | More credits, priority rendering, advanced avatar features |
| Enterprise | Custom pricing | Unlimited or custom usage, dedicated support, custom integrations, SLA guarantees |

API pricing operates on a separate structure with per-minute or per-credit billing. Both monthly and annual billing options are available, with annual plans offering discounted rates. Credits are issued monthly on the billing date and do not roll over to subsequent months [10].

## Funding

### How much money has D-ID raised?

D-ID has raised a total of approximately $48 million across multiple funding rounds [2].

| Round | Amount | Lead Investor(s) | Notable Participants | Date |
|-------|--------|-------------------|---------------------|------|
| Seed | Undisclosed | Various | Early-stage investors | 2018 |
| Series A | $13.5 million | AXA Venture Partners, Pitango | OurCrowd | 2021 |
| Series B | $25 million | Macquarie Capital | Pitango, AXA Venture Partners, OurCrowd, OIF, Maverick, Marubeni | March 2022 |
| **Total** | **~$48 million** | | | |

The Series B round, announced on March 22, 2022 and led by Macquarie Capital, brought D-ID's total funding to $48 million [2]. "We are incredibly grateful for this new round of funding and strong partnership with Macquarie, which will enable us to scale up our business and our technology to the next level," said Gil Perry, CEO and co-founder [2]. The company stated that the funds would be used to expand its Creative Reality platform, invest in research and development, and grow its team. The funding reflected investor confidence in the growing market for AI-generated video content, which was seeing rapid adoption across marketing, education, and entertainment sectors [2].

## Competition

### How does D-ID compare to Synthesia and HeyGen?

D-ID operates in the competitive AI video generation and synthetic media market alongside several other platforms.

| Competitor | Headquarters | Focus | Key Differentiator |
|------------|-------------|-------|----||
| [Synthesia](/wiki/synthesia) | London, UK | Enterprise video | Highly realistic studio-quality avatars, strong corporate adoption |
| [HeyGen](/wiki/heygen) | Los Angeles, USA | Script-to-video | 4K export, superior lip-sync, comprehensive template library |
| Colossyan | Budapest, Hungary | Training video | Emphasis on educational and corporate training content |
| Elai.io | Delaware, USA | Automated video | Video generation from text, URLs, and slides |
| DeepBrain AI | Seoul, South Korea | Broadcast video | Broadcast-quality avatars for news and media |
| Hour One | Tel Aviv, Israel | Enterprise | Presentations, training, and compliance content |
| Pictory | Dublin, Ireland | Content repurposing | Converting long-form content into short videos |

D-ID differentiates itself primarily through its ability to animate any photograph into a talking avatar (rather than requiring users to choose from a fixed library of pre-recorded presenters), its extensive 119+ language support, the strength of its API for developers building custom applications, and its proven track record with viral consumer products like Deep Nostalgia. However, competitors like Synthesia and HeyGen are often noted for producing higher visual fidelity in their avatar outputs, particularly for corporate and enterprise use cases. HeyGen, in particular, has gained ground with its 4K output quality and more expressive avatar animations [11]. With the diffusion-based V4 system and its claimed up-to-4K output, D-ID has moved to close that fidelity gap while emphasizing real-time conversational latency as its key advantage [14].

## Privacy and ethical considerations

### Is D-ID a deepfake company?

D-ID's technology raises important questions about synthetic media ethics. The ability to make any photographed person appear to speak arbitrary content has obvious potential for misuse, including the creation of [deepfakes](/wiki/deepfake) and misinformation. D-ID positions itself as a responsible, consent-based generative video company rather than a deepfake tool, citing its origins in privacy-protection technology.

D-ID addresses these concerns through several measures:

- **Consent requirements**: The platform's terms of service require users to have the right to use any photographs they upload, and the platform prohibits generating content that misrepresents real individuals.
- **Content moderation**: Automated systems and human review processes detect and prevent abuse, including attempts to generate political misinformation or non-consensual intimate content.
- **Watermarking**: Free-tier videos include visible watermarks indicating AI generation, providing a basic layer of provenance tracking.
- **De-identification roots**: The company's original privacy-protection technology informs its approach to responsible AI development, and D-ID has contributed to industry discussions about synthetic media governance.
- **Regulatory engagement**: D-ID has engaged with policymakers in the EU, US, and Israel on frameworks for responsible synthetic media, including disclosure requirements and consent standards.

The broader industry of AI-generated synthetic media continues to face regulatory scrutiny, with proposed legislation in multiple jurisdictions requiring disclosure when AI-generated content depicts real people. The [EU AI Act](/wiki/eu_ai_act), for instance, includes transparency requirements for AI-generated content, and similar proposals are advancing in the United States and other countries.

## See also

- [Digital human](/wiki/digital_human)
- [Generative AI](/wiki/generative_ai)
- [Deepfake](/wiki/deepfake)
- [Synthesia](/wiki/synthesia)
- [HeyGen](/wiki/heygen)
- [ElevenLabs](/wiki/elevenlabs)
- [Computer vision](/wiki/computer_vision)
- [Text-to-speech](/wiki/text_to_speech_ai)
- [Facial recognition](/wiki/facial_recognition)

## References

1. [D-ID Official Website](https://www.d-id.com/)
2. [D-ID Closes $25 Million Funding Round, Bringing Total Funding to $48 Million - PR Newswire](https://www.prnewswire.com/news-releases/d-id-closes-25-million-funding-round-bringing-total-funding-for-the-creative-reality-company-to-48-million-301507986.html)
3. [D-ID Company Information - Y Combinator](https://www.ycombinator.com/companies/d-id)
4. [D-ID: About Us](https://www.d-id.com/about-us/)
5. [Deep Nostalgia Technology Powered by D-ID's Live Portrait - D-ID Blog](https://www.d-id.com/blog/d-ids-live-portrait-technology-the-tech-behind-deep-nostalgia/)
6. [D-ID Chat API - Gil Perry LinkedIn](https://ee.linkedin.com/posts/gil-perry-d-id_d-id-unveils-new-chat-api-to-enable-face-to-face-activity-7036680752180117505-rdS2)
7. [D-ID Launches Creative Reality Studio - D-ID News](https://www.d-id.com/news/d-id-launches-creative-reality-studio-self-service-video-creation-platform-with-hyper-real-ai-presenters/)
8. [MyHeritage and D-ID Partner to Bring Photos to Life - TechCrunch](https://techcrunch.com/2022/03/03/myheritage-and-d-id-partner-to-bring-photos-to-life-with-both-animations-and-voice/)
9. [D-ID and ElevenLabs Announce Partnership - D-ID News](https://www.d-id.com/news/d-id-partners-with-elevenlabs/)
10. [D-ID Pricing - D-ID](https://www.d-id.com/pricing/studio/)
11. [Best D-ID Alternatives - Synthesia](https://www.synthesia.io/post/best-d-id-alternatives)
12. [D-ID Announces General Availability of Agents, Real-Time Conversational AI Avatars with RAG Technology - PR Newswire](https://www.prnewswire.com/news-releases/d-id-announces-general-availability-of-agents---real-time-conversational-ai-avatars-with-rag-technology-302070899.html)
13. [Deep Nostalgia Goes Viral and Reaches 26 Million Animations - MyHeritage Blog](https://blog.myheritage.com/2021/03/26-million-animations-created-with-deep-nostalgia/)
14. [D-ID Launches V4 Expressive Visual Agents for Real-Time, LLM-Connected Interaction at Enterprise Scale - D-ID News](https://www.d-id.com/news/v4-expressive-visual-agents-real-time-llm-connected-interaction/)
15. [D-ID's Creative Reality Studio - Generative AI Video Creator](https://www.d-id.com/creative-reality-studio/)