D-ID is an Israeli artificial intelligence company specializing in generative AI technology for creating photorealistic digital humans and talking avatars from still images. Founded in 2017 by Gil Perry, Sella Blondheim, and Eliran Kuta, the company is headquartered in Tel Aviv, Israel, with additional offices in Wilmington, Delaware. D-ID's flagship product, the Creative Reality Studio, allows users to produce AI-generated videos featuring lifelike talking avatars from a single photograph, supporting over 119 languages. The company has raised approximately $48 million in total funding and serves customers including Warner Bros. Studios, Mondelez, Publicis, and MyHeritage [1][2].
D-ID originally focused on protecting images from unauthorized facial recognition and mass surveillance. Over time, the company pivoted toward generative AI video production, becoming one of the leading platforms in the AI avatar and synthetic media space. The company's technology gained global recognition through its partnership with MyHeritage on the viral Deep Nostalgia feature, which has been used to animate over 118 million faces in historical photographs.
D-ID was founded in 2017 by Gil Perry (CEO), Sella Blondheim (COO), and Eliran Kuta (CTO). The three founders are veterans of the Israeli Defense Forces' elite Unit 8200 intelligence corps and met during their military service. Perry holds a B.Sc. in Computer Science from Tel Aviv University with a focus on computer vision and image processing. Kuta is a computer vision and image processing expert with extensive management and development experience, while Blondheim brings expertise in project management, advertising, and marketing, holding a B.Sc. in Industrial Engineering from Shenkar College of Engineering and Design [3].
The company's original mission centered on de-identification technology, a system designed to protect photographs from facial recognition software. The name "D-ID" itself is shorthand for "de-identification." The founders were motivated by growing concerns about mass surveillance and the erosion of privacy through facial recognition systems deployed without consent. In the years following the founding, governments and corporations had deployed facial recognition at an unprecedented scale, from public spaces to social media platforms, and the founders saw an urgent need for counter-technology that could protect individuals' visual identities [4].
D-ID's initial product could alter images in ways imperceptible to the human eye but sufficient to prevent facial recognition algorithms from matching those images to identity databases. The technology applied subtle pixel-level perturbations to facial images that disrupted the mathematical representations (known as face embeddings) that recognition algorithms use to identify individuals. From a human perspective, the altered photo looked identical to the original, but to a facial recognition system, the two images appeared to belong to different people.
D-ID participated in Y Combinator and gradually expanded its technology from privacy protection into creative applications of face animation and synthetic media.
While the privacy-focused de-identification technology attracted early interest from security-conscious organizations and privacy advocates, D-ID's leadership recognized a broader commercial opportunity in using their deep learning face animation capabilities for content creation. The company began developing tools that could take a single still photograph of a face and animate it to speak, blink, and express emotions with convincing realism.
This pivot positioned D-ID within the rapidly growing synthetic media industry, where businesses and individuals sought efficient ways to produce video content without traditional filming equipment, actors, or studios. The shift also aligned with the explosive growth of AI-generated content tools in the early 2020s, as advances in generative adversarial networks, diffusion models, and transformer architectures made increasingly realistic synthetic media possible.
The pivot did not mean D-ID abandoned its privacy roots entirely. The company continued to offer de-identification services alongside its creative tools, and the technical expertise in understanding and manipulating facial features at the pixel level directly informed the quality of its generative products.
D-ID's breakout moment came in February 2021 when MyHeritage launched the Deep Nostalgia feature powered by D-ID's Live Portrait technology. The feature, which animated faces in old family photographs, went viral on social media, particularly on TikTok, where millions of users shared videos of their deceased relatives appearing to move and look around. The emotional resonance of seeing long-dead family members "come to life" drove extraordinary engagement, and within months, nearly 100 million faces had been animated through the feature [5][8].
The viral success of Deep Nostalgia significantly raised D-ID's profile and demonstrated the consumer appeal of face animation technology. It also showed that the company's technology could handle a vast range of input image qualities, from crisp modern photographs to faded, damaged images from the 19th century.
Building on this momentum, D-ID launched the Creative Reality Studio in 2022, expanding from a technology provider (powering other companies' products) to a direct-to-consumer and direct-to-business platform. The studio made D-ID's face animation technology available to anyone through a web browser, dramatically lowering the barrier to entry for AI video creation.
D-ID's core technology uses deep learning models to analyze the structure of a human face in a photograph, then generate frame-by-frame animations that produce natural-looking head movements, lip synchronization, and facial expressions. The system processes a static input image along with either a text script or audio file and outputs a video in which the photographed individual appears to speak the provided content.
The animation pipeline handles several technical challenges simultaneously:
| Challenge | D-ID's Approach |
|---|---|
| Lip synchronization | Deep learning model maps phonemes to mouth shapes in real time |
| Head movement | Generates natural micro-movements to avoid an uncanny stillness |
| Eye blinking | Adds realistic blink patterns and gaze behavior |
| Facial expressions | Supports emotional range through configurable expression parameters |
| Multi-language support | Handles 119+ languages with appropriate mouth shapes for each phonetic system |
| Resolution preservation | Maintains the quality of the original input photograph in the output video |
| Occlusion handling | Manages partial face visibility (hair, glasses, headwear) |
| Aging and damage tolerance | Processes degraded historical photographs alongside modern high-resolution images |
The system's approach to lip synchronization is particularly notable. Rather than using a simple lookup table that maps speech sounds to mouth shapes, D-ID's model generates continuous, fluid mouth movements that account for coarticulation (the phenomenon where the pronunciation of one sound is influenced by surrounding sounds). This produces more natural-looking speech than simpler approaches, which can result in robotic or disjointed mouth movements.
D-ID's proprietary Live Portrait technology forms the foundation of several of its products and partnerships. This system was the technology behind the viral Deep Nostalgia feature on MyHeritage, which animated the faces in historical family photographs. Live Portrait works by mapping a pre-recorded "driver" video of facial movements onto a target still image, transferring motion from one face to another while preserving the identity of the person in the photograph [5].
The technology handles a range of challenging input conditions:
Beyond pre-recorded video generation, D-ID has developed technology for real-time interactive digital humans. These Natural User Interface agents combine D-ID's face animation with large language model text generation, enabling face-to-face conversations between users and AI-driven avatars. The agents can be integrated into websites, applications, and kiosks to serve as virtual customer service representatives, tutors, or brand ambassadors [6].
NUI agents represent an evolution from passive content creation (generating a video) to active interaction (holding a conversation). The technical requirements are significantly more demanding, as the system must generate facial animations in real time with low enough latency to feel conversational. D-ID achieves this through a streaming architecture where the avatar begins responding visually before the full response has been generated, similar to how ChatGPT streams text responses token by token.
Typical NUI agent deployments include:
| Use Case | Description |
|---|---|
| Customer support kiosks | AI avatars that greet customers and answer questions in retail locations, airports, or hotels |
| Virtual tutors | Interactive educational characters that teach language, science, or other subjects |
| Corporate training | Onboarding presenters that guide new employees through company policies and procedures |
| Healthcare information | Patient-facing avatars that explain medical procedures or medication instructions |
| Museum and exhibit guides | Interactive characters that provide information about exhibits in cultural institutions |
The Creative Reality Studio is D-ID's flagship self-service platform, launched in 2022 to allow businesses and individual creators to produce AI-generated videos without technical expertise. The studio combines D-ID's face animation technology with text-to-speech capabilities and LLM integration into a single browser-based interface [7].
Key features of the Creative Reality Studio include:
D-ID offers a Chat API that enables developers to build face-to-face conversational experiences powered by AI. This API connects D-ID's real-time face animation with LLMs such as GPT-4 and Claude, allowing developers to create interactive digital humans that can hold natural conversations. The Chat API supports streaming, meaning the avatar responds in real time rather than requiring pre-rendered video [6].
The API provides several configuration options:
The D-ID Agents platform allows businesses to deploy persistent AI characters that maintain context across conversations, remember previous interactions, and can be trained on company-specific knowledge bases. These agents are designed for use cases such as customer support, employee training, and interactive educational content. Unlike the Chat API, which is designed for developers, the Agents platform provides a no-code interface for business users to create and manage their digital human deployments.
For developers who need to generate talking-head videos programmatically (rather than interactively), D-ID provides a Video API. This API accepts an image, a script or audio file, and configuration parameters, and returns a rendered video. Common use cases include automated video generation pipelines for e-commerce product descriptions, news summaries, and personalized marketing campaigns.
The most publicly visible application of D-ID's technology is the Deep Nostalgia feature, developed in partnership with genealogy platform MyHeritage. Launched in February 2021, Deep Nostalgia allows users to animate the faces in old family photographs, creating short videos that show how deceased relatives might have moved and looked in life. The feature became a viral sensation, generating hundreds of millions of views on social media platforms including TikTok and Instagram. As of 2025, users have animated over 118 million faces through Deep Nostalgia [5][8].
Following the success of Deep Nostalgia, D-ID and MyHeritage expanded their partnership with the LiveStory feature, which adds narrated voice to the animated photographs, allowing users to hear what appears to be their ancestors speaking. This feature combines D-ID's face animation with text-to-speech technology to produce a fully narrated video from a single photograph and a written script.
D-ID partnered with ElevenLabs to integrate premium AI-generated voices into the Creative Reality Studio. This partnership gives D-ID users access to ElevenLabs' highly realistic text-to-speech voices, which are known for their natural intonation, emotional expressiveness, and support for voice cloning. The integration means users can pair D-ID's visual realism with ElevenLabs' audio realism, producing videos where both the face and voice are AI-generated but appear convincingly human [9].
D-ID serves a range of enterprise customers across industries:
| Client | Industry | Use Case |
|---|---|---|
| Warner Bros. Studios | Entertainment | Film and entertainment production applications |
| Mondelez | Consumer goods | Marketing and brand communication videos |
| Publicis | Advertising | Creative campaigns at scale |
| MyHeritage | Genealogy | Deep Nostalgia and LiveStory features |
| Various educational institutions | Education | Virtual tutors and training content |
| Healthcare providers | Healthcare | Patient education and information delivery |
D-ID operates on a credit-based pricing model with several tiers designed for different user segments. Credits are consumed based on the duration of video generated, with one credit typically corresponding to a set amount of video time.
| Plan | Price | Key Features |
|---|---|---|
| Free Trial | $0 (14 days) | Limited credits, watermarked output, access to core features |
| Lite | Starting at $6/month | Basic video generation, limited credits per month |
| Pro | Mid-tier pricing | Increased credits, no watermark, API access, premium presenters |
| Advanced | Higher tier | More credits, priority rendering, advanced avatar features |
| Enterprise | Custom pricing | Unlimited or custom usage, dedicated support, custom integrations, SLA guarantees |
API pricing operates on a separate structure with per-minute or per-credit billing. Both monthly and annual billing options are available, with annual plans offering discounted rates. Credits are issued monthly on the billing date and do not roll over to subsequent months [10].
D-ID has raised a total of approximately $48 million across multiple funding rounds.
| Round | Amount | Lead Investor(s) | Notable Participants | Date |
|---|---|---|---|---|
| Seed | Undisclosed | Various | Early-stage investors | 2018 |
| Series A | $13.5 million | AXA Venture Partners, Pitango | OurCrowd | 2021 |
| Series B | $25 million | Macquarie Capital | Pitango, AXA Venture Partners, OurCrowd, OIF, Maverick, Marubeni | 2022 |
| Total | ~$48 million |
The Series B round in 2022, led by Macquarie Capital, brought D-ID's total funding to $48 million. The company stated that the funds would be used to expand its Creative Reality platform, invest in research and development, and grow its team. The funding reflected investor confidence in the growing market for AI-generated video content, which was seeing rapid adoption across marketing, education, and entertainment sectors [2].
D-ID operates in the competitive AI video generation and synthetic media market alongside several other platforms.
| Competitor | Headquarters | Focus | Key Differentiator | |------------|-------------|-------|----|| | Synthesia | London, UK | Enterprise video | Highly realistic studio-quality avatars, strong corporate adoption | | HeyGen | Los Angeles, USA | Script-to-video | 4K export, superior lip-sync, comprehensive template library | | Colossyan | Budapest, Hungary | Training video | Emphasis on educational and corporate training content | | Elai.io | Delaware, USA | Automated video | Video generation from text, URLs, and slides | | DeepBrain AI | Seoul, South Korea | Broadcast video | Broadcast-quality avatars for news and media | | Hour One | Tel Aviv, Israel | Enterprise | Presentations, training, and compliance content | | Pictory | Dublin, Ireland | Content repurposing | Converting long-form content into short videos |
D-ID differentiates itself primarily through its ability to animate any photograph into a talking avatar (rather than requiring users to choose from a fixed library of pre-recorded presenters), its extensive 119+ language support, the strength of its API for developers building custom applications, and its proven track record with viral consumer products like Deep Nostalgia. However, competitors like Synthesia and HeyGen are often noted for producing higher visual fidelity in their avatar outputs, particularly for corporate and enterprise use cases. HeyGen, in particular, has gained ground with its 4K output quality and more expressive avatar animations [11].
D-ID's technology raises important questions about synthetic media ethics. The ability to make any photographed person appear to speak arbitrary content has obvious potential for misuse, including the creation of deepfakes and misinformation.
D-ID addresses these concerns through several measures:
The broader industry of AI-generated synthetic media continues to face regulatory scrutiny, with proposed legislation in multiple jurisdictions requiring disclosure when AI-generated content depicts real people. The EU AI Act, for instance, includes transparency requirements for AI-generated content, and similar proposals are advancing in the United States and other countries.