Content provenance

25 min read

Updated Jul 23, 2026

Content provenance is the set of techniques, standards, and policies for recording and disclosing the origin, authorship, and edit history of digital media. The term gained prominence after 2022 as image, audio, video, and text generators based on diffusion and large language models lowered the cost of producing realistic synthetic content, and policymakers, news organizations, and platform operators sought reliable ways to distinguish AI-generated material from camera-captured or human-authored material. Provenance approaches fall into two main families: cryptographically signed metadata bound to a file (as in the Coalition for Content Provenance and Authenticity standard, known as C2PA), and signal-level marks embedded inside the pixels, audio samples, or token distributions of the content itself (often called invisible watermarking, with SynthID from Google DeepMind and Meta's Stable Signature as prominent examples).^[1]^[2] Beyond technical mechanisms, content provenance has become the subject of binding regulation, including the EU AI Act's Article 50 transparency duties, China's 2023 deep synthesis rules, and the now-revoked United States Executive Order 14110.^[3]^[4]^[5] By 2026 both families had reached large scale: DeepMind reported that SynthID had watermarked more than ten billion images and video frames across Google's services, and C2PA had advanced its Content Credentials standard to version 2.4, released in April 2026, with signing built into smartphones, professional cameras, and the largest generative AI platforms.^[40]^[1]

How does provenance differ from authenticity and detection?

Provenance, authenticity, and originality refer to overlapping but distinct properties. Provenance describes the recorded chain of who produced, edited, and distributed a piece of content. Authenticity asks whether the content corresponds to the events it purports to depict. Originality asks whether the work is novel or derivative. A photograph can have strong provenance (a cryptographically signed capture device, a documented edit history) yet be inauthentic (it depicts a staged scene) or unoriginal (it copies an earlier composition). Provenance metadata does not, by itself, certify truth; it only certifies that a specific tool or actor produced a given asset, which an investigator or downstream consumer can then evaluate against context.^[6]

The Coalition for Content Provenance and Authenticity defines provenance as "the basic, trustworthy facts about the origins of a piece of digital content," including who created or edited it, with what tool, and when. Its specification distinguishes provenance assertions, which describe the asset's history, from claims, which bind those assertions to a cryptographic signature.^[1] The United States National Institute of Standards and Technology, in its November 2024 report NIST AI 100-4 Reducing Risks Posed by Synthetic Content, groups techniques into two families: "provenance data tracking" (metadata, signatures, watermarks) and "synthetic content detection" (classifiers that infer AI origin without prior marking).^[4] The report treats detection as a complement to, not a substitute for, marked provenance because detection accuracy degrades against capable adversaries and produces both false positives and false negatives.^[4]

Watermarking, deepfake detection, and content provenance are related but separable. Watermarking is one mechanism for carrying provenance signals inside an asset's bits; it is covered in depth in the AI watermarking article. Deepfake detection refers to forensic methods that infer manipulation without any cooperating signal from the generator, often using neural classifiers; see deepfake. Provenance is the broader social and technical category that includes both signed metadata and embedded marks, plus the standards bodies, vendor commitments, and laws that bind them together.

When did content provenance emerge?

Tracking the origin of an image is older than digital media. Camera makers embedded Exchangeable Image File Format (Exif) tags in JPEG files in the late 1990s to record exposure settings, device model, and (optionally) GPS coordinates. Exif metadata was never cryptographically signed, was easily stripped by social platforms, and could be forged by editing the file header. The widespread practice on platforms such as Twitter, Facebook, and Instagram of removing Exif on upload meant that by the mid-2010s most images circulating on the web carried little or no embedded origin data.^[6]

The contemporary content provenance movement began in 2019. In November of that year, Adobe announced the Content Authenticity Initiative (CAI) with founding partners The New York Times Company and Twitter at the Adobe MAX conference, framing the effort as a response to manipulated political imagery and the impending availability of stronger generative tools.^[7] In parallel, Microsoft Research and the BBC launched Project Origin in 2020 to develop similar provenance mechanisms aimed specifically at news publishers, partnering with CBC/Radio-Canada and The New York Times.^[8]

The two efforts merged at the standards layer on 22 February 2021 with the founding of the Coalition for Content Provenance and Authenticity, a Joint Development Foundation project under the Linux Foundation. Founding members were Adobe, Arm, BBC, Intel, Microsoft, and Truepic.^[9]^[10] C2PA released its first specification, version 1.0, in January 2022, and the steering committee expanded over the next three years to include Amazon, Google, Meta, OpenAI, Sony, and Publicis Groupe.^[1]^[11]

The pace of activity accelerated after the public release of generative image models in 2022 (Stable Diffusion, Midjourney, DALL-E 2) and the subsequent wave of multimodal models including DALL-E 3, Imagen, Veo, and Sora. By 2023, deepfake political imagery had already affected elections in Argentina and Slovakia, and a synthetic image of an explosion near the Pentagon briefly moved United States stock markets in May 2023, sharpening regulatory interest.^[12]

How does cryptographic provenance work?

Cryptographic provenance attaches a structured, signed record to a media file. The signed record typically lists the producing device or application, a timestamp, the actions performed (capture, crop, color correction, AI generation), the identity of the signer through an X.509 certificate, and a hash that binds the record to the binary content. A verifier can then check the signature against a trusted certificate authority and detect tampering with either the asset or the assertions.

C2PA Content Credentials

Content Credentials is the consumer-facing brand for the C2PA specification. A Content Credentials manifest is a CBOR-encoded structure stored inside the file (in JPEG XMP, MP4 boxes, or analogous containers) or referenced externally. Each manifest contains one or more assertions, signed claims about the content's origin and history, and a chain of trust back to a hardware root of trust or a software signing identity issued by the C2PA Trust List.^[1] When the asset is re-edited, a new manifest is appended that references the prior manifest by hash, producing a verifiable lineage similar to a blockchain but stored alongside the content itself rather than on a distributed ledger.^[1] The specification has evolved through frequent releases: C2PA published version 1.0 in January 2022, reached version 2.3 on 5 January 2026, and shipped version 2.4 in April 2026, the latter adding a JSON credential format and new assertion types such as repository receipts and an environmental-sustainability assertion.^[1]

Adoption grew rapidly between 2023 and 2026. Adobe shipped Content Credentials in Photoshop in 2023 and made the feature default-on in 2024. Adobe Firefly images carry Content Credentials at generation time. Microsoft's Bing Image Creator added them in early 2024. OpenAI began attaching C2PA metadata to images from DALL-E 3 and its successor in February 2024 and extended the practice to video output from Sora.^[13] Google announced C2PA support across Gemini, Google Search, and YouTube during 2024 and 2025.^[14] Camera makers Leica (M11-P, 2023) and Nikon (Z9 firmware, 2024) shipped hardware that signs photographs in-camera using C2PA-compliant manifests, enabling end-to-end provenance from sensor to publication.^[11]^[15]

Hardware adoption widened in 2025. The Google Pixel 10, announced on 10 September 2025, became the first mainstream smartphone to attach C2PA Content Credentials to every photo from its native camera by default, storing signing keys in the phone's Titan M2 security chip and reaching Assurance Level 2 under the C2PA Conformance Program; Samsung's Galaxy S25, released earlier in 2025, had signed only AI-edited images.^[41] Sony released its PXW-Z300 camcorder in 2025 as one of the first professional video cameras to record C2PA Content Credentials at capture.^[42] DPReview described the Pixel 10 as the first time Content Credentials had come to phones.^[41]

The Content Authenticity Initiative, which had a separate origin as Adobe's 2019 effort, continues to operate as an advocacy and developer-tools organization layered above C2PA's open standard. As of 2026, CAI reports more than 6,000 member organizations spanning newsrooms, technology firms, hardware vendors, and civil-society groups.^[42] Andy Parsons, CAI's senior director, wrote in early 2026 that "interoperability is emerging as a practical reality," pointing to consumer devices such as the Pixel 10 and professional cameras that now sign content out of the box.^[42]

Truepic

Truepic, founded in 2015 and one of the six C2PA founding members, focuses on capture-side provenance. Its Truepic Lens software development kit lets mobile applications open a hardened camera session in which captured images are signed at the moment of exposure using a key tied to a device's secure enclave, with C2PA assertions written into the resulting file.^[16] Truepic's authenticating camera SDK was recognized in TIME's 2022 Best Inventions list.^[17] The company has partnered with Microsoft on "Project Providence," a pilot delivering end-to-end provenance from capture to display, and supplies authenticated-capture tools to insurance, supply-chain, and humanitarian-documentation customers.^[18]

JPEG Trust

The Joint Photographic Experts Group, a joint ISO/IEC/ITU committee, published the first part of its JPEG Trust standard (ISO/IEC 21617-1:2025 Core foundation) in January 2025. JPEG Trust defines a framework for trust profiles, trust indicators, and signed provenance annotations that can wrap C2PA manifests or operate independently, and it incorporates earlier JPEG Privacy and Security work (ISO/IEC 19566-4).^[19]^[20] By aligning C2PA-style assertions with a formal ISO standard, JPEG Trust offers a route into procurement frameworks and national regulations that require ISO-recognized standards.

Limits of cryptographic provenance

Cryptographic provenance has well-understood limits. A signature only proves that a specified signing key produced the manifest; it does not prove that the underlying scene was real, that the signer is honest, or that the image has not been re-photographed off a screen (an "analog hole" attack). Provenance metadata is also easily stripped: screenshotting a Content Credentials image, re-encoding the file, or stripping XMP all remove the manifest while preserving the visible pixels. In response, C2PA defines "soft binding" mechanisms (perceptual fingerprints registered with a lookup service) that can re-attach provenance to a stripped asset if a copy is found in the registry, but this requires participating registries and is probabilistic.^[1] Researchers and policy analysts at Brookings have argued that cryptographic provenance is most effective when paired with watermarking and with platform-side enforcement that demands signed credentials for high-trust contexts.^[21]

Watermarking

Invisible watermarking embeds a robust, machine-readable signal directly in the content's perceptual representation: in pixel intensities for images, in spectral coefficients for audio, in token-distribution biases for text. Unlike metadata-based approaches, an invisible watermark survives screenshotting, re-encoding, and metadata stripping, though it can still be removed or weakened by sufficiently aggressive transformations. Watermarking is covered in depth in AI watermarking; the summary here focuses on its role within the broader content provenance landscape.

SynthID

SynthID is Google DeepMind's family of watermarking tools. It launched first for images on 29 August 2023, applying an imperceptible watermark to outputs of Imagen on Google Cloud's Vertex AI.^[22]^[23] On 14 May 2024, DeepMind extended SynthID to text generated by Gemini (using a probability-modulating sampler) and to video frames generated by Veo.^[24] The text variant was open-sourced through Google's Responsible Generative AI Toolkit on Hugging Face in October 2024.^[24] On 20 May 2025, Google launched the SynthID Detector, a public portal that scans an uploaded image, audio clip, video, or text passage for a SynthID watermark and highlights the segments where it is found, with access opening first to journalists and researchers through a waitlist.^[39] In an October 2025 paper titled "SynthID-Image: Image watermarking at internet scale," DeepMind researchers reported that "SynthID-Image has been used to watermark over ten billion images and video frames across Google's services," and benchmarked its external variant as state of the art in visual quality and in robustness to common image perturbations.^[40] Counting all modalities, Google said SynthID had marked more than ten billion pieces of generated content across Gemini, Imagen, Lyria (audio), and Veo by 2026.^[2]

Meta Stable Signature

In March 2023, researchers at Meta AI (Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, and Teddy Furon) posted "The Stable Signature: Rooting Watermarks in Latent Diffusion Models" on arXiv (paper 2303.15435), later published at ICCV 2023.^[25] The method fine-tunes the latent decoder of a latent diffusion model so that every generated image silently encodes a per-user binary signature. A pre-trained extractor recovers the signature from any subsequent image; the authors report 90 percent or better detection accuracy after cropping to 10 percent of the original area, at a false-positive rate below 10^-6.^[25] Code was released publicly on GitHub. Stable Signature gave model owners a tractable way to attribute generated images back to a specific deployment, and influenced subsequent commercial deployments at Meta, including watermarking commitments announced for Meta AI's image generators in 2024.

OpenAI text watermarking

OpenAI has developed but not deployed a text watermarking system for ChatGPT output. A Wall Street Journal report on 4 August 2024 disclosed that the company had a watermarking tool ready for roughly a year but had delayed release.^[26] OpenAI confirmed in an updated blog post the same day that text watermarking, classifiers, and metadata were under active research, and pointed to two concerns blocking deployment: a user survey indicating nearly 30 percent of ChatGPT users would reduce usage if their text were watermarked, and the risk of disproportionate harm to non-native English writers if classifiers misfired on their work.^[27] A spokesperson also noted that the watermark could be defeated by translation, paraphrase through another model, or other "global tampering" attacks.^[26] As of 2026, OpenAI continues to ship C2PA metadata on DALL-E and Sora outputs but has not enabled text watermarks in ChatGPT.

Other commercial deployments

Meta's image generators on Instagram, Facebook, and WhatsApp apply invisible marks and Content Credentials. Microsoft's Bing Image Creator and Designer apply C2PA. Stability AI shipped watermarking with Stable Diffusion 3 in 2024. Several open-source diffusion model distributions strip watermarks by default through configuration changes, illustrating the difficulty of enforcing provenance signals across an open ecosystem.^[4]

Adjacent technologies

Perceptual hashing

Perceptual hashing produces a compact fingerprint that remains stable under small visual transformations, allowing a database to recognize a known image after resizing, cropping, or recompression. Microsoft's PhotoDNA, introduced in 2009 to combat known child sexual abuse material, is the best-known deployment; it converts an image to grayscale, partitions it into a grid, computes localized descriptors, and outputs a hash that can be matched against a reference set.^[28] PhotoDNA is not provenance in the strict sense (it does not record who created an image), but its hash-and-match architecture has been adapted by C2PA's soft binding registry concept and by AI-content detectors that maintain growing databases of known generated images.

AI detection classifiers

Detection-only tools attempt to classify whether a given asset was AI-generated without relying on any cooperating signal from the generator. Commercial vendors include Hive (which serves multiple platforms with image, video, and text detection APIs), Reality Defender (focused on real-time deepfake detection for video conferencing and broadcast), Optic, and GPTZero for text. Detection accuracy varies widely with subject matter and adversarial pressure: classifiers trained on outputs of one model family often perform poorly on newer architectures, and the NIST AI 100-4 report cautions that detection should be treated as a low-confidence input to risk decisions rather than as ground truth.^[4]

How is content provenance regulated?

United States: Executive Order 14110

President Joe Biden signed Executive Order 14110, "Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence," on 30 October 2023.^[5] Among its provisions was a directive to the Department of Commerce, through NIST, to develop standards and guidelines for "authenticating content and tracking its provenance" and for "labeling synthetic content," including watermarking. NIST issued a Request for Information in December 2023, published a draft for public comment in April 2024, and released the final NIST AI 100-4 report Reducing Risks Posed by Synthetic Content on 20 November 2024.^[4]

President Donald Trump revoked Executive Order 14110 on 20 January 2025, the first day of his second term.^[5] The NIST 100-4 report itself remained published and continues to be cited as a non-binding technical reference. The revocation did not directly affect state-level rules, federal procurement language already finalized, or industry adoption of C2PA, but it removed the primary federal mandate for watermarking standards work.

European Union: AI Act Article 50

The EU AI Act, adopted in 2024, devotes Article 50 to transparency obligations for AI systems. Two clauses are central to content provenance. First, providers of generative AI systems must ensure that synthetic audio, image, video, or text outputs are "marked in a machine-readable format and detectable as artificially generated or manipulated," and the technical solutions used must be "effective, interoperable, robust and reliable as far as this is technically feasible," taking into account specifics of the content type, cost, and the state of the art.^[3] Second, deployers of systems that generate or manipulate image, audio, or video deepfakes must disclose that the content has been artificially generated or manipulated, with narrow carve-outs for artistic, satirical, or fictional works.^[3] Article 50's disclosure obligations apply from 2 August 2026.^[3] The European Commission published the first draft of a Code of Practice on Transparency of AI-Generated Content on 17 December 2025 and released the final code on 10 June 2026 to guide implementation.^[29] Under the Digital Omnibus package agreed in 2026, the machine-readable marking duty in Article 50(2) is deferred to 2 December 2026 for generative AI systems already placed on the market before 2 August 2026; systems launched on or after 2 August 2026 must mark their output from the start.^[44]

China: deep synthesis provisions

China's Cyberspace Administration, jointly with the Ministry of Industry and Information Technology and the Ministry of Public Security, issued the Provisions on the Administration of Deep Synthesis Internet Information Services, which took effect on 10 January 2023.^[30] The provisions require providers of services that synthesize text, images, audio, video, or virtual scenes to attach prominent labels indicating that content is generated, to obtain biometric consent before editing a person's face or voice, to verify user identities, and to maintain records sufficient to trace generated content back to the user who produced it. China followed up with the Measures for Labeling of AI-Generated Synthetic Content, jointly issued on 14 March 2025 by the Cyberspace Administration, the Ministry of Industry and Information Technology, the Ministry of Public Security, and the National Radio and Television Administration, and effective 1 September 2025.^[43] The measures require both explicit labels (a visible mark or interface notice, such as an "AI" tag) and implicit labels (machine-readable metadata embedded in the file so the mark travels with downloaded content), backed by a mandatory national standard, GB 45438-2025.^[43] The pairing of visible and metadata labels made China one of the first jurisdictions to require both layers in production deployments.^[31]

California state laws

California enacted a cluster of AI content laws in September 2024. On 17 September 2024, Governor Gavin Newsom signed three election-related bills: AB 2655 (the Defending Democracy from Deepfake Deception Act of 2024, requiring large platforms with over one million California users to detect, label, or remove materially deceptive election content), AB 2839 (an urgency measure prohibiting distribution of deceptive AI-generated election advertisements within a window around an election), and AB 2355 (requiring disclosure on electoral advertisements that use AI-generated content).^[32] Newsom signed AB 1836 the same day, expanding California's post-mortem right of publicity to cover unauthorized digital replicas of deceased performers, with damages of at least $10,000 per violation; the act took effect 1 January 2025.^[33]

A more sweeping bill, AB 3211, would have required general-purpose watermarking on AI-generated content sold or distributed in California. OpenAI publicly endorsed the bill in August 2024 through a letter from chief strategy officer Jason Kwon to Assemblymember Buffy Wicks, framing provenance signals as helpful for distinguishing AI from human content; the bill was ultimately not enacted in 2024.^[34]

Other jurisdictions

The United Kingdom has pursued a sectoral approach via Ofcom's online-safety duties rather than dedicated provenance legislation. South Korea's Personal Information Protection Commission issued guidance on deepfake disclosures in 2024. India's Ministry of Electronics and Information Technology issued advisories in 2024 directing platforms to label synthetic content, though without binding force. Many of these national efforts cite or reference the C2PA specification or the EU AI Act as templates.^[29]

Which organizations coordinate content provenance?

Several organizations sit alongside C2PA in the provenance ecosystem. The Content Authenticity Initiative, run from Adobe, focuses on developer tools and advocacy and provides an open-source SDK for creating and verifying Content Credentials. Project Origin, run from Microsoft Research with BBC, CBC/Radio-Canada, and The New York Times as founding news partners, was folded into C2PA at the standards layer in 2021 but continues to coordinate newsroom adoption.^[8]^[9] The Partnership on AI, founded in 2016, hosts working groups on synthetic media disclosure and has published guidance on responsible practices for generative content.^[35] The MPA Trust framework from the Motion Picture Association and DPP Origin from the Digital Production Partnership target film and broadcast production pipelines.

How do cryptographic provenance and watermarking compare?

Property	Cryptographic provenance (C2PA)	Invisible watermarking (SynthID, Stable Signature)
Carrier	Signed metadata in file container or sidecar	Modifications to pixels, audio samples, or token logits
Survives screenshotting	No (unless re-attached via soft binding)	Often yes
Survives metadata stripping	No	Yes
Survives heavy compression	Yes (signature still valid on intact file)	Degrades, varies by scheme
Survives paraphrase or regeneration	Not applicable (asset becomes a new asset)	No
Reveals who edited the asset	Yes, if signers are identified	No (only that the asset originated from a marked generator)
Requires generator cooperation	Yes	Yes
Standardized	Yes (C2PA, JPEG Trust)	Partial (no universal standard for marks themselves)

The two families are complementary. Cryptographic provenance gives strong identity and edit-history guarantees when the manifest is intact; watermarking gives weaker but more robust signals that survive casual stripping. Most active deployments combine both: a generator emits a watermarked file with a C2PA manifest attached, expecting that at least one of the two layers will survive any given downstream handling.^[4]

What are the limitations of content provenance?

Provenance systems face several recurring criticisms. First, they are opt-in for honest actors and offer no protection against adversaries who use open-weight generators that omit marks, edit the file to remove manifests, or deploy laundering pipelines (re-photograph the screen, run a Vary or img2img pass, paraphrase through a translation model). The Brookings Institution and academic researchers have documented the brittleness of leading watermark schemes against diffusion-purification and paraphrase attacks.^[21]^[36]

Second, detection-only tools carry significant false-positive and false-negative rates, and operators in education and publishing have reported accusations against students or contributors that proved incorrect when the tools mistakenly flagged human writing. NIST AI 100-4 warns operators against treating detector scores as ground truth.^[4]

Third, mandatory disclosure may not change the behavior of the audiences most susceptible to deceptive content. A 2025 experimental study found that AI labels lowered how accurate readers judged flagged content to be, but that these effects were "limited in scope" and did not significantly shift policy support or broader concern about misinformation.^[45] The reach of disclosure is also uneven: the Reuters Institute's Generative AI and News Report 2025 found that only 19 percent of respondents saw AI labels on a daily basis (28 percent weekly), and just 12 percent said they were comfortable with news made entirely by AI, versus 62 percent for content made entirely by humans; on balance, audiences expected AI to make the news less trustworthy.^[37]

Fourth, provenance schemes create privacy and surveillance concerns. A capture-side signed photograph leaks device identity and timestamps that could deanonymize sources, and watermarking schemes that encode per-user signatures (such as Stable Signature) create a forensic trail that could be subpoenaed or sold. Civil-society groups including the Electronic Frontier Foundation have urged that provenance standards be paired with privacy-preserving designs and limits on retention.^[38]

Finally, the international fragmentation of rules creates compliance complexity. A model provider distributing in the European Union must comply with Article 50, in China with the deep synthesis provisions and the 2025 labeling measures, and in California with election-content rules, each with different definitions, timing, and label-format requirements. Industry submissions to the European Commission's December 2025 Code of Practice consultation argued for harmonization around C2PA as a baseline.^[29]

References

^Coalition for Content Provenance and Authenticity, "C2PA Technical Specification 2.4", C2PA, 2026-04. spec.c2pa.org/...C2PA_Specification.html. Accessed 2026-07-12.
^Google DeepMind, "SynthID", DeepMind, 2026. deepmind.google/...synthid Accessed 2026-05-25.
^European Union, "Article 50: Transparency Obligations for Providers and Deployers of Certain AI Systems", EU Artificial Intelligence Act, 2024-07-12. artificialintelligenceact.eu/...50 Accessed 2026-05-25.
^National Institute of Standards and Technology, "NIST AI 100-4: Reducing Risks Posed by Synthetic Content", NIST, 2024-11-20. nvlpubs.nist.gov/...NIST.AI.100-4.pdf. Accessed 2026-05-25.
^The White House, "Executive Order 14110 on Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence", Federal Register, 2023-10-30. presidency.ucsb.edu/...lopment-and-use-artificial. Accessed 2026-05-25.
^Content Authenticity Initiative, "How it works", CAI, 2024. contentauthenticity.org/how-it-works. Accessed 2026-05-25.
^Adobe, "Introducing the Content Authenticity Initiative", Adobe Blog, 2019-11-04. blog.adobe.com/...content-authenticity-initiative. Accessed 2026-05-25.
^Microsoft Research, "Project Origin", Microsoft, 2021. microsoft.com/...project-origin Accessed 2026-05-25.
^Coalition for Content Provenance and Authenticity, "C2PA Founding Press Release", C2PA, 2021-02-22. c2pa.org/c2pa-founding-press-release Accessed 2026-05-25.
^Microsoft, "Technology and media entities join forces to create standards group aimed at building trust in online content", Microsoft Source, 2021-02-22. news.microsoft.com/...ding-trust-in-online-content Accessed 2026-05-25.
^Coalition for Content Provenance and Authenticity, "C2PA Releases Specification of World's First Industry Standard for Content Provenance", C2PA, 2022-01-27. c2pa.org/...dustry-standard-for-content-provenance Accessed 2026-05-25.
^Sam Schechner, "Fake AI Image of Pentagon Explosion Briefly Shakes Stocks", Reuters, 2023-05-23. reuters.com/...iral-trips-up-ai-chatbot-2023-05-23 Accessed 2026-05-25.
^OpenAI, "Understanding the source of what we see and hear online", OpenAI, 2024-05-07. openai.com/...ource-of-what-we-see-and-hear-online Accessed 2026-05-25.
^Google, "Building trust in AI-generated content with SynthID and C2PA", Google Blog, 2024-09-17. blog.google/...t-ai-generated-content-c2pa-synthid Accessed 2026-05-25.
^Leica Camera AG, "Leica M11-P with Content Credentials", Leica, 2023-10-26. leica-camera.com/...m11-p. Accessed 2026-05-25.
^Truepic, "Truepic's Technology Provides Authenticity and Content Verification via Tamper-Evident Imagery", Truepic Blog, 2023. truepic.com/...ication-via-tamper-evident-imagery. Accessed 2026-05-25.
^Truepic, "Truepic's Authenticating Camera SDK Recognized by TIME's Best Inventions 2022", Truepic Blog, 2022-11-10. truepic.com/...ized-by-times-best-inventions-2022. Accessed 2026-05-25.
^Truepic, "Truepic and Microsoft Pilot New Provenance Platform to Authenticate Images", Truepic Blog, 2023-03-27. truepic.com/...ate-images-from-capture-to-display. Accessed 2026-05-25.
^Joint Photographic Experts Group, "JPEG Trust becomes an International Standard", JPEG, 2024-12-02. jpeg.org/...20241202_press.html. Accessed 2026-05-25.
^International Organization for Standardization, "ISO/IEC 21617-1:2025 Information technology, JPEG Trust, Part 1: Core foundation", ISO, 2025-01. iso.org/...86831.html. Accessed 2026-05-25.
^Daniel Kim and Niam Yaraghi, "Provenance and watermarking for generative AI", Brookings Institution, 2024-04. brookings.edu/...-limits-of-provenance-information Accessed 2026-05-25.
^Google DeepMind, "Identifying AI-generated images with SynthID", DeepMind Blog, 2023-08-29. deepmind.google/...i-generated-images-with-synthid Accessed 2026-05-25.
^Will Douglas Heaven, "Google DeepMind has launched a watermarking tool for AI-generated images", MIT Technology Review, 2023-08-29. technologyreview.com/...ol-for-ai-generated-images Accessed 2026-05-25.
^Google DeepMind, "Watermarking AI-generated text and video with SynthID", DeepMind Blog, 2024-05-14. deepmind.google/...ted-text-and-video-with-synthid Accessed 2026-05-25.
^Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, and Teddy Furon, "The Stable Signature: Rooting Watermarks in Latent Diffusion Models", arXiv 2303.15435 (ICCV 2023), 2023-03-27. arxiv.org/...2303.15435. Accessed 2026-05-25.
^Deepa Seetharaman, "There's a Tool to Catch Students Cheating With ChatGPT. OpenAI Hasn't Released It", The Wall Street Journal, 2024-08-04. wsj.com/...tool-chatgpt-cheating-writing-135b755a. Accessed 2026-05-25.
^Kyle Wiggers, "OpenAI says it's taking a 'deliberate approach' to releasing tools that can detect writing from ChatGPT", TechCrunch, 2024-08-04. techcrunch.com/...-can-detect-writing-from-chatgpt Accessed 2026-05-25.
^Microsoft, "PhotoDNA Cloud Service", Microsoft Digital Safety, 2024. microsoft.com/...photodna. Accessed 2026-05-25.
^European Commission, "Code of Practice on Transparency of AI-Generated Content (first draft 2025-12-17, final 2026-06-10)", Shaping Europe's digital future, 2026-06-10. digital-strategy.ec.europa.eu/...enerated-content. Accessed 2026-07-12.
^Cyberspace Administration of China, "Provisions on the Administration of Deep Synthesis of Internet Information Services", CAC (translated by Library of Congress), 2022-11-25, effective 2023-01-10. loc.gov/...-synthesis-technology-enter-into-effect Accessed 2026-05-25.
^Bird and Bird, "New AI Content Labelling Rules in China: What are they and how do they compare to the EU AI Act", Bird and Bird, 2025. twobirds.com/...-do-they-compare-to-the-eu-ai-act. Accessed 2026-05-25.
^Office of Governor Gavin Newsom, "Governor Newsom signs bills to combat deepfake election content", California Governor, 2024-09-17. gov.ca.gov/...-to-combat-deepfake-election-content Accessed 2026-05-25.
^California State Legislature, "AB-1836 Use of likeness: digital replica", California Legislative Information, 2024-09-17. leginfo.legislature.ca.gov/...billNavClient.xhtml Accessed 2026-05-25.
^Pure AI, "OpenAI Backs California AI Watermarking Bill", Pure AI, 2024-08-26. pureai.com/...alifornia-ai-watermarking-bill.aspx. Accessed 2026-05-25.
^Partnership on AI, "Responsible Practices for Synthetic Media: A Framework for Collective Action", Partnership on AI, 2023-02. syntheticmedia.partnershiponai.org Accessed 2026-05-25.
^Hanlin Zhang and Niloofar Mireshghallah, "The Brittleness of AI-Generated Image Watermarking Techniques", arXiv 2408.10446, 2024-08-19. arxiv.org/...2408.10446. Accessed 2026-05-25.
^Reuters Institute for the Study of Journalism, "Generative AI and News Report 2025: How people think about AI's role in journalism and society", University of Oxford, 2025-10. reutersinstitute.politics.ox.ac.uk/...and-society. Accessed 2026-07-12.
^Electronic Frontier Foundation, "AI Watermarking Won't Save Us From Disinformation", EFF, 2023-12-13. eff.org/...termarking-wont-save-us-disinformation. Accessed 2026-05-25.
^Google DeepMind, "SynthID Detector, a new portal to help identify AI-generated content", Google Blog, 2025-05-20. blog.google/...google-synthid-ai-content-detector Accessed 2026-07-12.
^Sven Gowal et al., "SynthID-Image: Image watermarking at internet scale", arXiv 2510.09263, 2025-10-10. arxiv.org/...2510.09263. Accessed 2026-07-12.
^DPReview, "Google brings Content Credentials to phones for the first time", DPReview, 2025. dpreview.com/...ials-to-phones-for-the-first-time. Accessed 2026-07-12.
^Content Authenticity Initiative, "The State of Content Authenticity in 2026", CAI, 2026. contentauthenticity.org/...t-authenticity-in-2026. Accessed 2026-07-12.
^Loeb and Loeb LLP, "China's AI-Labeling Measures and Mandatory National Standards Take Effect September 1", Loeb and Loeb, 2025-03. loeb.com/...nal-standards-take-effect-september-1. Accessed 2026-07-12.
^Gibson Dunn, "EU AI Act Omnibus Agreement: Postponed High-Risk Deadlines and Other Key Changes", Gibson Dunn, 2026. gibsondunn.com/...-deadlines-and-other-key-changes Accessed 2026-07-12.
^Chuyao Wang, Patrick Sturgis, and Daniel de Kadt, "AI labeling reduces the perceived accuracy of online content but has limited broader effects", arXiv 2506.16202, 2025-06-19. arxiv.org/...2506.16202. Accessed 2026-07-12.

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

3 revisions by 1 contributors · v4 · 5,032 words · full history

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Suggest edit

What links here

AI and the internet AI in cryptocurrency AI watermarking Adobe Firefly Colossyan Content Authenticity Initiative Digital Media EU AI Act ElevenLabs Machine-generated text detection NIST AI Risk Management Framework

How does provenance differ from authenticity and detection?

When did content provenance emerge?

How does cryptographic provenance work?

C2PA Content Credentials

Truepic

JPEG Trust

Limits of cryptographic provenance

Watermarking

SynthID

Meta Stable Signature

OpenAI text watermarking

Other commercial deployments

Adjacent technologies

Perceptual hashing

AI detection classifiers

How is content provenance regulated?

United States: Executive Order 14110

European Union: AI Act Article 50

China: deep synthesis provisions

California state laws

Other jurisdictions

Which organizations coordinate content provenance?

How do cryptographic provenance and watermarking compare?

What are the limitations of content provenance?

See also

References

Improve this article

Related Articles

C2PA (Coalition for Content Provenance and Authenticity)

Copyright and AI

Content Authenticity Initiative

Disparate Impact

Disparate Treatment

General Data Protection Regulation (GDPR)

What links here

Related Articles

C2PA (Coalition for Content Provenance and Authenticity)

Copyright and AI

Content Authenticity Initiative

Disparate Impact

Disparate Treatment

General Data Protection Regulation (GDPR)

What links here