4.4

Reading image metadata

EXIF, XMP, IPTC, ICC profiles, embedded thumbnails, and maker notes carry more about an image's history than the photographer usually intends to share. Reading them is a verification step that takes minutes and routinely changes the case.

Digital images are not just pixels. They carry, by convention and standard, a substantial body of structured metadata describing capture parameters, editing history, ownership claims, color characteristics, and sometimes much more. This metadata is largely invisible to ordinary viewers but is trivially readable with the right tools, and it routinely contains information that contradicts or contextualizes the image's apparent claims.

This page is a guide to what metadata exists, where it lives, what it tends to reveal, and where it can be trusted versus where it is too easily forged to be evidentiary. The intended user is anyone with reason to examine an image's metadata: a verifier, a forensic examiner, a privacy-conscious user wondering what their images leak. The metadata layer is older than C2PA and complementary to it: C2PA signs the metadata cryptographically, but the underlying structures are the same.

The metadata families

Four major metadata standards dominate digital imaging, plus several minor ones.

EXIF

Exchangeable Image File Format, originally specified by the Japan Electronic Industries Development Association in 1995. EXIF is the metadata most associated with cameras: shutter speed, aperture, ISO, focal length, exposure compensation, white balance, GPS coordinates, capture timestamp, and device make / model. EXIF lives inside the image container (in JPEG as an APP1 marker; in HEIF and other formats as a structured box) and is preserved by most camera workflows but is often stripped or partially stripped by social platforms.

XMP

Extensible Metadata Platform, developed by Adobe and standardized as ISO 16684 in 2012. XMP is RDF/XML-based and used heavily by editorial workflows. It can carry IPTC fields, Dublin Core elements, custom application data (Photoshop edit history, Lightroom develop settings), rights metadata, and arbitrary extensions. XMP is the most expressive of the metadata standards and the one most often used by editing applications to record their own state.

IPTC

The metadata standards published by the International Press Telecommunications Council. The IPTC Photo Metadata Standard defines fields for editorial use: caption, photographer name, copyright, location, keywords, and various editorial-workflow fields. IPTC metadata is the lingua franca of newsroom image handling and is preserved through wire-service distribution networks. The current version is the Photo Metadata Standard 2024.1.

ICC profiles

International Color Consortium profiles describe an image's color space. ICC profiles are not provenance metadata in the editorial sense — they describe color rather than history — but they sometimes carry information about the device or software that produced the image, and they can be a useful triangulation when other metadata is missing or suspect.

Maker notes

Within EXIF, the MakerNote field is a manufacturer-specific extension. Each camera maker stores its own proprietary data there: shutter actuations, lens serial numbers, focus-point information, in-camera processing settings, and various proprietary fields documented to different degrees. Maker notes have been used in forensic identification of specific cameras and in detecting tools that re-saved an image (which may not know how to preserve maker notes properly).

A typical EXIF dump

The exiftool command-line utility is the standard tool for reading and writing metadata. A representative output from a recent Sony camera looks like:

$ exiftool DSC_0421.jpg
File Name                       : DSC_0421.jpg
File Size                       : 8.3 MB
File Modification Date/Time     : 2026:04:12 16:21:09-04:00
File Type                       : JPEG
Image Width                     : 6000
Image Height                    : 4000
Make                            : SONY
Camera Model Name               : ILCE-7RM5
Software                        : ILCE-7RM5 v2.10
Date/Time Original              : 2026:04:12 16:21:08
Create Date                     : 2026:04:12 16:21:08
Modify Date                     : 2026:04:12 16:21:09
ISO                             : 400
Exposure Time                   : 1/250
F Number                        : 5.6
Focal Length                    : 35.0 mm
Lens Model                      : FE 35mm F1.4 GM
GPS Latitude                    : 40 deg 42' 51.36" N
GPS Longitude                   : 74 deg  0' 21.50" W
GPS Date/Time                   : 2026:04:12 20:21:08Z
Color Space                     : sRGB
ICC Profile Name                : sRGB IEC61966-2.1
Artist                          : J. Photographer
Copyright Notice                : © 2026 J. Photographer
IPTC City                       : New York
IPTC Country/Primary Location   : United States

The output gives a reasonably complete picture: the camera, the lens, the exposure parameters, the GPS coordinates, the photographer's name, and the editorial location. Each of these can be checked: the camera model and lens combination should be consistent with the photographer's claimed equipment, the GPS coordinates should match the claimed location, the timestamps should be plausible, and the IPTC city and country should match.

What metadata is diagnostic of

Several patterns in metadata are diagnostic of common deceptions:

SignalImpliesReliability
Software field naming an editorImage was editedHigh when present; absence is not informative
Empty EXIFImage passed through a stripping pipelineHigh; commonly social platforms
Camera + GPS + time consistentImage is plausibly from claimed sourceModerate; all can be forged
Thumbnail differs from main imagePost-capture editingVery high when present
Maker notes present and consistentImage likely unmodified by re-encoderHigh; tools rarely preserve maker notes correctly
Caveat EXIF is trivially editable. Any forensic argument built on EXIF alone is contestable. EXIF claims should be treated as one input — useful when consistent with other signals, suspicious when not, but never decisive on its own. C2PA-signed metadata is the version that resists casual editing; unsigned EXIF cannot bear evidentiary weight against a knowledgeable adversary.

What metadata leaks

The verification use of metadata has a privacy counterpart. Photographs uploaded with intact EXIF reveal location (GPS), time (timestamps), and often photographer identity (Artist, Author). Several historical incidents — most famously the 2012 Vice article that geolocated John McAfee from an iPhone photo's EXIF GPS — have demonstrated the consequences. Modern phones default to stripping GPS on upload to some apps and preserving it for others; the inconsistency means users routinely leak more than they intend.

For verification practice, the privacy implications work in both directions. A subject's metadata leakage may aid the verifier. A source's metadata leakage may compromise the source. Newsroom workflows that handle sensitive imagery typically strip metadata before re-publication, both to protect sources and to remove information that might be used to compromise them. This is a deliberate trade-off against the verifiability that the metadata would otherwise provide.

How AI generators handle metadata

The major commercial AI image generators emit their own metadata along with C2PA manifests. DALL·E 3 outputs include a Software field identifying OpenAI; Adobe Firefly outputs include Adobe identification and edit-history XMP fields. The metadata gives an honest signal of AI origin when the producer cooperates. Open-weights generators typically produce minimal metadata, often only the generation tool's identification if anything.

An image generated by Stable Diffusion through a typical web UI and saved as PNG carries a "Software" string identifying the UI (Automatic1111, ComfyUI) but no explicit AI-generation flag. A re-save through a generic image editor may strip even this. The result is that metadata is a strong positive signal for AI generation when present but unreliable as a negative — an image without AI-generation metadata may be AI-generated or may be a photograph.

Reading metadata in practice

The standard tools are exiftool (command-line, comprehensive, open-source), Phil Harvey's web demo at exif.tools, and the various platform-specific viewers (macOS Preview's Inspector, Windows Properties dialog). For batch work, exiftool scripted against a directory is the production answer. For one-off inspection in a browser, FotoForensics displays EXIF alongside its forensic visualizations. The InVID/WeVerify plug-in includes EXIF inspection for journalist users who do not want to install command-line tools.

The c2patool from the CAI reads C2PA manifests, which include metadata as cryptographically signed assertions. For C2PA-credentialed images, c2patool's output is more authoritative than raw EXIF because it can verify what the signer attested to. For uncredentialed images, exiftool remains the right tool.

Where the field is moving

The metadata layer is being absorbed into the C2PA framework rather than replaced by it. C2PA manifests reference EXIF, XMP, and IPTC as standards-passthrough assertions; the underlying field definitions and tooling continue to be useful. What changes is that signed metadata becomes available alongside unsigned metadata, and verifiers can prefer the signed version when it exists. The metadata-reading skill, developed over the past two decades, transfers directly.

The other shift is privacy-driven. Several platforms have moved toward stripping metadata by default and offering signed C2PA-compatible re-attachment as an opt-in feature. This separates the privacy and provenance questions cleanly: users who want anonymity get it through stripping; users who want verifiability get it through signed re-attachment. Whether this pattern becomes universal depends on platform incentives and on regulatory developments around the EU AI Act's marking obligations, which create asymmetric pressure for synthetic-content metadata to be preserved.