Metadata Q&A
Why store metadata in image files?
Information stored in an image file is always with the image, no matter where it travels. In this sense, the information is the image. Think of today’s digital image files as packaged bundles of information, written (for the most part) in standard formats.
What types of metadata can we include in image files?
Digital image files can include descriptive, technical and administrative information about the image.
What metadata standards can we include in image files?
JPEG, TIFF, PSD, Raw and several other file formats can can contain IPTC-IIM, IPTC Core, IPTC Extension, PLUS, Exif and Dublin Core metadata.
What are the standards?
The formats and fields for storing metadata have evolved over the past couple decades, beginning with a standard - or “schema” - based on a multimedia Information Interchange Model created by the International Press and Telecommunications Council and adopted by Adobe in 1995 for its Photoshop products.
The original ("legacy") IPTC-IIM schema includes widely compatible fields identifying an image’s creator or rights holder, capture time, capture location, caption, headline, title, copyright notices and other basic information. IPTC Core and IPTC Extension build on the legacy of IPTC-IIM by adding more types of descriptive and administrative information, along with a more robust data format, XMP, and fields to accommodate the needs of the stock photography and cultural heritage communities.
Dublin Core is a schema for libraries in a wide variety of industries. It includes 15 basic components, five of which map to IPTC fields.
The PLUS system is a metadata standard that identifies and defines image-use licenses, along with a format and tools for generating a string of characters that can identify a copyright holder, user, scope and terms of a licensed image use.
Exif metadata include technical information about an image and its capture method, such as exposure settings, capture time, GPS location information and camera models.
How do we store metadata?
Image files include metadata, packaged separately from the pixel data that make up the visual image. Our bento box illustration might help you visualize this.
The initial method for storing metadata in image files originated with Adobe's TIFF format and was adopted by others. Since it stores the metadata – IPTC-IIM, and/or Exif – as "blocks" of data, it's referred to as Image Resource Block IRB format data. Sets of IRBs can be "nested," allowing multiple schemas in the same file. But this method of storing numbered "tags" faces tight size limits within the file header.
XMP is a newer, more flexible storage method – introduced by Adobe in 2001 and partly based on the XML language – for storing and accessing image metadata. It can store metadata within an image file or in an accompanying sidecar file, and it permits creation of custom metadata fields. In addition, XMP supports Unicode, allowing metadata to include language-specific characters (such as umlauts and accent marks) and even character-based alphabets such as Japanese, Chinese and Cyrillic. Unlike IRB, XMP fields have no character limits.
XMP can store IPTC Core, along with IPTC Extension, Dublin Core and PLUS metadata.
Exif, generated by capture devices, is both a storage format and a schema.
Do we need to worry about older storage methods?
Although the newer XMP format is replacing IRB for metadata storage, your metadata tools should support both, because:
• Older tools that don't support XMP (few did before 2005) will likely only read and write IRB data. Files created with older tools may only contain IRB-format data. However, many newer tools will read that information and translate it into XMP format.
• Some newer tools only store XMP-format metadata.
• A file edited by several different tools may have data in both formats, possibly with slightly different versions of the same data in each.
This can happen several ways, but one cause is the legacy IPTC-IIM schema limits the number of characters per field. An IPTC Core field might be truncated when saved in a corresponding IPTC-IIM field. When moving back and forth between tools that only understand the legacy format and those that recognize both the newer and older formats, synchronizing the information becomes extremely important.
Some software automatically recognizes both formats (the IRB format used to store IPTC-IIM and the XMP format that stores IPTC Core and other schemas) and synchronizes the information. But your workflow - the order in which you use different software - can make a big difference. In general, once you have used a newer tool that writes in both XMP and IRB, avoid using an older tool that only writes IRB format.
Why does the information I entered in "Author" show up as "Creator" in another program?
Several fields are "shared" between different schemas and field name labels. What one software program calls "Object Name," another may call "Document Title" or "Title." Part of this problem stems from changes in Field Names as schemas have evolved. In some cases, software programs are responding to users' requests to use legacy field names. In others, software developers have chosen to use a different name. Some software even gives users a choice of which Field Names to use.
The bottom line is the metadata can be "mapped" to corresponding fields regardless of what they're called. See the IPTC Core Mapped Fields PDF on the linked page for more information on how fields are mapped between various imaging software.
How can I include metadata in image files?
Working with a wide variety of software, you can embed descriptive and identifying metadata in standard file formats, such as TIFF, JPEG and PSD. You can also embed such data in Raw image files, but there can be pitfalls. Proprietary Raw formats are neither standardized nor publicly documented. For now, it’s best to attach metadata in a sidecar, such as an Adobe .xmp, file, unless you convert your images to DNG format. See our Tutorials for more information on working with specific software programs.
Which metadata fields are most important?
While it's important to fill in as many photo metadata fields as possible with accurate and complete information, a few basic fields are considered critical. They include information related to Copyright and Contacting the creator and/or rights holder. Creators should enter this information as soon as possible in their workflow, in-camera if possible. Users who receive images without these critical fields should add them - if they're aware of the correct information - to any images they intend to retain for even a few days. And they should ensure such information is never stripped from image files.
Additionally, as soon as possible in their workflows, creators and users should ensure rich metadata are present in all image files, including such fields as:
- Caption/Description
- Keywords
- Unique identifiers (such as working file numbers)
To learn more about the importance of metadata to workflows and commerce, please see The Metadata Manifesto.
Do pictures from my smartphone pose a privacy risk?
While it's possible that your smartphone may embed information about your location (via GPS tags in the Exif metadata), this is something you can control. Some phones ask whether you want to share location data, or allow you to turn location services off. In addition, many of the services that may be used to share images have a tendency to strip (remove) all of the Exif data -- of which the GPS data are a part. The Snopes page about this issue gives some additional details that are worth a read if you are concerned.