Skip to main content

Glossary of Terms

This meta-dictionary will help you understand the many metadata terms and acronyms you may encounter – from ANSI to XMP.

If you have other metadata-related terms to add, contact us.




ANSI - Abbreviation for the American National Standards Institute (ANSI), a private non-profit organization founded in 1916 that oversees the development and accreditation of voluntary consensus standards for products, services, processes, systems, and personnel in the United States.  These standards ensure that the characteristics and performance of products are consistent, that people use the same definitions and terms, and that products are tested the same way.

Administrative Metadata -  Information such as licensing, usage rights or restrictions, model releases, provenance, and contact information for the rights holder or licensor, rather than descriptive information about the photo.

Archive Master - See Master.

ASCII - Abbreviation for the American Standard Code for Information Interchange. ASCII, was one of the early (1960) character encoding methods based on the English alphabet. It was used in representing text in computers, communications equipment and other devices that work with text. Other modern character encodings, such as UTF-8, support many more characters, but have a historical basis in ASCII.

Bit - A contraction of binary digit, the smallest unit of information storage or digital information that can take on one of two values, such as false and true or 0 and 1.

Byte - A component in the machine data hierarchy usually larger than a bit and smaller than a word; now most often eight bits and the smallest addressable unit of storage. A Byte typically holds one character.

Capture - The process by which a digital image is acquired as a digital file, either by a digital camera or by digitally scanning with analog material (film / prints).

Cataloging - The process of adding images and both administrative and descriptive information about them, either automatically or manually, to an image database, or digital asset management system.

Checksum - A checksum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmissions or storage.   The checksum is transmitted or stored along with the data; the receiving system recomputes the checksum based upon the received data and compares this value with the one sent with the data. If the two values are the same, the receiver has some confidence that the data was received correctly.

Comp - A shorthand reference to a "comprehensive" or visual rendering of a proposed advertisement or other printed piece. These would usually indicate the intended placement of each photograph, illustration and/or text. Images made available for these purpose are often referred by this term.

Compression - Process of coding digital data using fewer bits, in order to save storage space or transmission time.  There are many Compression algorithms and utilities. The most common file format for photos  transmission is JPEG (q.v.) which provides several degrees of compression, each of which loses some bits of data.

Container - In the digital imaging world, these are semantic objects used to group together or store related information to be easily referenced. For example, all of the Exif metadata is stored in one container within the header of an image file.

Container format - Container or wrapper formats, such as DNG, may group RAW image data and metadata together into a single object, making it a meta-format, because both the real data as well as the information about the data format are stored within the file itself.

Controlled Vocabulary - Controlled vocabularies ensure that the same terms are used for the same concepts and objects in a database, with similar and related terms clearly defined.  Used in the construction of thesauri and taxonomies, they effectively limit choice by offering pre-selected terms from which users must choose.

Corruption - See Data Corruption.

CSV - An abbreviation for "Comma Separated Value", a type of delimited text file format (q.v.) in which a comma separates the columns in which tabular data is stored. It dates back to early business computing methods, and is common to all computer platforms.

Crosswalk - See "metadata crosswalk."

Cutline - An older term used in the newspaper industry to indicate the caption to be used with a published photograph or illustration.

Data Corruption - The errors that occur to a data file or digital image as it is transferred or retrieved, which introduce unintended changes to the original data. With some errors it is possible to recover or partially recover the file. In some cases the pixel data may be preserved, but the information in the file header such as Exif or IPTC metadata may be lost.

Data File - See File.

Data Integrity - The assurance that data is accurate, correct and valid. With computer systems it is possible to verify data integrity by checking hash values, checksums, or other means.

Decompression - To reverse the effects of data compression.

Decryption - Any procedure used in cryptography to convert ciphertext (encrypted data) into plaintext. (From FOLDOC:

Delimited file format - A plain text file format in which the various elements of a sequential file have their columns separated from one another using a specific recurring character, such as the comma or tab. Each row represents one record, and the delimited value keeps the data in ordered columns. Delimited files are useful in getting data from one program into another program. Tab delimited text files, and Comma Separated Value (CSV) files are both examples of delimited file formats.

Derivative File - Used for image files that are created from an original or archive master file by subsampling or oversampling. Thumbnail or preview images that allow users to see what an asset looks like before they open the larger file are all examples of derivative files. They may also refer to images that will be used for production purposes but where some aspect has been altered such as the resolution, format type, or color space. The term derivative files can almost be used interchangeably with surrogate files, though derivatives imply a wider range of uses.

Descriptive Metadata - Information which tells the viewer what or who is in the image, and where and when the image was taken. These include captions, headlines, titles, keywords, location, date created and more.

Diacritic - Precomposed letters containing special marks used in digital typography to indicate a special pronunciation, such as the accent, cedilla, grave, tilde, and umlaut (áçèîñõü). These letters many not be represented in the most basic character sets and thus may not be transcribed correctly when exchanging photo metadata between operating systems.

Diacritical mark - See diacritic.

Digital Asset Management - Digital Asset Management (DAM) refers to the methods of employing some form of database management of both images and their corresponding metadata to support accurate storage and retrieval of digital graphics and image files.

Digital Migration - See Migration.

Digital Object - A discrete unit of information in digital form such as a digital image file, or other document format such as PDF, Powerpoint, etc.

Digital Rights Management - The use of encryption or other technological means to regulate access to a licensable digital work, such as images, songs, movies, other software or sensitive documents.

Digital Negative (DNG) - Abbreviation for the Adobe Digital Negative Format. A wrapper technology or Container Format for holding RAW files along with other associated information such as metadata in XMP format. They can also be used to hold previews of images and a wide range of other data which can be stored with an image rather than in Sidecar Files (q.v.).

Digitization - Digitization is the process of converting analog film or physical prints (or other items) into digital equivalents. There are many methods that can be used. Film can be scanned with a dedicated film scanner or flatbed, prints or book pages can be scanned on a flatbed or photographed with a digital camera. The need to digitize will likely diminish with the move to direct digital capture.

Disc - Typically round optical storage device such as a CD or DVD.

Disk - A physical hard drive on which data is stored. Also referred to as a drive.

Downsampling - Resampling a digital image downwards by discarding pixel information, thus reducing the pixel density (resolution) and/or image dimensions.

Dublin Core Metadata Initiative (DCMI) - The Dublin Core Metadata Initiative (DCMI) is the name of the organization that first established this metadata standard. Because Dublin Core fields can, in theory, be applied to almost any type of asset, not simply photos it has become more popular within the  public sector as well as other archives and digital repositories. The original version was a standardised core set of 15 fields or criteria for broadly describing content. DCMI data can be placed in-line in the meta tags of web pages (or as a reference to an associated XML file) as well as for other content such as photos, documents, videos etc.

Encryption - Any procedure used in cryptography to convert plaintext into ciphertext (encrypted message) in order to prevent any but the intended recipient from reading that data. Schematically, there are two classes of encryption primitives: public-key cryptography and private-key cryptography; they are generally used complementarily. Public-key encryption algorithms include RSA; private-key algorithms include the obsolescent Data Encryption Standard, the Advanced Encryption Standard, as well as RC4.

Excel - A Microsoft spreadsheet format used to hold a variety of data in tables which can be sorted, calculated, or graphically displayed.

Exif - A metadata schema used to store technical metadata typically coming from a digital camera.This provides a host of information, such as the camera make and model, its serial number, the date and time of image capture, the shutter speed, lens used, the ISO speed setting, and often other technical details, such as white balance and distance to the subject. RAW file processing software can use Exif information to more accurately render images.

FPO - Abbreviation used to indicate low resolution images that are to be used "For Position Only" in comps.

File - A named and ordered sequence of Bytes that is known and understood by an operating system. A File can be zero or more Bytes, has permissions assigned (read/write/remove), and has file system statistics such as size and last modification date. A File also has a Format.

File Header - The non-image portion of a digital image file, preceding or following the actual pixel data, which contains information about the file such as those contained in various types of technical, descriptive and administrative metadata typically written using the EXIF, IPTC or XMP standards.

Format - A preexisting structure specifying the organization of a File, such as TIFF, JPEG, etc.

Header - See File Header.

Homonym - Words that are pronounced or spelled the same way but have different meanings.

ICC Color Profile - These are embedded metadata labels, developed by the International Color Consortium to indicate the color space used to create and edit the file. It is best to always embed an ICC profile in a digital image so that the colors as intended by the file creator are correctly transmitted, received, and viewed by the file recipient.

IIM - The abbreviation for the Information Interchange Module, the schema outlining the first IPTC metadata standard that was used in formulating the original File Info for Photoshop.

Ingest - The process by which one or many captured digital image files are taken into a computer system for some form of digital asset management.

Interoperability - The ability to exchange and use information between various systems or schemas.

IPTC Core - A metadata schema developed by the International Press Telecommunications Council that updated the previous IPTC schema to work with the newer Adobe XMP metadata standard. This IPTC4XMP format stores information separate from the IIM form of IPTC metadata but shares many fields that are backwards compatible to a degree. Also referred to as the IPTC4XMP, or IPTC Core Schema for XMP, it comprised the fields included in the IPTC Contact, Image, Content and Status panels that appear under the File> File Info menu in Photoshop.

IPTC - A metadata schema based on the Information Interchange Module (IIM) and named for the group that developed it in 1991, the International Press Telecommunications Council. A portion of the IIM was incorporated into Photoshop in 1995 and is stored in an Image Resource Block (IRB). While considered a legacy format, it remains widely used and readable by most software that accesses metadata.

IRB - Abbreviation for the Image Resource Block, a method of encoding non-pixel text-based information into the header of a digital image file.

JFIF - The technical name for the file format better known as JPEG. Typically only used when it is crucial to communicate the difference between the JPEG file format and the JPEG image compression algorithm.

JPEG - Is an acronym for the original name of the committee, (Joint Photographic Experts Group), that designed the standard image compression algorithm. As an 8-bit per channel format used for compressing either full-colour or grey-scale digital images of "natural", real-world scenes, JPEG does not work as well on non-realistic images (cartoons, line drawings, maps).

JPEG 2000 - JPEG 2000 is a wavelet-based image compression standard developed by the Joint Photographic Experts Group and with the intention of superseding their original discrete cosine transform-based JPEG standard (created in 1992). The standardized filename extension is .jp2 or .jpx.As of 2008, there is little support for JPEG 2000 in web browsers, and hence it is not used much for image display on the Internet, though it has been adopted for use by a number of cultural heritage institutions.

KML - KML is an abbreviation for Keyhole Markup Language (KML) an XML-based language used to describe three-dimensional geospatial data for display in application programs. KML was originally developed by Keyhole, Inc, (acquired by Google in 2004) for use with what became Google Earth. The term "Keyhole" is a reference to the KH reconnaissance satellites, the eye-in-the-sky military reconnaissance system launched in 1976.

KMZ - These are simply zipped KML files which use a .kmz file extension. When a KMZ file is unzipped, a single "doc.kml" is found along with any overlay and icon images referenced in the KML.

kb - See Kilobyte.

Kilobyte - A measure of file size and storage capacity which refers to 1,000 or 1,024, 8-bit data units or characters, depending on context.

LAMP - Acronym for applications, such as a number of open source image databases,  that use Linux, Apache, MySQL, and PHP (LAMP)to operate on the Internet.

LOC - An abbreviation for the Library of Congress.

LZW - Abbreviation for Lempel-Ziv Welch compression, the algorithm designed by Terry Welch in 1984 for use in high-performance disk controller hardware and used by the Unix compress command to reduce the size of files for archiving or transmission. The LZW algorithm relies on the recurrence of byte sequences (strings) in its input, and is a popular compression type for use with TIFF files.

MB - See Megabyte.

Master - The finished, fully-developed version of a digital or analog image, used as the source for making various derivative files.

Megabyte - A measure of file size and storage capacity which refers to 1,048,576, 8-bit data units or characters; or 1024 kilobytes. 1024 megabytes equals one gigabyte.

Metadata - Data about Data, in terms of Digital Photo Management, text data that informs you about the subject matter in a way that is more useful than noting that it is a collection of colored pixels.

Metadata Crosswalk - Metadata crosswalks show people how to match up the data from one scheme into a different scheme. They are often used by libraries, archives, museums, and other cultural institutions to translate data to or from specific metadata schemes.   This type of "translating" from one format to another is often called "metadata mapping" or "field mapping," and is related to "data mapping," and "semantic mapping."  Metadata crosswalks also help databases using different metadata schemes to share information. They help metadata harvesters create union catalogs. They enable search engines to search multiple databases simultaneously with a single query.

Metadata mapping - See "Metadata Crosswalk."

Metalogging - The process of adding descriptive information (metadata) about an image and storing it in such a way that it can be used as an aid in retrieving that image from a database or collection after it is “cataloged.”

Migration - The process of moving digital information from one form of storage to another, which may or may not involve transforming the file format as well.

NAA - Abbreviation for the Newspaper Association of American, one of the groups responsible, along with the IPTC, for establishing early photo metadata standards.

Namespace - A namespace is an abstract container, also called context,  created to hold a logical grouping of unique identifiers or symbols (i.e., names), in order to differentiate them from items in different namespaces that have the same name, and prevent any ambiguity between them.  Storage devices use directories (or folders) as namespaces, for example. This allows two files with the same name to be stored on the device so long as they are stored in different directories.   Computer languages that support namespaces specify the rules that determine to which namespace an identifier (i.e., not its definition) belongs.  The Adobe XMP labeling technology that allows you to embed metadata into an image file, uses namespaces associated with a Uniform Resource Identifer (URI) (q.v.) that identifies the namespace, so that each of the field names within that namespace are unique. Namespaces such as those used by Adobe or PLUS can be found at, or

Ontology - In the context of Digital Asset Management, an ontology shows the relationships, properties and functions between terms or concepts which can express a wider range of relationships between attributes or terms than can a simple hierarchy. This can be very useful when attempting to represent complex or multi-faceted relationships.

Original - First or master version of a digital or analog image. See also, Master.

PDF - Abbreviation for the Adobe Portable Document Format, the native file format for Adobe Systems' Acrobat. This file format represents documents in a way where they are independent of the original application software, hardware, and operating system used to create them.

Photo-CD - A format popularized by Kodak, for scanning analog film and storing it in an "image pac" format.

Plaintext - The normal representation of textual data before any action has been taken to conceal or format it. Within image metadata circles, this usually refers to text file formats in which there are no formatting codes such as bold, italic, underline, point size, or font designations.

PLUS - The Picture Licensing Universal System is an integrated set of standards for communicating rights metadata associated with commissioned and stock images. The PLUS standards are developed, approved and maintained by the PLUS Coalition, an international, non-profit umbrella association representing publishers, designers, advertising agencies, photographers, illustrators, stock image distributors, artist representatives, museums, libraries, and standards bodies, such as UPDIG, IPTC, IDEAlliance and others. More information at

PNG - Abbreviation for Portable Network Graphics, a format for storing bitmapped images, employing lossless data compression and supporting transparency. PNG was created to replace the GIF format as its use does not require a patent license.

Preview image - Refers to the larger on-screen version of an image, the next size up from a thumbnail, and may be smaller than a comp or FPO. A preview image is typically accessed by clicking on the thumbnail in an image database.

Proxy Files - Proxy files are those derived from an original digital master and are typically used (in combination with metadata) to assist users in locating images in a database. Proxy files include previews and thumbnails files and may also be referred to as Derivative Files.

PSB - The .PSB (Photoshop Big) format is an updated version of .PSD specifically designed for dealing with files over 2 gigabytes in size.

PSD - The .psd (Photoshop Document) format is a popular proprietary file format from Adobe Systems, Inc. It has support for most all of the imaging options available in Photoshop, such as layer masks, transparency, text, and alpha channels. In addition, spot colors, clipping paths and even duotone settings can be saved if you are preparing images for printing.

RAW - A RAW image file is a variety of image file that contains unprocessed (or minimally processed) data from the image sensor of a digital camera or image scanner, without processing them into a more common image format such as JPEG or TIFF. Raw files require additional processing by a raw converter in a wide-gamut colorspace before conversion to format where they are ready to be used with a bitmap image editor or printed. Because the characteristics of each RAW format change depending on which vendor or manufacturer, this makes dealing with them using Digital Image Management tools quite complex. Because original RAW files are quite fragile and may be corrupted with the use of third party software, many applications, such as Adobe Photoshop, Lightroom and Bridge, use sidecar files to store the changes made to them. As a result, Adobe developed the DNG (Digital Negative) format to the RAW file and attendant metadata in a single Container Format.

Record - A set of data (typically field of information) relating to am individual item in an image database.

Render - To present a Digital Object to a user by converting the high-level object-based description into a graphical representation in order to display the image.

Resample - Resampling an image changes its resolution through interpolation, making calculations using already known values.

RGB - An abbreviation for the colors Red, Green, and Blue used to display color on many projected light devices such as computer monitors, televisions and video projectors.  ICC color standards are used to define the display accurately.

Rights - Assertions of one or more rights or permissions pertaining to a Digital Image and/or what a specific Agent or Distributor can do with an Image.

Rights Management - The processes associated with active control and management of the licensing history of a work.

Rights Metadata - A subset of Administrative Metadata which identifies the creator, copyright holder, or licensor, and defines which rights are being granted or reserved.

Schema - A formal structural description or model of the various fields that are contained in a database or database-like structures, such as XML files. Both the Exif and IPTC Core are examples of schemas.

Sidecar files - Sidecar files are a method of storing data (often metadata) related to a file in an external file, rather than embedding it into the source file. Each source file can have one or more sidecar files, whereas a "metadata database" the one database contains metadata for several source files.  In most cases  sidecar files have the same base name as the source file, but with a different extension. The problem with this system is that most operating systems and file managers have no knowledge of these relationships, and might allow the user to rename or move one of the files thereby breaking the relationship.   For file formats that have no internal support for XMP data, the data is stored in separate .xmp files with the same base file name. Many photo cataloging applications have support for this file format.

slug; slug-line - Newspaper editing lingo for a short name given to an article that is in production. When metalogging you can put the name of the event into the Headline field within the IPTC metadata of an image.

SQL - An abbreviation for Structured Query Language, a standard interactive and programming  language used for defining and manipulating tables of data in a relational database management system.

Steganography - A term of Greek origin, meaning "covered" or "hidden writing." which refers to methods of concealing a message or word within a digital image so that a viewer does not  realize it is there.  Signum Systems and Digimarc have created techniques for embedding a hidden watermark or copyright data in digital images as a method of protecting creators' or owners' rights to its intellectual property.

Storage - The act or process of storing information in some form of non-volatile computer memory such as magnetic tape or disk, or optical disk (CD-R, DVD-R).

Store - The act of writing a data or image file to some non-volatile storage device such as a hard drive, tape, CD-R or DVD-R.

Suffix - One or more letters added at the end of a filename prior to the extension, which gives clues as to the specific type of image file. For example, a file with the suffix 'r' indicates that it is the RGB version of the image.

Surrogate File - see Proxy files.

Synonym - A word, phrase, or term that has a meaning the same as, or very near to, that of another word, phrase or term.

Tab Delimited - A type of delimited text file which uses the tab character to separate each of the columns used for storing tabular data. This form of storing data is popular with many databases making it a popular way to exchange data between programs.

Tag - The act of attaching a label, such as a keyword, to a digital photo or other image resource.

Taxonomy(ies) - A type of classification which implies a hierarchical system (i.e. it has parent/child relationships between terms).

Technical Metadata - For most modern image-capture devices this is information which describes an image’s characteristics, such as its size, color profile, ISO speed and other camera settings.

TGM-I - Abbreviation for the Thesaurus of Graphic Materials- I (cross reference for indexing visual materials).

TGM-II - Abbreviation for the Thesaurus of Graphic Materials- II (Genre and Physical Characteristic Terms).

Thesaurus -  A classified list of terms, such as key-words, and including synonyms (and sometimes antonyms) for the words of a given language.or used in a particular field, typically used for indexing and information retrieval.

Thumbnail image - A miniature version of an image that is smaller than a preview, and typically used in an image database, or on a web page to represent or provide a link to other content, such as a larger version of the image.

TIFF (or tif) - Tagged Image File Format images are stored using a proprietary method currently owned by Adobe.

UCS - An abbreviation for Universal Character Set, also a component of the abbreviation UTF, which stands for USC Transformation Format.

Unicode - A series of character encoding standards intended to support the characters used by a large number of the world’s languages designed for use internationally in computers. Unlike the 8-bit ASCII encoding scheme which can only represent 256 characters, Unicode characters are 16-bit, which allows for 65,536 combinations, enabling it to encode the letters of all written languages as well as thousands of characters in languages such as Japanese and Chinese.

Upsampling - Another way to refer to resampling a digital image upwards which requires creating pixel information based on the adjacent values.

URI, URL, URN - Uniform Resource Identifier (URI) consists of a string of characters used to identify or name a resource on the Internet, which enables interaction with representations of the resource over a network, typically the World Wide Web.  Computer scientists may classify a URI as a locator (URL), or a name (URN), or both.  A Uniform Resource Name (URN) functions like a person's name, while a Uniform Resource Locator (URL) resembles that person's street address. The ISBN system for uniquely identifying books provides a typical example of the use of typical URNs.

USM - Abbreviation for UnSharp Masking, a means of creating additional definition in the edge transition areas of an image in order to retain detail.

UTF - An abbreviation for USC Transformation Format; which describe one of a set of standard character encodings such as UTF-8 and UTF-16 which are created in accordance with ISO 10646, although the Unicode standard includes additional material.

UTF-16 - An abbreviation for the UCS transformation format 16 text encoding. This is a Unicode character set, encoded with a 16-bit transformation method as defined in RFC 2279.

UTF-8 - An abbreviation for UCS transformation format 8  text encoding; a Unicode character set, encoded with an 8-bit transformation method. This format avoids the problems of fixed-length Unicode encodings because an ASCII file encoded in UTF is exactly same as the original ASCII file; and any non-ASCII characters have the most significant bit set, so that normal tools for text searching etc. work as expected.

Validation - The process of checking or evaluating a Digital Image to ensure that it complies with the requirements of a standard or benchmark. For example, the structure of a Digital File can be validated against a file format specification to test for data corruption.

Verification - See Validation.

Volume - A logical or virtual entity that consists of portions of one or more hard drive disks. A volume may be formatted and may have a file system, a drive letter, or both. A volume is expressed by a type and a layout (simple, spanned, striped, RAID 1, etc.).

Watermark - A term adapted from the printing industry, which involves superimposing a recognizable image, pattern, or words so that the parts of the image it covers appear lighter or darker that the rest of the image. Watermarks are often used to assert ownership or copyright management information.

Workflow - A  logical sequence of steps taken or tasks performed that define the paths taken to complete a task with a specified outcome, subject to certain approvals or tests. It may be illustrated with a flowchart to define specific actions, results, decisions, or desired outcomes.  In photography, refers to the sequence of actions from capture to output that produce a final image.

XMP - An abbreviation for Extensible Metadata Platform a specific type of extensible markup language used to store metadata in digital photos. XMP was introduced by Adobe in 2001. Adobe, IPTC and IDEAlliance collaborated to introduce in 2004 the IPTC Core Schema for XMP, which transfers metadata values from IPTC headers to the more modern and flexible XMP.