FileStructure
A MIE file consists of a series of MIE elements. A MIE element may contain either data or a group of MIE
elements, providing a hierarchical format for storing data. Each MIE element is identified by a human-
readable tag name, and may store data from zero to 2^64-1 bytes in length.
FileSignature
The first element in the MIE file must be an uncompressed MIE group element with a tag name of "0MIE".
This restriction allows the first 8 bytes of a MIE file to be used to identify a MIE format file. The
following table lists the two possible initial byte sequences for a MIE-format file (the first for big-
endian, and the second for little-endian byte ordering):
Byte Number: 0 1 2 3 4 5 6 7
C Characters: ~ \x10 \x04 ? 0 M I E
or ~ \x18 \x04 ? 0 M I E
Hexadecimal: 7e 10 04 ? 30 4d 49 45
or 7e 18 04 ? 30 4d 49 45
Decimal: 126 16 4 ? 48 77 73 69
or 126 24 4 ? 48 77 73 69
Note that byte 1 may have one of the two possible values (0x10 or 0x18), and byte 3 may have any value
(0x00 to 0xff).
ElementStructure
1 byte SyncByte = 0x7e (decimal 126, character '~')
1 byte FormatCode (see below)
1 byte TagLength (T)
1 byte DataLength (gives D if DataLength < 253)
T bytes TagName (T given by TagLength)
2 bytes DataLength2 [exists only if DataLength == 255 (0xff)]
4 bytes DataLength4 [exists only if DataLength == 254 (0xfe)]
8 bytes DataLength8 [exists only if DataLength == 253 (0xfd)]
D bytes DataBlock (D given by DataLength)
The minimum element length is 4 bytes (for a group terminator). The maximum DataBlock size is 2^64-1
bytes. TagLength and DataLength are unsigned integers, and the byte ordering for multi-byte DataLength
fields is specified by the containing MIE group element. The SyncByte is byte aligned, so no padding is
added to align on an N-byte boundary.
FormatCode
The format code is a bitmask that defines the format of the data:
7654 3210
++++ ---- FormatType
---- +--- TypeModifier
---- -+-- Compressed
---- --++ FormatSize
FormatType (bitmask 0xf0):
0x00 - other (or unknown) data
0x10 - MIE group
0x20 - text string
0x30 - list of null-separated text strings
0x40 - integer
0x50 - rational
0x60 - fixed point
0x70 - floating point
0x80 - free space
TypeModifier (bitmask 0x08):
Modifies the meaning of certain FormatTypes (0x00-0x60):
0x08 - other data sensitive to MIE group byte order
0x18 - MIE group with little-endian byte ordering
0x28 - UTF encoded text string
0x38 - UTF encoded text string list
0x48 - signed integer
0x58 - signed rational (denominator is always unsigned)
0x68 - signed fixed-point
Compressed (bitmask 0x04):
If this bit is set, the data block is compressed using Zlib deflate. An entire MIE group may be
compressed, with the exception of file-level groups.
FormatSize (bitmask 0x03):
Gives the byte size of each data element:
0x00 - 8 bits (1 byte)
0x01 - 16 bits (2 bytes)
0x02 - 32 bits (4 bytes)
0x03 - 64 bits (8 bytes)
The number of bytes in a single value for this format is given by 2**FormatSize (or 1 << FormatSize).
The number of values is the data length divided by this number of bytes. It is an error if the data
length is not an even multiple of the format size in bytes.
The following is a list of all currently defined MIE FormatCode values for uncompressed data (add 0x04 to
each value for compressed data):
0x00 - other data (insensitive to MIE group byte order) (1)
0x01 - other 16-bit data (may be byte swapped)
0x02 - other 32-bit data (may be byte swapped)
0x03 - other 64-bit data (may be byte swapped)
0x08 - other data (sensitive to MIE group byte order) (1)
0x10 - MIE group with big-endian values (1)
0x18 - MIE group with little-endian values (1)
0x20 - ASCII (ISO 8859-1) string (2,3)
0x28 - UTF-8 string (2,3,4)
0x29 - UTF-16 string (2,3,4)
0x2a - UTF-32 string (2,3,4)
0x30 - ASCII (ISO 8859-1) string list (3,5)
0x38 - UTF-8 string list (3,4,5)
0x39 - UTF-16 string list (3,4,5)
0x3a - UTF-32 string list (3,4,5)
0x40 - unsigned 8-bit integer
0x41 - unsigned 16-bit integer
0x42 - unsigned 32-bit integer
0x43 - unsigned 64-bit integer (6)
0x48 - signed 8-bit integer
0x49 - signed 16-bit integer
0x4a - signed 32-bit integer
0x4b - signed 64-bit integer (6)
0x52 - unsigned 32-bit rational (16-bit numerator then denominator) (7)
0x53 - unsigned 64-bit rational (32-bit numerator then denominator) (7)
0x5a - signed 32-bit rational (denominator is unsigned) (7)
0x5b - signed 64-bit rational (denominator is unsigned) (7)
0x61 - unsigned 16-bit fixed-point (high 8 bits is integer part) (8)
0x62 - unsigned 32-bit fixed-point (high 16 bits is integer part) (8)
0x69 - signed 16-bit fixed-point (high 8 bits is signed integer) (8)
0x6a - signed 32-bit fixed-point (high 16 bits is signed integer) (8)
0x72 - 32-bit IEEE float (not recommended for portability reasons)
0x73 - 64-bit IEEE double (not recommended for portability reasons) (6)
0x80 - free space (value data does not contain useful information)
Notes:
1. The byte ordering specified by the MIE group TypeModifier applies to the MIE group element as well as
all elements within the group. Data for all FormatCodes except 0x08 (other data, sensitive to byte
order) may be transferred between MIE groups with different byte order by byte swapping the
uncompressed data according to the specified data format. The following list illustrates the byte-
swapping pattern, based on FormatSize, for all format types except rational (FormatType 0x50).
FormatSize Change in Byte Sequence
-------------- -----------------------------------
0x00 (8 bits) 0 1 2 3 4 5 6 7 --> 0 1 2 3 4 5 6 7 (no change)
0x01 (16 bits) 0 1 2 3 4 5 6 7 --> 1 0 3 2 5 4 7 6
0x02 (32 bits) 0 1 2 3 4 5 6 7 --> 3 2 1 0 7 6 5 4
0x03 (64 bits) 0 1 2 3 4 5 6 7 --> 7 6 5 4 3 2 1 0
Rational values consist of two integers, so they are swapped as the next lower FormatSize. For
example, a 32-bit rational (FormatSize 0x02, and FormatCode 0x52 or 0x5a) is swapped as two 16-bit
values (ie. as if it had FormatSize 0x01).
2. The TagName of a string element may have an 6-character suffix to indicate a specific locale. (eg.
"Title-en_US", or "Keywords-de_DE").
3. Text strings are not normally null terminated, however they may be padded with one or more null
characters to the end of the data block to allow strings to be edited within fixed-length data
blocks. Newlines in the text are indicated by a single LF (0x0a) character.
4. UTF strings must not begin with a byte order mark (BOM) since the byte order and byte size are
specified by the MIE format. If a BOM is found, it should be treated as a zero-width non-breaking
space.
5. A list of text strings separated by null characters. These lists must not be null padded or null
terminated, since this would be interpreted as additional zero-length strings. For ASCII and UTF-8
strings, the null character is a single zero (0x00) byte. For UTF-16 or UTF-32 strings, the null
character is 2 or 4 zero bytes respectively.
6. 64-bit integers and doubles are subject to the specified byte ordering for both 32-bit words and
bytes within these words. For instance, the high order byte is always the first byte if big-endian,
and the eighth byte if little-endian. This means that some swapping is always necessary for these
values on systems where the byte order differs from the word order (eg. some ARM systems), regardless
of the endian-ness of the stored values.
7. Rational values are treated as two separate integers. The numerator always comes first regardless of
the byte ordering. In a signed rational value, only the numerator is signed. The denominator of all
rational values is unsigned (eg. a signed 64-bit rational of 0x80000000/0x80000000 evaluates to -1,
not +1).
8. 32-bit fixed point values are converted to floating point by treating them as an integer and dividing
by an appropriate value. eg)
16-bit fixed value = 16-bit integer value / 256.0
32-bit fixed value = 32-bit integer value / 65536.0
TagLength
Gives the length of the TagName string. Any value between 0 and 255 is valid, but the TagLength of 0 is
valid only for the MIE group terminator.
DataLength
DataLength is an unsigned byte that gives the number of bytes in the data block. A value between 0 and
252 gives the data length directly, and numbers from 253 to 255 are reserved for extended DataLength
codes. Codes of 255, 254 and 253 indicate that the element contains an additional 2, 4 or 8 byte
unsigned integer representing the data length.
0-252 - length of data block
255 (0xff) - use DataLength2
254 (0xfe) - use DataLength4
253 (0xfd) - use DataLength8
A DataLength of zero is valid for any element except a compressed MIE group. A zero DataLength for an
uncompressed MIE group indicates that the group length is unknown. For other elements, a zero length
indicates there is no associated data. A terminator element must have a DataLength of 0, 6 or 10, and
may not use an extended DataLength.
TagName
The TagName string is 0 to 255 bytes long, and is composed of the ASCII characters A-Z, a-z, 0-9 and
underline ('_'). Also, a dash ('-') is used to separate the language/country code in the TagName of a
localized text string, and a units string (possibly containing other ASCII characters) may be appear in
brackets at the end of the TagName. The TagName string is NOT null terminated. A MIE element with a tag
string of zero length is reserved for the group terminator.
MIE elements are sorted alphabetically by TagName within each group. Multiple elements with the same
TagName are allowed, even within the same group.
TagNames should be meaningful. Case is significant. Words should be lowercase with an uppercase first
character, and acronyms should be all upper case. The underline ("_") is provided to allow separation of
two acronyms or two numbers, but it shouldn't be used unless necessary. No separation is necessary
between an acronym and a word (eg. "ISOSetting").
All TagNames should start with an uppercase letter. An exception to this rule allows tags to begin with
a digit (0-9) if they must come before other tags in the sort order, or a lowercase letter (a-z) if they
must come after. For instance, the '0Type' element begins with a digit so it comes before, and the
'data' element begins with a lowercase letter so that it comes after meta information tags in the main
"0MIE" group.
Tag names for localized text strings have an 6-character suffix with the following format: The first
character is a dash ('-'), followed by a 2-character lower case ISO 639-1 language code, then an
underline ('_'), and ending with a 2-character upper case ISO 3166-1 alpha 2 country code. (eg.
"-en_US", "-en_GB", "-de_DE" or "-fr_FR". Note that "GB", and not "UK" is the code for Great Britain,
although "UK" should be recognized for compatibility reasons.) The suffix is included when sorting the
tags alphabetically, so the default locale (with no tag-name suffix) always comes first. If the country
is unknown or not applicable, a country code of "XX" should be used.
Tags with numerical values may allow units of measurement to be specified. The units string is stored in
brackets at the end of the tag name, and is composed of zero or more ASCII characters in the range 0x21
to 0x7d, excluding the bracket characters 0x28 and 0x29. (eg. "Resolution(/cm)" or
"SpecificHeat(J/kg.K)".) See Image::ExifTool::MIEUnits for details. Unit strings are not localized, and
may not be used in combination with localized text strings.
Sets of tags which would require a common prefix should be added in a separate MIE group instead of
adding the prefix to all tag names. For example, instead of these TagName's:
ExternalFlashType
ExternalFlashSerialNumber
ExternalFlashFired
one would instead designate a separate "ExternalFlash" MIE group to contain the following elements:
Type
SerialNumber
Fired
DataLength2/4/8
These extended DataLength fields exist only if DataLength is 255, 254 or 253, and are respectively 2, 4
or 8 byte unsigned integers giving the data block length. One of these values must be used if the data
block is larger than 252 bytes, but they may be used if desired for smaller blocks too (although this may
add a few unnecessary bytes to the MIE element).
DataBlock
The data value for the MIE element. The format of the data is given by the FormatCode. For MIE group
elements, the data includes all contained elements and the group terminator.
MIEgroups
All MIE data elements must be contained within a group. A group begins with a MIE group element, and
ends with a group terminator. Groups may be nested in a hierarchy to arbitrary depth.
A MIE group element is identified by a format code of 0x10 (big endian byte ordering) or 0x18 (little
endian). The group terminator is distinguished by a zero TagLength (it is the only element allowed to
have a zero TagLength), and has a FormatCode of 0x00.
The MIE group element is permitted to have a zero DataLength only if the data is uncompressed. This
special value indicates that the group length is unknown (otherwise the minimum value for DataLength is
4, corresponding the the minimum group size which includes a terminator of at least 4 bytes). If
DataLength is zero, all elements in the group must be parsed until the group terminator is found. If
non-zero, DataLength includes the length of all elements contained within the group, including the group
terminator. Use of a non-zero DataLength is encouraged because it allows readers quickly skip over
entire MIE groups. For compressed groups DataLength must be non-zero, and is the length of the
compressed group data (which includes the compressed group terminator).
GroupTerminator
The group terminator has a FormatCode and TagLength of zero. The terminator DataLength must be 0, 6 or
10 bytes, and extended DataLength codes may not be used. With a zero DataLength, the byte sequence for a
terminator is "7e 00 00 00" (hex). With a DataLength of 6 or 10 bytes, the terminator data block
contains information about the length and byte ordering of the preceding group. This additional
information is recommended for file-level groups, and is used in multi-document MIE files and MIE
trailers to allow the file to be scanned backwards from the end. (This may also allow some documents to
be recovered if part of the file is corrupted.) The structure of this optional terminator data block is
as follows:
4 or 8 bytes GroupLength (unsigned integer)
1 byte ByteOrder (0x10 or 0x18, same as MIE group)
1 byte GroupLengthSize (0x04 or 0x08)
The ByteOrder and GroupLengthSize values give the byte ordering and size of the GroupLength integer. The
GroupLength value is the total length of the entire MIE group ending with this terminator, including the
opening MIE group element and the terminator itself.
File-levelMIEgroups
File-level MIE groups may NOT be compressed.
All elements in a MIE file are contained within a special group with a TagName of "0MIE". The purpose of
the "OMIE" group is to provide a unique signature at the start of the file, and to encapsulate
information allowing files to be easily combined. The "0MIE" group must be terminated like any other
group, but it is recommended that the terminator of a file-level group include the optional data block
(defined above) to provide information about the group length and byte order.
It is valid to have more than one "0MIE" group at the file level, allowing multiple documents in a single
MIE file. Furthermore, the MIE structure enables multi-document files to be generated by simply
concatenating two or more MIE files.
ScanningBackwardsthroughaMIEFile
The steps below give an algorithm to quickly locate the last document in a MIE file:
1. Read the last 10 bytes of the file. (Note that a valid MIE file may be as short as 12 bytes long,
but a file this length contains only an an empty MIE group.)
2. If the last byte of the file is zero, then it is not possible to scan backward through the file, so
the file must be scanned from the beginning. Otherwise, proceed to the next step.
3. If the last byte is 4 or 8, the terminator contains information about the byte ordering and length of
the group. Otherwise, stop here because this isn't a valid MIE file.
4. The next-to-last byte must be either 0x10 indicating big-endian byte ordering or 0x18 for little-
endian ordering, otherwise this isn't a valid MIE file.
5. The value of the preceding 4 or 8 bytes gives the length of the complete file-level MIE group
(GroupLength). This length includes both the leading MIE group element and the terminator element
itself. The value is an unsigned integer with a byte length given in step 3), and a byte order from
step 4). From the current file position (at the end of the data read in step 1), seek backward by
this number of bytes to find the start of the MIE group element for this document.
This algorithm may be repeated again beginning at this point in the file to locate the next-to-last
document, etc.
The table below lists all 5 valid patterns for the last 14 bytes of a file-level MIE group, with all
numbers in hex. The comments indicate the length and byte ordering of GroupLength (xx) if available:
?? ?? ?? ?? ?? ?? ?? ?? ?? ?? 7e 00 00 00 - (no GroupLength)
?? ?? ?? ?? 7e 00 00 06 xx xx xx xx 10 04 - 4 bytes, big endian
?? ?? ?? ?? 7e 00 00 06 xx xx xx xx 18 04 - 4 bytes, little endian
7e 00 00 0a xx xx xx xx xx xx xx xx 10 08 - 8 bytes, big endian
7e 00 00 0a xx xx xx xx xx xx xx xx 18 08 - 8 bytes, little endian
TrailerSignature
The MIE format may be used for trailer information appended to other types of files. When this is done,
a signature must appear at the end of the main MIE group to uniquely identify the MIE format trailer. To
achieve this, a "zmie" trailer signature is written as the last element in the main "0MIE" group. This
element has a FormatCode of 0, a TagLength of 4, a DataLength of 0, and a TagName of "zmie". With this
signature, the hex byte sequence "7e 00 04 00 7a 6d 69 65" appears immediately before the final group
terminator, and the last 22 bytes of the trailer correspond to one of the following 4 patterns (where the
trailer length is given by "xx", as above):
?? ?? ?? ?? 7e 00 04 00 7a 6d 69 65 7e 00 00 06 xx xx xx xx 10 04
?? ?? ?? ?? 7e 00 04 00 7a 6d 69 65 7e 00 00 06 xx xx xx xx 18 04
7e 00 04 00 7a 6d 69 65 7e 00 00 0a xx xx xx xx xx xx xx xx 10 08
7e 00 04 00 7a 6d 69 65 7e 00 00 0a xx xx xx xx xx xx xx xx 18 08
Note that the zero-DataLength terminator may not be used here because the trailer length must be known
for seeking backwards from the end of the file.
Multiple trailers may be appended to the same file using this technique.
MIEDataValues
MIE data values for a given tag are usually not restricted to a specific FormatCode. Any value may be
represented in any appropriate format, including numbers represented in string (ASCII or UTF) form.
It is preferred that closely related values with the same format are written to a single tag instead of
using multiple tags. This improves localization of like values and decreases MIE element overhead. For
instance, instead of separate ImageWidth and ImageHeight tags, a single ImageSize tag is defined.
Tags which may take on a discrete set of values should have meaningful values if possible. This improves
the extensibility of the format and allows a more reasonable interpretation of unrecognized values.
NumericalRepresentation
Integer and floating point numbers may be represented in binary or string form. In string form, integers
are a series of digits with an optional leading sign (eg. "[+|-]DDDDDD"), and multiple values are
separated by a single space character (eg. "23 128 -32"). Floating point numbers are similar but may
also contain a decimal point and/or a signed exponent with a leading 'e' character (eg.
"[+|-]DD[.DDDDDD][e(+|-)EEE]"). The string "inf" is used to represent infinity. One advantage of
numerical strings is that they can have an arbitrarily high precision because the possible number of
significant digits is virtually unlimited.
Note that numerical values may have associated units of measurement which are specified in the "TagName"
string.
Date/TimeFormat
All MIE dates are strings in the form "YYYY:mm:dd HH:MM:SS.ss+HH:MM". The fractional seconds (".ss") are
optional, and if included may contain any number of significant digits (unlike all other fields which are
a fixed number of digits and must be padded with leading zeros if necessary). The timezone ("+HH:MM" or
"-HH:MM") is recommended but not required. If not given, the local system timezone is assumed.
MIMEType
The basic MIME type for a MIE file is "application/x-mie", however the specific MIME type depends on the
type of subfile, and is obtained by adding "x-mie-" to the MIME type of the subfile. For example, with a
subfile of type "image/jpeg", the MIE file MIME type is "image/x-mie-jpeg". But note that the "x-" is
not duplicated if the subfile MIME type already starts with "x-". So a subfile with MIME type
"image/x-raw" is contained within a MIE file of type "image/x-mie-raw", not "image/x-mie-x-raw". In the
case of multiple documents in a MIE file, the MIME type is taken from the first document. Regardless of
the subfile type, all MIE-format files should have a filename extension of ".MIE".
LevelsofSupport
Basic MIE reader/writer applications may choose not to provide support for some advanced features of the
MIE format. Features which may not be supported by all software are:
Compression
Software not supporting compression must ignore compressed elements and groups, but should be able to
process the remaining information.
Large data lengths
Some software may limit the maximum size of a MIE group or element. Historically, a limit of 2GB may
be imposed by some systems. However, 8-byte data lengths should be supported by all applications
provided the value doesn't exceed the system limit. (eg. For systems with a 2GB limit, 8-byte data
lengths should be supported if the upper 17 bits are all zero.) If a data length above the system
limit is encountered, it may be necessary for the application to stop processing if it can not seek
to the next element in the file.