Embedded OpenType (EOT) File Format (original) (raw)

1 Introduction

Most commercial fonts have an end user license agreement (EULA) which specifies the rights a person who has purchased a license to use the font may enjoy. As the right to freely distribute the font to others is not normally granted, being restricted to installing within a limited set of parameters, there exists a need for a mechanism to allow people who have licensed the font for use with their documents a way to also use the font for their web content without violating the EULA agreement which they have accepted. The EOT file format was developed by Microsoft in cooperation with font creators for just that reason. The technology has been accepted for use by font makers for over 10 years.

This submission sets forth the responsibilities and expectations for the following classes of implementers/users of the EOT file format:

  1. Authoring tools
  2. Servers of web content
  3. User agents

This submission also provides the EOT file format.

1.1 Notational Conventions

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [IETF RFC 2119].

2 Responsibilities and Expectations

2.1 Authoring Tools

Authoring tools are defined as software programs that automate the process of creating web content. If the tool is capable of generating EOT files, the tool must get input from the user to verify that the font EULA allows this type of embedding or associating with the web content before generating an EOT font. The user should also verify that the embedding bits in the font file indicate the same level of embedding rights as the EULA. If there is a discrepancy, the font licensee should contact the font vendor and ask to have the font embedding bits changed to match the EULA.

The RootString in the EOT file must be set by the authoring tool in a manner consistent with the EULA permissions. A null RootString, indicating the font may be used for any site, must only be used if the font's owner gives permission to all people everywhere to use the font for embedding.

Authoring tools should compress TrueType and OpenType fonts with TrueType outlines to to reduce the size of the font that needs to be transmitted over the wire to rendering user agents. Authoring tools should not compress OpenType fonts with CFF (Adobe Composite Font Format) outlines using MicroType® Express.

2.2 Servers

Servers are not used to process Embedded OpenType fonts directly. However, server owners should not assume that it is permissible for font files to be stored on their server to share out to whichever person wants to use them. It is recommended that server owners understand the scope of the privileges of use for all font files stored on their server, as defined by the font EULAs.

2.3 User Agents

Whether the font associated with web content is or is not of an EOT file format, user agents must verify and honor the embedding bits of font files. The font designer has set the embedding privileges as part of the font manufacturing process to state the permissions they would like attributed to the font for embedding/associating with documents.

If the font embedding bits do not provide permissions for embedding, the user agent shall not use the font.

If the font embedding bits do not allow for the font to be used for editing purposes, the user agent shall ensure the document is in read only mode and shall use a font other than the non-editable font in form elements where text may be input.

If the font does not have an 'OS2' table in it, as is the case with a few older fonts, the least restrictive privileges should be applied. Thus, the font should be embeddable.

User Agents must validate that the page using the embedded font is within the list of URLs in the RootString from which the embedded font object may be legitimately referenced.

The use of MicroType® Express font compression has been the default form to create EOT fonts for many years. Therefore, user agents supporting EOT must implement support for decompressing EOT files to have the ability render web content commonly used on the world wide web today.

3 EOT File Format

The EOT file format consists of a single EMBEDDEDFONT structure. The EMBEDDEDFONT structure provides sufficient basic information about the font name and characters supported in the font so that the User Agent does not need to unpack, decompress or install the font if the font is already installed on the machine/device, or does not have sufficient licensing rights.

Three versions of the EMBEDDEDFONT have been defined to this point. The second and third add additional data at the end of the structure, just before the "FontData" entry.

OpenType/TrueType font data should be compressed to reduce the size of the file download using the MicroType® Express compression technology. The reference to MicroType® Express is normative.

FontData points to a TrueType or OpenType font whose format is specified in The Open Font Format (ISO/IEC 14496-22).

IMPORTANT NOTE: All values in the EMBEDDEDFONT structure, with the exception of FontData and EUDCFontData are in Intel (little-endian) format. In contrast, the embedded OpenType font data contained in FontData and EUDCFontData uses Motorola (big-endian) format.

3.3 Version 0x00020002

Data Type Entry Name Description
unsigned long EOTSize Total structure length in bytes (including string and font data)
unsigned long FontDataSize Length of the OpenType font (FontData) in bytes
unsigned long Version Version number of this format - 0x00020002
unsigned long Flags Processing Flags
byte[10] FontPANOSE The PANOSE value for this font - See http://www.microsoft.com/typography/otspec/os2.htm#pan
byte Charset In Windows this is derived from TEXTMETRIC.tmCharSet. This value specifies the character set of the font. DEFAULT_CHARSET (0x01) indicates no preference. - See http://msdn2.microsoft.com/en-us/library/ms534202.aspx
byte Italic If the bit for ITALIC is set in OS/2.fsSelection, the value will be 0x01 - See http://www.microsoft.com/typography/otspec/os2.htm#fss
unsigned long Weight The weight value for this font - See http://www.microsoft.com/typography/otspec/os2.htm#wtc
unsigned short fsType Type flags that provide information about embedding permissions - See http://www.microsoft.com/typography/otspec/os2.htm#fst
unsigned short MagicNumber Magic number for EOT file - 0x504C. Used to check for data corruption.
unsigned long UnicodeRange1 os/2.UnicodeRange1 (bits 0-31) - See http://www.microsoft.com/typography/otspec/os2.htm#ur
unsigned long UnicodeRange2 os/2.UnicodeRange2 (bits 32-63) - See http://www.microsoft.com/typography/otspec/os2.htm#ur
unsigned long UnicodeRange3 os/2.UnicodeRange3 (bits 64-95) - See http://www.microsoft.com/typography/otspec/os2.htm#ur
unsigned long UnicodeRange4 os/2.UnicodeRange4 (bits 96-127) - See http://www.microsoft.com/typography/otspec/os2.htm#ur
unsigned long CodePageRange1 CodePageRange1 (bits 0-31) - See http://www.microsoft.com/typography/otspec/os2.htm#cpr
unsigned long CodePageRange2 CodePageRange2 (bits 32-63) - See http://www.microsoft.com/typography/otspec/os2.htm#cpr
unsigned long CheckSumAdjustment head.CheckSumAdjustment - See http://www.microsoft.com/typography/otspec/head.htm
unsigned long Reserved1 Reserved - must be 0
unsigned long Reserved2 Reserved - must be 0
unsigned long Reserved3 Reserved - must be 0
unsigned long Reserved4 Reserved - must be 0
unsigned short Padding1 Padding to maintain long alignment. Padding value must always be set to 0x0000.
unsigned short FamilyNameSize Number of bytes used by the FamilyName array
byte FamilyName[FamilyNameSize] Array of UTF-16 characters the length of FamilyNameSize bytes. This is the English language Font Family string found in the name table of the font (name ID = 1) - See http://www.microsoft.com/typography/otspec/name.htm
unsigned short Padding2 Padding value must always be set to 0x0000.
unsigned short StyleNameSize Number of bytes used by the StyleName
byte StyleName[StyleNameSize] Array of UTF-16 characters the length of StyleNameSize bytes. This is the English language Font Subfamily string found in the name table of the font (name ID = 2) - See http://www.microsoft.com/typography/otspec/name.htm
unsigned short Padding3 Padding value must always be set to 0x0000.
unsigned short VersionNameSize Number of bytes used by the VersionName
bytes VersionName[VersionNameSize] Array of UTF-16 characters the length of VersionNameSize bytes. This is the English language version string found in the name table of the font (name ID = 5) - See http://www.microsoft.com/typography/otspec/name.htm
unsigned short Padding4 Padding value must always be set to 0x0000.
unsigned short FullNameSize Number of bytes used by the FullName
byte FullName[FullNameSize] Array of UTF-16 characters the length of FullNameSize bytes. This is the English language full name string found in the name table of the font (name ID = 4) - See http://www.microsoft.com/typography/otspec/name.htm
unsigned short Padding5 Padding value must always be set to 0x0000.
unsigned short RootStringSize Number of bytes used by the RootString array
byte RootString[RootStringSize] Array of UTF-16 characters the length of RootStringSize bytes.
unsigned long RootStringCheckSum RootString CheckSum value. See algorithm to process RootStringChecksum below.
unsigned long EUDCCodePage Codepage value needed for EUDC font support.
unsigned short Padding6 Padding value must always be set to 0x0000.
unsigned short SignatureSize Number of bytes used by the Signature array. Currently reserved and should be set to 0x0000.
byte Signature[SignatureSize] Currently reserved. If the SignatureSize is 0x0000 there is no length to this array.
unsigned long EUDCFlags Processing flags for the EUDC font. Typical values might be TTEMBED_XORENCRYPTDATA and TTEMBED_TTCOMPRESSED.
unsigned long EUDCFontSize Number of bytes used by the Signature array.
byte EUDCFontData[EUDCFontSize] Number of bytes used for the EUDC font data. If the EUDCFontSize is 0x00000000 there is no length to this array.
byte FontData[FontDataSize] The font data for this EOT file. The data may be compressed or XOR encrypted as indicated by the processing flags.

4 Implementation Information

4.1 Font Embedding Levels

The "fsType" entry of the EMBEDDEDFONT structure provides an easily accessible copy of the font embedding licensing rights for the font in the .EOT file. This allows User Agents the ability to verify licensing privileges for using the embedded font before unpacking and installing the font.

Applications that implement support for font embedding must not embed fonts which are not licensed to permit embedding. Further, applications loading embedded fonts for temporary use (see Preview & Print and Editable embedding below) must delete the fonts from the device when the document containing the embedded font is closed.

NOTE: While the current version of the OpenType font specification's OS/2 table makes bits 0 - 3 a set of mutually exclusive bits, there may be older fonts that have multiple bits set. In that case the least restrictive bit set is used to determine the licensing level for font embedding.

0x0000 - Installable Embedding: No fsType bit is set. Thus fsType is zero. Fonts with this setting indicate that they may be embedded and permanently installed on the remote system by an application. The user of the remote system acquires the identical rights, obligations and licenses for that font as the original purchaser of the font, and is subject to the same end-user license agreement, copyright, design patent, and/or trademark as was the original purchaser.

NOTE: It is recommended that User Agents never permanently install embedded fonts on systems, even when Installable Embedding is set, to mitigate attacks where a malicious attempt could be made to fill up a machine's memory space with unwanted fonts without the user being aware.

0x0002 - Restricted License embedding: Fonts that have only this bit set must not be modified, embedded or exchanged in any manner without first obtaining permission of the legal owner. Caution: For Restricted License embedding to take effect, it must be the only level of embedding selected.

0x0004 - Preview & Print embedding: When this bit is set, the font may be embedded, and temporarily loaded on the remote system. Documents containing Preview & Print fonts must be opened "read-only;" no edits can be applied to the document.

0x0008 - Editable embedding: When this bit is set, the font may be embedded but must only be installed temporarily on other systems. In contrast to Preview & Print fonts, documents containing Editable fonts may be opened for reading, editing is permitted, and changes may be saved.

0x0100 - No subsetting: When this bit is set, the font may not be subsetted prior to embedding. Other embedding restrictions specified in bits 0-3 and 9 also apply.

0x0200 - Bitmap embedding only: When this bit is set, only bitmaps contained in the font may be embedded. No outline data may be embedded. If there are no bitmaps available in the font, then the font is considered unembeddable and the embedding services will fail. Other embedding restrictions specified in bits 0-3 and 8 also apply.

4.2 Processing Flags

The following flags define processing required when loading an embedded font.MBEDDEDFONT structure. Some of the flags make sense during the process of creating the embedded font, but have no real value when the font is loaded.

#define TTEMBED_SUBSET 0x00000001 #define TTEMBED_TTCOMPRESSED 0x00000004 #define TTEMBED_FAILIFVARIATIONSIMULATED 0x00000010 #define TTMBED_EMBEDEUDC 0x00000020 #define TTEMBED_VALIDATIONTESTS 0x00000040 (Deprecated) #define TTEMBED_WEBOBJECT 0x00000080 #define TTEMBED_XORENCRYPTDATA 0x10000000

4.2.1 Flags used at loading time

The following flags define processing required when loading an embedded font.

TTEMBED_SUBSET

The font has been subsetted to contain support for only the characters needed to render the text in the document identified by the RootString entry of the EMBEDDEDFONT structure.

NOTE: If the font has been subsetted, it is possible that the UnicodeRange and CodePageRange data has not been updated to correctly reflect the subsetted font values. Therefore, the values should not be interpreted literally if the font has been subsetted.

TTEMBED_TTCOMPRESSED

The font data has been compressed by the MicroType® Express algorithm. The MicroType® Express algorithm is a lossless compression algorithm for OpenType/TrueType fonts, which was developed by Monotype Imaging. Monotype Imaging has indicated that they own one or more patents that relate to the MicroType® Express algorithm.

TTEMBED_EMBEDEUDC

If this flag is set, the font should contain EUDC (End User Defined Character) characters. [more information] Requires version 0x00020001 or higher of EOT file format.

TTEMBED_XORENCRYPTDATA

This flag indicates that the FontData array and EUDCFotData array (if present) has been encrypted using an XOR algorithm using an XOR key of 0x50 on each byte of the font data. This happens on final data, after compression and subsetting. The font must be decrypted using the XOR key to accesses the font data.

4.2.2 Flags used when creating embedded font

The following flags are used in some libraries for creating EOT files and are co-incidentally recorded in the EOT file. No processing is required by a user agent if these flags are set.

TTEMBED_FAILIFVARIATIONSIMULATED

If this flag is set, the User Agent must check to see if font is simulated by comparing weight and italic settings obtained from system with the actual font data. If values do not match what the client requested in the document then the font should not be used.

TTEMBED_VALIDATIONTESTS

This flag is deprecated. There is no processing to do if this flag is set. The flag means that some tests may have been run when creating the embedded font. User agents should not make any assumptions on the basis of this flag.

TTEMBED_WEBOBJECT

A font with this flag set indicates that it is intended for use on the internet. However, it does not really mean anything for processing, except if a tool creating embedded fonts wants to enforce certain settings.

4.3 RootString

The RootString contains one or more full URLs from which the embedded font object may be referenced. Multiple URLs, separated by NULL terminators, can be specified. This allows licensed owners of fonts to honor their license agreements for using the embedded fonts on their pages, while preventing others from using the font for uses for which the font is not licensed.

4.3.1 RootString Usage

User Agents must validate that the page using the embedded font is within the list of URLs from which the embedded font object may be legitimately referenced.

4.3.2 RootString Checksum Calculation

The RootStringCheckSum entry is calculated as the sum of all bytes in the RootString array, then XORed with a value of 0x50475342. The following algorithm is used to calculate the checksum value.

#define CS_XORKEY 0x50475342

unsigned long GetByteCheckSum( unsigned char* pucBuffer, unsigned short cbLength ) { int i; unsigned long ulCS = 0;

for(i = 0; i < cbLength; i++)
{
    ulCS += pucBuffer[i];
}

return ulCS ^ CS_XORKEY;

}

When unpacking the .EOT font file, the process should calculate the checksum of RootString data and then compare to the stored checksum value. If the values do not match, the font should be viewed as having been tampered with and should not be used.

4.4 Processing Font Data

When a font is packed into an .EOT file, with the font being both compressed (TTEMBED_TTCOMPRESSED) and XORed (TTEMBED_XORENCRYPTDATA), the processing is done in that order (compression followed by XOR). Thus, when the font has processing flags TTEMBED_TTCOMPRESSED and TTEMBED_XORENCRYPTDATA set, the font data must be first XORed and then passed through the decompression algorithm.

If the processing flag TTEMBED_XORENCRYPTDATA is set in the Flags entry, the following algorithm is used to XOR the font data for either the embedded font or the EUDC font (if present).

#define XORKEY 0x50

void XORBufferData( unsigned char* pucBuffer, unsigned long ulSize ) { unsigned long ulIndex;

for(ulIndex = 0; ulIndex < ulSize; ulIndex++)
{
    pucBuffer[ulIndex] = (unsigned char)(pucBuffer[ulIndex] ^ XORKEY);
}

}