Unicode 12.1.0 (original) (raw)

2019 May 7 (Announcement, in Japanese)

Version 12.1.0 has been superseded by the latest version of the Unicode Standard.

This page summarizes the important changes for the Unicode Standard, Version 12.1.0. This version supersedes all previous versions of the Unicode Standard.

A. Summary
B. Technical Overview
C. Stability Policy Update
D. Textual Changes and Character Additions
E. Conformance Changes
F. Changes in the Unicode Character Database
G. Changes in the Unicode Standard Annexes
H. Changes in Synchronized Unicode Technical Standards
M. Implications for Migration

A. Summary

Unicode 12.1 adds exactly one character, for a total of 137,929 characters.

The new character added to Version 12.1 is:

U+32FF SQUARE ERA NAME REIWA

Version 12.1 adds that single character to enable software to be rapidly updated to support the new Japanese era name in calendrical systems and date formatting. The new Japanese era name was officially announced on April 1, 2019, and is effective as of May 1, 2019.

Synchronization

Most of the Unicode Standard Annexes remain published as they were for Version 12.0. They are incorporated by reference into Version 12.1, with no changes to their text or dates of publication. The single exception is UAX #42, which has been updated to reflect the changes in the UCD in XML for Version 12.1.

This version of the Unicode Standard is identical to Unicode 12.0 except for the changes necessary to accommodate the addition of the single new era name character.

See Sections D through H below for additional details regarding the changes in this version of the Unicode Standard, its associated annexes, and the other synchronized Unicode specifications.

B. Technical Overview

Version 12.1 of the Unicode Standard consists of:

Version 12.0 of the core specification
Version 12.1 of the code charts (delta and archival)
Version 12.0 of the Unicode Standard Annexes (except UAX #42)
Version 12.1 of UAX #42, “Unicode Character Database in XML”
Version 12.1 of the Unicode Character Database (UCD)

The core specification gives the general principles, requirements for conformance, and guidelines for implementers. The code charts show representative glyphs for all the Unicode characters. The Unicode Standard Annexes supply detailed normative information about particular aspects of the standard. The Unicode Character Database supplies normative and informative data for implementers to allow them to implement the Unicode Standard.

Core Specification

The core specification is available as a single pdf for viewing. (14 MB) Links are also available in the navigation bar on the left of this page to accessindividual chapters and appendices of the core specification.

Code Charts

Several sets of code charts are available. They serve different purposes:

The latest set of code charts for the Unicode Standard is available online. Those charts are always the most current code charts available, and may be updated at any time. The charts are organized by scripts and blocks for easy reference. An online index by character name is also provided.

For Unicode 12.1.0 in particular two additional sets of code chart pages are provided:

A set of delta code charts showing the block in which a character was added for Unicode 12.1.0. The new character is visually highlighted in the charts.
A set of archival code charts that represents the entire set of characters, names and representative glyphs at the time of publication of Unicode 12.1.0.

The delta and archival code charts are a stable part of this release of the Unicode Standard. They will never be updated.

Unicode Standard Annexes

Links to the individual Unicode Standard Annexes are available in the navigation bar on the left of this page. No changes in the content of the Unicode Standard Annexes were made for Version 12.1, except for UAX #42.

Unicode Character Database

Data files for Version 12.1 of the Unicode Character Database are available. The ReadMe.txt in that directory provides a roadmap to the functions of the various subdirectories.Zipped versions of the UCD for bulk download are available, as well.

Version References

Version 12.1.0 of the Unicode Standard should be referenced as:

The Unicode Consortium. The Unicode Standard, Version 12.1.0, (Mountain View, CA: The Unicode Consortium, 2019. ISBN 978-1-936213-25-2)
http://www.unicode.org/versions/Unicode12.1.0/

The terms “Version 12.1” or “Unicode 12.1” are abbreviations for the full version reference, Version 12.1.0.

The citation and permalink for the latest published version of the Unicode Standard is:

The Unicode Consortium. The Unicode Standard.
http://www.unicode.org/versions/latest/

A complete specification of the contributory files for Unicode 12.1 is found on the page Components for 12.1.0. That page also provides the recommended reference format for Unicode Standard Annexes. For examples of how to cite particular portions of the Unicode Standard, see also the Reference Examples.

Errata

No errata were incorporated into Unicode 12.1. For corrigenda and errata after the release of Unicode 12.1, see the list of current Updates and Errata.

C. Stability Policy Update

There were no significant changes to the Stability Policy of the core specification between Unicode 12.0 and Unicode 12.1.

D. Textual Changes and Character Additions

No new scripts were added in version 12.1.

No changes were made in the Unicode Standard Annexes, except for UAX #42, “Unicode Character Database in XML”.

Character Assignment Overview

1 character has been added. For details, see delta code charts.

E. Conformance Changes

There are no significant new conformance requirements in Unicode 12.1.

F. Changes in the Unicode Character Database

The changes to the UCD for Version 12.1 are constrained to the minimum necessary to account for the addition of single new charater, U+32FF. A new Age value was added for Version 12.1.

G. Changes in the Unicode Standard Annexes

There were no changes made for the Unicode Standard Annexes for Version 12.1.

Unicode Standard Annex	Changes
UAX #9Unicode Bidirectional Algorithm	No changes in this version.
UAX #11East Asian Width	No changes in this version.
UAX #14Unicode Line Breaking Algorithm	No changes in this version.
UAX #15Unicode Normalization Forms	No changes in this version.
UAX #24Unicode Script Property	No changes in this version.
UAX #29Unicode Text Segmentation	No changes in this version.
UAX #31Unicode Identifier and Pattern Syntax	No changes in this version.
UAX #34Unicode Named Character Sequences	No changes in this version.
UAX #38Unicode Han Database (Unihan)	No changes in this version.
UAX #41Common References for Unicode Standard Annexes	No changes in this version.
UAX #42Unicode Character Database in XML	Updated to account for the addition of Age=12.1.
UAX #44 Unicode Character Database	No changes in this version.
UAX #45 U-Source Ideographs	No changes in this version.
UAX #50 Unicode Vertical Text Layout	No changes in this version.

H. Changes in Synchronized Unicode Technical Standards

Minimal changes were made for the Unicode Technical Standards whose versions are synchronized with the Unicode Standard. UTS #10, UTS #39, and UTS #46 were updated to 12.1, so that their data references correctly point to 12.1.0 data containing the newly added U+32FF. The emoji data files make no reference to Japanese era name characters, so no update was required for UTS #51 for 12.1.

Unicode Technical Standard	Changes
UTS #10Unicode Collation Algorithm	Data files were updated to include U+32FF. There are no significant textual changes in the specification.
UTS #39Unicode Security Mechanisms	Data files were updated to include U+32FF. There are no significant textual changes in the specification.
UTS #46Unicode IDNA Compatibility Processing	Data files were updated to include U+32FF. There are no significant textual changes in the specification.
UTS #51Unicode Emoji	No changes in this version.

M. Implications for Migration

The only change in Unicode 12.1 is the addition of the new Japanese era name U+32FF. This character will impact some calendrical systems and date formatting which use the ligated form U+32FF specifically (instead of a sequence of two unified CJK ideographs to represent the new era name).

Normalization Changes

Note that U+32FF has a compatibility decomposition to a sequence of two unified CJK ideographs. This addition will impact tables supporting Unicode normalization.