12.3.5 Column Character Set and Collation (original) (raw)

12.3.5 Column Character Set and Collation

Every “character” column (that is, a column of typeCHAR,VARCHAR, aTEXT type, or any synonym) has a column character set and a column collation. Column definition syntax for CREATE TABLE andALTER TABLE has optional clauses for specifying the column character set and collation:

col_name {CHAR | VARCHAR | TEXT} (col_length)
    [CHARACTER SET charset_name]
    [COLLATE collation_name]

These clauses can also be used forENUM andSET columns:

col_name {ENUM | SET} (val_list)
    [CHARACTER SET charset_name]
    [COLLATE collation_name]

Examples:

CREATE TABLE t1
(
    col1 VARCHAR(5)
      CHARACTER SET latin1
      COLLATE latin1_german1_ci
);

ALTER TABLE t1 MODIFY
    col1 VARCHAR(5)
      CHARACTER SET latin1
      COLLATE latin1_swedish_ci;

MySQL chooses the column character set and collation in the following manner:

CREATE TABLE t1  
(  
    col1 CHAR(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci  
) CHARACTER SET latin1 COLLATE latin1_bin;  

The character set and collation are specified for the column, so they are used. The column has character setutf8mb4 and collationutf8mb4_unicode_ci.

CREATE TABLE t1  
(  
    col1 CHAR(10) CHARACTER SET utf8mb4  
) CHARACTER SET latin1 COLLATE latin1_bin;  

The character set is specified for the column, but the collation is not. The column has character setutf8mb4 and the default collation forutf8mb4, which isutf8mb4_0900_ai_ci. To see the default collation for each character set, use theSHOW CHARACTER SET statement or query the INFORMATION_SCHEMA CHARACTER_SETS table.

CREATE TABLE t1  
(  
    col1 CHAR(10) COLLATE utf8mb4_polish_ci  
) CHARACTER SET latin1 COLLATE latin1_bin;  

The collation is specified for the column, but the character set is not. The column has collationutf8mb4_polish_ci and the character set is the one associated with the collation, which isutf8mb4.

CREATE TABLE t1  
(  
    col1 CHAR(10)  
) CHARACTER SET latin1 COLLATE latin1_bin;  

Neither the character set nor collation is specified for the column, so the table defaults are used. The column has character set latin1 and collationlatin1_bin.

The CHARACTER SET andCOLLATE clauses are standard SQL.

If you use ALTER TABLE to convert a column from one character set to another, MySQL attempts to map the data values, but if the character sets are incompatible, there may be data loss.