char - Rust (original) (raw)

Expand description

A character type.

The char type represents a single character. More specifically, since ‘character’ isn’t a well-defined concept in Unicode, char is a ‘Unicode scalar value’.

This documentation describes a number of methods and trait implementations on thechar type. For technical reasons, there is additional, separate documentation in the std::char module as well.

§Validity and Layout

A char is a ‘Unicode scalar value’, which is any ‘Unicode code point’ other than a surrogate code point. This has a fixed numerical definition: code points are in the range 0 to 0x10FFFF, inclusive. Surrogate code points, used by UTF-16, are in the range 0xD800 to 0xDFFF.

No char may be constructed, whether as a literal or at runtime, that is not a Unicode scalar value. Violating this rule causes undefined behavior.

// Each of these is a compiler error
['\u{D800}', '\u{DFFF}', '\u{110000}'];

// Panics; from_u32 returns None.
char::from_u32(0xDE01).unwrap();
// Undefined behavior
let _ = unsafe { char::from_u32_unchecked(0x110000) };

Unicode scalar values are also the exact set of values that may be encoded in UTF-8. Becausechar values are Unicode scalar values and functions may assume incoming str values are valid UTF-8, it is safe to store any char in a str or read any character from a str as a char.

The gap in valid char values is understood by the compiler, so in the below example the two ranges are understood to cover the whole range of possible char values and there is no error for a non-exhaustive match.

let c: char = 'a';
match c {
    '\0' ..= '\u{D7FF}' => false,
    '\u{E000}' ..= '\u{10FFFF}' => true,
};

All Unicode scalar values are valid char values, but not all of them represent a real character. Many Unicode scalar values are not currently assigned to a character, but may be in the future (“reserved”); some will never be a character (“noncharacters”); and some may be given different meanings by different users (“private use”).

char is guaranteed to have the same size, alignment, and function call ABI as u32 on all platforms.

use std::alloc::Layout;
assert_eq!(Layout:🆕:<char>(), Layout:🆕:<u32>());

§Representation

char is always four bytes in size. This is a different representation than a given character would have as part of a String. For example:

let v = vec!['h', 'e', 'l', 'l', 'o'];

// five elements times four bytes for each element
assert_eq!(20, v.len() * std::mem::size_of::<char>());

let s = String::from("hello");

// five elements times one byte per element
assert_eq!(5, s.len() * std::mem::size_of::<u8>());

As always, remember that a human intuition for ‘character’ might not map to Unicode’s definitions. For example, despite looking similar, the ‘é’ character is one Unicode code point while ‘é’ is two Unicode code points:

let mut chars = "é".chars();
// U+00e9: 'latin small letter e with acute'
assert_eq!(Some('\u{00e9}'), chars.next());
assert_eq!(None, chars.next());

let mut chars = "é".chars();
// U+0065: 'latin small letter e'
assert_eq!(Some('\u{0065}'), chars.next());
// U+0301: 'combining acute accent'
assert_eq!(Some('\u{0301}'), chars.next());
assert_eq!(None, chars.next());

This means that the contents of the first string above will fit into achar while the contents of the second string will not. Trying to create a char literal with the contents of the second string gives an error:

error: character literal may only contain one codepoint: 'é'
let c = 'é';
        ^^^

Another implication of the 4-byte fixed size of a char is that per-char processing can end up using a lot more memory:

let s = String::from("love: ❤️");
let v: Vec<char> = s.chars().collect();

assert_eq!(12, std::mem::size_of_val(&s[..]));
assert_eq!(32, std::mem::size_of_val(&v[..]));

Source§

1.83.0 · Source

The lowest valid code point a char can have, '\0'.

Unlike integer types, char actually has a gap in the middle, meaning that the range of possible chars is smaller than you might expect. Ranges of char will automatically hop this gap for you:

let dist = u32::from(char::MAX) - u32::from(char::MIN);
let size = (char::MIN..=char::MAX).count() as u32;
assert!(size < dist);

Despite this gap, the MIN and MAX values can be used as bounds for all char values.

§Examples
let c: char = something_which_returns_char();
assert!(char::MIN <= c);

let value_at_min = u32::from(char::MIN);
assert_eq!(char::from_u32(value_at_min), Some('\0'));

1.52.0 · Source

The highest valid code point a char can have, '\u{10FFFF}'.

Unlike integer types, char actually has a gap in the middle, meaning that the range of possible chars is smaller than you might expect. Ranges of char will automatically hop this gap for you:

let dist = u32::from(char::MAX) - u32::from(char::MIN);
let size = (char::MIN..=char::MAX).count() as u32;
assert!(size < dist);

Despite this gap, the MIN and MAX values can be used as bounds for all char values.

§Examples
let c: char = something_which_returns_char();
assert!(c <= char::MAX);

let value_at_max = u32::from(char::MAX);
assert_eq!(char::from_u32(value_at_max), Some('\u{10FFFF}'));
assert_eq!(char::from_u32(value_at_max + 1), None);

1.52.0 · Source

U+FFFD REPLACEMENT CHARACTER (�) is used in Unicode to represent a decoding error.

It can occur, for example, when giving ill-formed UTF-8 bytes toString::from_utf8_lossy.

1.52.0 · Source

The version of Unicode that the Unicode parts ofchar and str methods are based on.

New versions of Unicode are released regularly and subsequently all methods in the standard library depending on Unicode are updated. Therefore the behavior of some char and str methods and the value of this constant changes over time. This is not considered to be a breaking change.

The version numbering scheme is explained inUnicode 11.0 or later, Section 3.1 Versions of the Unicode Standard.

1.52.0 · Source

Creates an iterator over the native endian UTF-16 encoded code points in iter, returning unpaired surrogates as Errs.

§Examples

Basic usage:

// 𝄞mus<invalid>ic<invalid>
let v = [
    0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
];

assert_eq!(
    char::decode_utf16(v)
        .map(|r| r.map_err(|e| e.unpaired_surrogate()))
        .collect::<Vec<_>>(),
    vec![
        Ok('𝄞'),
        Ok('m'), Ok('u'), Ok('s'),
        Err(0xDD1E),
        Ok('i'), Ok('c'),
        Err(0xD834)
    ]
);

A lossy decoder can be obtained by replacing Err results with the replacement character:

// 𝄞mus<invalid>ic<invalid>
let v = [
    0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
];

assert_eq!(
    char::decode_utf16(v)
       .map(|r| r.unwrap_or(char::REPLACEMENT_CHARACTER))
       .collect::<String>(),
    "𝄞mus�ic�"
);

1.52.0 (const: 1.67.0) · Source

Converts a u32 to a char.

Note that all chars are valid u32s, and can be cast to one withas:

let c = '💯';
let i = c as u32;

assert_eq!(128175, i);

However, the reverse is not true: not all valid u32s are validchars. from_u32() will return None if the input is not a valid value for a char.

For an unsafe version of this function which ignores these checks, seefrom_u32_unchecked.

§Examples

Basic usage:

let c = char::from_u32(0x2764);

assert_eq!(Some('❤'), c);

Returning None when the input is not a valid char:

let c = char::from_u32(0x110000);

assert_eq!(None, c);

1.52.0 (const: 1.81.0) · Source

Converts a u32 to a char, ignoring validity.

Note that all chars are valid u32s, and can be cast to one withas:

let c = '💯';
let i = c as u32;

assert_eq!(128175, i);

However, the reverse is not true: not all valid u32s are validchars. from_u32_unchecked() will ignore this, and blindly cast tochar, possibly creating an invalid one.

§Safety

This function is unsafe, as it may construct invalid char values.

For a safe version of this function, see the from_u32 function.

§Examples

Basic usage:

let c = unsafe { char::from_u32_unchecked(0x2764) };

assert_eq!('❤', c);

1.52.0 (const: 1.67.0) · Source

Converts a digit in the given radix to a char.

A ‘radix’ here is sometimes also called a ‘base’. A radix of two indicates a binary number, a radix of ten, decimal, and a radix of sixteen, hexadecimal, to give some common values. Arbitrary radices are supported.

from_digit() will return None if the input is not a digit in the given radix.

§Panics

Panics if given a radix larger than 36.

§Examples

Basic usage:

let c = char::from_digit(4, 10);

assert_eq!(Some('4'), c);

// Decimal 11 is a single digit in base 16
let c = char::from_digit(11, 16);

assert_eq!(Some('b'), c);

Returning None when the input is not a digit:

let c = char::from_digit(20, 10);

assert_eq!(None, c);

Passing a large radix, causing a panic:

// this panics
let _c = char::from_digit(1, 37);

1.0.0 (const: unstable) · Source

Checks if a char is a digit in the given radix.

A ‘radix’ here is sometimes also called a ‘base’. A radix of two indicates a binary number, a radix of ten, decimal, and a radix of sixteen, hexadecimal, to give some common values. Arbitrary radices are supported.

Compared to is_numeric(), this function only recognizes the characters0-9, a-z and A-Z.

‘Digit’ is defined to be only the following characters:

For a more comprehensive understanding of ‘digit’, see is_numeric().

§Panics

Panics if given a radix smaller than 2 or larger than 36.

§Examples

Basic usage:

assert!('1'.is_digit(10));
assert!('f'.is_digit(16));
assert!(!'f'.is_digit(10));

Passing a large radix, causing a panic:

// this panics
'1'.is_digit(37);

Passing a small radix, causing a panic:

// this panics
'1'.is_digit(1);

1.0.0 (const: 1.67.0) · Source

Converts a char to a digit in the given radix.

A ‘radix’ here is sometimes also called a ‘base’. A radix of two indicates a binary number, a radix of ten, decimal, and a radix of sixteen, hexadecimal, to give some common values. Arbitrary radices are supported.

‘Digit’ is defined to be only the following characters:

§Errors

Returns None if the char does not refer to a digit in the given radix.

§Panics

Panics if given a radix smaller than 2 or larger than 36.

§Examples

Basic usage:

assert_eq!('1'.to_digit(10), Some(1));
assert_eq!('f'.to_digit(16), Some(15));

Passing a non-digit results in failure:

assert_eq!('f'.to_digit(10), None);
assert_eq!('z'.to_digit(16), None);

Passing a large radix, causing a panic:

// this panics
let _ = '1'.to_digit(37);

Passing a small radix, causing a panic:

// this panics
let _ = '1'.to_digit(1);

1.0.0 · Source

Returns an iterator that yields the hexadecimal Unicode escape of a character as chars.

This will escape characters with the Rust syntax of the form\u{NNNNNN} where NNNNNN is a hexadecimal representation.

§Examples

As an iterator:

for c in '❤'.escape_unicode() {
    print!("{c}");
}
println!();

Using println! directly:

println!("{}", '❤'.escape_unicode());

Both are equivalent to:

Using to_string:

assert_eq!('❤'.escape_unicode().to_string(), "\\u{2764}");

1.20.0 · Source

Returns an iterator that yields the literal escape code of a character as chars.

This will escape the characters similar to the Debug implementations of str or char.

§Examples

As an iterator:

for c in '\n'.escape_debug() {
    print!("{c}");
}
println!();

Using println! directly:

println!("{}", '\n'.escape_debug());

Both are equivalent to:

Using to_string:

assert_eq!('\n'.escape_debug().to_string(), "\\n");

1.0.0 · Source

Returns an iterator that yields the literal escape code of a character as chars.

The default is chosen with a bias toward producing literals that are legal in a variety of languages, including C++11 and similar C-family languages. The exact rules are:

§Examples

As an iterator:

for c in '"'.escape_default() {
    print!("{c}");
}
println!();

Using println! directly:

println!("{}", '"'.escape_default());

Both are equivalent to:

Using to_string:

assert_eq!('"'.escape_default().to_string(), "\\\"");

1.0.0 (const: 1.52.0) · Source

Returns the number of bytes this char would need if encoded in UTF-8.

That number of bytes is always between 1 and 4, inclusive.

§Examples

Basic usage:

let len = 'A'.len_utf8();
assert_eq!(len, 1);

let len = 'ß'.len_utf8();
assert_eq!(len, 2);

let len = 'ℝ'.len_utf8();
assert_eq!(len, 3);

let len = '💣'.len_utf8();
assert_eq!(len, 4);

The &str type guarantees that its contents are UTF-8, and so we can compare the length it would take if each code point was represented as a char vs in the &str itself:

// as chars
let eastern = '東';
let capital = '京';

// both can be represented as three bytes
assert_eq!(3, eastern.len_utf8());
assert_eq!(3, capital.len_utf8());

// as a &str, these two are encoded in UTF-8
let tokyo = "東京";

let len = eastern.len_utf8() + capital.len_utf8();

// we can see that they take six bytes total...
assert_eq!(6, tokyo.len());

// ... just like the &str
assert_eq!(len, tokyo.len());

1.0.0 (const: 1.52.0) · Source

Returns the number of 16-bit code units this char would need if encoded in UTF-16.

That number of code units is always either 1 or 2, for unicode scalar values in the basic multilingual plane or supplementary planes respectively.

See the documentation for len_utf8() for more explanation of this concept. This function is a mirror, but for UTF-16 instead of UTF-8.

§Examples

Basic usage:

let n = 'ß'.len_utf16();
assert_eq!(n, 1);

let len = '💣'.len_utf16();
assert_eq!(len, 2);

1.15.0 (const: 1.83.0) · Source

Encodes this character as UTF-8 into the provided byte buffer, and then returns the subslice of the buffer that contains the encoded character.

§Panics

Panics if the buffer is not large enough. A buffer of length four is large enough to encode any char.

§Examples

In both of these examples, ‘ß’ takes two bytes to encode.

let mut b = [0; 2];

let result = 'ß'.encode_utf8(&mut b);

assert_eq!(result, "ß");

assert_eq!(result.len(), 2);

A buffer that’s too small:

let mut b = [0; 1];

// this panics
'ß'.encode_utf8(&mut b);

1.15.0 (const: 1.84.0) · Source

Encodes this character as native endian UTF-16 into the provided u16 buffer, and then returns the subslice of the buffer that contains the encoded character.

§Panics

Panics if the buffer is not large enough. A buffer of length 2 is large enough to encode any char.

§Examples

In both of these examples, ‘𝕊’ takes two u16s to encode.

let mut b = [0; 2];

let result = '𝕊'.encode_utf16(&mut b);

assert_eq!(result.len(), 2);

A buffer that’s too small:

let mut b = [0; 1];

// this panics
'𝕊'.encode_utf16(&mut b);

1.0.0 · Source

Returns true if this char has the Alphabetic property.

Alphabetic is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database DerivedCoreProperties.txt.

§Examples

Basic usage:

assert!('a'.is_alphabetic());
assert!('京'.is_alphabetic());

let c = '💝';
// love is many things, but it is not alphabetic
assert!(!c.is_alphabetic());

1.0.0 (const: 1.84.0) · Source

Returns true if this char has the Lowercase property.

Lowercase is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database DerivedCoreProperties.txt.

§Examples

Basic usage:

assert!('a'.is_lowercase());
assert!('δ'.is_lowercase());
assert!(!'A'.is_lowercase());
assert!(!'Δ'.is_lowercase());

// The various Chinese scripts and punctuation do not have case, and so:
assert!(!'中'.is_lowercase());
assert!(!' '.is_lowercase());

In a const context:

const CAPITAL_DELTA_IS_LOWERCASE: bool = 'Δ'.is_lowercase();
assert!(!CAPITAL_DELTA_IS_LOWERCASE);

1.0.0 (const: 1.84.0) · Source

Returns true if this char has the Uppercase property.

Uppercase is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database DerivedCoreProperties.txt.

§Examples

Basic usage:

assert!(!'a'.is_uppercase());
assert!(!'δ'.is_uppercase());
assert!('A'.is_uppercase());
assert!('Δ'.is_uppercase());

// The various Chinese scripts and punctuation do not have case, and so:
assert!(!'中'.is_uppercase());
assert!(!' '.is_uppercase());

In a const context:

const CAPITAL_DELTA_IS_UPPERCASE: bool = 'Δ'.is_uppercase();
assert!(CAPITAL_DELTA_IS_UPPERCASE);

1.0.0 (const: unstable) · Source

Returns true if this char has the White_Space property.

White_Space is specified in the Unicode Character Database PropList.txt.

§Examples

Basic usage:

assert!(' '.is_whitespace());

// line break
assert!('\n'.is_whitespace());

// a non-breaking space
assert!('\u{A0}'.is_whitespace());

assert!(!'越'.is_whitespace());

1.0.0 · Source

Returns true if this char satisfies either is_alphabetic() or is_numeric().

§Examples

Basic usage:

assert!('٣'.is_alphanumeric());
assert!('7'.is_alphanumeric());
assert!('৬'.is_alphanumeric());
assert!('¾'.is_alphanumeric());
assert!('①'.is_alphanumeric());
assert!('K'.is_alphanumeric());
assert!('و'.is_alphanumeric());
assert!('藏'.is_alphanumeric());

1.0.0 · Source

Returns true if this char has the general category for control codes.

Control codes (code points with the general category of Cc) are described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database UnicodeData.txt.

§Examples

Basic usage:

// U+009C, STRING TERMINATOR
assert!('œ'.is_control());
assert!(!'q'.is_control());

1.0.0 · Source

Returns true if this char has one of the general categories for numbers.

The general categories for numbers (Nd for decimal digits, Nl for letter-like numeric characters, and No for other numeric characters) are specified in the Unicode Character Database UnicodeData.txt.

This method doesn’t cover everything that could be considered a number, e.g. ideographic numbers like ‘三’. If you want everything including characters with overlapping purposes then you might want to use a unicode or language-processing library that exposes the appropriate character properties instead of looking at the unicode categories.

If you want to parse ASCII decimal digits (0-9) or ASCII base-N, useis_ascii_digit or is_digit instead.

§Examples

Basic usage:

assert!('٣'.is_numeric());
assert!('7'.is_numeric());
assert!('৬'.is_numeric());
assert!('¾'.is_numeric());
assert!('①'.is_numeric());
assert!(!'K'.is_numeric());
assert!(!'و'.is_numeric());
assert!(!'藏'.is_numeric());
assert!(!'三'.is_numeric());

1.0.0 · Source

Returns an iterator that yields the lowercase mapping of this char as one or morechars.

If this char does not have a lowercase mapping, the iterator yields the same char.

If this char has a one-to-one lowercase mapping given by the Unicode Character Database UnicodeData.txt, the iterator yields that char.

If this char requires special considerations (e.g. multiple chars) the iterator yields the char(s) given by SpecialCasing.txt.

This operation performs an unconditional mapping without tailoring. That is, the conversion is independent of context and language.

In the Unicode Standard, Chapter 4 (Character Properties) discusses case mapping in general and Chapter 3 (Conformance) discusses the default algorithm for case conversion.

§Examples

As an iterator:

for c in 'İ'.to_lowercase() {
    print!("{c}");
}
println!();

Using println! directly:

println!("{}", 'İ'.to_lowercase());

Both are equivalent to:

Using to_string:

assert_eq!('C'.to_lowercase().to_string(), "c");

// Sometimes the result is more than one character:
assert_eq!('İ'.to_lowercase().to_string(), "i\u{307}");

// Characters that do not have both uppercase and lowercase
// convert into themselves.
assert_eq!('山'.to_lowercase().to_string(), "山");

1.0.0 · Source

Returns an iterator that yields the uppercase mapping of this char as one or morechars.

If this char does not have an uppercase mapping, the iterator yields the same char.

If this char has a one-to-one uppercase mapping given by the Unicode Character Database UnicodeData.txt, the iterator yields that char.

If this char requires special considerations (e.g. multiple chars) the iterator yields the char(s) given by SpecialCasing.txt.

This operation performs an unconditional mapping without tailoring. That is, the conversion is independent of context and language.

In the Unicode Standard, Chapter 4 (Character Properties) discusses case mapping in general and Chapter 3 (Conformance) discusses the default algorithm for case conversion.

§Examples

As an iterator:

for c in 'ß'.to_uppercase() {
    print!("{c}");
}
println!();

Using println! directly:

println!("{}", 'ß'.to_uppercase());

Both are equivalent to:

Using to_string:

assert_eq!('c'.to_uppercase().to_string(), "C");

// Sometimes the result is more than one character:
assert_eq!('ß'.to_uppercase().to_string(), "SS");

// Characters that do not have both uppercase and lowercase
// convert into themselves.
assert_eq!('山'.to_uppercase().to_string(), "山");
§Note on locale

In Turkish, the equivalent of ‘i’ in Latin has five forms instead of two:

Note that the lowercase dotted ‘i’ is the same as the Latin. Therefore:

let upper_i = 'i'.to_uppercase().to_string();

The value of upper_i here relies on the language of the text: if we’re in en-US, it should be "I", but if we’re in tr_TR, it should be "İ". to_uppercase() does not take this into account, and so:

let upper_i = 'i'.to_uppercase().to_string();

assert_eq!(upper_i, "I");

holds across languages.

1.23.0 (const: 1.32.0) · Source

Checks if the value is within the ASCII range.

§Examples
let ascii = 'a';
let non_ascii = '❤';

assert!(ascii.is_ascii());
assert!(!non_ascii.is_ascii());

Source

🔬This is a nightly-only experimental API. (ascii_char #110998)

Returns Some if the value is within the ASCII range, or None if it’s not.

This is preferred to Self::is_ascii when you’re passing the value along to something else that can take ascii::Char rather than needing to check again for itself whether the value is in ASCII.

1.23.0 (const: 1.52.0) · Source

Makes a copy of the value in its ASCII upper case equivalent.

ASCII letters ‘a’ to ‘z’ are mapped to ‘A’ to ‘Z’, but non-ASCII letters are unchanged.

To uppercase the value in-place, use make_ascii_uppercase().

To uppercase ASCII characters in addition to non-ASCII characters, useto_uppercase().

§Examples
let ascii = 'a';
let non_ascii = '❤';

assert_eq!('A', ascii.to_ascii_uppercase());
assert_eq!('❤', non_ascii.to_ascii_uppercase());

1.23.0 (const: 1.52.0) · Source

Makes a copy of the value in its ASCII lower case equivalent.

ASCII letters ‘A’ to ‘Z’ are mapped to ‘a’ to ‘z’, but non-ASCII letters are unchanged.

To lowercase the value in-place, use make_ascii_lowercase().

To lowercase ASCII characters in addition to non-ASCII characters, useto_lowercase().

§Examples
let ascii = 'A';
let non_ascii = '❤';

assert_eq!('a', ascii.to_ascii_lowercase());
assert_eq!('❤', non_ascii.to_ascii_lowercase());

1.23.0 (const: 1.52.0) · Source

Checks that two values are an ASCII case-insensitive match.

Equivalent to [to_ascii_lowercase](#method.to%5Fascii%5Flowercase)(a) == [to_ascii_lowercase](#method.to%5Fascii%5Flowercase)(b).

§Examples
let upper_a = 'A';
let lower_a = 'a';
let lower_z = 'z';

assert!(upper_a.eq_ignore_ascii_case(&lower_a));
assert!(upper_a.eq_ignore_ascii_case(&upper_a));
assert!(!upper_a.eq_ignore_ascii_case(&lower_z));

1.23.0 (const: 1.84.0) · Source

Converts this type to its ASCII upper case equivalent in-place.

ASCII letters ‘a’ to ‘z’ are mapped to ‘A’ to ‘Z’, but non-ASCII letters are unchanged.

To return a new uppercased value without modifying the existing one, useto_ascii_uppercase().

§Examples
let mut ascii = 'a';

ascii.make_ascii_uppercase();

assert_eq!('A', ascii);

1.23.0 (const: 1.84.0) · Source

Converts this type to its ASCII lower case equivalent in-place.

ASCII letters ‘A’ to ‘Z’ are mapped to ‘a’ to ‘z’, but non-ASCII letters are unchanged.

To return a new lowercased value without modifying the existing one, useto_ascii_lowercase().

§Examples
let mut ascii = 'A';

ascii.make_ascii_lowercase();

assert_eq!('a', ascii);

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII alphabetic character:

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(uppercase_a.is_ascii_alphabetic());
assert!(uppercase_g.is_ascii_alphabetic());
assert!(a.is_ascii_alphabetic());
assert!(g.is_ascii_alphabetic());
assert!(!zero.is_ascii_alphabetic());
assert!(!percent.is_ascii_alphabetic());
assert!(!space.is_ascii_alphabetic());
assert!(!lf.is_ascii_alphabetic());
assert!(!esc.is_ascii_alphabetic());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII uppercase character: U+0041 ‘A’ ..= U+005A ‘Z’.

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(uppercase_a.is_ascii_uppercase());
assert!(uppercase_g.is_ascii_uppercase());
assert!(!a.is_ascii_uppercase());
assert!(!g.is_ascii_uppercase());
assert!(!zero.is_ascii_uppercase());
assert!(!percent.is_ascii_uppercase());
assert!(!space.is_ascii_uppercase());
assert!(!lf.is_ascii_uppercase());
assert!(!esc.is_ascii_uppercase());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII lowercase character: U+0061 ‘a’ ..= U+007A ‘z’.

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(!uppercase_a.is_ascii_lowercase());
assert!(!uppercase_g.is_ascii_lowercase());
assert!(a.is_ascii_lowercase());
assert!(g.is_ascii_lowercase());
assert!(!zero.is_ascii_lowercase());
assert!(!percent.is_ascii_lowercase());
assert!(!space.is_ascii_lowercase());
assert!(!lf.is_ascii_lowercase());
assert!(!esc.is_ascii_lowercase());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII alphanumeric character:

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(uppercase_a.is_ascii_alphanumeric());
assert!(uppercase_g.is_ascii_alphanumeric());
assert!(a.is_ascii_alphanumeric());
assert!(g.is_ascii_alphanumeric());
assert!(zero.is_ascii_alphanumeric());
assert!(!percent.is_ascii_alphanumeric());
assert!(!space.is_ascii_alphanumeric());
assert!(!lf.is_ascii_alphanumeric());
assert!(!esc.is_ascii_alphanumeric());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII decimal digit: U+0030 ‘0’ ..= U+0039 ‘9’.

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(!uppercase_a.is_ascii_digit());
assert!(!uppercase_g.is_ascii_digit());
assert!(!a.is_ascii_digit());
assert!(!g.is_ascii_digit());
assert!(zero.is_ascii_digit());
assert!(!percent.is_ascii_digit());
assert!(!space.is_ascii_digit());
assert!(!lf.is_ascii_digit());
assert!(!esc.is_ascii_digit());

Source

🔬This is a nightly-only experimental API. (is_ascii_octdigit #101288)

Checks if the value is an ASCII octal digit: U+0030 ‘0’ ..= U+0037 ‘7’.

§Examples
#![feature(is_ascii_octdigit)]

let uppercase_a = 'A';
let a = 'a';
let zero = '0';
let seven = '7';
let nine = '9';
let percent = '%';
let lf = '\n';

assert!(!uppercase_a.is_ascii_octdigit());
assert!(!a.is_ascii_octdigit());
assert!(zero.is_ascii_octdigit());
assert!(seven.is_ascii_octdigit());
assert!(!nine.is_ascii_octdigit());
assert!(!percent.is_ascii_octdigit());
assert!(!lf.is_ascii_octdigit());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII hexadecimal digit:

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(uppercase_a.is_ascii_hexdigit());
assert!(!uppercase_g.is_ascii_hexdigit());
assert!(a.is_ascii_hexdigit());
assert!(!g.is_ascii_hexdigit());
assert!(zero.is_ascii_hexdigit());
assert!(!percent.is_ascii_hexdigit());
assert!(!space.is_ascii_hexdigit());
assert!(!lf.is_ascii_hexdigit());
assert!(!esc.is_ascii_hexdigit());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII punctuation character:

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(!uppercase_a.is_ascii_punctuation());
assert!(!uppercase_g.is_ascii_punctuation());
assert!(!a.is_ascii_punctuation());
assert!(!g.is_ascii_punctuation());
assert!(!zero.is_ascii_punctuation());
assert!(percent.is_ascii_punctuation());
assert!(!space.is_ascii_punctuation());
assert!(!lf.is_ascii_punctuation());
assert!(!esc.is_ascii_punctuation());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII graphic character: U+0021 ‘!’ ..= U+007E ‘~’.

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(uppercase_a.is_ascii_graphic());
assert!(uppercase_g.is_ascii_graphic());
assert!(a.is_ascii_graphic());
assert!(g.is_ascii_graphic());
assert!(zero.is_ascii_graphic());
assert!(percent.is_ascii_graphic());
assert!(!space.is_ascii_graphic());
assert!(!lf.is_ascii_graphic());
assert!(!esc.is_ascii_graphic());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII whitespace character: U+0020 SPACE, U+0009 HORIZONTAL TAB, U+000A LINE FEED, U+000C FORM FEED, or U+000D CARRIAGE RETURN.

Rust uses the WhatWG Infra Standard’s definition of ASCII whitespace. There are several other definitions in wide use. For instance, the POSIX locale includes U+000B VERTICAL TAB as well as all the above characters, but—from the very same specification—the default rule for “field splitting” in the Bourne shell considers _only_SPACE, HORIZONTAL TAB, and LINE FEED as whitespace.

If you are writing a program that will process an existing file format, check what that format’s definition of whitespace is before using this function.

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(!uppercase_a.is_ascii_whitespace());
assert!(!uppercase_g.is_ascii_whitespace());
assert!(!a.is_ascii_whitespace());
assert!(!g.is_ascii_whitespace());
assert!(!zero.is_ascii_whitespace());
assert!(!percent.is_ascii_whitespace());
assert!(space.is_ascii_whitespace());
assert!(lf.is_ascii_whitespace());
assert!(!esc.is_ascii_whitespace());

1.24.0 (const: 1.47.0) · Source

Checks if the value is an ASCII control character: U+0000 NUL ..= U+001F UNIT SEPARATOR, or U+007F DELETE. Note that most ASCII whitespace characters are control characters, but SPACE is not.

§Examples
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';

assert!(!uppercase_a.is_ascii_control());
assert!(!uppercase_g.is_ascii_control());
assert!(!a.is_ascii_control());
assert!(!g.is_ascii_control());
assert!(!zero.is_ascii_control());
assert!(!percent.is_ascii_control());
assert!(!space.is_ascii_control());
assert!(lf.is_ascii_control());
assert!(esc.is_ascii_control());

1.0.0 · Source§

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Container type for copied ASCII characters.

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Checks if the value is within the ASCII range. Read more

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Makes a copy of the value in its ASCII upper case equivalent. Read more

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Makes a copy of the value in its ASCII lower case equivalent. Read more

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Checks that two values are an ASCII case-insensitive match. Read more

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Converts this type to its ASCII upper case equivalent in-place. Read more

Source§

👎Deprecated since 1.26.0: use inherent methods instead

Converts this type to its ASCII lower case equivalent in-place. Read more

1.0.0 · Source§

1.0.0 · Source§

1.0.0 · Source§

Source§

Returns the default value of \x00

1.0.0 · Source§

1.2.0 · Source§

Source§

Extends a collection with the contents of an iterator. Read more

Source§

🔬This is a nightly-only experimental API. (extend_one #72631)

Extends a collection with exactly one element.

Source§

🔬This is a nightly-only experimental API. (extend_one #72631)

Reserves capacity in a collection for the given number of additional elements. Read more

1.0.0 · Source§

Source§

Extends a collection with the contents of an iterator. Read more

Source§

🔬This is a nightly-only experimental API. (extend_one #72631)

Extends a collection with exactly one element.

Source§

🔬This is a nightly-only experimental API. (extend_one #72631)

Reserves capacity in a collection for the given number of additional elements. Read more

Source§

Source§

Converts to this type from the input type.

1.46.0 · Source§

Source§

Allocates an owned String from a single character.

§Example
let c: char = 'a';
let s: String = String::from(c);
assert_eq!("a", &s[..]);

1.51.0 · Source§

Source§

Converts a char into a u128.

§Examples
use std::mem;

let c = '⚙';
let u = u128::from(c);
assert!(16 == mem::size_of_val(&u))

1.13.0 · Source§

Source§

Converts a char into a u32.

§Examples
use std::mem;

let c = 'c';
let u = u32::from(c);
assert!(4 == mem::size_of_val(&u))

1.51.0 · Source§

Source§

Converts a char into a u64.

§Examples
use std::mem;

let c = '👤';
let u = u64::from(c);
assert!(8 == mem::size_of_val(&u))

1.13.0 · Source§

Maps a byte in 0x00..=0xFF to a char whose code point has the same value, in U+0000..=U+00FF.

Unicode is designed such that this effectively decodes bytes with the character encoding that IANA calls ISO-8859-1. This encoding is compatible with ASCII.

Note that this is different from ISO/IEC 8859-1 a.k.a. ISO 8859-1 (with one less hyphen), which leaves some “blanks”, byte values that are not assigned to any character. ISO-8859-1 (the IANA one) assigns them to the C0 and C1 control codes.

Note that this is also different from Windows-1252 a.k.a. code page 1252, which is a superset ISO/IEC 8859-1 that assigns some (not all!) blanks to punctuation and various Latin characters.

To confuse things further, on the Web ascii, iso-8859-1, and windows-1252 are all aliases for a superset of Windows-1252 that fills the remaining blanks with corresponding C0 and C1 control codes.

Source§

Converts a u8 into a char.

§Examples
use std::mem;

let u = 32 as u8;
let c = char::from(u);
assert!(4 == mem::size_of_val(&c))

1.80.0 · Source§

1.17.0 · Source§

1.80.0 · Source§

Source§

1.12.0 · Source§

1.0.0 · Source§

1.20.0 · Source§

Source§

The associated error which can be returned from parsing.

Source§

Parses a string s to return a value of this type. Read more

1.0.0 · Source§

1.0.0 · Source§

1.0.0 · Source§

Source§

Tests for self and other values to be equal, and is used by ==.

Source§

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.

1.0.0 · Source§

Source§

This method returns an ordering between self and other values if one exists. Read more

Source§

Tests less than (for self and other) and is used by the < operator. Read more

Source§

Tests less than or equal to (for self and other) and is used by the<= operator. Read more

Source§

Tests greater than or equal to (for self and other) and is used by the >= operator. Read more

Source§

Tests greater than (for self and other) and is used by the >operator. Read more

Source§

Searches for chars that are equal to a given char.

§Examples

assert_eq!("Hello world".find('o'), Some(4));

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Associated searcher for this pattern

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Constructs the associated searcher fromself and the haystack to search in.

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Checks whether the pattern matches anywhere in the haystack

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Checks whether the pattern matches at the front of the haystack

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Removes the pattern from the front of haystack, if it matches.

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Checks whether the pattern matches at the back of the haystack

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Removes the pattern from the back of haystack, if it matches.

Source§

🔬This is a nightly-only experimental API. (pattern #27721)

Returns the pattern as utf-8 bytes if possible.

Source§

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the bounds on the number of successor steps required to get from start to endlike Iterator::size_hint(). Read more

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the value that would be obtained by taking the _successor_of self count times. Read more

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the value that would be obtained by taking the _predecessor_of self count times. Read more

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the value that would be obtained by taking the _successor_of self count times. Read more

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the value that would be obtained by taking the _predecessor_of self count times. Read more

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the value that would be obtained by taking the _successor_of self count times. Read more

Source§

🔬This is a nightly-only experimental API. (step_trait #42168)

Returns the value that would be obtained by taking the _predecessor_of self count times. Read more

1.74.0 · Source§

Maps a char with code point in U+0000..=U+FFFF to a u16 in 0x0000..=0xFFFF with same value, failing if the code point is greater than U+FFFF.

This corresponds to the UCS-2 encoding, as specified in ISO/IEC 10646:2003.

Source§

Tries to convert a char into a u16.

§Examples
let trans_rights = '⚧'; // U+26A7
let ninjas = '🥷'; // U+1F977
assert_eq!(u16::try_from(trans_rights), Ok(0x26A7_u16));
assert!(u16::try_from(ninjas).is_err());

Source§

The type returned in the event of a conversion error.

1.59.0 · Source§

Maps a char with code point in U+0000..=U+00FF to a byte in 0x00..=0xFF with same value, failing if the code point is greater than U+00FF.

Source§

Tries to convert a char into a u8.

§Examples
let a = 'ÿ'; // U+00FF
let b = 'Ā'; // U+0100
assert_eq!(u8::try_from(a), Ok(0xFF_u8));
assert!(u8::try_from(b).is_err());

Source§

The type returned in the event of a conversion error.

1.34.0 · Source§

Source§

The type returned in the event of a conversion error.

Source§

Performs the conversion.

Source§

1.0.0 · Source§

1.0.0 · Source§

Source§

Source§

Source§

§

§

§

§

§

§