char - Rust (original) (raw)
Primitive Type char
1.0.0· [−]
Expand description
A character type.
The char
type represents a single character. More specifically, since ‘character’ isn’t a well-defined concept in Unicode, char
is a ‘Unicode scalar value’, which is similar to, but not the same as, a ‘Unicode code point’.
This documentation describes a number of methods and trait implementations on thechar
type. For technical reasons, there is additional, separate documentation in the std::char module as well.
char
is always four bytes in size. This is a different representation than a given character would have as part of a String. For example:
let v = vec!['h', 'e', 'l', 'l', 'o'];
// five elements times four bytes for each element
assert_eq!(20, v.len() * std::mem::size_of::<char>());
let s = String::from("hello");
// five elements times one byte per element
assert_eq!(5, s.len() * std::mem::size_of::<u8>());
As always, remember that a human intuition for ‘character’ might not map to Unicode’s definitions. For example, despite looking similar, the ‘é’ character is one Unicode code point while ‘é’ is two Unicode code points:
let mut chars = "é".chars();
// U+00e9: 'latin small letter e with acute'
assert_eq!(Some('\u{00e9}'), chars.next());
assert_eq!(None, chars.next());
let mut chars = "é".chars();
// U+0065: 'latin small letter e'
assert_eq!(Some('\u{0065}'), chars.next());
// U+0301: 'combining acute accent'
assert_eq!(Some('\u{0301}'), chars.next());
assert_eq!(None, chars.next());
This means that the contents of the first string above will fit into achar
while the contents of the second string will not. Trying to create a char
literal with the contents of the second string gives an error:
error: character literal may only contain one codepoint: 'é'
let c = 'é';
^^^
Another implication of the 4-byte fixed size of a char
is that per-char
processing can end up using a lot more memory:
let s = String::from("love: ❤️");
let v: Vec<char> = s.chars().collect();
assert_eq!(12, std::mem::size_of_val(&s[..]));
assert_eq!(32, std::mem::size_of_val(&v[..]));
U+FFFD REPLACEMENT CHARACTER
(�) is used in Unicode to represent a decoding error.
It can occur, for example, when giving ill-formed UTF-8 bytes toString::from_utf8_lossy.
The version of Unicode that the Unicode parts ofchar
and str
methods are based on.
New versions of Unicode are released regularly and subsequently all methods in the standard library depending on Unicode are updated. Therefore the behavior of some char
and str
methods and the value of this constant changes over time. This is not considered to be a breaking change.
The version numbering scheme is explained inUnicode 11.0 or later, Section 3.1 Versions of the Unicode Standard.
Creates an iterator over the UTF-16 encoded code points in iter
, returning unpaired surrogates as Err
s.
Basic usage:
use std::char::decode_utf16;
// 𝄞mus<invalid>ic<invalid>
let v = [
0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
];
assert_eq!(
decode_utf16(v)
.map(|r| r.map_err(|e| e.unpaired_surrogate()))
.collect::<Vec<_>>(),
vec![
Ok('𝄞'),
Ok('m'), Ok('u'), Ok('s'),
Err(0xDD1E),
Ok('i'), Ok('c'),
Err(0xD834)
]
);
A lossy decoder can be obtained by replacing Err
results with the replacement character:
use std::char::{decode_utf16, REPLACEMENT_CHARACTER};
// 𝄞mus<invalid>ic<invalid>
let v = [
0xD834, 0xDD1E, 0x006d, 0x0075, 0x0073, 0xDD1E, 0x0069, 0x0063, 0xD834,
];
assert_eq!(
decode_utf16(v)
.map(|r| r.unwrap_or(REPLACEMENT_CHARACTER))
.collect::<String>(),
"𝄞mus�ic�"
);
Converts a u32
to a char
.
Note that all char
s are valid u32s, and can be cast to one withas:
let c = '💯';
let i = c as u32;
assert_eq!(128175, i);
However, the reverse is not true: not all valid u32s are validchar
s. from_u32()
will return None
if the input is not a valid value for a char
.
For an unsafe version of this function which ignores these checks, seefrom_u32_unchecked.
Basic usage:
use std::char;
let c = char::from_u32(0x2764);
assert_eq!(Some('❤'), c);
Returning None
when the input is not a valid char
:
use std::char;
let c = char::from_u32(0x110000);
assert_eq!(None, c);
Converts a u32
to a char
, ignoring validity.
Note that all char
s are valid u32s, and can be cast to one withas
:
let c = '💯';
let i = c as u32;
assert_eq!(128175, i);
However, the reverse is not true: not all valid u32s are validchar
s. from_u32_unchecked()
will ignore this, and blindly cast tochar
, possibly creating an invalid one.
This function is unsafe, as it may construct invalid char
values.
For a safe version of this function, see the from_u32 function.
Basic usage:
use std::char;
let c = unsafe { char::from_u32_unchecked(0x2764) };
assert_eq!('❤', c);
Converts a digit in the given radix to a char
.
A ‘radix’ here is sometimes also called a ‘base’. A radix of two indicates a binary number, a radix of ten, decimal, and a radix of sixteen, hexadecimal, to give some common values. Arbitrary radices are supported.
from_digit()
will return None
if the input is not a digit in the given radix.
Panics if given a radix larger than 36.
Basic usage:
use std::char;
let c = char::from_digit(4, 10);
assert_eq!(Some('4'), c);
// Decimal 11 is a single digit in base 16
let c = char::from_digit(11, 16);
assert_eq!(Some('b'), c);
Returning None
when the input is not a digit:
use std::char;
let c = char::from_digit(20, 10);
assert_eq!(None, c);
Passing a large radix, causing a panic:
use std::char;
// this panics
let _c = char::from_digit(1, 37);
Checks if a char
is a digit in the given radix.
A ‘radix’ here is sometimes also called a ‘base’. A radix of two indicates a binary number, a radix of ten, decimal, and a radix of sixteen, hexadecimal, to give some common values. Arbitrary radices are supported.
Compared to is_numeric(), this function only recognizes the characters0-9
, a-z
and A-Z
.
‘Digit’ is defined to be only the following characters:
0-9
a-z
A-Z
For a more comprehensive understanding of ‘digit’, see is_numeric().
Panics if given a radix larger than 36.
Basic usage:
assert!('1'.is_digit(10));
assert!('f'.is_digit(16));
assert!(!'f'.is_digit(10));
Passing a large radix, causing a panic:
// this panics
'1'.is_digit(37);
Converts a char
to a digit in the given radix.
A ‘radix’ here is sometimes also called a ‘base’. A radix of two indicates a binary number, a radix of ten, decimal, and a radix of sixteen, hexadecimal, to give some common values. Arbitrary radices are supported.
‘Digit’ is defined to be only the following characters:
0-9
a-z
A-Z
Returns None
if the char
does not refer to a digit in the given radix.
Panics if given a radix larger than 36.
Basic usage:
assert_eq!('1'.to_digit(10), Some(1));
assert_eq!('f'.to_digit(16), Some(15));
Passing a non-digit results in failure:
assert_eq!('f'.to_digit(10), None);
assert_eq!('z'.to_digit(16), None);
Passing a large radix, causing a panic:
// this panics
let _ = '1'.to_digit(37);
Returns an iterator that yields the hexadecimal Unicode escape of a character as char
s.
This will escape characters with the Rust syntax of the form\u{NNNNNN}
where NNNNNN
is a hexadecimal representation.
As an iterator:
for c in '❤'.escape_unicode() {
print!("{}", c);
}
println!();
Using println!
directly:
println!("{}", '❤'.escape_unicode());
Both are equivalent to:
println!("\\u{{2764}}");
Using to_string:
assert_eq!('❤'.escape_unicode().to_string(), "\\u{2764}");
Returns an iterator that yields the literal escape code of a character as char
s.
This will escape the characters similar to the Debug implementations of str
or char
.
As an iterator:
for c in '\n'.escape_debug() {
print!("{}", c);
}
println!();
Using println!
directly:
println!("{}", '\n'.escape_debug());
Both are equivalent to:
Using to_string:
assert_eq!('\n'.escape_debug().to_string(), "\\n");
Returns an iterator that yields the literal escape code of a character as char
s.
The default is chosen with a bias toward producing literals that are legal in a variety of languages, including C++11 and similar C-family languages. The exact rules are:
- Tab is escaped as
\t
. - Carriage return is escaped as
\r
. - Line feed is escaped as
\n
. - Single quote is escaped as
\'
. - Double quote is escaped as
\"
. - Backslash is escaped as
\\
. - Any character in the ‘printable ASCII’ range
0x20
..0x7e
inclusive is not escaped. - All other characters are given hexadecimal Unicode escapes; seeescape_unicode.
As an iterator:
for c in '"'.escape_default() {
print!("{}", c);
}
println!();
Using println!
directly:
println!("{}", '"'.escape_default());
Both are equivalent to:
Using to_string:
assert_eq!('"'.escape_default().to_string(), "\\\"");
1.0.0 (const: 1.52.0) · source
Returns the number of bytes this char
would need if encoded in UTF-8.
That number of bytes is always between 1 and 4, inclusive.
Basic usage:
let len = 'A'.len_utf8();
assert_eq!(len, 1);
let len = 'ß'.len_utf8();
assert_eq!(len, 2);
let len = 'ℝ'.len_utf8();
assert_eq!(len, 3);
let len = '💣'.len_utf8();
assert_eq!(len, 4);
The &str
type guarantees that its contents are UTF-8, and so we can compare the length it would take if each code point was represented as a char
vs in the &str
itself:
// as chars
let eastern = '東';
let capital = '京';
// both can be represented as three bytes
assert_eq!(3, eastern.len_utf8());
assert_eq!(3, capital.len_utf8());
// as a &str, these two are encoded in UTF-8
let tokyo = "東京";
let len = eastern.len_utf8() + capital.len_utf8();
// we can see that they take six bytes total...
assert_eq!(6, tokyo.len());
// ... just like the &str
assert_eq!(len, tokyo.len());
1.0.0 (const: 1.52.0) · source
Returns the number of 16-bit code units this char
would need if encoded in UTF-16.
See the documentation for len_utf8() for more explanation of this concept. This function is a mirror, but for UTF-16 instead of UTF-8.
Basic usage:
let n = 'ß'.len_utf16();
assert_eq!(n, 1);
let len = '💣'.len_utf16();
assert_eq!(len, 2);
Encodes this character as UTF-8 into the provided byte buffer, and then returns the subslice of the buffer that contains the encoded character.
Panics if the buffer is not large enough. A buffer of length four is large enough to encode any char
.
In both of these examples, ‘ß’ takes two bytes to encode.
let mut b = [0; 2];
let result = 'ß'.encode_utf8(&mut b);
assert_eq!(result, "ß");
assert_eq!(result.len(), 2);
A buffer that’s too small:
let mut b = [0; 1];
// this panics
'ß'.encode_utf8(&mut b);
Encodes this character as UTF-16 into the provided u16
buffer, and then returns the subslice of the buffer that contains the encoded character.
Panics if the buffer is not large enough. A buffer of length 2 is large enough to encode any char
.
In both of these examples, ‘𝕊’ takes two u16
s to encode.
let mut b = [0; 2];
let result = '𝕊'.encode_utf16(&mut b);
assert_eq!(result.len(), 2);
A buffer that’s too small:
let mut b = [0; 1];
// this panics
'𝕊'.encode_utf16(&mut b);
Returns true
if this char
has the Alphabetic
property.
Alphabetic
is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database DerivedCoreProperties.txt.
Basic usage:
assert!('a'.is_alphabetic());
assert!('京'.is_alphabetic());
let c = '💝';
// love is many things, but it is not alphabetic
assert!(!c.is_alphabetic());
Returns true
if this char
has the Lowercase
property.
Lowercase
is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database DerivedCoreProperties.txt.
Basic usage:
assert!('a'.is_lowercase());
assert!('δ'.is_lowercase());
assert!(!'A'.is_lowercase());
assert!(!'Δ'.is_lowercase());
// The various Chinese scripts and punctuation do not have case, and so:
assert!(!'中'.is_lowercase());
assert!(!' '.is_lowercase());
Returns true
if this char
has the Uppercase
property.
Uppercase
is described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database DerivedCoreProperties.txt.
Basic usage:
assert!(!'a'.is_uppercase());
assert!(!'δ'.is_uppercase());
assert!('A'.is_uppercase());
assert!('Δ'.is_uppercase());
// The various Chinese scripts and punctuation do not have case, and so:
assert!(!'中'.is_uppercase());
assert!(!' '.is_uppercase());
Returns true
if this char
has the White_Space
property.
White_Space
is specified in the Unicode Character Database PropList.txt.
Basic usage:
assert!(' '.is_whitespace());
// a non-breaking space
assert!('\u{A0}'.is_whitespace());
assert!(!'越'.is_whitespace());
Returns true
if this char
satisfies either is_alphabetic() or is_numeric().
Basic usage:
assert!('٣'.is_alphanumeric());
assert!('7'.is_alphanumeric());
assert!('৬'.is_alphanumeric());
assert!('¾'.is_alphanumeric());
assert!('①'.is_alphanumeric());
assert!('K'.is_alphanumeric());
assert!('و'.is_alphanumeric());
assert!('藏'.is_alphanumeric());
Returns true
if this char
has the general category for control codes.
Control codes (code points with the general category of Cc
) are described in Chapter 4 (Character Properties) of the Unicode Standard and specified in the Unicode Character Database UnicodeData.txt.
Basic usage:
// U+009C, STRING TERMINATOR
assert!(''.is_control());
assert!(!'q'.is_control());
Returns true
if this char
has one of the general categories for numbers.
The general categories for numbers (Nd
for decimal digits, Nl
for letter-like numeric characters, and No
for other numeric characters) are specified in the Unicode Character Database UnicodeData.txt.
Basic usage:
assert!('٣'.is_numeric());
assert!('7'.is_numeric());
assert!('৬'.is_numeric());
assert!('¾'.is_numeric());
assert!('①'.is_numeric());
assert!(!'K'.is_numeric());
assert!(!'و'.is_numeric());
assert!(!'藏'.is_numeric());
Returns an iterator that yields the lowercase mapping of this char
as one or morechar
s.
If this char
does not have a lowercase mapping, the iterator yields the same char
.
If this char
has a one-to-one lowercase mapping given by the Unicode Character Database UnicodeData.txt, the iterator yields that char
.
If this char
requires special considerations (e.g. multiple char
s) the iterator yields the char
(s) given by SpecialCasing.txt.
This operation performs an unconditional mapping without tailoring. That is, the conversion is independent of context and language.
In the Unicode Standard, Chapter 4 (Character Properties) discusses case mapping in general and Chapter 3 (Conformance) discusses the default algorithm for case conversion.
As an iterator:
for c in 'İ'.to_lowercase() {
print!("{}", c);
}
println!();
Using println!
directly:
println!("{}", 'İ'.to_lowercase());
Both are equivalent to:
Using to_string:
assert_eq!('C'.to_lowercase().to_string(), "c");
// Sometimes the result is more than one character:
assert_eq!('İ'.to_lowercase().to_string(), "i\u{307}");
// Characters that do not have both uppercase and lowercase
// convert into themselves.
assert_eq!('山'.to_lowercase().to_string(), "山");
Returns an iterator that yields the uppercase mapping of this char
as one or morechar
s.
If this char
does not have an uppercase mapping, the iterator yields the same char
.
If this char
has a one-to-one uppercase mapping given by the Unicode Character Database UnicodeData.txt, the iterator yields that char
.
If this char
requires special considerations (e.g. multiple char
s) the iterator yields the char
(s) given by SpecialCasing.txt.
This operation performs an unconditional mapping without tailoring. That is, the conversion is independent of context and language.
In the Unicode Standard, Chapter 4 (Character Properties) discusses case mapping in general and Chapter 3 (Conformance) discusses the default algorithm for case conversion.
As an iterator:
for c in 'ß'.to_uppercase() {
print!("{}", c);
}
println!();
Using println!
directly:
println!("{}", 'ß'.to_uppercase());
Both are equivalent to:
Using to_string:
assert_eq!('c'.to_uppercase().to_string(), "C");
// Sometimes the result is more than one character:
assert_eq!('ß'.to_uppercase().to_string(), "SS");
// Characters that do not have both uppercase and lowercase
// convert into themselves.
assert_eq!('山'.to_uppercase().to_string(), "山");
In Turkish, the equivalent of ‘i’ in Latin has five forms instead of two:
- ‘Dotless’: I / ı, sometimes written ï
- ‘Dotted’: İ / i
Note that the lowercase dotted ‘i’ is the same as the Latin. Therefore:
let upper_i = 'i'.to_uppercase().to_string();
The value of upper_i
here relies on the language of the text: if we’re in en-US
, it should be "I"
, but if we’re in tr_TR
, it should be "İ"
. to_uppercase()
does not take this into account, and so:
let upper_i = 'i'.to_uppercase().to_string();
assert_eq!(upper_i, "I");
holds across languages.
1.23.0 (const: 1.32.0) · source
Checks if the value is within the ASCII range.
let ascii = 'a';
let non_ascii = '❤';
assert!(ascii.is_ascii());
assert!(!non_ascii.is_ascii());
1.23.0 (const: 1.52.0) · source
Makes a copy of the value in its ASCII upper case equivalent.
ASCII letters ‘a’ to ‘z’ are mapped to ‘A’ to ‘Z’, but non-ASCII letters are unchanged.
To uppercase the value in-place, use make_ascii_uppercase().
To uppercase ASCII characters in addition to non-ASCII characters, useto_uppercase().
let ascii = 'a';
let non_ascii = '❤';
assert_eq!('A', ascii.to_ascii_uppercase());
assert_eq!('❤', non_ascii.to_ascii_uppercase());
1.23.0 (const: 1.52.0) · source
Makes a copy of the value in its ASCII lower case equivalent.
ASCII letters ‘A’ to ‘Z’ are mapped to ‘a’ to ‘z’, but non-ASCII letters are unchanged.
To lowercase the value in-place, use make_ascii_lowercase().
To lowercase ASCII characters in addition to non-ASCII characters, useto_lowercase().
let ascii = 'A';
let non_ascii = '❤';
assert_eq!('a', ascii.to_ascii_lowercase());
assert_eq!('❤', non_ascii.to_ascii_lowercase());
1.23.0 (const: 1.52.0) · source
Checks that two values are an ASCII case-insensitive match.
Equivalent to [to_ascii_lowercase](#method.to%5Fascii%5Flowercase)(a) == [to_ascii_lowercase](#method.to%5Fascii%5Flowercase)(b)
.
let upper_a = 'A';
let lower_a = 'a';
let lower_z = 'z';
assert!(upper_a.eq_ignore_ascii_case(&lower_a));
assert!(upper_a.eq_ignore_ascii_case(&upper_a));
assert!(!upper_a.eq_ignore_ascii_case(&lower_z));
Converts this type to its ASCII upper case equivalent in-place.
ASCII letters ‘a’ to ‘z’ are mapped to ‘A’ to ‘Z’, but non-ASCII letters are unchanged.
To return a new uppercased value without modifying the existing one, useto_ascii_uppercase().
let mut ascii = 'a';
ascii.make_ascii_uppercase();
assert_eq!('A', ascii);
Converts this type to its ASCII lower case equivalent in-place.
ASCII letters ‘A’ to ‘Z’ are mapped to ‘a’ to ‘z’, but non-ASCII letters are unchanged.
To return a new lowercased value without modifying the existing one, useto_ascii_lowercase().
let mut ascii = 'A';
ascii.make_ascii_lowercase();
assert_eq!('a', ascii);
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII alphabetic character:
- U+0041 ‘A’ ..= U+005A ‘Z’, or
- U+0061 ‘a’ ..= U+007A ‘z’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(uppercase_a.is_ascii_alphabetic());
assert!(uppercase_g.is_ascii_alphabetic());
assert!(a.is_ascii_alphabetic());
assert!(g.is_ascii_alphabetic());
assert!(!zero.is_ascii_alphabetic());
assert!(!percent.is_ascii_alphabetic());
assert!(!space.is_ascii_alphabetic());
assert!(!lf.is_ascii_alphabetic());
assert!(!esc.is_ascii_alphabetic());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII uppercase character: U+0041 ‘A’ ..= U+005A ‘Z’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(uppercase_a.is_ascii_uppercase());
assert!(uppercase_g.is_ascii_uppercase());
assert!(!a.is_ascii_uppercase());
assert!(!g.is_ascii_uppercase());
assert!(!zero.is_ascii_uppercase());
assert!(!percent.is_ascii_uppercase());
assert!(!space.is_ascii_uppercase());
assert!(!lf.is_ascii_uppercase());
assert!(!esc.is_ascii_uppercase());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII lowercase character: U+0061 ‘a’ ..= U+007A ‘z’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(!uppercase_a.is_ascii_lowercase());
assert!(!uppercase_g.is_ascii_lowercase());
assert!(a.is_ascii_lowercase());
assert!(g.is_ascii_lowercase());
assert!(!zero.is_ascii_lowercase());
assert!(!percent.is_ascii_lowercase());
assert!(!space.is_ascii_lowercase());
assert!(!lf.is_ascii_lowercase());
assert!(!esc.is_ascii_lowercase());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII alphanumeric character:
- U+0041 ‘A’ ..= U+005A ‘Z’, or
- U+0061 ‘a’ ..= U+007A ‘z’, or
- U+0030 ‘0’ ..= U+0039 ‘9’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(uppercase_a.is_ascii_alphanumeric());
assert!(uppercase_g.is_ascii_alphanumeric());
assert!(a.is_ascii_alphanumeric());
assert!(g.is_ascii_alphanumeric());
assert!(zero.is_ascii_alphanumeric());
assert!(!percent.is_ascii_alphanumeric());
assert!(!space.is_ascii_alphanumeric());
assert!(!lf.is_ascii_alphanumeric());
assert!(!esc.is_ascii_alphanumeric());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII decimal digit: U+0030 ‘0’ ..= U+0039 ‘9’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(!uppercase_a.is_ascii_digit());
assert!(!uppercase_g.is_ascii_digit());
assert!(!a.is_ascii_digit());
assert!(!g.is_ascii_digit());
assert!(zero.is_ascii_digit());
assert!(!percent.is_ascii_digit());
assert!(!space.is_ascii_digit());
assert!(!lf.is_ascii_digit());
assert!(!esc.is_ascii_digit());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII hexadecimal digit:
- U+0030 ‘0’ ..= U+0039 ‘9’, or
- U+0041 ‘A’ ..= U+0046 ‘F’, or
- U+0061 ‘a’ ..= U+0066 ‘f’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(uppercase_a.is_ascii_hexdigit());
assert!(!uppercase_g.is_ascii_hexdigit());
assert!(a.is_ascii_hexdigit());
assert!(!g.is_ascii_hexdigit());
assert!(zero.is_ascii_hexdigit());
assert!(!percent.is_ascii_hexdigit());
assert!(!space.is_ascii_hexdigit());
assert!(!lf.is_ascii_hexdigit());
assert!(!esc.is_ascii_hexdigit());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII punctuation character:
- U+0021 ..= U+002F
! " # $ % & ' ( ) * + , - . /
, or - U+003A ..= U+0040
: ; < = > ? @
, or - U+005B ..= U+0060
[ \ ] ^ _ `
, or - U+007B ..= U+007E
{ | } ~
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(!uppercase_a.is_ascii_punctuation());
assert!(!uppercase_g.is_ascii_punctuation());
assert!(!a.is_ascii_punctuation());
assert!(!g.is_ascii_punctuation());
assert!(!zero.is_ascii_punctuation());
assert!(percent.is_ascii_punctuation());
assert!(!space.is_ascii_punctuation());
assert!(!lf.is_ascii_punctuation());
assert!(!esc.is_ascii_punctuation());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII graphic character: U+0021 ‘!’ ..= U+007E ‘~’.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(uppercase_a.is_ascii_graphic());
assert!(uppercase_g.is_ascii_graphic());
assert!(a.is_ascii_graphic());
assert!(g.is_ascii_graphic());
assert!(zero.is_ascii_graphic());
assert!(percent.is_ascii_graphic());
assert!(!space.is_ascii_graphic());
assert!(!lf.is_ascii_graphic());
assert!(!esc.is_ascii_graphic());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII whitespace character: U+0020 SPACE, U+0009 HORIZONTAL TAB, U+000A LINE FEED, U+000C FORM FEED, or U+000D CARRIAGE RETURN.
Rust uses the WhatWG Infra Standard’s definition of ASCII whitespace. There are several other definitions in wide use. For instance, the POSIX locale includes U+000B VERTICAL TAB as well as all the above characters, but—from the very same specification—the default rule for “field splitting” in the Bourne shell considers _only_SPACE, HORIZONTAL TAB, and LINE FEED as whitespace.
If you are writing a program that will process an existing file format, check what that format’s definition of whitespace is before using this function.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(!uppercase_a.is_ascii_whitespace());
assert!(!uppercase_g.is_ascii_whitespace());
assert!(!a.is_ascii_whitespace());
assert!(!g.is_ascii_whitespace());
assert!(!zero.is_ascii_whitespace());
assert!(!percent.is_ascii_whitespace());
assert!(space.is_ascii_whitespace());
assert!(lf.is_ascii_whitespace());
assert!(!esc.is_ascii_whitespace());
1.24.0 (const: 1.47.0) · source
Checks if the value is an ASCII control character: U+0000 NUL ..= U+001F UNIT SEPARATOR, or U+007F DELETE. Note that most ASCII whitespace characters are control characters, but SPACE is not.
let uppercase_a = 'A';
let uppercase_g = 'G';
let a = 'a';
let g = 'g';
let zero = '0';
let percent = '%';
let space = ' ';
let lf = '\n';
let esc = '\x1b';
assert!(!uppercase_a.is_ascii_control());
assert!(!uppercase_g.is_ascii_control());
assert!(!a.is_ascii_control());
assert!(!g.is_ascii_control());
assert!(!zero.is_ascii_control());
assert!(!percent.is_ascii_control());
assert!(!space.is_ascii_control());
assert!(lf.is_ascii_control());
assert!(esc.is_ascii_control());
👎 Deprecated since 1.26.0:
use inherent methods instead
Container type for copied ASCII characters.
👎 Deprecated since 1.26.0:
use inherent methods instead
Checks if the value is within the ASCII range. Read more
👎 Deprecated since 1.26.0:
use inherent methods instead
Makes a copy of the value in its ASCII upper case equivalent. Read more
👎 Deprecated since 1.26.0:
use inherent methods instead
Makes a copy of the value in its ASCII lower case equivalent. Read more
👎 Deprecated since 1.26.0:
use inherent methods instead
Checks that two values are an ASCII case-insensitive match. Read more
👎 Deprecated since 1.26.0:
use inherent methods instead
Converts this type to its ASCII upper case equivalent in-place. Read more
👎 Deprecated since 1.26.0:
use inherent methods instead
Converts this type to its ASCII lower case equivalent in-place. Read more
Formats the value using the given formatter. Read more
Returns the default value of \x00
Formats the value using the given formatter. Read more
Extends a collection with the contents of an iterator. Read more
🔬 This is a nightly-only experimental API. (extend_one
#72631)
Extends a collection with exactly one element.
🔬 This is a nightly-only experimental API. (extend_one
#72631)
Reserves capacity in a collection for the given number of additional elements. Read more
Extends a collection with the contents of an iterator. Read more
🔬 This is a nightly-only experimental API. (extend_one
#72631)
Extends a collection with exactly one element.
🔬 This is a nightly-only experimental API. (extend_one
#72631)
Reserves capacity in a collection for the given number of additional elements. Read more
use std::mem;
let c = 'c';
let u = u32::from(c);
assert!(4 == mem::size_of_val(&u))
use std::mem;
let c = '⚙';
let u = u128::from(c);
assert!(16 == mem::size_of_val(&u))
use std::mem;
let c = '👤';
let u = u64::from(c);
assert!(8 == mem::size_of_val(&u))
Allocates an owned String from a single character.
let c: char = 'a';
let s: String = String::from(c);
assert_eq!("a", &s[..]);
Maps a byte in 0x00..=0xFF to a char
whose code point has the same value, in U+0000..=U+00FF.
Unicode is designed such that this effectively decodes bytes with the character encoding that IANA calls ISO-8859-1. This encoding is compatible with ASCII.
Note that this is different from ISO/IEC 8859-1 a.k.a. ISO 8859-1 (with one less hyphen), which leaves some “blanks”, byte values that are not assigned to any character. ISO-8859-1 (the IANA one) assigns them to the C0 and C1 control codes.
Note that this is also different from Windows-1252 a.k.a. code page 1252, which is a superset ISO/IEC 8859-1 that assigns some (not all!) blanks to punctuation and various Latin characters.
To confuse things further, on the Web ascii
, iso-8859-1
, and windows-1252
are all aliases for a superset of Windows-1252 that fills the remaining blanks with corresponding C0 and C1 control codes.
use std::mem;
let u = 32 as u8;
let c = char::from(u);
assert!(4 == mem::size_of_val(&c))
The associated error which can be returned from parsing.
Parses a string s
to return a value of this type. Read more
Compares and returns the maximum of two values. Read more
Compares and returns the minimum of two values. Read more
Restrict a value to a certain interval. Read more
This method tests for self
and other
values to be equal, and is used by ==
. Read more
This method tests for !=
.
This method returns an ordering between self
and other
values if one exists. Read more
This method tests less than (for self
and other
) and is used by the <
operator. Read more
This method tests less than or equal to (for self
and other
) and is used by the <=
operator. Read more
This method tests greater than or equal to (for self
and other
) and is used by the >=
operator. Read more
This method tests greater than (for self
and other
) and is used by the >
operator. Read more
Searches for chars that are equal to a given char.
assert_eq!("Hello world".find('o'), Some(4));
🔬 This is a nightly-only experimental API. (pattern
#27721)
Associated searcher for this pattern
🔬 This is a nightly-only experimental API. (pattern
#27721)
Constructs the associated searcher fromself
and the haystack
to search in. Read more
🔬 This is a nightly-only experimental API. (pattern
#27721)
Checks whether the pattern matches anywhere in the haystack
🔬 This is a nightly-only experimental API. (pattern
#27721)
Checks whether the pattern matches at the front of the haystack
🔬 This is a nightly-only experimental API. (pattern
#27721)
Removes the pattern from the front of haystack, if it matches.
🔬 This is a nightly-only experimental API. (pattern
#27721)
Checks whether the pattern matches at the back of the haystack
🔬 This is a nightly-only experimental API. (pattern
#27721)
Removes the pattern from the back of haystack, if it matches.
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the number of successor steps required to get from start
to end
. Read more
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the value that would be obtained by taking the _successor_of self
count
times. Read more
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the value that would be obtained by taking the _predecessor_of self
count
times. Read more
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the value that would be obtained by taking the _successor_of self
count
times. Read more
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the value that would be obtained by taking the _predecessor_of self
count
times. Read more
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the value that would be obtained by taking the _successor_of self
count
times. Read more
🔬 This is a nightly-only experimental API. (step_trait
#42168)
Returns the value that would be obtained by taking the _predecessor_of self
count
times. Read more
Converts the given value to a String
. Read more
Map char
with code point in U+0000..=U+00FF to byte in 0x00..=0xFF with same value, failing if the code point is greater than U+00FF.
See impl From for char for details on the encoding.
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.
impl Any for T where
T: 'static + ?Sized,
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more
impl From for T
impl<T, U> Into for T where
U: From,
The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning. Read more
🔬 This is a nightly-only experimental API. (toowned_clone_into
#41263)
Uses borrowed data to replace owned data, usually by cloning. Read more
Converts the given value to a String
. Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.