Unicodedata – Unicode Database in Python (original) (raw)

Last Updated : 19 Nov, 2020

Unicode Character Database (UCD) is defined by Unicode Standard Annex #44 which defines the character properties for all unicode characters. This module provides access to UCD and uses the same symbols and names as defined by the Unicode Character Database.Functions defined by the module :

print (unicodedata.lookup('LEFT CURLY BRACKET'))
print (unicodedata.lookup('RIGHT CURLY BRACKET'))
print (unicodedata.lookup('ASTERISK'))

gives error as there is

no symbol called ASTER

print (unicodedata.lookup('ASTER'))

` Output :
{
}
*

print (unicodedata.name(u'/'))
print (unicodedata.name(u'|'))
print (unicodedata.name(u':'))
` Output :
SOLIDUS
VERTICAL LINE
COLON

print (unicodedata.decimal(u'9'))
print (unicodedata.decimal(u'a'))
` Output :
9
Traceback (most recent call last):
File "7e736755dd176cd0169eeea6f5d32057.py", line 4, in
print unicodedata.decimal(u'a')
ValueError: not a decimal

print (unicodedata.decimal(u'9'))
print (unicodedata.decimal(u'143'))
` Output :
9
Traceback (most recent call last):
File "ad47ae996380a777426cc1431ec4a8cd.py", line 4, in
print unicodedata.decimal(u'143')
TypeError: need a single Unicode character as parameter

print (unicodedata.decimal(u'9'))
print (unicodedata.decimal(u'143'))
` Output :
9
Traceback (most recent call last):
File "ad47ae996380a777426cc1431ec4a8cd.py", line 4, in
print unicodedata.decimal(u'143')
TypeError: need a single Unicode character as parameter

print (unicodedata.category(u'A'))
print (unicodedata.category(u'b'))
` Output :
Lu
Ll

print (unicodedata.bidirectional(u'\u0660'))
` Output :
AN

print ('%r' % normalize('NFD', u'\u00C7'))
print ('%r' % normalize('NFC', u'C\u0327'))
print ('%r' % normalize('NFKD', u'\u2460'))
`

Output :

u'C\u0327' u'\xc7' u'1'