Message 228191 - Python tracker (original) (raw)

New here, but I think this is the correct issue to get info about this unicode problem. On the windows console:

chcp Active code page: 437

type utf.txt Привет

chcp 65001 Active code page: 65001

type utf.txt Привет

python --version Python 3.5.0a0

cat utf.py f = open('utf.txt') l = f.readline() print(l) print(len(l))

python utf.py Привет �²ÐµÑ‚ �‚

13

cat utf_explicit.py import codecs f = codecs.open('utf.txt', encoding='utf-8', mode='r') l = f.readline() print(l) print(len(l))

python utf_explicit.py Привет ет

7

I partly read through the page but these things are a bit above my head. Could anyone explain

type utf2.txt aαbβcγdδ

cat utf2.py import streams import codecs streams.enable() f = codecs.open('utf2.txt', encoding='utf-8', mode='r') print(f.read(1)) print(f.read(1)) print(f.read(2)) print(f.read(4))

python utf2.py a α bβc γdδ