Issue 947906: calendar.weekheader(n): n should mean chars not bytes (original) (raw)

Created on 2004-05-04 18:38 by leorochael, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
diff.txt doerwalter,2004-07-21 19:17
calendar.diff doerwalter,2006-03-31 15:14
calendar2.diff doerwalter,2006-03-31 17:36
calendar3.diff doerwalter,2006-03-31 17:45
Messages (10)
msg20692 - (view) Author: Leonardo Rochael Almeida (leorochael) Date: 2004-05-04 18:38
calendar.weekheader(n) is locale aware, which is good in principle. The parameter n, however, is interpreted as meaning bytes, not chars, which can generate broken strings for, e.g. localized weekday names: >>> calendar.weekheader(2) 'Mo Tu We Th Fr Sa Su' >>> locale.setlocale(locale.LC_ALL, "pt_BR.UTF-8") 'pt_BR.UTF-8' >>> calendar.weekheader(2) 'Se Te Qu Qu Se S\xc3 Do' Notice how "Sábado" (Saturday) above is missing the second utf-8 byte for the encoding of "á": >>> u"Sá".encode("utf-8") 'S\xc3\xa1' The implementation of weekheader (and of all of calendar.py, it seems) is based on localized 8 bit strings. I suppose the correct fix for this bug will involve a roundtrip thru unicode.
msg20693 - (view) Author: Hyeshik Chang (hyeshik.chang) * (Python committer) Date: 2004-05-07 23:57
Logged In: YES user_id=55188 I think calendar.weekheader should mean not chars nor bytes but width. Because the function is currectly used for fixed width representations of calendars. Yes. They are same for western alphabets. But, for many of CJK characters are in full width. So, they need only 1 character for calendar.weekheader(2); and it's conventional in real life, too. But, we don't have unicode.width() support to implement the feature yet.
msg20694 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2004-06-02 19:08
Logged In: YES user_id=89016 Maybe we should have a second version of calendar (named ucalendar?) that works with unicode strings? Could those two modules be rewritten to use as much common functionality as possible? Or we could use a module global to configure whether str or unicode should be returned? Most of the localization functionality in calendar seems to come from datetime.datetime.strftime(), so it probably would help to have a method datetime.datetime.ustrftime() that returns the formatted string as unicode (using the locale encoding). Assigning to MvL as the locale/unicode expert.
msg20695 - (view) Author: Martin v. Löwis (loewis) * (Python committer) Date: 2004-06-03 04:43
Logged In: YES user_id=21627 Adding an ucalendar module would be reasonable, IMO. Introducing ustrftime is not necessary - we could just apply the "unicode in/unicode out" procedure (i.e. if the format is a Unicode string, return a Unicode result). The tricky part of that is to convert the strftime result to Unicode. We could try mbstowcs, but that would fail if the locale doesn't use Unicode for wchar_t. Once ucalendar is written, we could document that the calendar module has known problems if the locale's encoding is not Latin-1. However, I'm not going to implement that any time soon, so unassigning.
msg20696 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2004-07-21 19:17
Logged In: YES user_id=89016 The following patch doesn't fix the unicode problem, but it should enable us to have both 8bit and unicode calendars. It reimplements the calendar functionality as classes. This makes it possible to reuse the date calculation logic and extend or replace the string formatting logic. Implementing a unicode version would be done by subclassing TextCalendar and overwritting formatweekday() and formatmonthname(). The patch adds several other features: A HTML version of a calendar can be output. (An example output can be found at http://styx.livinglogic.de/~walter/calendar/calendar.html). The calendar module can be used as a script from the command line. Various options are available. It's possible to specify the number of months per row (they were fixed at 3 in the old version). If this patch is accepted I can provide documentation and tests.
msg20697 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 15:14
Logged In: YES user_id=89016 Here's a new version of the patch with documentation for the Calendar classes and a new test. The script interface isn't documented in the TeX file (python -mcalendar --help should be enough).
msg20698 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 17:11
Logged In: YES user_id=89016 Checked in calendar.diff as r43483.
msg20699 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 17:36
Logged In: YES user_id=89016 This second patch (calendar2.diff) adds new subclasses LocaleTextCalendar and LocaleHTMLCalendar that output localized month and weekday names and can cope with encodings.
msg20700 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-03-31 17:45
Logged In: YES user_id=89016 This third patch (calendar3.diff) is a variant of of the second patch, that uses xmlcharrefreplace error handling in the HTML calendar.
msg20701 - (view) Author: Walter Dörwald (doerwalter) * (Python committer) Date: 2006-04-01 07:57
Logged In: YES user_id=89016 Checked in calendar3,diff (minus the test) as r43531.
History
Date User Action Args
2022-04-11 14:56:04 admin set github: 40218
2004-05-04 18:38:31 leorochael create