Issue 1159501: Improve %s support for unicode (original) (raw)

Created on 2005-03-09 01:43 by nascheme, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
unicode_format.txt nascheme,2005-03-09 01:43
pyobject_text.txt nascheme,2005-03-10 21:13 version 2 of patch
Messages (8)
msg47901 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-03-09 01:43
"'%s' % unicode_string" produces a unicode result. I think the following code should also return a unicode string: class Wrapper: ....def __str__(self): ........return unicode_string '%s' % Wrapper() That behavior would make it easier to write library code that can work with either str objects or unicode objects. The fix is pretty simple (see that attached patch). Perhaps the PyObject_Text function should be called _PyObject_Text instead. Alternatively, if the function is make public then we should document it and perhaps also provide a builtin function called 'text' that uses it.
msg47902 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-03-09 10:10
Logged In: YES user_id=38388 Nice patch. Only nit: PyObject_Text() should check that the result of tp_str() is indeed either a string or unicode instance (possibly from a subclass). Otherwise, the function wouldn't be able to guarantee this feature - which is what it's all about.
msg47903 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-03-10 21:12
Logged In: YES user_id=35752 Attaching a better patch. Add a builtin function called "text". Change PyObject_Text to check the return types as suggested by Mark. Update the documentation and the tests.
msg47904 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-03-10 21:13
Logged In: YES user_id=35752 attempt to attach patch again
msg47905 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-04-20 21:00
Logged In: YES user_id=35752 Assigning to effbot for review. He had mentioned something about __text__ at one point.
msg47906 - (view) Author: Marc-Andre Lemburg (lemburg) * (Python committer) Date: 2005-04-20 21:27
Logged In: YES user_id=38388 Looks OK to me; not sure what you mean with __text__ - __str__ already has taken that role long ago.
msg47907 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-04-20 21:46
Logged In: YES user_id=35752 Here's a quote from him: > I'm beginning to think that we need an extra method (__text__), that > can return any kind of string that's compatible with Python's text model. > > (in today's CPython, that's an 8-bit string with ASCII only, or a Uni- > code string. future Python's may support more string types, at least at > the C implementation level). > > I'm not sure we can change __str__ or __unicode__ without breaking > code in really obscure ways (but I'd be happy to be proven wrong). My idea is that we can change __str__ without breaking code. The reason is that no one should be calling tp_str directly. Instead they use PyObject_Str. I don't know what he meant by "string that's compatible with Python's text model". With my change, Python can only deal with str or unicode instances. I have no idea how we could support other string implementations. I don't want to introduce a text() builtin that calls __str__ and then later realize that __text__ would be a useful. Perhaps this change is big enough to require a PEP.
msg47908 - (view) Author: Neil Schemenauer (nascheme) * (Python committer) Date: 2005-08-22 20:57
Logged In: YES user_id=35752 Closing in favor of patch 1266570.
History
Date User Action Args
2022-04-11 14:56:10 admin set github: 41671
2005-03-09 01:43:16 nascheme create