[Python-Dev] PEP: Adding data-type objects to Python (original) (raw)
M.-A. Lemburg mal at egenix.com
Sat Oct 28 21:31:34 CEST 2006
- Previous message: [Python-Dev] PEP: Adding data-type objects to Python
- Next message: [Python-Dev] PEP: Adding data-type objects to Python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Travis E. Oliphant wrote:
M.-A. Lemburg wrote:
Travis E. Oliphant wrote:
M.-A. Lemburg wrote:
Travis E. Oliphant wrote:
------------------------------------------------------------------------
PEP: Title: Adding data-type objects to the standard library Attributes kind -- returns the basic "kind" of the data-type. The basic kinds are: 't' - bit, 'b' - bool, 'i' - signed integer, 'u' - unsigned integer, 'f' - floating point, 'c' - complex floating point, 'S' - string (fixed-length sequence of char), 'U' - fixed length sequence of UCS4, Shouldn't this read "fixed length sequence of Unicode" ?! The underlying code unit format (UCS2 and UCS4) depends on the Python version. Well, in NumPy 'U' always means UCS4. So, I just copied that over. See my questions at the bottom which talk about how to handle this. A data-format does not necessarily have to correspond to something Python represents with an Object. Ok, but why are you being specific about UCS4 (which is an internal storage format), while you are not specific about e.g. the internal bit size of the integers (which could be 32 or 64 bit) ? The 'kind' does not specify how "big" the data-type (data-format) is. A number is needed to represent the number of bytes. In this case, the 'kind' does not specify how large the data-type is. You can have 'u1', 'u2', 'u4', etc. The same is true with Unicode. You can have 10-character unicode elements, 20-character, etc. But, we have to be clear about what a "character" is in the data-format.
I understand and that's why I'm asking why you made the range explicit in the definition.
The definition should talk about Unicode code points. The number of bytes then determines whether you can only represent the ASCII subset (1 byte), UCS2 (2 bytes, BMP only) or UCS4 (4 bytes, all currently assigned code points).
This is similar to the range for integers (ie. ZZ_0), where the number of bytes determines the range of numbers that can be represented.
-- Marc-Andre Lemburg eGenix.com
Professional Python Services directly from the Source (#1, Oct 28 2006)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::
- Previous message: [Python-Dev] PEP: Adding data-type objects to Python
- Next message: [Python-Dev] PEP: Adding data-type objects to Python
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]