[Python-Dev] Array Enhancements (original) (raw)
Scott Gilbert xscottg@yahoo.com
Mon, 8 Apr 2002 00:52:43 -0700 (PDT)
- Previous message: [Python-Dev] Array Enhancements
- Next message: [Python-Dev] Array Enhancements
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thanks for the various replies. As suggested by a few, I'll take this to the Numarray folks and see where it goes from there.
Just to respond to a few of the points though... I've put all my responses in one message to wrap things up.
Tim Peters wrote:
Sounds like a PEP to me.
My initial response to reading this was a loud "ugh" as I envisioned red tape swarming around for what I would consider to be a pretty simple patch. I mean, I just wanted to hack in some new typecodes...
After thinking about things for a while though, I've come to the conclusion that the builtin Python array module does need a real reworking. Even though one ships with the standard baseline, it's getting reinvented again and again.
I hope the Numarray guys are doing a bang up job with their NDArray type (I've looked at it briefly, but I don't really understand it yet...). I suspect that most of the ufuncs and other stuff those guys are doing are too special purpose to be part of the standard Python baseline, but I would very much like to see a single usable array type become the standard. I'd be willing to do PEP grunt work for that.
Tim Peters also wrote:
> ... > *** I really need complex types. And more than the functionality > provided by Numeric/Numarray, I need complex integer types.
This will meet resistance, as it's a pile of code of no conceivable use to the vast majority of Python users. That is, "code bloat". Instead the array type should be subclassable, and extreme special-purpose hair like "complex integers" should be supplied by extension modules.
It's not that much bloat. It would be a setitem and getitem pair for each new type.
I'll give you that most people don't need "fixed point complex arrays".
Guido van Rossum wrote:
You'll have to consider: is it important to be able to read pickled arrays on previous Python releases, or it that not a requirement? If it's not, you should probably add a new pickle code for pickled arrays, and do an implementation that writes;
Nope, we ship the version of Python we want them to use with our applications.
Did you guys really make it possible to unpickle a Unicode string in versions of Python that were pre Unicode?
I would think new features should only work in new versions...
Guido also wrote:
Ehm, 'u' is already taken (Unicode).
That must have snuck in there sometime after 2.2 I guess.
Guido also wrote:
> *** The ability to construct an array object from an existing C > pointer. We get our memory in all kinds of ways (valloc for page > aligned DMA transfers, shmem etc...), and it would be nice not to > copy in and copy out in some cases.
But then you get into ownership issues. Who owns that memory? Who can free it? What if someone calls a method on the array that requires the memory to be resized? But it's a useful thing to be able to do, I agree, and it shouldn't be too hard to add a flag that says "I don't own this memory" -- which would mean that the buffer can't be resized at all.
I pictured this working like CObjects do where you pass in a destructor for when the reference count goes to zero. Possibly also passing in a realloc function. If the realloc function is null, then an exception is raised when someone tries to resize the array.
This means there would need to be a C visible API for building array objects around special types of memory though.
Guido also wrote:
Since arrays are all about compromises that trade flexibility for speed and memory footprint, you can't have a one size fits all. :-)
Bahh. I don't think getting a good general purpose Python object that represents arbitrary C arrays is all that impossible. C arrays just don't do that much.
Besides I didn't say "one size fits all", I said "one size fits all my needs". That "my" is important (at least to me :-)
Guido also wrote:
> Well if someone authoritative tells me that all of the above is a > great idea, I'll start working on a patch and scratch my plans to > create a "not in house" xarray module.
It all depends on the quality of the patch. By the time you're done you may have completely rewritten the array module, and then the question is, wouldn't your own xarray module have been quicker to implement, because it doesn't need to preserve backwards compatibility?
Yup, I think I would be done with my xarray module by now if I had written it instead of taking this route. It would also have the disadvantage that it doesn't play nice with anyone else.
I now think the best bet is to replace the array module with something flexible enough to:
- do what it currently does
- do what the Numarray guys need
- do what I need
Guido also wrote:
An alternative might be a separate bit-array implementation: it seems that the bit-array won't share much code with the regular array (of any flavor), so why not make it a separate type?
Yup. It would be nice if a bitarray was actually the same type, but having code like:
if (o->is_bitarray) { /* do something / } else { / do every other byte addressable type */ }
is a little ugly.
David Ascher wrote:
> I just realized that multi-dimensional getitem shouldn't be a > big deal. The question is, given the above declaration, what a[0] > should return: the same as a[0, 0] or a copy of a[0, 0:20000] or > a reference to a[0, 0:20000].
Or a ValueError? In the face of ambiguity, refuse the temptation to guess.
Yup. I think there should be a base array type that raises a ValueError or similar, and derived array types can implement slice references or slice copies as need be.
David also wrote:
Why does submitting a patch to arraymodule seem an easier path than modifying numarray or numpy to support what's needed? I believe that the goals of numarray aren't that different from what Scott is trying to do (memory management APIs, etc.).
Well, part of my preference for modifying arraymodule.c instead of Numarray is that I very quickly understood what's going on in arraymodule.c, and a patch is pretty obvious. Looking at Numarray, I just don't get it yet. Please take this as a shortcoming in my abilities. Numarray does appear to be the heir-apparent though, so I'll give it a better look.
I also assumed that the Numarray folks would play nice with the standard array module. So if I could get what I wanted out of array, then I could leverage Numarray when the opportunity arose.
David also wrote:
I'd like to see fewer multi-dimensional array objects, not more...
I agree completely. In fact, I'd like to see one official one distributed with the baseline.
Perry Greenfield wrote:
[ a whole bunch of interesting things ]
I think I'll try to bring those up on the Numarray list.
Cheers, -Scott Gilbert
Do You Yahoo!? Yahoo! Tax Center - online filing with TurboTax http://taxes.yahoo.com/
- Previous message: [Python-Dev] Array Enhancements
- Next message: [Python-Dev] Array Enhancements
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]