[Numpy-discussion] fromiter (original) (raw)

Tim Hochberg tim.hochberg at cox.net
Fri Jun 2 23:15:33 EDT 2006


Some time ago some people, myself including, were making some noise about having 'array' iterate over iterable object producing ndarrays in a manner analogous to they way sequences are treated. I finally got around to looking at it seriously and once I came to the following three conclusions:

  1. All I really care about is the 1D case where dtype is specified. This case should be relatively easy to implement so that it's fast. Most other cases are not likely to be particularly faster than converting the iterators to lists at the Python level and then passing those lists to array.
  2. 'array' already has plenty of special cases. I'm reluctant to add more.
  3. Adding this to 'array' would be non-trivial. The more cases we tried to make fast, the more likely that some of the paths would be buggy. Regardless of how we did it though, some cases would be much slower than other, which would probably be suprising.

So, with that in mind, I retreated a little and implemented the simplest thing that did the stuff that I cared about:

fromiter(iterable, dtype, count) => ndarray of type dtype and length
count

This is essentially the same interface as fromstring except that the values of dtype and count are always required. Some primitive benchmarking indicates that 'fromiter(generator, dtype, count)' is about twice as fast as 'array(list(generator))' for medium to large arrays. When producing very large arrays, the advantage of fromiter is larger, presumably because 'list(generator)' causes things to start swapping.

Anyway I'm about to bail out of town till the middle of next week, so it'll be a while till I can get it clean enough to submit in some form or another. Plenty of time for people to think of why it's a terrible idea ;-)

-tim



More information about the NumPy-Discussion mailing list