[Numpy-discussion] Bug in digitize function (original) (raw)
David Huard david.huard at gmail.com
Thu Jun 29 14:42:51 EDT 2006
- Previous message (by thread): [Numpy-discussion] trouble on tru64
- Next message (by thread): [Numpy-discussion] Bug in digitize function
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi,
Here is something I noticed with digitize() that I guess would qualify as a small but annoying bug.
In [165]: x = rand(10); bin = linspace(x.min(), x.max(), 10); print x.min(); print bin[0]; digitize(x,bin) 0.0925030184144 0.0925030184144 Out[165]: array([2, 9, 5, 9, 6, 1, 1, 1, 4, 5])
In [166]: x = rand(10); bin = linspace(x.min(), x.max(), 10); print x.min(); print bin[0]; digitize(x,bin) 0.0209738428066 0.0209738428066 Out[166]: array([ 5, 2, 8, 3, 0, 8, 9, 6, 10, 9])
Sometimes, the smallest number in x is counted in the first bin, and sometimes, it is counted as an outlier (bin number = 0). Moreover, creating the bin with bin = linspace(x.min()-eps, x.max(), 10) doesn't seem to solve the problem if eps is too small (ie 1./2**32). So basically, you can have
In [186]: x.min()>bin[0] Out[186]: True and yet digitize() considers x.min() as an outlier.
And to actually do something constructive, here is a docstring for digitize """Given an array of values and bin edges, digitize(values, bin_edges) returns the index of the bin each value fall into.
The first bin has index 1, and the last bin has the index n, where n is the number of bins. Values smaller than the inferior edge are assigned index 0, while values larger than the superior edge are assigned index n+1. """
Cheers,
David P.S. Many mails I send don't make it to the list. Is it gmail related ? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060629/a4c4026a/attachment.html>
- Previous message (by thread): [Numpy-discussion] trouble on tru64
- Next message (by thread): [Numpy-discussion] Bug in digitize function
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]