[Python-Dev] PEP 450 adding statistics module (original) (raw)

Oscar Benjamin oscar.j.benjamin at gmail.com
Sun Sep 8 22:48:26 CEST 2013

Previous message: [Python-Dev] PEP 450 adding statistics module
Next message: [Python-Dev] PEP 450 adding statistics module
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 8 September 2013 18:32, Guido van Rossum <guido at python.org> wrote:

Going over the open issues:

- Parallel arrays or arrays of tuples? I think the API should require an array of tuples. It is trivial to zip up parallel arrays to the required format, while if you have an array of tuples, extracting the parallel arrays is slightly more cumbersome. Also for manipulating of the raw data, an array of tuples makes it easier to do insertions or removals without worrying about losing the correspondence between the arrays.

For something like this, where there are multiple obvious formats for the input data, I think it's reasonable to just request whatever is convenient for the implementation. Otherwise you're asking at least some of your users to convert data from one format to another just so that you can convert it back again. In any real problem you'll likely have more than two variables, so you'll be writing some code to prepare the data for the function anyway.

The most obvious alternative that isn't explicitly mentioned in the PEP is to accept either:

def correlation(x, y=None): if y is None: xs = [] ys = [] for x, y in x: xs.append(x) ys.append(y) else: xs = list(x) ys = list(y) assert len(xs) == len(ys) # In reality a helper function does the above. # Now compute stuff

This avoids any unnecessary conversions and is as convenient as possible for all users at the expense of having a slightly more complicated API.

Oscar

Previous message: [Python-Dev] PEP 450 adding statistics module
Next message: [Python-Dev] PEP 450 adding statistics module
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list