Issue 18749: [issue 18606] Re: Add statistics module to standard library (original) (raw)

I hope I'm doing the right thing by replying in-line. This is my first code review, please let me know if I'm doing something wrong.

By the way, the email hasn't gone to the tracker again. Is that a bug in the tracker? I've taken the liberty of changing the address to report@bugs.python.org.

On 15/08/13 22:58, ezio.melotti@gmail.com wrote:

http://bugs.python.org/review/18606/diff/8927/Lib/statistics.py File Lib/statistics.py (right):

http://bugs.python.org/review/18606/diff/8927/Lib/statistics.py#newcode113 Lib/statistics.py:113: date = "2013-08-13" Are these still needed after inclusion?

Probably not. Also the licence will need to be changed.

http://bugs.python.org/review/18606/diff/8927/Lib/statistics.py#newcode194 Lib/statistics.py:194: """ This would be good in the rst docs, but IMHO docstrings should be less verbose. If you end up copy/pasting all these in the rst file, you will duplicate all the docs and they will risk to get out of sync (in addition to have to update both every time).

Personally, I like having detailed docs in the docstring, at my fingers in the interactive interpreter. But I'll follow the consensus here.

http://bugs.python.org/review/18606/diff/8927/Lib/statistics.py#newcode277 Lib/statistics.py:277: assert isinstance(x, float) and isinstance(partials, list) Is this a good idea?

I think so :-) add_partials is internal/private, and so I don't have to worry about the caller providing wrong arguments, say a non-float. But I want some testing to detect coding errors. Using assert for this sort of internal pre-condition is exactly what assert is designed for.

http://bugs.python.org/review/18606/diff/8927/Lib/statistics.py#newcode524 Lib/statistics.py:524: """mode(data [, max_modes]) -> mode(s) The form "mode(data, max_modes=1) -> mode(s)" is preferred.

Is it? I see plenty of examples in the standard library of that form, e.g. str.find:

find(...) S.find(sub [,start [,end]]) -> int

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics.py File Lib/test/test_statistics.py (right):

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics.py#newcode46 Lib/test/test_statistics.py:46: 'missing name "%s" in all' % name) FWIW This should be already covered by test___all__.

Sorry, I don't understand this. test__all__?

[...]

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics.py#newcode144 Lib/test/test_statistics.py:144: assert data != sorted(data) Why not assertNotEqual?

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics.py#newcode385 Lib/test/test_statistics.py:385: assert x == inf Why not assertEqual?

In general, I use bare asserts for testing code logic, even if the code is test code. So if I use self.assertSpam(...) then I'm performing a unit test of the module being tested. If I use a bare assert, I'm asserting something about the test logic itself.

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics.py#newcode417 Lib/test/test_statistics.py:417: self.assertTrue(math.isnan(sum(data))) Since you seem to use this quite often, it might be better to define a assertIsNaN (and possibly assertIsNotNaN) in NumericTestCase and provide a better error message in case of failure. The same could apply for assertIsInf.

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics.py#newcode913 Lib/test/test_statistics.py:913: self.assertTrue(isinstance(result, Decimal)) assertIsInstance

I used to be able to remember all the unittest assert* methods... there are so many now, 31 not including deprecated aliases.

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics_approx.py File Lib/test/test_statistics_approx.py (right):

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics_approx.py#newcode1 Lib/test/test_statistics_approx.py:1: """Numeric approximated equal comparisons and unit testing. Do I understand correctly that this is just an helper module used in test_statistics and that it doesn't actually test anything from the statistics module?

Correct. It does, however, test itself.

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics_approx.py#newcode137 Lib/test/test_statistics_approx.py:137: # and avoid using TestCase.almost_equal, because it sucks :-) Could you elaborate on this?

Ah, I misspelled "TestCase.AlmostEqual".

Using round() to test for equal-to-some-tolerance is quite an idiosyncratic way of doing approx-equality tests. I've never seen anyone do it that way before. It surprises me.
It's easy to think that places means significant figures, not decimal places.
There's now a delta argument (when was that added?) that is the same as my absolute error tolerance tol, but no relative error argument.
You can't set a per-instance error tolerance.

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics_approx.py#newcode241 Lib/test/test_statistics_approx.py:241: assert len(args1) == len(args2) Why not assertEqual?

As above, I use bare asserts to test the test logic, and assertSpam methods to perform the test. In this case, I'm confirming that I haven't created dodgy test data.

http://bugs.python.org/review/18606/diff/8927/Lib/test/test_statistics_approx.py#newcode255 Lib/test/test_statistics_approx.py:255: self.assertTrue(approx_equal(b, a, tol=0, rel=rel)) Why not assertApproxEqual?

Because I'm testing the approx_equal function. I can't use assertApproxEqual to test its own internals :-)

I hope I'm doing the right thing by replying in-line. This is my first code review, please let me know if I'm doing something wrong.

By the way, the email hasn't gone to the tracker again. Is that a bug in the tracker? I've taken the liberty of changing the address to report@bugs.python.org.

Apparently that doesn't work, since it created a new issue :)

The best way is to reply directly on http://bugs.python.org/review/18606/. You can either reply to the individual comments (better), or to the whole message like you did here. I don't know if there's a way to reply via email.