Issue 35980: Py3 BIF random.choices() is O(N**2) but I've written O(N) code for the same task (original) (raw)

Created on 2019-02-13 01:08 by shawn_berry, last changed 2022-04-11 14:59 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
Improved_Py3_BIF_random_dot_choices.py shawn_berry,2019-02-13 01:08 Improves Py3 BIF random.choices() from O(N**2) to O(N)

| Repositories containing patches | | | | | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | | | | https://github.com/shawnberry/Improved_random.choices/blob/master/Improved_Py3_BIF_random_dot_choices.py | | | |

Messages (5)
msg335379 - (view) Author: shawnberry (shawn_berry) * Date: 2019-02-13 01:08
Please see my GitHub page https://github.com/shawnberry/Improved_random.choices/blob/master/Improved_Py3_BIF_random_dot_choices.py for code that reduces Py3 BIF random.choices() from O(N**2) to O(N). This is my first suggestion to improve Python code. Thanks, shawnberry121@gmail.com
msg335381 - (view) Author: Ammar Askar (ammar2) * (Python committer) Date: 2019-02-13 01:27
I can't speak to the complexity of the choices function but if you're proposing an alternative implementation for choices, it would be wise to show the benefits empirically. That is, benchmarks and an explanation of why your implementation would be better than just an example of your code among an example. Additionally, you need to expand your implementation to match all the functionality of the current function and make a pull request to the main Python repository. Adding Raymond to nosy, the original author of random.choices
msg335382 - (view) Author: Steven D'Aprano (steven.daprano) * (Python committer) Date: 2019-02-13 01:45
What's "BIF" mean? You use that term multiple times but I have never heard it before. I'm sorry, I don't understand your code (and don't have time to study it in detail to decipher it). It would help if you factored out your new implementation of choices() into a separate function. I see that your timing code it not reliable for timing individual function calls. You include your setup code that builds the input to choices() as well as the time it takes to print messages, so we can't really draw any reliable conclusions about the speed of choices() from your timings. To give an analogy: you start the stopwatch at home. Climb into the shower and wash, get dressed, have breakfast, drive to work, park the car, buy a coffee at the shop next door, go to your office, greet your fellow co-workers, and finally stop the stopwatch. And then you say that the total time measured was the time it took you to drive to work alone. Finally, I can't actually work out what part of your code is intended as a replacement for the choices() function. You should factor it out into a separate function, then time the two function calls: choices(dataset, weights) shawn_choices(dataset, weights) alone (without timing the setup of the datasets, or the I/O, or any other irrelevant costs). You probably should use the timeit module for the actual timing. As for the versions of Python, 3.6 and older are in "bug fix only" mode, so this will not apply to anything older than 3.7.
msg335389 - (view) Author: shawnberry (shawn_berry) * Date: 2019-02-13 07:41
Hi PSF, Upon further review, random.choices() is O(N). I was looping through the values in random.choices() to make it O(N**2) in my code. Next time I will thoroughly check a function on its own before raising any issue. Fyi, BIF is built-in function. Please close this issue. Best, Shawn On Tue, Feb 12, 2019 at 5:45 PM Steven D'Aprano <report@bugs.python.org> wrote: > > Steven D'Aprano <steve+python@pearwood.info> added the comment: > > What's "BIF" mean? You use that term multiple times but I have never heard > it before. > > I'm sorry, I don't understand your code (and don't have time to study it > in detail to decipher it). It would help if you factored out your new > implementation of choices() into a separate function. > > I see that your timing code it not reliable for timing individual function > calls. You include your setup code that builds the input to choices() as > well as the time it takes to print messages, so we can't really draw any > reliable conclusions about the speed of choices() from your timings. > > To give an analogy: you start the stopwatch at home. Climb into the shower > and wash, get dressed, have breakfast, drive to work, park the car, buy a > coffee at the shop next door, go to your office, greet your fellow > co-workers, and finally stop the stopwatch. And then you say that the total > time measured was the time it took you to drive to work alone. > > Finally, I can't actually work out what part of your code is intended as a > replacement for the choices() function. You should factor it out into a > separate function, then time the two function calls: > > choices(dataset, weights) > shawn_choices(dataset, weights) > > alone (without timing the setup of the datasets, or the I/O, or any other > irrelevant costs). You probably should use the timeit module for the actual > timing. > > As for the versions of Python, 3.6 and older are in "bug fix only" mode, > so this will not apply to anything older than 3.7. > > ---------- > nosy: +steven.daprano > versions: -Python 3.4, Python 3.5, Python 3.6 > > _______________________________________ > Python tracker <report@bugs.python.org> > <https://bugs.python.org/issue35980> > _______________________________________ >
msg335392 - (view) Author: Mark Dickinson (mark.dickinson) * (Python committer) Date: 2019-02-13 07:45
Closing as requested by OP.
History
Date User Action Args
2022-04-11 14:59:11 admin set github: 80161
2019-02-13 07:49:16 SilentGhost set resolution: not a bug
2019-02-13 07:45:19 mark.dickinson set status: open -> closednosy: + mark.dickinsonmessages: + stage: resolved
2019-02-13 07:43:32 rhettinger set assignee: rhettinger
2019-02-13 07:41:12 shawn_berry set messages: +
2019-02-13 01:45:06 steven.daprano set nosy: + steven.dapranomessages: + versions: - Python 3.4, Python 3.5, Python 3.6
2019-02-13 01:43:55 eamanu set nosy: + eamanu
2019-02-13 01:27:22 ammar2 set nosy: + rhettinger, ammar2messages: +
2019-02-13 01:08:43 shawn_berry create