msg335379 - (view) |
Author: shawnberry (shawn_berry) * |
Date: 2019-02-13 01:08 |
Please see my GitHub page https://github.com/shawnberry/Improved_random.choices/blob/master/Improved_Py3_BIF_random_dot_choices.py for code that reduces Py3 BIF random.choices() from O(N**2) to O(N). This is my first suggestion to improve Python code. Thanks, shawnberry121@gmail.com |
|
|
msg335381 - (view) |
Author: Ammar Askar (ammar2) *  |
Date: 2019-02-13 01:27 |
I can't speak to the complexity of the choices function but if you're proposing an alternative implementation for choices, it would be wise to show the benefits empirically. That is, benchmarks and an explanation of why your implementation would be better than just an example of your code among an example. Additionally, you need to expand your implementation to match all the functionality of the current function and make a pull request to the main Python repository. Adding Raymond to nosy, the original author of random.choices |
|
|
msg335382 - (view) |
Author: Steven D'Aprano (steven.daprano) *  |
Date: 2019-02-13 01:45 |
What's "BIF" mean? You use that term multiple times but I have never heard it before. I'm sorry, I don't understand your code (and don't have time to study it in detail to decipher it). It would help if you factored out your new implementation of choices() into a separate function. I see that your timing code it not reliable for timing individual function calls. You include your setup code that builds the input to choices() as well as the time it takes to print messages, so we can't really draw any reliable conclusions about the speed of choices() from your timings. To give an analogy: you start the stopwatch at home. Climb into the shower and wash, get dressed, have breakfast, drive to work, park the car, buy a coffee at the shop next door, go to your office, greet your fellow co-workers, and finally stop the stopwatch. And then you say that the total time measured was the time it took you to drive to work alone. Finally, I can't actually work out what part of your code is intended as a replacement for the choices() function. You should factor it out into a separate function, then time the two function calls: choices(dataset, weights) shawn_choices(dataset, weights) alone (without timing the setup of the datasets, or the I/O, or any other irrelevant costs). You probably should use the timeit module for the actual timing. As for the versions of Python, 3.6 and older are in "bug fix only" mode, so this will not apply to anything older than 3.7. |
|
|
msg335389 - (view) |
Author: shawnberry (shawn_berry) * |
Date: 2019-02-13 07:41 |
Hi PSF, Upon further review, random.choices() is O(N). I was looping through the values in random.choices() to make it O(N**2) in my code. Next time I will thoroughly check a function on its own before raising any issue. Fyi, BIF is built-in function. Please close this issue. Best, Shawn On Tue, Feb 12, 2019 at 5:45 PM Steven D'Aprano <report@bugs.python.org> wrote: > > Steven D'Aprano <steve+python@pearwood.info> added the comment: > > What's "BIF" mean? You use that term multiple times but I have never heard > it before. > > I'm sorry, I don't understand your code (and don't have time to study it > in detail to decipher it). It would help if you factored out your new > implementation of choices() into a separate function. > > I see that your timing code it not reliable for timing individual function > calls. You include your setup code that builds the input to choices() as > well as the time it takes to print messages, so we can't really draw any > reliable conclusions about the speed of choices() from your timings. > > To give an analogy: you start the stopwatch at home. Climb into the shower > and wash, get dressed, have breakfast, drive to work, park the car, buy a > coffee at the shop next door, go to your office, greet your fellow > co-workers, and finally stop the stopwatch. And then you say that the total > time measured was the time it took you to drive to work alone. > > Finally, I can't actually work out what part of your code is intended as a > replacement for the choices() function. You should factor it out into a > separate function, then time the two function calls: > > choices(dataset, weights) > shawn_choices(dataset, weights) > > alone (without timing the setup of the datasets, or the I/O, or any other > irrelevant costs). You probably should use the timeit module for the actual > timing. > > As for the versions of Python, 3.6 and older are in "bug fix only" mode, > so this will not apply to anything older than 3.7. > > ---------- > nosy: +steven.daprano > versions: -Python 3.4, Python 3.5, Python 3.6 > > _______________________________________ > Python tracker <report@bugs.python.org> > <https://bugs.python.org/issue35980> > _______________________________________ > |
|
|
msg335392 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2019-02-13 07:45 |
Closing as requested by OP. |
|
|