RFR 8003981: Support Parallel Array Sorting (original) (raw)
Chris Hegarty chris.hegarty at oracle.com
Fri Dec 21 20:01:42 UTC 2012
- Previous message: RFR 8003981: Support Parallel Array Sorting - JEP 103
- Next message: RFR 8003981: Support Parallel Array Sorting - JEP 103
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
The granularity is the threshold beyond which we no longer split the array, for its parts to be sorted as separate tasks, and then merged together. At some point it becomes more costly to do this split and merge than simply sorting the array. I think Paul put it best by calling finding the optimal threshold a black art.
This API is targeted at the average developer that simply wants the sorting to be as past a reasonably possible, on their multi-core machine, without having to run benchmarks, profiling, etc. I am concerned that exposing the threshold will simply confuse most developers.
-Chris
On 21 Dec 2012, at 18:21, Kurchi Hazra <kurchi.subhra.hazra at oracle.com> wrote:
Isn't the optimal granularity value different for different number of cores, differing amounts of memory available in a machine etc? If someone is trying to use a parallelSort to take adavntage of all the cores in a machine, he is anyway a little bit of an advanced user I guess.
- Kurchi On 21.12.2012 09:43, Remi Forax wrote: On 12/21/2012 05:49 PM, Chris Hegarty wrote:
Updated webrev/specdiff: http://cr.openjdk.java.net/~chegar/8003981/ver.01/
I also included 'webrevdiffAgainstVer00', so you can easily see the differences against the last webrev. Note: I remove any reference to the threshold,MINARRAYSORTGRAN, in the API, given the other comments on this change.
Remi, Thank you for the detailed review. Chris, all is green for me, thumb up. [...] > In all Sorter, instead of calling Arrays.sort(), you should call > DualPivotQuicksort.sort to avoid unnecessary range checks. Yes, I see your point. The implementation note in the spec says Arrays.sort, but I think this suggestion is good. I made the change, and since DualPivotQuicksort.sort is inclusive of toIndex/right -1 from toIndex/right. good catch ! [...]
-Chris. Rémi
On 12/21/2012 11:19 AM, Remi Forax wrote: On 12/20/2012 06:31 PM, Chris Hegarty wrote: This is a review request for the addition of utility methods to java.util.Arrays that provide sorting of arrays in parallel, JEP 103 [1]. Webrev: http://cr.openjdk.java.net/~chegar/8003981/ver.00/ Current sorting implementations provided by the Java Collections Framework (Collections.sort and Arrays.sort) all perform the sorting operation sequentially in the calling thread. This enhancement will offer the same set of sorting operations currently provided by the Arrays class, but with a parallel implementation that utilizes the Fork/Join framework. These new API’s are still synchronous with regard to the calling thread as it will not proceed past the sorting operation until the parallel sort is complete. The actual sorting API this proposal adds will leverage the ForkJoinPool.commonPool (default Fork/Join pool defined as part of JEP 155). The sorting algorithm is that used in Doug Lea’s ParallelArray implementation. This work was originally done over in the lambda/lambda repo by David Holmes, and is now being cleaned up and brought into jdk8. Open issues/differences from version in lambda: 1) Is is necessary for MINARRAYSORTGRAN to be in the public API? It is an implementation detail (easy to remove). 2) The use of FJP.commonPool is an implementation detail, not part of the spec. Should not be a problem, just worth pointing out, as it differs from what is in lambda/lambda. -Chris. [1] http://openjdk.java.net/jeps/103 Hi Chris, in Arrays.parallelSort(TYPE[] a, int fromIndex, int toIndex), there is no need to have the two supplementary local variable origin and fence, ws is not a great better name (workspaceArray?), I had to read the code of ArraySortHelpers to understand. ArraySortHelpers should be renamed to ArrayParallelSortHelpers. In ArraySortHelpers, all Sorter, Merger, etc. should declare only one field by line, currently, the formatting of the fields is weird. In all Sorter, instead of calling Arrays.sort(), you should call DualPivotQuicksort.sort to avoid unnecessary range checks. In compute() of Sorter and Merger, you should copy a and w in local variables (a = this.a) to help the VM to figure out that they never changed, it will also reduce the bytecode size. In Sorter.compute, n should be loaded in a local field, int l = origin; int g = gran; int n = this.n; In Merger.compute, the sequence of inits before the second while should be: int l = lo; int lFence = l + nleft; // instead of lo + nleft int r = ro; int rFence = r + nright; // instead of ro + nright int k = wo; it should not changed the performance but it's more inline with the coding style of the rest of the code IMO. cheers, Rémi -- -Kurchi
- Previous message: RFR 8003981: Support Parallel Array Sorting - JEP 103
- Next message: RFR 8003981: Support Parallel Array Sorting - JEP 103
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]