Alternative Spliterator for SpinedBuffer (original) (raw)

Peter Levart peter.levart at gmail.com
Fri Apr 12 08:10:53 PDT 2013


On 04/11/2013 09:53 PM, brian.goetz at oracle.com wrote:

Changeset: 5e7f8797b9c1 Author: briangoetz Date: 2013-04-11 15:53 -0400 URL: http://hg.openjdk.java.net/lambda/lambda/jdk/rev/5e7f8797b9c1

Spec pass on StreamBuilder ! src/share/classes/java/util/stream/StreamBuilder.java ! src/share/classes/java/util/stream/Streams.java

Hi,

I played a little with StreamBuilder and it works quite optimally using private SpinedBuffer under the hood. The only sub-optimality is when it starts small (min. 16 elements) and then grows big and after that it is used in a parallel computation where leaf task maximum size is more than 16 elements. For example, this code:

public class SpinedBufferTest {

 static int dump(Spliterator<Integer> spliterator, int level, int 

iterateSize) { System.out.print(" ".substring(0, level)); System.out.print(spliterator.estimateSize()); Spliterator spl2; if (spliterator.estimateSize() > iterateSize && (spl2 = spliterator.trySplit()) != null) { System.out.println(": split " + spl2.estimateSize() + " + "

")"); return 1; } }

 public static void main(String[] args) {
     SpinedBuffer<Integer> buf = new SpinedBuffer<>();
     for (int i = 0; i < 16384; i++) {
         buf.accept(i);
     }
     int leafs = dump(buf.spliterator(), 0, 1024);
     System.out.println("\nTotal leafs: " + leafs);
 }

}

Prints the splits that would occur for 16k elements when max. leaf task size is 1k:

16384: split 16 + 16368 16: (0...15) 16368: split 16 + 16352 16: (16...31) 16352: split 32 + 16320 32: (32...63) 16320: split 64 + 16256 64: (64...127) 16256: split 128 + 16128 128: (128...255) 16128: split 256 + 15872 256: (256...511) 15872: split 512 + 15360 512: (512...1023) 15360: split 1024 + 14336 1024: (1024...2047) 14336: split 2048 + 12288 2048: split 1024 + 1024 1024: (2048...3071) 1024: (3072...4095) 12288: split 4096 + 8192 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (4096...5119) 1024: (5120...6143) 2048: split 1024 + 1024 1024: (6144...7167) 1024: (7168...8191) 8192: split 4096 + 4096 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (8192...9215) 1024: (9216...10239) 2048: split 1024 + 1024 1024: (10240...11263) 1024: (11264...12287) 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (12288...13311) 1024: (13312...14335) 2048: split 1024 + 1024 1024: (14336...15359) 1024: (15360...16383)

Total leafs: 22

7 leaf tasks are half the max. size or less.

Now SpinedBuffer has chunks with sizes increasing exponentially using power of 2: 16, 16, 32, 64, 128, 256, 512, ... Each chunk has capacity equal to the sum of capacities of all previous chunks. So nearly optimal split (modulo last non-full chunk) would be to always split so that the last chunk is cut-off and the returned Spliterator contains all the chunks up to the last and this Spliterator only contains the last chunk.

Here's my attempt at modifying the spliterator to act like that (I only did it for the generic Object variant in 1st try, but I can prepare the primitive variants too, if desired):

http://dl.dropboxusercontent.com/u/101777488/jdk8-lambda/SpinedBufferAltSpliterator/webrev.01/index.html

With this modification, the above code prints:

16384: split 8192 + 8192 8192: split 4096 + 4096 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (0...1023) 1024: (1024...2047) 2048: split 1024 + 1024 1024: (2048...3071) 1024: (3072...4095) 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (4096...5119) 1024: (5120...6143) 2048: split 1024 + 1024 1024: (6144...7167) 1024: (7168...8191) 8192: split 4096 + 4096 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (8192...9215) 1024: (9216...10239) 2048: split 1024 + 1024 1024: (10240...11263) 1024: (11264...12287) 4096: split 2048 + 2048 2048: split 1024 + 1024 1024: (12288...13311) 1024: (13312...14335) 2048: split 1024 + 1024 1024: (14336...15359) 1024: (15360...16383)

Total leafs: 16

Regards, Peter



More information about the lambda-dev mailing list