Spliterator (original) (raw)

Brian Goetz brian.goetz at oracle.com
Mon Nov 19 14:17:54 PST 2012

Previous message: Spliterator
Next message: Spliterator
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

The purpose of getSizeIfKnown is to eliminate copies. If the leaf size is known, we can allocate the right size at each leaf rather than guessing; if all the sizes through the tree are known, then we can optimize things like:

collection.stream().map(...).toArray();

so there is only a source array and a target array and the mapping happens in parallel into the right offset within the final target array, rather than allocating chunks for each leaf and merging them. For the cases where this is possible, this could be a 2x performance difference. But, this is tied to the non-interference requirement, that the source collection not change during the computation.

The missing bit of the spec, as Doug alludes to, is in the Stream-construction methods, where the non-interference constraint lives. We should probably point from the Spliterator spec to those, but I didn't want to constrain Spliterator in this way because (for example) Doug might support concurrent modification in bulk ops on CHM.

On 11/19/2012 3:46 PM, Joe Bowbeer wrote:

The spec for getSizeIfKnown() seems similar to that of InputStream.available(), which is spec'd so loosely that I don't think it should ever be used. Similar to Thread.yield() in that respect.

On Mon, Nov 19, 2012 at 12:37 PM, Doug Lea <dl at cs.oswego.edu_ _<mailto:dl at cs.oswego.edu>> wrote: On 11/16/12 11:40, Brian Goetz wrote: Following up...we've eliminated StreamAccessor from the design, so you only need a Spliterator (and maybe stream flags) now to create a stream. Here's the latest API and spec for Spliterator. Some explanations of the choices are in the original mail to which this is a reply. My main concern is spec'ing/using the heuristic methods that could do anything and return any answer vs provide required functionality. For example, getSizeIfKnown(). Is it required to promise that the size will not change after return? Is it required to provide a result if one is computationally possible rather than just handy? Is it required to lie and return Integer.MAXVALUE if it overflows int? (Reminder: HashMaps and CHMs with > 1 << 31 elements are known to exist out there.) These will be hard to spec exactly right. And there are 4 more methods like this. And I'm left wondering (as always) just how much good these methods will do compared to an ultra-small/simple API, considering that most collection developers would rather put more time into writing custom versions of forEach, reduce, etc rather than tuning their Spliterator hinting methods so that the lambda-libs default versions work well? Or, considering that each of these added methods seems targeted at a particular data structure/shape (array, tree, bounded list/seq), why not make subinterfaces for them? interface Spliterator interface SizedSpliterator extends Spliterator... interface RandomAccessSpliterator extends SizedSpliterator ... etc On a little thought, I can't think of a reason not to do this? Can you? (Where to get this started, the minimal base version of Spliterator has a split() method returning one with fewer elements or null if it can't, and either is a or returns an Iterator.) -Doug

Previous message: Spliterator
Next message: Spliterator
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the lambda-libs-spec-observers mailing list