Streams design strawman (original) (raw)

Brian Goetz brian.goetz at oracle.com
Sun Apr 22 10:04:15 PDT 2012


The more general design principle that we were appealing to is: collections are all about storing values, and the set of operations you have to support on a collection is large. But it is silly to use a collection as the intermediate value between every operation -- that is wasteful. For example, we could have had filter and map return new collections, and written things like this:

Collection filtered = names.filter(...); Collection mapped = names.map(n -> n.getLastName()); mapped.sort(...);

But creating the intermediate collections is usually wasteful. So instead, filter/map return streams:

SortedSet result = names.filter(...) .map(Name::getLastName) .into(new SortedSet<>());

Which gives the same final result, but more efficiently and (IMO) more cleanly.

The key observation is: most bulk operations on collections can be expressed in the form

source - lazy - lazy - lazy - eager

where the "eager" operations are things like forEach, dump the results into a collection, or some form of reduce.

Grouping might sometimes be the last element in the processing, but very often we want to keep going. Expressing it as something that produces a stream makes it easier to keep going. Grouping may benefit less from laziness than filtering, but treating it as a lazy (stream-producing) operation also has benefits.

Our model is that the methods that produce new streams can be lazy, and those that produce concrete results (scalars, collections, etc) are eager.

On 4/22/2012 12:55 PM, Brian Goetz wrote:

So basically it's not a stream but something like this:

interface Histogram<K,V> { Iterable keys(); Iterable values(); Iterable<Entry<K,V>> entries(); } a kind of super type of a Map. It certainly could be, if we wanted to make it an eager (end-of-stream-pipeline) operation. But it seems more flexible to make it a BiStream-creating operation (even though the values need to be internally buffered, which I think is your underlying point), because then you can keep going with more transformations / reductions on the resulting BiStream. For example, the following produces a Map<Integer,_ _String>, where the keys are word lengths and the values are strings of "word,word,word". words.groupBy(w -> w.length()) .mapValues((length, words) -> String.join(words)) .into(new HashMap<Integer, String>); The group-by operation is rarely the end of what you want to do; usually you want to count, post-process, etc.



More information about the lambda-dev mailing list