Collectors update redux (original) (raw)

Brian Goetz brian.goetz at oracle.com
Thu Feb 7 11:12:51 PST 2013


Is three-arg collect really the target "on ramp"?

Sorry, I was probably not clear. It is the onramp to the mutable part of the reduce functionality, but it builds on the more functional flavors, as outlined in the "digression" section.

IF you've been successfully spoon-fed the excellent examples (bitset etc.) then you can see it as reasonably simple. Otherwise you're pretty lost in the woods.

I think that's fair. Which points, as we've already agreed, to the fact that this is mostly a pedagogical problem.

I would have thought the first stop would be the combinators. OTOH ... there's a lot of stuff in there.

I think there is way too much stuff in there, and I don't have enough time to even review it all before it gets set in stone. I strongly believe we would be smarter to keep the set of prepackaged Collectors much smaller and let third-party libraries experiment with which Collectors to provide.

Conceptually, the set is pretty simple:

base collectors == toCollection, toStatistics, toStringBuilder, joinedWith (takes Stream plus T->U, produces Map<T,U>)

combinator for map+collector combinator for groupBy+collector combinator for groupBy+reduce combinator for partition+collector combinator for partition+reduce

plus defaults for above where if you don't have a downstream collector, it assumes "toCollection" (e.g., the no-arg groupBy).

Individually, each of these is dead-simple both in concept and implementation (once you understand Collector) -- even the most complex are only 20 LoC, and many are are 1-2 LoC. I think what creates the perception of complexity is the number of forms that jumps out at you on the Javadoc page?

The one place where we might consider reducing scope is by eliminating the forms that take an explicit Supplier. In other words, you always get a HashMap / ConcurrentHashMap. This cuts the number of groupBy/join forms in half. But it leaves those who want, say, to group to a TreeMap out in the cold.

Do we feel that would be an improvement?

Alternately, we can refactor the Map-driven collectors so that instead of the Supplier being an argument, it can be a method on the Collector:

collect(groupingBy(Txn::buyer).usingMap(TreeMap::new))

by having a ToMapCollector (extends Collector) with a usingMap() method. This again gets us a nearly 2x reduction in number of methods in Collectors, at the cost of moving the "pick your own map" functionality to somewhere else.



More information about the lambda-libs-spec-observers mailing list