Reducing reduce (original) (raw)

Joe Bowbeer joe.bowbeer at gmail.com
Mon Feb 11 08:57:41 PST 2013


My parallel string-compare sample provides two implementations (below).

Will both of these survive your changes?

bitbucket.org/joebowbeer/stringcompare

int compareMapReduce(String s1, String s2) { assert s1.length() == s2.length(); return intRange(0, s1.length()).parallel() .map(i -> compare(s1.charAt(i), s2.charAt(i))) .reduce(0, (l, r) -> (l != 0) ? l : r); } int compareBoxedReduce(String s1, String s2) { assert s1.length() == s2.length(); return intRange(0, s1.length()).parallel().boxed() .reduce(0, (l, i) -> (l != 0) ? l : compare(s1.charAt(i), s2.charAt(i)), (l, r) -> (l != 0) ? l : r); }

The person who sold me the second form told me it would "burn less heat". He said that I could optimize my map/reduce by having it "not even calculate f if the left operand is nonzero, by combining the map and reduce steps into a fold."

What is that person going to sell me now?

Joe

On Mon, Feb 11, 2013 at 8:41 AM, Brian Goetz <brian.goetz at oracle.com> wrote:

Now that we've added all the shapes of map() to Stream (map to ref/int/long/double), and we've separated functional reduce (currently called reduce) from mutable reduce (currently called collect), I think that leaves room for taking out one of the reduce methods from Stream:

U reduce(U identity, BiFunction<U, ? super T, U> accumulator, BinaryOperator reducer); This is the one that confuses everyone anyway, and I don't think we need it any more. The argument for having this form instead of discrete map+reduce are: - fused map+reduce reduces boxing - this three-arg form can also fold filtering into the accumulation However, since we now have primitive-bearing map methods, and we can do filtering before and after the map, is this form really carrying its weight? Specifically because people find it counterintuitive, we should consider dropping it and guiding people towards map+reduce. For example, "sum of pages" over a stream of Documents is better written as: docs.map(Document::**getPageCount).sum() rather than docs.reduce(0, (count, d) -> count + d.getPageCount(), Integer::sum) The big place where we need three-arg reduce is when we're folding into a mutable store. But that's now handled by collect(). Have I missed any use cases that would justify keeping this form?



More information about the lambda-libs-spec-observers mailing list