Tabulators -- a catalog (original) (raw)

Brian Goetz brian.goetz at oracle.com
Thu Dec 27 18:23:42 PST 2012


Here's a catalog of the currently implemented Tabulators.

  1. The groupBy family. Currently, there are 16 of these:

{ map vs mapMulti } x { explicit factories or not } x { reduce forms }

where the reduce forms are:

The MutableReducer form is what we have been calling Accumulator today; since all the tabulators are MutableReducer, the second form is what allows multi-level tabulations.

Q: Does the mapMulti variant carry its weight? (It's not a lot of extra code; these extra 8 methods are in total less than 50 lines of code.)

Q: Should the mapMulti variant be called something else, like groupByMulti?

Q: The first reduce form is classic groupBy; the others are group+reduce. Should they be called groupedReduce / groupedAccumulate for clarity?

Examples:

// map + no explicit factories + mutable reduce form Map<K, D> groupBy(Function<T,K> classifier, Accumulator<T,D> downstream)

// map + explicit factories + classic reduce <T, K, C extends Collection, M extends Map<K, C>> Tabulator<T, M> groupBy(Function<? super T, ? extends K> classifier, Supplier mapFactory, Supplier rowFactory) {

  1. The mappedTo family. These take a Stream and a function T->U and produce a MapLikeThingy<T,U>.

Four forms:

 // basic
 <T, U> Tabulator<T, Map<T,U>>
 mappedTo(Function<? super T, ? extends U> mapper)

 // with merge function to handle duplicates
 <T, U> Tabulator<T, Map<T,U>>
 mappedTo(Function<? super T, ? extends U> mapper,
          BinaryOperator<U> mergeFunction)

 // with map factory
 <T, U, M extends Map<T, U>> Tabulator<T, M>
 mappedTo(Function<? super T, ? extends U> mapper,
          Supplier<M> mapSupplier)

 // with both factory and merge function
 <T, U, M extends Map<T, U>> Tabulator<T, M>
 mappedTo(Function<? super T, ? extends U> mapper,
          BinaryOperator<U> mergeFunction,
          Supplier<M> mapSupplier)

Q: is the name good enough?

Q: what should be the default merging behavior for the forms without an explicit merger? Throw?

  1. Partition. Partitions a stream according to a predicate. Results always are a two-element array of something. Five forms:

    // Basic Tabulator<T, Collection[]> partition(Predicate predicate)

    // Explicit factory <T, C extends Collection> Tabulator<T, C[]> partition(Predicate predicate, Supplier rowFactory) // Partitioned mutable reduce <T, D> Tabulator<T, D[]> partition(Predicate predicate, MutableReducer<T,D> downstream) // Partitioned functional reduce Tabulator<T, T[]> partition(Predicate predicate, T zero, BinaryOperator reducer) // Partitioned functional map-reduce Tabulator<T, T[]> partition(Predicate predicate, T zero, Function<T, U> mapper, BinaryOperator reducer)

All of these implement MutableReducer/Accumulator/Tabulator, which means any are suitable for use as the downstream reducer, allowing all of these to be composed with each other. (Together all of these are about 300 lines of relatively straight-forward code.)

More? Fewer? Different?



More information about the lambda-libs-spec-experts mailing list