RFC: draft API for JEP 269 Convenience Collection Factories (original) (raw)

Timo Kinnunen timo.kinnunen at gmail.com
Wed Oct 14 21:48:54 UTC 2015


Hi,

That’s intriguing since I wrote a collections library too, covering just Map/Set/List/Stream, with immutable/mutable versions and lots of convenience methods included, but I haven’t noticed such issues. My scope is a lot smaller, of course. It’s also not beholden to the way the Collections Framework does things so I can decide List is immutable, ArrayList is mutable, both will use the same API and there won’t be any subtyping relation between them. Maybe it’s this freedom that makes the difference.

I actually have 4 uses of ArrayList.of in a smallish project where they are used for reading and generating configuration files. I get to do things like this in a few places:

ArrayList lines = ArrayList.of( "", "", "", " ", // more lines of prefix . . .

And also this sort of thing:

ArrayList list = ArrayList.of(FILES.replaceAll("^"|"$", "").split("" "")).removeIf(String::isEmpty); ArrayList list2 = list.map(Paths::get);

If that looks somewhat like a Stream then just imagine what the user experience is like when stepping into it in a debugger ��

More anecdotes: Having methods taking a Collection overloaded with versions taking varargs makes the API a lot more flexible.

I chose ArrayList.of(), ArrayList.of(T t) and ArrayList.of(T...ts), and List.of(T...ts) and the [0. . .10] argument variants for List.of(T t0, ... , T tn) completely unscientifically.

Sent from Mail for Windows 10

From: Kevin Bourrillion Sent: Wednesday, October 14, 2015 19:56 To: Stuart Marks Cc: core-libs-dev Subject: Re: RFC: draft API for JEP 269 Convenience Collection Factories

(Sorry that Guava questions were asked and I didn't notice this thread sooner.)

Note that we have empirically learned through our Lists/Sets/Maps factory classes that varargs factory methods for mutable collections are almost entirely useless. For one thing, it's simply not common to have a hardcoded set of initial values yet still actually need to modify the contents later. When that does come up, the existing workarounds just aren't bad at all:

(a)

Set<Foo> foos = new HashSet<>(asList(foo1, foo2, foo3)); // static

import, of course

(b)

static final Set<Foo> INITIAL_VALUES = Set.of(foo1, foo2, foo3);
 . . .

Set<Foo> foos = new HashSet<>(INITIAL_VALUES);

(c)

Set<Foo> foos = new HashSet<>();
Collections.addAll(foos, foo1, foo2, foo3);

Note that (c) is a two-liner. But a two-liner is really only bad in the immutable case (because you might be initializing a static final). It's of little harm in the mutable case.

Anyway, since we created these methods, they became an attractive nuisance, and thousands of users reach for them who would have been better off in every way using an immutable collection. Our fondest desire is to one day be able to delete them. So, obviously, my strong recommendation is not to add these to ArrayList, etc.

On Fri, Oct 9, 2015 at 4:11 PM, Stuart Marks <stuart.marks at oracle.com> wrote:

Now, Guava handles this use case by providing a family of copying factories

that can accept an array, a Collection, an Iterator, or an Iterable. These are all useful, but for JEP 269, we wanted to focus on the "collection literal like" APIs and not expand the proposal to include a bunch of additional factory methods. Since we need to have a varargs method anyway, it seemed reasonable to arrange it so that it could easily accept an array as well.

A decision to support only varargs and arrays is reasonable. However, I don't see the advantage in using the same method name for both. In Guava, it's clear what the difference between ImmutableList.of(aStringArray) and ImmutableList.copyOf(aStringArray) is.

Does anybody care about LinkedHashSet?

Assuming you go ahead with this for mutable collection types despite the above, then YES, absolutely. Accidental dependence on hash order has always been a runaway problem in our codebase that has made every single major JDK upgrade difficult. And the memory cost of LHS over HS isn't nearly as great as HS is already paying over a lean immutable set. The use of HashMap and HashSet themselves should be discouraged.

(Even I had to fight the temptation to add "except when memory is at a premium" to that! But it makes no sense. That's like "if you want to lose weight, then accompany your giant pasta dinner and chocolate cake with a Diet Coke.")

  1. Duplicate handling.

    My current thinking is for the Set and Map factories to throw IllegalArgumentException if a duplicate element or key is detected. +1

To the other question: the reason we chose 11 as the cutoff is that we determined that there would be no logical basis for exactly where to do it, so we looked for an illogical basis. Sometimes you'll be at 10, all the way up, you're at 10 and where can you go from there? Where? Nowhere. So this way, if we need that extra push over the cliff, we can go up to 11.

-- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com



More information about the core-libs-dev mailing list