RFR(s): 8060192: Add default method Collection.toArray(generator) (original) (raw)
Stuart Marks stuart.marks at oracle.com
Tue Dec 12 22:56:51 UTC 2017
- Previous message: RFR(s): 8060192: Add default method Collection.toArray(generator)
- Next message: RFR(s): 8060192: Add default method Collection.toArray(generator)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 12/7/17 5:03 PM, Martin Buchholz wrote:
(I'm still trying to love this new API)
The changes to the jsr166 tck are fine. I'm not convinced that the new implementation for ArrayList is progress. The current implementation of toArray(T[]) does // Make a new array of a's runtime type, but my contents: return (T[]) Arrays.copyOf(elementData, size, a.getClass()); which does not have to pay the cost of zeroing the array, so is potentially faster. (depends on whether the VM provides cheap pre-zeroed memory?! what does jmh say?) If we're not getting type safety and not getting significantly better performance, all we have is a more natural API. But toArray(IntFunction) also seems not very natural to me, and we'll have to live with the toArray(new String[0]) wart forever anyways. (But is it really so bad?) (Maybe toArray(Class) is actually more natural?)
A lot of ideas flying around here. (Thanks John! :-)) But for this message let me focus on performance.
I think implicitly there's some concern about the new API and whether it might suffer inherent performance disadvantages compared to the existing APIs. It certainly doesn't seem to provide any inherent advantages. I spent some time doing benchmarking more rigorously, and I was able to get similar results to those from Alexey Shipilëv's article:
https://shipilev.net/blog/2016/arrays-wisdom-ancients/
That is, I was able to discern the difference between the copy-to-Object[] case, the copy to presized array case, and copy to zero-sized array case. Also added a potential new API toArray(Class) for comparison.
Here are the results. For brevity, I ran only the tests with an ArrayList of size 1000:
Benchmark (size) (type) Mode Cnt Score Error Units ToArrayBench.clazz 1000 arraylist avgt 30 1423.924 ± 11.437 ns/op ToArrayBench.clazz0 1000 arraylist avgt 30 1173.442 ± 11.614 ns/op ToArrayBench.clazzDf 1000 arraylist avgt 30 1179.648 ± 17.662 ns/op ToArrayBench.lambda 1000 arraylist avgt 30 1421.910 ± 21.320 ns/op ToArrayBench.lambda0 1000 arraylist avgt 30 1168.863 ± 11.109 ns/op ToArrayBench.lambdaDf 1000 arraylist avgt 30 1168.372 ± 9.512 ns/op ToArrayBench.simple 1000 arraylist avgt 30 462.371 ± 8.693 ns/op ToArrayBench.sized 1000 arraylist avgt 30 1417.483 ± 11.451 ns/op ToArrayBench.zero 1000 arraylist avgt 30 1182.932 ± 27.773 ns/op
Legend:
clazz - ArrayList override of toArray(Class)
T[] a = (T[])java.lang.reflect.Array.newInstance(Foo.class, size);
System.arraycopy(elementData, 0, a, 0, size);
clazz0 - ArrayList override of toArray(Class)
T[] a = (T[])java.lang.reflect.Array.newInstance(clazz, 0);
return (T[]) Arrays.copyOf(elementData, size, a.getClass());
classDf - Collection default method toArray(Class)
toArray((T[])java.lang.reflect.Array.newInstance(clazz, 0))
lambda - ArrayList override of toArray(IntFunc)
T[] a = generator.apply(size);
System.arraycopy(elementData, 0, a, 0, size);
lambda0 - ArrayList override of toArray(IntFunc)
(T[]) Arrays.copyOf(elementData, size, generator.apply(0).getClass())
lambdaDf - Collection default method toArray(IntFunc)
toArray(generator.apply(0))
simple - existing no-arg toArray() method
sized - existing toArray(T[]) called with presized array
coll.toArray(new Foo[coll.size()])
zero - existing toArray(T[]) called with zero-sized array
coll.toArray(new Foo[0])
A couple observations from this. First, "lambda", the ArrayList override of toArray(IntFunc), is slower than "lambda0" or "lambdaDf", both of which create zero-length arrays and pass them elsewhere. So Martin, you're right, the override I put into ArrayList isn't buying anything. I'll take it out. Oddly enough, just inheriting the default might be sufficient.
(And the same probably applies to Arrays$ArrayList as well.)
Second, setting aside "simple" case (which is fast because doesn't have to do any array store checking) we see performance falling into two buckets: one that takes around 1400ns, and the other that takes around 1170ns. It turns out that the faster cases all end up calling the Arrays.copyOf() method. This is an intrinsic that can avoid the work of zeroing the array, because it allocates the array immediately before filling it with data from the source.
I would have thought that allocating the array immediately prior to filling it with System.arraycopy would have done the same, but apparently not.
We can see the effect of intrinsics by disabling the Arrays.copyOf intrinsic by applying the -XX:+UnlockDiagnosticVMOptions -XX:DisableIntrinsic=_copyOf options:
Benchmark (size) (type) Mode Cnt Score Error Units ToArrayBench.clazz 1000 arraylist avgt 30 1413.922 ± 9.570 ns/op ToArrayBench.clazz0 1000 arraylist avgt 30 1427.857 ± 13.669 ns/op ToArrayBench.clazzDf 1000 arraylist avgt 30 1448.274 ± 15.734 ns/op ToArrayBench.lambda 1000 arraylist avgt 30 1423.556 ± 17.312 ns/op ToArrayBench.lambda0 1000 arraylist avgt 30 1422.177 ± 20.650 ns/op ToArrayBench.lambdaDf 1000 arraylist avgt 30 1464.320 ± 26.199 ns/op ToArrayBench.simple 1000 arraylist avgt 30 399.856 ± 4.893 ns/op ToArrayBench.sized 1000 arraylist avgt 30 1417.668 ± 8.979 ns/op ToArrayBench.zero 1000 arraylist avgt 30 1438.527 ± 11.521 ns/op
Here, we can see that everything (except "simple") has moved into the 1400ns range. (Not sure why "simple" appears to have gotten faster.)
My conclusion from this is that the choice of one API over another doesn't offer any inherent performance advantage. You can make any of the APIs go fast if you can arrange for it to call an intrinsic, even if this ends up allocating an "extra" zero-length array.
So the issue is really about which API we want to expose. I'll reserve that discussion for another message.
s'marks
- Previous message: RFR(s): 8060192: Add default method Collection.toArray(generator)
- Next message: RFR(s): 8060192: Add default method Collection.toArray(generator)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]