RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size (original) (raw)
Ivan Gerasimov ivan.gerasimov at oracle.com
Sun Jul 6 03:50:26 UTC 2014
- Previous message: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
- Next message: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thank you Doug for your comment!
On 04.07.2014 23:38, Doug Lea wrote:
On 07/04/2014 01:33 PM, Ivan Gerasimov wrote:
On 04.07.2014 8:14, David Holmes wrote:
Hi Ivan,
I find the change to capacity somewhat obfuscating and I can't see what the actual bug was. The bug was in rounding in the expression minCapacity = (3 * expectedMaxSize)/2. Suppose, we want the expected size to be 11, then minCapacity becomes (int)(11 * 3 / 2) == 16. The threshold is calculated later to be (int)(16 * 2 / 3) == 10, which is less than expected size 11. So the bug report was based on someone noticing that with odd-valued sizes, the undocumented 2/3 target load was rounded down, not up, which they didn't like? Not quite.
There are two small issues that I'm trying to address here:
- The wrong capacity is calculated based on the expected size due to rounding error. This is only noticeable for the expected sizes 11, 43, 171, 683 and a few others. This leads to a redundant table resize when inserting the expected number of elements.
- The wrong capacity due to overflow error. For numbers N larger or equal to 1431655765, the expression (int)(N * 3) / 2 produces non-negative small numbers. Frankly speaking, this is also true for my last proposed fix, so the updated webrev will follow.
- In put() the table gets resized as soon as the threshold is reached. This is wrong for expected sizes 2, 3, 5, 10, 21, 42, and some others.
- If put() was unsuccessful, the map gets corrupt.
I don't object to rounding it up and modernizing the size calculations to use highestOneBit, but it's a weak pretense :-)
Simple rounding the minimum capacity up as minCapacity = (3 * expectedMaxSize + 1)/2 would produce the same result of course. I just took a chance to replace the while loop with a single function call.
In the last webrev I also modified the check to address the issue #2 from the list above.
To address the issue I combined the division by 2 with the rounding up to the nearest power of two. I also took a chance to replace a while-loop with a single call to the highestOneBit method, which calculates exactly what we need here.
The recursive call to put after a resize seems very sub-optimal as you will re-search the map for the non-existent key. Can you not just recompute the correct indices and do the store? The initial rationale for post-insert-resize was to avoid these issues given the need to preserve at least one empty slot. Which I expect to remain faster than pre-resize even with your modified patch. Please provide systematic performance comparisons before committing! -Doug
Yes, you are right. Microbenchmark showed more than 20% slowdown of a single put operation. So, I reverted this change to the original code, but added a single check of the current table size before doing any modifications to the map. This is to address the issue #4, and this doesn't seem to introduce any significant penalty.
Would you please help review the updated webrev:
http://cr.openjdk.java.net/~igerasim/6904367/3/webrev/
Sincerely yours, Ivan
- Previous message: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
- Next message: RFR: [6904367]: (coll) IdentityHashMap is resized before exceeding the expected maximum size
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]