Proxy.isProxyClass scalability (original) (raw)

Peter Levart peter.levart at gmail.com
Wed Apr 17 14🔞50 UTC 2013

Previous message: Proxy.isProxyClass scalability
Next message: Proxy.isProxyClass scalability
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Mandy,

Here's the updated webrev:

https://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc/webrev.02/index.html

This adds TwoLevelWeakCache to the scene with following performance compared to other alternatives:

Summary (4 Cores x 2 Threads i7 CPU):

Test Threads ns/op Original Patch(CL field)
FlattenedWeakCache TwoLevelWeakCache ======================= ======= ============== ===============
================== ================= Proxy_getProxyClass 1 2,403.27 163.70
206.88 252.89 4 3,039.01 202.77
303.38 327.62 8 5,193.58 314.47
442.58 510.63

Proxy_isProxyClassTrue 1 95.02 10.78
41.85 42.03 4 2,266.29
10.80 42.32 42.07 8 4,782.29
20.53 72.29 69.25

Proxy_isProxyClassFalse 1 95.02 1.36
1.36 1.36 4 2,186.59
1.36 1.37 1.40 8 4,891.15
2.72 2.94 2.72

Annotation_equals 1 240.10 152.29
193.27 200.45 4 1,864.06 153.81
195.60 202.45 8 8,639.20 262.09
384.72 338.70

As expected, the Proxy.getProxyClass() is yet a little slower than with FlattenedWeakCache, but still much faster than original Proxy implementation. Another lookup in the ConcurrentHashMap and another indirection have a price, but we also get something in return - space.

This is all obtained on latest lambda build (with new segment-less ConcurrentHashMap). I also added another ClassLoader to see what happens when the 2nd is added to the cache:

Original Proxy, 32 bit addressing

class proxy size of delta to loaders classes caches prev.ln.

    0         0       400       400
    1         1       768       368
    1         2       920       152
    1         3      1072       152
    1         4      1224       152
    1         5      1376       152
    1         6      1528       152
    1         7      1680       152
    1         8      1832       152
    1         9      1984       152
    1        10      2136       152
    2        11      2456       320
    2        12      2672       216
    2        13      2824       152
    2        14      2976       152
    2        15      3128       152
    2        16      3280       152
    2        17      3432       152
    2        18      3584       152
    2        19      3736       152
    2        20      3888       152

Original Proxy, 64 bit addressing

class proxy size of delta to loaders classes caches prev.ln.

    0         0       632       632
    1         1      1216       584
    1         2      1448       232
    1         3      1680       232
    1         4      1912       232
    1         5      2144       232
    1         6      2376       232
    1         7      2608       232
    1         8      2840       232
    1         9      3072       232
    1        10      3304       232
    2        11      3832       528
    2        12      4192       360
    2        13      4424       232
    2        14      4656       232
    2        15      4888       232
    2        16      5120       232
    2        17      5352       232
    2        18      5584       232
    2        19      5816       232
    2        20      6048       232

Patched Proxy (FlattenedWeakCache), 32 bit addressing

class proxy size of delta to loaders classes caches prev.ln.

    0         0       240       240
    1         1       584       344
    1         2       768       184
    1         3       952       184
    1         4      1136       184
    1         5      1320       184
    1         6      1504       184
    1         7      1688       184
    1         8      1872       184
    1         9      2056       184
    1        10      2240       184
    2        11      2424       184
    2        12      2736       312
    2        13      2920       184
    2        14      3104       184
    2        15      3288       184
    2        16      3472       184
    2        17      3656       184
    2        18      3840       184
    2        19      4024       184
    2        20      4208       184

Patched Proxy (FlattenedWeakCache), 64 bit addressing

class proxy size of delta to loaders classes caches prev.ln.

    0         0       336       336
    1         1       920       584
    1         2      1200       280
    1         3      1480       280
    1         4      1760       280
    1         5      2040       280
    1         6      2320       280
    1         7      2600       280
    1         8      2880       280
    1         9      3160       280
    1        10      3440       280
    2        11      3720       280
    2        12      4256       536
    2        13      4536       280
    2        14      4816       280
    2        15      5096       280
    2        16      5376       280
    2        17      5656       280
    2        18      5936       280
    2        19      6216       280
    2        20      6496       280

Patched Proxy (TwoLevelWeakCache), 32 bit addressing

class proxy size of delta to loaders classes caches prev.ln.

    0         0       240       240
    1         1       752       512
    1         2       896       144
    1         3      1040       144
    1         4      1184       144
    1         5      1328       144
    1         6      1472       144
    1         7      1616       144
    1         8      1760       144
    1         9      1904       144
    1        10      2048       144
    2        11      2400       352
    2        12      2608       208
    2        13      2752       144
    2        14      2896       144
    2        15      3040       144
    2        16      3184       144
    2        17      3328       144
    2        18      3472       144
    2        19      3616       144
    2        20      3760       144

Patched Proxy (TwoLevelWeakCache), 64 bit addressing

class proxy size of delta to loaders classes caches prev.ln.

    0         0       336       336
    1         1      1216       880
    1         2      1440       224
    1         3      1664       224
    1         4      1888       224
    1         5      2112       224
    1         6      2336       224
    1         7      2560       224
    1         8      2784       224
    1         9      3008       224
    1        10      3232       224
    2        11      3808       576
    2        12      4160       352
    2        13      4384       224
    2        14      4608       224
    2        15      4832       224
    2        16      5056       224
    2        17      5280       224
    2        18      5504       224
    2        19      5728       224
    2        20      5952       224

So we loose approx. 32 bytes (32bit addresses) or 48 bytes (64 bit addresses) for each proxy class compared to original code when using FlattenedWeakCache, but we gain 8 bytes (32 bit or 64 bit addresses) for each proxy class cached compared to original code when using TwoLevelWeakCache. So which to favour, space or time?

Other comments in-line...

On 04/17/2013 07:31 AM, Mandy Chung wrote:

On 4/16/2013 7:18 AM, Peter Levart wrote:

Hi Mandy,

I prepared a preview variant of j.l.r.Proxy using WeakCache (turned into an interface and a special FlattenedWeakCache implementation in anticipation to create another variant using two-levels of ConcurrentHashMaps for backing storage, but with same API) just to compare performance: https://dl.dropboxusercontent.com/u/101777488/jdk8-tl/proxy-wc/webrev.01/index.html

thanks for getting this prototype done quickly. As the values (Class objects of proxy classes) must be wrapped in a WeakReference, the same instance of WeakReference can be re-used as a key in another ConcurrentHashMap to implement quick look-up for Proxy.isProxyClass() method eliminating the need to use ClassValue, which is quite space-hungry. I also think maintaining another ConcurrentHashMap is a good replacement with the use of ClassValue to avoid its memory overhead. Comparing the performance, here's a summary of all 3 variants (original, patched using a field in ClassLoader and this variant): [...] The improvement is still quite satisfactory, although a little slower than the direct-field variant. The scalability is the same as with direct-field variant. Agree - the improvement is quite good. Space consumption of cache structure, calculated as deep-size of the structure, ignoring interned Strings, Class and ClassLoader objects unsing single non-bootstrap ClassLoader for defining the proxy classes and using 32 bit addressing is the following: [...] So with new ConcurrentHashMap the patched Proxy uses about 32 bytes more per proxy class. Is this satisfactory or should we also try a variant with two-levels of ConcurrentHashMaps? The overhead seems okay to trade off the scalability. Since you have prepared for doing another variant, it'd be good to compare two prototypes if this doesn't bring too much work :) I would imagine that there might be slight difference in your measurement when comparing with proxies defined by a single class loader but the code might be simpler (might not be if you keep the same API but different implementation).

With TwoLevelWeakCache, there is a "step" of 108 bytes (32bit addresses) when new ClassLoader is encoutered (new 2nd level ConcurrentHashMap is allocated and new entry added to 1st level CHM. There's no such "step" in FlattenedWeakCache (modulo the steps when the CHMs are itself resized). So we roughly have 108 bytes wasted for each new ClassLoader encountered with TwoLevelWeakCache vs. FlattenedWeakCache, but we also have 40 bytes spared for each proxy class cached. TwoLevelWeakCache starts to pay off if there are at least 3 proxy classes defined per ClassLoader in average.

Regardless of which approach to use - you have added a general purpose WeakCache and the implementation class in the sun.misc package. While it's good to have such class for other jdk class to use, I am more comfortable in keeping it as a private class for proxy implementation to use. We need existing applications to migrate away from sun.misc and other private APIs to prepare for modularization.

What about package-private in java.lang.reflect? It makes Proxy itself much easier to read. When we decide which way to go, I can remove the interface and only leave a single package-private class...

Nits: can you wrap the lines around 80 columns including comments? try-catch-finally statements need some formatting fixes. Our convention is to have 'catch', or 'finally' following the closing bracket '}' in the same line. Your editor breaks 'catch' or 'finally' into the next line.

Fixed.

Regards, Peter

Even without SecurityManager installed the performance of native getClassLoader0 was a hog. I don't know why? Isn't there an implicit reference to defining ClassLoader from every Class object? That's right - it looks for the caller class only if the security manager is installed. The defining class loader is kept in the VM's Klass object (language-level Class instance representation in the VM) and there is no computation needed to obtain a defining class loader of a given Class object. I can only think of the Java <-> native transition overhead that could be one factor. Class.getClassLoader0 is not intrinsified. I'll find out (others on this mailing list may probably know). Mandy

Previous message: Proxy.isProxyClass scalability
Next message: Proxy.isProxyClass scalability
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the core-libs-dev mailing list