GetPrimitiveArrayCritical vs GetByteArrayRegion: 140x slow-down using -Xcheck:jni and java.util.zip.DeflaterOutputStream (original) (raw)

Ian Rogers irogers at google.com
Mon Mar 5 19:15:09 UTC 2018


Thanks! Changing the DeflaterOutputStream buffer size to be something other than the default reduces the number of JNI native calls and is a possible work around here, as this is an implementation detail could it be made in the JDK? Unfortunately larger input sizes will also regress the issue as the number of calls is "input size / buffer size". The JNI critical may give direct access to the array but depending on the GC, may require a lock and so lock contention may be a significant issue with the code and contribute to tail latencies. In my original post I mention this is difficult to measure and I think good practice is to avoid JNI critical regions.

Thanks, Ian

On Mon, Mar 5, 2018 at 10:41 AM Xueming Shen <xueming.shen at oracle.com> wrote:

On 03/05/2018 10:28 AM, Xueming Shen wrote: > On 03/05/2018 08:34 AM, Ian Rogers wrote: >> Firstly, we're not running -Xcheck:jni in production code :-) During >> development and testing it doesn't seem an unreasonable flag to enable, but >> a 140x regression is too much to get developers to swallow. >> >> There are 2 performance considerations: >> 1) the performance of -Xcheck:jni, which probably shouldn't be orders of >> magnitude worse than without the flag. >> 2) the problems associated with JNI criticals, for which GetByteArrayRegion >> is a panacea but by introducing a copying overhead. >> >> > > The reason the GetByteArrayCritical was/is being used here is exactly to avoid the copy > overhead, which was an issue escalated in the past. Though the "copy overhead" appears > to be much bigger for the GBAC when -Xcheck:jni is used here. > > Another issue with the DeflaterOutputStream is the default buf size is relative too small, > for historical reason. So with a DeflaterOutStream(deflated, new Deflater(), 8192 *64), > is which a bigger buf/8192*64, the performance is close to the run with the -Xcheck:jni >

type: in which a bigger buf/8192*64 is used, .... run without the -Xcheck:jni is specified. -Sherman



More information about the hotspot-dev mailing list