Array accesses using sun.misc.Unsafe cause data corruption or SIGSEGV (original) (raw)

John Rose john.r.rose at oracle.com
Fri Jul 17 19:31:04 UTC 2015


Thanks Serkan and Martijn for reporting and analyzing this.

We had a very similar bug reported internally, and we just integrated a fix: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/3816de51b5e7

Would you mind checking if it fixes your problem also?

Best wishes, — John

On Jul 12, 2015, at 5:07 AM, Serkan Özal <serkan at hazelcast.com> wrote:

Hi Martjin, Thanks for your interest and comment for making this thread a little bit more hot.

From my previous message (http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-June/018221.html <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-June/018221.html>): I added some additional logs to "vm/c1/c1Canonicalizer.cpp": void Canonicalizer::doUnsafeGetRaw(UnsafeGetRaw* x) { if (OptimizeUnsafes) doUnsafeRawOp(x); tty->printcr("Canonicalizer: doUnsafeGetRaw id %d: base = id %d, index = id %d, log2scale = %d", x->id(), x->base()->id(), x->index()->id(), x->log2scale()); } void Canonicalizer::doUnsafePutRaw(UnsafePutRaw* x) { if (OptimizeUnsafes) doUnsafeRawOp(x); tty->printcr("Canonicalizer: doUnsafePutRaw id %d: base = id %d, index = id %d, log2scale = %d", x->id(), x->base()->id(), x->index()->id(), x->log2scale()); }

So I run the test by calculating address as: - "int * long" (int is index and long is 8l) - "long * long" (the first long is index and the second long is 8l) - "int * int" (the first int is index and the second int is 8) Here are the logs: int * long: Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 33: base = id 13, index = id 27, log2scale = 3 Canonicalizer: doUnsafeGetRaw id 36: base = id 13, index = id 27, log2scale = 3 long * long: Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 35: base = id 13, index = id 14, log2scale = 3 Canonicalizer: doUnsafeGetRaw id 37: base = id 13, index = id 14, log2scale = 3 int * int: Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 33: base = id 13, index = id 29, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 36: base = id 13, index = id 29, log2scale = 0 Canonicalizer: doUnsafePutRaw id 19: base = id 8, index = id 15, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 8, index = id 15, log2scale = 0 As you can see, at the problematic runs ("int * long" and "long * long") there are two scaling. One for "Unsafe.put" and the other one is for "Unsafe.get" and these instructions points to same "base" and "index" instructions. This means that address is scaled one more time because there should be only one scale. With this fix (or attempt since I am not %100 sure if it is perfect/optimum way or not), I prevent multiple scaling on the same index instruction. Also one of my previous messages (http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-July/018383.html <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-July/018383.html>) shows that there are multiple scaling on the index so when it scaled multiple, anymore it shows somewhere or anywhere in the memory. On Sun, Jul 12, 2015 at 2:54 PM, Martijn Verburg <martijnverburg at gmail.com <mailto:martijnverburg at gmail.com>> wrote: Non reviewer here, but I'd add to the comment why you don't want to scale again. Cheers, Martijn On 12 July 2015 at 11:29, Serkan Özal <serkan at hazelcast.com <mailto:serkan at hazelcast.com>> wrote: Hi all, I have created a webrev for review including the patch and shared for public access from here: https://s3.amazonaws.com/jdk-8087134/webrev.00/index.html <https://s3.amazonaws.com/jdk-8087134/webrev.00/index.html> Regards. On Sat, Jul 4, 2015 at 9:06 PM, Serkan Özal <serkan at hazelcast.com <mailto:serkan at hazelcast.com>> wrote: Hi, I have added some logs to show that problem is caused by double scaling of offset (index) Here is my updated (log messages added) reproducer code: int count = 100000; long size = count * 8L; long baseAddress = unsafe.allocateMemory(size); System.out.println("Start address: " + Long.toHexString(baseAddress) + ", End address: " + Long.toHexString(baseAddress + size)); for (int i = 0; i < count; i++) {_ _long address = baseAddress + (i * 8L);_ _System.out.println(_ _"Normal: " + Long.toHexString(address) + ", " +_ _"If double scaled: " + Long.toHexString(baseAddress + (i * 8L * 8L)));_ _long expected = i;_ _unsafe.putLong(address, expected);_ _unsafe.getLong(address);_ _}_ _After sometime it crashes as_ _..._ _Current thread (0x0000000002068800): JavaThread "main" [threadinJava, id=10412, stack(0x00000000023f0000,0x00000000024f0000)]_ _siginfo: ExceptionCode=0xc0000005, reading address 0x0000000059061020_ _..._ _..._ _And here is output of the execution until crash:_ _Start address: 58bbcfa0, End address: 58c804a0_ _Normal: 58bbcfa0, If double scaled: 58bbcfa0_ _Normal: 58bbcfa8, If double scaled: 58bbcfe0_ _Normal: 58bbcfb0, If double scaled: 58bbd020_ _..._ _..._ _Normal: 58c517b0, If double scaled: 59061020_ _As seen from the logs and crash dump, double scaled version of target address (If double scaled: 59061020) is the same with the problematic address (siginfo: ExceptionCode=0xc0000005, reading address 0x0000000059061020) that causes to crash while accessing it._ _So I think, it is obvious that the crash is caused by wrong optimization of index value since index is scaled two times (for Unsafe::put and Unsafe::get) instead of only one time. Then double scaled index points to invalid memory address._ _Regards._ _On Sun, Jun 14, 2015 at 2:39 PM, Serkan Özal <serkan at hazelcast.com <mailto:serkan at hazelcast.com>> wrote: Hi all, I had dived into the issue with JDK-HotSpot commits and the issue arised after this commit: http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/a60a1309a03a <http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/rev/a60a1309a03a> Then I added some additional logs to "vm/c1/c1Canonicalizer.cpp": void Canonicalizer::doUnsafeGetRaw(UnsafeGetRaw* x) { if (OptimizeUnsafes) doUnsafeRawOp(x); tty->printcr("Canonicalizer: doUnsafeGetRaw id %d: base = id %d, index = id %d, log2scale = %d", x->id(), x->base()->id(), x->index()->id(), x->log2scale()); } void Canonicalizer::doUnsafePutRaw(UnsafePutRaw* x) { if (OptimizeUnsafes) doUnsafeRawOp(x); tty->printcr("Canonicalizer: doUnsafePutRaw id %d: base = id %d, index = id %d, log2scale = %d", x->id(), x->base()->id(), x->index()->id(), x->log2scale()); } So I run the test by calculating address as - "int * long" (int is index and long is 8l) - "long * long" (the first long is index and the second long is 8l) - "int * int" (the first int is index and the second int is 8) Here are the logs: int * long: Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 33: base = id 13, index = id 27, log2scale = 3 Canonicalizer: doUnsafeGetRaw id 36: base = id 13, index = id 27, log2scale = 3 long * long: Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 35: base = id 13, index = id 14, log2scale = 3 Canonicalizer: doUnsafeGetRaw id 37: base = id 13, index = id 14, log2scale = 3 int * int: Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 33: base = id 13, index = id 29, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 36: base = id 13, index = id 29, log2scale = 0 Canonicalizer: doUnsafePutRaw id 19: base = id 8, index = id 15, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 8, index = id 15, log2scale = 0 As you can see, at the problematic runs ("int * long" and "long * long") there are two scaling. One for "Unsafe.put" and the other one is for "Unsafe.get" and these instructions points to same "base" and "index" instructions. This means that address is scaled one more time because there should be only one scale. When I debugged the non-problematic run ("int * int"), I saw that "instr->asArithmeticOp();" is always returns "null" then "matchindexandscale" method returns "false" always. So there is no scaling. static bool matchindexandscale(Instruction* instr, Instruction** index, int* log2scale) { ... ArithmeticOp* arith = instr->asArithmeticOp(); if (arith != NULL) { ... } return false; } Then I have added my fix attempt to prevent multiple scaling for Unsafe instructions points to same index instruction like this: void Canonicalizer::doUnsafeRawOp(UnsafeRawOp* x) { Instruction* base = NULL; Instruction* index = NULL; int log2scale; if (match(x, &base, &index, &log2scale)) { x->setbase(base); x->setindex(index); // The fix attempt here // ///////////////////////////// if (index != NULL) { if (index->ispinned()) { log2scale = 0; } else { if (log2scale != 0) { index->pin(); } } } // ///////////////////////////// x->setlog2scale(log2scale); if (PrintUnsafeOptimization) { tty->printcr("Canonicalizer: UnsafeRawOp id %d: base = id %d, index = id %d, log2scale = %d", x->id(), x->base()->id(), x->index()->id(), x->log2scale()); } } } In this fix attempt, if there is a scaling for the Unsafe instruction, I pin index instruction of that instruction and at next calls, if the index instruction is pinned, I assummed that there is already scaling so no need to another scaling. After this fix, I rerun the problematic test ("int * long") and it works with these logs: int * long (after fix): Canonicalizer: doUnsafeGetRaw id 18: base = id 16, index = id 17, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 20: base = id 16, index = id 19, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 22: base = id 16, index = id 21, log2scale = 0 Canonicalizer: doUnsafeGetRaw id 24: base = id 16, index = id 23, log2scale = 0 Canonicalizer: doUnsafePutRaw id 35: base = id 13, index = id 14, log2scale = 3 Canonicalizer: doUnsafeGetRaw id 37: base = id 13, index = id 14, log2scale = 0 Canonicalizer: doUnsafePutRaw id 21: base = id 8, index = id 11, log2scale = 3 Canonicalizer: doUnsafeGetRaw id 23: base = id 8, index = id 11, log2scale = 0 I am not sure my fix attempt is a really fix or maybe there are better fixes. Regards. -- Serkan ÖZAL Btw, (thanks to one my colleagues), when address calculation in the loop is converted to long address = baseAddress + (i * 8) test passes. Only difference is next long pointer is calculated using integer 8 instead of long 8. _ _for (int i = 0; i < count; i++) {_ _long address = baseAddress + (i * 8); // <--- here, integer 8 instead_ _of long 8_ _long expected = i;_ _unsafe.putLong(address, expected);_ _long actual = unsafe.getLong(address);_ _if (expected != actual) {_ _throw new AssertionError("Expected: " + expected + ", Actual: " +_ _actual);_ _}_ _}_ _ On Tue, Jun 9, 2015 at 1:07 PM Mehmet Dogan <mehmet at hazelcast.com <http://mail.openjdk.java.net/mailman/listinfo/hotspot-compiler-dev>> wrote: > Hi all, > > While I was testing my app using java 8, I encountered the previously > reported sun.misc.Unsafe issue. > > https://bugs.openjdk.java.net/browse/JDK-8076445 <https://bugs.openjdk.java.net/browse/JDK-8076445> > > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017685.html <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2015-April/017685.html> > > Issue status says it's resolved with resolution "Cannot Reproduce". But > unfortunately it's still reproducible using "1.8.060-ea-b18" and > "1.9.0-ea-b67". > > Test is very simple: > > _ _> public static void main(String[] args) throws Exception {_ _> Unsafe unsafe = findUnsafe();_ _> // 10000 pass_ _> // 100000 jvm crash_ _> // 1000000 fail_ _> int count = 100000;_ _> long size = count * 8L;_ _> long baseAddress = unsafe.allocateMemory(size);_ _>_ _> try {_ _> for (int i = 0; i < count; i++) {_ _> long address = baseAddress + (i * 8L);_ _>_ _> long expected = i;_ _> unsafe.putLong(address, expected);_ _>_ _> long actual = unsafe.getLong(address);_ _>_ _> if (expected != actual) {_ _> throw new AssertionError("Expected: " + expected + ",_ _> Actual: " + actual);_ _> }_ _> }_ _> } finally {_ _> unsafe.freeMemory(baseAddress);_ _> }_ _> }_ _> > It's not failing up to version 1.8.0.31, by starting 1.8.0.40 test is > failing constantly. > > - With iteration count 10000, test is passing. > - With iteration count 100000, jvm is crashing with SIGSEGV. > - With iteration count 1000000, test is failing with AssertionError. > > When one of compilation (-Xint) or inlining (-XX:-Inline) or > on-stack-replacement (-XX:-UseOnStackReplacement) is disabled, test is not > failing at all. > > I tested on platforms: > - Centos-7/openjdk-1.8.0.45 > - OSX/oraclejdk-1.8.0.40 > - OSX/oraclejdk-1.8.0.45 > - OSX/oraclejdk-1.8.060-ea-b18 > - OSX/oraclejdk-1.9.0-ea-b67 > > Previous issue comment ( > https://bugs.openjdk.java.net/browse/JDK-8076445?focusedCommentId=13633043#comment-13633043 <https://bugs.openjdk.java.net/browse/JDK-8076445?focusedCommentId=13633043#comment-13633043>) > says "Cannot reproduce based on the latest version". I hope that latest > version is not mentioning to '1.8.060-ea-b18' or '1.9.0-ea-b67'. Because > both are failing. > > I'm looking forward to hearing from you. > > Thanks, > -Mehmet Dogan- > -- > > @mmdogan > -- Serkan ÖZAL Remotest Software Engineer GSM: +90 542 680 39 18 tel:%2B90%20542%20680%2039%2018 Twitter: @serkanozal -- Serkan ÖZAL Remotest Software Engineer GSM: +90 542 680 39 18 tel:%2B90%20542%20680%2039%2018 Twitter: @serkanozal -- Serkan ÖZAL Remotest Software Engineer GSM: +90 542 680 39 18 tel:%2B90%20542%20680%2039%2018 Twitter: @serkanozal

-- Serkan ÖZAL Remotest Software Engineer GSM: +90 542 680 39 18 Twitter: @serkanozal

-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150717/dba82f86/attachment-0001.html>



More information about the hotspot-compiler-dev mailing list