[OpenJDK 2D-Dev] [8] Review request for 8005607: Recursion in J2DXErrHandler() Causes a Stack Overflow on Linux (original) (raw)

Phil Race philip.race at oracle.com
Mon May 13 21:23:59 UTC 2013


Looks OK.

-phil.

On 4/24/2013 9:53 AM, Anton Litvinov wrote:

Hello Phil,

I provide answers to the questions from your last e-mail. Yes, the fix for the bug 6678385 did not take into account error handlers which are set in 2D native code area. Thank you for accepting the idea of the fix, despite theoretically possible problems, which you defined for the cases, when Java is embedded into another applications. No, after discussion with AWT team member Anthony Petrov, I would not like to remove "savederrorhandler" variable from the class "XErrorHandlerUtil", because, as Anthony stated in his last e-mail in the mail thread dedicated to this fix, this variable may be used for debugging purpose at any time and its removal is not directly related to the main purpose of this fix. Could you please review a new version of the fix containing corrections which should satisfy your remarks. Webrev: http://cr.openjdk.java.net/~alitvinov/8005607/webrev.05 The following changes were added to the fix: 1. Declarations of the local variables introduced by the fix in all C source files were moved to the beginning of the functions in order not to lead to warnings or errors of some C source code compilers. The fix in these files was modified: awtGraphicsEnv.c, awtutil.h, GLXSurfaceData.c, X11SurfaceData.c. 2. The code AWTLOCK/AWTUNLOCK was eliminated from both files "GLXSurfaceData.c" and "X11SurfaceData.c", where it was added by the earlier version of the fix. Because: - "sun.java2d.opengl.GLXSurfaceData.initPbuffer" method, whose native implementation contains the fix, is always called under AWT lock, acquired in the method "sun.java2d.opengl.OGLSurfaceData.initSurface". - "X11SurfaceData.c::X11SDCreateSharedPixmap" calling "X11SurfaceData.c::X11SDCreateSharedImage" function, containing the fix, is always guarded by AWT lock, acquired in "X11SurfaceData.c::XSharedinitSurface" function. - Despite of the fact that I was not able to find evidences of acquirement of AWT lock in all other algorithm's paths, which lead to a call of the function "X11SurfaceData.c::X11SDCreateSharedImage", I tested the fix with manual test and did not observe any case, when "sun.awt.X11.XShmAttachHandler" was exploited without AWT lock by means of the method "SunToolkit.isAWTLockHeldByCurrentThread". I also suppose that small possibility of race condition is less important than a deadlock caused by AWTLOCK/AWTUNLOCK code. Thank you, Anton On 4/22/2013 5:02 PM, Anthony Petrov wrote:

I also do not know the point of savederrorhandler variable, it became unusable with the fix for the bug "6678385", but this seems to be a stable code which I just had to move from XToolkit class to XErrorHandlerUtil without any modification.

So maybe remove it ? Ask the AWT folks what they think. The variable occupies 8 bytes only. It also allows one to uncomment the line: 120 // return XlibWrapper.CallErrorHandler(savederrorhandler, display, error.pData); for debugging purposes w/o having to wire up all the savederrorhandler machinery anew. This is needed rarely but may be useful in some cases. I'd prefer to leave it as is. -- best regards, Anthony On 03/28/13 02:34, Phil Race wrote: Hello, On 3/27/2013 3:08 PM, Anton Litvinov wrote: Hello Phil,

Thank you very much for the review. No, the original problem consists in the fact that Xlib function "XSetErrorHandler" is not thread-safe, so calling it from different threads by setting one error handler and restoring the previous error handler leads to I get that. The MT case is just the mechanism for what I described, because the install/restore was wrapping the code that needed the special handler.

such not easily reproducible bugs like this and the already fixed 6678385, for example when the second thread restores error handler set by the first thread after the first thread left the code block potentially generating XError and already unset its error handler. The only solution to this problem is introduction of one critical section for all pairs of calls to "XSetErrorHandler" function through WITHXERRORHANDLER, RESTOREXERRORHANDLER macros in the code of JDK. Even this solution is not complete, because JDK's code cannot influence invoked by it code from the third-party libraries, which also sets or potentially sets own error handlers. The purpose of the fix for "6678385" bug was to guarantee that AWT code sets its global error handler once and for the whole life time of Java application and allows Java code to set "synthetic" or not real error handlers without invocation of "XSetErrorHandler" function. While the idea of this fix is to guarantee that there is no place in JDK other than "src/solaris/native/sun/xawt/XlibWrapper.c", where "XSetErrorHandler" function is called. So this fix substitutes real calls to that native function for setting of "synthetic" error handlers through "sun.awt.X11.XErrorHandlerUtil" class. Except that 6678385 apparently didn't include the two 2D handlers ? Just the XAWT ones.

Yes, this pattern follows a policy relying on the assumption that no other code has a "more important" need to have its X error handler installed, but with one correction which is "no other code in JDK". So this constraint is not applicable to a code of any third-party library, since it is impossible without collaboration between JDK and third parties on definition of common critical section. Unfortunately, I did not know about the opportunity of embedding Java application into another application. Isn't that exactly what the SWT /AWT bridge is, which is what started off 6678385 ? Fortunately that doesn't seem to rely on this behaviour and in fact needed 667838. But I also wonder about embedding AWT into FX too .. So I don't know of actual problems for specific apps, but it seems theoretically possible. If this was already policy for XAWT we likely have this issue anyway so I suppose we just go with this approach until its proven to be a problem. I also do not know the point of savederrorhandler variable, it became unusable with the fix for the bug "6678385", but this seems to be a stable code which I just had to move from XToolkit class to XErrorHandlerUtil without any modification. So maybe remove it ? Ask the AWT folks what they think. AWTLOCK/AWTUNLOCK code was added to guarantee that no other thread from JDK code both Java and native can set and unset synthetic error handlers simultaneously. This is the critical section, which I described in my first passage of this e-mail. That didn't completely answer the point. It wasn't needed before, so did you see a real problem ? It looked to me like we only get here if we have the AWTLOCK anyway, but I didn't exhaustively trace. -phil. Thank you, Anton On 3/27/2013 12:12 AM, Phil Race wrote: If I correctly understand the original problem it was that the restoration of an X Error Handler was expected to be to the original one installed by the XToolkit but there is no guarantee of that, so the essence of this fix is that we install our error handlers as we need them but then RESTOREXERRORHANDLER() is a bit of a misnomer since it really leaves the handler installed (as far as X11 is concerned) and in the Java code simply discardscurrenterrorhandler. Then if an error occurs the Java code will fall through to SAVEDXERRORHANDLER() which just ignores it. I suppose this policy relies on the assumption that no other code has a "more important" need to have its X error handler installed, so we have no obligation to restore it after we are done. I wonder if this is the right thing to do if we are embedded in another application. And I am not sure what the point is of savederrorhandler in XErrorHandlerUtil.java since you never use it.

561 JNIEnv* env = (JNIEnv*)JNUGetEnv(jvm, JNIVERSION12); 562 AWTLOCK(); 563 jboolean xShmAttachResult = TryXShmAttach(env, awtdisplay, shminfo); 564 AWTUNLOCK(); embedding these declarations in the middle of the function may trigger a C compiler warning or error depending on the compiler. Best to move the declarations. Same in the other file. I'm curious, why did you add the AWTLOCK/AWTUNLOCK which was not there before? It may have been considered 'harmless' even if we already have the lock on this thread and its a Reentrant lock but it does increase the risk of deadlock, plus its got JNI up-call overhead .. but we seem to have a ton of that anyway. -phil. On 3/26/2013 5:40 AM, Anton Litvinov wrote: Hello, Please review the following fix for a bug. The fix passed 3 cycles of review by AWT development team. Artem Ananiev and Anthony Petrov approved it. But because the fix modifies also Java 2D Graphics code, review by 2D Graphics development team is necessary. New "webrev.04" was generated to resolve problem in not smooth patching of the latest version of the file "src/solaris/native/sun/java2d/opengl/GLXSurfaceData.c". Bug: http://bugs.sun.com/bugdatabase/viewbug.do?bugid=8005607 Webrev: http://cr.openjdk.java.net/~alitvinov/8005607/webrev.04 Thank you, Anton On 2/20/2013 5:43 PM, Artem Ananiev wrote: Looks fine. Thanks, Artem On 2/18/2013 8:08 PM, Anton Litvinov wrote: Hello Artem,

Could you please review a new version of the fix. The method "XErrorHandlerUtil.getDisplay()" was removed and "XErrorHandlerUtil.XSync()" method refers to the field "display" directly now. Webrev: http://cr.openjdk.java.net/~alitvinov/8005607/webrev.03 Thank you, Anton On 2/18/2013 4:23 PM, Artem Ananiev wrote:

On 2/18/2013 4:04 PM, Anton Litvinov wrote: Hello Artem,

Thank you very much for the review of this fix. My responses to your questions are provided below in the same order, which you defined. 1. I think that "XErrorHandlerUtil.savederror" field can surely be marked as private, but in this case the corresponding "XErrorHandlerUtil.getSavedError" method will be necessary, because this field is actively accessed from other classes which set a certain instance of XErrorHandler. For example "MotifDnDDropTargetProtocol.java", "XDragSourceProtocol.java" and a few other classes edited in this fix. OK, I missed that usages when looking at the webrev. Let it stay unchanged now. 2. Yes, I completely agree that "XErrorHandlerUtil.getDisplay()" is reduntant. This method will be eliminated. Thanks, Artem Thank you, Anton On 2/18/2013 3:16 PM, Artem Ananiev wrote: Hi, Anton,

a few minor comments: 1. XErrorHandlerUtil: can savederror be private instead of package protected? 2. XErrorHandlerUtil.getDisplay() seems to be redundant. In general, the fix looks perfectly fine to me. Please, wait for comments from Java2D team, though. Thanks, Artem On 2/13/2013 8:57 PM, Anton Litvinov wrote: Hello Anthony,

Could you please review the third version of the fix containing modifications discussed with you in the previous letter. Webrev: http://cr.openjdk.java.net/~alitvinov/8005607/webrev.02 This version of the fix differs from the previous in the following places: 1. A comment about the place of invocation of the method "XErrorHandlerUtil.init" was added to a documentation block of the method. 2. A code related to XShmAttach function common to the files "src/solaris/native/sun/awt/awtGraphicsEnv.c" and "src/solaris/native/sun/java2d/x11/X11SurfaceData.c" was extracted into a separate function "TryXShmAttach" declared in "src/solaris/native/sun/awt/awtGraphicsEnv.h" file. 3. All JNI code related to X error handling was implemented as corresponding macros defined in "src/solaris/native/sun/awt/awtutil.h" file. Thank you, Anton On 1/31/2013 7:42 PM, Anton Litvinov wrote: Hello Anthony,

Thank you for the review and these remarks. Surely, the comment will be added. I think that all JNI code related to XShmAttach can be definitely transferred into a separate dedicated function, which will be declared in "src/solaris/native/sun/awt/awtGraphicsEnv.h" file. I will try to wrap all JNU calls connected with XErrorHandler into the particular "WITHXERRORHANDLER", "RESTOREXERRORHANDLER" functions or macros. Thank you, Anton On 1/31/2013 4:57 PM, Anthony Petrov wrote: Hi Anton,

A couple comments: 1. src/solaris/classes/sun/awt/X11/XErrorHandlerUtil.java 80 private static void init(long display) { This method is private and isn't called from anywhere in this class itself. This looks confusing. Please add a comment stating that this method is invoked from native code, and from where exactly.

2. Interesting that we use this machinery to call the XShmAttach() from native code twice, and the code looks quite similar in each case. Would it be possible to extract the common code in a separate function (a-la BOOL TryXShmAttach(...)) to avoid code replication? There are other usages as well, so we could also introduce a macro (such as the old EXECWITHXERRORHANDLER but now with other arguments) that would minimize all the JNU calls required to use this machinery. Otherwise the fix looks great. -- best regards, Anthony On 1/30/2013 20:14, Anton Litvinov wrote: Hello Anthony, Could you, please, review a second version of the fix, which is based on an idea of reusing the existing AWT native global error handler from Java 2D native code. Webrev: http://cr.openjdk.java.net/~alitvinov/8005607/webrev.01 The fix consists of the following parts: 1. Migration of all X error handling code from XToolkit to a new XErrorHandlerUtil class for resolution of interdependency between a static initialization block of XToolkit and a block initializing java.awt.GraphicsEnvironment singleton. Such dependency is created by new calls to XToolkit static methods from "src/solaris/native/sun/awt/awtGraphicsEnv.c", "src/solaris/native/sun/java2d/x11/X11SurfaceData.c" files. 2. Substitution of XToolkit.WITHXERRORHANDLER, XToolkit.RESTOREXERRORHANDLER ... for corresponding methods, fields of XErrorHandlerUtil class in all places of JDK source code, where they were used. 3. Substitution of all found native X error handlers which are set in native code (awtGraphicsEnv.c, X11SurfaceData.c, GLXSurfaceData.c) for new synthetic Java error handlers. 4. Removal of X error handling code used by the native error handlers from "solaris/native/sun/awt/awtutil.c" "solaris/native/sun/awt/awtutil.h" files. Thank you, Anton On 1/11/2013 3:45 PM, Anthony Petrov wrote: I'm not Jim, but as I indicated earlier my opinion is that the easiest way to fix this is to install the existing J2DXErrHandler() only once. That is, it is the second option listed by you. Of course, the J2DXErrHandler needs to be updated as well to detect whether 2D code wants to use it at the moment or it must simply delegate to the previous handler (i.e. where the code currently installs/uninstalls the handler, it must instead set a global boolean flag or something.) While the first option (reusing the existing AWT machinery) is an interesting idea in general, I think it is complex and would require too much additional testing and bring an unjustified risk to the solution for such a basic problem. -- best regards, Anthony On 1/11/2013 14:44, Anton Litvinov wrote: Hello Jim, Thank you very much for the review and provision of a new idea of a solution. Elimination of the logic, which sets/unsets J2DXErrHandler() for each call "XShmAttach(awtdisplay, &shminfo))" should effectively resolve the issue, but only in case if all other native error handlers, which were set by the system function "XSetErrorHandler()" in JDK or in any external library, observe the rule of relaying of all events, which are not relative to them, to the previously saved error handlers. Otherwise an error generated during "XShmAttach" function call will not be handled by J2DXErrHandler(). Could you answer the following question. By setting J2DXErrHandler() only once and forever do you mean usage of AWT global event handler "static int ToolkitErrorHandler(Display * dpy, XErrorEvent * event)" from "src/solaris/native/sun/xawt/XlibWrapper.c" with Java synthetic handlers or creation of another global native error handler with J2DXErrHandler as native synthetic handler? Thank you, Anton On 1/10/2013 5:44 AM, Jim Graham wrote: I think I'd rather see some way to prevent double-adding the handler in the first place as well. Since it is only ever used on errors I also think it is OK to set it once and leave it there forever... ...jim On 1/9/13 8:08 AM, Anthony Petrov wrote: Hi Anton et al., If I read the description of the bug correctly, specifically this part: The problem occurs, if another thread (for example, GTK thread) is doing the same sort of thing concurrently. This can lead to the following situation. JVM thread: Sets J2DXErrHandler(), saves ANYPREVIOUSHANDLER as previous GTK thread: Sets some GTKHANDLER, saves J2DXErrHandler() as previous JVM thread: Restores ANYPREVIOUSHANDLER GTK thread: Restores J2DXErrHandler() JVM thread: Sets J2DXErrHandler(), saves J2DXErrHandler() as previous It is obvious that at this final step 2D is in an inconsistent state. We don't expect to replace our own error handler (and it shouldn't have been there in the first place). I realize that the fix you propose works around this problem. But this doesn't look like an ideal solution to me. BTW, IIRC, in JDK7 (and 6?) we decided to set the actual X11 error handler only once and never replace it. All the rest of the pushhandler/pophandler logic is now located in Java code (see XToolkit.SAVEDERRORHANDLER() and the surrounding logic). I believe that we should somehow share this machinery with the 2D code to avoid this sort of problems. Though I'm not sure if this will eliminate this exact issue.

2D/AWT folks: any other thoughts? -- best regards, Anthony On 12/29/2012 17:44, Anton Litvinov wrote: Hello, Please review the following fix for a bug. Bug: http://bugs.sun.com/bugdatabase/viewbug.do?bugid=8005607

https://jbs.oracle.com/bugs/browse/JDK-8005607 Webrev: http://cr.openjdk.java.net/~alitvinov/8005607/webrev.00 The bug consists in a crash which is caused by a stack overflow for the reason of an infinite recursion in AWT native function J2DXErrHandler() under certain circumstances on 32-bit Linux OS. The fix is based on introduction of the logic, which detects indirect recursive calls to J2DXErrHandler() by means of a simple counter, to J2DXErrHandler() native function. Such a solution requires minimum code changes, does not alter the handler's code significantly and eliminates this bug. Adding 2d-dev at openjdk.java.net e-mail alias to the list of recipients of this letter, because the edited function's name is related to Java 2D area of JDK, despite of the fact that the edited file is located in AWT directory. Thank you, Anton



More information about the 2d-dev mailing list