Question about thread initialization (original) (raw)

Thomas Stüfe thomas.stuefe at gmail.com
Thu Oct 18 08:23:43 UTC 2018

Previous message: Question about thread initialization
Next message: Question about thread initialization
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Oct 18, 2018 at 10:08 AM David Holmes <david.holmes at oracle.com> wrote:

On 18/10/2018 5:52 PM, Thomas Stüfe wrote: > Hi David, > On Thu, Oct 18, 2018 at 3:29 AM David Holmes <david.holmes at oracle.com> wrote: >> >> Hi Thomas, >> >> On 17/10/2018 10:33 PM, Thomas Stüfe wrote: >>> This proves to be more difficult than I initially thought :/ >>> >>> The problem is that we need to initialize stack base, size from within >>> the newly started thread itself: for stackbase, this is obvious. And >>> even stacksize - which should usually be equal to the requested stack >>> size known to the parent thread - might be larger, actually. >>> >>> But actually, on some platforms the thread-start-handshake is >>> differently implemented than on Linux. On Linux, nativethreadentry() >>> starts runnning right away after pthreadcreate(), and then waits for >>> the handshake. On Solaris/AIX/Windows we simply create the new thread >>> in suspended state, and in os::startthread(), we resume it. So >>> nativethreadentry() never ran. >> >> Hmmm my (incorrect) recollection was that the "start suspended" ability >> only handled the first part of the handshake, I didn't realize it >> removed it all. I also didn't recall it being used other than on Solaris >> - but I've never looked at the AIX code (I assumed it would be a copy of >> Linux). >> > > I think I wrote that part for AIX but cannot recollect anymore why we > do it the solaris/windows way instead of the linux way. Maybe it just > seemed more natural. Wild guess, the Linux handshake code feels like a > workaround from pre-nptl times, maybe because threads could not be > started in a suspended state? POSIX pthreads doesn't support starting suspended - and AFAIK neither does NPTL!

Ah, I see, you are right. I was confused since we do it on AIX, but there we use a non-posix addition (pthread_attr_setsuspendstate_np).

>>> Now we have a hen-end-egg problem: we want to call Threads::add() >>> before os::startthread() (I suppose, since the Thread may be short >>> lived and we do not want to add a possibly dead thread to the Threads >>> list). But we do not have the stack dimensions, since the thread on >>> these platforms never ran and hence had no opportunity to initialize >>> Thread::stackbase/size. >> >> Yes there is an unavoidable window where the thread is visible to >> "iterators" but has not yet initialized itself to the point where it is >> necessarily safe to interact with. Safepoint based iteration is not an >> issue, but we also seem to have a number of non-safepoint-based >> iterations occurring. >> >> But if we can initialise the stack base and size as the first thing in >> threadnativeentry, that should at least make the window as small as >> possible - no? And also handle the crash reporting case. > > I think this is what I will do. That will mitigate the problem > somewhat for solaris, windows and solve it completely for Linux and > Mac.

Ok. >>> Not really sure what to do here: I guess one could replace the resume >>> mechanism on Solaris/Windows/AIx with the type of monitor-based >>> handshake we do on Linux. But that seems heavy handed and may also >>> make thread creation slower. >> >> Other than Windows that would be the result if we standardized on >> POSIX-based threading. But that's not an insignificant effort in itself, >> and the performance implications would still need examining. >> > > Yes, this looks like time consuming work, and I do not have much time > left on this issue. In general I wonder whether there would be really > any performance implications with the handshake technique - and if > there are, why we use it then on Linux instead of cresting the thread > in suspended state (I know, it is probably all historical). As I said no start-suspended with POSIX pthreads or NPTL. >> BTW by chance this bug seems to show code that is both susceptible to >> finding new threads too soon, and also not preventing threads from >> terminating whilst in use: >> >> https://bugs.openjdk.java.net/browse/JDK-8212207 >> > > Seems similar, yes. Wonder why we crash though. Should we not simply > return EINVAL if the pthread id is invalid? It depends how long the id has been invalid. It seems that the threads library has short-term memory when it comes to using the pthreadt of a thread that recently terminated - the pthreadt->tid field is set to a negative value and it will cause ESRCH in that case. But if the pthreadt memory has been reclaimed and reused ... potential boom!

I see, That is rude :) But I guess that is alright since after pthread_join the pthread_t is invalid and use of it is UB.

Cheers, Thomas

Cheers, David

> ..Thomas > >> Cheers, >> David >> >>> Cheers, Thomas >>> >>> On Mon, Oct 15, 2018 at 9:58 PM David Holmes <david.holmes at oracle.com> wrote: >>>> >>>> Hi Thomas, >>>> >>>> I'm not sure I see a reason for splitting. Just move >>>> recordstackbaseandsize after the essential initialization in >>>> threadnativeentry - probably just before the: >>>> >>>> loginfo(os, thread)("Thread is alive (tid: ... >>>> >>>> as long as it happens before the handshake to the creating thread it >>>> should fix the problem of being able to see the new thread before its >>>> stack is set. The early crash problem is improved even if not perfect >>>> and better handled in the error reporter IMHO - as it needs to be >>>> resilient anyway. >>>> >>>> Cheers, >>>> David >>>> >>>> On 16/10/2018 1:41 AM, Thomas Stüfe wrote: >>>>> Hi David, >>>>> >>>>> On Mon, Oct 15, 2018 at 7:32 AM David Holmes <david.holmes at oracle.com> wrote: >>>>>> >>>>>> On 13/10/2018 5:49 PM, Thomas Stüfe wrote: >>>>>>> On Sat, Oct 13, 2018 at 8:36 AM Thomas Stüfe <thomas.stuefe at gmail.com> wrote: >>>>>>>> >>>>>>>> Hi David, >>>>>>>> >>>>>>>> On Sat, Oct 13, 2018 at 1:42 AM David Holmes <david.holmes at oracle.com> wrote: >>>>>>>>> >>>>>>>>> Hi Thomas, >>>>>>>>> >>>>>>>>> On 13/10/2018 3:56 AM, Thomas Stüfe wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> thank you for your brain cycles. >>>>>>>>>> >>>>>>>>>> On Fri, Oct 12, 2018 at 12:39 PM David Holmes <david.holmes at oracle.com> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Thomas, >>>>>>>>>>> >>>>>>>>>>> On 12/10/2018 6:18 PM, Thomas Stüfe wrote: >>>>>>>>>>>> Hi all, >>>>>>>>>>>> >>>>>>>>>>>> a small question. >>>>>>>>>>>> >>>>>>>>>>>> JVMStartThread calls new JavaThread() >>>>>>>>>>>> JavaThread::JavaThread() calls os::createthread() >>>>>>>>>>>> os::createthread() starts the new thread and waits for the >>>>>>>>>>>> handshake, then returns >>>>>>>>>>>> >>>>>>>>>>>> Back in JVMStartThread, we call JavaThread::prepare(), which adds the >>>>>>>>>>>> new thread to the Threads list. By that time, the new thread is >>>>>>>>>>>> already running, but how far it has gotten is unknown. >>>>>>>>>>> >>>>>>>>>>> Right - though the new thread can't enter run() until after it has been >>>>>>>>>>> prepared and "started". There are two parts to the handshake: >>>>>>>>>>> >>>>>>>>>>> Parent thread New Thread >>>>>>>>>>> start new Thread >>>>>>>>>>> wait for new thread >>>>>>>>>>> signal parent >>>>>>>>>>> wait for parent >>>>>>>>>>> prepare new thread >>>>>>>>>>> "start" new thread >>>>>>>>>>> signal new thread >>>>>>>>>>> run() >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Ah, I see. The new thread is taken off the leash only at the end of >>>>>>>>>> JVMStartThread(), when Thread::start() is called and the final part >>>>>>>>>> of the handshake is completed. >>>>>>>>>> >>>>>>>>>>>> The new thread's stack dimensions are set from within Thread::run() >>>>>>>>>>>> (for some reason, every child class does this on its own?) by calling >>>>>>>>>>>> Thread::recordstackbaseandsize(). So, after the handshake with its >>>>>>>>>>>> parent thread. Why? >>>>>>>>>>> >>>>>>>>>>> Good question. Undoubtedly historical. :) >>>>>>>>>>> >>>>>>>>>>>> This means we have a little race: in the Threads list there may be >>>>>>>>>>>> threads which have been just created and Thread::run() did not yet get >>>>>>>>>>>> around to set the stack size. In tests I stumbled over this, very >>>>>>>>>>>> rarely, when iterating Threads to check the stack sizes. >>>>>>>>>>> >>>>>>>>>>> Hmmm. Threads that are still threadnew should really be invisible >>>>>>>>>>> until more fully initialized. How exactly were you iterating the >>>>>>>>>>> threads? This might be an oversight in the related APIs. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This was because of a little unrelated test I wrote to accumulate the >>>>>>>>>> combined thread stack sizes. They did not add up and I was confused. >>>>>>>>>> >>>>>>>>>> I am now less confused: >>>>>>>>>> >>>>>>>>>> In JVMStartThread(): >>>>>>>>>> >>>>>>>>>> { >>>>>>>>>> grab thread list lock >>>>>>>>>> new JavaThread() >>>>>>>>>> - calls pthreadcreate, new thread starts and waits for >>>>>>>>>> handshake. stack base, size still NULL,0. >>>>>>>>>> JavaThread::prepare() >>>>>>>>>> - thread gets added to Threads list >>>>>>>>>> a } // relinquish threads lock >>>>>>>>>> ... >>>>>>>>>> ... >>>>>>>>>> Threads::start() >>>>>>>>>> - signal new thread to run >>>>>>>>>> b - new thread completes handshake, calls Thread::run(), and all >>>>>>>>>> implementations pretty much immediately set the stackbase/size. >>>>>>>>>> >>>>>>>>>> Between (a) and (b) another thread could grab the Threads lock, >>>>>>>>>> iterate the threads and would see the new thread with uninitialized >>>>>>>>>> base/size. >>>>>>>>>> >>>>>>>>>> To prove this I did a little test right there, in JVMStartThread, after (a): >>>>>>>>>> >>>>>>>>>> + { >>>>>>>>>> + MutexLocker mu(Threadslock); >>>>>>>>>> + MyThreadClosure tc; >>>>>>>>>> + Threads::threadsdo(&tc); >>>>>>>>>> + } >>>>>>>>>> >>>>>>>>>> the thread closure just prints the stack dimensions of the thrad. Sure >>>>>>>>>> enough, the new thread still has NULL/0 for stack base/size. In fact, >>>>>>>>>> Thread::stackbase() asserts because of this. >>>>>>>>>> >>>>>>>>>> So, could not another thread happen to do the same in this little time interval? >>>>>>>>> >>>>>>>>> There should be some rules about how thread closures gather their >>>>>>>>> target threads based on the thread state. Part of that should (now that >>>>>>>>> I think about it) filter out threads that are still "new" - but whether >>>>>>>>> they do or not is a different matter. This may well just be an oversight. >>>>>>>>> >>>>>>>> >>>>>>>> I think so too. >>>>>>>> >>>>>>>>>>>> Is there any reason why we could not just call >>>>>>>>>>>> recordstackbaseandsize() before calling Thread::run(), right at >>>>>>>>>>>> the start of the native entry function (threadnativeentry in the >>>>>>>>>>>> case of Linux)? >>>>>>>>>>> >>>>>>>>>>> Have you tried it? :) >>>>>>>>>>> >>>>>>>>>>> I can't immediately see why this can't happen in the >>>>>>>>>>> threadnativeentry. It's possible there was once a dependency with >>>>>>>>>>> something in thread preparation etc, but that's just speculation on my part. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Funny, I wanted to test it and then it occurred to me that we do this >>>>>>>>>> all along already on AIX: At the entrace of threadnativeentry I set >>>>>>>>>> stack base and size (that was even my own code from the initial AIX >>>>>>>>>> port). Unfortunately I have lost the history for that change and do >>>>>>>>>> not know anymore why I did this. >>>>>>>>>> >>>>>>>>>> Since we never had problems on AIX I guess this is okay for other >>>>>>>>>> platforms too - as long os::currentstackbase() / >>>>>>>>>> currentstacksize() have no side effects and do not rely on anything >>>>>>>>>> but OS functionality. >>>>>>>>> >>>>>>>>> Okay so are you going to propose making such a change? >>>>>>>>> >>>>>>>> >>>>>>>> Yes. I'll prepare a webrev. >>>>>>>> >>>>>>> >>>>>>> Yikes Thread::recordstackbaseandsize() is complex. That thing's >>>>>>> name is a blatant lie, it does way more than that, especially on >>>>>>> Solaris. >>>>>> >>>>>> To be fair the problem is in the naming of os::initializethread() and >>>>>> the fact that on Solaris it has also been used for additional thread >>>>>> initialization when only intended for fixing up issues with the >>>>>> primordial thread's stack. And it seems only an issue with Solaris as >>>>>> everywhere else it is a empty method (and I'm unclear how it ended up as >>>>>> an .cpp function!) >>>>>> >>>>> >>>>> Okay maybe I overreacted. The function name was probably well choosen >>>>> some time in the past and then the function grew in scope. >>>>> >>>>>> I would think you can move recordstackbaseandsize() in >>>>>> threadnativeentry; delete initializethread(Thread t) completely. Put >>>>>> the code that was in the Solaris initializethread into the Solaris >>>>>> naitvethreadentry after recordstackbaseandsize(). >>>>>> >>>>> >>>>> Replacing os::initializethread() with a Solaris-only local solution >>>>> seems clean and very reasonable. >>>>> >>>>> I am not so sure of moving the call to >>>>> Thread::recordstackbaseandsize() up to threadnativeentry, not in >>>>> its current form. >>>>> >>>>> I worry about this part: >>>>> >>>>> #if INCLUDENMT >>>>> // record thread's native stack, stack grows downward >>>>> MemTracker::recordthreadstack(stackend(), stacksize()); >>>>> #endif // INCLUDENMT >>>>> logdebug(os, thread)("Thread " UINTXFORMAT " stack dimensions: " >>>>> PTRFORMAT "-" PTRFORMAT " (" SIZEFORMAT "k).", >>>>> os::currentthreadid(), p2i(stackbase() - stacksize()), >>>>> p2i(stackbase()), stacksize()/1024); >>>>> >>>>> Both NMT recording and logging are probably sensitive to the current >>>>> Thread not fully initialized or Thread::current() not set at all >>>>> (depending on how early we want to set stackbase and size). So I >>>>> still prefer to chop this function in two, move the stackbase and >>>>> size part up to the start of nativethreadentry, and leaving the rest >>>>> where it is, under a new name. >>>>> >>>>>>> I feel unsure about moving this up to the start of the native entry >>>>>>> function, since it is not aware of its surrounding Thread class being >>>>>>> only half initialized. >>>>>>> >>>>>>> On AIX, we just call "setstackbase" and "setstacksize" directly, >>>>>>> and later, in shared code, run thru >>>>>>> Thread::recordstackbaseandsize() like everyone else. So, we just >>>>>>> pre-initialized base and size. I do not want to do this for all >>>>>>> platforms, since this is ugly. >>>>>>> >>>>>>> I wonder whether a better solution would be to change >>>>>>> Thread::recordstackbaseandsize(): >>>>>>> >>>>>>> 1 move the calls to setstackbase() and setstacksize() to the OS >>>>>>> specific threadnativeentry(). We can call this right away in the >>>>>>> newly born thread, since it relies on nothing else. >>>>>>> >>>>>>> 2 leave the rest in place and rename the function to something like >>>>>>> "Thread::finishinitialization()" >>>>>>> >>>>>>> 3 move the call to Thread::finishinitialization() up out of the >>>>>>> ::run() functions to threadnativeentry() - just before it >>>>>>> calls Thread::run(). >>>>>>> >>>>>>> 4 Alternative: to (3), make Thread::run() a non-virtual public >>>>>>> function - a place for os-generic and thread-class-generic common >>>>>>> stuff to-be-run-before-run(). Add a second, protected method, virtual >>>>>>> Thread::dorun() = 0, which is the real function child classes should >>>>>>> override. >>>>>>> >>>>>>> BTW I remember now why I did move the stack base/size initialization >>>>>>> on AIX up to the start of threadnativeentry. We had crashes in newly >>>>>>> spawned child threads and no usable hs-err files since to print a >>>>>>> stack trace you need to know the stack dimensions. Another reason to >>>>>>> change this. >>>>>> >>>>>> Well ... a thread should be doing very little interesting in terms of >>>>>> crashes prior to calling run() and setting up the stack information, and >>>>>> there will always be somewhere it can crash before this is set up, so >>>>>> perhaps the error reporter also needs to be bit more cautious about >>>>>> assuming things about the thread initialization state and the ability to >>>>>> use the Thread API. >>>>>> >>>>> >>>>> I agree. I think on AIX we are now able to print a stack trace without >>>>> knowing the stack dimensions from Thread. >>>>> >>>>> Best Regards, Thomas >>>>> >>>>>> Cheers, >>>>>> David >>>>>> >>>>>>> ..Thomas >>>>>>> >>>>>>> >>>>>>>> Cheers, Thomas >>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Cheers, Thomas >>>>>>>>>> >>>>>>>>>>>> Am I missing something here? >>>>>>>>>>> >>>>>>>>>>> I guess we will find out :) >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Thomas >>>>>>>>>>>>

Previous message: Question about thread initialization
Next message: Question about thread initialization
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the hotspot-runtime-dev mailing list