|
|
Random Thread getting stuck (1 viewing) 1 Guest
|
|
TOPIC: Random Thread getting stuck
|
|
Random Thread getting stuck 28 Dec 2011 - 09:59
|
I've spent the last couple of weeks investigating a very very difficult bug.
Firstly some background... our application prcoesses 600-1000 pairs of audio streams from the network in realtime and uses one thread per pair of audio streams (2 streams because its stereo). This means our app can use upto about 1006 threads (There are a few threads in additon to the audio receiving threads).
Each Audio thread is temporary and may only last a couple of minutes, so threads are being created and exited frequently.
The problem is that after a couple of hours, one of main threads (Not one of the 1000 or so Audio Receiving threads) just stops. It doesn't exit but its not running either!
Using WinDbg I can see the following call stack for the stuck thread:
ntdll!NtWaitForMultipleObjects+0x15 (FPO: [5,0,0])
KERNELBASE!WaitForMultipleObjectsEx+0x100 (FPO: [SEH])
kernel32!WaitForMultipleObjectsExImplementation+0xe0 (FPO: [5,8,4])
kernel32!WaitForMultipleObjects+0x18 (FPO: [4,0,0])
WARNING: Stack unwind information not available. Following frames may be wrong.
boost_thread_vc90_mt_1_47!boost::this_thread::interruptible_wait+0x199
boost_thread_vc90_mt_1_47!boost::thread::get_thread_info+0x144
boost_thread_vc90_mt_1_47!boost::thread::join+0x6c
Firstly, why is some of the stack unwind info unavailable? If I break the process before the problem occurs then the threads call stack looks perfectly normal and shows all my apps methods as you would expect. Could the stack have become corrupt?
Secondly, why doesn't the WaitForMultipleObjects ever return? One of the objects I'm waiting on is a WaitableTimer that fires every 10 seconds, so at the very least I should be seeing that (But I'm not).
I beieve that if one of my wait objects is invalid then WaitForMultipleObjects should return WAIT_FAIL, and not just hang. Plus I'm confident that none of the objects are invalid as I have been successfully calling WaitForMultipleObjects using these objects for well over an hour before the problem occurs, and no other threads close the handles that I'm waiting on.
Anyway, here's the weirdest bit...
If I add any diagnostics to this thread to try and see what's going on then the next time I run the code it may be another thread that freezes, not this one! Basically changing the code changes which thread freezes!
Does this just look like a stack corruption problem (If so then is there any way to get WinDbg to detect what is corrupting the stack?).
Thanks
Ben
|
|
|
Logged IP
|
|
The administrator has disabled public write access.
|
|
Re: Random Thread getting stuck 17 Mar 2012 - 13:34
|
Hi Ben,
Have you solved this problem?
According to the call stack, the thread is waiting for something.
If you can attach a debugger, try kb and !handle to see the objects which block the thread.
Cs.
|
|
|
Logged IP
|
|
The administrator has disabled public write access.
|
|
|
|
|
|