Thursday, February 21, 2013

How to Debug a Deadlock in KD & WinDBG

In my last post, I was looking at some threads in KD and noticed there was a deadlock.  I will quickly go through the analysis of that particular deadlock.  You can review that post here.

As a quick review, I dumped the stacks of the running threads in the process.  The one that we knew was hung, was waiting to be notified.  Then the problem thread that was supposed to notify was hung waiting on an object.  Inspecting the stack led us to frame 9.  Looking at the source:


DWORD WINAPI CNotificationQueue::ThreadProc( void * pv )
{
...
        while ( 0 == s_lThreadShuttingDown )
        {
            EnterCriticalSection( &s_cs );
            poNotification = RemoveHead();
            if ( poNotification )

We can see this code is taking a lock on s_cs, a global.  

0: kd> ?? CNotificationQueue::s_cs

struct _RTL_CRITICAL_SECTION
   +0x000 DebugInfo        : 0x026acca0 _RTL_CRITICAL_SECTION_DEBUG
   +0x004 LockCount        : 0n-1
   +0x008 RecursionCount   : 0n0
   +0x00c OwningThread     : (null)
   +0x010 LockSemaphore    : 0x00000558 Void
   +0x014 SpinCount        : 0x20007d0
0: kd> !cs 0x026acca0
-----------------------------------------
Critical section   = 0x026acca0 (+0x26ACCA0)
Uninitialized or deleted.
LOCKED
LockCount          = 0xE4C5EB20
WaiterWoken        = Yes
OwningThread       = 0x00857380
RecursionCount     = 0x2658C70
LockSemaphore      = 0x0
SpinCount          = 0x00000031
0: kd>

So, this code is taking a lock on an invalid critical section, and hangs.  At this point if this bug reproduces, it would be good to run this again with Application Verifier enabled on the faulty code.

How to Set the Context & View a Thread in KD



In KD you have already:
  1. Found the process you were looking for
  2. Set the context to that process
  3. Looked at the process and its threads

Often the next thing you want to do while debugging your process in user mode (UM) is look at other threads running in your process.  That is kind of what post #3 in this series was about.

In UM you might do something like:
> ~* kn 
To see what the other the other threads are doing.

In KM the equivalent it is more like:
0: kd> !process -1 7

Normally there are a lot of system threads that you don't care about, threadpool threds, SCM threads, threads injected by the kernel, etc.  Generally you can ignore 75% of the thread stacks.

In this case, I am debugging a hang in a program.  There are two threads that are actually interesting:

        THREAD 86e35400  Cid 02b8.04d4  Teb: 7f8fc000 Win32Thread: 927e99d0 WAIT: (UserRequest) UserMode Non-Alertable
            86cfd7c8  SynchronizationEvent
            86d46838  SynchronizationEvent
        Not impersonating
        DeviceMap                 a9dee8e8
        Owning Process            88d07a80       Image:         program.exe
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      122295         Ticks: 1859 (0:00:00:29.046)
        Context Switch Count      575            IdealProcessor: 0
        UserTime                  00:00:00.031
        KernelTime                00:00:00.015
Loading symbols for 75f20000       msvcrt.dll ->   msvcrt.dll
        Win32 Start Address msvcrt!_threadstartex (0x75f554bd)
        Stack Init a8d77ed8 Current a8d777f0 Base a8d78000 Limit a8d75000 Call 0
        Priority 10 BasePriority 8 UnusualBoost 1 ForegroundBoost 0 IoPriority 2 PagePriority 5
Loading symbols for 76590000       USER32.dll ->   USER32.dll
        Child-SP RetAddr  : Args to Child                       : Call Site
        a8d77800 82442d6c : 00000000 00000002 86d26580 825d7580 : nt!KiSwapContext+0x1a
        a8d77838 8244288c : 00000000 86e35400 825d8580 00000000 : nt!KiSwapThread+0x374
        a8d77870 8252e48c : 86e354d8 00000002 86e354d8 86e35400 : nt!KiCommitThreadWait+0x208
        a8d778b0 82699ab8 : ba7d97c8 00000004 00000000 00000000 : nt!KeWaitForMultipleObjects+0x3d0
        a8d77920 826997ae : 00000001 00000000 a8d77bc0 00000000 : nt!ObWaitForMultipleObjects+0x2f0
        a8d77bb0 824eae80 : 00000000 a8d77bc0 ffffff01 ea453738 : nt!NtWaitForMultipleObjects+0xba
        a8d77d00 824eac06 : a8d77bc0 00000e01 00000000 0000011a : nt!KiSystemService+0x100
        a8d77d40 77d8d29a : 825d84f0 00000000 761d6398 77ce5339 : nt!KiSWIException+0xe6 (TrapFrame @ a8d77d40)
        01dafa10 758ca340 : 01dafa28 00863888 01dafbb0 00000000 : ntdll!ZwWaitForMultipleObjects+0x6
        01dafa10 765ae4a2 : 01dafa28 00863888 01dafbb0 00000000 : KERNELBASE!WaitForMultipleObjectsEx+0x108
        01dafb98 765ae3a6 : 00000000 00000080 0001d4c0 00000000 : USER32!RealMsgWaitForMultipleObjectsEx+0xc2
        01dafbf8 00dad464 : 00000000 00000000 0001d4c0 765ae369 : USER32!MsgWaitForMultipleObjects+0x3e
        01dafc18 00dafbf4 : 00001cbf 00000000 00020158 00000400 : my_dll!obj4::WaitForNotification+0x74
        01dafc60 00dc5a68 : 00695f38 00000000 01dafc80 00000000 : my_dll!obj3::Listen+0x37c
        01dafdb8 00dbfbf4 : 0086f700 77e069d0 826803f9 01dafdd8 : my_dll!obj2::Execute+0x184
        01dafdd0 75f554a2 : 01dafde0 75f554a3 77d59b01 01dafe00 : my_dll!obj1::method1+0x1c
        01dafdd8 75f5550a : 77d59b01 01dafe00 01dafdf0 75f5550b : msvcrt!_callthreadstartex+0x16
        01dafde8 765116c0 : 765116b5 75f554bd 01dafdf8 765116c1 : msvcrt!_threadstartex+0x4e
        01dafdf8 77d842f0 : 01dafe10 77d842f1 00000000 77d842cd : KERNEL32!BaseThreadInitThunk+0xc
        01dafe00 00000000 : 00000000 77d842cd 00000000 00000000 : ntdll!RtlUserThreadStart+0x24

Then this the other interesting thread:

        THREAD 86ddd300  Cid 02b8.0cf8  Teb: 7f8f4000 Win32Thread: 925bbcf0 WAIT: (UserRequest) UserMode Non-Alertable
            86c45678  SynchronizationEvent
        Not impersonating
        DeviceMap                 a9dee8e8
        Owning Process            88d07a80       Image:         program.exe
        Attached Process          N/A            Image:         N/A
        Wait Start TickCount      122295         Ticks: 1859 (0:00:00:29.046)
        Context Switch Count      164            IdealProcessor: 1
        UserTime                  00:00:00.078
        KernelTime                00:00:00.000
Loading symbols for 6ce70000      FunDisc.dll ->   FunDisc.dll
        Win32 Start Address FunDisc!CNotificationQueue::ThreadProc (0x6ce73725)
        Stack Init 933d7ed8 Current 933d7b70 Base 933d8000 Limit 933d5000 Call 0
        Priority 11 BasePriority 8 UnusualBoost 0 ForegroundBoost 2 IoPriority 2 PagePriority 5
        Child-SP RetAddr  : Args to Child                       : Call Site
        933d7b80 82442d6c : 00000000 00000002 86e35400 825d7580 : nt!KiSwapContext+0x1a
        933d7bb8 824428c4 : 00000000 00000001 00000000 00000002 : nt!KiSwapThread+0x374
        933d7bf0 8247aa1c : 86ddd3d8 00000000 86ddd3d8 00000000 : nt!KiCommitThreadWait+0x240
        933d7c30 826f79a8 : 00100000 86c45660 00000001 86c45660 : nt!KeWaitForSingleObject+0x3b0
        933d7c98 824eae80 : 00000000 933d7cac 00000000 82937f69 : nt!NtWaitForSingleObject+0xec
        933d7d00 824eac06 : 00000001 00000d02 00000000 000000e7 : nt!KiSystemService+0x100
        933d7d40 77d8cfe2 : 8c2514f0 00000000 0068a5e0 75f6dd09 : nt!KiSWIException+0xe6 (TrapFrame @ 933d7d40)
        02bbfa08 758bc8fa : 00000000 006925c0 6ce873c4 6ce85048 : ntdll!ZwWaitForSingleObject+0x6
        02bbfa08 6ce737a6 : 00000000 006925c0 6ce873c4 6ce85048 : KERNELBASE!WaitForSingleObjectEx+0xbe
        02bbfa68 765116c0 : 0071f180 00000000 00000000 00000000 : system_dll!CNotificationQueue::ThreadProc+0x82
        02bbfab0 77d842f0 : 02bbfac8 77d842f1 00000000 77d842cd : KERNEL32!BaseThreadInitThunk+0xc
        02bbfab8 00000000 : 00000000 77d842cd 00000000 00000000 : ntdll!RtlUserThreadStart+0x24

Clearly thread 86e35400 is deadlocked waiting on thread 86ddd300 which also appears to be deadlocked waiting for an event (probably).  Since 86ddd300 deadlock is probably the issue here because is 86e35400 waiting for it, that is a good place to start.  In this case it is a system DLL.  You may source debug it directly if you have source code access to Windows system components as I have, or at least you can get symbols if you don't.

This post isn't about debugging deadlocks, so I am not going to show you how to solve this issue in this post, but as I have said in the past, Application Verifier is great at root causing even tricky deadlocks. 

This post is how to change your thread context in KD.  With WinDBG in UM, you do this like:
> ~3 s 
Where 3 is the thread number, and s is set.

In KD however, the kernel is omnipresent and all threads are accessible.  This how you change your thread context in KD:
0: kd> .thread /p /r 86ddd300
Implicit thread is now 86ddd300
Implicit process is now 88d07a80
.cache forcedecodeuser done
Loading User Symbols
.ModLoad: 00d60000 00df0000   D:\program.exe
.ModLoad: 77d30000 77e76000   C:\Windows\SYSTEM32\ntdll.dll
.ModLoad: 764a0000 76589000   C:\Windows\system32\KERNEL32.DLL
.ModLoad: 758a0000 75941000   C:\Windows\system32\KERNELBASE.dll
.ModLoad: 75f20000 75f99000   C:\Windows\system32\msvcrt.dll
...



/p <= will page in the threads pages if needed
/r <= is will do .relaod for you

Now you can use your UM favorite commands like you did in WinDBG like: kn, .frame, dv, r, etc.

Monday, February 11, 2013

How to View a Process' Threads in KD

In KD you have already:
  1. Found the process you were looking for
  2. And, set the context to that process
How do you look at the threads of that given process?  In windbg, you can type something like ~* kn to show the stacks of the process you are debugging.  That doesn't work in a KD.  How can you do something similar in KD?  You can use .process to show you detailed information about the threads in a process

!process ffffe00002d0a940 7 <= will show the stacks and detailed information about the process ffffe00002d0a940's threads

If you are only interested in the current process, then -1 should be specified.

!process -1 7 <= will show the threads from the current process' context

Setting the Process Context in KD

In my last post I showed you how to find your UM process in KD.  It produced output something like this:

kd> !process /m my_module.dll 0 0
Searching processes with loaded module 'my_module.dll'
**** NT ACTIVE PROCESS DUMP ****
Unable to get LeftChild of nt!_MMVAD_SHORT at ffffe00002d075c0
failed to count VADs
PROCESS ffffe00002d0a940
    SessionId: 0  Cid: 0394    Peb: 7ff72d81d000  ParentCid: 025c
    DirBase: 0ca19000  ObjectTable: ffffc00001423e40  HandleCount: 706.
    Image: svchost.exe

To change your context to this process, you use the .process command.

.process /r /p ffffe00002d0a940

/r <= is for reloading the symbols for the new process context
/p <= is for translating the page table entries so you can inspect the process' memory translated to KM

How To Find a User Mode Process With KD

Windbg, NTSD, and KD are based off the same debug engine so most things are the same; however, when debugging in kernel mode, some things are a little different.  Specifically you are debugging the entire computer and not an instance of the program.

How do you find your process within a kernel debugger (KD)?

Normally when you have a UM break in your KD, the debugger is put into the break's process and thread context.  What if you want to debug another process, say the RPC server?

!process is your command.  It lets you walk the kernel's list of process objects.

!process 0 0 <= will list the basic information about all processes
!process 0 7 <= will display detailed information about all process, this can be a lot
!process fb667780 7 <= will display detailed information about process fb667780, but you need to know the process ID or process address

If you don't know the process ID or address, but do know the name of the module that is in the process you are looking for, you can try this:

!process /m module.dll 0 0 <= this will give you the needed information like process ID to set the context to debug that process.

It will look something like this:

kd> !process /m my_module.dll 0 0
Searching processes with loaded module 'my_module.dll'
**** NT ACTIVE PROCESS DUMP ****
Unable to get LeftChild of nt!_MMVAD_SHORT at ffffe00002d075c0
failed to count VADs
PROCESS ffffe00002d0a940
    SessionId: 0  Cid: 0394    Peb: 7ff72d81d000  ParentCid: 025c
    DirBase: 0ca19000  ObjectTable: ffffc00001423e40  HandleCount: 706.
    Image: svchost.exe
...