Friday, April 29, 2022

How to Enable Application Verifier in Windbg


This is how you enable Application Verifier (appverif) if you are already in the debugger:

0:000> !gflag +vrf Current NtGlobalFlag contents: 0x00000100 vrf - Enable application verifier
You can then control verifier with the !avrf command.

This will display the current settings: 

0:000> !avrf Verifier package version >= 3.00 Application verifier settings (81000000): - no heap checking enabled! - SRWLock

Other notes:

If you see something like this, then verifier is not setup on the exe.

No type information found for `verifier!_DPH_HEAP_ROOT'.
Please fix the symbols for `verifier.dll'.

Monday, April 11, 2022

Spin Until the Debugger Is Attached

Here is a handy trick to make some code wait for a debugger to attach.  

Let's say you have some user mode code you need to debug and it has some kind of tricky activation flow that makes it hard to get a debugger on the process before the code executes.  

In my case, I was helping someone on my team debug some test failures.  Some weirdness was happing when the test activated an out-of-proc COM object.  The COM server was getting loaded into a COM surrogate dllhost.exe.  We added some more WPP tracing to see if we could narrow down the issue, and the was happening when the COM object was getting activated.

We tried to use named pipe debugging like I outlined before but it wasn't really applicable because it was using the surrogate instead of the svchost. 

You can also set up windbg to automatically attach to the children of the process that you are currently debugging, but that didn't help either.  The COM runtime was the one creating the process.  I can't remember exactly, but I think it was the RPC endpoint mapper doing it.

Then the thought I had was to just make the COM server sleep for long enough for me to attach a user-mode debugger for say 30 sec.  Our traces were showing the server's PID, or you can also use tlist.exe to find the PID.  Instead of waiting for a long timeout, I added some code that spins until a debugger is attached.

Here is the code that spins until a debugger attaches.

while (!IsDebuggerPresent()) {

   Sleep(1000);

__debugbreak();

 

Wednesday, May 19, 2021

windbg: changing stepping by line instead of instruction

 In recently nightly builds of windbgx, I've noticed it has changed the initial default step behavior, or source mode.  Typically while source code debugging, when you press F10, or 'p', it steps by one line of code, which is called source mode.  Windbgx has recently been defaulting to stepping by instruction, which is called assembly mode.  A line of code is often several instructions, so you end up pressing F10 a lot to step through a number of lines of code which is generally not what you want if you have source while debugging.

The source stepping options are controlled by the 'l' command.  Here is the documentation, but basically you wan the 't' mode to step by lines of code instead of instructions.  Without the 't', the debugger is in assembly mode instead of source mode.

The command to go to source mode is the following:

> l+t


Monday, November 16, 2020

COM Access Violation (AV) Crashes Relating to Unloaded DLLs

My team uses a broadly used WinRT API.  Basically, all apps that have a device-related scenario use it.  This means buggy apps that crash will sometimes have our component featured in their Watson dumps.  Based on certain metrics, Watson will turn buckets turn into code bugs, so we have to triage them.  Aside: in case you were wondering, bugs from WER (Windows Error Reporting) service and dumps, do generate bugs and get fixed, so WER is useful.

A number of AV crash bugs we see are related to our device watcher calling back into the app's handlers for the device events, which almost always means a lifetime management bug in the app.  

You can check the memory allocations to see if they are busy or free.  This applies more to the component that owns the object's memory.  I think I covered that before.  That will let you know if the object has already been released.  A lot of times you can still see the object's state after it has been freed, but the important part is that it has been released/freed already, so the object is not being ref counted correctly.  AppVerif has a handy COM set of checks which might help if you think you may have this problem in your code.  With smart pointers, it can be a little harder to see what is happening with the object's references.  If the object model is not too complicated, you can just find the bug with code inspection.

A super common reason for AVs like this is the DLL is unloaded.  COM aggressively unloads DLLs not being used by calling DllCanUnloadNow.  If the implementation has a bug, the DLL will get unloaded while objects and other COM operations are in flight.  Eventually, when COM tries to CoUnitialize an object that belongs to that DLL, it will AV.  This also applies to other COM things like when COM tries to marshal between apartments, etc.

 We see crashes like this fairly often:

0:049> .excr
rax=00007ff8f305cf70 rbx=0000026dfc5cc900 rcx=0000026dfc5cc900
rdx=0000000000012f06 rsi=0000026dfa091b00 rdi=0000026dfa091ae0
rip=00007ff990c9094d rsp=000000b92deff490 rbp=00000000800401fd
 r8=00007ff970a0b370  r9=000000b92deff5d0 r10=dee01e7974d59970
r11=000000b92deff5e0 r12=0000000000012f06 r13=00007ff970a0b370
r14=000000b92deff5d0 r15=00007ff8f305cf70
iopl=0         nv up ei pl nz na pe nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
combase!CGIPTable::GetRequestedInterface+0x22 [inlined in combase!CGIPTable::GetInterfaceFromGlobal+0x1ed]:
00007ff9`90c9094d 488b00          mov     rax,qword ptr [rax] ds:00007ff8`f305cf70=????????????????
0:049> k
  *** Stack trace for last set context - .thread/.cxr resets it
 # Child-SP          RetAddr               Call Site
00 (Inline Function) --------`--------     combase!CGIPTable::GetRequestedInterface+0x22 [onecore\com\combase\dcomrem\giptbl.cxx @ 1615] 
01 000000b9`2deff490 00007ff9`7097515b     combase!CGIPTable::GetInterfaceFromGlobal+0x1ed [onecore\com\combase\dcomrem\giptbl.cxx @ 1660] 
02 000000b9`2deff560 00007ff9`70977823     Windows_Devices_Enumeration!Gip<Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *> >::Localize+0x6b [onecore\private\base\inc\devices\Git.h @ 167] 
03 (Inline Function) --------`--------     Windows_Devices_Enumeration!GitDelegate2<Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *>,Windows::Devices::Enumeration::IDeviceWatcher *,Windows::Devices::Enumeration::IDeviceInformation *>::GetHandler+0x11 [onecoreuap\base\devices\rtenum\dllsrv\GitDelegate.h @ 254] 
04 000000b9`2deff5a0 00007ff9`70976d95     Windows_Devices_Enumeration!GitDelegate2<Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *>,Windows::Devices::Enumeration::IDeviceWatcher *,Windows::Devices::Enumeration::IDeviceInformation *>::Invoke+0x23 [onecoreuap\base\devices\rtenum\dllsrv\GitDelegate.h @ 303] 
05 (Inline Function) --------`--------     Windows_Devices_Enumeration!Microsoft::WRL::EventSource<Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *>,Microsoft::WRL::InvokeModeOptions<-2> >::InvokeAll::__l2::<lambda_00d1300cb6cb75d6b2b5344e37267964>::operator()+0x36 [onecore\external\sdk\inc\wrl\event.h @ 964] 
06 000000b9`2deff5d0 00007ff9`70976cf4     Windows_Devices_Enumeration!Microsoft::WRL::InvokeTraits<-2>::InvokeDelegates<<lambda_00d1300cb6cb75d6b2b5344e37267964>,Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *> >+0x79 [onecore\internal\sdk\inc\wrl\internalevent.h @ 121] 
07 000000b9`2deff630 00007ff9`70981d3d     Windows_Devices_Enumeration!Microsoft::WRL::EventSource<Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *>,Microsoft::WRL::InvokeModeOptions<-2> >::DoInvoke<<lambda_00d1300cb6cb75d6b2b5344e37267964> >+0x78 [onecore\external\sdk\inc\wrl\event.h @ 954] 
08 (Inline Function) --------`--------     Windows_Devices_Enumeration!Microsoft::WRL::EventSource<Windows::Foundation::ITypedEventHandler<Windows::Devices::Enumeration::DeviceWatcher *,Windows::Devices::Enumeration::DeviceInformation *>,Microsoft::WRL::InvokeModeOptions<-2> >::InvokeAll+0x19 [onecore\external\sdk\inc\wrl\event.h @ 964] 
09 000000b9`2deff670 00007ff9`8ee2d946     Windows_Devices_Enumeration!Watcher<Windows::Devices::Enumeration::DeviceWatcher,Windows::Devices::Enumeration::IDeviceWatcher,Windows::Devices::Enumeration::IDeviceWatcher2,Windows::Devices::Enumeration::DeviceInformation,Windows::Devices::Enumeration::IDeviceInformation,Windows::Devices::Enumeration::IDeviceInformation2,DeviceInformationServer,Windows::Devices::Enumeration::DeviceInformationUpdate,Windows::Devices::Enumeration::IDeviceInformationUpdate,DeviceInformationUpdateServer,&RuntimeClass_Windows_Devices_Enumeration_DeviceWatcher>::Impl::DevQueryCallback+0x3ad [onecoreuap\base\devices\rtenum\dllsrv\Watcher.h @ 889] 
0a 000000b9`2deff710 00007ff9`91b9b0ea     cfgmgr32!TQuery::ServiceActionQueue+0xe2 [onecore\base\pnp\devquery\lib\query.cpp @ 245] 
0b 000000b9`2deff7a0 00007ff9`91b3ec06     ntdll!TppWorkpExecuteCallback+0x13a [minkernel\threadpool\ntdll\work.c @ 671] 
0c 000000b9`2deff7f0 00007ff9`8ff94ede     ntdll!TppWorkerThread+0x686 [minkernel\threadpool\ntdll\worker.c @ 1109] 
0d 000000b9`2deffae0 00007ff9`91b87c6b     kernel32!BaseThreadInitThunk+0x1e [clientcore\base\win32\client\thread.c @ 70] 
0e 000000b9`2deffb10 00000000`00000000     ntdll!RtlUserThreadStart+0x2b [minkernel\ntdll\rtlstrt.c @ 1152] 
0:049> .frame 0n0;dv /t /v
00 (Inline Function) --------`--------     combase!CGIPTable::GetRequestedInterface+0x22 [onecore\com\combase\dcomrem\giptbl.cxx @ 1615] 
@rbx              struct IUnknown * pUnk = 0x0000026d`fc5cc900
@r15              void * pVtableAddress = 0x00007ff8`f305cf70
<unavailable>     HRESULT hr = <value unavailable>
0:049> dps 0x0000026d`fc5cc900
0000026d`fc5cc900  00007ff8`f305cf70 <Unloaded_xxxxxxxxx.dll>+0xc2cf70
0000026d`fc5cc908  00000001`00000000
0000026d`fc5cc910  0000026d`fbdc9af0
0000026d`fc5cc918  00080000`00000000
0000026d`fc5cc920  00000000`00000008
0000026d`fc5cc928  00000008`4d454d4c
0000026d`fc5cc930  0000026d`fc4d3bf8
You can see that the handler's dll is already unloaded.
// or another example ///
:000> k
combase!CStdMarshal::DisconnectSrvIPIDs::__l29::<lambda_2a3a7b5175b0a5e47c77e1d8eff078e5>::operator()+0x7
combase!ObjectMethodExceptionHandlingAction<<lambda_2a3a7b5175b0a5e47c77e1d8eff078e5> >+0x24
combase!CStdMarshal::DisconnectSrvIPIDs+0x30d
combase!CStdMarshal::DisconnectWorker_ReleasesLock+0x2d7
combase!CStdMarshal::DisconnectSwitch_ReleasesLock+0x1c
combase!CStdMarshal::DisconnectAndReleaseWorker_ReleasesLock+0x32
combase!COIDTable::ThreadCleanup+0x117
combase!FinishShutdown::__l2::<lambda_3d4acc620ec77839d81caec938b15158>::operator()+0x5
combase!ObjectMethodExceptionHandlingAction<<lambda_3d4acc620ec77839d81caec938b15158> >+0x9
combase!FinishShutdown+0x78
combase!ApartmentUninitialize+0xc9
combase!wCoUninitialize+0x17d
combase!CoUninitialize+0xea
wuaueng!UHRunRemoteHandlerServer+0x25e
...
0:000> .exr -1
ExceptionAddress: 00007ffe8571a510 (combase!CStdMarshal::DisconnectSrvIPIDs::__l29::<lambda_2a3a7b5175b0a5e47c77e1d8eff078e5>::operator()+0x0000000000000007)
   ExceptionCode: c0000005 (Access violation)
  ExceptionFlags: 00000000
NumberParameters: 2
Attempt to read from address 00007ffe59d74ee8
0:000> ln 00007ffe59d74ee8
(00007ffe`59d74ee8)   <Unloaded_xxxxxxxxxxx.dll>+0x1b4ee8

Sunday, November 3, 2019

Howto: Enable Application Verifier Within WinDbg

!gflag debugger extention

A quick way to enable AppVerifier settings from the kernel debugger is to use !gflag debugger extension. This extension also enables Heaps, Handles and Locks checks only. Any process that is launched after the settings are enabled will run with these AppVerifier settings.
To enable lite pageheap, Handles and Locks checks on all apps that start from here on:
   kd>!gflag +vrf
To enable full pageheap
   kd>!gflag +hpa
To disable settings:
   kd>!gflag -vrf
   kd>!gflag -hpa

!avrf debugger extention

The !avrf extension controls the settings of Application Verifier and displays a variety of output produced by Application Verifier.
    !avrf
    !avrf -vs { Length | -a Address }
    !avrf -hp { Length | -a Address }
    !avrf -cs { Length | -a Address }
    !avrf -dlls [ Length ]
    !avrf -trm
    !avrf -ex [ Length ] 
    !avrf -threads [ ThreadID ]
    !avrf -tp [ ThreadID ]
    !avrf -srw  [ Address | Address Length ] [ -stats ]
    !avrf -leak  [ -m ModuleName] [ -r ResourceType] [ -a Address ] [ -t ]
    !avrf -trace TraceIndex 
    !avrf -cnt
    !avrf -brk [BreakEventType]  
    !avrf -flt [EventType Probability] 
    !avrf -flt break EventType 
    !avrf -flt stacks Length 
    !avrf -trg [ Start End | dll Module | all ] 
    !avrf -settings 
    !avrf -skp [ Start End | dll Module | all | Time ] 

Parameters

-vs { Length | -a Address }
Displays the virtual space operation log. Length specifies the number of records to display, starting with the most recent. Address specifies the virtual address. Records of the virtual operations that contain this virtual address are displayed.
-hp { Length | -a Address }
Displays the heap operation log. Address specifies the heap address. Records of the heap operations that contain this heap address are displayed.
-cs { Length | -a Address }
Displays the critical section delete log. Length specifies the number of records to display, starting with the most recent. Address specifies the critical section address. Records for the particular critical section are displayed when Address is specified.
-dlls [ Length ]
Displays the DLL load/unload log. Length specifies the number of records to display, starting with the most recent.
-trm
Displays a log of all terminated and suspended threads.
-ex [ Length ]
Displays the exception log. Application Verifier tracks all the exceptions in the application.
-threads [ ThreadID ]
Displays information about threads in the target process. For child threads, the stack size and the CreateThread flags specified by the parent are also displayed. If you provide a thread ID, information for only that thread is displayed.
-tp [ ThreadID ]
Displays the threadpool log. This log contains stack traces for various operations such as changing the thread affinity mask, changing thread priority, posting thread messages, and initializing or uninitializing COM from within the threadpool callback. If you provide a thread ID, information for that thread only is displayed.
-srw [ Address | Address Length ] [ -stats ]
Displays the Slim Reader/Writer (SRW) log. If you specify Address, records for the SRW lock at that address are displayed. If you specify Address and Length, records for SRW locks in that address range are displayed. If you include the -stats option, the SRW lock statistics are displayed.
-leak [ -m ModuleName] [ -r ResourceType] [ -a Address ] [ -t ]
Displays the outstanding resources log. These resources may or may not be leaks at any given point. If you specify Modulename (including the extension), all outstanding resources in the specified module are displayed. If you specify ResourceType, all outstanding resources of that resource type are displayed. If you specify Address, records of outstanding resources with that address are displayed. ResourceType can be one of the following:
Heap: Displays heap allocations using Win32 Heap APIs
Local: Displays Local/Global allocations
CRT: Displays allocations using CRT APIs
Virtual: Displays Virtual reservations
BSTR: Displays BSTR allocations
Registry: Displays Registry key opens
Power: Displays power notification objects
Handle: Displays thread, file, and event handle allocations
-trace TraceIndex Displays a stack trace for the specified trace index. Some structures use this 16-bit index number to identify a stack trace. This index points to a location within the stack trace database.
-cnt Displays a list of global counters.
-brk [ BreakEventType ] Specifies a break event. BreakEventType is the type number of the break event. For a list of possible types, and a list of the current break event settings, enter !avrf -brk.
-flt [ EventType Probability ] Specifies a fault injection. EventType is the type number of the event. Probability is the frequency with which the event will fail. This can be any integer between 0 and 1,000,000 (0xF4240). If you enter !avrf -flt with no additional parameters, the current fault injection settings are displayed.
-flt break EventType Causes Application Verifier to break into the debugger each time this fault, specified by EventType, is injected.
-flt stacks Length Displays Length number of stack traces for the most recent fault-injected operations.
-trg [ Start End | dll Module | all ] Specifies a target range. Start is the beginning address of the target range. End is the ending address of the target range. Module specifies the name (including the .exe or .dll extension, but not including the path) of a module to be targeted. If you enter -trg all, all target ranges are reset. If you enter -trg with no additional parameters, the current target ranges are displayed.
-skp [ Start End | dll Module | all | Time ] Specifies an exclusion range. Start is the beginning address of the exclusion range. End is the ending address of the exclusion range. Module specifies the name of a module to be targeted or excluded. Module specifies the name (including the .exe or .dll extension, but not including the path) of a module to be excluded. If you enter -skp all, all target ranges or exclusion ranges are reset. If you enter aTime value, all faults are suppressed for Time milliseconds after execution resumes.


Wednesday, September 25, 2019

Setting windbg to break in for C++ exceptions

Sometimes unhanded C++ exceptions will crash you program.

It might looks something like this:

(fc0.5204): C++ EH exception - code e06d7363 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
KERNELBASE!RaiseException+0x69:

windbg has gotten a lot better at handling exceptions.  A lot of times, you can just type:

> .excr

And it will reconstituted and set the context to the stack the threw; still, there are other factors that may prevent this from working like you'd want.

If that doesn't work, you can just use the trusty sx to set an exception handler

> sxe eh

Or if you have a specific structured exception number

> sxe number






Wednesday, November 22, 2017

How to Find a loaded Module in Windows

Let's get right into it.  Most of the time it is very easy where to find where a dll or exe is loaded using tlist.exe (aka task list).  In an elevated prompt, type:

> tlist /m module.dll|exe

eg.
C:\Debuggers> tlist /m cfgmgr32.dll
C:\WINDOWS\System32\cfgmgr32.dll -  828 lsass.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1108 svchost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1132 WUDFHost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1240 svchost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1304 svchost.exe
C:\WINDOWS\System32\CFGMGR32.dll - 1556 svchost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1652 svchost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1676 dwm.exe           DWM Notification Window
C:\WINDOWS\System32\cfgmgr32.dll - 1940 svchost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 2016 svchost.exe
C:\WINDOWS\System32\cfgmgr32.dll - 1348 svchost.exe
...

I think generally though the idea is that you can use this to find the PID for debugging the component in question.

eg.
C:\Debuggers> tlist /m notepad.exe
C:\WINDOWS\system32\NOTEPAD.EXE - 12900 notepad.exe       remote.txt - Notepad
C:\Debuggers>windbg -p  12900

That is assuming that it is already loaded and running.  What if it isn't loaded?

eg.
C:\Debuggers> tlist /m hotplug.dll
No tasks found using HOTPLUG.DLL

The easiest case is you know where it will be loaded.  For example, you can know that exporer.exe will load it eventually.  Then, you simply need to just attach a debugger to explorer.exe

eg.
C:\Debuggers> tlist explorer.exe
9020 explorer.exe      Program Manager
   CWD:     C:\WINDOWS\system32\
   CmdLine: "C:\WINDOWS\explorer.exe"
   VirtualSize:   2148226416 KB   PeakVirtualSize:   2149576868 KB
...
C:\Debuggers> windbg -p 9020

After that, you can do standard WinDbg stuff like setting a breakpoint:

eg.
> bp hotplug!SomeFunction

Or, if you need to break it before the DLL gets fully loaded, that is a little more work:
> sxe ld hotplug.dll
> bp hotplug!DllMain
> g

What if you have no idea where hotplug.dll gets loaded?  That is where using a KD comes in handy.