Monday, December 14, 2009

Windows Win32 Timer or Time Counting

If you want to make a timer that only depends on the kernel, GetTickCount64 is an easy way to do it.


http://msdn.microsoft.com/en-us/library/ms724411%28VS.85%29.aspx


The code is simple and would look like this:


ULONGLONG Ticks = 0;

Ticks = GetTickCount64();

//
// some code to time
//

Ticks = GetTickCount64() - Ticks;


Note: GetTickCount64 returns the number of ms since the system was last booted. Theoretically you might have check for wrapping of the clock; however, 2^64 ms or ~49.7 days is a long time. If you are worried about that, you might consider other APIs that keep track of the full time with a ms resolution.

Friday, December 11, 2009

Running cmd.exe as SYSTEM on Win 7 and Vista

In Windows, being Administrator has its limitations. Running as SYSTEM is all powerful. It is kind of like being root in the UNIX world except you can just su or sudo to SYSTEM.

I have this lib that can run in kernel mode and in user mode. I would like to simulate a UM process using the lib doing the same things I would have it do in KM tp measure the performance difference. To solve this issue, I need to run the process as SYSTEM.

A while back, I remember doing exactly that in a sysintenrals class. I ran across this blog post that reminded me how to do it.

1. Download the sysinternals PSTools
2. Copy the files to somewhere in your path
3. In an admin cmd.exe run > psexec -i -s cmd.exe

That is it. Anything you run in the shell will be run as LOCAL SYSTEM.

Tuesday, October 27, 2009

ATL Removal: Part 6 – Replacing CComQIPtr with _com_ptr_t

ATL Removal: Part 6 – Replacing CComQIPtr with _com_ptr_t
In part 4 of ATL removal, I talked about removing CComPtr and replacing it with _com_ptr_t. You may run into some smart pointers in the form of CComQIPtr. CComPtr and CComQIPtr have the same base class and CComQIPtr is a super set of CComPtr. I tried to read up on CComQIPtr in MSDN, but the documentation wasn’t really very helpful. In the code I am working on, CComQIPtr is used to create a smart pointer and then do a QueryInterface hence the QI. I verified this is what goes on by looking through the atlbase.h header file from the SDK.

If you don’t want to depend on ATL in the sense that your binary doesn't load the ATL dlls, the it is fine to still include the .h file, but if you want to purge ATL altogether, read on. Keep in mind _com_ptr_t doesn't do everything that ATL smart pointers, so some extra manual work may still be required. This is an example of a case where the extra legwork is required.

CComQIPtr(IUnknown* lp)
{
p=NULL;
if (lp != NULL)
lp->QueryInterface(*piid, (void **)&p);
}

Before I had some code that looked like this:
CComQIPtr spLocaclVariable( m_spMemberProperty);

To fix it, I changed it to this:
_com_ptr_t<_com_IIID< IYourInterface, &__uuidof(IYourInterface)> > spLocaclVariable;
if (spMemberProperty)
{
spMemberProperty ->QueryInterface(__uuidof(IYourInterface), (void**)& spLocaclVariable);
}

I think this should do the trick for your CComQIPtr smart pointers.

ATL Removal: Part 5 – Changing ATL’s Lock and Unlock to Slim Reader/Writer (SRW) Locks

You may encounter Lock() and Unlock() calls in your ATL based code to synchronize the usage of shared memory amongst your threads. You get these bonus methods thanks to inheriting from ATL base classes (CComObjectRoot). Since we are removing ATL, we will have to find an alternative. Enter Slim Reader/Writer (SRW) locks.

SRW locks are not loaded with tons of features like some of the other locks that Microsoft has to offer ; however, they fast and easy on the memory foot print. In fact, you can consider using SRW locks instead of ATL locking as a performance enhancement. I won’t get into SRW locks here, but you can read more about them on MSDN here:

http://msdn.microsoft.com/en-us/library/aa904937%28VS.85%29.aspx

The first thing to do is add a SRWLOCK member like so:

SRWLOCK m_Lock;

Then in your class’ constructor add the initialization:

InitializeSRWLock(&m_Lock);

Replace Lock() with this:

AcquireSRWLockExclusive(&m_Lock);

And Unlock() with this:

ReleaseSRWLockExclusive(&m_Lock);

As you can see from the MSDN documentation SRW locks can do more than exclusive locks. You might want to look at these other SRW functions to give your critical sections just the right amount of protection.

The following are the SRW lock functions (from MSDN).

SRW lock functionDescription
AcquireSRWLockExclusiveAcquires an SRW lock in exclusive mode.
AcquireSRWLockSharedAcquires an SRW lock in shared mode.
InitializeSRWLockInitialize an SRW lock.
ReleaseSRWLockExclusiveReleases an SRW lock that was opened in exclusive mode.
ReleaseSRWLockSharedReleases an SRW lock that was opened in shared mode.
SleepConditionVariableSRWSleeps on the specified condition variable and releases the specified lock as an atomic operation.

Monday, October 26, 2009

ATL Removal: Part 4 – Replacing CComPtr with _com_ptr_t

CComPtr (a.k.a. smart pointer) is another helpful template class provided in ATL that is intended to make COM programming easier. CComPtr wraps and abstracts the com object and does all of the lifetime management stuff automatically for you like calling AddRef and Release. Lifetime management is not that tricky to do it on your own. A pattern I like to use is call AddRef when I get passed a com object that I need to use, and then call Release when I am done using it. It’s not very complicated to get it right. Also, just because you are using a CComPtr, it is possible to still screw up the ref count and cause leaks or double frees, so you still need to know what you are doing with them to avoid these issues. Let’s move on to the topic at hand, replacing CComPtr.

Trying to just get rid of CComPtrs and code straight com objects could be a pain since it more about redoing the entire coding pattern of the project instead of modifying declarations. If you don’t want to depend on ATL in the sense that your binary doesn't load the ATL dlls, the it is fine to still include the .h file, but if you want to purge ATL altogether, read on. Keep in mind _com_ptr_t doesn't do everything that ATL smart pointers, so some extra manual work may still be required.

Luckily there is compiler support for non-ATL smart pointers in the form of _com_ptr_t that will allow us to change the declarations instead of the coding pattern.

http://msdn.microsoft.com/en-us/library/417w8b3b(VS.80).aspx

So this is what a declaration might look like before:

CComPtr m_spYourInterface;

And this is what it would look like afterwards:

_com_ptr_t<_com_iiid<> > m_spYourInterface;

Additionally you will need to add comip.h to your stdafx.h file, and USE_MSVCRT=1 comsuppw.lib to your TARGETLIBS in the sources file to get your project to compile and link.

Here are some errors you might hit:

file.cpp(776) : error C2039: 'CopyTo' : is not a member of '_com_ptr_t<_iiid>'

The code looks something like this:

hr = m_spObject.CopyTo( ppObject );

Change it to this:

if (
m_spObject) m_spObject.AddRef();
*ppObject = m_spObject;

I guess _com_ptr_t<_iiid> does not provide that method, but the little AddRef and assignment does the same thing.


You might see this linking error as well:
error LNK2001: unresolved external symbol "void __stdcall _com_issue_error(long)" (?
_com_issue_error@@YGXJ@Z)

You need to make sure you enable compiler support for _com_ptr_t smart pointers. Add the following to your sources file:

USE_MSVCRT=1

Friday, October 23, 2009

ATL Removal: Part 3 – Tackling _Module and OBJECT_MAP

In Part 1, we saw how ATL uses com maps to automatically generate the IUnknown implementation and to manage the objects lifetime. In this installment we will tackle the object and lifetime management of the DLL.

Let’s dive into the guts of the DLL management. This is another place that ATL was intended to save time for the developer. If you have an ATL based DLL, you might notice something like this in your main DLL source file.
CComModule _Module;
BEGIN_OBJECT_MAP(ObjectMap)
OBJECT_ENTRY(CLSID_YourClass, CYourClass)
END_OBJECT_MAP()
The _Module object is designed to manage the lifetime of your DLL. You ATL CComObjects objects will automatically call _Module.Lock() and _Module.Unlock(). Awsome! If you use _Module in your ATL based DLL and some non ATL com objects, make sure the lock and unlock _Module in the constructors and destructors respectively; I fixed a bug last year where my DLL was prematurely getting unloaded because of this. The first thing to do to replace _Module is to add:
LONG g_cLockCount = 0;

inline VOID IncModuleCount()
{
InterlockedIncrement(&g_cLockCount);
} // IncModuleCount

inline VOID DecModuleCount()
{
InterlockedDecrement(&g_cLockCount);
} // DecModuleCount
You DLL can use that to keep track of the active objects so I can know when it is safe to unload.

Next let’s look at the object map. What does that buy for you? It basically provides you a free implementation of IClassFactory. Is it hard to write your own? No, I will show you how right now.
You can create a file called ClassFactory.h that looks like this:
class CClassFactory:
public IClassFactory
{
public:
// IUnknown
STDMETHODIMP_(ULONG) AddRef();
STDMETHODIMP_(ULONG) Release();
STDMETHODIMP QueryInterface(
REFIID riid,
__deref_out_opt void **ppv);

// IClassFactory
STDMETHODIMP CreateInstance(
__in_opt IUnknown *punkOuter,
REFIID iid,
__deref_out_opt void **ppv);

STDMETHODIMP LockServer(
BOOL fLock);

// Constructor / Destuctor
CClassFactory();
~CClassFactory();

protected:
LONG m_cRef;
}; // CClassFactory
Now, let’s look at the implementation. Here is the content of ClassFactory.cpp:
#include "stdafx.h"

//---------------------------------------------------------------------------
// Begin CClassFactory implemetation
//---------------------------------------------------------------------------
extern LONG g_cLockCount;

CClassFactory::CClassFactory():
m_cRef(1)
{
InterlockedIncrement(&g_cLockCount);
} // CClassFactory::CClassFactory

CClassFactory::~CClassFactory()
{
InterlockedDecrement(&g_cLockCount);
} // CClassFactory::~CClassFactory

STDMETHODIMP_(ULONG) CClassFactory::AddRef()
{
return InterlockedIncrement(&m_cRef);
} // CClassFactory::AddRef

STDMETHODIMP_(ULONG) CClassFactory::Release()
{
LONG cRef = InterlockedDecrement(&m_cRef);

if (!cRef)
{
delete this;
}

return cRef;
} // CClassFactory::Release

STDMETHODIMP CClassFactory::QueryInterface(
REFIID riid,
__deref_out_opt void **ppv)
{
HRESULT hr = S_OK;

if (ppv)
{
*ppv = NULL;
}
else
{
hr = E_INVALIDARG;
}

if (S_OK == hr)
{
if (IID_IUnknown == riid)
{
AddRef();
*ppv = (IUnknown*)(IClassFactory*)this;
}
else if (IID_IClassFactory == riid)
{
AddRef();
*ppv = (IClassFactory*)this;
}
else
{
hr = E_NOINTERFACE;
}
}

return hr;
} // CClassFactory::QueryInterface

STDMETHODIMP CClassFactory::CreateInstance(
__in_opt IUnknown *pUnkownOuter,
REFIID riid,
__deref_out_opt void **ppv)
{
HRESULT hr = S_OK;
IUnknown *pUnknown = NULL;

if (ppv)
{
*ppv = NULL;
}
else
{
hr = E_INVALIDARG;
}

if (S_OK == hr)
{
if (pUnkownOuter)
{
hr = CLASS_E_NOAGGREGATION;
}
}

if (S_OK == hr)
{
pUnknown = new(std::nothrow) CAudioProvider();

if (!pUnknown)
{
hr = E_OUTOFMEMORY;
}
}

if (S_OK == hr)
{
hr = pUnknown->QueryInterface(riid, ppv);
}

if (pUnknown)
{
pUnknown->Release();
}

return hr;
} // CClassFactory::CreateInstance

STDMETHODIMP CClassFactory::LockServer(
BOOL fLock)
{
if (fLock)
{
InterlockedIncrement(&g_cLockCount);
}
else
{
InterlockedDecrement(&g_cLockCount);
}

return S_OK;
} // CClassFactory::LockServer
//---------------------------------------------------------------------------
// End CClassFactory implemetation
//---------------------------------------------------------------------------
Next you will have to fix the rest of the DllMain cpp file. This is what a simple version would look like clean of ATL
extern "C"
{

BOOL APIENTRY DllMain(
HMODULE hModule,
ULONG ulReason,
__in_opt PVOID pReserved)
{
BOOL fRetVal = TRUE;

if (DLL_PROCESS_ATTACH == ulReason)
{
// Disable thread attach notifications
fRetVal = DisableThreadLibraryCalls(hModule);
}

return fRetVal;
} // DllMain

STDAPI DllGetClassObject(
__in REFCLSID rclsid,
__in REFIID riid,
__deref_out LPVOID FAR *ppv)
{
HRESULT hr = S_OK;
CClassFactory* pClassFactory = NULL;

if (ppv)
{
*ppv = NULL;
}
else
{
hr = E_INVALIDARG;
}

if (S_OK == hr)
{
if (CLSID_fdAudio != rclsid)
{
hr = CLASS_E_CLASSNOTAVAILABLE;
}
}

if (S_OK == hr)
{
pClassFactory = new(std::nothrow) CClassFactory;

if (!pClassFactory)
{
hr = E_OUTOFMEMORY;
}
}

if (S_OK == hr)
{
hr = pClassFactory->QueryInterface(riid, ppv);
}

if (pClassFactory)
{
pClassFactory->Release();
}

return hr;
} // DLLGetClassObject

HRESULT APIENTRY DllCanUnloadNow()
{
return (g_cLockCount == 0) ? S_OK : S_FALSE;
} // DllCanUnloadNow

} // extern "C"
Finally you will have to fix your ATL free com objects in their constructors to increment and decrement the DLL lock count. By calling these:
inline VOID IncModuleCount()
{
InterlockedIncrement(&g_cLockCount);
} // IncModuleCount

inline VOID DecModuleCount()
{
InterlockedDecrement(&g_cLockCount);
} // DecModuleCount
Hopefully this will get you one step closer to being ATL free.

Thursday, October 22, 2009

ATL Removal: Part 2 – Safely Deleting ATL_NO_VTABLE

As I was going through and purging ATL from my project, I ran into the macro ATL_NO_VTABLE. It appears that some ATL project generators slap that baby into your class definition.

class ATL_NO_VTABLE CYourClass :…

It turns out that it is an optimization “that prevents the vtable pointer from being initialized in the class's constructor and destructor. If the vtable pointer is prevented from being initialized in the class's constructor and destructor, the linker can eliminate the vtable and all of the functions to which it points. Expands to __declspec(novtable).”

http://msdn.microsoft.com/en-us/library/6h06t6s8%28VS.80%29.aspx

It sounds like a good optimization for ATL code since your object will be wrapped by CComObject anyway; however, with normal com code, having the vtable line up with IUnknown is important. In turning your ATL com object into to a regular com object, it is safe to just remove ATL_NO_VTABLE.

Wednesday, October 21, 2009

ATL Removal: Part 1 – Removing BEGIN_COM_MAP() and END_COM_MAP()

There are some who feel that ATL is the best thing since sliced bread for COM programming. ATL can abstract a lot of the nitty gritty COM programming away from the COM programmer. My goal today is not to bash ATL. I think it can be a great time saver if used correctly for the right level of code. If you are programming UI, or an IE plugin, go right ahead and use ATL if you like. If you are writing lower level system APIs, please don’t. Also, ATL cannot be used for an excuse for not understanding COM programming or memory management. I have spent the last year fixing lots of bugs because someone didn’t understand what ATL was doing, things like: double frees, double releases, memory leaks, leaked references, and so on. In other words, ATL is not a sliver bullet for API usage ignorance. I don’t like using ATL because: writing straight COM code is not much harder, and ATL makes debugging more annoying. So, in the end for me, ATL does not buy me much, and is somewhat annoying. If you like ATL and know what it is really doing so you use it correctly, go ahead and use it. Independent of my no ATL preference, one API I am maintaining needs to be “low level” and therefore should not depend on ATL, so I am removing ATL from that API.

ATL wants to internally manage the ref counts to your ATL based com object. To do this, it wraps your com object using a template class called CComObject. CComObject basically adds two extra layers to your com object. For debugging, it does make things a little bit more annoying, but it is supposed to manage the lifetime of your object for you.

The first thing to do is to remove the COM_MAP macros needed to set up all of the CComObject hooks. When you have an ATL generated COM object, it creates for you something like this in your header:

BEGIN_COM_MAP(CYourInterface)
COM_INTERFACE_ENTRY(IYourInterface)
END_COM_MAP()

Basically these macros help write your IUnknown implementation. Not really that big of a time saver as it is not that hard to implement IUnknown. The first step is to remove COM_MAP macro stuff from your header and swap it for the IUnknown definition. Here is one example that you see often:

STDMETHOD(QueryInterface)(IN REFIID riid, OUT void ** ppv);
STDMETHOD_(ULONG, AddRef)(void);
STDMETHOD_(ULONG, Release)(void);

Next, in your cpp file where you implement your com object, you will need to add the implementation of IUnknown. This too is almost boiler plate.

//////////////////////////////////////////////////////////////////////
// Implementation : IUnknown
//////////////////////////////////////////////////////////////////////

ULONG CYourInterface::AddRef()
{
return InterlockedIncrement(&m_cRef);
}

ULONG CYourInterface::Release()
{
LONG cRef = InterlockedDecrement(&m_cRef);

if (0 == cRef)
{
delete this;
}

return cRef;
}

HRESULT CYourInterface::QueryInterface(REFIID riid, __deref_out_opt void **ppv)
{
HRESULT hr = S_OK;

if (ppv)
{
*ppv = NULL;
}
else
{
hr = E_INVALIDARG;
}

if (S_OK == hr)
{
if ((__uuidof(IUnknown) == riid) (riid == __uuidof(IYourInterface)))
{
AddRef();
*ppv = (IYourInterface *)this;
}
else
{
hr = E_NOINTERFACE;
}
}

return hr;
}

Finally make sure that you add a:

LONG m_cRef;

Member in your class, and make sure you initialize it to 1 in your constructor.

m_cRef(1)

Tuesday, October 13, 2009

GPU Accelerated FDTD Using Cg

Apparently there has been some interested in my old GPU accelerated FDTD code implemented in C and Cg. I also had a CUDA version that had a hard time meeting the performance of the Cg version. I will need to check when I get home to see if I can find the other versions. Here is one version that I found on one of my old sites. It looks like it is version 0.1 and about three years old. I know I have a better versions somewhere out there, but this will give you a "taste" until I can find the other versions. Still, there are some interesting aspects of this code. I used an interesting 3D volume packing scheme for 2D textures that you can read about in GPU Gems 2. Also a lot of the guts of this code initially came from Dom's basic math tutorial for Cg.


cg_fdtd: clean
gcc test.c -o cg_fdtd -lglut -lGLEW -lCgGL -lpthread

clean:
rm -f cg_fdtd

test: cg_fdtd
./cg_fdtd 256 10


/*
* test.c
*
* author : Sam Adams
* date : 20061027
* discription : This is my implementation of a basic FDTD using GPU with Cg
*/

#include
#include
#include
#include
#include
#include
#include
#include

#define GLUT_WINDOW_NAME "fdtd window"

typedef enum bool{false, true} bool;
//clock_gettime
// struct for variable parts of GL calls (texture format, float format etc)
struct struct_textureParameters {
char* name;
GLenum texTarget;
GLenum texInternalFormat;
GLenum texFormat;
char* shader_source;
};

// struct actually being used (set from command line)
struct struct_textureParameters textureParameters;

int verbose = 1;
int numIterations = 100; //timesteps for fdtd
int texSize = 256;

GLuint glutWindowHandle;
GLuint fb;

enum components{X, Y, Z};

GLuint *exTexID; // these are z slices for fdtd
GLuint *eyTexID; // the +1 is for the output of the calculations
GLuint *ezTexID;
GLuint *hxTexID;
GLuint *hyTexID;
GLuint *hzTexID;
GLuint psTexID, psEmptyTexID;

// Cg vars
CGcontext cgContext;
CGprofile fragmentProfile;
CGprogram fragmentProgram;
//fdtd
////for e fields
CGprogram ex_fp, ey_fp, ez_fp;
CGparameter ex_ps, ex_e, ex_h1, ex_h2, ex_h3, ex_h4, ex_esctc, ex_eincc, ex_ei, ex_edevcn, ex_dei, ex_ecrl1, ex_ecrl2;
CGparameter ey_e, ey_h1, ey_h2, ey_h3, ey_h4, ey_esctc, ey_eincc, ey_ei, ey_edevcn, ey_dei, ey_ecrl1, ey_ecrl2;
CGparameter ez_e, ez_h1, ez_h2, ez_h3, ez_h4, ez_esctc, ez_eincc, ez_ei, ez_edevcn, ez_dei, ez_ecrl1, ez_ecrl2;
////for h fields
CGprogram hx_fp, hy_fp, hz_fp;
CGparameter hx_h, hx_e1, hx_e2, hx_e3, hx_e4, hx_dt, hx_delta1, hx_delta2;
CGparameter hy_h, hy_e1, hy_e2, hy_e3, hy_e4, hy_dt, hy_delta1, hy_delta2;
CGparameter hz_h, hz_e1, hz_e2, hz_e3, hz_e4, hz_dt, hz_delta1, hz_delta2;

float* tmpData;
//fdtd Cg update sources
char *hx_hUpdate_source = \
"float hUpdate ("\
"in float2 coords : TEXCOORD0, "\
"uniform samplerRECT textureH, "\
"uniform samplerRECT textureE1, "\
"uniform samplerRECT textureE2, "\
"uniform samplerRECT textureE3, "\
"uniform samplerRECT textureE4, "\
"uniform float dt, "\
"uniform float delta1, "\
"uniform float delta2) : COLOR{ "\
"float h = texRECT(textureH, coords); "\
"float e1 = texRECT(textureE1, coords); "\
"float e2 = texRECT(textureE2, coords); "\
"float e3 = texRECT(textureE3, coords+(0.0,1.0)); "\
"float e4 = texRECT(textureE4, coords); "\
"return h + ((dt / 0.0000012566306) *((e1 - e2) / delta1 - (e3 - e4) / delta2)); }";

char *hy_hUpdate_source = \
"float hUpdate ("\
"in float2 coords : TEXCOORD0, "\
"uniform samplerRECT textureH, "\
"uniform samplerRECT textureE1, "\
"uniform samplerRECT textureE2, "\
"uniform samplerRECT textureE3, "\
"uniform samplerRECT textureE4, "\
"uniform float dt, "\
"uniform float delta1, "\
"uniform float delta2) : COLOR{ "\
"float h = texRECT(textureH, coords); "\
"float e1 = texRECT(textureE1, coords+(1.0,0.0)); "\
"float e2 = texRECT(textureE2, coords); "\
"float e3 = texRECT(textureE3, coords); "\
"float e4 = texRECT(textureE4, coords); "\
"return h + ((dt / 0.0000012566306) *((e1 - e2) / delta1 - (e3 - e4) / delta2)); }";

char *hz_hUpdate_source = \
"float hUpdate ("\
"in float2 coords : TEXCOORD0, "\
"uniform samplerRECT textureH, "\
"uniform samplerRECT textureE1, "\
"uniform samplerRECT textureE2, "\
"uniform samplerRECT textureE3, "\
"uniform samplerRECT textureE4, "\
"uniform float dt, "\
"uniform float delta1, "\
"uniform float delta2) : COLOR{ "\
"float h = texRECT(textureH, coords); "\
"float e1 = texRECT(textureE1, coords+(0.0,1.0)); "\
"float e2 = texRECT(textureE2, coords); "\
"float e3 = texRECT(textureE3, coords+(1.0,0.0)); "\
"float e4 = texRECT(textureE4, coords); "\
"return h + ((dt / 0.0000012566306) *((e1 - e2) / delta1 - (e3 - e4) / delta2)); }";

char *ex_eUpdate_source = \
"float eUpdate ("\
"in float2 coords : TEXCOORD0, "\
"uniform samplerRECT texturePS, "\
"uniform samplerRECT textureE, "\
"uniform samplerRECT textureH1, "\
"uniform samplerRECT textureH2, "\
"uniform samplerRECT textureH3, "\
"uniform samplerRECT textureH4, "\
"uniform float esctc, "\
"uniform float eincc, "\
"uniform float ei, "\
"uniform float edevcn, "\
"uniform float dei, "\
"uniform float ecrl1, "\
"uniform float ecrl2 ) : COLOR{ "\
"float ps = texRECT(texturePS, coords); "\
"float e = texRECT(textureE, coords); "\
"float h1 = texRECT(textureH1, coords); "\
"float h2 = texRECT(textureH2, coords+(0.0,1.0)); "\
"float h3 = texRECT(textureH3, coords); "\
"float h4 = texRECT(textureH4, coords); "\
"float tmp = ps;"\
"if(ps != 0.0) tmp = ps;"\
/*"else tmp = e * esctc - eincc * ei - edevcn * dei + (h1 - h2) * ecrl1 - (h3 - h4) * ecrl2;"\*/
"else tmp = e * esctc - eincc * ei - edevcn * dei + ((h1 - h2) * ecrl1) - ((h3 - h4) * ecrl2);"\
"return tmp;}";

//"return e * esctc - eincc * ei - edevcn * dei + (h1 - h2) * ecrl1 - (h3 - h4) * ecrl2;}";
//"return e * esctc - eincc * ei - edevcn * dei + (h1 -h2) * ecrl1;}";

char *ey_eUpdate_source = \
"float eUpdate ("\
"in float2 coords : TEXCOORD0, "\
"uniform samplerRECT textureE, "\
"uniform samplerRECT textureH1, "\
"uniform samplerRECT textureH2, "\
"uniform samplerRECT textureH3, "\
"uniform samplerRECT textureH4, "\
"uniform float esctc, "\
"uniform float eincc, "\
"uniform float ei, "\
"uniform float edevcn, "\
"uniform float dei, "\
"uniform float ecrl1, "\
"uniform float ecrl2 ) : COLOR{ "\
"float e = texRECT(textureE, coords); "\
"float h1 = texRECT(textureH1, coords); "\
"float h2 = texRECT(textureH2, coords); "\
"float h3 = texRECT(textureH3, coords); "\
"float h4 = texRECT(textureH4, coords+(1.0,0.0)); "\
"return e * esctc - eincc * ei - edevcn * dei + (h1 - h2) * ecrl1 - (h3 - h4) * ecrl2; }";

char *ez_eUpdate_source = \
"float eUpdate ("\
"in float2 coords : TEXCOORD0, "\
"uniform samplerRECT textureE, "\
"uniform samplerRECT textureH1, "\
"uniform samplerRECT textureH2, "\
"uniform samplerRECT textureH3, "\
"uniform samplerRECT textureH4, "\
"uniform float esctc, "\
"uniform float eincc, "\
"uniform float ei, "\
"uniform float edevcn, "\
"uniform float dei, "\
"uniform float ecrl1, "\
"uniform float ecrl2 ) : COLOR{ "\
"float e = texRECT(textureE, coords); "\
"float h1 = texRECT(textureH1, coords); "\
"float h2 = texRECT(textureH2, coords+(1.0,0.0)); "\
"float h3 = texRECT(textureH3, coords); "\
"float h4 = texRECT(textureH4, coords+(0.0,1.0)); "\
"return e * esctc - eincc * ei - edevcn * dei + (h1 - h2) * ecrl1 - (h3 - h4) * ecrl2; }";

time_t start, end;

/* prototypes */
void initGLUT(int argc, char **argv);
void initGLEW();
void initFBO();
void createTextures();
void initCG();
void performComputation();
void checkGLErrors (const char *label);
double cpuBench();
void printVector(float *v, int len);
void transferFromTexture(float* data, GLenum fb);
void initMemory();
//void nextSlice(int i);
void transferToTexture(float* data, GLuint texID);

int main(int argc, char **argv){
int i;
time_t start, stop;
float simTime;
double mflops;
FILE *f;

f = fopen("results", "a");

texSize = atoi(argv[1]);
numIterations = atoi(argv[2]);

// int start, stop;
// struct timespec res;
// clock_getres(CLOCK_REALTIME, &res);
textureParameters.name = "TEXRECT - float_NV - R - 32";
textureParameters.texTarget = GL_TEXTURE_RECTANGLE_ARB;
textureParameters.texInternalFormat = GL_FLOAT_R32_NV;
textureParameters.texFormat = GL_LUMINANCE;
// start = clock_gettime(CLOCK_REALTIME, &res);
start = clock();
initMemory();
if(verbose) fprintf(stderr,"initalizing GLUT\n");
initGLUT(argc, argv);
if(verbose) fprintf(stderr,"initalizing GLEW\n");
initGLEW();
if(verbose) fprintf(stderr,"initalizing FBOs\n");
initFBO();
if(verbose) fprintf(stderr,"creating textures\n");
createTextures();
if(verbose) fprintf(stderr,"initalizing Cg\n");
initCG();
if(verbose) fprintf(stderr,"performing calculations\n");
performComputation();
stop = clock();
simTime = (float)(stop-start)/(float)CLOCKS_PER_SEC;
mflops = (double)(63*texSize*texSize*texSize*numIterations)/1000000.0;
printf("time was %f\n", simTime);
fprintf(f,"%i\t%i\t%e\t%e\n",texSize,numIterations,simTime,mflops);
return 0;
}

void initMemory(){
tmpData = (float*)calloc(sizeof(float),texSize*texSize); // to initalize textures to 0.0
exTexID = (GLuint*)malloc(sizeof(GLuint) * texSize);
eyTexID = (GLuint*)malloc(sizeof(GLuint) * texSize);
ezTexID = (GLuint*)malloc(sizeof(GLuint) * texSize);
hxTexID = (GLuint*)malloc(sizeof(GLuint) * texSize);
hyTexID = (GLuint*)malloc(sizeof(GLuint) * texSize);
hzTexID = (GLuint*)malloc(sizeof(GLuint) * texSize);
/*
exLocation = (int*)malloc(sizeof(int)*texSize);
eyLocation = (int*)malloc(sizeof(int)*texSize);
ezLocation = (int*)malloc(sizeof(int)*texSize);
hxLocation = (int*)malloc(sizeof(int)*texSize);
hyLocation = (int*)malloc(sizeof(int)*texSize);
hzLocation = (int*)malloc(sizeof(int)*texSize);

exLocation_out = eyLocation_out = ezLocation_out = hxLocation_out = hyLocation_out = hzLocation_out = texSize;*/
}

void printVector(float *v, int len){
int i;

for(i = 0; i < len; i++) printf("%i)\t%f\n",i,v[i]);
}

/*
* Checks framebuffer status.
* Copied directly out of the spec, modified to deliver a return value.
*/
int checkFramebufferStatus() {
GLenum status;
status = (GLenum) glCheckFramebufferStatusEXT(GL_FRAMEBUFFER_EXT);
switch(status) {
case GL_FRAMEBUFFER_COMPLETE_EXT:
printf("Framebuffer complete\n");
return 1;
case GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT_EXT:
printf("Framebuffer incomplete, incomplete attachment\n");
return 0;
case GL_FRAMEBUFFER_UNSUPPORTED_EXT:
printf("Unsupported framebuffer format\n");
return 0;
case GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT_EXT:
printf("Framebuffer incomplete, missing attachment\n");
return 0;
case GL_FRAMEBUFFER_INCOMPLETE_DIMENSIONS_EXT:
printf("Framebuffer incomplete, attached images must have same dimensions\n");
return 0;
case GL_FRAMEBUFFER_INCOMPLETE_FORMATS_EXT:
printf("Framebuffer incomplete, attached images must have same format\n");
return 0;
case GL_FRAMEBUFFER_INCOMPLETE_DRAW_BUFFER_EXT:
printf("Framebuffer incomplete, missing draw buffer\n");
return 0;
case GL_FRAMEBUFFER_INCOMPLETE_READ_BUFFER_EXT:
printf("Framebuffer incomplete, missing read buffer\n");
return 0;
default:
printf("Unknown framebuffer status %i\n", (int)status);
}
return 0;
}

void checkGLErrors (const char *label) {
GLenum errCode;
const GLubyte *errStr;

if ((errCode = glGetError()) != GL_NO_ERROR) {
errStr = gluErrorString(errCode);
printf("OpenGL ERROR: ");
printf((char*)errStr);
printf("(Label: ");
printf(label);
printf(")\n.");
}
}

/*
* Performs the actual calculation.
*/
void performComputation(){
int i, j;
int tmp;
double total;
double mflops;

start = clock();
glEnable(textureParameters.texTarget);
// attach textures to FBO
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT0_EXT, textureParameters.texTarget, exTexID[texSize], 0);
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT1_EXT, textureParameters.texTarget, eyTexID[texSize], 0);
glFramebufferTexture2DEXT(GL_FRAMEBUFFER_EXT, GL_COLOR_ATTACHMENT2_EXT, textureParameters.texTarget, ezTexID[texSize], 0);
// check if that worked
if(!checkFramebufferStatus()) {
printf("glFramebufferTexture2DEXT():\t [FAIL]\n");
exit (-1);
}// else if (mode == 0) {
// printf("glFramebufferTexture2DEXT():\t [PASS]\n");
// }
// enable fragment profile
cgGLEnableProfile(fragmentProfile);
// bind fdtd program
cgSetParameter1f(ex_esctc, 1.0);
cgSetParameter1f(ex_eincc, 1.0);
cgSetParameter1f(ex_ei, 1.0);
cgSetParameter1f(ex_edevcn, 0.0);
cgSetParameter1f(ex_dei, 0.0);
cgSetParameter1f(ex_ecrl1, 217.51);
cgSetParameter1f(ex_ecrl2, 217.51);

cgGLBindProgram(ey_fp);
cgSetParameter1f(ey_esctc, 1.0);
cgSetParameter1f(ey_eincc, 0.0);
cgSetParameter1f(ey_ei, 0.0);
cgSetParameter1f(ey_edevcn, 0.0);
cgSetParameter1f(ey_dei, 0.0);
cgSetParameter1f(ey_ecrl1, 217.51);
cgSetParameter1f(ey_ecrl2, 217.51);

cgGLBindProgram(ez_fp);
cgSetParameter1f(ez_esctc, 1.0);
cgSetParameter1f(ez_eincc, 0.0);
cgSetParameter1f(ez_ei, 0.0);
cgSetParameter1f(ez_edevcn, 0.0);
cgSetParameter1f(ez_dei, 0.0);
cgSetParameter1f(ez_ecrl1, 217.51);
cgSetParameter1f(ez_ecrl2, 217.51);

cgGLBindProgram(hx_fp);
cgSetParameter1f(hx_dt, 1.0);
cgSetParameter1f(hx_delta1, 1.0);
cgSetParameter1f(hx_delta2, 1.0);

cgGLBindProgram(hy_fp);
cgSetParameter1f(hy_dt, 1.0);
cgSetParameter1f(hy_delta1, 1.0);
cgSetParameter1f(hy_delta2, 1.0);

cgGLBindProgram(hz_fp);
cgSetParameter1f(hz_dt, 1.0);
cgSetParameter1f(hz_delta1, 1.0);
cgSetParameter1f(hz_delta2, 1.0);

// sutff that changes should be in the loop...

for(j = 0; j < numIterations; j++){
//printf("iteration %i\n",j);
for(i = 0; i < texSize; i++){
//update x e field
cgGLBindProgram(ex_fp);
// if(i == texSize/2){
// cgGLSetTextureParameter(ex_ps, psTexID);
// cgGLEnableTextureParameter(ex_ps);
// }
// else{
// cgGLSetTextureParameter(ex_ps, psEmptyTexID);
// cgGLEnableTextureParameter(ex_ps);
// }
cgGLSetTextureParameter(ex_e, exTexID[i]);
cgGLEnableTextureParameter(ex_e);
cgGLSetTextureParameter(ex_h1, hzTexID[i]);
cgGLEnableTextureParameter(ex_h1);
cgGLSetTextureParameter(ex_h2, hzTexID[i]);
cgGLEnableTextureParameter(ex_h2);
cgGLSetTextureParameter(ex_h3, hyTexID[i]);
cgGLEnableTextureParameter(ex_h3);
// fprintf(stdout,"we are doing --i- %i --i-1- %i %i\n",i, i-1,((i-1 <= 0) ? 0 : i-1));
//if(i) cgGLSetTextureParameter(ex_h4, hyTexID[i-1]);
if(i > 0) cgGLSetTextureParameter(ex_h4, hyTexID[i]);
else cgGLSetTextureParameter(ex_h4, hyTexID[0]);
cgGLEnableTextureParameter(ex_h4);
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT);
glPolygonMode(GL_FRONT,GL_FILL);
//render/compute
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0);
glVertex2f(0.0, 0.0);
glTexCoord2f(texSize, 0.0);
glVertex2f(texSize, 0.0);
glTexCoord2f(texSize, texSize);
glVertex2f(texSize, texSize);
glTexCoord2f(0.0, texSize);
glVertex2f(0.0, texSize);
glEnd();
//get result
// transferFromTexture(tmpData, hyTexID[i]);
/*if(i == texSize/2){*/// fprintf(stdout,"h%i) hy ps %e\n",i,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
//}
transferFromTexture(tmpData, GL_COLOR_ATTACHMENT0_EXT);
transferToTexture(tmpData, exTexID[i]);
// if(i == texSize/2){ fprintf(stdout,"e%i) ex ps %e\n",j,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
// }
//update y e field
cgGLBindProgram(ey_fp);
cgGLSetTextureParameter(ey_e, eyTexID[i]);
cgGLEnableTextureParameter(ey_e);
cgGLSetTextureParameter(ey_h1, hxTexID[i]);
cgGLEnableTextureParameter(ey_h1);
cgGLSetTextureParameter(ey_h2, hxTexID[((i-1 < 0) ? 0 : i-1)]);
cgGLEnableTextureParameter(ey_h2);
cgGLSetTextureParameter(ey_h3, hzTexID[i]);
cgGLEnableTextureParameter(ey_h3);
cgGLSetTextureParameter(ey_h4, hyTexID[i]);
cgGLEnableTextureParameter(ey_h4);
glDrawBuffer(GL_COLOR_ATTACHMENT1_EXT);
glPolygonMode(GL_FRONT,GL_FILL);
//render/compute
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0);
glVertex2f(0.0, 0.0);
glTexCoord2f(texSize, 0.0);
glVertex2f(texSize, 0.0);
glTexCoord2f(texSize, texSize);
glVertex2f(texSize, texSize);
glTexCoord2f(0.0, texSize);
glVertex2f(0.0, texSize);
glEnd();
//get result
transferFromTexture(tmpData, GL_COLOR_ATTACHMENT1_EXT);
transferToTexture(tmpData, eyTexID[i]);

// if(i == texSize/2){ fprintf(stdout,"e%i) ey ps %e\n",j,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
// }
//update z e field
cgGLBindProgram(ez_fp);
cgGLSetTextureParameter(ez_e, ezTexID[i]);
cgGLEnableTextureParameter(ez_e);
cgGLSetTextureParameter(ez_h1, hyTexID[i]);
cgGLEnableTextureParameter(ez_h1);
cgGLSetTextureParameter(ez_h2, hyTexID[i]);
cgGLEnableTextureParameter(ez_h2);
cgGLSetTextureParameter(ez_h3, hxTexID[i]);
cgGLEnableTextureParameter(ez_h3);
cgGLSetTextureParameter(ez_h4, hxTexID[i]);
cgGLEnableTextureParameter(ez_h4);
glDrawBuffer(GL_COLOR_ATTACHMENT2_EXT);
glPolygonMode(GL_FRONT,GL_FILL);
//render/compute
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0);
glVertex2f(0.0, 0.0);
glTexCoord2f(texSize, 0.0);
glVertex2f(texSize, 0.0);
glTexCoord2f(texSize, texSize);
glVertex2f(texSize, texSize);
glTexCoord2f(0.0, texSize);
glVertex2f(0.0, texSize);
glEnd();
//get result
transferFromTexture(tmpData, GL_COLOR_ATTACHMENT2_EXT);
transferToTexture(tmpData, ezTexID[i]);

// if(i == texSize/2){ fprintf(stdout,"e%i) ez ps %e\n",j,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
// }
//update x h field
cgGLBindProgram(hx_fp);
cgGLSetTextureParameter(hx_h, hxTexID[i]);
cgGLEnableTextureParameter(hx_h);
cgGLSetTextureParameter(hx_e1, eyTexID[i+1]);
cgGLEnableTextureParameter(hx_e1);
cgGLSetTextureParameter(hx_e2, eyTexID[i]);
cgGLEnableTextureParameter(hx_e2);
cgGLSetTextureParameter(hx_e3, ezTexID[i]);
cgGLEnableTextureParameter(hx_e3);
cgGLSetTextureParameter(hx_e4, ezTexID[i]);
cgGLEnableTextureParameter(hx_e4);
glDrawBuffer(GL_COLOR_ATTACHMENT0_EXT);
glPolygonMode(GL_FRONT,GL_FILL);
//render/compute
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0);
glVertex2f(0.0, 0.0);
glTexCoord2f(texSize, 0.0);
glVertex2f(texSize, 0.0);
glTexCoord2f(texSize, texSize);
glVertex2f(texSize, texSize);
glTexCoord2f(0.0, texSize);
glVertex2f(0.0, texSize);
glEnd();
//get result
transferFromTexture(tmpData, GL_COLOR_ATTACHMENT0_EXT);
transferToTexture(tmpData, hxTexID[i]);
// if(i == texSize/2){ fprintf(stdout,"h%i) hx ps %e\n",j,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
// }

//update y h field
cgGLBindProgram(hy_fp);
cgGLSetTextureParameter(hy_h, hyTexID[i]);
cgGLEnableTextureParameter(hy_h);
cgGLSetTextureParameter(hy_e1, ezTexID[i]);
cgGLEnableTextureParameter(hy_e1);
cgGLSetTextureParameter(hy_e2, ezTexID[i]);
cgGLEnableTextureParameter(hy_e2);
cgGLSetTextureParameter(hy_e3, exTexID[i+1]);
cgGLEnableTextureParameter(hy_e3);
cgGLSetTextureParameter(hy_e4, exTexID[i]);
cgGLEnableTextureParameter(hy_e4);
glDrawBuffer(GL_COLOR_ATTACHMENT1_EXT);
glPolygonMode(GL_FRONT,GL_FILL);
//render/compute
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0);
glVertex2f(0.0, 0.0);
glTexCoord2f(texSize, 0.0);
glVertex2f(texSize, 0.0);
glTexCoord2f(texSize, texSize);
glVertex2f(texSize, texSize);
glTexCoord2f(0.0, texSize);
glVertex2f(0.0, texSize);
glEnd();
//get result
transferFromTexture(tmpData, GL_COLOR_ATTACHMENT1_EXT);
transferToTexture(tmpData, hyTexID[i]);

// if(i == texSize/2){ fprintf(stdout,"h%i) hy ps %e\n",j,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
// }
//update z h field
cgGLBindProgram(hz_fp);
cgGLSetTextureParameter(hz_h, hzTexID[i]);
cgGLEnableTextureParameter(hz_h);
cgGLSetTextureParameter(hz_e1, exTexID[i]);
cgGLEnableTextureParameter(hz_e1);
cgGLSetTextureParameter(hz_e2, exTexID[i]);
cgGLEnableTextureParameter(hz_e2);
cgGLSetTextureParameter(hz_e3, eyTexID[i]);
cgGLEnableTextureParameter(hz_e3);
cgGLSetTextureParameter(hz_e4, eyTexID[i]);
cgGLEnableTextureParameter(hz_e4);
glDrawBuffer(GL_COLOR_ATTACHMENT2_EXT);
glPolygonMode(GL_FRONT,GL_FILL);
//render/compute
glBegin(GL_QUADS);
glTexCoord2f(0.0, 0.0);
glVertex2f(0.0, 0.0);
glTexCoord2f(texSize, 0.0);
glVertex2f(texSize, 0.0);
glTexCoord2f(texSize, texSize);
glVertex2f(texSize, texSize);
glTexCoord2f(0.0, texSize);
glVertex2f(0.0, texSize);
glEnd();
//get result
transferFromTexture(tmpData, GL_COLOR_ATTACHMENT2_EXT);
transferToTexture(tmpData, hzTexID[i]);
// if(i == texSize/2){ fprintf(stdout,"h%i) hz ps %e\n",j,tmpData[(texSize*texSize)/2 + 1]);
// printVector(tmpData, texSize*texSize);
// }
}
}
// done, stop timer, calc MFLOP/s if neccessary
// glFinish();
// end = clock();
// total = (double)(end-start)/(double)CLOCKS_PER_SEC;
// mflops = (double)((3*12 + 3*9)*texSize*texSize*texSize*numIterations) / (total * 1000000.0);
// printf("GPU MFLOP/s for N=%d:\t\t%f\n",texSize, mflops);
// done, just do some checks if everything went smoothly.
checkFramebufferStatus();
checkGLErrors("render()");
glDisable(textureParameters.texTarget);
// transferFromTexture(tmpData);
// printVector(tmpData, texSize*texSize);
//printVector(dataX, texSize*texSize);

// do cpu comarison
// printf("GPU speedup %f\n \n", cpuBench()/total);
}

void transferFromTexture(float* data, GLenum fb){
// version (a): texture is attached
// recommended on both NVIDIA and ATI
glReadBuffer(fb);
glReadPixels(0, 0, texSize, texSize,textureParameters.texFormat,GL_FLOAT,data);
}

void cgErrorCallback(){
CGerror lastError = cgGetError();
if(lastError) {
printf(cgGetErrorString(lastError));
printf(cgGetLastListing(cgContext));
exit(lastError);
}
}

/*
* Sets up the Cg runtime and creates shader.
*/
void initCG(void) {
// set up Cg
cgSetErrorCallback(cgErrorCallback);
cgContext = cgCreateContext();
fragmentProfile = cgGLGetLatestProfile(CG_GL_FRAGMENT);
cgGLSetOptimalOptions(fragmentProfile);
// create fragment program
ex_fp = cgCreateProgram(cgContext, CG_SOURCE, ex_eUpdate_source, fragmentProfile, "eUpdate", NULL);
ey_fp = cgCreateProgram(cgContext, CG_SOURCE, ey_eUpdate_source, fragmentProfile, "eUpdate", NULL);
ez_fp = cgCreateProgram(cgContext, CG_SOURCE, ez_eUpdate_source, fragmentProfile, "eUpdate", NULL);

hx_fp = cgCreateProgram(cgContext, CG_SOURCE, hx_hUpdate_source, fragmentProfile, "hUpdate", NULL);
hy_fp = cgCreateProgram(cgContext, CG_SOURCE, hy_hUpdate_source, fragmentProfile, "hUpdate", NULL);
hz_fp = cgCreateProgram(cgContext, CG_SOURCE, hz_hUpdate_source, fragmentProfile, "hUpdate", NULL);
// // load programs
cgGLLoadProgram(ex_fp);
cgGLLoadProgram(ey_fp);
cgGLLoadProgram(ez_fp);
cgGLLoadProgram(hx_fp);
cgGLLoadProgram(hy_fp);
cgGLLoadProgram(hz_fp);
//and get parameter handles by name
//ex
ex_ps = cgGetNamedParameter(ex_fp, "texturePS");
ex_e = cgGetNamedParameter(ex_fp, "textureE");
ex_h1 = cgGetNamedParameter(ex_fp, "textureH1");
ex_h2 = cgGetNamedParameter(ex_fp, "textureH2");
ex_h3 = cgGetNamedParameter(ex_fp, "textureH3");
ex_h4 = cgGetNamedParameter(ex_fp, "textureH4");
ex_esctc = cgGetNamedParameter(ex_fp, "esctc");
ex_eincc = cgGetNamedParameter(ex_fp, "eincc");
ex_ei = cgGetNamedParameter(ex_fp, "ei");
ex_edevcn = cgGetNamedParameter(ex_fp, "edevcn");
ex_dei = cgGetNamedParameter(ex_fp, "dei");
ex_ecrl1 = cgGetNamedParameter(ex_fp, "ecrl1");
ex_ecrl2 = cgGetNamedParameter(ex_fp, "ecrl2");
//ey
ey_e = cgGetNamedParameter(ey_fp, "textureE");
ey_h1 = cgGetNamedParameter(ey_fp, "textureH1");
ey_h2 = cgGetNamedParameter(ey_fp, "textureH2");
ey_h3 = cgGetNamedParameter(ey_fp, "textureH3");
ey_h4 = cgGetNamedParameter(ey_fp, "textureH4");
ey_esctc = cgGetNamedParameter(ey_fp, "esctc");
ey_eincc = cgGetNamedParameter(ey_fp, "eincc");
ey_ei = cgGetNamedParameter(ey_fp, "ei");
ey_edevcn = cgGetNamedParameter(ey_fp, "edevcn");
ey_dei = cgGetNamedParameter(ey_fp, "dei");
ey_ecrl1 = cgGetNamedParameter(ey_fp, "ecrl1");
ey_ecrl2 = cgGetNamedParameter(ey_fp, "ecrl2");
//ez
ez_e = cgGetNamedParameter(ez_fp, "textureE");
ez_h1 = cgGetNamedParameter(ez_fp, "textureH1");
ez_h2 = cgGetNamedParameter(ez_fp, "textureH2");
ez_h3 = cgGetNamedParameter(ez_fp, "textureH3");
ez_h4 = cgGetNamedParameter(ez_fp, "textureH4");
ez_esctc = cgGetNamedParameter(ez_fp, "esctc");
ez_eincc = cgGetNamedParameter(ez_fp, "eincc");
ez_ei = cgGetNamedParameter(ez_fp, "ei");
ez_edevcn = cgGetNamedParameter(ez_fp, "edevcn");
ez_dei = cgGetNamedParameter(ez_fp, "dei");
ez_ecrl1 = cgGetNamedParameter(ez_fp, "ecrl1");
ez_ecrl2 = cgGetNamedParameter(ez_fp, "ecrl2");
//hx
hx_h = cgGetNamedParameter(hx_fp, "textureH");
hx_e1 = cgGetNamedParameter(hx_fp, "textureE1");
hx_e2 = cgGetNamedParameter(hx_fp, "textureE2");
hx_e3 = cgGetNamedParameter(hx_fp, "textureE3");
hx_e4 = cgGetNamedParameter(hx_fp, "textureE4");
hx_dt = cgGetNamedParameter(hx_fp, "dt");
hx_delta1 = cgGetNamedParameter(hx_fp, "delta1");
hx_delta2 = cgGetNamedParameter(hx_fp, "delta2");
//hy
hy_h = cgGetNamedParameter(hy_fp, "textureH");
hy_e1 = cgGetNamedParameter(hy_fp, "textureE1");
hy_e2 = cgGetNamedParameter(hy_fp, "textureE2");
hy_e3 = cgGetNamedParameter(hy_fp, "textureE3");
hy_e4 = cgGetNamedParameter(hy_fp, "textureE4");
hy_dt = cgGetNamedParameter(hy_fp, "dt");
hy_delta1 = cgGetNamedParameter(hy_fp, "delta1");
hy_delta2 = cgGetNamedParameter(hy_fp, "delta2");
//hz
hz_h = cgGetNamedParameter(hz_fp, "textureH");
hz_e1 = cgGetNamedParameter(hz_fp, "textureE1");
hz_e2 = cgGetNamedParameter(hz_fp, "textureE2");
hz_e3 = cgGetNamedParameter(hz_fp, "textureE3");
hz_e4 = cgGetNamedParameter(hz_fp, "textureE4");
hz_dt = cgGetNamedParameter(hz_fp, "dt");
hz_delta1 = cgGetNamedParameter(hz_fp, "delta1");
hz_delta2 = cgGetNamedParameter(hz_fp, "delta2");
}

/*
* Transfers data to texture.
* Check web page for detailed explanation on the difference between ATI and NVIDIA.
*/
void transferToTexture (float* data, GLuint texID) {
// version (a): HW-accelerated on NVIDIA
glBindTexture(textureParameters.texTarget, texID);
glTexSubImage2D(textureParameters.texTarget,0,0,0,texSize,texSize,textureParameters.texFormat,GL_FLOAT,data);
}

/*
* Sets up a floating point texture with NEAREST filtering.
* (mipmaps etc. are unsupported for floating point textures)
*/
void setupTexture (const GLuint texID) {
// make active and bin
int err;
glBindTexture(textureParameters.texTarget,texID);
// turn off filtering and wrap modes
glTexParameteri(textureParameters.texTarget, GL_TEXTURE_MIN_FILTER, GL_NEAREST);
glTexParameteri(textureParameters.texTarget, GL_TEXTURE_MAG_FILTER, GL_NEAREST);
glTexParameteri(textureParameters.texTarget, GL_TEXTURE_WRAP_S, GL_CLAMP);
glTexParameteri(textureParameters.texTarget, GL_TEXTURE_WRAP_T, GL_CLAMP);
// define texture with floating point format
glTexImage2D(textureParameters.texTarget,0,textureParameters.texInternalFormat,texSize,texSize,0,textureParameters.texFormat,GL_FLOAT,0);
// define texture with floating point format
// check if that worked
if(err = glGetError() != GL_NO_ERROR){
printf("glTexImage2D():\t\t\t [FAIL]\n");
exit(err);
}
// else if(mode == 0){
printf("glTexImage2D():\t\t\t [PASS]\n");
// }
printf("Created a %i by %i floating point texture.\n",texSize,texSize);
}

void createTextures(){
int i;
// create textures
glGenTextures(texSize+1, exTexID);
glGenTextures(texSize+1, eyTexID);
glGenTextures(texSize+1, ezTexID);
glGenTextures(texSize+1, hxTexID);
glGenTextures(texSize+1, hyTexID);
glGenTextures(texSize+1, hzTexID);
glGenTextures(1, &psTexID);
glGenTextures(1, &psEmptyTexID);

tmpData[(texSize*texSize)/2] = 5.0;
setupTexture(psTexID);
transferToTexture(tmpData, psTexID);
tmpData[(texSize*texSize)/2] = 0.0;
setupTexture(psEmptyTexID);
transferToTexture(tmpData, psEmptyTexID);


// set up textures
for(i = 0; i <= texSize; i++){
setupTexture(exTexID[i]);
transferToTexture(tmpData, exTexID[i]);
setupTexture(eyTexID[i]);
transferToTexture(tmpData, eyTexID[i]);
setupTexture(ezTexID[i]);
transferToTexture(tmpData, ezTexID[i]);
setupTexture(hxTexID[i]);
transferToTexture(tmpData, hxTexID[i]);
setupTexture(hyTexID[i]);
transferToTexture(tmpData, hyTexID[i]);
setupTexture(hzTexID[i]);
transferToTexture(tmpData, hzTexID[i]);
}
// set texenv mode from modulate (the default) to replace)
glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_REPLACE);
// check if something went completely wrong
checkGLErrors ("createFBOandTextures()");
}

/*
* Creates framebuffer object, binds it to reroute rendering operations
* from the traditional framebuffer to the offscreen buffer
*/
void initFBO(){
// create FBO (off-screen framebuffer)
glGenFramebuffersEXT(1, &fb);
// bind offscreen framebuffer (that is, skip the window-specific render target)
glBindFramebufferEXT(GL_FRAMEBUFFER_EXT, fb);
// viewport for 1:1 pixel=texture mapping
glMatrixMode(GL_PROJECTION);
glLoadIdentity();
gluOrtho2D(0.0, texSize, 0.0, texSize);
glMatrixMode(GL_MODELVIEW);
glLoadIdentity();
glViewport(0, 0, texSize, texSize);
}

void initGLEW (void) {
int err = glewInit();
if (GLEW_OK != err) {
fprintf(stderr,"error: %s\n",(char*)glewGetErrorString(err));
exit(-1);
}
}

void initGLUT(int argc, char **argv){
glutInit(&argc, argv);
glutWindowHandle = glutCreateWindow(GLUT_WINDOW_NAME);
}

How To Use the String Stream implementation of ISequentialStream

Someone the other day asked how do they use the ISequentialStream string class I implemented.

Here is the original post on the string stream implementation.

This is how you would use it.

HRESULT hr = S_OK;
LPWSTR pszXml = NULL;
ISequentialStream *pStream = NULL;
IXmlReader *pReader = NULL;

hr = GetXmlString(&pszXml);

if (S_OK == hr)
{
hr = CStringStream::Create(pszXml, &pStream);
}

if (S_OK == hr)
{
hr = CreateXmlReader(__uuidof(IXmlReader), (void**)&pReader, NULL);
}

if (S_OK == hr)
{
hr = pReader->SetInput(pStream);
}

//
// rest of your xml parsing here.
//

// cleanup
if (pReader)
{
pReader->Release();
}

if (pStream)
{
pStream->Release();
}

if (pszXml)
{
free(pszXml);
}

Making Your Function Discovery (FD) Provider Run In Proc

I was looking through my basic FD provider sample that I wrote a few months ago and realized that I can't directly publish it on the Internet. I don't want to provide a sample that doesn't do anything real, so I will need to rewrite a new provider sample that publishable and is realistic. In the mean time, if you want to push ahead and try writing your own provider, it is actually not that hard. First, you need to do the standard things for making a new DLL: DLLMain, DLLGetClassObject, implement IClassFactory, et cetera. Then, you need to implement IFunctionDiscoveryProvider. That is it. Well, to get it to work, you will also have to make the proper registrations (hint: use the in box providers registry keys as an example). The second problem will be that your provider will have to implement all of the PnPX stuff to be loaded out of proc because FD will only load known providers inproc by default. If you want to ship a provider, pnpx should be your end game especially if you are doing network type providers. While you are learning the basics, there is a trick to get your FD client to load your non pnpx provider inproc. Note, this is done on the client or provider consumer side. This is how you would do that. I will try to provide a complete sample later if time permits.

hr = pPnpQuery->AddQueryConstraint(
FD_QUERYCONSTRAINT_COMCLSCONTEXT,
FD_CONSTRAINTVALUE_COMCLSCONTEXT_INPROC_SERVER
);

Unit Testing

Since we are between product cycles, they want us to be writing tests and not making changes on the product code until initial planning is done. Testing is like hygiene, not specifically fun, but necessary. Writing unit tests is not my favorite thing to do, but it is need if you want to have quality in your code. So, I went about figuring out how to setup good unit tests in our build environment and how to automate them. I won't go in to those details because they are unique to our environment, but I will provide some slides for a presentation that I gave to my team.

Unit Testing
What Is a Unit Test?
• Unit: smallest testable part of a program: functions, classes, methods
• Validates correct behavior of the unit
• Ideally independent of the other unit tests
• They should cover most code paths
• Generally a white box testing method that is close to the code implementation
• First line of testing
• Should be written in conjunction with the unit of code

What Isn’t Unit Testing?
• A catchall for every bug
• Replacement for other testing
• Functional testing: validates code functions to spec (higher level and more black box than unit tests)
• Integration testing: tests how the units are put together

Why Write Unit Tests?
• Finds bugs early in the development cycle
• Gives confidence that the units you are writing are behaving correctly
• Provides quick feedback if functionality has inadvertently been regressed
• Simplifies refactoring
• Gives confidence when you make late milestone changes
• Documents and defines correct unit behavior

Dev IC Workflow
• Spec & design
• Product code & unit tests
• Check in
• Automation

Unit Test Writing Work Flow
Test Driven Development
TDD Cycle
• Write a test
• Run all unit tests – the new test should fail
• Write some code – write enough code to pass the test
• Run all unit tests – the new test should now pass
• Refactor – clean up the code and tests as needed
• Repeat steps 1-5 until all code units are complete

TDD Benefits
• Promotes better design since you must think about using the API before writing them
• Heavy debugging is rarely needed
• Many of the bugs are found even before it is checked in
• Promotes good unit test coverage
• Discourages code creep
• Unit tests are easier to write before than after

Tuesday, September 15, 2009

Function Discovery: Callback Objects, Implementing IFunctionDiscoveryNotification

If you missed my introductory post on Function Discovery (FD), you might want to go back there and give it a once over. It will give you a quick primer on what FD is about.

Function Discovery Intro

In my first FD post I provided a sample using the PnP FD provider to enumerate present devices. The FD PnP provider probably the most used provider and is easier to use than SetupDiGetGlassDevs especially if your program is already using COM. Unfortunately my first sample didn’t include a callback object which is required to get notifications from the provider. It gets worse than that; the PnP provider is actually the only provider (except the registry provider) that will provide synchronous results (IFunctionInstanceCollection)when you execute your query. In other words, every other inbox FD provider is asynchronous, and you won’t get any function instance (FI) results unless you provide a callback object and get them asynchronously.

Don’t worry; writing callback objects is easy, and I will show you how with an example. You start off creating a query just like we did in the first example, except we will have to change two parts. First you will need to create your callback object, and then pass it as a parameter to the CreateInstanceCollectionQuery method call. Finally when you execute the query, you will not get a function instance query back unless it is the PnP or registry provider, and the call will return E_PENDING. E_PENDING is not an error if you are using an asynchronous provider; it just means that the provider will give you function instances asynchronously to your call back object. If the provider is async and returns E_PENDING, it should also send FD_EVENTID_SEARCHCOMPLETE to the callback’s OnEvent method.

Here is a simple sample code for a callback object.


class CNotificationCallback : public IFunctionDiscoveryNotification
{
public:

STDMETHODIMP_(ULONG) AddRef();

STDMETHODIMP_(ULONG) Release();

STDMETHODIMP QueryInterface(
REFIID riid,
__deref_out_opt void **ppv);

STDMETHODIMP OnUpdate(
QueryUpdateAction enumQueryUpdateAction,
FDQUERYCONTEXT fdqcQueryContext,
__in IFunctionInstance *pIFunctionInstance);

STDMETHODIMP OnError(
HRESULT hr,
FDQUERYCONTEXT fdqcQueryContext,
PCWSTR pszProvider);

STDMETHODIMP OnEvent(
DWORD dwEventID,
FDQUERYCONTEXT fdqcQueryContext,
PCWSTR pszProvider);

CNotificationCallback();

protected:
LONG m_cRef;
};

CNotificationCallback::CNotificationCallback():
m_cRef(1)
{
}

STDMETHODIMP_(ULONG) CNotificationCallback::AddRef()
{
return InterlockedIncrement(&m_cRef);
}

STDMETHODIMP_(ULONG) CNotificationCallback::Release()
{
LONG cRef = InterlockedDecrement(&m_cRef);
if (0 == cRef)
{
delete this;
}

return cRef;
}

STDMETHODIMP CNotificationCallback::QueryInterface(
REFIID riid,
__deref_out_opt void **ppv)
{
HRESULT hr = S_OK;

if (ppv)
{
*ppv = NULL;
}
else
{
hr = E_INVALIDARG;
}

if (S_OK == hr)
{
if (__uuidof(IUnknown) == riid )
{
AddRef();
*ppv = (IUnknown*) this;
}
else if (__uuidof(IFunctionDiscoveryNotification) == riid)
{
AddRef();
*ppv = (IFunctionDiscoveryNotification*) this;
}
else
{
hr = E_NOINTERFACE;
}
}

return hr;
}

STDMETHODIMP CNotificationCallback::OnUpdate(
QueryUpdateAction enumQueryUpdateAction,
FDQUERYCONTEXT fdqcQueryContext,
__in IFunctionInstance* pIFunctionInstance)
{
HRESULT hr = S_OK;

switch (enumQueryUpdateAction)
{
case QUA_ADD:
wprintf(L"QUA_ADD\n");
break;
case QUA_REMOVE:
wprintf(L"QUA_REMOVE\n");
break;
case QUA_CHANGE:
wprintf(L"QUA_CHANGE\n");
break;
}

return S_OK;
}

STDMETHODIMP CNotificationCallback::OnError(
HRESULT hr,
FDQUERYCONTEXT fdqcQueryContext,
PCWSTR pszProvider)
{
wprintf(L"****** ERROR: 0x%08x\n", hr);

return S_OK;
}

STDMETHODIMP CNotificationCallback::OnEvent(
DWORD dwEventID,
FDQUERYCONTEXT fdqcQueryContext,
PCWSTR pszProvider)
{
wprintf(L"Event: %d\n", dwEventID);

return S_OK;
}

This is the most basic example of how you might write a callback. Let’s pretend that we wanted to take an async provider like WSD and make it synchronized. One way you could do this is by passing in an empty function instance collection, and a handle that the callback object can signal once it receives FD_EVENTID_SEARCHCOMPLETE from the provider. In the main thread you could just wait on the handle. Often the callback interface is inherited in a bigger fancier class that does a lot more things than implement IFunctionDiscoveryNotification; the sky is the limit on how you want to structure your code here. Just make sure you exercise good tread safety. If you are sharing memory between the main program thread and your callback’s tread, be sure to use a SRW lock.

Now armed with callbacks, you can use two of FD’s main features: enumerating, and receiving notifications. With callbacks you will be able to take advantage of all of the providers on your computer.

Hopefully next time we can see how simple it is to write a FD provider and register it on your computer. Once we can write a simple provider, we can move on to writing full blown PnP-X providers. If you want to skip straight to PnP-X, there is a sample of one in the Windows SDK already, but hopefully I will be able to break it down into more digestible chunks. :)

If you are interested in using FD and want some extra help, email me and I can get you going on writing your provider or whatever you want to get accomplished.

Tuesday, September 8, 2009

Function Discovery Intro

The other day I wrote a post about using SetupDi to enumerate PnP devices.

SetupDi Post

But SetupDi is not the only API to enumerate devices; Function Discovery can also enumerate PnP devices along with a host of other capabilities.

Function Discovery (FD) came to life as part of Windows Vista. FD’s main goal was to provide a unified API and interfaces for gathering functionality, properties, and notifications from various providers. PnP just happens to be one of the providers. Before FD, different API sets were required for discovering functionality of devices; for example you could use SetupDiGetClassDevs to find physically connected devices, but you had to use other APIs for network devices or printers. Using FD, you can use the same set of interfaces and methods for PnP and any number of devices exposed trough a provider. Vista shipped with in-box-providers for PnP, PnP-X (WSD & SSDP), Registry, NetBIOS, and the capability for third parties to create their own providers, and Windows 7 there are even more providers.

If you have the Windows SDK installed (I assume that you would if you are interested in writing this kind of code), you can do some header spelunking. Check out FunctionDiscoveryCategories.h to get an idea of what providers you can try to use. Also you can dig into the registry to see what other providers are registered on the system at HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Function Discovery\Categories.

So if you need to enumerate devices and/or get notifications, FD can help assuming there is a provider for it.

To start out with, I wrote a sample FD code using the PnP provider to enumerate PnP devices just like I did in the SetupDiGetClassDevs example from before. I wrote this a while back now, so I hope there are no bug in this code.

I am having it print out a few properties from each function instance. A good place to go to find what kinds of properties are discoverable through FD is the header files included in the SDK: functiondiscoverykeys.h, and functiondiscoverykeys_devpkey.h. Not all PKEYs are populated by all providers; for example, PKEY_WSD_MetadataVersion will not be populated by the PnP provider, but will be populated by the WSD provider. Generally PKEYs populated by the PnP are prefixed with PKEY_Device_, and the properties you can get with SetupDi are available with the PnP FD provider.

For a more in depth reference to Function Discovery, refer to the official MSDN FD documentation.


/* displays pnp function instances */

#include <stdio.h>
#include <FunctionDiscovery.h>

HRESULT PrintFIs(IFunctionInstanceCollection* FIs);

int __cdecl wmain(
__in int argc,
__in_ecount(argc) PWSTR
)
{
HRESULT hr = S_OK;
IFunctionDiscovery *pFD = NULL;
IFunctionInstanceCollectionQuery *pPnpQuery = NULL;
IFunctionInstanceCollection *pFICollection = NULL;

hr = CoInitializeEx(NULL, COINIT_MULTITHREADED);

// CoCreate FunctionDiscovery
if (S_OK == hr)
{
hr = CoCreateInstance(
__uuidof(FunctionDiscovery),
NULL,
CLSCTX_ALL,
__uuidof(IFunctionDiscovery),
(PVOID*)&pFD);
}

// Query the pnp provider
if (S_OK == hr)
{
hr = pFD->CreateInstanceCollectionQuery(
FCTN_CATEGORY_PNP, // pnp category (defined in functiondiscoverycategories.h)
NULL, // subcategory
FALSE, // include subcategories
NULL, // notification callback
NULL, // context
&pPnpQuery); // FI collection query
}

/*
// * optional *
// add property constraints
// generally only works with core FD properties, and not provider specific properties
if (S_OK == hr)
{
PROPVARIANT pv;

PropVariantClear(&pv);

pv.vt = VT_UINT;
pv.uintVal = 0;

hr = pPnpQuery->AddPropertyConstraint(PKEY_FD_Visibility, &pv, QC_EQUALS);

PropVariantClear(&pv);
}
*/

/*
// * optional *
// add query constraints
// refer to functiondiscoveryconstraints.h
if (S_OK == hr)
{
hr = pPnpQuery->AddQueryConstraint(PNP_CONSTRAINTVALUE_NOTPRESENT, FD_CONSTRAINTVALUE_TRUE);
}
*/

if (S_OK == hr)
{
hr = pPnpQuery->Execute(&pFICollection);
}

if (S_OK == hr)
{
hr = PrintFIs(pFICollection);
}

// clean up
if (pFD)
{
pFD->Release();
}

if (pPnpQuery)
{
pPnpQuery->Release();
}

if (pFICollection)
{
pFICollection->Release();
}

CoUninitialize();

if (S_OK != hr)
{
wprintf(L"an error occured (hr == 0x%x)\n", hr);
return 1;
}

return 0;
}

HRESULT PrintFIs(
IFunctionInstanceCollection* FIs
)
{
HRESULT hr = S_OK;
DWORD cFIs = 0;
IFunctionInstance *pFI = NULL;
IFunctionInstanceCollection *pDeviceFunctionCollection = NULL;

if (FIs)
{
hr = FIs->GetCount(&cFIs);

wprintf(L"*******************************************************\n");
wprintf(L"* %i Function Instances\n", cFIs);
wprintf(L"*******************************************************\n\n");

// go through each function instance
for (DWORD i = 0; S_OK == hr && i < cFIs; i++)
{
hr = FIs->Item(i, &pFI);

if (S_OK == hr)
{
IPropertyStore *pPropertyStore = NULL;

pFI->OpenPropertyStore(STGM_READ, &pPropertyStore);

if (pPropertyStore)
{
PROPVARIANT pv;

PropVariantClear(&pv);

// PKEYs can be found in these headers in the SDK:
// functiondiscoverykeys.h functiondiscoverykeys_devpkey.h
// Providers do not populate all PKEYs.
hr = pPropertyStore->GetValue(PKEY_Device_FriendlyName, &pv);

if (S_OK == hr)
{
wprintf(L"Device Friendly Name : \"%s\"\n", (pv.vt == VT_LPWSTR) ? pv.pwszVal : L"");
}

PropVariantClear(&pv);


hr = pPropertyStore->GetValue(PKEY_Device_InstanceId, &pv);

if (S_OK == hr && VT_LPWSTR == pv.vt)
{
wprintf(L"\tDevice Instance ID : \"%s\"\n", pv.pwszVal);
}

PropVariantClear(&pv);

hr = pPropertyStore->GetValue(PKEY_Device_Class, &pv);

if (S_OK == hr && VT_LPWSTR == pv.vt)
{
wprintf(L"\tClass : %s",pv.pwszVal);
}

PropVariantClear(&pv);

hr = pPropertyStore->GetValue(PKEY_Device_ClassGuid, &pv);

if (S_OK == hr && VT_CLSID == pv.vt)
{
wprintf(L"\t(GUID : %x-%x-%x-%x%x-%x%x%x%x%x%x)\n",
pv.puuid->Data1,
pv.puuid->Data2,
pv.puuid->Data3,
pv.puuid->Data4[0],
pv.puuid->Data4[1],
pv.puuid->Data4[2],
pv.puuid->Data4[3],
pv.puuid->Data4[4],
pv.puuid->Data4[5],
pv.puuid->Data4[6],
pv.puuid->Data4[7]
);
}

PropVariantClear(&pv);

pPropertyStore->Release();
}
}

if (pFI)
{
pFI->Release();
pFI = NULL;
}

if (pDeviceFunctionCollection)
{
pDeviceFunctionCollection->Release();
pDeviceFunctionCollection = NULL;
}
}
}

return hr;
}

Wednesday, August 12, 2009

Creating an IStream or ISequentialStream From a String for XmlLite

XmlLite needs an IStream or an ISequentialStream to parse from. You can get one by opening a file like I showed in the previous post, but in my real code I didn’t have a file, I had a string. No biggie, you can always implement your own if there isn’t one already. This CStringStream class implements ISequentialStream using a string as an input. The class factory method takes in a string, creates a buffer, and gives back an ISequentialStream. Awesome, just what you need if you want to use XmlLite on XML in a string. Here is the class implementation:

#pragma once

//
// this class creates an ISequentialStream from a string
//
class CStringStream : public ISequentialStream
{
public:
// factory method
__checkReturn static HRESULT Create(
__in LPWSTR psBuffer,
__deref_out ISequentialStream **ppStream)
{
HRESULT hr = S_OK;
void *pNewBuff = NULL;
size_t buffSize = 0;

if (!psBuffer)
{
return E_INVALIDARG;
}

*ppStream = NULL;

buffSize = (wcslen(psBuffer)+1) * sizeof(wchar_t);
pNewBuff = malloc(buffSize);

if (!pNewBuff)
{
return E_OUTOFMEMORY;
}

hr = StringCbCopy((LPWSTR)pNewBuff, buffSize, psBuffer);

if (S_OK == hr)
{
*ppStream = new CStringStream(
buffSize,
pNewBuff);
}

if (!*ppStream)
{
hr = E_FAIL;
}

return hr;
};

// ISequentialStream
__checkReturn HRESULT STDMETHODCALLTYPE Read(
__out_bcount_part(cb, *pcbRead) void *pv,
/* [in] */ ULONG cb,
__out_opt ULONG *pcbRead)
{
HRESULT hr = S_OK;

for (*pcbRead = 0; *pcbRead < cb; ++*pcbRead, ++m_buffSeekIndex)
{
// we are seeking past the end of the buffer
if (m_buffSeekIndex == m_buffSize)
{
hr = S_FALSE;
break;
}

((BYTE*)pv)[*pcbRead] = ((BYTE*)m_pBuffer)[m_buffSeekIndex];
}

return hr;
};

HRESULT STDMETHODCALLTYPE Write(
__in_bcount(cb) const void *pv,
/* [in] */ ULONG cb,
__out_opt ULONG *pcbWritten)
{
return E_NOTIMPL;
};

// IUnknown
STDMETHODIMP_(ULONG) AddRef()
{
return InterlockedIncrement(&m_cRef);
};

STDMETHODIMP_(ULONG) Release()
{
LONG cRef = InterlockedDecrement(&m_cRef);

if (0 == cRef)
{
delete this;
}

return cRef;
};

STDMETHODIMP QueryInterface(REFIID riid, __deref_out_opt void **ppv)
{
HRESULT hr = S_OK;

if (ppv)
{
*ppv = NULL;
}
else
{
hr = E_INVALIDARG;
}

if (S_OK == hr)
{
if ((__uuidof(IUnknown) == riid) || (riid == __uuidof(ISequentialStream)))
{
AddRef();
*ppv = (ISequentialStream*)this;
}
else
{
hr = E_NOINTERFACE;
}
}

return hr;
};

protected:
LONG m_cRef;
void *m_pBuffer;
size_t m_buffSize;
size_t m_buffSeekIndex;

// constructor/deconstructor
CStringStream(
__in size_t buffSize,
__in void *pBuff)
:
m_cRef(1),
m_pBuffer(pBuff),
m_buffSize(buffSize),
m_buffSeekIndex(0)
{
};

~CStringStream()
{
free(m_pBuffer);
};
};

Howto Use XmlLite

I was recently breaking off high-level heavy-weight dependencies on a code I was cleaning up, and I ran into the 500 lb. gorilla that is MSXML6. I found some code that was using it to parse some basic XML strings. MSXML is a full featured XML parser that can do fancy things like schema validation, but it is kind of heavy weight and has high-level dependencies. The downsides of MSXML might be a necessary evil if you need its fancy features, but in many cases we don’t. In my code, I definitely did not. I wanted to gut XML out altogether, but was vetoed. My thoughts turned to MSXML’s handsome and more athletic cousin, XmlLite. XmlLite has very few dependencies and is self-contained in its own library files. Although XmlLite is COM like, it doesn’t even actually have a dependency on COM, so I am liking this guy already. It does need an IStream, or an ISequentialStream, so you will have to create one from some file, or implement the interface yourself. I can provide a sample implementation of that later.


To the code…


Here is a simple quick and dirty code I wrote mainly following the code samples on MSDN. This program takes a filename as a parameter, opens it, and parses the XML printing out the elements. The code I actually wrote looks cleaner, but this will get you going.


MSDN Refrences


MSXML6


XmlLite



/*
* xml_lite.cpp
*
* Description : simple code to show using XML Lite
*/

#include <objbase.h>
#include <XmlLite.h>
#include <shlwapi.h>
#include <stdio.h>

HRESULT WriteAttributes(IXmlReader* pReader)
{
const WCHAR* pwszPrefix;
const WCHAR* pwszLocalName;
const WCHAR* pwszValue;
HRESULT hr = pReader->MoveToFirstAttribute();

if (S_FALSE == hr)
return hr;
if (S_OK != hr)
{
wprintf(L"Error moving to first attribute, error is %08.8lx", hr);
return -1;
}
else
{
while (TRUE)
{
if (!pReader->IsDefault())
{
UINT cwchPrefix;
if (FAILED(hr = pReader->GetPrefix(&pwszPrefix, &cwchPrefix)))
{
wprintf(L"Error getting prefix, error is %08.8lx", hr);
return -1;
}
if (FAILED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
{
wprintf(L"Error getting local name, error is %08.8lx", hr);
return -1;
}
if (FAILED(hr = pReader->GetValue(&pwszValue, NULL)))
{
wprintf(L"Error getting value, error is %08.8lx", hr);
return -1;
}
if (cwchPrefix > 0)
wprintf(L"Attr: %s:%s=\"%s\" \n", pwszPrefix, pwszLocalName, pwszValue);
else
wprintf(L"Attr: %s=\"%s\" \n", pwszLocalName, pwszValue);
}

if (S_OK != pReader->MoveToNextAttribute())
break;
}
}
return hr;
}

int __cdecl wmain(
__in int argc,
__in_ecount(argc) LPCTSTR argv[])
{
HRESULT hr = S_OK;
IStream *pStream = NULL;
IXmlReader *pReader = NULL;
UINT cAttribute = 0;

if (FAILED(hr = SHCreateStreamOnFile(argv[1], STGM_READ, &pStream)))
{
wprintf(L"Error creating file reader, error is %08.8lx", hr);
return hr;
}

if (FAILED(hr = CreateXmlReader(__uuidof(IXmlReader), (void**) &pReader, NULL)))
{
wprintf(L"error creating xml reader, error is %08.8lx", hr);
return hr;
}

if (FAILED(hr = pReader->SetProperty(XmlReaderProperty_DtdProcessing, DtdProcessing_Prohibit)))
{
wprintf(L"Error setting XmlReaderProperty_DtdProcessing, error is %08.8lx", hr);
return -1;
}

if (FAILED(hr = pReader->SetInput(pStream)))
{
wprintf(L"Error setting input for reader, error is %08.8lx", hr);
return -1;
}

XmlNodeType nodeType;

while (S_OK == (hr = pReader->Read(&nodeType)))
{
LPCWSTR pwszPrefix = NULL;
UINT cwchPrefix = 0;
LPCWSTR pwszLocalName = NULL;
LPCWSTR pwszValue = NULL;

switch (nodeType)
{
case XmlNodeType_XmlDeclaration:
wprintf(L"XmlDeclaration\n");
if (FAILED(hr = WriteAttributes(pReader)))
{
wprintf(L"Error writing attributes, error is %08.8lx", hr);
return -1;
}
break;
case XmlNodeType_Element:
if (FAILED(hr = pReader->GetPrefix(&pwszPrefix, &cwchPrefix)))
{
wprintf(L"Error getting prefix, error is %08.8lx", hr);
return -1;
}
if (FAILED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
{
wprintf(L"Error getting local name, error is %08.8lx", hr);
return -1;
}
if (cwchPrefix > 0)
wprintf(L"Element: %s:%s\n", pwszPrefix, pwszLocalName);
else
wprintf(L"Element: %s\n", pwszLocalName);

if (FAILED(hr = WriteAttributes(pReader)))
{
wprintf(L"Error writing attributes, error is %08.8lx", hr);
return -1;
}

if (pReader->IsEmptyElement() )
wprintf(L" (empty)");
break;
case XmlNodeType_EndElement:
if (FAILED(hr = pReader->GetPrefix(&pwszPrefix, &cwchPrefix)))
{
wprintf(L"Error getting prefix, error is %08.8lx", hr);
return -1;
}
if (FAILED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
{
wprintf(L"Error getting local name, error is %08.8lx", hr);
return -1;
}
if (cwchPrefix > 0)
wprintf(L"End Element: %s:%s\n", pwszPrefix, pwszLocalName);
else
wprintf(L"End Element: %s\n", pwszLocalName);
break;
/*
case XmlNodeType_Text:
case XmlNodeType_Whitespace:
if (FAILED(hr = pReader->GetValue(&pwszValue, NULL)))
{
wprintf(L"Error getting value, error is %08.8lx", hr);
return -1;
}
wprintf(L"Text: >%s<\n", pwszValue);
break;
*/
case XmlNodeType_CDATA:
if (FAILED(hr = pReader->GetValue(&pwszValue, NULL)))
{
wprintf(L"Error getting value, error is %08.8lx", hr);
return -1;
}
wprintf(L"CDATA: %s\n", pwszValue);
break;
case XmlNodeType_ProcessingInstruction:
if (FAILED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
{
wprintf(L"Error getting name, error is %08.8lx", hr);
return -1;
}
if (FAILED(hr = pReader->GetValue(&pwszValue, NULL)))
{
wprintf(L"Error getting value, error is %08.8lx", hr);
return -1;
}
wprintf(L"Processing Instruction name:%S value:%S\n", pwszLocalName, pwszValue);
break;
case XmlNodeType_Comment:
if (FAILED(hr = pReader->GetValue(&pwszValue, NULL)))
{
wprintf(L"Error getting value, error is %08.8lx", hr);
return -1;
}
wprintf(L"Comment: %s\n", pwszValue);
break;
case XmlNodeType_DocumentType:
wprintf(L"DOCTYPE is not printed\n");
break;
}

/*
hr = pReader->GetAttributeCount(&cAttribute);

if (S_OK == hr)
{
wprintf(L"num attrubutes %i\n", cAttribute);
}
*/
}

return hr;
}

Monday, August 10, 2009

SetupDi: How To Enumerate Devices Using SetupDiGetClassDevs

I am back to work from my month off for paternity leave with a fresh new post. This time I going to write a about a topic that is directly related to my job, devices. In particular, we will look at how to use SetupDi to enumerate present devices and print out a few properties. There are a lot of other APIs available in Windows to do device enumeration. Perhaps I will cover them in later posts. As you will see in this post, SetupDi’s interfaces aren’t the most conducive to sexy code, but for better or worse, SetupAPI is the main way to work with devices in Windows. If you have any opinion on what a good device API should look like in Windows, please leave a comment and let me know.


Windows Vista on there is another API that is maybe easier to use for this kind of task, Function Discovery. If you are interested, check it out at:

Function Discovery PnP Enumeration Example

To the code…


This code is pretty basic. We create an HDEVINFO set of all present dev nodes, and step through each dev node printing out a few properties. I haven’t really looked at this sample code recently, so let me know if you see any problems.


#include <windows.h>
#include <setupapi.h>
#include <stdio.h>

void print_property
(
__in HDEVINFO hDevInfo,
__in SP_DEVINFO_DATA DeviceInfoData,
__in PCWSTR Label,
__in DWORD Property
)
{
DWORD DataT;
LPTSTR buffer = NULL;
DWORD buffersize = 0;

//
// Call function with null to begin with,
// then use the returned buffer size (doubled)
// to Alloc the buffer. Keep calling until
// success or an unknown failure.
//
// Double the returned buffersize to correct
// for underlying legacy CM functions that
// return an incorrect buffersize value on
// DBCS/MBCS systems.
//
while (!SetupDiGetDeviceRegistryProperty(
hDevInfo,
&DeviceInfoData,
Property,
&DataT,
(PBYTE)buffer,
buffersize,
&buffersize))
{
if (ERROR_INSUFFICIENT_BUFFER == GetLastError())
{
// Change the buffer size.
if (buffer)
{
LocalFree(buffer);
}
// Double the size to avoid problems on
// W2k MBCS systems per KB 888609.
buffer = (LPTSTR)LocalAlloc(LPTR, buffersize * 2);
}
else
{
break;
}
}

wprintf(L"%s %s\n",Label, buffer);

if (buffer)
{
LocalFree(buffer);
}
}

//int main(int argc, char *argv[], char *envp[])
int setupdi_version()
{
HDEVINFO hDevInfo;
SP_DEVINFO_DATA DeviceInfoData;
DWORD i;

// Create a HDEVINFO with all present devices.
hDevInfo = SetupDiGetClassDevs(
NULL,
0, // Enumerator
0,
DIGCF_PRESENT | DIGCF_ALLCLASSES);

if (INVALID_HANDLE_VALUE == hDevInfo)
{
// Insert error handling here.
return 1;
}

// Enumerate through all devices in Set.

DeviceInfoData.cbSize = sizeof(SP_DEVINFO_DATA);

for (i = 0; SetupDiEnumDeviceInfo(hDevInfo, i, &DeviceInfoData); i++)
{
LPTSTR buffer = NULL;
DWORD buffersize = 0;

print_property(hDevInfo, DeviceInfoData, L"Friendly name :", SPDRP_FRIENDLYNAME);

while (!SetupDiGetDeviceInstanceId(
hDevInfo,
&DeviceInfoData,
buffer,
buffersize,
&buffersize))
{
if (buffer)
{
LocalFree(buffer);
}

if (ERROR_INSUFFICIENT_BUFFER == GetLastError())
{
// Change the buffer size.
// Double the size to avoid problems on
// W2k MBCS systems per KB 888609.
buffer = (LPTSTR)LocalAlloc(LPTR, buffersize * 2);
}
else
{
wprintf(L"error: could not get device instance id (0x%x)\n", GetLastError());
break;
}
}

if (buffer)
{
wprintf(L"\tDeviceInstanceId : %s\n", buffer);
}

print_property(hDevInfo, DeviceInfoData, L"\tClass :", SPDRP_CLASS);
print_property(hDevInfo, DeviceInfoData, L"\tClass GUID :", SPDRP_CLASSGUID);
}


if (NO_ERROR != GetLastError() && ERROR_NO_MORE_ITEMS != GetLastError())
{
// Insert error handling here.
return 1;
}

// Cleanup
SetupDiDestroyDeviceInfoList(hDevInfo);

return 0;
}

Friday, July 10, 2009

Removing ATL from Your Code

Removing ATL from Your Code
COM is all over the place if you are doing win32 programming especially in higher level components like the shell. Should we be using COM as a Windows extension model in general is debatable, but we can leave that discussion for a later time. We want to talk about ATL.

The point of ATL is that it is supposed to help you write COM code more quickly by taking care of the tedious parts of COM programming like implementing IUnknown, managing ref counts, and freeing up memory; however, without going into too much gory detail, using ATL adds many high level dependencies not introduced by simple COM programming. If you are writing a higher level COM object that is already depending on things that use ATL or other high level dependencies, then maybe it doesn’t really matter if you are using ATL. Then it is really just a matter of preference. If you want to limit your high level dependencies and lower the level of you COM object, then you should avoid taking a dependency on ATL. Generally this is what you want to do if you want to write a low level system component that others will depend on and write extensions. I fall into this latter category for a component I want to clean up.

Unfortunately you may have some legacy COM code that took a dependency on ATL. If someone wanted depend on your component for use for something low level, then ATL could be a deal breaker. That is if COM already isn’t, but there are ways to make COM lean and mean without even using OLE. Take UMDF for example. This scenario is precisely the situation I am in. After coming back from vacation, I will be spending the next while removing ATL from some COM components. I will write on the things you need to do as I go through and figure out the processes.

Thursday, July 9, 2009

Supressing PreFAST Warnings

PreFAST is a great static analysis tool that can find lots of bugs for you; however, sometimes it can act like an over protective mother. In my last posts I NULL terminated the strings just to make PreFAST happy, but it felt more like a hack. I don't like leaving hacks in my real code. If there is a PreFAST warning that you feel like is unjustified and you would like you code to PreFAST warning free, you can use a handy #pragma trick to tell PreFAST that you know what you are doing and its okay. Keep in mind you don't want to do this very often, because generally PreFAST warnings should be fixed.

In my case, I was mallocing a buffer for a string that was to be read from the registry. PreFAST was warning me that I should NULL terminate the string. In this case RegEnumValue should do that correctly or give me an error. Since this warning is safe to ignore and I don't want to put in a hack just to get rid of the warning I added this #pragma at the line where the warning was:

#pragma prefast(suppress: 26036, "We expect that RegEnumValue will properly NULL terminate ppszKeyValue.")

Wednesday, July 8, 2009

Reading Registry Values Using RegEnumValue

This time I am pretending that I don’t know or care what the name of the value is. You can do that by using RegEnumValue. I wrote a function with this prototype:


__checkReturn HRESULT ReadKeyValue(
__in LPCWSTR pszSubKey,
__in DWORD dwIndex,
__deref_out LPWSTR *ppszKeyValue);

Basically you can read regkey values by its index. If you wanted to print out all of the values you could do it like this:

DWORD keyIndex = 0;

do
{
hr = HRESULT_FROM_WIN32(ReadKeyValue(psCategory, psSubcategory, keyIndex++, &psKeyValue));
wprintf(L"%s 0x%x\n", psKeyValue, hr);
if (hr == HRESULT_FROM_WIN32(ERROR_NO_MORE_ITEMS)) wprintf(L"this means there are no more key values\n");
} while (S_OK == hr);

So here is the complete function. If you knew that you were going to read a lot of values, you might want to open the regkey handle beforehand, do all of your reads, and close it when you are done. Other than that this code should look similar to the previous posting.

__checkReturn HRESULT ReadKeyValue(
__in LPCWSTR pszSubKey,
__in DWORD dwIndex,
__deref_out LPWSTR *ppszKeyValue)
{
HRESULT hr = S_OK;
LPWSTR pszSubKey = NULL;
HKEY hKey = NULL;
LPWSTR pszValueName = NULL;
DWORD lpcchValueName = 2048; // a reasonable initial buffer size
DWORD cbDataSize = 4096;
DWORD type = 0;

hr = HRESULT_FROM_WIN32(
RegOpenKeyEx(
HKEY_LOCAL_MACHINE,
pszSubKey,
0,
KEY_QUERY_VALUE,
&hKey));

if (S_OK == hr)
{
*ppszKeyValue = (LPWSTR)malloc(cbDataSize);
pszValueName = (LPWSTR)malloc(lpcchValueName * sizeof(wchar_t));

if (!*ppszKeyValue || !pszValueName)
{
hr = E_OUTOFMEMORY;
}
else
{
**ppszKeyValue = L'\0';
*pszValueName = L'\0';
}
}

while (S_OK == hr && S_OK != (hr = HRESULT_FROM_WIN32(
RegEnumValue(
hKey,
dwIndex,
pszValueName,
&lpcchValueName ,
NULL,
&type,
(LPBYTE)*ppszKeyValue,
&cbDataSize))))
{
if (*ppszKeyValue)
{
free(*ppszKeyValue);
*ppszKeyValue = NULL;
}

if (pszValueName)
{
free(pszValueName);
pszValueName = NULL;
}

if (HRESULT_FROM_WIN32(ERROR_MORE_DATA) == hr)
{
hr = S_OK;

*ppszKeyValue = (LPWSTR)malloc(cbDataSize);
pszValueName = (LPWSTR)malloc(lpcchValueName * sizeof(wchar_t));

if (!*ppszKeyValue || !pszValueName)
{
hr = E_OUTOFMEMORY;
}
else
{
**ppszKeyValue = L'\0'; // PreFAST is still whining about this
*pszValueName = L'\0';
}
}
else
{
break;
}
}

// cleanup
(void)RegCloseKey(hKey);

if (pszValueName)
{
free(pszValueName);
}

if (S_OK != hr && *ppszKeyValue)
{
free(*ppszKeyValue);
*ppszKeyValue = NULL;
}

return hr;
}