I wanted to write about a really weird problem I recently had while debugging in C++ (technically, it’s all C). Unfortunately, I was doing this in kernel debugging mode, which made life a bit harder, but it would have happened the same in userland.
I had an .hpp file (we’ll call it process_internal.hpp) that was originally an internal file just to be included from a .cpp file (we’ll call it process.cpp), so it contained global variables as symbols. I ended up needing to include this process_internal.hpp file elsewhere (for testing, we’ll call it test.cpp). Because of this, the same symbol was included in multiple files, so the separate .o builds were not properly interacting. I ended up using “#ifdef”s to only include the parts I needed in the test.cpp file, and doing “extern” defines of the global variables for it. It looked something like the following:
enum { FT_Inbound, FT_Outbound };
typedef struct FilteringLayer {
int FilterTypeNum, OriginalID;
const char *Name;
} FilteringLayer;
const int FT_NumTypes=2;
#ifdef _PROCESS_INTERNAL
FilteringLayer FilterTypes[FT_NumTypes]={
{FT_Inbound, 5, "Inbound"),
{FT_Outbound, 8, "Outbound"),
};
#else
extern "C" FilteringLayer *FilterTypes;
#endif
So I was accessing this variable in test.cpp and getting a really weird problem. The code looked something like this:
struct foo { int a, b; };
foo Stuff[]={...};
void FunctionBar()
{
for(int i=0;i<FT_NumTypes;i++)
Stuff[FilterTypes[i].OriginalID].b=1;
}
This was causing an access exception, which blue screened my debug VM. I tried running the exact same statements in the visual studio debugger, and things were working just as they were supposed to! So I decided to go to the assembly level. It looked something like this: (I included descriptions)
L# | Code |
|
1 | mov qword ptr [rsp+58h],0 |
2 | jmp MODULENAME!FunctionBar+0xef |
3 | mov rax,qword ptr [rsp+58h] |
4 | inc rax |
5 | mov qword ptr [rsp+58h],rax |
6 | cmp qword ptr [rsp+58h],02h |
7 | jae MODULENAME!FunctionBar+0x11e |
|
8 | imul rax,qword ptr [rsp+58h],10h |
9 | mov rcx,[MODULENAME!FilterTypes ] |
10 | movzx eax,word ptr [rcx+rax+4] |
11 | imul rax,rax,30h |
12 | lea rcx,[MODULENAME!Stuff ] |
13 | mov dword ptr [rcx+rax+04h],1 |
14 | jmp MODULENAME!FunctionBar+0xe2 |
15 |
I noticed that line #9 was putting 0x0000000C`00000000 into RCX instead of &FilterTypes. I knew the instruction should have been an “lea” instead of a “mov” to fix this. My first thought was compiler bug, but as many programming mantras say, that is very very rarely the case. If you want to guess now what the problem is, now is the time. I’ve given you all the information (and more) to make the guess.
The answer: extern "C" FilteringLayer *FilterTypes; should have been extern "C" FilteringLayer FilterTypes[];. Oops! The debugger was getting it right because it had the extra information of the real definition of the FilterTypes variable.