Sunday, August 30, 2009

Switch and goto

I recently came across code with a function which was basically just a switch-case. The function had to handle some generic object and switched on a type-id to perform some type specific stuff. There were quite a few of these kind of dispatch functions lying around and they ended up causing a significant performance hit (they were called very often).

Why were the functions slow to start with? Well the case label values were neither contiguous nor did form a pattern. They were values scattered all over the integer range, which forced the compiler to generate code that basically a series of if-else statements. If the values were dense, the compiler would have generated a jump table and performed an indirect jump. No such hope here.

What could be done?

Getting performance out of your code is a lot like curing cancer. Do it early. Hoping to stick it in after the code's in is more like waiting till metastasis to start treatment on your tumour. At least in this case, we had the excuse of having to deal with a 15-year old legacy code.

Option 1: Choose something like the Command-pattern.

Option 2: Make your own jump-table (i.e., if that's possible)

How do make a jump table? Well, first you need to be able to map your case values to table indices. If you can't make it contiguous, it should atleast fit into a reasonable sized table with not too many unused slots (which can all map to default jump target). For eg, in the simple case that your case values form an arithmetic progression like V = A*i + K, (V - K)/A gives you the table index (in this case a decent compiler would have already done that with the switch-case).

Now comes the sticky part. If you're using a compiler like gcc which has an extension to allow computed gotos, it's simple. This is how you do it in gcc.

void dispatcher(int n)
{
 void *tbl[] = {&&L1, &&L2, &&LDefault};
 int i = n>1?2:n; // 0,1 are legal values. 2 the default handler

 goto *tbl[i];
L1:
 handler_L1();
 return;
L2:
 handler_L2();
 return;
LDefault:
}

But if you're not lucky enough to have such a compiler, then here's how you do it in VC (VC8).

#define ADDR_LABEL(var,label) do{void *p; __asm lea eax,label __asm mov p,eax __asm{} var = p; }while(0)
#define GOTO_LABEL(var)  do{void *p = var; __asm mov eax,p __asm jmp eax }while(0)

void dispatcher(int n){
 void *tbl[3];
 int i = n>1?2:n; // 0,1 are legal values. 2 the default handler

 ADDR_LABEL(tbl[0],L1);
 ADDR_LABEL(tbl[1],L2);
 ADDR_LABEL(tbl[1],LDefault);

 GOTO_LABEL(tbl[i]);

L1:
 handler_L1();
 return;
L2:
 handler_L2();
 return;
LDefault:
}

The ADDR_LABEL macro puts the address of your label into a given variable (of type void pointer). The empty __asm {} block is just a marker for the VC compiler so that it doesn't give parse errors on the subsequent C code. Of course, all this depends on your being able to map values to indices.

So, that's it. I really hope it doesn't come to your making jump tables.

No comments:

Post a Comment

About Me

My photo
3 things to learn before I die - Evolution, Relativity, and Quantum mechanics. I put the theory of evolution before the other two, and I'm not a fan of the three lettered G-word most people kill for. That's me.

CS with maybe a dash of math and sprinkling of insanity!