Hard to C

Look and you will C -- Learn and you will C++

Sunday, February 27, 2005

What Else Could Go Wrong?

JrDebugLogger is a very nice debug logging library. Much of it's functionality is implemented through macros to allow it to be selectively left out when compiling. Along the way the author has had some interesting problems to solve, and this post is about one of them.

Assume we use the following macro:

   #define DEBUGOUT if (debug_on) debug_stream
to allow us to perform debug logging with a stream-like interface. We can then do:
   DEBUGOUT << "hello";
which expands to:
   if (debug_on) debug_stream << "hello";
Now if the compiler knows that debug_on is false, it can leave out all code related to the debug logging, since it knows it will never be called. If it does not know the value at compile time, the resulting code will contain a very fast check around the call, allowing debug logging to be turned on and off dynamically with little performance overhead.

There is, however, an insidious bug lurking in the corner, waiting to jump at the user. Can you spot the problem? Think about it for a minute or two before reading on.

Consider this use:
   if (i > limit)
DEBUGOUT << "i too big";
else
do_computation(i);
it expands to:
   if (i > limit)
if (debug_on) debug_stream << "i too big";
else
do_computation(i);
This is valid C++, and compiled without warnings on the three compilers I tried. But who does that else belong to?

Let's see what the standard says:
"An else is associated with the lexically nearest preceding if that is allowed by the syntax."
This is from the C99 standard (6.8.4.1p3), which was the most clear, however statements to the same effect are present in the C++ standards.

So the above is equivalent to:
   if (i > limit)
{
if (debug_on)
debug_stream << "i too big";
else
do_computation(i);
}
which was of course not the intention.

So how can we solve this without giving up the nice properties of the if? The simple solution is to give the if in the macro its own else:
   #define DEBUGOUT if (!debug_on) ; else debug_stream
We now get the expansion:
   if (i > limit)
if (!debug_on) ; else debug_stream << "i too big";
else
do_computation(i);
and the compiler will correctly associate the users else with his if. So it is equivalent to:
   if (i > limit)
{
if (!debug_on)
;
else
debug_stream << "i too big";
} else {
do_computation(i);
}
Thanks to Jesse for the nice topic.

Friday, February 18, 2005

Ternary Trickery

There is a good article about BOOST_FOREACH by Eric Niebler on The C++ Source.

It contains some interesting trickery involving the ternary conditional operator (?:).

Monday, February 14, 2005

Loophole in Visual C++, Part 2

Here is a slightly more elaborate example:

    #include <stdio.h>
#include <limits.h>
 
unsigned int ratio(unsigned int x, unsigned int y)
{
if (x <= UINT_MAX / 100) x *= 100; else y /= 100;
 
if (y == 0) y = 1;
 
return x / y;
}
 
int main(void)
{
unsigned int count;
 
for (count = 0x3fffffff; count != 0; ++count)
{
/* do something */
 
/* show progress */
printf("\r%u%% done", ratio(count, UINT_MAX));
}
 
return 0;
}
This program goes through the entire range of the unsigned int type, performing some action for each. It shows the progress by calling a function to compute the ratio of count to the maximum possible value. Again, count is incremented in each step, and hence will reach the value zero at some point.

The program works as expected on the compilers I tried, except for cl.exe from VC7 and VC71 with the /O2 switch, which stop at 25%. In case you wondered about the starting point of 0x3fffffff, that's the reason -- no need to watch your machine chew it's way through all integers up to 25%.

Looking at the code generated for the loop:
    $L873:
...
inc esi
add edi, 100 ; 00000064H
jne short $L873
We see that it fails because the two instructions before the conditional jump have been reversed. Again it looks like the optimizer fails to recognize the importance of the increment to the loop.

Wednesday, February 09, 2005

Additional Trouble

2 plus 2 is 4, but does that generalize?

What is your immediate reaction to this little program?

   #include <stdio.h>


int main(void)
{
if (20000 + 20000 == 40000) printf("HardToC");

return 0;
}
If it was something along the lines of 'depends' then you're either a raider of the standard, or you've just been around C/C++ for too long like me.

The type of an unsuffixed decimal integer constant is the first type from a list in which its value can be represented:


C89 - int, long int, unsigned long int
C99 - int, long int, long long int
C++ - int, long int


Now, the problem with the little program above is that if the int type is 16-bit, then 20000 + 20000 results in an overflow because the maximum value of a 16-bit int is 32767. We are guaranteed that computations involving unsigned operands cannot overflow, but there is no such guarantee for signed operands. So the addition may leave us in the land of undefined behaviour.

I compiled the above example with three DOS 16-bit compilers; Borland, Open Watcom and Digital Mars. None of the programs gave any output when run. Borland warned about the overflow, Open Watcom warned at -w2, Digital Mars did not warn.

What happens is that in the x86 two's complement representation, 20000 + 20000 overflows and becomes -25536, which is not equal to 40000.

Writing portable, standard compliant C/C++ is not always easy .. and it can be Hard to C the problems.

Sunday, February 06, 2005

Loophole in Visual C++, Part 1

Lets start this post by recalling what the gosp^H^H^H^Hstandard has to say about unsigned arithmetic:

"A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type."
This is from the C89 draft (3.1.2.5p5), statements to the same effect are present in the C99 standard (6.2.5p9) and the C++ standards.

Now consider the following program:
   #include <stdio.h>

 
int main(void)
{
unsigned int count = 0;
 
do {
printf("%u\n", count);
count += 1;
} while (count != 0);
 
return 0;
}
Since count starts at zero and is incremented each time through the loop, the standard tells us it will wrap to zero when it reaches a result that cannot be represented by an unsigned int, making the program terminate. Compiling the program with various compilers gives the expected stream of increasing numbers.

However, if you compile it with cl.exe from Visual C++ using the /O2 switch (maximize speed) you get a somewhat surprising result; a single zero and the program exits. This goes for VC6, VC7 and VC71.

If you initialize count to one instead, the program works fine. So it looks like the optimizer fails to recognize the addition as changing the value of count, and thus optimizes away the loop.

I have not tested the various VC8 betas, so if you have any of them installed, feel free to try it out and post your results (just remember to compile from the command-line using cl.exe and /O2).

Saturday, February 05, 2005

Herb Sutter on Visual C++

Channel 9 has a great interview with Herb Sutter (part 1, part 2).

I think he has some very sound arguments about programming and programmers, which are as interesting as the information about the future of Visual C++.

Sometimes it can be Hard to C

.. and even harder to C++

This is intended to be a place for random postings about the trials and tribulations of C and C++ coding.

Welcome!