Thursday 25 August 2016

c - gcc loop construct changes to assembly code




Why does the gcc compiler translate while loops into do-while constructs when creating assembly code? I know any while loop can be rewritten as a do-while for instance in c



while (test) {
...
}



can be rewritten as




if ( !test ) goto skip;
do {
. . .
} while ( test );
skip:

Answer



Building on anatolyg's answer, you may wonder why the do-while construct is more efficient, especially considering that as shown, the test has been duplicated (so the generated code is bigger). The answer is that very often the test expression can be proven always to be true on loop entry, so




    if ( !test ) goto skip;
loop:
. . . // loop body
if ( test ) goto loop;
skip:
. . . // continue the program


can be simplified to




loop:
. . . // loop body
if ( test ) goto loop;
. . . // continue program


Now, why not do that at the same time as the original transformation, and avoid the original transformation if the compiler can't prove the loop will cycle at least once? Because the algorithm that proves the loop will cycle at least once is actually a generic if-condition optimizer. The usual sequence of optimizations is something like this:




  1. Transform all loops into a "canonical" form (more or less as shown in the first code block above)


  2. Do lots and lots of optimizations that expect loops in that canonical form, such as the if-statement elimination shown.

  3. After you're done with everything that cares about loops as such, attempt to de-canonicalize where that would eliminate redundancy.



The other thing to know is that sometimes a duplicate test is deliberately left in because the compiler expects that to produce better run-time behavior. For instance, if the compiler has reason to believe that the loop usually cycles many times, but can't prove that it always cycles at least once, then the conditional branch instruction above the loop will almost always fall through, and the conditional branch below the loop will almost always jump. When that's the case, leaving them separate is likely to make the CPU's branch predictor more accurate. Branch prediction accuracy is second only to cache-friendliness as the limiting factor for speed on modern out-of-order CPUs.


No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...