C's Poor Loop Constructs


At some naive level, C's

for(i=1;i<=n;i++){foo;}
is the same as FORTRAN's
do i=1,n
  foo
end do
However, when one considers FORTRAN's rules in detail, they are very different from C's.

Firstly, in FORTRAN it is not permitted to alter the loop control variable within the body of the loop. There is no direct equivalent of

n=10;
for (i=1;i<=n;i++){
  if(i==4) i=n-1;
  printf("%d\n",i);
}
which in C will print 1,2,3,9,10.

Secondly, the loop count in FORTRAN is calculated on entry to the loop, not at the end of each iteration. Hence whereas

n=10;
for (i=1;i<=n;i++){
  n=n-1;
  printf("%d\n",i);
}
is legal C and prints 1,2,3,4,5, the `equivalent' FORTRAN
n=10
do i=1,n
  n=n-1
  write(*,*)i
end do
prints all integers from 1 to 10.

Why does this matter? Well, if a loop contains no exit statement and no goto (yuk), then in FORTRAN the number of iterations calculated at run time on first entering the loop will be the number performed. In C there are additional checks to make before this is the case: neither the loop counter nor the variables involved in the exit condition may be altered, either directly, or via pointers, or by passing them by reference to other functions. The latter two cases can be impossible to check in all but the most trivial pieces of code. Whereas the above examples are the sort of code no sane person would even write, passing any integer pointer to a function called from the loop body, or modifying the target of any integer pointer, can make proving that these pointers could never point to the loop control variables quite hard.

As even at run time the number of interations of the loop is not immediately apparent in C, splitting the loop over multiple processors can be rather hard. A loop which appears to do 100,000 cycles might, in fact, do just five in some circumstances because the loop counter or the variables in the exit condition are intentionally modified within the loop, or it might skip a block of iterations in the middle of the range because the loop counter is modified.

Because FORTRAN's loop construct is easier to optimise, and can be optimised in more cases, compilers are more likely to try to optimise it. Again C is at a disadvantage.


MJ Rutter, August 2000. Return to contents