In thinking about this further, I am inclined to think that the main problem
that leads programers to fail to consider the potential problems with
statements such as
IF (K .GT. N .AND. X(K) .GT. 0.) THEN
it is that the potential effects of evaluating X(K) when K is greater than N
is outside of the many Fortran programmer's mental model of the machine. When
I try to think about my thought proceses when I used to write this sort of
code, I believe that a belief in short circuiting played little role in that
decision. Instead I did not believe that attempting to access X(K) in that
situation would have any significant side effect.
My mental model at that time was that a logical expression could have three
possible results, .TRUE., .FALSE., or invalid (or undefined). The result of
X(K) .GT. 0. when K was greater than N was invalid, but under the normal
rules of logic it still resulted in a valid result, .FALSE., for the
expression K .GT. N .AND. X(K) .GT. 0.. Problems only arised when some
systems threw an array out of bounds error and aborted all processing under
such conditions. I now view this response of the systems as contrary to one
aspect of the spirit of Fortran, that the programmer should be allowed to
avoid thinking about unimportant side effects.
To me accessing an array out of bounds should generate unimportant side
effects, unless it leads to an invalid statement. Unfortunately, it is easier
for a processor to detect undefined expressions in general, than the special
case of an undefined expression that leads to an undefined statement. As a
result, requiring that significant side effects should only be generated by
the system if they lead to undefined statements, is contrary to another
aspect of the spirit of Fortran, that the processor should be allowed to
execute as efficiently as possible.
The above conflict can never be completely resolved. In practice, control
flow analysis can resolve that conflict for the vast majority of code. In
such code, any potentially out of bounds index will also be tested in the
same expression. As a result, the processor can recognize that there are
implicit constraints on the index, and accessing the element before verifying
the validity of the constraints has the potential of causing the sytem to
generate an error that will result in an improper side effect. In other
words, I believe to on a system that can generate an array out of bounds
warning, a high quality implementation should test for K .GT. N before
accessing X(K) whether or not the expression is written K .GT. N .AND. X(K)
.GT. 0. or written X(K) .GT. 0. .AND. K .GT. N. (It should also have a
compile time flag that warns of the potential problems such code can cause on
other systems.) I believe that DEC in particular says that it does such
analyses.
I hate it when issues become quality of implementation concerns, but, while
in practice I suspect the vast majority of such coding practices are
detectable by the compiler, there are some code useages that can only be
detected by generating unnecessarilly inefficient code in many situations.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|