I understand that Fortran compiler vendors are doing a good job now with
the MATMUL intrinsic. If that's untrue, tell me so, and don't bother
with the rest of this message.
I need to compute A^T * B. This is an easy thing to say, and efficient
to do, with LAPACK's xGEMM routines. It's easy to say in Fortran, too:
MATMUL(TRANSPOSE(A), B).
My questions:
Do compilers usually form the transpose of A explicitly and write it
down before MATMUL begins execution, or do optimizers usually detect this
particular construction, and use a different MATMUL, or a different call
to a not-inline MATMUL (as can be done with xGEMM's "this argument is
transposed" signal)?
If I write X = MATMUL(A,B), do compilers typically form the result in X,
or in a temporary area that is later copied to X? The former can be done
if (1) the optimizer is clever enough, and (2) if MATMUL is not an inline
function, then the place to put the result (X) is passed in to MATMUL as
an extra hidden intent(out) argument.
Best regards,
Van Snyder
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|