I suppose we may hope that compiler vendors may adopt more intelligent
technology, but I think for the time being most compilers have a low IQ
rating on this test. At least one vendor continues to emphasize BLAS
support, and I've just landed in the middle of helping my team figure out
how that may be used, including writing and comparison testing equivalent
f90 code. The lack of in-line optimizations gives away more relative
performance on small to medium sized matrices than on large ones. On the
test I ran today, not requiring explicit transposes, the BLAS performance
fell between the MATMUL and the f77 expansion. I suppose there's a belief
that there might be more payoff in BLAS, as it might (even more painfully)
be hooked up with C++ as well as Fortran.
----- Original Message -----
From: "Van Snyder" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Friday, September 08, 2000 7:02 PM
Subject: Question about matmul and transpose
>
> I understand that Fortran compiler vendors are doing a good job now with
> the MATMUL intrinsic. If that's untrue, tell me so, and don't bother
> with the rest of this message.
>
> I need to compute A^T * B. This is an easy thing to say, and efficient
> to do, with LAPACK's xGEMM routines. It's easy to say in Fortran, too:
> MATMUL(TRANSPOSE(A), B).
>
> My questions:
>
> Do compilers usually form the transpose of A explicitly and write it
> down before MATMUL begins execution, or do optimizers usually detect this
> particular construction, and use a different MATMUL, or a different call
> to a not-inline MATMUL (as can be done with xGEMM's "this argument is
> transposed" signal)?
>
> If I write X = MATMUL(A,B), do compilers typically form the result in X,
> or in a temporary area that is later copied to X? The former can be done
> if (1) the optimizer is clever enough, and (2) if MATMUL is not an inline
> function, then the place to put the result (X) is passed in to MATMUL as
> an extra hidden intent(out) argument.
>
> Best regards,
> Van Snyder
>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|