From [log in to unmask] Sat Nov 1 21:15:21 1997
From: [log in to unmask]
Date: Sat, 1 Nov 97 11:08:19 +0100
Just a test on a Sun system:
PARAMETER (NL=1000, NC=1001)
DOUBLE PRECISION A (NC, NL), B (NL, NC), C (NC, NC)
C = MATMUL (A, B)
716 sec.
What optimization level was used here?
CALL DGEMM ( 'N','N', NC, NC, NL, 1.0d0, A, NC, B, NL, 0.0d0, C, NC )
255 sec. vanilla BLAS
53 sec. libsunperf (Sun optimized BLAS)
So, apart from the lack of of performance-consciousness of Sun as a
compiler vendor, what is the reason why the F90 compiler does not
replace MATMUL with a call to their optimized DGEMM ?
And if the compiler won't do it, then my advice is that the programmer
should !
Michel OLAGNON email: [log in to unmask]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|