Being the old fogey that I am, I've been running Livermore Fortran
Kernels, both in an f77 and an f90 version. I've even uncovered some
apparent bugs in the usual sources for those BLAS routines, which
someone might wish to check:
xDOT does not handle correctly the case where strides are opposite sign
(LFK 6). The calling overhead nearly always prevents a separately
compiled library version from paying off, although LFK 3 goes up to
vector length 1000. Speed depends strongly on splitting/interleaving
and pre-conditioning strategy.
Not necessarily a bug, but xGEMM gives up accuracy in favor of unit
stride, even where that costs speed on many machines. Fancy loop
re-nesting compilers like MipsPro f90 can overcome this, if you remove
the redundant IF's which are there only in a misguided attempt to
improve speed. Cray compilers are supposed to invoke their own version
of xGEMM automatically, making its own decision whether to in-line or
use an external library version. In the latter case, compiling it
yourself with outer unrolling parameters set to favor your data set may
beat the built-in version. The LFK 21 original is written to favor the
Cray vector architecture, in the absence of an intelligent compiler or
substitution of xGEMV.
LFK 4 is a disguised xGEMV case, which doesn't exhibit the bug in the
standard source, which doesn't handle the case where the inner loop is
length zero but the outer loop is non-zero length.
Tim Prince
----- Original Message -----
From: "Van Snyder" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Friday, August 04, 2000 1:34 PM
Subject: What are your favorite linear algebra benchmark programs?
>
> What are your favorite programs to measure the performance of linear
> algebra procedures, especially vector and matrix inner product
> procedures? I'm interested to compare Fortran intrinsic DOT_PRODUCT
> and MATMUL to xDOT and xGEMM, respectively.
>
> Best regards,
> Van Snyder
>
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|