Hi all,
I am having some trouble with some very simple level 1 BLAS routines
optimized for Pentium II:
http://www.cs.utk.edu/~ghenry/distrib/archive.htm#blas
See for example the attached code, which crashes under Linux RedHat 6.2
with lf95 on a single CPU Pentium II.
I tried other routines, like SCOPY and XASAM, and they seemed to work.
Others, like ISAMAX did not.
I have used the level 2 and 3 routines before with no problems, so I am
assuming the level 1 must work. Am I doing something wrong? Can somebody
please try the code with your BLAS libraries.
I need S(D)AXPY and S(D)DOT for my conjugate-gradient routines, and they
are really significantly faster when optimized in assembler.
I am aware of only one other Pentium BLAS 1,
http://cip.physik.uni-wuerzburg.de/~mlkessle/blas1.html, which is an old
page and it says AXPY is not optimized yet.
Thanks a lot,
Aleksandar
--
_____________________________________________
Aleksandar Donev
Physics Department
Michigan State University
East Lansing, MI 48824-1116
E-mail: [log in to unmask]
Work phone: (517) 432-6770
_____________________________________________
program test
implicit none
external :: SDOT, SCOPY, SNRM2, SASUM, ISAMAX, SAXPY
real :: SASUM, SDOT, SNRM2, dots
real, allocatable, dimension(:) :: vec1,vec2
integer :: N,indx,ISAMAX
write(*,*) "N?="
read(*,*) N
allocate(vec1(N),vec2(N))
call random_number(vec1)
call random_number(vec2)
write(*,*) vec1, vec2
dots=SDOT(N,vec1,1,vec2,1)
call SAXPY(N,1.0,vec1,1,vec2,1)
write(*,*) vec1, vec2 , dots
end program test
!
|