Hi list!
About random_number... If the seed is not set with random_seed it is
then automatically set to a processor dependent value... So it will give
the same results in the same processor and most likely different results
in different processors...
I got this results using CVF in an Intel machine (2xPIII@504MHz, 128MB
RAM):
Single precision, no optimisations.
f77 loop, contiguous array, sum =3.2850294E+00 time = 2.2340 sec.
f77 loop, stride 2x2 array, sum =3.2850294E+00 time = 4.3440 sec.
f90 loop, contiguous array, sum =3.2850294E+00 time = 2.2970 sec.
f90 loop, stride 2x2 array, sum =3.2850294E+00 time = 4.4210 sec.
f90 SUM(contiguous array), sum =3.2850294E+00 time = 1.5320 sec.
f90 SUM(stride 2x2 array), sum =3.2850294E+00 time = 3.7340 sec.
Single precision, optimisations, no loop transformation.
f77 loop, contiguous array, sum =3.2850308E+00 time = 0.3910 sec.
f77 loop, stride 2x2 array, sum =3.2850308E+00 time = 1.1090 sec.
f90 loop, contiguous array, sum =3.2850339E+00 time = 0.3750 sec.
f90 loop, stride 2x2 array, sum =3.2850339E+00 time = 1.1090 sec.
f90 SUM(contiguous array), sum =3.2850342E+00 time = 0.3750 sec.
f90 SUM(stride 2x2 array), sum =3.2850342E+00 time = 1.0940 sec.
Single precision, optimisations, loop transformation.
f77 loop, contiguous array, sum =3.2850308E+00 time = 0.3910 sec.
f77 loop, stride 2x2 array, sum =3.2850308E+00 time = 1.0940 sec.
f90 loop, contiguous array, sum =3.2850339E+00 time = 0.4210 sec.
f90 loop, stride 2x2 array, sum =3.2850339E+00 time = 1.1250 sec.
f90 SUM(contiguous array), sum =3.2850342E+00 time = 0.3600 sec.
f90 SUM(stride 2x2 array), sum =3.2850342E+00 time = 1.0940 sec.
Single precision, optimisations, loop transformation, unroll = 64.
f77 loop, contiguous array, sum =3.2850337E+00 time = 0.4060 sec.
f77 loop, stride 2x2 array, sum =3.2850337E+00 time = 0.9530 sec.
f90 loop, contiguous array, sum =3.2850339E+00 time = 0.4690 sec.
f90 loop, stride 2x2 array, sum =3.2850339E+00 time = 1.1560 sec.
f90 SUM(contiguous array), sum =3.2850344E+00 time = 0.4070 sec.
f90 SUM(stride 2x2 array), sum =3.2850344E+00 time = 0.9840 sec.
Double precision, no optimisations.
f77 loop, contiguous array, sum =3.285034552180996E+00 time = 2.3130
sec.
f77 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 7.4060
sec.
f90 loop, contiguous array, sum =3.285034552180996E+00 time = 2.6090
sec.
f90 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 7.4380
sec.
f90 SUM(contiguous array), sum =3.285034552180996E+00 time = 1.9370
sec.
f90 SUM(stride 2x2 array), sum =3.285034552180996E+00 time = 7.4070
sec.
Double precision, optimisations, no loop transformation.
f77 loop, contiguous array, sum =3.285034552180996E+00 time = 0.7500
sec.
f77 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 1.9220
sec.
f90 loop, contiguous array, sum =3.285034552180996E+00 time = 0.8280
sec.
f90 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 1.9370
sec.
f90 SUM(contiguous array), sum =3.285034552180996E+00 time = 0.5160
sec.
f90 SUM(stride 2x2 array), sum =3.285034552180996E+00 time = 1.9690
sec.
Double precision, optimisations, loop transformation.
f77 loop, contiguous array, sum =3.285034552180996E+00 time = 0.7500
sec.
f77 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 1.9220
sec.
f90 loop, contiguous array, sum =3.285034552180996E+00 time = 0.5620
sec.
f90 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 1.9530
sec.
f90 SUM(contiguous array), sum =3.285034552180996E+00 time = 0.5160
sec.
f90 SUM(stride 2x2 array), sum =3.285034552180996E+00 time = 1.9370
sec.
Double precision, optimisations, loop transformation, unroll = 64.
f77 loop, contiguous array, sum =3.285034552180996E+00 time = 0.4370
sec.
f77 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 1.8750
sec.
f90 loop, contiguous array, sum =3.285034552180996E+00 time = 0.5160
sec.
f90 loop, stride 2x2 array, sum =3.285034552180996E+00 time = 1.9690
sec.
f90 SUM(contiguous array), sum =3.285034552180996E+00 time = 0.4210
sec.
f90 SUM(stride 2x2 array), sum =3.285034552180996E+00 time = 1.9220
sec.
Best Regards
José Rui
P.S.- I made some alterations to the program so I include it in
attachment
========================================================================
Obviousness is always the enemy of correctness. - Bertrand Russell.
========================================================================
Iam://José Rui Faustino de Sousa http://homepage.esoterica.pt/~jrfsousa/
mailto:[log in to unmask] phone://+351-239444940
address://rua Carlos A. Pinto de Abreu nº 30C, 1º 3040 Coimbra Portugal
========================================================================
Real Programmers do sig blocks in Fortran 95.
========================================================================
|