The optimal vector size for each machine should be known internally by
the compiler. Write the loop as the algorithm dictates. The compiler
will divide it up into vector "chunks" automatically if the body of the
loop can be executed by vector hardware. User attempts to manually
reform loops for presumed vector lengths, pipelining, or cache
blocking are generally counter-productive. The result is code that is
unclear to read, difficult to maintain, and confusing to the compiler.
Compiler optimizers work best on simple, clean loops.
Cheers,
Bill
On 1/4/11 9:37 AM, Greenberg, Naomi wrote:
> I am trying to find a way to configure code before compile time to set
> the optimal loop vectorization size for the user’s machine and then
> (using the Fortran preprocessor) get that value and set the loop size to
> this value. For example, on Machine1, nvec might be 64, on machine2, it
> might be 1024, and the code would “do i=1,nvec” (obviously not quite
> that way). The question is whether there’s a way to automatically get
> the optimal vector size from each machine (using Linux) or whether
> there’s a better way to get the same result? Any suggestions are welcome!
>
> Naomi Greenberg
>
> /Member of the Research Staff/
>
> Riverside Research Institute
>
> (212) 502-1718 (ph)
>
> (212) 502-1729 (fax)
>
> [log in to unmask]
>
--
Bill Long [log in to unmask]
Fortran Technical Support & voice: 651-605-9024
Bioinformatics Software Development fax: 651-605-9142
Cray Inc./Cray Plaza, Suite 210/380 Jackson St./St. Paul, MN 55101
|