On 2/3/2011 3:42 PM, Greenberg, Naomi wrote:
> Tim,
> I alos have access to a system with Itanium chips (my real target machine) and an 11.1 compiler, but I can't get any vectorization diagnostics on it (-vec-report[n] isn't supported). How to I know if I am getting this routine to vectorize? Does the Intel compiler vectorize double-precision complex on these chips?
>
> Thanks,
> Naomi
>
> -----Original Message-----
> From: Fortran 90 List [mailto:[log in to unmask]] On Behalf Of Tim Prince
> Sent: Thursday, February 03, 2011 2:55 PM
> To: [log in to unmask]
> Subject: Re: Vectorization and double-precision complex
>
> On 2/3/2011 2:33 PM, Greenberg, Naomi wrote:
>> I am trying to add 2 double-precision complex matrices together and my
>> (older) Intel compiler will not vectorize the code (bad datatype). Is
>> there any way around this? I’ve tried to add the real and imaginary
>> parts separately and them combine them, but that doesn’t work well
>> either. Any ideas?
>>
>> This is what I’d like to do, either as a subroutine, or in the calling
>> routine:
>>
>> My compiler flags are set to treat real and complex as double precision.
>>
>> subroutine addA2B (num, A, B)
>>
>> integer, intent(IN) :: num
>>
>> complex, intent(INOUT) :: A(num)
>>
>> complex, intent(IN) :: B(num)
>>
>> !DIR$IVDEP
>>
>> A = A + B
>>
>> end subroutine addA2B
>>
>> Naomi Greenberg
>>
>> /Member of the Research Staff/
>>
>> Riverside Research
>>
>> (212) 502-1718 (ph)
>>
>> (212) 502-1729 (fax)
>>
>> [log in to unmask]
>>
> Nearly 6 years ago, the "prescott" SSE3 instructions were introduced to
> support this (original compiler option -xP). There's no reasonable way
> to vectorize with an earlier compiler.
>
The Intel Itanium compilers used different terminology for loop
optimization (software pipelining) and had somewhat different options
for reporting optimizations. I don't know whether to mention this in
present or past tense, my last Itanium box broke and the 11.1 compilers
are the latest for Itanium, thus are still under support but not
development. 11.1 Itanium compiler should optimize the case you show
with default or higher (-O3) options. In the case you show, the
compiler would optimize for loop count(100) if you don't give it
additional information.
While it's reasonable to expect that a loop which will optimize for SSE3
will optimize for Itanium, there's certainly no 1:1 correspondence.
--
Tim Prince
|