I investigated the following case for the BLAS interface, which has
continued to confuse me. I'm sure most of us would be dismayed to lose the
ability to work efficiently the way f77 did:
w(i) = .01d0 + sdot(i-1,b(i,1),1,w,-1)
never required copy-in on most compilers. I checked this with the latest
Intel compiler;
w(i) = .01d0 + sdot(i-1,b(i,1),1,w(i-1:1:-1),1)
is quite slow, as it involves a copy, but
w(i) = .01d0 + sdot(i-1,b(i,1),1,w(1:i-1),-1)
works the same as the f77 code quoted above. For some reason, I found it
difficult to grasp the f77 version of the concept of passing an array
section to be traversed with a negative stride (the array element designated
in the function call is the last one to be used, while the array element of
the forward stride array is the first to be used).
Anyway, several compilers are able to pass a contiguous section of an array
without a copy, but there are limits, which may vary between compilers, on
how far that may be taken.
Another issue, which I can't answer for, is the variation in policy between
vendors about whether dot_product and matmul may be expected to cover most
cases efficiently, and whether the proprietary "optimized" BLAS libraries
are optimized only for arrays too large for efficient cache usage with sane
f77 source code. That is quite large (minimum dimension > 200) on the newer
machines.
But, this is not assumed shape I am talking about, it's assumed size. We
had an example just a few days ago which appeared to be one of assumed
shape, except that no one could show a way to make it work other than with
CONTAINS or an explicit interface.
----- Original Message -----
From: "Van Snyder" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Thursday, September 28, 2000 12:56 PM
Subject: Questions about assumed shape
>
> I have a procedure that has an assumed-shape rank-1 dummy argument.
>
> It calls another procedure that doesn't have explicit interface, and
> passes that argument to it.
>
> I assume most compilers are clever enough to get the interface done
> right. That is, they use copy-in/copy-out on the caller's side. Is
> it common for compilers to generate code that suppresses taking copies
> if the elements of the dummy argument are contiguous?
>
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|