This is a common piece of code that I use to copy a segment from 1
arbitrary dimensional array to another array of the same
dimensiononality (but may be different size)
function offset(pos, lb, strd)
!
! compute the offset into a flattened array
!
integer, dimension(:), intent(in) :: pos, lb, strd
integer :: offset, i
offset = 1
do i = 1, ndims
offset = offset + (pos(i)-lb(i))*strd(i)
end do
end function offset
real(8), allocatable(:) :: array1, array2
integer, allocatable(:) :: inc, lb1, ub1, lb2, ub2
integer :: i, ii, imax, jj, ndims
ndims=something
allocate(array1(some large size), array2(another large size))
allocate(inc(ndims), lb1(ndims), ub1(ndims))
allocate(lb2(ndims), ub2(ndims))
lb1=something
ub1=something
strd1(1)=1
do i = 2, ndims
strd1(i) = (ub1(i)-lb1(i)+1)*strd1(i-1)
end do
lb2=something
ub2=something
strd2(1)=1
do i = 2, ndims
strd2(i) = (ub2(i)-lb2(i)+1)*strd2(i-1)
end do
imax = offset(ub1, lb1, strd1)
ii = imax
inc = lb1
do while (ii <= imax)
i = offset(inc, lb1, strd1)
jj = offset(inc, lb2, strd2)
do ii = i, i+ub1(1)-lb1(1)
array2(jj) = array1(ii)
jj = jj + 1
end do
do k = 2, ndims
inc(k) = inc(k) + 1
if (inc(k) <= ub2(k)) exit
inc(k) = lb1(k)
end do
end do
Notionally lb1/ub1 and lb2/ub2 are the upper and lower bounds of the
two arrays. So by adjusting the lb's and ub's in
imax = offset(ub1, lb1, strd1)
ii = imax
inc = lb1
you can choose a different segment of the arrays. The example I have
provided copys ALL of array1 into array2 (though array2 may be of
different size).
I use to have the function offset defined as a cpp macro so that it
was inlined... but most compilers can inline a function just as well.
Things to note: for speed, you need to try and keep your memory-
stores contiguous - hence the array over ii/jj is contiguous. If you
stride over cache lines while doing a memory-store, you will actually
induce a cache-read, cache-modify, cache-write. Which is costly :)
Much better just to do a cache-read, lost of cache-modifies, cache-
write and fush a full/modified cache line.
I use modifications of this piece of code all through my MPI-AMR code
so I can do 2 and 3D problems without having to recompile. I use it
in my fortran scripting language so I can have interpreted arrays of
arbitrary size and dimensionality. In all the performance testing I
have done, this code snippet is as fast as hard wiring the number of
dimensions. Deep down the compiler has to be doing the same thing,
as memory is a single dimensional address range.
Of course, doing this sort of thing screws with your mind...
especially when you start getting into some hairy mathematics :) It
is much easier to let the compiler do it.
Stu.
--
Dr Stuart Midgley
Industry Uptake Program Leader
iVEC, 'The hub of advanced computing in Western Australia'
26 Dick Perry Avenue, Technology Park
Kensington WA 6151
Australia
Phone: +61 8 6436 8545
Fax: +61 8 6436 8555
Email: [log in to unmask]
WWW: http://www.ivec.org
|