Andy
The reason for the big difference in performance, is that your two programs
are quite different!
In prog2, you use static allocation at compile time, and so the computer
does not have to check any aspect of the allocation at run time.
In prog1, you allocate something of arbitrary size and then call a routine
1000 times. Each time the routine must determine if there is anything
allocated, and if there is how big it is and where it is, and take
appropriate action if conditions are not good.
Try the following program which I think is similar to your common example
and uses the static idiom with the module. The performance is the same as
with your common block example. (well it is on my Compaq Visual Fortran
compiler on PC).
Hope that this helps.
Alistair
PS I have made small changes to prog1.f90 and prog2.f90 to make them more to
my taste, and so I attach those also.
Prog3.f90
!==============================
module A_MOD
implicit none
private
integer, parameter, public :: na = 1024*96
real, public :: dt = 1.0/1024
real, dimension(na), public :: A, dAdt
end module A_MOD
!==============================
program main
use A_MOD, only: na, A, dAdt, dt
implicit none
integer :: i, n
dt = 1.0/1024
A(:) = 0
dAdt(:) = 1
do i = 1, 10
do n = 1, 1024
call SUB
enddo
print *, a(n)
end do
end
!==============================
subroutine SUB
use A_MOD, only: na, A, dAdt, dt
implicit none
integer :: n
do n = 1, na
A(n) = A(n) + dAdt(n)*dt
enddo
end
Prog1.f90
!==============================
module A_MOD
implicit none
private
integer, public :: na
real, public :: dt = 1.0/1024
real, allocatable, dimension(:), public :: A, dAdt
end module A_MOD
!==============================
program main
use A_MOD, only: na, A, dAdt
implicit none
integer :: n, i
na = 96*1024
ALLOCATE(A(na))
ALLOCATE(dAdt(na))
A(:) = 0
dAdt(:) = 1
do i = 1, 10
do n = 1, 1024
call SUB
enddo
print *, A(n)
end do
deALLOCATE(A)
deALLOCATE(dAdt)
end
!==============================
subroutine SUB
use A_MOD, only: na, A, dAdt, dt
implicit none
integer :: n
do n = 1, na
A(n) = A(n) + dAdt(n)*dt
enddo
end
!==============================
Prog2.f90
!==============================
program main
implicit none
integer, parameter :: na = 1024*96
real :: dt, A(na), dAdt(na)
common /A_MOD/ dt, A, dAdt
integer :: i, n
dt = 1.0/1024
A(:) = 0
dAdt(:) = 1
do i = 1, 10
do n = 1, 1024
call SUB
enddo
print *, a(n)
end do
end
!==============================
subroutine SUB
implicit none
integer, parameter :: na = 1024*96
real :: dt, A(na), dAdt(na)
common /A_MOD/ dt, A, dAdt
integer :: n
do n = 1, na
A(n) = A(n) + dAdt(n)*dt
enddo
end
-----Original Message-----
From: Fortran 90 List [mailto:[log in to unmask]] On Behalf Of
Andy Leonard
Sent: 27 January 2003 15:39
To: [log in to unmask]
Subject: Performance penalties for allocatable arrays?
Can anyone explain why the use of allocatable arrays in the first example
program below causes a 10% increase in the CPU time spent in the subroutine,
compared to the second example program where the arrays are in a common
block? (When the allocatable is changed to pointer, it adds another 10%).
Both codes were compiled with the IBM xlf90 compiler (with the same compiler
options) and the cpu time determined by profiling with gprof.
We have noticed some significant increases in CPU time in our software since
we started dynamically allocating arrays. The trivial examples below are
representative of our code, as we frequently use common blocks (and now
modules) rather than passing data through the arguments.
Thanks,
Andy
Program 1:
c==============================
program main
use A_MOD
na = 100000
ALLOCATE(A(na))
ALLOCATE(dAdt(na))
do n=1,1000
call SUB
enddo
stop
end
c==============================
subroutine SUB
use A_MOD
do n=1,na
A(n) = A(n) + dAdt(n)*dt
enddo
return
end
c==============================
module A_MOD
integer :: na
real :: dt = 0.001
real, allocatable, dimension(:) :: A, dAdt
end module A_MOD
c==============================
Program 2:
c==============================
program main
parameter (na = 100000)
common /A_MOD/ dt,A(na),dAdt(na)
dt = 0.001
do n=1,1000
call SUB
enddo
stop
end
c==============================
subroutine SUB
parameter (na = 100000)
common /A_MOD/ dt,A(na),dAdt(na)
do n=1,na
A(n) = A(n) + dAdt(n)*dt
enddo
return
end
Andy Leonard
Gamma Technologies, Inc.
601 Oakmont Lane, Suite 220
Westmont, IL 60559
Tel: (630) 325-5848
Fax: (630) 325-5849
E-mail: [log in to unmask]
Web: http://www.gtisoft.com
|