JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for COMP-FORTRAN-90 Archives


COMP-FORTRAN-90 Archives

COMP-FORTRAN-90 Archives


COMP-FORTRAN-90@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

COMP-FORTRAN-90 Home

COMP-FORTRAN-90 Home

COMP-FORTRAN-90  April 2015

COMP-FORTRAN-90 April 2015

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: coarray deadlock at high core counts: Cray XC30 ARCHER

From:

Bill Long <[log in to unmask]>

Reply-To:

Fortran 90 List <[log in to unmask]>

Date:

Fri, 10 Apr 2015 15:33:55 +0000

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (121 lines)

Hi Anton,

On Apr 10, 2015, at 7:24 AM, Anton Shterenlikht <[log in to unmask]> wrote:

> This is a relatively large coarray/MPI program.
> Running on ARCHER, Cray XC30, archer.ac.uk.
> 
> The program apparently deadlocks at >1000 cores.
> 
> I understand a standard conforming coarray program
> can never deadlock.
> Is that correct?

Well, not exactly.  A properly written one should not deadlock.  But you could intentionally write a code that hangs.  For example, image 1 executes SYNC IMAGES (2) and image 2 never executes a corresponding SYNC IMAGES statement.  Image 1 will wait “forever” for image 2.  Similarly, you can hang an MPI program by having one rank call MPI_Recv with no rank calling  a corresponding MPI_Send. 

> 
> Is this statement still true in a coarray/MPI program?

The last time I looked, the MPI group had not updated their rules for interaction with the use of coarrays.  However, people have been mixing coarrays and MPI for many years.  The basic rule is to write the code in “phases” that are either all MPI or all coarrays, ending each MPI phase with MPI_Barrier and each coarray phase with SYNC ALL.  MPI_Barrier would not normally do memory syncs for coarray operations that MPI has no information about, for example. 


> 
> The complete program is too large to reproduce here,
> but the key fragment is:
> 
> sync all
>  call sub( a , b , c , d )
>  write (*,*) "image:", this_image(), "sub done"
> sync all
>  write (*,*) "image:: ", this_image(), "passed sync all 2"
> 
> All arguments to subroutine sub are INTENT(IN).
> Subroutine sub reads a coarray variable defined
> in the previous segment. Subroutine sub does not
> update this coarray variable. I belive this means
> the program is standard conforming (as far as coarray
> rules are concerned).
> Is that correct?

So far, it looks OK. 
> 
> The program works as expected at low core counts,
> and most times at 1500 cores. It never works at >10k cores.
> At runtime I get "image ... sub done" output from
> the vast majority of images, >99%, but not all of them.
> Then the program seems to stall until the queue time expires.
> No images get to the second write statement, so
> this means the program is stalling at the second SYNC ALL.


Usually if a code is OK at 1000 images and hangs at much larger image counts,  there is something other than standard violation happening at large scale. 

> 
> The full text of subroutine sub is:
> 
> subroutine sub( origin, rot, bcol, bcou )
> real( kind=rdef ), intent( in ) ::                                     &
> origin(3),        & ! origin of the "box" cs, in FE cs
> rot(3,3),         & ! rotation tensor *from* FE cs *to* CA cs
> bcol(3),          & ! lower phys. coords of the coarray on image
> bcou(3)             ! upper phys. coords of the coarray on image
> integer :: errstat, i, j, nimgs, nelements
> real( kind=cgca_pfem_iwp ) :: cen_ca(3) ! 3D case only
> nimgs = num_images()
> allocate( lcentr( 0 ), stat=errstat )
> if ( errstat .ne. 0 ) error stop                                       &
>  "ERROR: cgca_pfem_cenc: cannot allocate( lcentr )"
> images: do i=1, nimgs
>  nelements = size( cgca_pfem_centroid_tmp[i]%r, dim=2 )
>  elements: do j = 1, nelements
>    cen_ca = matmul( rot, cgca_pfem_centroid_tmp[i]%r(:,j) - origin )

This statement throws up big red flags.  You have an outer loop that runs i = 1..num_images() more or less concurrently on every image, and referencing image i.   Thus, for i = 1, every image in the program will be trying to access image 1 at the same time. For i = 2, all the images are suddenly pouncing on image 2.  This sort of construct does not scale well.  For a large number of images, it can create considerable congestion of the network at the point where the target image is attached.  

Note that you could create this same sort of congestion using MPI.  This is a general parallel programming consideration.  Try to avoid creating hot spots. 

If the algorithm allows (it appears so here), it is better to spread out the accesses more uniformly.   Options include executing the iterations of the images: loop in a random order that is different on each image, or offsetting the start of the loop so it runs this_image()+1..num_images()  with wrap around to 1…this_image() at the end.  



>    if ( all( cen_ca .ge. bcol ) .and. all( cen_ca .le. bcou ) )       &
>      lcentr = (/ lcentr, mcen( i, j, cen_ca ) /)
>  end do elements
> end do images
> end subroutine sub
> 
> There are several module variable, but you can see that
> there is only a single coarray var, that is being read,
> not written. So I think the coarray segment rules are not
> violated.

No problem on that front. 

> 
> I wanted to confirm here that my understanding of the
> segment rules was correct and the fragment is indeed
> standard conforming.
> 
> I wonder if the use of MPI in other parts of the program
> can have any effect on this seeming deadlock behaviour?

Probably not, especially if there is no issue at 1500 images. 


Cheers,
Bill

> 
> Any other advice?
> 
> Does this sound like I need to submit a problem report?
> 
> Thanks
> 
> Anton

Bill Long                                                                       [log in to unmask]
Fortran Technical Suport  &                                  voice:  651-605-9024
Bioinformatics Software Development                     fax:  651-605-9142
Cray Inc./ Cray Plaza, Suite 210/ 380 Jackson St./ St. Paul, MN 55101

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

December 2023
February 2023
November 2022
September 2022
February 2022
January 2022
June 2021
November 2020
September 2020
June 2020
May 2020
April 2020
December 2019
October 2019
September 2019
March 2019
February 2019
January 2019
November 2018
October 2018
September 2018
August 2018
July 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
December 2015
November 2015
October 2015
September 2015
August 2015
June 2015
April 2015
March 2015
January 2015
December 2014
November 2014
October 2014
August 2014
July 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
July 2013
June 2013
May 2013
April 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
August 2010
July 2010
June 2010
March 2010
February 2010
January 2010
December 2009
October 2009
August 2009
July 2009
June 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
January 2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager