Print

Print


Hi Sjors,

I agree I am being hampered by somethings. My workstation has dual 10-
core Xeon processors which with hyperthreading should give me a total
of 40 threads. Additionally it has 128GB of ram and a 8 harddrive raid
5 array giving over 1GB/s throughput, which according to 'top' isn't
being taxed at all. 

An interesting observation is that when I use relion_refine, I have no
problems of it running more than 2 threads (4, 8, 16, etc. verified
using 'top'). However when I use the relion_refine_mpi, I am stuck at 2
threads per mpi process. 

Some additional info:
1) I have seen this issue on two different systems running Centos7 and
Fedora 23 with 20 and 4 cores respectively. 
2) I have see issue with my own compiled version of relion 1.4 and
SBGrid.org's compiled version.
3) I suspect this might be a reason the program is not scaling as well
as we would like on our cluster, a Cray XE6 system.  

Cheers,
Bharat Reddy
Post Doc
University of Chicago

 
On Thu, 2016-04-28 at 10:10 +0100, Sjors Scheres wrote:
> Hi again,
> The program is actually using 4 threads (as from the stdout). The
> fact 
> top runs at 200% means that your threads are hampered by something
> else. 
> This could for example be the reading of particles from the hard
> disk, 
> which can become a bottle neck. Also: how many cores does 
> nsit-dhcp-148-090.bsd.uchicago.edu have? You're running 4 MPI
> slaves, 
> each with 4 threads on it. The master also takes 1 core. Therefore,
> your 
> machine should have 17 cores to do everything you ask for. If it has 
> fewer cores, then they'll just be in each others way.
> HTH,
> Sjors
> 
> On 04/27/2016 08:18 PM, Baru Reddy wrote:
> > 
> > Hi Sjors,
> > 'top' says each mpi process is running at ~200%. This is the
> > criteria by which I say it is only using 2 threads is based on the
> > ~200%. The command I use and initial output I get is shown below.
> > 
> > mpirun -n 5 ~/Downloads/relion-1.4/bin/relion_refine_mpi --o
> > Class3D/run1_ct5 --continue Class3D/run1_it005_optimiser.star --
> > iter 25 --tau2_fudge 4 --solvent_mask proteasome_mask_150.mrc --
> > oversampling 1 --healpix_order 3 --offset_range 5 --offset_step 2
> > --j 4  &
> > 
> > [reddybg@nsit-dhcp-148-090 gauto]$  === RELION MPI setup ===
> >   + Number of MPI processes             = 5
> >   + Number of threads per MPI process  = 4
> >   + Total number of threads therefore  = 20
> >   + Master  (0) runs on host            = nsit-dhcp-148-
> > 090.bsd.uchicago.edu
> >   + Slave     1 runs on host            = nsit-dhcp-148-
> > 090.bsd.uchicago.edu
> >   + Slave     2 runs on host            = nsit-dhcp-148-
> > 090.bsd.uchicago.edu
> >   + Slave     3 runs on host            = nsit-dhcp-148-
> > 090.bsd.uchicago.edu
> >   + Slave     4 runs on host            = nsit-dhcp-148-
> > 090.bsd.uchicago.edu
> > 
> > Cheers,Bharat ReddyPost DocUniversity of Chicago
> > 
> >        From: Sjors Scheres <[log in to unmask]>
> >   To: Baru Reddy <[log in to unmask]>
> > Cc: [log in to unmask]
> >   Sent: Wednesday, April 27, 2016 2:08 PM
> >   Subject: Re: [ccpem] Stuck at 2 Threads per MPI Process
> >     
> > Hi Bharat,
> > --j N should always launch N threads. You'll only see them as 1
> > process in
> > 'top', but it may run up to ~N00%. Why do you say relion launches
> > only 2
> > threads? How do you see this? Does it say so in the stdout?
> > S
> > 
> > > 
> > > Hi Everyone,
> > > Currently we are trying to mobilize the power of threads as our
> > > refinements have become more memory intensive and we have hit a
> > > limit with
> > > the number of MPI processes we can deploy. The problem is that
> > > however
> > > many threads I tell relion_refine_mpi to use (-j X where X is
> > > 4,8,16,etc.), it only uses two threads. Is there a setting I am
> > > missing, a
> > > variable I am failing to define, or is this a limit of
> > > relion_refine_mpi .
> > > Cheers,Bharat ReddyPost DocUniversity of Chicago