Thanks for your feedback Adam. Perhaps I should not hide the option in the
next release then... Anyway, it's good it is advertised here now.
S
> Hi Sjors,
>
> We have found that is essential to use the hidden flag:
> --dont_combine_weights_via_**disc
>
> With this option our performance with v1.2 is comparable to v1.1. This has
> been our experience on several different clusters with different
> system architectures and interconnect hardware, including the Gordon
> cluster administered by NSF and our local clusters of Dell Sandybridge
> nodes.
>
> ~Adam
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> [log in to unmask]
> http://www.biochem.utah.edu/frost/
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> On Fri, Sep 6, 2013 at 9:51 AM, Sjors Scheres
> <[log in to unmask]>wrote:
>
>> Hi Niels,
>>
>> This probably has to do with a change in how the data are communicated
>> from one node to the other. In our cluster, we started experiencing
>> severe
>> network limitations during the gathering of all information from all
>> nodes
>> at the end of each step. I then decided to have each node write out
>> rather
>> large files to disc and to have the master read them all back in again.
>> On
>> our cluster that improved stability of the runs. However, it all depends
>> on
>> where the bottleneck on your system lies: network speed or disc access
>> (all
>> slaves write their files one after the other and other files are read
>> back
>> in one after the other again: on some clusters this takes very long).
>> To switch back to the old-relion-way of sending all data over the
>> network,
>> you can add the (hidden) option: --dont_combine_weights_via_**disc to
>> the
>> extra option bar in the gui.
>> I would appreciate if you could share with us whether this helps on your
>> system.
>>
>> HTH, S.
>>
>>
>>
>> On 09/06/2013 03:53 PM, Fischer, Niels wrote:
>>
>>>
>>> Dear Sjors Scheres,
>>>
>>> Relion 1.2 provides features absent in Relion 1.1 that are
>>> indispensable
>>> for us. However, we at the MPI in Goettingen experienced a severe drop
>>> (up
>>> to ~4fold) in performance with the auto-refine procedure since changing
>>> from Relion 1.1 to 1.2.
>>>
>>> For instance, a run of the same refinement (same data, same Relion
>>> settings etc.) on the same cluster takes 40 minutes with Relion 1.1,
>>> but
>>> 150 minutes with Relion 1.2. The calculation times that are indicated
>>> by
>>> Relion in the shell appear to be very similar for both versions. The
>>> main
>>> difference appears to be a very long pausing time between the
>>> expectation
>>> step and the maximization step in Relion 1.2. This pausing time is
>>> absent
>>> or at least much reduced in Relion 1.1. During this pause, Relion 1.2
>>> writes a bunch of tmp-files on the hard disc
>>> (“outputfilename”_rank000001.
>>> **tmp, “outputfile”_rank000002.tmp etc.).
>>>
>>> Any idea how to get rid of this pause or to otherwise increase
>>> performance of Relion 1.2 would be highly appreciated!
>>>
>>> Thanks in advance and best regards,
>>>
>>> Niels
>>>
>>> ---
>>>
>>> Dr. Niels Fischer
>>>
>>> MPI f. Biophysical Chemistry
>>>
>>> 3D Electron Cryomicroscopy
>>>
>>> Am Faßberg 11
>>>
>>> 37077 Göttingen
>>>
>>> Tel. ++49 - (0)551-2011306
>>>
>>> Fax ++49 - (0)551-2011197
>>>
>>>
>> --
>> Sjors Scheres
>> MRC Laboratory of Molecular Biology
>> Francis Crick Avenue, Cambridge Biomedical Campus
>> Cambridge CB2 0QH, U.K.
>> tel: +44 (0)1223 267061
>> http://www2.mrc-lmb.cam.ac.uk/**groups/scheres<http://www2.mrc-lmb.cam.ac.uk/groups/scheres>
>>
>
--
Sjors Scheres
MRC Laboratory of Molecular Biology
Francis Crick Avenue, Cambridge Biomedical Campus
Cambridge CB2 0QH, U.K.
tel: +44 (0)1223 267061
http://www2.mrc-lmb.cam.ac.uk/groups/scheres
|