Good to know. We have not at all accounted for SLI, not even a passive
such. We'll make a TODO of it and hopefully evade unnecessary split
allocations in future versions.
Thanks!
/Björn
On 02/22/2017 04:23 PM, Dieter Blaas wrote:
> Hi,
>
> SOLVED: The SLI switch was set. Unsetting it solved the issue!
>
> Thanks for the help,
>
> Dieter
>
> ------------------------------------------------------------------------
> Dieter Blaas,
> Max F. Perutz Laboratories
> Medical University of Vienna,
> Inst. Med. Biochem., Vienna Biocenter (VBC),
> Dr. Bohr Gasse 9/3,
> A-1030 Vienna, Austria,
> Tel: 0043 1 4277 61630,
> Fax: 0043 1 4277 9616,
> e-mail: [log in to unmask]
> ------------------------------------------------------------------------
>
> Am 22.02.2017 um 15:37 schrieb Bjoern Forsberg:
>> Hi,
>>
>> Apologies, I notice now that you only have two PIDs, indicating
>> exactly what you initially described. Please let me know if you work
>> out why this happens, I can't reproduce it here and a have in fact
>> never seen this happen before.
>>
>> /Björn
>>
>>
>> On 02/22/2017 03:20 PM, Dieter Blaas wrote:
>>> Hi Björn,
>>>
>>> thank you very much for the explication!
>>>
>>> But why, when I explicitly enter "0" under "which GPU to use":
>>>
>>> ###############################################
>>>
>>> uniqueHost N616-DB-LSRV2 has 2 ranks.
>>> Using explicit indexing on slave 1 to assign devices 0
>>> Thread 0 on slave 1 mapped to device 0
>>> Using explicit indexing on slave 2 to assign devices 0
>>> Thread 0 on slave 2 mapped to device 0
>>> Device 0 on N616-DB-LSRV2 is split between 2 slaves
>>> Estimating accuracies in the orientational assignment ...
>>>
>>> ##############################################
>>>
>>> GPU #1 is used as well and divided into 2:
>>>
>>> ##############################################
>>>
>>> +-----------------------------------------------------------------------------+
>>>
>>> | Processes: GPU Memory |
>>> | GPU PID Type Process name Usage |
>>> |=============================================================================|
>>>
>>> | 0 2344 G
>>> /usr/lib/xorg/Xorg 13MiB |
>>> | 0 15865 C /usr/local/bin/relion_refine_mpi
>>> 3939MiB |
>>> | 0 15866 C /usr/local/bin/relion_refine_mpi
>>> 3949MiB |
>>> | 1 2344 G
>>> /usr/lib/xorg/Xorg 13MiB |
>>> | 1 15865 C /usr/local/bin/relion_refine_mpi
>>> 3939MiB |
>>> | 1 15866 C /usr/local/bin/relion_refine_mpi
>>> 3949MiB |
>>> +-----------------------------------------------------------------------------+
>>>
>>>
>>> ################################################
>>>
>>> I am afraid that this is a problem of hardware setup.....
>>>
>>> Dieter
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> Dieter Blaas,
>>> Max F. Perutz Laboratories
>>> Medical University of Vienna,
>>> Inst. Med. Biochem., Vienna Biocenter (VBC),
>>> Dr. Bohr Gasse 9/3,
>>> A-1030 Vienna, Austria,
>>> Tel: 0043 1 4277 61630,
>>> Fax: 0043 1 4277 9616,
>>> e-mail: [log in to unmask]
>>> ------------------------------------------------------------------------
>>>
>>>
>>> Am 22.02.2017 um 15:09 schrieb Bjoern Forsberg:
>>>> Hi Dieter,
>>>>
>>>> There will be initial output during the run which states exactly
>>>> how relion distributes MPI-ranks and threads. If you are running 4
>>>> ranks there is simply no way to avoid using at least 2 ranks on at
>>>> least one GPU, because MPI is implemented with non-shared memory in
>>>> mind. This means that two MPI-ranks simply *cannot* share the same
>>>> memory, even if they are using allocations on the same physical
>>>> piece of memory. The only way to share object residing in memory
>>>> between ranks is by sending and receiving them, which is both
>>>> inefficient in itself, and entirely unfeasible for objects like
>>>> class references which get re-used so often inside relion. If you
>>>> want to use more CPUs per GPU, using more threads help. It IS less
>>>> efficient to compensate fewer MPI-ranks by increasing the number of
>>>> threads, but in your case it is the only alternative, since you are
>>>> limited by memory.
>>>>
>>>> Cheers,
>>>>
>>>> /Björn
>>>>
>>>>
>>>> On 02/22/2017 02:53 PM, Dieter Blaas wrote:
>>>>> Hi all,
>>>>>
>>>>> I have 2 GPUs but whatever I enter under 'Which GPU to use'
>>>>> (nothing or '0' or 0,0 etc)' and/or 'Number of MPI Proc' (3 or 4
>>>>> and Threads 1 or 2) the GPU RAM becomes divided into two each so
>>>>> that I run out of memory. What might be the reason? This does not
>>>>> occur on a second PC configured similarly.
>>>>>
>>>>> Thanks for hints, Dieter
>>>>>
>>>>> +-----------------------------------------------------------------------------+
>>>>>
>>>>> | Processes: GPU Memory |
>>>>> | GPU PID Type Process name Usage |
>>>>> |=============================================================================|
>>>>>
>>>>> | 0 2344 G
>>>>> /usr/lib/xorg/Xorg 13MiB |
>>>>> | 0 15865 C
>>>>> /usr/local/bin/relion_refine_mpi 3939MiB |
>>>>> | 0 15866 C
>>>>> /usr/local/bin/relion_refine_mpi 3953MiB |
>>>>> | 1 2344 G
>>>>> /usr/lib/xorg/Xorg 13MiB |
>>>>> | 1 15865 C
>>>>> /usr/local/bin/relion_refine_mpi 3939MiB |
>>>>> | 1 15866 C
>>>>> /usr/local/bin/relion_refine_mpi 3953MiB |
>>>>> +-----------------------------------------------------------------------------+
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> Dieter Blaas,
>>>>> Max F. Perutz Laboratories
>>>>> Medical University of Vienna,
>>>>> Inst. Med. Biochem., Vienna Biocenter (VBC),
>>>>> Dr. Bohr Gasse 9/3,
>>>>> A-1030 Vienna, Austria,
>>>>> Tel: 0043 1 4277 61630,
>>>>> Fax: 0043 1 4277 9616,
>>>>> e-mail: [log in to unmask]
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>
>>>
>>
>
|