Kostas Georgakopoulos wrote:
> Marco Verlato wrote:
>
>> Kostas Georgakopoulos wrote:
>>
>>> And the normal jobs (non mpi jobs) will still work? because Charles
>>> Loomis made that point exactly: if you change the configuration to
>>> pbs and you *don't*
>>> have shared home directories then all jobs will fail.
>>>
>>
>> In both cases the lcgpbs jobmanager with non-shared home directories
>> is used.
>> The INFN solution tries to simulate shared home directories scp-ing
>> all job subdirectory from the WN where the job is executed to all the
>> others in the set choosen for the job. Of course this approach has
>> some limitations because the home directories are not really shared,
>> so only a subset of MPI applications (like the one in the example
>> shown at http://grid-it.cnaf.infn.it/index.php?mpihowto&type=1) will
>> work.
>>
> Thanks again everybody i finally got it to work!
> Two final questions:
>
> What is the subset of mpi jobs that run with the
> non-shared-home-directories scheme?
i meant all those MPI jobs that can successfully run when encapsulated
in a script like the one shown at
http://grid-it.cnaf.infn.it/index.php?mpihowto&type=1, i.e. for which it
is enough to have all the home directories made identical at the
involved WNs only at the beginning of the threads, when inside the
script you execute the mpirun command. If the MPI job relies instead
e.g. on something written in the home directory by the differents
threads after having started and during their execution, the shared home
dirs were mandatory
> If we wanted to have full support
> for mpi jobs shoud we
> consider switching to shared home directories?
yes
>
> and how can we now publish the fact that our site supports mpi so others
> can use it?
just publish GlueHostApplicationSoftwareRunTimeEnvironment MPICH (adding
the tag MPICH in CE_RUNTIMEENV line of your site.def)
cheers, Marco
>
> Best regards,
> Kostas Georgakopoulos - University of Macedonia
|