On 30 Sep 2011, at 16:33, Stephen Jones wrote:
> Stuart wrote:
> ... I'm pretty sure that's what we do
>
> That's interesting. When I tried to install
> EMI-CREAM/EMI-TORQUE_UTILS(torque 2.5...) to talk to a cluster with
> EMI-TORQUE_SERVER (torque 2.3...), the first problem was about MUNGE,
Ah! Not quite what we have - we've got torque 2.3 on each. Looks like the torque libs haven't been updated since I installed it, and if I remember correctly I had problems getting YUM to install torque at the time (unrelated reasons) - might be that 2.5 is realtivly new in EPEL?
This does, however suggest a solution - if all else fails, just install and pin torque 2.3, and that (by experience) works. Then, once everything can move together, update all the machine to 2.5 at once.
> i.e. qsub would grumble about a missing socket from a missing MUNGE
> server. When I installed MUNGE, things went to the next step - I just
> got "End Of File" messages from qsub. I figured that is because the pbs
> server was getting off-protocol messages, i.e. some kind of MUNGE
> message that the server doesn't want or understand.
>
> I'll try it again.
>
> Steve
>
>
>
>
> Stuart Purdie wrote:
>> On 30 Sep 2011, at 13:18, Stephen Jones wrote:
>>
>>
>>> Hi,
>>>
>>> If everyone an early adopter, is anyone an early adopter? While we
>>> ponder that, I've got something to say/ask about emi-cream and
>>> emi-torque. Maybe an early-adopter could enlighten me?
>>>
>>> The plan was to fix a new new emi-cream onto our existing (separate)
>>> glite_TORQUE_server cluster. But I discovered that emi-torque (server
>>> and utils) is built to use MUNGE to safety transmit the login details.
>>> This is a new thing.
>>>
>>> Unfortunately, due to MUNGE, the emi-cream can't qsub from a system
>>> using emi-torque-utils to a batch cluster headnode that uses
>>> glite_TORQUE_server (job array syntax is also new). This would mean
>>> that, by design, emi-cream cannot work with a standalone
>>> glite_TORQUE_server cluster -- the whole lot has to be updated at once.
>>>
>>
>>
>>
>>> This could deserve a GGUS ticket, but I'm not sure of the facts -- is it
>>> possible to run emi-cream/emi-torque-utils with an existing
>>> glite_TORQUE_server?
>>>
>>
>> ... I'm pretty sure that's what we do
>>
>> Ok, strictly, no - we're using what looks like the SL5 native Torque, probably inherited from Centos, not the gLite one.
>>
>>
>>> Should it be possible to qsub from a
>>> system using emi-torque-utils to a batch cluster headnode that uses
>>> glite_TORQUE_server?
>>>
>>
>> I've been able to qsub, and I can assure you that we're defiatly using ruserok (and not munge) on our torque head node, and the emi-torque-utils where I submitted from.
>>
>> What error message are you actually seeing - it could be down to the different authentication keys in the different builds of torque?
>>
>
>
> --
> Steve Jones [log in to unmask]
> System Administrator office: 220
> High Energy Physics Division tel (int): 42334
> Oliver Lodge Laboratory tel (ext): +44 (0)151 794 2334
> University of Liverpool http://www.liv.ac.uk/physics/hep/
|