Print

Print


On 30 Sep 2011, at 16:33, Stephen Jones wrote:

> Stuart wrote:
> ... I'm pretty sure that's what we do
> 
> That's interesting. When I tried to install 
> EMI-CREAM/EMI-TORQUE_UTILS(torque 2.5...) to talk to a cluster with 
> EMI-TORQUE_SERVER (torque 2.3...), the first problem was about MUNGE, 

Ah! Not quite what we have - we've got torque 2.3 on each.  Looks like the torque libs haven't been updated since I installed it, and if I remember correctly I had problems getting YUM to install torque at the time (unrelated reasons) - might be that 2.5 is realtivly new in EPEL?  

This does, however suggest a solution - if all else fails, just install and pin torque 2.3, and that (by experience) works.  Then, once everything can move together, update all the machine to 2.5 at once.


> i.e. qsub would grumble about a missing socket from a missing MUNGE 
> server. When I installed MUNGE, things went to the next step - I just 
> got "End Of File" messages from qsub. I figured that is because the pbs 
> server was getting off-protocol messages, i.e. some kind of MUNGE 
> message that the server doesn't want or understand.
> 
> I'll try it again.
> 
> Steve
> 
> 
> 
> 
> Stuart Purdie wrote:
>> On 30 Sep 2011, at 13:18, Stephen Jones wrote:
>> 
>> 
>>> Hi,
>>> 
>>> If everyone an early adopter, is anyone an early adopter? While we 
>>> ponder that, I've got something to say/ask about emi-cream and 
>>> emi-torque. Maybe an early-adopter could enlighten me?
>>> 
>>> The plan was to fix a new new emi-cream onto our existing (separate) 
>>> glite_TORQUE_server cluster. But I discovered that emi-torque (server 
>>> and utils) is built to use MUNGE to safety transmit the login details. 
>>> This is a new thing.
>>> 
>>> Unfortunately, due to MUNGE, the emi-cream can't qsub from a system 
>>> using emi-torque-utils to a batch cluster headnode that uses 
>>> glite_TORQUE_server (job array syntax is also new). This would mean 
>>> that, by design, emi-cream cannot work with a standalone 
>>> glite_TORQUE_server cluster -- the whole lot has to be updated at once.
>>> 
>> 
>> 
>> 
>>> This could deserve a GGUS ticket, but I'm not sure of the facts -- is it 
>>> possible to run emi-cream/emi-torque-utils with an existing 
>>> glite_TORQUE_server?
>>> 
>> 
>> ... I'm pretty sure that's what we do
>> 
>> Ok, strictly, no - we're using what looks like the SL5 native Torque, probably inherited from Centos, not the gLite one.
>> 
>> 
>>> Should it be possible to qsub from a 
>>> system using emi-torque-utils to a batch cluster headnode that uses 
>>> glite_TORQUE_server?
>>> 
>> 
>> I've been able to qsub, and I can assure you that we're defiatly using ruserok (and not munge) on our torque head node, and the emi-torque-utils where I submitted from.
>> 
>> What error message are you actually seeing - it could be down to the different authentication keys in the different builds of torque?
>> 
> 
> 
> -- 
> Steve Jones                             [log in to unmask]
> System Administrator                    office: 220
> High Energy Physics Division            tel (int): 42334
> Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 2334
> University of Liverpool                 http://www.liv.ac.uk/physics/hep/