Hi John, These seem very sensible suggestions indeed ... All the best, david Gordon, JC (John) wrote: > The stageout issues take me back to the early days of LEP when CSF was > developing (yes, OK, I am old). Nodes started another job when the > previous one entered its output phase and was copying output across the > network. I don't think WMS can do much to push another job. This issue > is bigger than just staging the sandbox back to WMS, jobs often need to > send their data elsewhere and not everyone has async solutions for this. > The options I see are:- > > a) local batch system - LSF can start more jobs when cpu load drops. > Obviously a risk if jobs stall at the start when stageing in. What can > other batch systems do? > > b) Pilot jobs - obviously they can know enough to start another job at > the appropriate time but launching payloads other than serially > introduces opportunities for interference and difficulties in cleaning > up. > > John > >> -----Original Message----- >> From: Testbed Support for GridPP member institutes [mailto:TB- >> [log in to unmask]] On Behalf Of Graeme Stewart >> Sent: 19 October 2008 11:01 >> To: [log in to unmask] >> Subject: Re: [Fwd: Jobs idling on transfers..] >> >> On Sun, Oct 19, 2008 at 11:04 AM, Coles, J (Jeremy) >> <[log in to unmask]> wrote: >>> Hi Graeme >>> >>>>> Which VO are the jobs running under? >>>> Unless I'm mistaken Kostas has pulled out code from the RB/WMS job >>>> epliogue wrapper. So the VO is not really relevant. >>> I think it is relevant from a user education standpoint, rather than >>> simply one of catching inefficient jobs at the batch system. >> No it's not. If it's user education that would be teaching them "don't >> use the WMS, it's rubbish and it can't get your job outputs back to >> you..." >> >> :-) >> >> Graeme >> >> -- >> Dr Graeme Stewart http://www.physics.gla.ac.uk/~graeme/ >> Department of Physics and Astronomy, University of Glasgow, Scotland