Hi Elena,
collect all the information you can: the DN of the user who's running
those jobs, if they are trying to connect to a remote machine, wall
time... and then kill the jobs and open a ticket for biomed
([log in to unmask]) explaining what you have done and why.
cheers
alessandra
Elena Korolkova wrote:
> Our cluster in Sheffield is also filled by biomed jobs:
> half of them are running fine but the other half doesn't consume any CPU
> time for more than 50 hours.
>
> Elena
>
> On Mon, 4 Feb 2008, Andrew Elwell wrote:
>
>>> Our cluster is currently almost filled by biomed jobs that don't seem
>>> to be consuming any CPU time. Looking on the WNs, there are programs
>>> running with names such as "job_pull.sh" but little real work is
>>> being done. These look very like pilot jobs to me, and I'd prefer not
>>> to have them blocking slots that could be used by others. (At the
>>> very least I'd expect them to vacate their slot after a few minutes
>>> without work.) Has anyone else seen this?
>>>
>>
>> Not for a while - All our biomed jobs at the moment look fairly
>> efficient (I've not had a rummage on the worker nodes themselves to
>> see whats up.
>>
>> On saturday they weren't so good though - lower efficiencies on the plots
>>
>> A
>
>
> ____________________________________________________________________________
>
> Dr Elena Korolkova
> Email: [log in to unmask]
> Tel.: +44 (0)114 2223553
> Fax: +44 (0)114 2223555
> Department of Physics and Astronomy
> University of Sheffield
> Sheffield, S3 7RH, United Kingdom
>
|