Hi
it's not new ... we see this all the time here.
JT
On Nov 8, 2012, at 17:24 , Stephen Jones wrote:
> On 11/08/2012 03:23 PM, Andreas Gellrich wrote:
>> Hi *,
>> We migrated our CREAM-CEs to EMI-2/SL58, most recent version.
>>
>> We noticed that the CREAM-CEs sometimes resubmits jobs which have already finished.
>>
>> This is visible since the worker node which receives a resubmitted jobs can not download (gridftp) the input sandbox because it was deleted by the CREAM-CE after the end of the (first) job. The job ends up as W (waiting) in torque and stays there forever.
>
> Hi Andreas,
>
> This looks like a familiar issue. Maybe the job comes through some WMS and winds up on a CREAM server. The CREAM Server hands to to some batch server (torque).The job gets queued.The proxy expires, and CREAM removes the files. The job gets to the front of the queue and runs. The job can't load its files (e.g. proxy). The batch system (torque) puts the job in W state. The batch system holds it for a while then tries it again. So the job just goes round and round. Please see https://ggus.eu/tech/ticket_show.php?ticket=72506
>
> But it is odd that you have not seen it before. Maybe it's something new.
>
>
> Steve
>
>
>> There are also hints that jobs are resubmitted although the first submit
>> still runs.
>>
>> This showed up immediately after we put the EMI-2-CREAM-CE into operations. We do not see and have never seen this with the last remaining glite-CREAM-CE.
>>
>> Any ideas or similar observations?
>>
>> Thanx
>> Andreas
>>
>> # Andreas Gellrich
>> # DESY IT / Grid Computing
>> # 2b/317, Notkestr. 85, D-22607 Hamburg, +49 40 8998 2732
>
>
> --
> Steve Jones [log in to unmask]
> System Administrator office: 220
> High Energy Physics Division tel (int): 42334
> Oliver Lodge Laboratory tel (ext): +44 (0)151 794 2334
> University of Liverpool http://www.liv.ac.uk/physics/hep/
|