Debugging off-list ...
On Thu, 2 Dec 2010, Nilsen Dimitri wrote:
> It is a CREAM_CE.. and the strange thing, it seems to affect only one of
> 3 our creams. also not all jobs stay forever in "running".. some of them
> passed good.
> status at CREAM also done.. date of "done" at LB and CREAM seems to be
> the same. But if we look at the date ff the entry in LB database, it is
> an hour earlier. Is it somehow UTC staff?
> cream and log-output:15:20:05
> LB database: 14 20 05 ..
> (seel logs in my first mail)
>
> CREAM-log:
> /opt/glite/var/log/glite-ce-cream.log.1:01 Dec 2010 15:20:05,606 INFO
> org.glite.ce.creamapi.jobmanagement.cmdexecutor.AbstractJobExecutor
> (AbstractJobExecutor.java:2094) - (Worker Thread 19) JOB CREAM215908827
> STATUS CHANGED: REALLY-RUNNING => DONE-OK [failureReason=reason=0]
> [localUser=dcms063]
> [gridJobId=https://lb-1-fzk.gridka.de:9000/HToevj_pLkQDZcEKfXQQCw]
> [lrmsJobId=982499] [workerNode=c01-016-117]
> [delegationId=12910455052E687689wms2D12Dfzk2Egridka2Ede]
>
> I read the docu at http://goc.grid.sinica.edu.tw/gocwiki/Jobs_sent_to_some_CE_stay_in_Running_state_forever
> but I don't think there are some background processes. Job was simple: just:
> #!/bin/bash
> /bin/hostname
> /usr/bin/id
> date
>
> What I don't understand: By glite-wms-job-status the attribute is a JobID with a reference to LB(https://lb-1-fzk.gridka.de:9000/D4Fhkep6xv-fLQI_aEV1-w). So, glite-wms-job-status makes a connection to LB to check the status, right? At LB Job is marked as "done".. in the database.. why it shows "running"?
>
> Regards
> Dimitri
>
>
>
>
> On 12/01/2010 08:01 PM, Massimo Sgaravatto - INFN Padova wrote:
>> Was this submitted to a CREAM-CE or LCG-CE ?
>>
>> In the former case, what is the status of that job wrt CREAM (you can
>> find this info in the glite-ce-cream.log)
>>
>> Cheers, Massimo
>>
>>
>> On Wed, 1 Dec 2010, Nilsen Dimitri wrote:
>>
>>> Hi
>>>
>>> we observe that many jobs stay in running state forever.
>>> But the job was done successfully and output copied back to WMS. What
>>> could be the reason?
>>>
>>> example:
>>> gridka24 $ glite-wms-job-status
>>> https://lb-1-fzk.gridka.de:9000/HToevj_pLkQDZcEKfXQQCw
>>> ...Current Status: Running...
>>>
>>>
>>> but:
>>> gridka24 $ glite-wms-job-logging-info -v 2
>>> https://lb-1-fzk.gridka.de:9000/HToevj_pLkQDZcEKfXQQCw
>>> Event: Done
>>> - Arrived = Wed Dec 1 15:20:05 2010 CET
>>> - Exit code = 0
>>> - Host = c01-016-117.gridka.de
>>> - Reason = job completed
>>>
>>> @LB mysql db:
>>> | HToevj_pLkQDZcEKfXQQCw | 14 | DG.LLLID=2430000
>>> DG.USER="/O=GermanGrid/OU=Uni Karlsruhe/CN=Andreas Oehler"
>>> DATE=20101201142005.522378 HOST="c01-016-117.gridka.de" PROG=edg-wms
>>> LVL=SYSTEM DG.PRIORITY=4 DG.SOURCE="LRMS" DG.SRC_INSTANCE=""
>>> DG.EVNT="Done"
>>> DG.JOBID="https://lb-1-fzk.gridka.de:9000/HToevj_pLkQDZcEKfXQQCw"
>>> DG.SEQCODE="UI=000000:NS=0000000004:WM=000004:BH=0000000000:JSS=000002:LM=000002:LRMS=000005:APP=000000:LBS=000000"
>>>
>>> DG.DONE.STATUS_CODE="OK" DG.DONE.REASON="job completed"
>>> DG.DONE.EXIT_CODE="0"
>>>
>>> @WMS:
>>> # cat
>>> /var/glite/SandboxDir/HT/https_3a_2f_2flb-1-fzk.gridka.de_3a9000_2fHToevj_5fpLkQDZcEKfXQQCw/output/gc.stdout
>>>
>>> <some output, job done correct>
>>>
>>> LB and WMS are different hosts and have latest updates.
>>> I tried to restart interlogd processes.. no effect.
>>>
>>> Regards
>>> Dimitri
>>>
>>> --
>>> Dimitri Nilsen, Dipl.-Ing(FH)
>>>
>>> Karlsruhe Institute of Technology (KIT)
>>> Steinbuch Centre for Computing
>>> Postfach 3640
>>> 76344 Eggenstein-Leopoldshafen, Germany
>>>
>>> Tel.: +49 7247 82-8607
>>> Fax.: +49 7247 82-4972
>>> Email: [log in to unmask]
>>>
>>
>> \|||/
>> -----------0oo----( o o )----oo0-------------------
>> (_)
>> INFN Sezione di Padova
>> Via Marzolo, 8
>> 35131 Padova - Italy E-mail: massimo.sgaravatto [at] pd.infn.it
>> Tel: ++39 0498275908 Skype: massimo.sgaravatto
>> Fax: ++39 0498275952
>
>
> --
> Dimitri Nilsen, Dipl.-Ing(FH)
>
> Karlsruhe Institute of Technology (KIT)
> Steinbuch Centre for Computing
> Postfach 3640
> 76344 Eggenstein-Leopoldshafen, Germany
>
> Tel.: +49 7247 82-8607
> Fax.: +49 7247 82-4972
> Email: [log in to unmask]
>
>
\|||/
-----------0oo----( o o )----oo0-------------------
(_)
INFN Sezione di Padova
Via Marzolo, 8
35131 Padova - Italy E-mail: massimo.sgaravatto [at] pd.infn.it
Tel: ++39 0498275908 Skype: massimo.sgaravatto
Fax: ++39 0498275952
|