Print

Print


Hi Elena

I don't know about setup at your end but some of ops jobs are getting mapped to dteam queue

Destination: lcgce1.shef.ac.uk:8443/cream-pbs-dteam 

Is it intentional ?

Cheers
Kashif
-----Original Message-----
From: Elena Korolkova [mailto:[log in to unmask]] 
Sent: 05 October 2012 14:28
To: Kashif Mohammad
Cc: Testbed Support for GridPP member institutes
Subject: Re: problem with nagios tests

Hi Kashif 

nagios tests are failing again. This time the job was submitted from IC wms:

======================= glite-wms-job-status Success =====================
BOOKKEEPING INFORMATION:

Status info for the Job : https://wmslb02.grid.hep.ph.ic.ac.uk:9000/F0zgSOstWGnHy9rS72qCNg
Current Status: Aborted 
Logged Reason(s):
- pbs_reason=1; SetLoggingJob(https://wmslb02.grid.hep.ph.ic.ac.uk:9000/F0zgSOstWGnHy9rS72qCNg,UI=000000:NS=0000000004:WM=000005:BH=0000000000:JSS=000002:LM=000002:LRMS=000000:APP=000000:LBS=000000): No such file or directory (No credentials found.) SetLoggingJob(https://wmslb02.grid.hep.ph.ic.ac.uk:9000/F0zgSOstWGnHy9rS72qCNg,UI=000000:NS=0000000004:WM=000005:BH=0000000000:JSS=000002:LM=000002:LRMS=000000:APP=000000:LBS=000000): No such file or directory (No credentials found.) Cannot move ISB (retry_copy ${globus_transfer_cmd} gsiftp://wms02.grid.hep.ph.ic.ac.uk:2811/var/SandboxDir/F0/https_3a_2f_2fwmslb02.grid.hep.ph.ic.ac.uk_3a9000_2fF0zgSOstWGnHy9rS72qCNg/input/nagrun.sh file:///grid/home/sgmops14/home_cream_563793770/CREAM563793770/nagrun.sh): Problem to detect the lifetime of the proxy
- pbs_reason=1; SetLoggingJob(https://wmslb02.grid.hep.ph.ic.ac.uk:9000/F0zgSOstWGnHy9rS72qCNg,UI=000000:NS=0000000004:WM=000011:BH=0000000000:JSS=000004:LM=000014:LRMS=000000:APP=000000:LBS=000000): No such file or directory (No credentials found.) SetLoggingJob(https://wmslb02.grid.hep.ph.ic.ac.uk:9000/F0zgSOstWGnHy9rS72qCNg,UI=000000:NS=0000000004:WM=000011:BH=0000000000:JSS=000004:LM=000014:LRMS=000000:APP=000000:LBS=000000): No such file or directory (No credentials found.) Cannot move ISB (retry_copy ${globus_transfer_cmd} gsiftp://wms02.grid.hep.ph.ic.ac.uk:2811/var/SandboxDir/F0/https_3a_2f_2fwmslb02.grid.hep.ph.ic.ac.uk_3a9000_2fF0zgSOstWGnHy9rS72qCNg/input/nagrun.sh file:///grid/home/sgmops14/home_cream_367761561/CREAM367761561/nagrun.sh): Problem to detect the lifetime of the proxy
Status Reason: hit job shallow retry count (1)
Destination: lcgce1.shef.ac.uk:8443/cream-pbs-dteam
Submitted: Fri Oct 5 13:30:18 2012 BST

Any idea what should I fix at my side?

Many thanks
Elena



On 5 Oct 2012, at 13:25, Elena Korolkova wrote:

> Thank you very much, Kashif.
> 
> Indeed, we are passing tests now. I keep watching.
> 
> Elena
> 
> On 5 Oct 2012, at 13:06, Kashif Mohammad wrote:
> 
>> Hi Elena
>> 
>> The problem seems to be because of lcgwms02 at RAL probably because of high load as one of their WMS is in downtime. I have removed lcgwms02 from nagios  list of WMS's and lcgce1.shef.ac.uk is passing nagios test now.
>> 
>> Cheers
>> Kashif
>> 
>> -----Original Message-----
>> From: Testbed Support for GridPP member institutes [mailto:[log in to unmask]] On Behalf Of Elena Korolkova
>> Sent: 05 October 2012 10:56
>> To: [log in to unmask]
>> Subject: problem with nagios tests
>> 
>> Hello
>> 
>> we had a power cut yesterday. We brought servers back online but we have a problem with jobs submission. We 've fixed the problem in the morning and atlas jobs are running (HC test jobs are successful) but we have still a problem with nagios tests. The

>> 
>> From https://gridppnagios.lancs.ac.uk/nagios/ 
>> Status info for the Job : https://svr024.gla.scotgrid.ac.uk:9000/qvcx8RrhYs6Wx1NQvUCaQA
>> Current Status: Aborted 
>> Logged Reason(s):
>> - Cannot move ISB (retry_copy ${globus_transfer_cmd} gsiftp://svr023.gla.scotgrid.ac.uk:2811/var/SandboxDir/qv/https_3a_2f_2fsvr024.gla.scotgrid.ac.uk_3a9000_2fqvcx8RrhYs6Wx1NQvUCaQA/input/nagrun.sh file:///grid/home/sgmops14/home_cream_612631604/CREAM612631604/nagrun.sh): proxy expired; SetLoggingJob(https://svr024.gla.scotgrid.ac.uk:9000/qvcx8RrhYs6Wx1NQvUCaQA,UI=000000:NS=0000000004:WM=000005:BH=0000000000:JSS=000002:LM=000002:LRMS=000000:APP=000000:LBS=000000): No such file or directory (No credentials found.) SetLoggingJob(https://svr024.gla.scotgrid.ac.uk:9000/qvcx8RrhYs6Wx1NQvUCaQA,UI=000000:NS=0000000004:WM=000005:BH=0000000000:JSS=000002:LM=000002:LRMS=000000:APP=000000:LBS=000000): No such file or directory (No credentials found.) Cannot move ISB (retry_copy ${globus_transfer_cmd} gsiftp://svr023.gla.scotgrid.ac.uk:2811/var/SandboxDir/qv/https_3a_2f_2fsvr024.gla.scotgrid.ac.uk_3a9000_2fqvcx8RrhYs6Wx1NQvUCaQA/input/nagrun.sh file:///grid/home/sgmops14/home_cream_612631604/CREAM612631604/nagrun.sh): proxy expired
>> - Cannot move ISB (retry_copy ${globus_transfer_cmd} 
>> .......................
>> Status Reason: hit job shallow retry count (1)
>> Destination: lcgce1.shef.ac.uk:8443/cream-pbs-dteam
>> Submitted: Fri Oct 5 09:32:39 2012 BST
>> 
>> We appreciate your help very much.
>> 
>> Elena
>> __________________________________________________
>> Dr Elena Korolkova
>> Email: [log in to unmask]
>> Tel.:  +44 (0)114 2223553
>> Fax:   +44 (0)114 2223555
>> Department of Physics and Astronomy
>> University of Sheffield
>> Sheffield, S3 7RH, United Kingdom
> 
> __________________________________________________
> Dr Elena Korolkova
> Email: [log in to unmask]
> Tel.:  +44 (0)114 2223553
> Fax:   +44 (0)114 2223555
> Department of Physics and Astronomy
> University of Sheffield
> Sheffield, S3 7RH, United Kingdom
> 
> 
> 
> 

__________________________________________________
Dr Elena Korolkova
Email: [log in to unmask]
Tel.:  +44 (0)114 2223553
Fax:   +44 (0)114 2223555
Department of Physics and Astronomy
University of Sheffield
Sheffield, S3 7RH, United Kingdom