Dear all Our CE (pcncp04.ncp.edu.pk) on slc 4.6, is showing strange behavior since today's morning. All incoming jobs starts with showing "R" status, but all of sudden the status of these jobs changed from "R" to "E". SAM test is complaining about JobWrapper as below ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://wms209.cern.ch:9000/eu799lmLzB8ktB9Ykx-HjQ Current Status: Aborted Logged Reason(s): - File not available.Cannot read JobWrapper output, both from Condor and from Maradona. Status Reason: hit job shallow retry count (1) Destination: pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-ops Submitted: Mon Aug 24 12:14:04 2009 CEST *********************************************************************** Whereas CE's logs shows for a particular job as follows ********************************************************************* Aug 24 11:30:09 pcncp04 sshd[24884]: Accepted hostbased for prdcms35 from 172.16.14.54 port 59869 ssh2 Aug 24 17:30:09 pcncp04 sshd[24883]: Accepted hostbased for prdcms35 from 172.16.14.54 port 59869 ssh2 Aug 24 17:30:09 pcncp04 sshd(pam_unix)[24885]: session opened for user prdcms35 by (uid=0) Aug 24 17:30:09 pcncp04 sshd[24885]: User prdcms35 attempting to execute command 'scp -r -p -f /home/prdcms35/.lcgjm/globus-cache-export.r24829/globus-cache-export.r24829.gpg' on command line Aug 24 17:30:09 pcncp04 sshd(pam_unix)[24885]: session closed for user prdcms35 ********************************************************************** On the other hand when I tried to submit job from cic-samadmin portal, than it shows CE-sft-lcg-rm-cr failure on SAM as ********************************************************************* Checking lcg-cr command Netork timeout on LFC: LFC_CONNTIMEOUT=10 LFC_CONRETRY=1 LFC_CONRETRYINT=2 Network and search timeouts on BDII set for lcg-utils: LCG_GFAL_BDII_TIMEOUT=20 SE timeouts in sec: connect 10, send/receive 120, SRM 180 Using lcg-utils version: + lcg-cp --version lcg_util-1.7.4-1 GFAL-client-1.11.6-2 + set +x Create a local file: sft-lcg-rm-cr.txt Move the file to the default SE (pcncp22.ncp.edu.pk) and register it with the LFN: sft-lcg-rm-cr-wn46.ncp.edu.pk.090824075522.936461 ++ pwd + lcg-cr --connect-timeout 10 --sendreceive-timeout 120 --bdii-timeout 20 --srm-timeout 180 -v --vo ops -d pcncp22.ncp.edu.pk -l lfn:sft-lcg-rm-cr-wn46.ncp.edu.pk.090824075522.936461 file:///home/sgmops03/globus-tmp.wn46.20102.0/https_3a_2f_2fglite-rb-01.cnaf.infn.it_3a9000_2fD1EMPLZtjdd1KJMz7MrN-g/work/testjob/nodes/pcncp04.ncp.edu.pk/sft-lcg-rm-cr.txt Using grid catalog type: lfc Using grid catalog : prod-lfc-shared-central.cern.ch Checksum type: None SE type: SRMv2 Destination SURL : srm://pcncp22.ncp.edu.pk/dpm/ncp.edu.pk/home/ops/generated/2009-08-24/file81329b5d-d306-4957-b79a-9225d881d615 Source SRM Request Token: 8cf31d51-c6bc-44c3-800e-af7c983b600b Source URL: file:/home/sgmops03/globus-tmp.wn46.20102.0/https_3a_2f_2fglite-rb-01.cnaf.infn.it_3a9000_2fD1EMPLZtjdd1KJMz7MrN-g/work/testjob/nodes/pcncp04.ncp.edu.pk/sft-lcg-rm-cr.txt File size: 228 VO name: ops Destination specified: pcncp22.ncp.edu.pk Destination URL for copy: gsiftp://pcncp22.ncp.edu.pk/pcncp22.ncp.edu.pk:/storage1/ops/2009-08-24/file81329b5d-d306-4957-b79a-9225d881d615.135061.0 # streams: 1 228 bytes 1.12 KB/sec avg 1.12 KB/sec inst Transfer took 1000 ms send2nsd: NS002 - send error : Bad credentials send2nsd: NS002 - send error : Bad credentials [LFC][lfc_statg][] prod-lfc-shared-central.cern.ch: lfn:/grid/ops/SAM/sft-lcg-rm-cr-wn46.ncp.edu.pk.090824075522.936461: Bad credentials send2nsd: NS002 - send error : Bad credentials srm://pcncp22.ncp.edu.pk/dpm/ncp.edu.pk/home/ops/generated/2009-08-24/file81329b5d-d306-4957-b79a-9225d881d615: Registration failed, please register it by hand, when the problem will be solved guid:43a028fd-c6a3-457d-ac5c-8093def0c6bb lcg_cr: Communication error on send + result=1 + set +x List the replicas: + lcg-lr --vo ops lfn:sft-lcg-rm-cr-wn46.ncp.edu.pk.090824075522.936461 send2nsd: NS002 - send error : Bad credentials [LFC][lfc_getreplica][] prod-lfc-shared-central.cern.ch: /grid/ops/SAM/sft-lcg-rm-cr-wn46.ncp.edu.pk.090824075522.936461: Bad credentials lcg_lr: Communication error on send + set +x ************************************************************************ any idea what is the reason behind this issue? thanks in advance Regards, FAWAD SAEED Scientific Officer Computing National Centre for Physics Islamabad Tel: +92 - 51 260 1018 Fax: +92 - 51 920 5753 Email: [log in to unmask] <mailto:[log in to unmask]>