Dear Ilja, The output of edg-job-submit I am getting is as follows: LOGGING INFORMATION: Printing info for the Job : https://pcncp24.ncp.edu.pk:9000/V_vK6voweHl3stwItI9gbw --- Event: RegJob - source = UserInterface - timestamp = Mon Aug 27 10:09:48 2007 --- Event: Transfer - destination = NetworkServer - result = START - source = UserInterface - timestamp = Mon Aug 27 10:09:50 2007 --- Event: Transfer - destination = NetworkServer - result = OK - source = UserInterface - timestamp = Mon Aug 27 10:09:54 2007 --- Event: Accepted - source = NetworkServer - timestamp = Mon Aug 27 10:09:52 2007 --- Event: EnQueued - result = OK - source = NetworkServer - timestamp = Mon Aug 27 10:09:54 2007 --- Event: DeQueued - source = WorkloadManager - timestamp = Mon Aug 27 10:09:55 2007 --- Event: Match - dest_id = pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam - source = WorkloadManager - timestamp = Mon Aug 27 10:10:00 2007 --- Event: EnQueued - result = START - source = WorkloadManager - timestamp = Mon Aug 27 10:10:01 2007 --- Event: EnQueued - result = OK - source = WorkloadManager - timestamp = Mon Aug 27 10:10:02 2007 --- Event: DeQueued - source = JobController - timestamp = Mon Aug 27 10:10:04 2007 --- Event: Transfer - destination = LogMonitor - result = START - source = JobController - timestamp = Mon Aug 27 10:10:05 2007 --- Event: Transfer - destination = LogMonitor - result = OK - source = JobController - timestamp = Mon Aug 27 10:10:06 2007 --- Event: Accepted - source = LogMonitor - timestamp = Mon Aug 27 10:10:14 2007 --- Event: Transfer - destination = LRMS - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:10:26 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:13:57 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:14:09 2007 --- Event: Resubmission - result = WILLRESUB - source = LogMonitor - timestamp = Mon Aug 27 10:14:10 2007 --- Event: EnQueued - result = START - source = LogMonitor - timestamp = Mon Aug 27 10:14:12 2007 --- Event: EnQueued - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:14:13 2007 --- Event: DeQueued - source = WorkloadManager - timestamp = Mon Aug 27 10:14:14 2007 --- Event: Match - dest_id = pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam - source = WorkloadManager - timestamp = Mon Aug 27 10:14:19 2007 --- Event: EnQueued - result = START - source = WorkloadManager - timestamp = Mon Aug 27 10:14:21 2007 --- Event: EnQueued - result = OK - source = WorkloadManager - timestamp = Mon Aug 27 10:14:22 2007 --- Event: DeQueued - source = JobController - timestamp = Mon Aug 27 10:14:23 2007 --- Event: Transfer - destination = LogMonitor - result = START - source = JobController - timestamp = Mon Aug 27 10:14:25 2007 --- Event: Transfer - destination = LogMonitor - result = OK - source = JobController - timestamp = Mon Aug 27 10:14:26 2007 --- Event: Accepted - source = LogMonitor - timestamp = Mon Aug 27 10:14:36 2007 --- Event: Transfer - destination = LRMS - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:14:48 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:17:23 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:17:35 2007 --- Event: Resubmission - result = WILLRESUB - source = LogMonitor - timestamp = Mon Aug 27 10:17:36 2007 --- Event: EnQueued - result = START - source = LogMonitor - timestamp = Mon Aug 27 10:17:38 2007 --- Event: EnQueued - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:17:39 2007 --- Event: DeQueued - source = WorkloadManager - timestamp = Mon Aug 27 10:17:40 2007 --- Event: Match - dest_id = pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam - source = WorkloadManager - timestamp = Mon Aug 27 10:17:46 2007 --- Event: EnQueued - result = START - source = WorkloadManager - timestamp = Mon Aug 27 10:17:47 2007 --- Event: EnQueued - result = OK - source = WorkloadManager - timestamp = Mon Aug 27 10:17:48 2007 --- Event: DeQueued - source = JobController - timestamp = Mon Aug 27 10:17:49 2007 --- Event: Transfer - destination = LogMonitor - result = START - source = JobController - timestamp = Mon Aug 27 10:17:50 2007 --- Event: Transfer - destination = LogMonitor - result = OK - source = JobController - timestamp = Mon Aug 27 10:17:51 2007 --- Event: Accepted - source = LogMonitor - timestamp = Mon Aug 27 10:18:02 2007 --- Event: Transfer - destination = LRMS - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:18:14 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:21:44 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:21:56 2007 --- Event: Resubmission - result = WILLRESUB - source = LogMonitor - timestamp = Mon Aug 27 10:21:57 2007 --- Event: EnQueued - result = START - source = LogMonitor - timestamp = Mon Aug 27 10:21:59 2007 --- Event: EnQueued - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:22:00 2007 --- Event: DeQueued - source = WorkloadManager - timestamp = Mon Aug 27 10:22:01 2007 --- Event: Match - dest_id = pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam - source = WorkloadManager - timestamp = Mon Aug 27 10:22:07 2007 --- Event: EnQueued - result = START - source = WorkloadManager - timestamp = Mon Aug 27 10:22:08 2007 --- Event: EnQueued - result = OK - source = WorkloadManager - timestamp = Mon Aug 27 10:22:09 2007 --- Event: DeQueued - source = JobController - timestamp = Mon Aug 27 10:22:11 2007 --- Event: Transfer - destination = LogMonitor - result = START - source = JobController - timestamp = Mon Aug 27 10:22:12 2007 --- Event: Transfer - destination = LogMonitor - result = OK - source = JobController - timestamp = Mon Aug 27 10:22:13 2007 --- Event: Accepted - source = LogMonitor - timestamp = Mon Aug 27 10:22:23 2007 --- Event: Transfer - destination = LRMS - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:22:35 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:25:10 2007 --- Event: Done - source = LogMonitor - timestamp = Mon Aug 27 10:25:22 2007 --- Event: Resubmission - result = WILLRESUB - source = LogMonitor - timestamp = Mon Aug 27 10:25:23 2007 --- Event: EnQueued - result = START - source = LogMonitor - timestamp = Mon Aug 27 10:25:25 2007 --- Event: EnQueued - result = OK - source = LogMonitor - timestamp = Mon Aug 27 10:25:26 2007 --- Event: DeQueued - source = WorkloadManager - timestamp = Mon Aug 27 10:25:27 2007 --- Event: Abort - source = WorkloadManager - timestamp = Mon Aug 27 10:25:28 2007 ********************************************************************** [pcncp21] ~ > edg-job-status https://pcncp24.ncp.edu.pk:9000/V_vK6voweHl3stwItI9gbw ************************************************************* BOOKKEEPING INFORMATION: Status info for the Job : https://pcncp24.ncp.edu.pk:9000/V_vK6voweHl3stwItI9gbw Current Status: Aborted Status Reason: Job RetryCount (3) hit Destination: pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam reached on: Mon Aug 27 10:25:28 2007 ************************************************************* -- Best Regards -- Adeel -----Original Message----- From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Ilja Livenson Sent: Monday, August 27, 2007 10:50 AM To: [log in to unmask] Subject: Re: [LCG-ROLLOUT] Job Submission Failure Well, in case of LCG-CE I think it's best to try running job with globus-job-run. Perhaps you could post output of running it? atb, Ilja Adeel-ur-Rehman wrote: > Dear Ilja, > > Sorry for the confusion. By globus-job-run, I mean to say edg-job-submit! > Yes I'm talking about LCG-CE. > > -- Best Regards -- > Adeel > > -----Original Message----- > From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] > On Behalf Of Ilja Livenson > Sent: Saturday, August 25, 2007 8:29 PM > To: [log in to unmask] > Subject: Re: [LCG-ROLLOUT] Job Submission Failure > > Hi, > > are you sure you are talking about globus-job-run? It doesn't resubmit > jobs, afaik, hence doesn't fail with the HitCount error. > > Ilja > > PS. You are talking about LCG CE, not gLite, right? > > Adeel-ur-Rehman wrote: > >> Dear All, >> >> At our site, since I upgraded it to the latest update of gLite 3.1, no >> > jobs are executing rather I am getting job submission failures. Reading the > details of the error, it states "Got a job held event, reason: Unspecified > gridmanager error". I can qsub test jobs, but globus-job-run Aborts the job > after Retrying HitCount 3 times. > >> >> And there is no offending ssh key problems between our CE and WNs. >> >> Any ideas?? >> >> >> >> -- Best Regards -- >> Adeel-ur-Rehman >> >>