Print

Print


Dear Ilja,

The output of edg-job-submit I am getting is as follows:

LOGGING INFORMATION:

Printing info for the Job :
https://pcncp24.ncp.edu.pk:9000/V_vK6voweHl3stwItI9gbw

        ---
 Event: RegJob
- source                  =    UserInterface
- timestamp               =    Mon Aug 27 10:09:48 2007
        ---
 Event: Transfer
- destination             =    NetworkServer
- result                  =    START
- source                  =    UserInterface
- timestamp               =    Mon Aug 27 10:09:50 2007
        ---
 Event: Transfer
- destination             =    NetworkServer
- result                  =    OK
- source                  =    UserInterface
- timestamp               =    Mon Aug 27 10:09:54 2007
        ---
 Event: Accepted
- source                  =    NetworkServer
- timestamp               =    Mon Aug 27 10:09:52 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    NetworkServer
- timestamp               =    Mon Aug 27 10:09:54 2007
        ---
 Event: DeQueued
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:09:55 2007
        ---
 Event: Match
- dest_id                 =
pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:10:00 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:10:01 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:10:02 2007
        ---
 Event: DeQueued
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:10:04 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    START
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:10:05 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    OK
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:10:06 2007
        ---
 Event: Accepted
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:10:14 2007
        ---
 Event: Transfer
- destination             =    LRMS
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:10:26 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:13:57 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:14:09 2007
        ---
 Event: Resubmission
- result                  =    WILLRESUB
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:14:10 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:14:12 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:14:13 2007
        ---
 Event: DeQueued
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:14:14 2007
        ---
 Event: Match
- dest_id                 =
pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:14:19 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:14:21 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:14:22 2007
        ---
 Event: DeQueued
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:14:23 2007
        ---
Event: Transfer
- destination             =    LogMonitor
- result                  =    START
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:14:25 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    OK
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:14:26 2007
        ---
 Event: Accepted
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:14:36 2007
        ---
 Event: Transfer
- destination             =    LRMS
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:14:48 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:17:23 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:17:35 2007
        ---
 Event: Resubmission
- result                  =    WILLRESUB
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:17:36 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:17:38 2007
        ---
Event: EnQueued
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:17:39 2007
        ---
 Event: DeQueued
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:17:40 2007
        ---
 Event: Match
- dest_id                 =
pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:17:46 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:17:47 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:17:48 2007
        ---
 Event: DeQueued
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:17:49 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    START
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:17:50 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    OK
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:17:51 2007
        ---

Event: Accepted
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:18:02 2007
        ---
 Event: Transfer
- destination             =    LRMS
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:18:14 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:21:44 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:21:56 2007
        ---
 Event: Resubmission
- result                  =    WILLRESUB
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:21:57 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:21:59 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:22:00 2007
        ---
 Event: DeQueued
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:22:01 2007
        ---
 Event: Match
- dest_id                 =
pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:22:07 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:22:08 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:22:09 2007
        ---
 Event: DeQueued
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:22:11 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    START
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:22:12 2007
        ---
 Event: Transfer
- destination             =    LogMonitor
- result                  =    OK
- source                  =    JobController
- timestamp               =    Mon Aug 27 10:22:13 2007
        ---
 Event: Accepted
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:22:23 2007
        ---
 Event: Transfer
- destination             =    LRMS
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:22:35 2007
        ---
 Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:25:10 2007
        ---

Event: Done
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:25:22 2007
        ---
 Event: Resubmission
- result                  =    WILLRESUB
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:25:23 2007
        ---
 Event: EnQueued
- result                  =    START
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:25:25 2007
        ---
 Event: EnQueued
- result                  =    OK
- source                  =    LogMonitor
- timestamp               =    Mon Aug 27 10:25:26 2007
        ---
 Event: DeQueued
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:25:27 2007
        ---
 Event: Abort
- source                  =    WorkloadManager
- timestamp               =    Mon Aug 27 10:25:28 2007

**********************************************************************

[pcncp21] ~ > edg-job-status
https://pcncp24.ncp.edu.pk:9000/V_vK6voweHl3stwItI9gbw


*************************************************************
BOOKKEEPING INFORMATION:

Status info for the Job :
https://pcncp24.ncp.edu.pk:9000/V_vK6voweHl3stwItI9gbw
Current Status:     Aborted
Status Reason:      Job RetryCount (3) hit
Destination:        pcncp04.ncp.edu.pk:2119/jobmanager-lcgpbs-dteam
reached on:         Mon Aug 27 10:25:28 2007
*************************************************************


-- Best Regards --
Adeel


-----Original Message-----
From: LHC Computer Grid - Rollout [mailto:[log in to unmask]]
On Behalf Of Ilja Livenson
Sent: Monday, August 27, 2007 10:50 AM
To: [log in to unmask]
Subject: Re: [LCG-ROLLOUT] Job Submission Failure

Well, in case of LCG-CE I think it's best to try running job with 
globus-job-run. Perhaps you could post output of running it?

atb,
Ilja

Adeel-ur-Rehman wrote:
> Dear Ilja, 
>
> Sorry for the confusion. By globus-job-run, I mean to say edg-job-submit! 
> Yes I'm talking about LCG-CE.
>
> -- Best Regards --
> Adeel
>
> -----Original Message-----
> From: LHC Computer Grid - Rollout
[mailto:[log in to unmask]]
> On Behalf Of Ilja Livenson
> Sent: Saturday, August 25, 2007 8:29 PM
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] Job Submission Failure
>
> Hi,
>
> are you sure you are talking about globus-job-run? It doesn't resubmit 
> jobs, afaik, hence doesn't fail with the HitCount error.
>
> Ilja
>
> PS. You are talking about LCG CE, not gLite, right?
>
> Adeel-ur-Rehman wrote:
>   
>> Dear All,
>>
>> At our site, since I upgraded it to the latest update of gLite 3.1, no
>>     
> jobs are executing rather I am getting job submission failures. Reading
the
> details of the error, it states "Got a job held event, reason: Unspecified
> gridmanager error". I can qsub test jobs, but globus-job-run Aborts the
job
> after Retrying HitCount 3 times.
>   
>>  
>> And there is no offending ssh key problems between our CE and WNs.
>>  
>> Any ideas??
>>  
>>  
>>  
>> -- Best Regards --
>> Adeel-ur-Rehman
>>
>>