Hi Stephen,
Yes, it's the latest version of Torque, whereas the version in the glite
repository seems to be quite a bit older. If CREAM really is
incompatible with this version I don't mind changing but before I go
through that I'd like to understand if my more serious problem with
BNotifier is also due to version incompatibility or something in my
CREAM configuration.
Adam
On 28/02/2011 13:13, Stephen Jones wrote:
> Adam,
>
> Is that an unusual version of torque you're using? We're on
> torque-server-2.3.6-2, while you are on 2.5.4. Somewhere in all those
> revisions, maybe the record formats got changed? We have no "Preparing
> to send ... " lines in our /var/spool/pbs/server_logs output at all.
> The exit lines look like this:
>
> 02/28/2011
> 13:09:36;0010;PBS_Server;Job;783973.hammer.ph.liv.ac.uk;Exit_status=0
> resources_used.cput=04:47:17 resources_used.mem=789428kb
> resources_used.vmem=2093180kb resources_used.walltime=04:49:24
>
> While tracejob output looks like this:
>
> [root@hammer server_logs]# tracejob 783973
> /var/spool/pbs/mom_logs/20110228: No such file or directory
> /var/spool/pbs/sched_logs/20110228: No such file or directory
>
> Job: 783973.hammer.ph.liv.ac.uk
>
> 02/28/2011 07:58:53 S enqueuing into long, state 1 hop 1
> 02/28/2011 07:58:53 S Job Queued at request of
> [log in to unmask], owner =
> [log in to unmask], job name = STDIN, queue = long
> 02/28/2011 07:58:53 A queue=long
> 02/28/2011 08:20:08 S Job Modified at request of
> [log in to unmask]
> 02/28/2011 08:20:08 S Job Run at request of [log in to unmask]
> 02/28/2011 08:20:08 S Job Modified at request of
> [log in to unmask]
> 02/28/2011 08:20:08 S post_modify_req: PBSE_UNKJOBID for job
> 783973.hammer.ph.liv.ac.uk in state RUNNING-STAGEGO, dest =
> r23-n20.ph.liv.ac.uk
> 02/28/2011 08:20:12 A user=prdatl40 group=atlasprd jobname=STDIN
> queue=long ctime=1298879933 qtime=1298879933 etime=1298879933
> start=1298881212
> [log in to unmask]
> exec_host=r23-n20.ph.liv.ac.uk/6 Resource_List.cput=48:00:00
> Resource_List.ncpus=1
> Resource_List.neednodes=1
> Resource_List.nodect=1 Resource_List.nodes=1
> Resource_List.walltime=48:00:00
> 02/28/2011 13:09:36 S Exit_status=0 resources_used.cput=04:47:17
> resources_used.mem=789428kb resources_used.vmem=2093180kb
> resources_used.walltime=04:49:24
> 02/28/2011 13:09:36 A user=prdatl40 group=atlasprd jobname=STDIN
> queue=long ctime=1298879933 qtime=1298879933 etime=1298879933
> start=1298881212
> [log in to unmask]
> exec_host=r23-n20.ph.liv.ac.uk/6 Resource_List.cput=48:00:00
> Resource_List.ncpus=1
> Resource_List.neednodes=1
> Resource_List.nodect=1 Resource_List.nodes=1
> Resource_List.walltime=48:00:00 session=25604
> end=1298898576 Exit_status=0
> resources_used.cput=04:47:17 resources_used.mem=789428kb
> resources_used.vmem=2093180kb
> resources_used.walltime=04:49:24
> 02/28/2011 13:09:58 S dequeuing from long, state COMPLETE
>
>
> Steve
>
>
>
|