On Tue, Mar 08, 2005 at 05:23:08PM +0100 or thereabouts, Maarten Litmaath wrote:
> Steve Traylen wrote:
>
> >>Jobs are waiting when the Workload Manager has not got to them yet,
> >>meaning they still sit in /var/edgwl/workload_manager/input.fl.
> >>
> >
> >>Does /var/edgwl/workload_manager/log/events.log show activity?
Not since 15:18 GMT time.
>
> So, what does /var/edgwl/workload_manager/log/events.log say?
>
> I have attached a script "chk-wl.sh" that shows the state of affairs
> on a single page.
Output is attached.
Steve
--
Steve Traylen
[log in to unmask]
http://www.gridpp.ac.uk/
1: POSIX ADVISORY WRITE 32199 03:02:1292115 0 EOF f1b57be0 c03a9448 c3775e24 00000000 f1b57bec
2: POSIX ADVISORY WRITE 2966 03:02:2093748 0 EOF c3775e20 f1b57be4 c3775fa4 00000000 c3775e2c
3: FLOCK ADVISORY WRITE 2768 03:02:2093745 0 EOF c3775fa0 c3775e24 c03a9448 00000000 c3775fac
PID TTY TIME CMD
32283 ? 00:00:15 condor_gridmana
32199 ? 00:00:02 condor_master
32201 ? 00:00:13 condor_schedd
32232 ? 00:00:01 edg-wl-bkserver
32329 ? 00:01:14 edg-wl-interlog
32080 ? 00:00:38 edg-wl-job_cont <
32335 ? 00:00:02 edg-wl-logd
32291 ? 00:00:41 edg-wl-log_moni <
32396 ? 00:02:22 edg-wl-ns_daemo <
32429 ? 00:00:04 edg-wl-renewd
32476 ? 00:05:11 edg-wl-workload <
32285 ? 00:06:58 gahp_server
=== WM ===
08 Mar, 15:18:10 -I- checkRank: t2ce01.physics.ox.ac.uk:2119/jobmanager-lcgpbs-infinite, -1
08 Mar, 15:18:10 -I- checkRank: wn-04-07-01-a.cr.cnaf.infn.it:2119/jobmanager-lcgpbs-lhcb, 6
08 Mar, 15:18:10 -I- Helper::resolve: Selected t2-ce-01.lnl.infn.it:2119/jobmanager-lcglsf-lhcb for job https://lcgrb01.gridpp.rl.ac.uk:9000/4JTZe7_lODB0uISbUdwDAQ
=== NS ===
08 Mar, 16:25:16 -I- "NS2WM::stSeqCode": Sequence Code: UI=000003:NS=0000000004:WM=000000:BH=0000000000:JSS=000000:LM=000000:LRMS=000000:APP=000000
08 Mar, 16:25:16 -I- "NS2WM::stSeqCode": Sequence Code file: /var/edgwl/SandboxDir/69/https_3a_2f_2flcgrb01.gridpp.rl.ac.uk_3a9000_2f69cUZzpvls4b8cG72WowBg/.edg_wll_seq
08 Mar, 16:25:16 -F- "NS2WM::submit": Submit Forwarded.
=== JC ===
08 Mar, 15:17:55 -E- JobControllerReal::submit(...): Classad file created...
08 Mar, 15:17:55 -C- JobControllerReal::submit(...): Job submitted to Condor cluster: 1148
08 Mar, 15:17:55 -I- JobControllerClientReal::get_next_request(): Waiting for requests...
=== LM ===
08 Mar, 16:30:00 -I- MonitorLoop::run(): Must wait for other 11 seconds.
08 Mar, 16:30:11 -I- MonitorLoop::run(): No new event found, going to sleep.
08 Mar, 16:30:11 -I- MonitorLoop::run(): Checking each 10 seconds for new events.
|