The CREAM CE is not matched because it publishes Special instead of
production as GlueCeStateStatus ?
Gonçalo Borges wrote:
> Dear All...
>
> I was about to submit a patch for the CREAMCE / SGE integration when I
> hit the following issue...
>
> ---*---
>
> 1./ I can perfectly submit a job using glite-ce-job-submit . Check the
> following examples:
>
> -bash-3.00$ glite-ce-job-submit -a -r
> ce03.lip.pt:8443/cream-sge-dteamgrid sleep.jdl
> 2009-10-13 16:17:31,114 WARN - No configuration file suitable for
> loading. Using built-in configuration
> https://ce03.lip.pt:8443/CREAM469661735
>
> -bash-3.00$ glite-ce-job-status https://ce03.lip.pt:8443/CREAM469661735
> 2009-10-13 16:18:42,920 WARN - No configuration file suitable for
> loading. Using built-in configuration
> ****** JobID=[https://ce03.lip.pt:8443/CREAM469661735]
> Status = [DONE-OK]
> ExitCode = [0]
>
> ---*---
>
> 2./ However, trying to submit via WMS, using
>
> glite-wms-job-submit -a -r ce03.lip.pt:8443/cream-sge-dteamgrid
> sleep.jdl,
>
> the job never passes from the READY state.
>
> ---*---
>
> 3./ Monitoring the messages, glite-ce-cream and glite-ce-monitor logs,
> I conclude that the job never reaches my SGE CREAMCE.
>
> ---*---
>
> 4./ I checked that the WMS workload manager registered the job,
>
> 13 Oct, 16:30:17 -I: [Info] operator()(dispatcher_utils.cpp:218):
> new jobsubmit for https://wms01.lip.pt:9000/ctA8jPXMMyayYdOUMs84cA
> 13 Oct, 16:30:17 -I: [Info] operator()(submit_request.cpp:478):
> https://wms01.lip.pt:9000/ctA8jPXMMyayYdOUMs84cA delivered
>
> but I can not obtain its condor ID.
>
> [root@wms01 CondorG.log]# grep
> https://wms01.lip.pt:9000/_yUTkiOYPrcAJbRpjOGOdw
> /var/glite/logmonitor/CondorG.log/*
> [root@wms01 CondorG.log]#
>
> ---*---
>
> 5./ Finally, when I cancel the job, I see in jobcontoller_events.log,
> the following log:
>
> 13 Oct, 16:44:09 -I- ControllerLoop::run(): Got new remove request
> (JOB ID = https://wms01.lip.pt:9000/_yUTkiOYPrcAJbRpjOGOdw)...
> 13 Oct, 16:44:09 -I- JobControllerReal::cancel(...): Asked to remove
> job: https://wms01.lip.pt:9000/_yUTkiOYPrcAJbRpjOGOdw
> 13 Oct, 16:44:09 -M- JobControllerReal::readRepository(): Reading
> repository from LogMonitor file:
> /var/glite/logmonitor/internal/irepository.dat
> 13 Oct, 16:44:10 -*- JobControllerReal::cancel(...): I'm not able to
> retrieve the condor ID.
>
> ---*---
>
> 6./ I've tried to submit to other CREAM-CEs, bit I've noticed that a
> list-match do not show me any cream-ce
>
> -bash-3.00$ glite-wms-job-list-match -a sleep.jdl | grep -i cream
> -bash-3.00$
>
> My JDL is quite simple, and I do not think that I should have problems
> with it:
>
> -bash-3.00$ cat sleep.jdl
> Executable = "sleep.sh";
> StdOutput = "sleep.out";
> StdError = "sleep.err";
> RetryCount = 0;
> InputSandbox = {"sleep.sh"};
> OutputSandbox = {"sleep.out","sleep.err"};
> #OutputSandboxBaseDestUri="gsiftp://ce02.lip.pt/tmp";
>
> I've tried to force other creamCEs, using the "-r" option, but I got
> the same results.
>
> I conclude that something must be wrong in my WMS / ICE configuration...
>
> ---*---
>
> 7./ I'm using the following software:
>
> [root@wms01 ~]# rpm -qa | grep glite-WMS
>
> root@ce03 ~]# rpm -qa | grep CREAM
> glite-CREAM-3.1.20-0
>
>
> Any help is welcome...
> Thanks in Advance
> Cheers
> Goncalo
>
>
>
>
|