To connect to our machines we use ssh (telnet service is not running on
any node).
The ldapsearch works well with both mds-vo-name=local and
mds-vo-name=CGG-LCG2.
I have perhaps an information which could help us, it is about the
edg-job-status. When the CECycle test fails, it gives a job identifier
and when I do edg-job-status on it, I get :
[aberiach@ui1 aberiach]$ edg-job-status
https://lxn1188.cern.ch:9000/VKZ3kJXgQ9k-tTYzW2i53Q
*************************************************************
BOOKKEEPING INFORMATION:
Status info for the Job :
https://lxn1188.cern.ch:9000/VKZ3kJXgQ9k-tTYzW2i53Q
Current Status: Aborted
Status Reason: Cannot plan: BrokerHelper: no compatible resources
reached on: Wed Aug 18 15:07:00 2004
*************************************************************
Is it a known error message ?
An other point is that I have to define by hand GSI_PASSWORD env
variable and PXhostname=lxn1179.cern.ch variable on the UI. Without this
the test does not start at all.
Is it normal to have to do this ?
Regards.
Ahmed
[log in to unmask] wrote:
>On Wed, 18 Aug 2004, Ahmed Beriache wrote:
>
>
>
>>We installed LCG.2.2.0 release at CGG (France) but we have a problem when runing the Mainscript test
>>
>>It fails at the last step which is
>>
>>BaseTest::EDGLifecycle::Stack::CECycle
>>
>>
>
>I do not know if this explains the result, but you have this problem:
>
>-----------------------------------------------------------
>$ telnet ce1.egee.fr.cgg.com 2135
>Trying 212.37.204.3...
>telnet: connect to address 212.37.204.3: Connection refused
>-----------------------------------------------------------
>
>With globus-job-run I saw your slapd processes running and
>the port open for "LISTEN", but I think the slapd has got
>into a bad state, no longer accepting connections, thereby
>letting the listen queue fill up, whereby further connection
>attempts are immediately refused. Try this on your CE:
>
> /etc/init.d/globus-mds stop
> ps auxww | grep slapd
> kill -9 ..... # kill any slapd processes still left
> /etc/init.d/globus-mds start
>
>Then verify it works from your UI:
>
> ldapsearch -x -h ce1.egee.fr.cgg.com -p 2135 -b mds-vo-name=local,o=grid
>
>
>
>>Here is the error output
>>
>>-----------------------------------------------------------------------------------------------------------------------------
>> Entering Test: ===== CECycle ===== #5
>>
>> Begin test: CECycle with: BaseTest::EDGLifecycle::Stack::CECycle on:
>>--xml --reqLapse=25 --maxSubs=1 --serie=919 --maxStack=25 --useCEList
>>--singleSubmit --vo=dteam ce1.egee.fr.cgg.com
>>*A set of information ...*
>>Selected Virtual Organisation name (from EDG_WL_UI_CONFIG_VO env
>>variable): dteam
>>Connecting to host lxn1188.cern.ch, port 7772
>>
>>***************************************************************************
>> COMPUTING ELEMENT IDs LIST
>> The following CE(s) matching your job requirements have been found:
>>
>> *CEId*
>> lcg2-ce.physik.rwth-aachen.de:2119/jobmanager-lcgpbs-infinite
>>*A long CEs list*
>> cclcgceli01.in2p3.fr:2119/jobmanager-bqs-A
>>***************************************************************************
>>
>> matchingCEs: lcg2-ce.physik.rwth-aachen.de atlasce01.na.infn.it
>>ce00.inta.es grid-ce2.desy.de grid003.ft.uam.es gridkap01.fzk.de
>>*a long matching CEs list*
>>lcg-ce.lps.umontreal.ca:2119/jobmanager-lcgpbs-short
>>lcg00125.grid.sinica.edu.tw:2119/jobmanager-lcgpbs-short
>> CE: lcgce01.triumf.ca
>>*a long matching CEs list*
>> CE: lcg00125.grid.sinica.edu.tw
>>
>>OPTIONS:
>>serie => 919
>>xml => 1
>>pollingPeriod => 30
>>vo => dteam
>>maxSubs => 1
>>childTimeout => 1800
>>polling => 30
>>useCEList => 1
>>maxStack => 25
>>singleSubmit => 1
>>zombSlep => 10
>>reqLapse => 25
>>
>> Requirements: Requirements = other.GlueCEUniqueID==""
>>Submit 0 21655 0
>> Starting the CHILD process: 21655 for timeout: 1800 sec.
>>PID 0 / 1 / stacksize: 0 / pid: 21655 / procs: 23 / descr: 260
>>ZOMBIES 4 0
>>ZOMBIES 1 0
>> Reaper List: 21655
>> SUM timest=>Wed_Aug_18_11:09:39_CEST_2004, flag=>Trying_submission,
>>index=>0, none=>--, jdl=>/tmp/aberiach-21625-31352/tmparg-0.jdl,
>>date=>Wed_Aug_18_11:04:39_CEST_2004, mess=>Error, stat2=>TIMEOUTSUB,
>>stat1=>[FAIL]
>>ERROR Ugly failure ! The command returned:
>>TIMEOUTSUB
>> We are done with 1 1 0
>> SUM Ending GG CECycle: serie 919: We are done with 1 1
>> SUMM 0 :
>> date => Wed_Aug_18_11:04:39_CEST_2004
>> jdl => /tmp/aberiach-21625-31352/tmparg-0.jdl
>> mace => RB: : <br>fixedCE:CE:<br>@
>> mess => Error
>> olog => /tmp/aberiach-21625-31352/CECycle-0.log
>> pid => 21655
>> stat1 => [FAIL]
>> stat2 => TIMEOUTSUB
>> timest => Wed_Aug_18_11:09:39_CEST_2004
>> SUMM 1 :
>> mace => RB: : <br>fixedCE:<br>@
>> mawn => WN:MATCHING
>> olog => /tmp/aberiach-21625-31352/matching.out
>> stat1 => [FAIL]
>> SUMM 2 :
>> mace => RB: : <br>fixedCE:<br>@
>> mawn => WN:RUNNING
>> olog => /tmp/aberiach-21625-31352/running.out
>> stat1 => [FAIL]
>> Called buildXMLFile() to write /tmp/aberiach-21625-31352/CECycle.XML
>>log2html-script.pl -j -f
>>/tmp/aberiach/040818-110056_SiteTesting/CECycle/index.html
>>/tmp/aberiach-21625-31352/CECycle.XML
>>EDGLifecycle: cleanup
>>
>> End Test #5: CECycle with: BaseTest::EDGLifecycle::Stack::CECycle on:
>>--xml --reqLapse=25 --maxSubs=1 --serie=919 --maxStack=25 --useCEList
>>--singleSubmit --vo=dteam ce1.egee.fr.cgg.com
>> Global Result: BaseTest::EDGLifecycle::Stack::CECycle [FAIL]
>> Duration: 305 sec.
>> ------------------------------------------------------------
>>
>> Exiting Test: ===== CECycle =====*
>>
>>
>>Do you have any idea about this error ?
>>
>>
>
>
>
--
-----------------------------------------------------------------------
Ahmed Beriache phone: +33 01 64 47 35 18 (direct)
Compagnie Generale de Geophysique (CGG) +33 01 64 47 30 00
1, rue Leon Migaux 91341 Massy fax: +33 01 64 47 30 98
web site: http://www.cgg.com e-mail: [log in to unmask]
-----------------------------------------------------------------------
|