Hi Yves,
The APEL cron seems to be running OK on the CE, I can also run it
manually.
The GIIS value in /opt/glite/etc/glite-apel-pbs/parser-config-yaim.xml
is correct.
The APEL cron on the MON box ran OK overnight (without any intervention
on my part),
but if I run it manually now, I get (see below). This makes me think
that this problem is with central services.
Also rgma-gin is back to not starting with "Could not contact Schema
trying to declare table LcgRecords" records.
However the gstat errors do indicate I still have some site-info.def
config error
Dave
Fri Nov 16 09:53:34 UTC 2007: apel-publisher - Optimising table:
BlahdRecords
Fri Nov 16 09:53:34 UTC 2007: apel-publisher - **** Combining tables and
republishing in LcgRecords ****
Fri Nov 16 09:53:34 UTC 2007: apel-publisher - Checking valid CPU spec
data exists
Fri Nov 16 09:53:32 UTC 2007: apel-publisher - CPU spec values found
Fri Nov 16 09:53:32 UTC 2007: apel-publisher - Creating a new Primary
Producer
Fri Nov 16 09:53:33 UTC 2007: apel-publisher - program aborted
org.glite.apel.core.ApelException: org.glite.apel.core.ApelException:
org.glite.rgma.RGMAException: Could not contact Schema trying to declare
table LcgRecords
Caused by: Failed to get Database connection: Could not retrieve
connection info from pool
at
org.glite.apel.publisher.AccountPublisher.<init>(AccountPublisher.java:177)
at
org.glite.apel.publisher.AccountManager.run(AccountManager.java:130)
at
org.glite.apel.publisher.ApelPublisher.runJoinProcessor(ApelPublisher.java:121)
at org.glite.apel.publisher.ApelPublisher.run(ApelPublisher.java:69)
at
org.glite.apel.publisher.ApelPublisher.main(ApelPublisher.java:238)
Caused by: org.glite.apel.core.ApelException:
org.glite.rgma.RGMAException: Could not contact Schema trying to declare
table LcgRecords
Caused by: Failed to get Database connection: Could not retrieve
connection info from pool
at
org.glite.apel.publisher.AccountPublisher.createResilientPrimaryProducer(AccountPublisher.java:192)
at
org.glite.apel.publisher.AccountPublisher.<init>(AccountPublisher.java:174)
... 4 more
Yves Coppens wrote:
> Hi Dave,
>
> Have you tried to run APEL cron manually on your CE before running it on
> the MON and does it throw any error, or alternatively could check the
> value of <GIIS> in /opt/glite/etc/glite-apel-pbs/parser-config-yaim.xml?
>
> It should be:
>
> <GIIS>grid002.jet.efda.org</GIIS>
>
> >From gstat page, there may well be a typo in your site-info.def? Or slapd
> process is stuck. Could you quickly stop your site bdii and globus-mds,
> check if there are any slapd process hanging kill them, and restart mds
> and bdii.
>
> Thanks,
>
> Yves
>
>
> On Thu, 15 Nov 2007, Wilson, AJ (Antony) wrote:
>
>
>> I believe that apel gets the cpu spec value by doing an ldap search of
>> the sites giis, so I would guess this may be a local problem
>>
>>
>> Regards
>> Antony
>>
>>
>>> -----Original Message-----
>>> From: LHC Computer Grid - Rollout
>>> [mailto:[log in to unmask]] On Behalf Of David Robson
>>> Sent: 15 November 2007 16:47
>>> To: [log in to unmask]
>>> Subject: Re: [LCG-ROLLOUT] rgma-gin does not start after update 36
>>>
>>> I've tried restarting rgma-gin and it is now running. Must
>>> have been the issue with central services, and not the upgrade.
>>>
>>> However, when I try to run apel-publisher, I get ...
>>>
>>> Thu Nov 15 16:42:57 UTC 2007: apel-publisher - Cannot
>>> generate any accounting records because no cpu spec value is
>>> defined in the SpecRecords table, spec values are added when
>>> running the CPUProcessor, check user documen
>>>
>>> Any idea whether this is a central services or an upgrade issue?
>>>
>>>
>>>
>>> Wilson, AJ (Antony) wrote:
>>>
>>>> Hi David
>>>>
>>>> The schema has been restarted and we are investigating the cause of
>>>> the problems.
>>>> Please can you try again now.
>>>>
>>>> Thanks
>>>> Antony
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: LHC Computer Grid - Rollout
>>>>> [mailto:[log in to unmask]] On Behalf Of David Robson
>>>>> Sent: 15 November 2007 14:35
>>>>> To: [log in to unmask]
>>>>> Subject: [LCG-ROLLOUT] rgma-gin does not start after update 36
>>>>>
>>>>> After upgrading our MON box to glite 3.0 update 36, rgma-gin no
>>>>> longer starts
>>>>>
>>>>> Looking in /var/log/glite/rgma-gin.log, I see ...
>>>>>
>>>>> 2007-11-15 12:19:05: Restarting rgma-gin
>>>>> 2007-11-15 12:19:05: rgma-gin Stopping
>>>>> 2007-11-15 12:19:10: rgma-gin Stopped OK
>>>>> 2007-11-15 12:19:10: Starting rgma-gin
>>>>> 2007-11-15 12:19:11,407 [main] FATAL
>>>>>
>>> org.glite.rgma.gin.Gin - Error
>>>
>>>>> during startup: org.glite.rgma.gin.GinException: Error in
>>>>> configuration file '/opt/glite/etc/rgma-gin/gin.conf' -
>>>>>
>>> Unable to get
>>>
>>>>> table definition for 'GlueSE'
>>>>> 2007-11-15 12:19:13: rgma-gin Failed to Start
>>>>>
>>>>> I have attached our /opt/glite/etc/rgma-gin/gin.conf file
>>>>>
>>>>> Has anyone come across a similar problem?
>>>>>
>>>>> --
>>>>> David Robson
>>>>>
>>>>> CODAS, Machine Operations, UKAEA Culham Division Culham Science
>>>>> Centre, Abingdon, OXON, OX14 3DB, UK
>>>>> Voice: +44(0)1235-46-4569, Fax: 4404
>>>>> Work email: [log in to unmask]
>>>>> Home email: [log in to unmask]
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>> --
>>> David Robson
>>>
>>> CODAS, Machine Operations, UKAEA Culham Division Culham
>>> Science Centre, Abingdon, OXON, OX14 3DB, UK
>>> Voice: +44(0)1235-46-4569, Fax: 4404
>>> Work email: [log in to unmask]
>>> Home email: [log in to unmask]
>>>
>>>
>
>
--
David Robson
CODAS, Machine Operations, UKAEA Culham Division
Culham Science Centre, Abingdon, OXON, OX14 3DB, UK
Voice: +44(0)1235-46-4569, Fax: 4404
Work email: [log in to unmask]
Home email: [log in to unmask]
|