Print

Print


HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information

  Hi,
 Yes, now accounting data is updated on the GOC portal. But the
rgma-client-check command is still working with the mentioned
error message concerning to Checking Java API.
  Cheers,
  Vladimir.

-----Исходное сообщение-----
От: LHC Computer Grid - Rollout от имени Del Cano Novales, Cristina (STFC,RAL,ESC)
Отправлено: Пн, 14.09.2009 15:14
Кому: [log in to unmask]
Тема: Re: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information

Hi,



Please wait till tomorrow to see the data in the Accounting Portal. If the data is not available then, we can check further.



Cheers,



Cristina



From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Vladimir.O Tikhomirov
Sent: 14 September 2009 12:08
To: [log in to unmask]
Subject: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information



    Hi,

  Since 12 September the sitiuation becomes more curious:

During the period 31.08-11.09 I had (both in /var/log/apel.log file and/or

direct output of apel-publisher):

....

Wed Sep  9 03:34:02 UTC 2009: apel-publisher - Read-in configuration: [logenabled, p, inspectTables, j] [DBUsername=accounting, DBURL=jdbc:mysql://localhost:3306/accounting, DBPassword=****, site=ru-Moscow-FIAN-LCG2, republish=missing]
Wed Sep  9 03:34:02 UTC 2009: apel-publisher - program aborted
Wed Sep  9 03:34:02 UTC 2009: apel-publisher - null
Thu Sep 10 03:34:01 UTC 2009: apel-publisher - Read-in configuration: [logenabled, p, inspectTables, j] [DBUsername=accounting, DBURL=jdbc:mysql://localhost:3306/accounting, DBPassword=****, site=ru-Moscow-FIAN-LCG2, republish=missing]
Thu Sep 10 03:34:01 UTC 2009: apel-publisher - program aborted
Thu Sep 10 03:34:01 UTC 2009: apel-publisher - null
Fri Sep 11 03:34:02 UTC 2009: apel-publisher - Read-in configuration: [logenabled, p, inspectTables, j] [DBUsername=accounting, DBURL=jdbc:mysql://localhost:3306/accounting, DBPassword=****, site=ru-Moscow-FIAN-LCG2, republish=missing]
Fri Sep 11 03:34:02 UTC 2009: apel-publisher - program aborted
Fri Sep 11 03:34:02 UTC 2009: apel-publisher - null
...

But since 12.09 (for unknown reason for me) the output seems to be

quite reasonable:

Sat Sep 12 00:57:01 UTC 2009: apel-publisher - Read-in configuration: [logenabled, p, inspectTables, j] [DBUsername=accounting, DBURL=jdbc:mysql://localhost:3306/accounting, DBPa$
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - ------ Starting the apel application ------
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - **** APEL is examining the schema ****
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Checking the LcgRecords table
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - The LcgRecords schema is up-to-date
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Checking the BlahdRecords table
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - The BlahdRecords schema is up-to-date
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Checking the LcgProcessedFiles table
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - The LcgProcessedFiles schema is up-to-date
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Checking the SpecRecords table
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - The SpecRecords schema is up-to-date
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Checking the GkRecords table
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - The GkRecords schema is up-to-date
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Checking the MessageRecords table
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - The MessageRecords schema is up-to-date
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - **** Schema checks complete ****
Sat Sep 12 00:57:02 UTC 2009: apel-publisher - Optimising table: EventRecords
Sat Sep 12 00:57:04 UTC 2009: apel-publisher - Optimising table: GkRecords
Sat Sep 12 00:57:04 UTC 2009: apel-publisher - Optimising table: MessageRecords
Sat Sep 12 00:57:04 UTC 2009: apel-publisher - Optimising table: SpecRecords
Sat Sep 12 00:57:04 UTC 2009: apel-publisher - Optimising table: LcgRecords
Sat Sep 12 00:57:04 UTC 2009: apel-publisher - Optimising table: BlahdRecords
Sat Sep 12 00:57:05 UTC 2009: apel-publisher - **** Combining tables and republishing in LcgRecords ****
Sat Sep 12 00:57:05 UTC 2009: apel-publisher - Checking valid CPU spec data exists
Sat Sep 12 00:57:05 UTC 2009: apel-publisher - CPU spec values found
Sat Sep 12 00:57:05 UTC 2009: apel-publisher - Creating a new Primary Producer
.......

Sat Sep 12 00:57:11 UTC 2009: apel-publisher -  ====================================
Sat Sep 12 00:57:11 UTC 2009: apel-publisher -     Synchronisation data check
Sat Sep 12 00:57:11 UTC 2009: apel-publisher -  ====================================
Sat Sep 12 00:57:11 UTC 2009: apel-publisher - Finding all records in local database since the last successful publish timestamp : 2009-08-31 03:34:27
Sat Sep 12 00:57:11 UTC 2009: apel-publisher - Record/s found: 109
Sat Sep 12 00:57:11 UTC 2009: apel-publisher - Checking Archiver is Online
Sat Sep 12 00:57:11 UTC 2009: apel-publisher - Creating a Resilient Consumer
Sat Sep 12 00:57:12 UTC 2009: apel-publisher - Starting Resilient Consumer
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Closing Resilient Consumer
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Archiver Alive
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Archiver Record Count: Record/s found site ru-Moscow-FIAN-LCG2 : 0
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - WARNING - Detected missing records, republishing data starting from: 2009-08-31 03:34:27
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Publishing data into rgma
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Publishing 109 records...
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Total records published : 109...
Sat Sep 12 00:57:31 UTC 2009: apel-publisher - Checking the record counts for syncronisation
Sat Sep 12 01:02:36 UTC 2009: apel-publisher - Consumer has died - will try to create a new one
Sat Sep 12 01:02:36 UTC 2009: apel-publisher - Creating a Resilient Consumer
Sat Sep 12 01:02:37 UTC 2009: apel-publisher - Starting Resilient Consumer
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Closing Resilient Consumer
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Archiver Record Count: 109
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Local database and GOC managed to syncronise, updating RepublishInfo
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Rows deleted from RepublishInfo: 1
Sat Sep 12 01:02:52 UTC 2009: apel-publisher -  ====================================
Sat Sep 12 01:02:52 UTC 2009: apel-publisher -  Completed Synchronisation data check
Sat Sep 12 01:02:52 UTC 2009: apel-publisher -  ====================================
Sat Sep 12 01:02:52 UTC 2009: apel-publisher -  Publisher Mode = Apel Publisher (Default)
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Building account records via the new Accounting Log File
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - NB: Record Counts may be zero if Patch #898 is not active on this CE
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Stitching together all accounting records
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Stitching completed
Sat Sep 12 01:02:52 UTC 2009: apel-publisher - Storing accounting data into local database
Sat Sep 12 01:02:54 UTC 2009: apel-publisher - Generating 1329 records
Sat Sep 12 01:02:54 UTC 2009: apel-publisher - Publishing data into rgma
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Publishing 1329 to GOC (via Accounting Log)
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Number of Joined accounting records: 1329
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Building account records via GK Logs
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - NB: Record Counts may be zero if Patch #898 is active on this CE
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Stitching together all accounting records
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Stitching completed
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - No accounting data to store
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Generating 0 records
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Publishing data into rgma
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Publishing 0 to GOC (via GK Log)
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Number of Joined accounting records: 0
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Build complete
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - **** Join processing complete ****
Sat Sep 12 01:02:55 UTC 2009: apel-publisher -  ====================================
Sat Sep 12 01:02:55 UTC 2009: apel-publisher -       Publishing Summary Data
Sat Sep 12 01:02:55 UTC 2009: apel-publisher -  ====================================
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Data will be written to RGMA Table : LcgRecordsSync_v2
Sat Sep 12 01:02:55 UTC 2009: apel-publisher - Creating a new Primary Producer
Sat Sep 12 01:02:58 UTC 2009: apel-publisher - Publishing summary data into rgma
Sat Sep 12 01:02:58 UTC 2009: apel-publisher - ------ Processing finished ------
......



  But accounting data for the site (ru-Moscow-FIAN-LCG2) are still absent on GOC, at least on

http://www3.egee.cesga.es/gridsite/accounting/CESGA/egee_view.php page. And I still have an

warning about SAM APEL test on

https://lcg-sam.cern.ch:8443/sam/sam.py?funct=ShowHistory&option=old&sensors=APEL&vo=ops&nodename=ce1.grid.lebedev.ru

 |Does any delay between date of accounting information publication and appearing on the GOC

portal exist? If so, how long is it?

    Cheers,

    Vladimir.





________________________________

От: LHC Computer Grid - Rollout от имени Del Cano Novales, Cristina (STFC,RAL,ESC)
Отправлено: Пн, 14.09.2009 10:06
Кому: [log in to unmask]
Тема: Re: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information

Hi Vladimir,



Can you please check how APEL publisher is run? If you get the error when running it from the cron job please attach the output of /etc/cron.d/edg-apel-publisher.



Otherwise, can you send me the command you are running?



Cheers,



Cristina



From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Vladimir.O Tikhomirov
Sent: 11 September 2009 21:36
To: [log in to unmask]
Subject: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information





    Hi Cristina,
 Yes, this line is presented.
    Cheers,
    Vladimir.

-----Исходное сообщение-----
От: LHC Computer Grid - Rollout от имени Del Cano Novales, Cristina (STFC,RAL,ESC)
Отправлено: Пт, 11.09.2009 11:59
Кому: [log in to unmask]
Тема: Re: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information

Hi Vladimir,



Can you check your apel publisher configuration file (/opt/glite/etc/glite-apel-publisher/publisher-config-yaim.xml)?

Under the <DBPassword> line you should have the following:

<Limit>300000</Limit>



Please add the line if it's not present.



Cheers,



Cristina



From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Vladimir.O Tikhomirov
Sent: 11 September 2009 10:51
To: [log in to unmask]
Subject: [LCG-ROLLOUT] HA: [LCG-ROLLOUT] Problem with publishing of site accouting information





   Hi,
 I did it now, but it does not help.
   Cheers,
   Vladimir.

-----Исходное сообщение-----
От: LHC Computer Grid - Rollout от имени Del Cano Novales, Cristina (STFC,RAL,ESC)
Отправлено: Пт, 11.09.2009 9:39
Кому: [log in to unmask]
Тема: Re: [LCG-ROLLOUT] Problem with publishing of site accouting information

Hi Vladimir,

Did you reconfigure the MON box with YAIM after the update?

Cheers,

Cristina

-----Original Message-----
From: LHC Computer Grid - Rollout [mailto:[log in to unmask]] On Behalf Of Vladimir Tikhomirov
Sent: 10 September 2009 21:22
To: [log in to unmask]
Subject: [LCG-ROLLOUT] Problem with publishing of site accouting information

Hello,
 Resenly I got a problem with publishing of site accounting information. The
problem seems to appear just after gLite 3.1 update 54 (31.08.2009). Since
when the accounting information is not published anymore to the GOC. The
site name is ru-Moscow-FIAN-LCG2.
 On the APEL side the contents of apel.log file seems to be normal. But on
the MON RGMA side I can see:
--------------------------
[root@se1] ~ $ tail /var/log/apel.log
......
Thu Sep 10 03:34:01 UTC 2009: apel-publisher - Read-in configuration:
[logenabled, p, inspectTables, j] [DBUsername=accounting,
DBURL=jdbc:mysql://localhost:3306/accounting, DBPassword=****,
site=ru-Moscow-FIAN-LCG2, republish=missing]
Thu Sep 10 03:34:01 UTC 2009: apel-publisher - program aborted
Thu Sep 10 03:34:01 UTC 2009: apel-publisher - null
--------------
 and:
---------------
[root@se1] ~ $ /opt/glite/bin/rgma-client-check

*** Running R-GMA client tests on se1.grid.lebedev.ru ***

Checking Command-line API: Success
Checking Java API: Exception in thread "main"
java.lang.NoClassDefFoundError:
org/bouncycastle/jce/provider/BouncyCastleProvider
        at
org.glite.security.trustmanager.ContextWrapper.init(ContextWrapper.java:390)
        at
org.glite.security.trustmanager.ContextWrapper.<init>(ContextWrapper.java:246)
        at
org.glite.security.trustmanager.TimedOutContextWrapper.<init>(TimedOutContextWrapper.java:41)
        at org.edg.info.ServletConnection.setupHTTPS(ServletConnection.java:189)
        at
org.edg.info.ServletConnection.setupSecurity(ServletConnection.java:180)
        at org.edg.info.ServletConnection.connect(ServletConnection.java:498)
        at org.edg.info.ServletConnection.connect(ServletConnection.java:401)
        at
org.edg.info.ServletConnection.sendCommand(ServletConnection.java:443)
        at
org.glite.rgma.stubs.ProducerFactoryStub.createInstance(ProducerFactoryStub.java:165)
        at
org.glite.rgma.stubs.ProducerFactoryStub.createPrimaryProducer(ProducerFactoryStub.java:76)
        at InsertTuple.main(InsertTuple.java:24)
Failure - failed to insert test tuple
Checking Python API: Success

*** R-GMA client test failed ***
---------------

 rgma-server-check looks OK. Restart of tomcat5 service does not help. OS -
SLC4.
 Can somebody help?
   
     Thank you in advance,
     Vladimir.


--
Scanned by iCritical.



--
Scanned by iCritical.




--
Scanned by iCritical.