Govind,
So do you still think this is correlated with the mysql backup.
I guess you could try suspending the backups (temporarily !) to see if
it is really correlated with the dump.
Are there no other srm requests at the time (you said it wasn't heavily
loaded but did you look in all the srmv2 log files.
There is also the GridppDpmMonitor which can indicate srm requests and
give some clues as to trafic....
http://www.gridpp.ac.uk/wiki/DPM_Monitoring
Did this only start after you moved to 1.7.2 btw?
Wahid
Govind Songara wrote:
> We are still failing SAM test on CE, It would be nice if someone can
> give idea on troubleshooting this problem.
>
> Thanks
> Govind
>
>
> Govind Songara wrote:
>> Thanks Sam for your advise.
>> After enabling the binary log we did not see any SAM test failure of SE.
>> But still having intermittent SAM test failure for CE around same
>> time when mysql dump finishes.
>> SE][PrepareToPut][] httpg://se2.ppgrid1.rhul.ac.uk:8446/srm/managerv2: CGSI-gSOAP running on node034 reports Error reading token data header: Connection closed
>> lcg_cr: Communication error on send
>>
>> https://lcg-sam.cern.ch:8443/sam/sam.py?funct=TestResult&nodename=ce2.ppgrid1.rhul.ac.uk&vo=ops&testname=CE-sft-lcg-rm-cr&testtimestamp=1260295735
>>
>> The system is not heavily loaded when mysql dumps running.
>>
>>
>> There continuous errors in srmv1 in log around same time when myqsl
>> dump finishes, all from same server (lcgfts01.gridpp.rl.ac.uk)
>> 12/08 18:13:28 24347,18 srmv1: SRM02 - soap_serve error :
>> [130.246.183.209] (lcgfts01.gridpp.rl.ac.uk) : CGSI-gSOAP running on
>> se2.ppgrid1.rhul.ac.uk reports Error reading token data header:
>> Connection closed
>>
>>
>> Current set-up of /etc/my.cnf for mysqld section are
>> [mysqld]
>> datadir=/local/mysql
>> socket=/local/mysql/mysql.sock
>> old_passwords=1
>> log-bin
>> expire_logs_days=7
>> set-variable=innodb_buffer_pool_size=256M
>>
>> It would be great some one can advise on this.
>>
>> Thanks
>> Govind
>>
>>
>> Sam Skipsey wrote:
>>> 2009/11/27 Ewan MacMahon <[log in to unmask]>:
>>>
>>>>> -----Original Message-----
>>>>> From: GRIDPP2: Deployment and support of SRM and local storage
>>>>>
>>>>> At RHUL we're seeing sporadic temporary failure of the SE evidenced by
>>>>> SAM test failures which seem to be correlated with the backing up of
>>>>> the mysql database. Has anyone else seen this (Ewan I seem to remember
>>>>>
>>>> you
>>>>
>>>>> mentioning it) and if so any tips?
>>>>>
>>>>>
>>>> The suggestion was to use the '--single-transaction' parameter to
>>>> mysqldump
>>>> which causes it to start a transaction, dump the state of the db from
>>>> there,
>>>> then close the transaction. The effect is to avoid locking the tables at
>>>> all, whereas the default behaviour takes an exclusive lock on everything
>>>> for
>>>> the duration of the dump. The transaction approach only works properly
>>>> with
>>>> tables using the InnoDB backed, but it seems that DPM headnode databases
>>>> do.
>>>>
>>>>
>>>
>>> IIRC, you also need binary logs enabled for this to work precisely as described.
>>> Luckily, this is an easy thing to do.
>>>
>>> Sam
>>>
>>>
>>>> That said, our mysql backup script seems not to be using
>>>> --single-transaction
>>>> due to (we think) a simple oversight while cut-n-pasting bits of script,
>>>>
>>>> though I'm sure someone's actually using it (Lancaster, possibly?).
>>>>
>>>> Ewan
>>>>
>>>>
>>>
>>>
>>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
|