Hello, thanks for the reply.
The fetch_crl cron runs at a similar interval (every 6 hours) but at 27
minutes past the hour- so after the failures. Would increasing their
frequency (say to every 4 hours) be a plan to prevent stale CRLs?
Although I'd be surprised if things went bad that quickly every day.
I'll shunt around the timing of the mysql backups and see if that makes
a difference, lets see what happens over the weekend.
Have a good weekend all,
Matt
Greig A. Cowan wrote:
> Hi Matt,
>
> When does fetch-crl run? gSOAP errors like that are often caused by
> out of date CRLs.
>
> Can you change the MySQL backup to a different time to see if it
> correlates with the SAM failures?
>
> Greig
>
> Matt Doidge wrote, On 09/01/09 11:59:
>> Heya guys, and Happy 2009 to all,
>>
>> We're regularly failing srm SAM tests at ~6.13 and ~18.13 every day
>> with the error message pasted below. Such regular failing sets off
>> the obvious alarm bells, and I immediately checked the cron jobs.
>> Both the edg-mkgridmap and our mysql backup happen at the time of
>> these failures, but as these are 6 hourly cronjobs I would also
>> expect them to interfere with the midnight and midday tests. Also the
>> error message doesn't quite fit with what I'd expect (last time we
>> saw a similar error message it was caused by network problems between
>> the worker nodes/CE and the SE). I'd appreciate any wisdom on this
>> matter.
>>
>> cheers,
>> Matt
>>
>> + lcg-cr --version
>> lcg_util-1.6.15
>> GFAL-client-1.10.17
>> + set +x
>>
>> + lcg-cr -t 120 -v --vo ops file:/home/samops/.same/SE/testFile.txt
>> -l lfn:SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507 -d
>> fal-pygrid-30.lancs.ac.uk
>> Using grid catalog type: lfc
>> Using grid catalog : prod-lfc-shared-central.cern.ch
>> Using LFN : /grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
>> [BDII] sam-bdii.cern.ch:2170: Warning, no GlueVOInfo information
>> found about tag '(null)' and SE 'fal-pygrid-30.lancs.ac.uk'
>> SE type: SRMv1
>> Using SURL :
>> srm://fal-pygrid-30.lancs.ac.uk/dpm/lancs.ac.uk/home/ops/generated/2009-01-09/file33df0c61-861c-4f81-9efa-3c6999a6d6d1
>>
>> Alias registered in Catalog:
>> lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
>> Alias registered in Catalog:
>> lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
>> Alias registered in Catalog:
>> lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
>> Alias registered in Catalog:
>> lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
>> Alias registered in Catalog:
>> lfn:/grid/ops/SAM/SE-lcg-cr-fal-pygrid-30.lancs.ac.uk-1231481507
>> [SE][put] httpg://fal-pygrid-30.lancs.ac.uk:8443/srm/managerv1:
>> CGSI-gSOAP: Error reading token data header: Connection closed
>> lcg_cr: Operation now in progress
>> + result=1
>> + set +x
>>
>
|