We were hit by DB corruption following the apt auto-update last Friday
and an attempt at running the upgrade script on Monday (after having
stopped all services as recommended).
We have now restored functionality following the procedure recommended
on Rollout:
1) Ditch the database (this is for the mysql type)
2) Restore the most recent backup prior to the apt auto-update
3) Run the latest YAIM 3.0.0-38. This will also run the database schema
update for 1.6.3 (YAIM 3.0.0-36 was said not to be working for this see
Savannah bug #24589, but Graeme's experience proved otherwise)
A couple of notes for those unsure on what to do:
- DPM people have not yet been able to reproduce the DB corruption under
the circumstances observed at some sites. Still investigating
- Following the atp auto-update, the dpm service continues to run, SAM
tests are green
- After stopping the service and trying out the upgrade script manually,
the dpm service would not restart
- The upgrade script fails with the error "error on rename"
cheers,
Gianfranco
Graeme Stewart wrote:
> Hey, that makes me the UK's fool... damn!
>
> g
>
> On 13 Mar 2007, at 16:20, Alessandra Forti wrote:
>
>>> My summary of the rollout thread is that you should definitely not
>>> be auto-updating RPMs on gLite server nodes (I suspect you can get
>>> away with it most of the time on WNs, although I do remember a java
>>> update breaking various things some time ago...).
>>
>> and you should wait for some other 'fool' to go ahead and test the
>> updates for you.... then you can do it in half an hour.
>>
>> cheers
>> alessandra
>
> --
> Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
> GridPP DM Wiki - http://wiki.gridpp.ac.uk/wiki/Data_Management
> ScotGrid - http://www.scotgrid.ac.uk/ http://scotgrid.blogspot.com/
--
Dr. Gianfranco Sciacca Tel: +44 (0)20 7679 3044
Dept of Physics and Astronomy Internal: 33044
University College London D15 - Physics Building
London WC1E 6BT
|