Hi Gianfranco, FYI, we are still investigating the reasons of this problem with APT auto-update If you are talking about the "error on rename", you should use the most recent backup you have before the apt auto-update. Please use glite-yaim-3.0.0-38 (to be released today), and not glite-yaim-3.0.0-36. And re-configure the DPM node with YAIM. Thank you, Sophie. > Could this be clarified? I had the apt auto-update on Friday and > dpm-qryconf went nuts as a result. However, copying files to and from > the DPM was not affected. Now, running the update script today (after > having stopped all the services) fails. This failure was reported at > the beginning of thread as a sign of DB corruption. > > I have regular dumps of the DB, so restoring to working order should > be not a big deal (needless to say, however, it could have been > avoided, as commented in this very thread). It would be desirable to > know how far back I have to go in order to retrieve the latest > possible functional snapshot. Is the corruption likely to occur as the > new rpm's are installed (perhaps triggered by an attempted transfer > after that), or is the action of running the script likely to corrupt > the lot? > > cheers, > Gianfranco > > Michel Jouvin wrote: > >> Sophie can confirm but I think there is no risk of corruption if >> running=20 >> the new server on the old db : it will just fails. The problem is >> running=20 >> the update script with the service (old or new) running. >> >> Michel >> >> --On samedi 10 mars 2007 19:03 +0100 Debreczeni Gergely=20 >> <[log in to unmask]> wrote: >> >> >> >>> Hi ! >>> >>> Just thiking loudly: >>> >>> The apt-autoupdate updated the rpms, but none of them was restarted. >>> (I've checked the .spec files). >>> So after the upgrade you had the new DPM libraries and files installed >>> but the old servers running. When we tested the upgrade script we >>> strictly followed the description and there no hours were passed >>> between >>> the rpm upgrade and the database schema upgrade, and no meantime data >>> transfer were on the server... >>> >>> So in your case what probably happened, that the old server wanted to >>> load one of the new shared libraries during the night (because you >>> had an >>> ongoing transfer), which is obviously a weird situation and that caused >>> DB corruption. >>> >>> If as you proposed the rpm postinstall script had stopped the >>> service, >>> then you would have waken up in the morning with some crashed data >>> transfer... (I dunno which one is better :-)) >>> >>> So, none of the solution is perfect, personally >>> *I'm very much againts of apt-autoupdate*. >>> If I run a production site then it would be me who >>> would like to do the upgrade and see,follow the output and >>> read the release notes carefully before, not only superficially..... >>> >>> So, both side needs some improvement ;-) >>> >>> Best regards and good weekend, >>> Gergo >>> >>> PS: And of course very probably after the database is corrupted the >>> update script is not gonna to work... >>> >>> >>> >>> Adam Padee a =C3=A9crit : >>> >>> >>>> Sophie Lemaitre wrote: >>>> >>>> >>>>> Wait, I agree only with the documentation change time and date. >>>>> >>>>> But, starting and stoping the services is done by YAIM as needed. >>>>> This is also explained in the Wiki documentation (since the >>>>> beginning) >>>>> as well as in the release notes. >>>>> >>>>> >>>>> >>>> >>>> Well, you're right. Probably the best way to do it was to use YAIM. >>>> But, >>>> as I mentioned previously, my SE was upgraded by apt-autoupdate, which >>>> unfortunately doesn't run YAIM. When I woke up in the morning, my >>>> databases were already corrupt. So I had to deal with the problem >>>> manually. I don't mind updating things manually. But gLite adopted >>>> continuous update model, which makes sense only with automatic update >>>> tools. I agree, that some things cannot be done without manual >>>> intervention. But in such a case I would like to have it stated >>>> explicitly in the release notes that come to my mailbox. As updates >>>> are >>>> "continuous", I look at these notes only superficially, and unless I >>>> find something really serious, stated in capital letters, I let it go >>>> automatically. If I had to update all the nodes manually after every >>>> minor update, then the "continuous" update model =3D much more work >>>> than >>>> in the previous "release" model. >>>> In the update 16 release notes I see only "pay close attention to >>>> glite-CE and lcg-CE_torque". Nothing at all about reconfiguration of >>>> SE_dpm_mysql. >>>> >>>> I really don't like to repeat the discussion that has already taken >>>> place here in Sept'06 along with the openssh update. But I think that >>>> putting to the production repository the packages that without special >>>> treatment may cause services' malfunction, when lot of people use >>>> apt-autoupdate, is not a very good idea. I (partially) understand the >>>> openssh case, as it is an external package. But if the same thing >>>> happens with EGEE packages, which are not critical security updates, I >>>> begin to wonder what PPS is for? >>>> >>>> >>>> >>>>> We are always happy to answer all GGUS tickets we get, so please >>>>> send a >>>>> mail if you are "fighting", or not sure in which order to do what. >>>>> >>>>> >>>>> >>>> >>>> I appreciate that, and I'm really grateful for the help I already >>>> received from DPM team (for example with my problem with dpm-drain in >>>> ver 1.5.6), but GGUS tickets have to travel very long way before they >>>> reach your desk. Usually they are sorted by TPM shift, sent to ROC, >>>> analyzed by ROC 1st line support, sent back to GGUS, and then assigned >>>> to your group. At least this is what has happened with my previous >>>> ticket concerning DPM. When the harm is already done, and my site does >>>> not work, I don't think that gong through GGUS is the quickest way to >>>> solve the problem. >>>> >>>> Cheers, >>>> Adam >>>> >>> >> >> >> >> ************************************************************* >> * Michel Jouvin Email : [log in to unmask] * >> * LAL / CNRS Tel : +33 1 64468932 * >> * B.P. 34 Fax : +33 1 69079404 * >> * 91898 Orsay Cedex * >> * France * >> ************************************************************* >> > > >