JISCMail - GRIDPP-STORAGE Archives

Hi Wahid,

Thanks for the info - however, we've farmed off the mysql DB to a separate machine so I guess my.cnf doesn't matter? Either way, the old SE doesn't have anything intersting in it anyway....

I suppose that brings up another question though: Do I have to do anything to the MySQL DB if I change the head node? Essentially all I do at present is start the new EMI head node, play with it, fail, and then restart the Glite head node. Should I be making changes anything else as well?

I think this pthread fix will sort out the crashing at least which will be a start - I'll see if it helps with the performance as well :)

Thanks,

Mark

On 01/10/12 17:16, Wahid Bhimji wrote:

I think absence of those fixes makes it break rather quickly rather than perform poorly. But anyway worth having them of course. Next thing to check might be the database - i.e. turn on logging of slow queries (log-slow-queries) ; make sure it is a InnoDB (!) and increase the innodb_buffer_pool_size in my.cnf . Actually EMI should be better in that respect as it is supposed to have sensible defaults for my.cnf. But I replaced mine as soon as installed so maybe the defaults were in fact not sensible. Wahid On 1 Oct 2012, at 17:00, Mark Slater <[log in to unmask]> wrote:

Did you apply the fixes in the known issues page here: https://svnweb.cern.ch/trac/lcgdm/blog/official-release-lcgdm-183 ? If you did, I've not seen this on previous EMI release either. Sam On 1 October 2012 16:39, Mark Slater <[log in to unmask]> wrote:

Hi All, I managed to get the new EMI2 head node up and running last week after fixing my minor screw up with the permissions. However, the performance is *terrible* compared to the previous Glite install. I had thought it was to do with running it on a VM, but I reinstalled on a baremetal machine today (actually better spec than the original one) and I see the same problem. Basically, it starts OK, then it begins to take ages to take the SRM request and offload it to the pool node. A simple lcg-cr for example would take ~5s previously but on the new one can take up to 20s! This goes on a few a few hours and then services (dpm, srmv2.2, dpns) start randomly falling over and won't stay up for any great length of time. I'm guessing there is some tuning I should have done but I haven't found anything to tell me what :( Has anyone seen this before?? Thanks! Mark