Print

Print


Hi Wahid,

Thanks for the info - however, we've farmed off the mysql DB to a separate machine so I guess my.cnf doesn't matter? Either way, the old SE doesn't have anything intersting in it anyway....

I suppose that brings up another question though: Do I have to do anything to the MySQL DB if I change the head node? Essentially all I do at present is start the new EMI head node, play with it, fail, and then restart the Glite head node. Should I be making changes anything else as well?

I think this pthread fix will sort out the crashing at least which will be a start - I'll see if it helps with the performance as well :)

Thanks,

Mark

On 01/10/12 17:16, Wahid Bhimji wrote:
[log in to unmask]" type="cite">
I think absence of those fixes makes it break rather quickly  rather than perform poorly. But anyway worth having them of course. 

Next thing to check might be the database - i.e. turn on logging of slow queries (log-slow-queries) ; 
make sure it is a InnoDB (!) and increase the innodb_buffer_pool_size in my.cnf . 

Actually EMI should be better in that respect as it is supposed to have sensible defaults for my.cnf. But I replaced mine as soon as installed so maybe the defaults were in fact not sensible.

Wahid


On 1 Oct 2012, at 17:00, Mark Slater <[log in to unmask]> wrote:

Thanks for this Sam - I'll try that tomorrow and see if helps!

Mark



On 01/10/12 16:56, Sam Skipsey wrote:
Did you apply the fixes in the known issues page here:
https://svnweb.cern.ch/trac/lcgdm/blog/official-release-lcgdm-183 ?

If you did, I've not seen this on previous EMI release either.

Sam

On 1 October 2012 16:39, Mark Slater <[log in to unmask]> wrote:
Hi All,

I managed to get the new EMI2 head node up and running last week after
fixing my minor screw up with the permissions. However, the performance is
*terrible* compared to the previous Glite install. I had thought it was to
do with running it on a VM, but I reinstalled on a baremetal machine today
(actually better spec than the original one) and I see the same problem.
Basically, it starts OK, then it begins to take ages to take the SRM request
and offload it to the pool node. A simple lcg-cr for example would take ~5s
previously but on the new one can take up to 20s! This goes on a few a few
hours and then services (dpm, srmv2.2, dpns) start randomly falling over and
won't stay up for any great length of time.

I'm guessing there is some tuning I should have done but I haven't found
anything to tell me what :( Has anyone seen this before??

Thanks!

Mark

      

      

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.