Hi Wahid,
Thanks for the info - however, we've farmed off the mysql DB to a
separate machine so I guess my.cnf doesn't matter? Either way, the
old SE doesn't have anything intersting in it anyway....
I suppose that brings up another question though: Do I have to do
anything to the MySQL DB if I change the head node? Essentially
all I do at present is start the new EMI head node, play with it,
fail, and then restart the Glite head node. Should I be making
changes anything else as well?
I think this pthread fix will sort out the crashing at least which
will be a start - I'll see if it helps with the performance as
well :)
Thanks,
Mark
On 01/10/12 17:16, Wahid Bhimji wrote:
[log in to unmask]"
type="cite">
I think absence of those fixes makes it break rather quickly rather than perform poorly. But anyway worth having them of course.
Next thing to check might be the database - i.e. turn on logging of slow queries (log-slow-queries) ;
make sure it is a InnoDB (!) and increase the innodb_buffer_pool_size in my.cnf .
Actually EMI should be better in that respect as it is supposed to have sensible defaults for my.cnf. But I replaced mine as soon as installed so maybe the defaults were in fact not sensible.
Wahid
On 1 Oct 2012, at 17:00, Mark Slater <[log in to unmask]> wrote:
Thanks for this Sam - I'll try that tomorrow and see if helps!
Mark
On 01/10/12 16:56, Sam Skipsey wrote:
Did you apply the fixes in the known issues page here:
https://svnweb.cern.ch/trac/lcgdm/blog/official-release-lcgdm-183 ?
If you did, I've not seen this on previous EMI release either.
Sam
On 1 October 2012 16:39, Mark Slater <[log in to unmask]> wrote:
Hi All,
I managed to get the new EMI2 head node up and running last week after
fixing my minor screw up with the permissions. However, the performance is
*terrible* compared to the previous Glite install. I had thought it was to
do with running it on a VM, but I reinstalled on a baremetal machine today
(actually better spec than the original one) and I see the same problem.
Basically, it starts OK, then it begins to take ages to take the SRM request
and offload it to the pool node. A simple lcg-cr for example would take ~5s
previously but on the new one can take up to 20s! This goes on a few a few
hours and then services (dpm, srmv2.2, dpns) start randomly falling over and
won't stay up for any great length of time.
I'm guessing there is some tuning I should have done but I haven't found
anything to tell me what :( Has anyone seen this before??
Thanks!
Mark
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.