Print

Print


Huh, well, the last query shouldn't take that long, as there's an index on parent_fileid and name in Cns_file_metadata (and there's only one entry which has any given parent_fileid).
This sounds like there's some other issue, possibly load-related on the mysql server?
(What's your /etc/my.cnf look like?)

Sam

On Tue, Apr 26, 2016 at 10:54 AM Matt Williams <[log in to unmask]> wrote:
Hi,

I've followed Sam's advice about the tuning guide and have applied
most of the ideas there. It's helped some small amount from the DPM
head node side but the MySQL server is still under heavy load. I've
collected some slow query logs and the output is:

$ mysqldumpslow -a -s l -t 5 /var/log/mysql-slow.log  # By lock time

Reading mysql slow query log from /var/log/mysql-slow.log
Count: 2143  Time=78.32s (167836s)  Lock=0.02s (40s)  Rows=1.0 (2143),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT                 FILEID, PARENT_FILEID, GUID, NAME, FILEMODE,
NLINK, OWNER_UID,                  GID, FILESIZE, ATIME, MTIME, CTIME,
FILECLASS, STATUS, CSUMTYPE,                CSUMVALUE, ACL
    FROM Cns_file_metadata               WHERE parent_fileid = 0
          AND name = '/'

Count: 7  Time=1.70s (11s)  Lock=0.87s (6s)  Rows=16.0 (112),
root[root]@localhost
  SELECT CONCAT(User, '@',Host) AS User FROM mysql.user

Count: 92  Time=34.22s (3148s)  Lock=0.01s (1s)  Rows=1.0 (92),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT                 R_ORDINAL, R_TOKEN, R_UID,
R_GID, CLIENT_DN, CLIENTHOST,           R_TYPE, U_TOKEN,
 FLAGS, RETRYTIME, NBREQFILES,           CTIME, STIME, ETIME,
  STATUS, ERRSTRING, GROUPS  FROM dpm_pending_req             WHERE
r_token = '643c0750-8998-4d35-bc32-f09079935f60'

Count: 1  Time=7.52s (7s)  Lock=0.82s (0s)  Rows=0.0 (0),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT                 S_TOKEN, CLIENT_DN, S_UID, S_GID, RET_POLICY,
          AC_LATENCY, S_TYPE, U_TOKEN, T_SPACE, G_SPACE,
   U_SPACE, POOLNAME, ASSIGN_TIME, EXPIRE_TIME, GROUPS            FROM
dpm_space_reserv       WHERE expire_time <= 1461600329

Count: 1544  Time=55.77s (86108s)  Lock=0.00s (0s)  Rows=1.0 (1544),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT fileid, parent_fileid, guid, name, filemode, nlink,
owner_uid, gid,            filesize, atime, mtime, ctime, fileclass,
status,            csumtype, csumvalue, acl, xattr        FROM
Cns_file_metadata         WHERE parent_fileid = 3 AND name =
'ph.bham.ac.uk'


or

$mysqldumpslow -a -s r -t 5 /var/log/mysql-slow.log  # By rows returned

Reading mysql slow query log from /var/log/mysql-slow.log
Count: 441  Time=19297.85s (8510350s)  Lock=0.00s (0s)  Rows=1676767.5
(739454489), dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  select fileid,
parent_fileid,name,filesize,filemode,csumvalue,atime,mtime from
Cns_file_metadata order by parent_fileid

Count: 50  Time=965.76s (48287s)  Lock=0.02s (0s)  Rows=114.7 (5735),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT                 R_ORDINAL, R_TOKEN, R_UID,
R_GID, CLIENT_DN, CLIENTHOST,           R_TYPE, U_TOKEN,
 FLAGS, RETRYTIME, NBREQFILES,           CTIME, STIME, ETIME,
  STATUS, ERRSTRING, GROUPS  FROM dpm_pending_req

Count: 72  Time=617.20s (44438s)  Lock=0.00s (0s)  Rows=52.1 (3750),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT fileid, parent_fileid, guid, name, filemode, nlink,
owner_uid, gid,          filesize, atime, mtime, ctime, fileclass,
status,          csumtype, csumvalue, acl, xattr        FROM
Cns_file_metadata         WHERE parent_fileid = 8415288        ORDER
BY name ASC

Count: 2759  Time=221.25s (610418s)  Lock=0.00s (0s)  Rows=1.0 (2759),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT fileid, parent_fileid, guid, name, filemode, nlink,
owner_uid, gid,            filesize, atime, mtime, ctime, fileclass,
status,            csumtype, csumvalue, acl, xattr        FROM
Cns_file_metadata         WHERE parent_fileid = 0 AND name = '/'

Count: 2143  Time=78.32s (167836s)  Lock=0.02s (40s)  Rows=1.0 (2143),
dpmmgr[dpmmgr]@epgse1.ph.bham.ac.uk
  SELECT                 FILEID, PARENT_FILEID, GUID, NAME, FILEMODE,
NLINK, OWNER_UID,                  GID, FILESIZE, ATIME, MTIME, CTIME,
FILECLASS, STATUS, CSUMTYPE,                CSUMVALUE, ACL
    FROM Cns_file_metadata               WHERE parent_fileid = 0
          AND name = '/'


I don't know how unusual these queries are but maybe you have some insight?

Cheers,
Matt


On 25 April 2016 at 13:27, Terry Froy <[log in to unmask]> wrote:
> On Mon, 2016-04-25 at 12:10 +0100, Matt Williams wrote:
>
>> Hi all,
>>
>> I'm seeing very high loads (around 12 or 13) on our DPM MySQL server.
>> This seems to be causing timeouts when DPM is requesting information
>> as I am seeing lots or errors like:
>>
>> 160425 12:05:17 [ERROR] /usr/libexec/mysqld: Lock wait timeout
>> exceeded; try restarting transaction
>> 160425 12:05:17 [ERROR] /usr/libexec/mysqld: Sort aborted
>>
>> in /var/log/mysqld.log
>>
>> Does anyone have any ideas about how I can start debugging this?
>> <snip>
>
> Hi Matt,
>
> Turn on slow query logging.
>
> ... and when you have some logs collected, if you can't figure out
> who/what/why this is happening, post them here so we can have a look!
>
> Regards,
> Terry
> --
> Terry Froy
> Cluster Systems Manager, Particle Physics
> Queen Mary, University of London
> Tel: +44 (0)207 882 6560
> E-mail: [log in to unmask]