Look in /etc/sysconfig/dpm and /etc/sysconfig/dpnsdaemon
you'll see commented out lines for setting the number of dpm slow and
fast threads, etc.
Sam
On 27 July 2012 09:40, SCHAER Frederic <[log in to unmask]> wrote:
> So…
>
> No one knows how to increase the number of DPM threads ?
>
>
>
> Cheers
>
>
>
> De : LHC Computer Grid - Rollout [mailto:[log in to unmask]] De la
> part de SCHAER Frederic
> Envoyé : vendredi 20 juillet 2012 09:50
> À : [log in to unmask]
> Objet : [PROVENANCE INTERNET] Re: [LCG-ROLLOUT] gLite 3.2 DPM server load
> issue
>
>
>
> That was one of the questions… would you by chance know how to do that ?
>
> I found some glite patch about that possibility in 1.8.2 (we run 1.8.0…),
> but I’m still looking for documentation on how to achieve this ?
>
>
>
> Regards
>
> Frederic
>
>
>
> De : LHC Computer Grid - Rollout [mailto:[log in to unmask]] De la
> part de Eric Fede
> Envoyé : jeudi 19 juillet 2012 17:03
> À : [log in to unmask]
> Objet : Re: [LCG-ROLLOUT] gLite 3.2 DPM server load issue
>
>
>
> Hi
>
>
>
> Did you tried to increase DPM and DPNS number of thread ?
>
> Eric
>
> Le 19/07/2012 16:28, SCHAER Frederic a écrit :
>
> Hi,
>
>
>
> We are facing some DPM load issues at GRIF/IRFU: when we receive about 2000
> atlas analysis jobs, the dpm process starts to time out (nagios cannot
> telnet on the dpm port), transfers start to fail, and GGUS ticket start to
> flow…
>
> When looking at the mysqld-slow logs that we enabled a while ago, I see the
> following :
>
>
>
> # du -h /var/lib/mysql/mysqld-slow.log
>
> 123M /var/lib/mysql/mysqld-slow.log
>
> (ouch)
>
>
>
> And inside that file, many requests that normally shouldn’t take so much
> time given the number of examined rows :
>
>
>
> # Time: 120719 16:00:57
>
> # User@Host: dpmmgr[dpmmgr] @ node12.datagrid.cea.fr [192.54.206.27]
>
> # Query_time: 16 Lock_time: 0 Rows_sent: 1 Rows_examined: 537
>
> use dpm_db;
>
> SELECT MAX(lifetime) FROM dpm_get_filereq WHERE pfn =
> 'node94.datagrid.cea.fr:/fs4/cms/2012-06-08/C4FA6C05-51AE-E111-B9BA-001D09F295A1.root.81799623.0';
>
> # Time: 120719 16:02:06
>
> # User@Host: dpmmgr[dpmmgr] @ node12.datagrid.cea.fr [192.54.206.27]
>
> # Query_time: 11 Lock_time: 0 Rows_sent: 1 Rows_examined: 8272
>
> SELECT ROWID, R_ORDINAL, R_TOKEN, R_UID, R_GID,
> CLIENT_DN, CLIENTHOST, R_TYPE, U_TOKEN, FLAGS,
> RETRYTIME, NBREQFILES, CTIME, STIME, ETIME, STATUS,
> ERRSTRING, GROUPS FROM dpm_pending_req WHERE status
> = 4096 ORDER BY ctime, r_ordinal LIMIT 1
> FOR UPDATE;
>
> # Time: 120719 16:02:09
>
> # User@Host: dpmmgr[dpmmgr] @ node12.datagrid.cea.fr [192.54.206.27]
>
> # Query_time: 11 Lock_time: 0 Rows_sent: 1 Rows_examined: 8273
>
> SELECT ROWID, R_ORDINAL, R_TOKEN, R_UID, R_GID,
> CLIENT_DN, CLIENTHOST, R_TYPE, U_TOKEN, FLAGS,
> RETRYTIME, NBREQFILES, CTIME, STIME, ETIME, STATUS,
> ERRSTRING, GROUPS FROM dpm_pending_req WHERE status
> = 4096 ORDER BY ctime, r_ordinal LIMIT 1
> FOR UPDATE;
>
>
>
> Taking 16 seconds to examine 537 records seems surprising to me.
>
> At first, I was wondering if the DPM database didn’t get too fat (it’s 24G
> big), but now, I’m wondering what could cause the node to be so I/O hungry
> that mysql cannot reply to requests in a timely manner ?
>
>
>
> Off course, this is an SL5 node, so we don’t have iotop to find out which
> process could be the culprit.
>
> Would someone have any hint on where to look, and what to tweak ?
>
> Is there a limited number of replies a DPM process can handle per second ?
>
>
>
> Thanks && regards
>
>
>
>
> --
>
> ---------------------------------------------------
>
> Eric Fede
>
> L.A.P.P. - BP 110, Chemin de Bellevue
>
> 74941 Annecy le Vieux cedex
>
> Tel : (+33) (0)4.50.09.17.02
>
> Fax : (+33) (0)4.50.27.94.95
>
> email: [log in to unmask] ; [log in to unmask]
>
> ---------------------------------------------------
|