Dear all,
we currently have a problem with Maui on one of our batch servers:
diagnose -f does not completely report the stanza below
GROUP
(5 unix groups are missing, although defined in the pbs server and
maui.cfg, and using resources)
QOS
No entry at all (even "QOS" is missing)
CLASS
The same as for QOS
The problem appeared roughly one week ago. Two events might be
correlated: We introduced new hardware shortly before, the system was
under very heavy load (~5000 jobs in queue and running), and we
configured a second CE for some time, on a testing basis (now removed,
PBS configuration reverted back).
We run one CE (grid-ce3.desy.de) in front of this batch server. Some
information relevant to the batch server:
glite-apel-pbs-2.0.5-2.noarch
maui-3.2.6p20-snap.1182974819.8.slc4.i386
maui-client-3.2.6p20-snap.1182974819.8.slc4.i386
maui-server-3.2.6p20-snap.1182974819.8.slc4.i386
torque-2.3.0-snap.200801151629.2cri.slc4.i386
torque-client-2.3.0-snap.200801151629.2cri.slc4.i386
torque-mom-2.3.0-snap.200801151629.2cri.slc4.i386
torque-server-2.3.0-snap.200801151629.2cri.slc4.i386
root@grid-batch3: [~] uname -a
Linux grid-batch3.desy.de 2.6.9-78.0.1.ELsmp #1 SMP Tue Aug 5 12:59:28
CDT 2008 i686 i686 i386 GNU/Linux
root@grid-batch3: [~] cat /etc/issue
Scientific Linux SL release 4.4 (Beryllium)
Kernel \r on an \m
An example output can be found here:
http://www.desy.de/~kemp/diagnose.txt
I have put the config files here:
http://www.desy.de/~kemp/pbs_server.conf
http://www.desy.de/~kemp/maui.cfg
The information about the fairshare seems to be there, as shown e.g.
in the file /var/spool/maui/stats/FS.1228780800
http://www.desy.de/~kemp/FS.1228780800
so we assume that scheduling is not affected (but we do not really
know...).
Does anyone have an idea what is going wrong?
Thanks for any hint!
Best
Yves
# Yves Kemp: [log in to unmask]
# DESY IT 2b/314, Notkestr. 85, D-22607 Hamburg
# FON: +49-(0)-40-8998-2318, FAX: +49-(0)-40-8994-2318
|