Hi All,
Thanks to the work of Mariusz Mamonski from Poznan Supercomputing and
Networking Center I'm able to share some improvements that were made to
MUNGE Authentication mechanism in TORQUE.
In the attachment you will find patch:
torque-munge-api-support-v3.patch.gz, it works with 2.5.11 and 2.5.12
version of Torque.
The patch adds new "--enable-munge-library" configure option which turns
on new Munge authorization based on API instead of external executables.
The patch does not modify any of old munge authentication code. We just
add alternative methods which are switched on by specifying configure
option.
By using munge functions directly via API we were able to get rid of
expensive calls like popen (exec) and fsync used in the older method
and gain significant speedups in client request processing.
As the reslut of changes we've got a lot more responsiveness from
pbs_server. Observed performance gain vary from 2x to more than 10x
times depending on query types.
The bigger the cluster the bigger performance gain you may expect.
We have successfully tested the new implementation in our test
environment. After verification on smaller cluster the patch is now in
production since yesterday's afternoon. This cluster processes around
25k of jobs per day and no issues have been observed yet.
--------------------------------------------------------------
We don't guarantee that it will work for you and take no responsibility
for any damages it may cause.
--------------------------------------------------------------
Despite above statement ;) it is worth trying. We did our best to ensure
that is is cross-compatible with old munge-auth and error free.
The most benefits however can be seen on bigger clusters where server
is queried frequently (i.e. grid sites)
HOWTO install:
1. Get torque sources torque 2.5.12.tar.gz
2. untar and apply patch
$> tar -zxvf torque-2.5.12.tar.gz
$> cd torque-2.5.12; patch -p1 < torque-munge-api-support-v3.patch
3. Regenerate configure script by invoking:
$> autoconf
NOTE: m4 in version 1.4.8 or newer is required.
RHEL5 derivatives (like SL5) may require newer package:
For scientific linux 5 we used:
$> wget
ftp://ftp.scientificlinux.org/linux/scientific/5x/SRPMS/SL/m4-1.4.8-1.src.rpm
$> rpmbuild --rebuild m4-1.4.8-1.src.rpm
$> rpm -Uvh ../path/to/m4-1.4.8-1.x86_64.rpm
4. Read README.munge in torque directory
5. Make sure you have munge-libs and munge-devel (library and headers)
NOTE: munge-libs are LGPL sice 0.5.9, earlier versions are GPL
6. Configure and build torque
$> ./configure <your_options> --enable-munge-library && make
Good luck & happy testing
Any feedback is welcome. I wish you good performance gains :)
--
Lukasz Flis
ACC Cyfronet AGH
Nawojki 11, 31-209
POLAND
|