It seems likely that the drop off in efficiency with these file-staging
tests will occur at around 2-3 jobs per node. Sites with 4 slots per
node are therefore likely to be able to fill a larger proportion of
their slots than sites with 8 slots per node.
If we are going to go down this route then what we need to do is to
improve the rate at which we can read off the local disk. It seems that
ATLAS may already be looking at a promising approach which is the use of
solid state disks. These have high random read rates and have shown good
performance in tests at BNL:
http://indico.cern.ch/contributionDisplay.py?contribId=395&sessionId=61&confId=35523
http://www.usatlas.bnl.gov/twiki/bin/Admins/rsrc/Admins/MinutesProofFeb14/Xrootd_PROOF_BNL_Feb08.ppt
So the idea would be for sites to retro-fit their worker nodes (starting
with the 8 core nodes) with a small (say 64G) SSD and either make it the
temporary storage area or to define a new directory on it into which
files would be staged.
A refinement might be to have some sort of distributed file system cache
(lustre?) made up of these SSD's and onto which files could be
pre-staged by panda.
So it might be interesting for one of the UK sites to fit one of these
to an 8 core node and we could see how many more jobs it could process
by looking at its entry in the 'finished' column of the sites page, e.g.:
http://panda.cern.ch:25980/server/pandamon/query?overview=wnlist&type=analysis&hours=4&site=ANALY_RHUL
An alternative (suggested by Chris W) might be to buy lots of RAM and
use a ram disk as a fast cache.
Duncan
|