Dave et al,
ScotGRID-Glasgow found many ways to achieve "Failure while executing job
script" while putting the ScotGRID-Glasgow cluster on the grid. It seemed
to correspond to jobs evaporating while in the care of PBS. We even found
a case due to a bug in bash from RedHat 7.2
A crude writeup of our experience is at
http://www.scotgrid.ac.uk/documents/experience.html
David Martin
Dept of Physics and Astronomy,
University of Glasgow,
Glasgow, G12 8QQ,
United Kingdom
tel: (0)141 330 4197 fax: (0)141 330 5881
email: [log in to unmask]
|