Hi,
I had a problem last week with lhcb jobs. According to their logs the
jobs failed with bus error. According to my batch system they terminated
with Exit_code=0. As we discussed this at the UK ops meeting someone
suggested it might be a problem with the CREAM job wrapper not passing
the application error code to the batch system. Looking at the wrapper
created by pbs_submit.sh it looks like there are two functions that
should do this:
do_exit() {
stat=$1
echo "jw exit status = ${stat}"
echo $2 1>&2
if [ $__create_subdir -eq 1 ]; then
cd ..
rm -rf ${newdir}
fi
exit $stat
}
can anyone confirm this function (or it's parent log_and_exit) are
correctly called and the application error code is passed back to the
batch system?
thanks
cheers
alessandra
|