Ahh, the infamous maradona error. This is an annoying problem and
difficult to detect as it is a symtom of a bad WN/batch system. You'll
need to dredge WN and CE batch system logs looking for the bad node.
Your SFTs probably pass because they land on good nodes. Check here for
possible causes:
http://grid-deployment.web.cern.ch/grid-deployment/eis/docs/Maradona
First check would be WN-CE passwdless ssh is ok. Check cpu-used, as this
shows black-hole nodes like a sore thumb. This class of problem
prompted our move towards a stateful config system (cfengine).
Peter
Olivier van der Aa ([log in to unmask]) wrote:
> Dear All,
>
> We are having problems to have the sam tests running fine on our new ce
> (ce00.hep.ph.ic.ac.uk).
>
> The sam tests shows ok for each individual tests
> http://tinyurl.com/yjhnso but the logging and book keeping shows a
> maradona error (http://tinyurl.com/ydgv8b).
>
> We have used the rb sam is using (gdrb02.cern.ch) and we don't have a
> problem at all. We have mapped ourselves as ops on the ce and that works
> fine.
>
> We have biomed jobs running fine on the cluster...
>
> Any idea ?
>
> Cheers, Olivier.
> --
> - O. van der Aa - Imperial College London -
> - LT2 Technical Coordinator -
> - tel: +442075947810, +442071005426 -
> - SIP: [log in to unmask] -
> - fax: +442078238830 -
> - http://surl.se/agtu -
|