Hi,
On Thu, 27 Mar 2003, Stephen Burke wrote:
> On Thu, 27 Mar 2003, Steve Traylen wrote:
> > As some point soon checkpointing comes into WP1 job submission
> > components. I think the theory is that at sensible points in the job the
> > state is recorded so it can be resumed from that point. I expect
> > there is something on the WP1 website.
>
> That's true, but it may well be that at least in HEP people don't use it
> because it's usually fairly easy to re-run jobs. On the other hand that
> also means that having jobs cancelled is not that disastrous. Actually
> with the current efficiency you might not notice the effect of a site
> deliberately killing jobs!
>
I think that people are quite likely to use checkpointing (if we did then
it would not have been implemented). I thought the efficiency was pretty
high (even if it does involve re-submission) at the moment for example...
(I hope that it is OK to forward your mail Dominique... if not I
apologise)
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Date: Fri, 21 Mar 2003 04:35:24 +0100
From: Dominique Boutigny <[log in to unmask]>
To: Dr D J Colling <[log in to unmask]>, Gilbert Grosdidier
<[log in to unmask]>
Subject: Test result
Hi Dave and Gilbert,
The result of my last test with the BDII is 98% success, 1 job (over
100) has been lost, another one crashed but I think that it is due to a
problem at in2p3 which is unrelated to edg.
Cheers,
Dominique
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
All the best,
david
|