Hi David (et al),
The RB is now back and I have just used it to submit to NIKHEF using the
environment variable, so please have another go (the job is in the
scheduled state at the moment). I have also done something that I should
have done some time ago and written a script that does the database clean
so recovery should be quicker from now on.
All the best,
david
On Thu, 5 Jun 2003, Dr D J Colling wrote:
> Hi David,
>
> The RB crashed again last night with an unterminated quoted string that is
> why your job got no further. When I performed my test I didn't use the -r
> option but just the environment variable.
>
> For the DB to become corrupted 3 times in a day is just not giving me a
> fighting chance! Sometimes it goes for a week or more without this
> happenning and then 3 times in one day! It doesn't seem to correlated to
> much (the load is roughly constant) except the number of other things that
> I am supposed to be working on. Oh well, I shall clean it out again ...
> more jobs gone forever..
>
> I will send a mail when it is complete...
>
> Anyway I am on away next week and so any problems with be dealt with
> either by Rod Walker ([log in to unmask]) or Hugh Tallini
> ([log in to unmask]). I think that they both recieve any mails to
> [log in to unmask]
>
> All the best,
> david
>
>
>
>
> On Thu, 5 Jun 2003, David Groep wrote:
>
> > Hi David,
> >
> > At 06:30 PM 6/4/2003, Dr D J Colling wrote:
> > >I have cleaned things out yet again and have just submitted a job
> > >requiring the NIKHEF environment variable, which ran quite happily on
> > >tbn09. So I think that it should be OK now... do you want to give it a
> > >try?
> >
> > My jobs seem to wait indefinitely in the "Waiting" state (job accepted, but
> > stuck in the RB).
> > Is there a problem with the Information Index not updating correctly?
> > All relevant information on the NIKHEF CE is in the top-level MDS at
> > gppinfo06.gridpp.rl.ac.uk (just checked with ldapsearch). Could you look
> > in the local BDII for the RB?
> >
> > I tested weith the following piece of JDL:
> >
> > Executable = "/bin/echo";
> > Arguments = "Hello World";
> > StdOutput = "std.out";
> > StdError = "std.err";
> > OutputSandbox = {"std.out","std.err"};
> > Requirements = Member(other.RunTimeEnvironment,"NIKHEF");
> >
> > Did you specify the resource using "-r"?
> >
> > Thanks a lot for the help!
> >
> > David G.
> >
> >
> > >All the ebst,
> > >david
> > >
> > >
> > >On Wed, 4 Jun 2003, David Groep wrote:
> > >
> > > > Hi David,
> > > >
> > > > At 02:56 PM 6/4/2003, Dr D J Colling wrote:
> > > > >The Imperial College RB is now all cleaned out and appears to be working
> > > > >properly. I have also now included the new UK ca in the list, so if
> > > > >somebody would like to test this (i.e. somebody with such a certificate)
> > > > >then I would be grateful.
> > > >
> > > > There still seems to be a problem when submitting jobs via
> > > > the IC RB: they remain stuck in the RB or JS service in
> > > > the "Waiting" state, like these jobs:
> > > >
> > > >
> > > https://gm03.hep.ph.ic.ac.uk:7846/192.16.186.233/152749191164227?gm03.hep.ph.ic.ac.uk:7771
> > > >
> > > https://gm03.hep.ph.ic.ac.uk:7846/192.16.186.233/143506186855830?gm03.hep.ph.ic.ac.uk:7771
> > > >
> > > >
> > > > Seems to be related to jobs with RuntimeEnvironment requirement "NIKHEF".
> > > > But the same jobs run correctly to completion when submitted via LYON
> > > > (although Lyon is extremely slow at the moment).
> > > >
> > > > Could you have a look at it?
> > > >
> > > > Thanks,
> > > > David Groep.
> > > >
> > > >
> > > >
> >
> >
>
>
|