Print

Print


Hi Daniel,

I just had a look as Arka's jobs that where registered as failed on
the DIRAC server and from there it looks like you have  access
problems on se03.esc.qmul.ac.uk (I assume that is where your data is
?)
I know Dan has struggled to keep that SE up and running in recent
months and a number of VOs have run into similar troubles.
I don't know how big your data set is, but if it's not too big you
might be able to have it (temporarily) hosted elsewhere.
Also I get the impression that at least in your case ganga does more
harm than good, have you looked into direct submission to DIRAC ?
The DIRAC version you use looks OK to me.

Regards,
Daniela

On Wed, 19 Sep 2018 at 13:52, Daniel Felea <[log in to unmask]> wrote:
>
>
> Hello All,
>
> Can anyone please provide help with the following issue :
>
> Arka and I have recently sent some simulations tests for the MoEDAL collab., most of them ended with 'Error' status.
>
> We got many error messages, something like :
>
> WARNING  An error occured finalising job: 3514
> WARNING  Attemting again (5 of 5) after 2.5-sec delay
> ERROR    GangaDiracError: No Output sandbox registered for job 12831901
> ERROR    Unable to finalise job after 3513 retries due to error:
> GangaDiracError: No Output sandbox registered for job 12831900
> WARNING  An error occured finalising job: 3513
> WARNING  Attemting again (5 of 5) after 2.5-sec delay
> ERROR    GangaDiracError: No Output sandbox registered for job 12831900
> ERROR    Unable to finalise job after 3516 retries due to error:
> GangaDiracError: No Output sandbox registered for job 12840825
>
> Interesting enough, many of these test simulations (9 out of 11) were successfully run a couple of days before, I repeated them just in case, and they ended all in error.
>
> There are two type of error messages on the GridPp-DIRAC page :
>
> For local storage (LocalFile):
> MinorStatus : 'Maximum of reschedulings reached'
> ApplicationStatus : 'Failed Input Sandbox Download'
>
> For remote storage (DiracFile):
> MinorStatus : 'Uploading Job Outputs'
> ApplicationStatus : 'exe-script.py successful'
>
> We currently use: ./dirac-install -r v6r20p5 -i 27 -g v14r1
>
> What should be done to obtain, at least 90% successful rate to our simulations ? We foresee some thousands of jobs to be submitted during next couple of months, or so.
>
> Thank you very much in advance !
>
> Cheers,
> Daniel (on behalf or Arka, too)
>
>
> ________________________________
>
> To unsubscribe from the TB-SUPPORT list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1



-- 
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

########################################################################

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1