Print

Print


Hi Mario,

> >> I've got a question about recent optimizations in lcg-ce (addition of
> >> globus-gass-cache-marshal and globus-job-manager-marshal) - it has  
> >> been
> >> said that configuration files can be found in globus/etc location,  
> >> but I
> >> didn't manage to find any documentation about what that configuration
> >> actually mean or does. Any hints? I suspect that our current cluster
> >> problem might be related to the configuration of this new piece of  
> >> software.
> >
> > What problems?  The SAM tests appear to be working fine.
> 
> Latest example:
> 
> [link to SAM job submission error page]

The error was this:

    Globus error 94: the jobmanager does not accept any new requests
    (shutting down)

Did you check its Wiki page:

http://goc.grid.sinica.edu.tw/gocwiki/Globus_error_94%3A_the_jobmanager_does_not_accept_any_new_requests_%28shutting_down%29

In particular note that the batch system can be in bad shape for _some_
users, e.g. if the user ran out of disk quota.

> However for 3 weeks we have been struggling with that particular  
> error. It comes and goes as it pleases and sometimes takes the cluster  
> offline for almost days. The CE has been reconfigured now and it seems  
> to help to some extent, but not 100%. Until today evening we were also  
> in the FCR because of it. I guess the late success of SAM has removed  
> us from FCR for now.