[log in to unmask] wrote:
> On Mon, 20 Apr 2009, Jean-Michel Barbet wrote:
>
>>> The WMS job wrapper always tries a mkdir and then cd into the directory,
>>> but will continue when either operation fails. Does /var/log/messages
>>> show any problems for the "/dlocal" file system?
>
> No errors in /var/log/messages?
Good morning,
No errors in /var/log/message, neither in /var/log/secure
>> more ./13969.1239792844/stderr
>> /users/lcg/sgmali020/.globus/.gass_cache/local/md5/53/6b5744d385fc42f37ff06770a4c4d9/md5/15/326d224da16bedf7ab303c42c54ba8/data:
>> line 66: : No such file or directory
>
> Is that file still available? I would be interested.
> I think I have seen the error before and it may be harmless.
I will send the file in a separate mail.
> So, for the job in question it seems very much that files and directories
> were cleaned up while it was still running! Might some agressive cleanup
> script be at fault? It could also be an error in the user payload...
I do not have an aggressive cleaning script but there are the cases
I mentioned where pathological jobs started cleaning from / instead of
from their top directory.
At the moment, I still have job that do not create their directory
based on the job id but I have not found recently a job that
tries to remove files from /.
It is not really easy to monitor what's happening in the cluster
and we could certainly all benefit from recipes in this respect.
Thanks very much.
JM
--
------------------------------------------------------------------------
Jean-michel BARBET | Tel: +33 (0)2 51 85 84 86
Laboratoire SUBATECH Nantes France | Fax: +33 (0)2 51 85 84 79
CNRS-IN2P3/Ecole des Mines/Universite | E-Mail: [log in to unmask]
------------------------------------------------------------------------
|