On Tue, Jul 26, 2005 at 10:07:36AM +0200, Jeff Templon wrote:
> [root@tbn20 root]# strace -p 1493
> Process 1493 attached - interrupt to quit
...
> read(15, 0x80cd6d0, 4096) = -1 ESTALE (Stale NFS file handle)
...
> Process 1493 detached
>
> my guess is that it is supposed to read something in a file, and that
> file will tell it when the process should die, but the file is gone and
> so the process does not know that it should have terminated itself.
>
> My guess: somehow the script/process manages to wait long enough between
> reads that the job's home directory mount (autofs) 'expires' and gets
> unmounted.
Can you find which file it is trying to read from? ls -al /proc/1493/fd/15
From the strace it seems that the file is open so the autofs mount shouldn't
go away. I think what happens is that a cleanup script removed the file
and you get the stale nfs handle messages.
Kostas
|