On 5 Jan 2010, at 11:52, Daniela Bauer wrote:
> Hi All,
>
> Since before Christmas we have a problem with a biomed user requesting
> files which have been deleted some time in June, but somehow dcache
> didn't catch on.
>
> The situation is as follows:
>
> The file is gone (as confirmed by brute-force use of the 'find'
> command on all the nodes).
>
> It has a pnfsid, but nobody is home:
>
> companion=# select * from cacheinfo where pnfsid='000C000000000000001465D8';
> pnfsid | pool | ctime
> --------+------+-------
> (0 rows)
>
>
> [root@gfe02 tmp]# ls -l
> /pnfs/hep.ph.ic.ac.uk/data/biomed/generated/2009-01-21/fileb2c3e2f5-50b4-42ac-864d-0d6a8b8b1d69
> -rw-r--r-- 1 lt2-biomed001 lt2-biomed 1507914 Jan 21 2009
> /pnfs/hep.ph.ic.ac.uk/data/biomed/generated/2009-01-21/fileb2c3e2f5-50b4-42ac-864d-0d6a8b8b1d69
>
> The billing says the file is gone:
> [root@gfe02 06]# grep 000C000000000000001465D8 *
> billing-2009.06.03:06.03 15:09:20
> [pool:sedsk11other_0@sedsk11Domain:remove]
> [000C000000000000001465D8,0] [Unknown] <unknown> {0:""}
>
> I cannot work out from the dcache manual what the correct way of
> dealing with this is (I can come up with a hack, but I am wondering if
> I am not making it worse).
>
> Does anyone know ?
>
rm on the pnfs namespace entry always worked for me when we had those sort of problems...
> Does anyone have an idea what happened here ? (All these files seems
> to have been on the same pool, so maybe it had issues, but which ?)
Which issue or which pool? From the billing message the pool is sedsk11other_0 on host sedsk11.
Some potential issues - a timeout updating pnfs while trying to delete, this pool was at some point configured as a cache pool (i.e. for use in front of a tape system) and picked that file to remove when it got full. There may well be others but its been too long since I dealt with dCache.
Derek
|