Hi Simon,
Looking at the recent SAM results, you always pass if the file is going
to gridraid2 or 4. Scanning back further, I see that you also have a
gridraid3 which was passing tests yesterday. This would indicate that
there is a problem with this machine. Can you tell me the permissions on
the DPM filesystem on gridraid3? The filesystem is something like:
/grid1/pool/ops/
although there may be others, I don't know the exact configuration of
your machine.
Can yo also send me the output of dpm-qryconf and recent /var/log/dpns
and /var/log/dpm log entries?
Cheers,
Greig
Simon George wrote:
> Hi,
>
> unfortunately this has come up while Duncan is away and my experience is
> not up to the task of investigating this. Could anyone suggest how to
> proceed?
>
> I checked the SAM tests for my SE, and I see that we have intermittent
> errors for ops cr/cp/del:
>
> https://lcg-sam.cern.ch:8443/sam/sam.py?funct=ShowHistory&sensors=SE&vo=ops&nodename=se1.pp.rhul.ac.uk
>
>
> In /var/log/srmv1/log on my SE, I see:
> 07/27 09:33:33 4345,1 put: request by
> /O=GermanGrid/OU=Uni-Dortmund/CN=Christoph Wissing H1sm from
> gfm01.pp.rhul.ac.uk
> 07/27 09:33:33 4345,1 put: SRM98 - put 632459 632459
> 07/27 09:33:33 4345,1 put: SRM98 - put 0
> srm://se1.pp.rhul.ac.uk/dpm/pp.rhul.ac.uk/home/hone/generated/2007-07-27/file84e72306-11f5-43e5-a07b-e1e04ec4384c
>
> 07/27 09:33:33 4345,1 put: returns 0
> 07/27 09:33:34 4345,1 getRequestStatus: request by
> /O=GermanGrid/OU=Uni-Dortmund/CN=Christoph Wissing H1sm from
> gfm01.pp.rhul.ac.uk
> 07/27 09:33:34 4345,1 getRequestStatus: SRM98 - getRequestStatus 632459
> 07/27 09:33:34 4345,1 getRequestStatus: returns 0
> 07/27 09:33:45 4345,1 getRequestStatus: request by
> /O=GermanGrid/OU=Uni-Dortmund/CN=Christoph Wissing H1sm from
> gfm01.pp.rhul.ac.uk
> 07/27 09:33:45 4345,1 getRequestStatus: SRM98 - getRequestStatus 632459
> 07/27 09:33:45 4345,1 getRequestStatus: returns 0
> 07/27 09:33:55 4345,1 getRequestStatus: request by
> /O=GermanGrid/OU=Uni-Dortmund/CN=Christoph Wissing H1sm from
> gfm01.pp.rhul.ac.uk
> 07/27 09:33:55 4345,1 getRequestStatus: SRM98 - getRequestStatus 632459
>
> How can I tell which pool node this file has gone to?
>
> Thanks,
> Simon
>
> -------- Original Message --------
> Subject: Access to se1.pp.rhul.ac.uk by HONE VO
> Date: Fri, 27 Jul 2007 10:22:09 +0200
> From: Christoph Wissing <[log in to unmask]>
> Organisation: DESY
> To: [log in to unmask]
>
>
> Dear colleagues,
>
> we have some problems with your SE se1.pp.rhul.ac.uk. It is not possible
> to access this file:
>
> lcg-cp --vo hone -v -t 600
> srm:///dpm/pp.rhul.ac.uk/home/hone/generated/2007-07-26/file367e0797-17fd-42a8-abe6-9890b125c75f
> file:///dev/null
> Using grid catalog type: lfc
> Using grid catalog : grid-lfc0.desy.de
> lcg_cp: Connection timed out
>
> It takes the 10 minutes before the error appears.
>
> Having done some manual tests I succeed sometimes with lcg-cr, sometimes
> I fail. Might be that only a subcomponent of the SE instance is having
> trouble.
>
> Would please have a look at it. If you can foresee that there is a
> bigger problem, please let me know, because then it is faster for us to
> recreate the file.
>
> Thanks a lot in advance,
> Christoph for the HONE production team
>
|