Thanks.
Graeme Stewart wrote:
> Sorry Alesandra
>
> I think I must have interpreted the multiple levels of indenting in your
> forward message wrongly - I really had thought it was Dario who'd said
> that. No offence intended!
>
> The point you made was very valuable and I get the impression this is
> being addressed through the CIC portal.
>
> I'll send a clarification around.
>
> Cheers
>
> Graeme
>
> On 16 Aug 2006, at 11:26, Alessandra Forti wrote:
>
>> Hi Greame,
>>
>> since you are reporting an exchange I was involved in. I liked you
>> reported correctly what happened. As it wasn't Dario pointing out that
>> more roles are needed but me. If it had been for Dario this would have
>> been just another tragic loss of data due to EGEE ops bad communication
>> channels. I'm a bit tired of seeing my words in someone else mouth
>> or completely ignored.
>>
>> cheers
>> alessandra
>>
>>
>> On Wed, 16 Aug 2006, Graeme Stewart wrote:
>>
>>> Action from last week: here's the email from Frank Wuerthwein in OSG re.
>>> dCache housekeeping.
>>>
>>> g
>>>
>>> Begin forwarded message:
>>>
>>>> From: Frank Wuerthwein <[log in to unmask]>
>>>> Date: 4 August 2006 09:54:31 BDT
>>>> To: Artem Trunov <[log in to unmask]>
>>>> Cc: Graeme Stewart <[log in to unmask]>, storage-classes-wg
>>>> <[log in to unmask]>
>>>> Subject: Re: Minutes of Friday Jul 28 meeting
>>>>
>>>> Dear Artem,
>>>>
>>>> my apologies if this is an inappropriate forum for sending this
>>>> email to.
>>>> Please let me know if it is. I'm a bit confused about the scope of this
>>>> group,
>>>> probably because I have not paid enough attention. My apologies for
>>>> that
>>>> too.
>>>>
>>>> Dear Graeme,
>>>>
>>>> I took a look at the url you gave below. Very interesting!
>>>> Looks like you are dealing with some of the same problems as the CMS
>>>> T2's in
>>>> the US on OSG.
>>>>
>>>> We had an OSG sponsored SRM/dCache workshop at FNAL recently
>>>> (http://osg.ivdgl.org/twiki/bin/view/Storage/DcacheWorkshop )
>>>> to bring storage admins at T2's, T1's, and the fnal dCache
>>>> team in contact in order to get to know each other, and start
>>>> discussing
>>>> operational issues, performance tuning,
>>>> deployment choices, and support. All but one T2 for both ATLAS & CMS
>>>> attended this workshop, as well as a couple
>>>> T3's.
>>>>
>>>> The Q of loosing files came up, and we learned that several of the
>>>> sites had
>>>> already started implementing home-cooked
>>>> solutions for:
>>>> -> searching for lost files
>>>> -> searching for corrupted files (i.e. comparing the adler32 chksum
>>>> in pnfs
>>>> metadata with the one of the actual file)
>>>> -> automatically retransfering those files via PhEDEx, the cms Xfer
>>>> tool.
>>>>
>>>> To get a feel for what people cook at home:
>>>> http://t2.unl.edu/cms/storage/test_pfns.py/file_view
>>>> http://hepuser.ucsd.edu/twiki/bin/view/Main/PnfsChecker
>>>>
>>>> We agreed that we ought to take stock of what people are
>>>> home-cooking, and
>>>> merge the home-cooked into
>>>> something that's worth making available more widely as part of SRM
>>>> /dCache
>>>> ops tools of some sort.
>>>>
>>>> Personally, I'm expecting that there'll be some generic SRM/dCache ops
>>>> tools, and an interface in PhEDEx
>>>> that can be called to "register" lost files for retransfer. This
>>>> obviously
>>>> would only work for files lost on disk.
>>>> Files lost on tape are much more rare, at least based on my
>>>> experience with
>>>> CDF at fnal, and could thus probably
>>>> be dealt with by hand.
>>>>
>>>> If something comes of this, would you be interested in it?
>>>> Should such tools be available for storage solutions other than
>>>> SRM/dCache ?
>>>> Are they a site responsibility? Or should there be a way for sites
>>>> to share
>>>> such tools, and adopt common procedures?
>>>>
>>>> For the OSG sites of CMS & ATLAS, we are likely to share tools and
>>>> adopt
>>>> common procedures.
>>>> We are likely to coordinate this within the OSG project, and make them
>>>> available via either the VDT or
>>>> the dCache rpm's, or both. We are just starting to figure out how to
>>>> best do
>>>> this.
>>>>
>>>> Thanks, Frank
>>>>
>>>> Frank Wuerthwein
>>>> UCSD
>>>>
>>>>>>
>>>>>> [1] An issue came up at Manchester in the UK where the CIC tool
>>>>>> was used
>>>>>> to
>>>>>> broadcast the closure of an SE, then when it was taken off line ATLAS
>>>>>> had
>>>>>> managed to place 22k valuable files on it. Dario pointed out that VO
>>>>>> manager
>>>>>> wasn't the right person to contact, so a lot more roles than just VO
>>>>>> manager
>>>>>> were needed.
>>>>>>
>>>>>> PS. I had a first go at writing a operational procedure for site file
>>>>>> loss for
>>>>>> our T2s, which is here:
>>>>>>
>>>>>> http://www.gridpp.ac.uk/wiki/SRM_File_Loss
>>>>>>
>>>>>> --
>>>>>> Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
>>>>>> GridPP DM Wiki - http://wiki.gridpp.ac.uk/wiki/Data_Management
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>
>>> --
>>> Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
>>> GridPP DM Wiki - http://wiki.gridpp.ac.uk/wiki/Data_Management
>>> ScotGrid - http://www.scotgrid.ac.uk/ http://scotgrid.blogspot.com/
>>>
>>
>> --************************************************************************
>>
>> * Alessandra Forti * He who laugh last probably made *
>> * e-mail: [log in to unmask] * a back-up. He who laugh *
>> * tel: +41 22 767 9594 * histerically probably DIDN'T. *
>> * fax: +41 22 767 8630 * *
>> ************************************************************************
>>
>
> --
> Dr Graeme Stewart - http://wiki.gridpp.ac.uk/wiki/User:Graeme_stewart
> GridPP DM Wiki - http://wiki.gridpp.ac.uk/wiki/Data_Management
> ScotGrid - http://www.scotgrid.ac.uk/ http://scotgrid.blogspot.com/
--
*******************************************
* Dr Alessandra Forti *
* Technical Coordinator - NorthGrid Tier2 *
* http://www.hep.man.ac.uk/u/aforti *
*******************************************
|