Hi Chris,
I know this is very annoying, but keep perservering, you're almost there!
Did you make the change to the srm.batch file and restart the SRM?
set context -c SpaceManagerDefaultRetentionPolicy REPLICA
set context -c SpaceManagerDefaultAccessLatency ONLINE
Also, just to confirm, you have
srmSpaceManagerEnabled=yes
on the doors?
What error message are you getting when you try to copy a file into the
dCache? Is it a no-space-available, or no-write-pool-selected ?
Cheers,
Greig
On 08/02/08 18:05, Brew, CAJ (Chris) wrote:
> Hi Greig,
>
> No luck for me I'm afraid.
>
> Though I suspect I'm doing something very stupid somewhere since what
> you are saying fits with what I thought I had working for BaBar
> yesterday, i.e. that it worked if I set the directory tags explicitly.
>
> But the after I ran a script* to walk the directory tree and then tried
> to replicate the babar config to everyone else it stopped working. I was
> beginning to think I was mistaken and hadn't actually got the space
> manager switch on at that point
>
> Well here's my dCacheSetup file, PoolManager.conf and the script I used
> incase someone can spot the daft mistake I've made.
>
> Chris.
> (going home to forget about srm!)
>
> *
> for group in `ls /pnfs/pp.rl.ac.uk/data`; do echo $group; for dir in
> `find /pnfs/pp.rl.ac.uk/data/${group} -type d -print`; do cd $dir;echo
> "ONLINE" > ".(tag)(AccessLatency)"; echo "REPLICA" >
> ".(tag)(RetentionPolicy)"; done; done
>
>> -----Original Message-----
>> From: GRIDPP2: Deployment and support of SRM and local
>> storage management [mailto:[log in to unmask]] On
>> Behalf Of Greig Alan Cowan
>> Sent: 08 February 2008 17:13
>> To: [log in to unmask]
>> Subject: Re: Help
>>
>> Hi Matt,
>>
>> How's it going? After I set up the pnfs tags, I've found that I could
>> get the Space manager working without too many problems. My
>> PoolManager
>> and dCacheSetup are attached.
>>
>> For the reservation I did something like this:
>>
>> reserve -vog=/lhcb -vor=lhcbprd -acclat=ONLINE -retpol=REPLICA
>> -desc=LHCb_DST -lg=lhcb-linkGroup 24993207653 "-1"
>>
>> Note that there is a problem with (I think) gPlazma in that it caches
>> user DNs for a short period. This means that if you try to transfer a
>> file when belonging to one VO and then switch proxies to another, you
>> are likely to get a permission denied error. Someone is working on
>> fixing this.
>>
>> Cheers,
>> Greig
>>
>> On 08/02/08 15:19, Matt Doidge wrote:
>>> Helps if I attach the bloomin script doesn't it!
>>>
>>> Got that Friday feeling...
>>>
>>> Matt
>>>
>>> On 08/02/2008, Matt Doidge <[log in to unmask]> wrote:
>>>> Heya guys,
>>>>
>>>> Here's the python script that I was given by Dmitri Litvinse that
>>>> recursively sets the AccessLatency and RetentionPolicy
>> tags in pnfs to
>>>> ONLINE and REPLICA. Usage is:
>>>>
>>>> set_tag.py --dir=/pnfs/wherever/data/vo/foo
>>>>
>>>> or to be careful about it cd to the directory and
>>>> /pathtoscript/set_tag.py --dir=`pwd`
>>>>
>>>> This took nearly 3 hours for my admittedly gargantuan
>> atlas directory,
>>>> so you're best off doing it in chunks. Oh and as a disclaimer this
>>>> script comes with no gurantees, it was written for us as a favour.
>>>>
>>>> However doing this doesn't seem to have fixed our troubles, srmv2
>>>> writes still work if you specify the space token but fail
>> if you don't
>>>> for dteam. I don't know about other VOs, as none of my
>> collegues seem
>>>> to be able to get a proxy today. I might have to fiddle with
>>>> permissions and pretend to be in other VOs to test.
>>>>
>>>> cheers,
>>>> Matt
>>>>
>>>> On 08/02/2008, Greig Alan Cowan <[log in to unmask]> wrote:
>>>>> Hi Matt,
>>>>>
>>>>> Yep, you are bang on. I just set the PNFS tags to
>> REPLICA-ONLINE and now
>>>>> it's all working. Seems to me that things have really
>> been setup to work
>>>>> for dCache's with HSM backends and not thinking about the
>> little guys.
>>>>> I'll report this in the deployment meeting that's starting soon.
>>>>>
>>>>> $ echo ONLINE > ".(tag)(AccessLatency)"
>>>>> $ echo REPLICA > ".(tag)(RetentionPolicy)"
>>>>>
>>>>> Can you send round that script?
>>>>>
>>>>> Cheers,
>>>>> Greig
>>>>>
>>>>> On 08/02/08 14:46, Matt Doidge wrote:
>>>>>> Heya guys,
>>>>>>
>>>>>> I've had similar experiances playing with Lancaster's
>> dcache- I can
>>>>>> get writes to work only if I specify the token to write
>> into, if I
>>>>>> leave it unpsecified or try to use srmv1 I get "No Space
>> Available.
>>>>>> >From the logs of our experiments Dimitri and Timur have
>> concluded that
>>>>>> there's some confusion involving default space settings. Despite
>>>>>> having set us to "REPLICA" and "ONLINE" in the
>> dCacheSetup writes into
>>>>>> our dcache with no write policies set (i.e. no token
>> specified) are
>>>>>> being made to look for a space which is "NEARLINE" and
>> "CUSTODIAL".
>>>>>> One fix suggested is to edit the srm.batch with:
>>>>>> set context -c SpaceManagerDefaultRetentionPolicy REPLICA
>>>>>> set context -c SpaceManagerDefaultAccessLatency ONLINE
>>>>>> (these were set wrong for us)
>>>>>>
>>>>>> And also Dmitri advised setting the policy tags in the pnfs
>>>>>> directories. Dmitri wrote a nice little python script to
>> do that, I
>>>>>> can forward it if you want, but be warned it took nearly
>> 3 hours for
>>>>>> it to get through our existing atlas directory. Luckily
>> it should only
>>>>>> ever have to be run once.
>>>>>>
>>>>>> I've set things up and am about to have a go at
>> switching the Space
>>>>>> Manager on without breaking our srm. Wish me luck.
>>>>>>
>>>>>> cheers,
>>>>>> Matt
>>>>>>
>>>>>> On 08/02/2008, Greig Alan Cowan <[log in to unmask]> wrote:
>>>>>>> Hi Chris, all,
>>>>>>>
>>>>>>> I've got the SRM2.2 transfers into a reserved space
>> working for the
>>>>>>> Edinburgh dCache.
>>>>>>>
>>>>>>> All I did was add a section to my PoolManager.conf file
>> that created a
>>>>>>> link group and added an existing dteam link to it, i.e.,
>>>>>>>
>>>>>>> psu create linkGroup dteam-linkGroup
>>>>>>> psu set linkGroup custodialAllowed dteam-linkGroup false
>>>>>>> psu set linkGroup replicaAllowed dteam-linkGroup true
>>>>>>> psu set linkGroup nearlineAllowed dteam-linkGroup false
>>>>>>> psu set linkGroup outputAllowed dteam-linkGroup false
>>>>>>> psu set linkGroup onlineAllowed dteam-linkGroup true
>>>>>>> psu addto linkGroup dteam-linkGroup dteam-link
>>>>>>>
>>>>>>> Nothing elsed changed in PoolManager.conf. In
>> dCacheSetup on the SRM
>>>>>>> node, I have
>>>>>>>
>>>>>>> srmSpaceManagerEnabled=yes
>>>>>>> srmImplicitSpaceManagerEnabled=yes
>>>>>>> SpaceManagerDefaultRetentionPolicy=REPLICA
>>>>>>> SpaceManagerDefaultAccessLatency=ONLINE
>>>>>>> SpaceManagerReserveSpaceForNonSRMTransfers=true
>>>>>>>
>> SpaceManagerLinkGroupAuthorizationFileName=/opt/d-cache/etc/Li
>> nkGroupAuthorization.conf
>>>>>>> It is also essential to have
>>>>>>>
>>>>>>> srmSpaceManagerEnabled=yes
>>>>>>>
>>>>>>> on all *door* nodes.
>>>>>>>
>>>>>>> I could then reserve a space in the newly created link
>> group using the
>>>>>>> "reserve" command line tool in the srmSpaceManager
>> cell. You can then
>>>>>>> test this with the latest dCache srmclient by doing
>> something like:
>>>>>>> srmcp -2 -debug file:////etc/group
>>>>>>>
>> srm://srm.epcc.ed.ac.uk:8443/pnfs/epcc.ed.ac.uk/data/dteam/gre
>> ig_test_dir/`date
>>>>>>> +%s` -space_token=1
>>>>>>>
>>>>>>> Where space_token=1 is the numerical value of the space token
>>>>>>> reservation that you made.
>>>>>>>
>>>>>>> Transfers using SRMv1 are still returning that there is no space
>>>>>>> available. I need to investigate further why this is.
>> I'll be in touch.
>>>>>>> Cheers,
>>>>>>> Greig
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 07/02/08 17:41, Brew, CAJ (Chris) wrote:
>>>>>>>> (I'm guessing jiscmail should be up now)
>>>>>>>>
>>>>>>>> Are there any sites without a MSS backend that have
>> got this working?
>>>>>>>> Chris.
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Greig Alan Cowan [mailto:[log in to unmask]]
>>>>>>>>> Sent: 07 February 2008 17:33
>>>>>>>>> To: [log in to unmask]
>>>>>>>>> Cc: cajbrew
>>>>>>>>> Subject: Re: Help
>>>>>>>>>
>>>>>>>>> Hi Guys,
>>>>>>>>>
>>>>>>>>> Sorry for my silence this afternoon, I've been at
>> CERN all week and
>>>>>>>>> that's me just back home now. I've got a working
>>>>>>>>> PoolManager.conf from
>>>>>>>>> FZK which I'm scrutinising. I'll be in touch later/tomorrow
>>>>>>>>> in order to
>>>>>>>>> get you both up and running in SRM2.2 mode.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Greig
>>>>>>>>>
>>>>>>>>> On 07/02/08 17:15, [log in to unmask] wrote:
>>>>>>>>>> It's a pain in the arse, I'm managing to get some results,
>>>>>>>>> but writes
>>>>>>>>>> only work when the space token is implicity set in the
>>>>>>>>> srmPut and they
>>>>>>>>>> fail in every other case. And for some reason even if I
>>>>>>>>> only set up a
>>>>>>>>>> linkGroup for dteam I still seem to affect all other VOs as
>>>>>>>>> soon as I
>>>>>>>>>> throw the SpaceManager on, and they get the "No
>> Space Availiable"
>>>>>>>>>> error.
>>>>>>>>>>
>>>>>>>>>> At least I'm seeing some progress I suppose- I can
>> technically get
>>>>>>>>>> SpaceTokens to work. It just means nothing else will.....
>>>>>>>>>>
>>>>>>>>>> Oh, and your cutdown arcane ritual does indeed seem
>> to work wonders-
>>>>>>>>>> but according to the dcache bods a restart of
>> doornodes (with the
>>>>>>>>>> edits to dCacheSetup on board) is advisable after
>> each change,
>>>>>>>>>> something to do with the door processes retaining
>>>>>>>>> infomation about the
>>>>>>>>>> SpaceManager stuff (to use the technical terms).
>>>>>>>>>>
>>>>>>>>>> cheers,
>>>>>>>>>> Matt
>>>>>>>>>>
>>>>>>>>>> On 07/02/2008, cajbrew <[log in to unmask]> wrote:
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> Thanks I'm back up now.
>>>>>>>>>>>
>>>>>>>>>>> OK, my arcane ritual was a bit short than yours so
>> I'll share it:
>>>>>>>>>>> In dCacheSetup on the head node
>>>>>>>>>>>
>>>>>>>>>>> Reset
>>>>>>>>>>> srmSpaceManagerEnabled=no
>>>>>>>>>>>
>>>>>>>>>>> and comment out:
>>>>>>>>>>> #srmImplicitSpaceManagerEnabled=yes
>>>>>>>>>>> #SpaceManagerDefaultRetentionPolicy=REPLICA
>>>>>>>>>>> #SpaceManagerDefaultAccessLatency=ONLINE
>>>>>>>>>>> #SpaceManagerReserveSpaceForNonSRMTransfers=true
>>>>>>>>>>>
>>>>>>>>> #SpaceManagerLinkGroupAuthorizationFileName=/opt/d-cache/etc/L
>>>>>>>>> inkGroupAuthor
>>>>>>>>>>> ization.conf
>>>>>>>>>>>
>>>>>>>>>>> In Poolmanager.conf file comment out all the LinkGroup
>>>>>>>>> configuration.
>>>>>>>>>>> Restart the dcache-core service on the head node.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I thought I had SrmSpaceManager working for a while.
>>>>>>>>>>>
>>>>>>>>>>> I seemed to have a setup where it worked for babar but I
>>>>>>>>> could only write to
>>>>>>>>>>> directories where I had explicitly set the AccessLatency
>>>>>>>>> and RetentionPolicy
>>>>>>>>>>> using:
>>>>>>>>>>>
>>>>>>>>>>> echo "ONLINE" > ".(tag)(AccessLatency)"; echo "REPLICA" >
>>>>>>>>>>> ".(tag)(RetentionPolicy)"
>>>>>>>>>>>
>>>>>>>>>>> But when I restarted to rty to replicate the config to CMS
>>>>>>>>> and test it from
>>>>>>>>>>> there it stopped working even for BaBar. Now whatever I
>>>>>>>>> try I cannot get
>>>>>>>>>>> writes working with SrmSpaceManager enabled.
>>>>>>>>>>>
>>>>>>>>>>> The trouble is we cannot test this without taking dCache
>>>>>>>>> effectively offline
>>>>>>>>>>> for everyone.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Chris.
>>>>>>>>>>>
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: [log in to unmask] [mailto:[log in to unmask]]
>>>>>>>>>>>> Sent: 07 February 2008 13:39
>>>>>>>>>>>> To: cajbrew
>>>>>>>>>>>> Cc: [log in to unmask]
>>>>>>>>>>>> Subject: Re: Help
>>>>>>>>>>>>
>>>>>>>>>>>> When we broke our dcache with the SpaceManager we found
>>>>>>>>> that in order
>>>>>>>>>>>> to get things working again we had to:
>>>>>>>>>>>>
>>>>>>>>>>>> Cross fingers.
>>>>>>>>>>>> Get rid of all the linkGroups in the
>> PoolManager.conf (or at least
>>>>>>>>>>>> remove all the links from them).
>>>>>>>>>>>> Set dcacheSetup to have SpaceManager disabled on
>> the srm and
>>>>>>>>>>>> all the door nodes
>>>>>>>>>>>> Rerun install.sh on srm node(I'm not sure if this
>> is totally
>>>>>>>>>>>> nessicery, but it seems to do the trick)
>>>>>>>>>>>> Restart the srm node.
>>>>>>>>>>>> Restart the door nodes.
>>>>>>>>>>>> Throw holy water at your nodes till the the SpaceManager
>>>>>>>>>>>> leaves them be.
>>>>>>>>>>>>
>>>>>>>>>>>> It's a bloody lot of hassle I tell you. To be honest half
>>>>>>>>> those steps
>>>>>>>>>>>> might be unnessicery, but I'm not sure which half so I'll
>>>>>>>>> keep this
>>>>>>>>>>>> arcane ritual.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm totally stuck with the whole SpaceToken thing,
>> after countless
>>>>>>>>>>>> emails with attached configs and logs I've had to go and
>>>>>>>>> give access
>>>>>>>>>>>> to our dcache to Dmitri so he can have a good poke- which
>>>>>>>>> goes against
>>>>>>>>>>>> some University rules so I'm having to be a bit hush hush
>>>>>>>>> about it.
>>>>>>>>>>>> Hopefully he's not filling my SRM with naughty
>> pictures, and finds
>>>>>>>>>>>> some way to get us working that I can spread to the other
>>>>>>>>> UK dcaches.
>>>>>>>>>>>> Hope this gets your dcache up and running again,
>>>>>>>>>>>>
>>>>>>>>>>>> Matt
>>>>>>>>>>>>
>>>>>>>>>>>> On 07/02/2008, cajbrew <[log in to unmask]> wrote:
>>>>>>>>>>>>> Hi Grieg, Matt
>>>>>>>>>>>>>
>>>>>>>>>>>>> (The Atlas center has lost power so my work mail and the
>>>>>>>>>>>> maillist are all
>>>>>>>>>>>>> down)
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm trying to enable space tokens but seem to
>> have run into
>>>>>>>>>>>> the same problem
>>>>>>>>>>>>> as Matt.
>>>>>>>>>>>>>
>>>>>>>>>>>>> When I try to transfer some data in I get:
>>>>>>>>>>>>>
>>>>>>>>>>>>> heplnx101 - ~ $ lcg-cr -v --vo babar -d
>> heplnx204.pp.rl.ac.uk -P
>>>>>>>>>>>>> testfile.brew file:/opt/ppd/scratch/brew/LoadTestSeed
>>>>>>>>>>>>> Using grid catalog type: lfc
>>>>>>>>>>>>> Using grid catalog : lfcserver.cnaf.infn.it
>>>>>>>>>>>>> Using LFN :
>>>>>>>>>>>>>
>> /grid/babar/generated/2008-02-07/file-de6e10d4-db82-4658-8dd7-
>>>>>>>>>>>> 5b0390c4e8cc
>>>>>>>>>>>>> Using SURL :
>>>>>>>>>>>>>
>> srm://heplnx204.pp.rl.ac.uk/pnfs/pp.rl.ac.uk/data/babar/testfile.brew
>>>>>>>>>>>>> Alias registered in Catalog:
>>>>>>>>>>>>>
>> lfn:/grid/babar/generated/2008-02-07/file-de6e10d4-db82-4658-8
>>>>>>>>>>>> dd7-5b0390c4e8
>>>>>>>>>>>>> cc
>>>>>>>>>>>>> Source URL: file:/opt/ppd/scratch/brew/LoadTestSeed
>>>>>>>>>>>>> File size: 2747015459
>>>>>>>>>>>>> VO name: babar
>>>>>>>>>>>>> Destination specified: heplnx204.pp.rl.ac.uk
>>>>>>>>>>>>> Destination URL for copy:
>>>>>>>>>>>>>
>> gsiftp://heplnx172.pp.rl.ac.uk:2811//pnfs/pp.rl.ac.uk/data/bab
>>>>>>>>>>>> ar/testfile.br
>>>>>>>>>>>>> ew
>>>>>>>>>>>>> # streams: 1
>>>>>>>>>>>>> # set timeout to 0 seconds
>>>>>>>>>>>>> 0 bytes 0.00 KB/sec avg 0.00 KB/sec
>>>>>>>>>>>>> instglobus_ftp_client: the server responded with an error
>>>>>>>>>>>>> 451 Operation failed: Non-null return code from
>>>>>>>>>>>>> [>PoolManager@dCacheDomain:*@dCacheDomain] with error No
>>>>>>>>> write pools
>>>>>>>>>>>>> configured for <babar:babar@osm>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Unfortunately when I try to back out and set
>>>>>>>>>>>>>
>>>>>>>>>>>>> srmSpaceManagerEnabled=no
>>>>>>>>>>>>>
>>>>>>>>>>>>> I still get the same error.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I now seem to be stuck, I cannot go forwards or back.
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, actually I've gone further back and commented out all
>>>>>>>>>>>> the LinkGroup
>>>>>>>>>>>>> setting s in PoolManager.conf and I can at least transfer
>>>>>>>>>>>> data in with both
>>>>>>>>>>>>> srmv1 and srmv2
>>>>>>>>>>>>>
>>>>>>>>>>>>> So has Lancaster solved this or are we both in
>> the same boat?
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Chris.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
|