Print

Print


that may explain things, we have annoyingly been replicating all files that
come into our dcache. thought thewse were the parameters to stop it. which
parameters does matt need to tweak this. this is a problem not necessarily
for bandwidth ( but could become one) but is a potential problem for
simultaneous read/write of disks akin to the problem frasier saw with number
of streams>1 .
brian


On 29/09/06, Greig A Cowan <[log in to unmask]> wrote:
>
>
> Brian,
>
> I think you are talking about the case where the dCache is running in
> resilient mode. GridPP dCache sites are not (in general) doing this so
> cannot set the min and max number of replicas in the system. There is the
> potential to run the replica manager on a subset of pools so in theory you
> could say that all atlas pools will be controlled such that there are 2
> copies of each file while the remainder of the pools only have 1 by
> default (the dCache may still create some replicas of some popular files
> in order to better load balance the system).
>
> Regarding lcg-infosites, all it does is report the information that each
> of the SE are reporting about the used and available storage. The SE GIPS
> plugins would have to change to take account of replicas, not
> lcg-infosites.
>
> Greig
>
>
> On Fri, 29 Sep 2006, brian davies wrote:
>
> > Number of replicas can be set in a config file with to values min and
> max (
> > matt doidge can remind me which file but it is also in the dcache book
> under
> > replica manager). this means that the dcache sets a min number and a max
> > number of replicas of a file. some sites for redundancy may want to
> increase
> > copies so as to secure file transfer ( raid0 and multiple replicas
> rather
> > than raid5+ and only one copy. Actually, we could get an extra 11.6TB if
> we
> > raid0our storage; but that probably make my storage volatile?) This of
> > course would halve the capacity since i would probably need to have a
> > replica of each file.
> > This is something to point out for those who want to resilient dcache.
> you
> > need to halve/third your disk space to get what you offer to your
> > experiment.
> > When dcache replicates a file for bandwidth issues, it is a luxury to
> the
> > experiments and so they shouldn't be charged ( ie we don't have to offer
> > this service in fact DPM can't do it at the moment iirc).  it is a
> problem
> > that we have noticed that lcg-infosites does not take into account
> > replicas. First i have heard of experiments being charged for network
> > bandwidth.
> > brian
> >
> >
> > On 29/09/06, Jensen, J (Jens) <[log in to unmask]> wrote:
> > >
> > > >> For user requested replication, the fact that there are replicas
> would
> > > >> still not appear in the PNFS namespace of dCache. You really need a
> DB
> > > >> query to pick out _all_ of the files.
> > > >
> > > > I must agree with Greig as we cant just bill for space used but
> SHOULD
> > > > also bill for bandwidth. Replicas are a consequence of bandwidth.
> > >
> > > Cost modeling is quite a tricky business - a whole new sub-branch of
> > > astrology!
> > >
> > > So how does the dCache thing work then; do users effectively get
> > > Disk2Tape0
> > > (at least for certain files, ignore the spaces for now) or are the
> > > replicas
> > > temporary, for internal optimisations?
> > >
> > > Can we infer from this that cumulative data in and/or data out should
> be
> > > a metric.
> > >
> > > -j
> > >
> >
>
> --
> ========================================================================
> Dr Greig A Cowan                         http://www.ph.ed.ac.uk/~gcowan1
> School of Physics, University of Edinburgh, James Clerk Maxwell Building
>
> TIER-2 STORAGE SUPPORT PAGES: http://wiki.gridpp.ac.uk/wiki/Grid_Storage
> ========================================================================
>