On Thu, 6 Jul 2006 10:23:24 +0100
Greig A Cowan <[log in to unmask]> wrote:
> > What is annoying is that dcache even when it knows that all pools
> > are online and it knows very well that we don't have a tape it will
> > hapilly report that the file is there and wait forever for the tape
> > (what bloody tape?) to deliver the file.
>
> The problem is that dCache was designed to run at sites that have some
>
> form of MSS backend. Running in "disk only" mode is something that is
> relatively new.
>
> > That's the price you pay for having two independent databases
> > keeping information about the same data. Sooner or later they *will*
> > get out of sync. We *are* going to loose files, if it's a hadrware
> > or a software error isn't important. At the moment there is no way
> > to sync the databases and I don't expect it to ever happen either :(
>
> I agree that it is likely that the SRM databases and file catalogs
> will get out of sync, but this is the system that we have to work with
> at the moment. I think sites (where their resources allow) are already
> trying to minimise the effect of failures by having some form of
> resiliency built in (database backups, RAID, file replicas, redundant
> power supplies...).
>
> Greig
Sorry to correct you here Greg but the SRM database holds SRM requests,
hence Derek has dropped this database multiple time in the past for Tier
1 and not the file system name space as may be implied by your choice of
words. Like any project of any size developers MUST make considerable
effort to avoid having duplicate state (or code) unless this is well
managed and designed in at day one (Eg Message Queues and Touple
spaces).
D-cache uses PNFS for file system name space management (Through an
POSIX file/directory API) this is effectively a view of the PNFS
databases, in the same way as the Castor SRM uses a conventional RDBMS
(Oracle) view to see the name server within Castor. Both the Castor and
D-Cache databases for SRM do not contain duplicate file system
catalogues.
The Replica catalogue and the SRM do have duplication for grid files but
SRM's must also support local usage. These will and do get out of sync,
It is my opinion the the replica management services should resolve
these inconsistencies issues and SRM's should report on availability of
files within them.
Regards
Owen
>
> --
> =====================================================================
> === Dr Greig A Cowan
> http://www.ph.ed.ac.uk/~gcowan1
> School of Physics, University of Edinburgh, James Clerk Maxwell
> Building
>
> TIER-2 STORAGE SUPPORT PAGES:
> http://wiki.gridpp.ac.uk/wiki/Grid_Storage
> =====================================================================
> ===
--
###########################################################
Please note that my email address is now [log in to unmask]
###########################################################
|