JISCMail - LCG-ROLLOUT Archives

Hi Steve,

Burke, S (Stephen) пишет:
>
> As Owen said, this is not a good solution because you won't be able to read
> the files, the normal replica management tools need to find the SE in the
> information system.

How's that? I need no infosys to query RLS; and RLS records have pretty
explicit SFNs, don't they? globus-url-copy needs no infosys either.

Anyway, I am not really insisting on removing SEs from the infosystem;
but do you know of any LCG tool or method that makes use of the
published free space? And you actually mentioned yourself that what is
published is the overall space, not a per-VO quota. I'm suggesting a
least damaging solution (in my opinion), and I'm willing to discuss
alternatives.

> Also, intrinsically a full SE is not a fatal error any
> more than a full disk on any system, it's just that users need some way of
> dealing with the condition.

A full disk on a system is not a fatal error. I have plenty of them full
sitting around. Just checked, NorduGrid has 17 out of 43 disk SEs
completelly full. You just use the system read-only, which is perfectly
fine for a Storage Element. A full *system* partition is fatal, but I am
sure nobody has storage area and system area on the same partition.

> The free space is published in the information
> system so it should be possible to recognise the situation and deal with it
> in whatever way you like - maybe atlas would actually rather leave the SEs
> full and write new files somewhere else.

This is effectively the situation. If a job fails to write to an SE -
whatever is the reason, - it will eventually store the file wherever it
can be stored. It doesn't use the free space reported in the infosystem,
just a "kamikaze" method :-)

BTW, the reported free space is useless for yet another reason: imagine
there's 10 GB reported free, and 10 jobs read this information
simultaneously (and they do, even more than 10), and duly start
uploading a 2GB file each. Guess what will happen. Right, all will fail.
  Meanwhile, the SE GRIS will time out because the system will get
overloaded with 10 multithreaded transfers, and 10 more jobs will still
see the 2 GB free because this is what will be cached in the BDII. And
so on. Ain't that cool.

> The only reason it can be a problem
> is on systems where all VOs share the space and there are no quotas, so one
> VO can block the others.

So, we can block LHCb and they can block us. We're even ;-)

>   A separate point is the question of reliability. Tier-1s will typically
> commit to a high level of reliability so you can resonably expect that files
> there are safe. Many sites, even ones with large amounts of space, may not
> have much reliability or backup, so if disks crash data may be lost. I'm not
> sure how that can be represented, how do you quantify the likelihood of
> losing data?

Nobody's perfect. A certain person here suggested to have data loss
insurance :-) Smaller is the site, less compensation is to be paid.
Profits from the insurance  company should finance purchase of more
storage hardware. How's that? ;-)

Seriously, I would suggest to change the entire LCG SE model - and the
information system schema. As a SE, only a reasonably reliable facility,
committed for long-term storage, should qualify. SEs should not
necessarily be linked to sites, they should be standalone services
available via GridFTP, SRM, whatever, and register to GIISes
independently on the rest of the site. Thus we will be able to have a
set of sites for running data, and a [different] set of SEs for storing
the results. The disk space local to the site and necessary for its
proper functioning should be renamed and treated as "cache" and must not
be used for long-term data storage. Of course, the "real" SE disk space
may well be cross-mounted on the WNs, it is up to sysadmin's choice.

Oxana