Hi
my two cents
Quoting Oxana Smirnova <[log in to unmask]>:
> Hi Owen,
>
> owen maroney пишет:
>
> > "Join [log in to unmask] and post message there."
> >
Anybody has ever heard of savanah and trouble ticketing systems
those usually are (suposed) to be directed/sent to the right
people, whomever they are
cheers
Mario David
> > then OK, that is Atlas's choice. (Although we were actually just told
> > "mail [log in to unmask]" :-) ) However, I do not personally feel this
> > is the correct solution.
>
> No, neither do I, but there's nothing better at the moment. As the list
> manager, I can't open it for posting from non-members, I already have
> 80% of SPAM in my mailbox. Besides, it's not too difficult to
> unsubscribe as soon as your problem is solved. It's not very
> "user-friendly", but nothing is.
>
> > Absolutely, on this I completely agree with you! But I do wonder is it
> > really the case that all the data is actually still needed on the SE at
> > the end of the job? There were no input files that could be deleted?
> > No files that could be staged somewhere else?
>
> Actually, ATLAS jobs do stage files elsewhere. Of course the "close SE"
> is the preferred one, but if it is not available, the data are copied
> elsewhere. Too bad LCG gatekeepers have no standard procedure for this,
> despite multiple requests from user(s). There's no built-in data
> management concept in this Grid.
>
> > OK, this is where I cannot follow your reasoning.
> >
> > OK, if I remove my SE from the Information System, then maybe you can't
> > get *any* of your data on my SE. Neither can any of the other VO's.
> > You can't even clean data off my SE that you don't need.
>
> Isn't that what you want - to prevent me from writing onto your SE?
> Also, I need no infosys to clean data off anything. You forget about RLS
> (or whatever it's going to be replaced with). RLS is the index that
> lists file locations, not infosystem.
>
> > If that's an acceptable solution to you, then you clearly don't need
> > your data and I would be justified in doing rm -rf
> > /my/data/storage/atlas/*
>
> No, you are not, no matter what happens. Because every file is
> registered in RLS, and by doing "rm -rf" you not just free the disk
> space, but corrupt the entire database. Many thanks. Who's going to
> clean up those orphaned records, eh?
>
> > The point is that taking the SE out of the information system does not
> > solve the problem.
>
> Perhaps not, it's too radical of a solution, but I don't see many
> alternatives.
>
> > Pressing the data management folks to develop a better solution is
> > clearly the long term solution. But it hardly deals with the problem now?
>
> So, what does, in your opinion? I can't help thinking if we wouldn't
> have been aiming at short-term solutions 3 years ago, we would have had
> a way more reasonable system now.
>
> > Yes, because Atlas have filled up a lot of SE's with your data. Clearly
> > Atlas have produced a lot of data on the grid, which is excellent, but
> > has not considered there to be any need to develop a coherent strategy
> > for managing that data, which is not excellent.
>
> Actually, we requested officially a certain amount of storage from LCG,
> and have been told that it's there and we can use it. So it's rather
> surprising that now it turns out we were supposed to arrange some other
> storage outside LCG. I would like to hear an official statement on it.
>
> > It is naive to claim "all SE's are equal" and assume that you can treat
> > a disk storage system at a minor regional centre as equally permanent
> > storage to a tape storage system at Tier 1 centres.
>
> Please let me know how can I distinguish between them by simply reading
> the infosystem. I'm not aware of such an attribute, "GlueSETierLevel" or
> what.
>
> > It is simply
> > *false* to assume you can ignore the fact that there is no automatic
> > space management and just keep copying new data onto a finite disk
> > space. If that is Atlas' data mangement framework, then that framework
> > is clearly broken!
>
> What else are disks for but storing data? To heat the premises? Is LCG
> data management framework to keep disks vacant? Disk space consumption
> never decreases, it never even stais constant: it ONLY can increase,
> ATLAS or no.
>
> This is how SEs are fundamentally different from CEs, as a resource: the
> job is finished, the CE is free again. SEs are never getting free, they
> only get filled up.
>
> > The current data management software is clearly missing a whole layer,
> > but if Atlas's response is to just close their eyes, refuse to manage
> > their use of disk resources and cause SE's to fill up and fall over,
> > that seems to me to be an abuse of resources.
>
> Why would SE "fall over" when filled? It is meant to be filled,
> eventually. Why would anybody manufacture disks otherwise?
>
> Please distinguish between SEs and the disk space on WNs necessary for
> running jobs. It is *NOT* the same thing.
>
> > To suggest a different perspective: given disk SE's will inevitably fill
> > up with data if the experiments do not implement their own
> > datamanagement frameworks, disk SE's should be considered *non-permanent
> > by default*. If you are unable to determine which SE's are disk or not,
> > then your data management framework will just have to assume any given
> > SE is non-permanent, and arrange for files to be transferred to known
> > SE's that you have discovered (by whatever means) to be permanent for
> you.
>
> Firstly, there's no way to tell one LCG SE from another, as we already
> discussed. Secondly, even a disk SE can be permanent, there's nothing
> wrong with it.
>
> > Then might I suggest you start taking a careful look at what actual
> > resources are available, find out where you have the capacity to
> > permanently store it, and develop a datamangement framework?
>
> Good idea, but somehow we thought that's why the LHC Computing Grid
> project was created in the first hand, and this is why LCG middleware is
> based on the EU *DATAGRID* one :-) It's not an ATLAS-specific problem
> for sure. I don't think ATLAS or anybody else must arrange for a storage
> outside Grid framework.
>
> > It does not seem to me to be beyond the capability of the current grid
> > tools to develop fairly simple processes for a job submission frameworks
> > to (a) transfer output data files to known permanent storage spaces
> > (such as tape storage at the Tier 1's) and (b) clean unneeded data files
> > off local SE's. It is certainly the case that other experiments have
> > managed this!
>
> So, the time is ripe to implement it as a standard LCG service, so that
> the experiments do not reinvent the wheel over and over again. BTW,
> other Grids do not appear to run into this kind of problem, although
> they processed exactly the same volume of ATLAS data as LCG.
>
> Oxana
>
|