Indeed you cant. Apologies for the slander. :-)
What I want is to be able to offer a possibly wide range of applications
to use the grid. In order to enable 'legacy' software to use this new
infrastructure a 'file' SE protocol should be available at least until
file access methods have stabilized and standardized. LCG has focused on
the L part for obvious reasons but it could proliferate if the tools are
accepted.
Locality of data remains an important concept that even the fastest nets
cannot match. Look at the transports available in a cluster, FC,
infiniband, PCIXpress, and consider the fact that these connections are
available at a flat rate.
Your dream of remote user space mounting has materialized in many
projects already. Maybe too early for the LCG, but surely not for the
spin-off I would like to see happen.
Now to answer your questions:
- yes bring it back (I was not even aware it was gone)
- yes allow a close SE to CE option. Not binary but more in terms of
distance costs. A close CE has a lower IO cost. I assume something
similar is provisioned for other SE supported protocols.
J
Jeff Templon wrote:
>
> Me?? A software guy? I can't even spell Pearl correctly ;-)
>
> So it sounds like what you want is the following:
>
> - bring back the 'file' protocol since that is *real* posix.
> - allow a SE to be 'close' to a CE only if 'file' protocol is possible
> between them.
>
> Did I understand it right? Sounds reasonable to me; 'file' as a
> supported protocol would require specifying the CEs to which it applies.
>
> I am not sure whether it is better for the SE or CE to do this
> specifying. I have a light preference for having the SE publish the CEs
> for which it provides "initmate services", but I'm interested in what
> Stephen, Maarten and maybe Laurence have to say.
>
> JT
>
> Jos van Wezel wrote:
>
>> Jeff, Maarten,
>>
>> of course I'm being pedantic and it should be carried on another list
>> but could you please refrain from using posix and rfio/dcap/gfal and
>> the like in one single sentence.
>>
>> The common misnomer nowadays, at least in part of the grid storage
>> world, is: "POSIX-like". This just means you have to rewrite your
>> program. The fact that you cannot run some software package in a grid
>> without changing the IO part is a show-stopper for many applications.
>> if it is not POSIX it is not POSIX. Period.
>>
>> There are many technical and financial implications too. You software
>> guys are probably not aware that these "POSIX-like" (damn now I'm
>> using it myself) protocols demands us to install twice the amount of IO
>> servers because of the poor performance of these user space thingies.
>> Secondly the fabric managers, are now confronted with a wave of
>> protocols to support IO.
>>
>> Lastly, doing IO via WAN maybe possible for the larger data but not in
>> the coming 10 years for all the logs, scratchpads, homes etc.
>>
>> The concept of a close SE remains. In fact one needs a POSIX compliant
>> access in every cluster, not only to speed up things but also to
>> enable more applications to step into the grid. There is life after
>> the LHC as Alan Silverman recently said. The computing grid will stay
>> if we do it right.
>>
>> Jos
>>
>>>
>>> Seems to me the concept of 'close' SE has been problematic since the
>>> BOG (Beginning Of Grid). If we ever agree on a definition of 'close'
>>> -- and I doubt it, just try discussing what a 'dataset' is with
>>> somebody -- then we could use it.
>>>
>>> Given that 'close SE' comes out of the BOG epoch and is still poorly
>>> defined, I think it's best to ask: "what problem is it that having a
>>> 'close SE' is supposed to solve?"
>>>
>>> I can think of only two:
>>>
>>> - posix file access to that SE
>>> - fast network to that SE
>>>
>>> I suspect the first point is no longer a valid one, since 'posix
>>> file' used to mean 'NFS mount' which is why the SE had to be
>>> 'close'. Now that we have e.g. gsirfio, I could have a "close SE" in
>>> Chicago.
>>>
>>> So I would be in favor of reporting the protocols (we do this
>>> already) and also specifying a site-default SE, with the *option* of
>>> specifying a different one per VO.
>>>
>>> I still hope we will have user-space remote mounting someday, so I
>>> could e.g. mount the grid file system with root
>>> /grid/atlas/rome05/bbar/ivov as /gmount/ivovdata in user space ...
>>> solves a lot of problems.
>>>
>>> J "this is only one cup of coffee" T
>>>
>>>
>>> Maarten Litmaath, CERN wrote:
>>>
>>>> On Thu, 7 Jul 2005, Kyriakos G. Ginis wrote:
>>>>
>>>>
>>>>
>>>>> On Thu, Jul 07, 2005 at 12:10:43PM +0200, Maarten Litmaath, CERN
>>>>> wrote:
>>>>>
>>>>>
>>>>>> On Mon, 4 Jul 2005, Rod Walker wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> The situation is that SFU has significant disk and tape storage,
>>>>>>> running dcache, and very good network to TRIUMF and WestGrid cpu.
>>>>>>> Previously it was published via the TRIUMF-GC-LCG2 site giis, but
>>>>>>> this was for convenience, and in order to isolate it from
>>>>>>> maintenance and problems at TRIUMF it would make sense to have a
>>>>>>> seperate site(giis).
>>>>>>
>>>>>>
>>>>>>
>>>>>> OK. Any site that is "close" to SFU can publish the SE as a close
>>>>>> SE.
>>>>>
>>>>>
>>>>>
>>>>> Hello,
>>>>>
>>>>> Regarding this 'close SE' issue: If a site 'A' publishes a SE 'B' as a
>>>>> close SE, are the WNs of site 'A' expected to have rfio access to
>>>>> the SE
>>>>> 'B'?
>>>>
>>>>
>>>>
>>>>
>>>> You raise an interesting point. First of all, not all SEs support
>>>> RFIO:
>>>> a dCache SE has "gsidcap" instead. A user application may be able
>>>> to deal
>>>> with both protocols, though, e.g. by using GFAL. If the SE
>>>> advertizes any
>>>> such POSIX-like access protocol, one would indeed expect to be able
>>>> to use
>>>> the protocol from any CE (WN) that is "close" to the SE, but I do
>>>> not know
>>>> if such is required according to some official document at this time.
>>>>
>>>> In the case of SFU there need not be a problem, as it simply could
>>>> abstain
>>>> from publishing "gsidcap", but when an SE is accessible from the
>>>> local CE
>>>> through a POSIX-like protocol, a remote site should no longer
>>>> declare it
>>>> as a close SE, even when it is physically close and preferred...
>>>> The remote site could still declare it to be the default SE, though.
>>>> Comments?
>>>
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 08:09:31 +0300
>>> From: Filippidis christos <[log in to unmask]>
>>> Subject: problem passing the daily tests
>>>
>>> hi,
>>> i have problem passing the daily test , as you can see here:
>>>
>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr
>>>
>>>
>>> my site failed at (3rd Party Rep. central SE to defaultSE) and at
>>> (lcg-rep central SE to defaultSE)
>>>
>>> i have a hardware firewall and i believe that i have the ports that
>>> a SE
>>> needs open.
>>>
>>> do you believe that i have to check more carefully the firewall or it
>>> maybe somethink else
>>>
>>> thanks xristos
>>>
>>> Christos Filippidis
>>> NCSR DEMOKRITOS
>>> Institute of Nuclear Physics
>>> office block 6(ktirion 6)
>>> Gr-15310 Agia Paraskevi
>>> GREECE
>>> Tel:2106503425
>>>
>>> http://consult.cern.ch/xwho/people/117002
>>> http://www.inp.demokritos.gr/~filippidisx/
>>>
>>>
>>>
>>>
>>> ----------------------------------------------
>>>
>>> "Institute of Nuclear Physics NCSR Demokritos"
>>> http://www.inp.demokritos.gr/
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 10:08:51 +0100
>>> From: Kostas Georgiou <[log in to unmask]>
>>> Subject: Re: Site with only SE
>>>
>>> On Fri, Jul 08, 2005 at 09:48:06AM +0200, Jeff Templon wrote:
>>>
>>>
>>>> I still hope we will have user-space remote mounting someday, so I
>>>> could e.g. mount the grid file system with root
>>>> /grid/atlas/rome05/bbar/ivov as /gmount/ivovdata in user space ...
>>>> solves a lot of problems.
>>>
>>>
>>>
>>>
>>> Well it seems that FUSE (http://fuse.sourceforge.net/) will be merged
>>> in the kernel at some point (http://lkml.org/lkml/2005/6/30/51).
>>> Implementing "gsiftpfs" in user space should be easy with it ;P
>>> Kostas
>>>
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 11:48:32 +0200
>>> From: "Maarten Litmaath, CERN" <[log in to unmask]>
>>> Subject: Re: problem passing the daily tests
>>>
>>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>>
>>>
>>>> hi,
>>>> i have problem passing the daily test , as you can see here:
>>>>
>>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr
>>>>
>>>>
>>>> my site failed at (3rd Party Rep. central SE to defaultSE) and at
>>>> (lcg-rep central SE to defaultSE)
>>>>
>>>> i have a hardware firewall and i believe that i have the ports that
>>>> a SE
>>>> needs open.
>>>
>>>
>>>
>>>
>>> Are your WNs on a private network?
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 10:01:22 +0300
>>> From: Filippidis christos <[log in to unmask]>
>>> Subject: Re: problem passing the daily tests
>>>
>>> yes everythink is on a private network (wn and the SE CE )
>>>
>>>
>>>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>>>
>>>>
>>>>> hi,
>>>>> i have problem passing the daily test , as you can see here:
>>>>>
>>>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr
>>>>>
>>>>>
>>>>> my site failed at (3rd Party Rep. central SE to defaultSE) and at
>>>>> (lcg-rep central SE to defaultSE)
>>>>>
>>>>> i have a hardware firewall and i believe that i have the ports that a
>>>>> SE
>>>>> needs open.
>>>>
>>>>
>>>>
>>>> Are your WNs on a private network?
>>>>
>>>>
>>>
>>>
>>>
>>>
>>> Christos Filippidis
>>> NCSR DEMOKRITOS
>>> Institute of Nuclear Physics
>>> office block 6(ktirion 6)
>>> Gr-15310 Agia Paraskevi
>>> GREECE
>>> Tel:2106503425
>>>
>>> http://consult.cern.ch/xwho/people/117002
>>> http://www.inp.demokritos.gr/~filippidisx/
>>>
>>>
>>>
>>>
>>> ----------------------------------------------
>>>
>>> "Institute of Nuclear Physics NCSR Demokritos"
>>> http://www.inp.demokritos.gr/
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 12:40:26 +0200
>>> From: "Maarten Litmaath, CERN" <[log in to unmask]>
>>> Subject: Re: problem passing the daily tests
>>>
>>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>>
>>>
>>>> yes everythink is on a private network (wn and the SE CE )
>>>
>>>
>>>
>>>
>>> I suppose your WNs access your SE using its private address?
>>> That will cause 3rd party transfers to fail, because the private
>>> address gets communicated to a remote site, where it cannot be used.
>>> In LCG-2_6_0 (due end of next week) there will be a work-around for
>>> this problem. If you need it fixed now, I can tell you which rpms
>>> to upgrade and how to get it to work. Alternatively, you can let
>>> the WNs access the SE always through its public address.
>>>
>>>
>>>>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>>>>
>>>>>
>>>>>> hi,
>>>>>> i have problem passing the daily test , as you can see here:
>>>>>>
>>>>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr
>>>>>>
>>>>>>
>>>>>> my site failed at (3rd Party Rep. central SE to defaultSE) and at
>>>>>> (lcg-rep central SE to defaultSE)
>>>>>>
>>>>>> i have a hardware firewall and i believe that i have the ports
>>>>>> that a
>>>>>> SE
>>>>>> needs open.
>>>>>
>>>>>
>>>>>
>>>>> Are your WNs on a private network?
>>>
>>>
>>>
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 11:45:00 +0100
>>> From: "Burke, S (Stephen)" <[log in to unmask]>
>>> Subject: Re: Site with only SE
>>>
>>> LHC Computer Grid - Rollout=20
>>>
>>>> [mailto:[log in to unmask]] On Behalf Of Jeff Templon
>>>
>>>
>>>
>>> said:
>>>
>>>> Seems to me the concept of 'close' SE has been problematic=20
>>>> since the BOG=20
>>>> (Beginning Of Grid). If we ever agree on a definition of=20
>>>> 'close' -- and=20
>>>> I doubt it, just try discussing what a 'dataset' is with somebody --=20
>>>> then we could use it.
>>>
>>>
>>>
>>>
>>> It has traditionally meant at least three different things: 1) a default
>>> SE to use for writing files if no destination is explicitly specified;
>>> 2) an SE you can use for reading files from a WN from which the access
>>> can be expected to be "fast" in some undefined sense; 3) an SE to which
>>> you can get "local" access from a WN for protocols like NFS and rfio
>>> which only work within a site.
>>>
>>> The first of those has been superseded by a VO-dependent environment
>>> variable in LCG for some time, and that should now be explicitly
>>> published in the new Glue schema. The third case was never very explicit
>>> and didn't work very well; NFS has been out of use for some time and
>>> rfio is not much used so it hasn't been that much of a problem. However,
>>> if we intend to keep using site-local protocols, which we probably do,
>>> we should come up with a better way to do it, and leave the SE binding
>>> to the second case. Even there the semantics aren't very well defined,
>>> e.g. if you specify multiple input files the broker only requires one of
>>> them to be on a close SE (at least that used to be the case, I haven't
>>> checked lately).
>>>
>>> There is also the technical point that for historical reasons the
>>> replica manager code used the access point in the CESEbind to construct
>>> the SE pathname for classic SEs, with the result that a classic SE had
>>> to be close to some CE. That is now fixed in the new glue schema, but I
>>> don't know if the replica management tools have been updated yet.
>>>
>>> Stephen
>>>
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 12:49:07 +0200
>>> From: EGEE BROADCAST <[log in to unmask]>
>>> Subject: R-GMA registry unvailable at 13:10 BST (GMT+1) today.
>>>
>>> ------------------------------------------------------------------------------------
>>>
>>> Publication from : steve traylen <[log in to unmask]> (RAL-LCG2)
>>> This mail has been sent using the broadcasting tool available at
>>> http://cic.in2p3.fr
>>> ------------------------------------------------------------------------------------
>>>
>>>
>>> In order to address a memory leak in the
>>> JDBC code used by the R-GMA registry there
>>> will be a short interuption to the registry
>>> service today at 13:10 BST (GMT+1).
>>>
>>> This is expected to take less than 30 minutes.
>>> The situation will be monitored closely afterwards, in particular
>>> following the
>>> SFTs which may well be reran.
>>>
>>> People may have noticed that the browser at
>>>
>>> http://lcgic01.gridpp.rl.ac.uk:8080/R-GMA/
>>> is no longer visable. This is by design and all R-GMA MON boxes are
>>> by default configured with a web browser interface.
>>>
>>> Steve
>>> ------------------------------
>>>
>>> Date: Fri, 8 Jul 2005 12:27:42 +0100
>>> From: Alessandra Forti // EOJ <[log in to unmask]>
>>> Subject: LCG-2_6_0 plans?
>>>
>>> Hi,
>>>
>>> can anyone from the deployment team at CERN update us on the
>>> situation of LCG-2_6_0?
>>>
>>> I need to schedule manpower and agree with some of the experiments
>>> when to do the upgrade. The release was due this week but there is no
>>> sign of it and I haven't seen any email that explains why it has been
>>> delayed and when it is foreseen for and what we should expect from
>>> it. It would be very helpful to know. An EGEE broadcast would be
>>> apreciated.
>>>
>>> thanks
>>>
>>> cheers
>>> alessandra
>>>
>
|