JISCMail - LCG-ROLLOUT Archives

Email discussion lists for the UK Education and Research communities
Subscriber's Corner
Email Lists
LCG-ROLLOUT Archives

LCG-ROLLOUT@JISCMAIL.AC.UK

View:

Message:
[
First
Last
]
By Topic:
[
First
Last
]
By Author:
[
First
Last
]
Font:
Proportional Font
		LISTSERV Archives
		LCG-ROLLOUT Home
		LCG-ROLLOUT 2005
Options

Subscribe or Unsubscribe
Get Password
Subject:
Re: Followup: Site with only SE
From:
Jeff Templon <[log in to unmask]>
Reply-To:
LHC Computer Grid - Rollout <[log in to unmask]>
Date:
Wed, 13 Jul 2005 12:34:02 +0200
Content-Type:
text/plain
Parts/Attachments:
text/plain (423 lines)
Me??  A software guy?  I can't even spell Pearl correctly ;-)

So it sounds like what you want is the following:

- bring back the 'file' protocol since that is *real* posix.
- allow a SE to be 'close' to a CE only if 'file' protocol is possible
   between them.

Did I understand it right?  Sounds reasonable to me; 'file' as a 
supported protocol would require specifying the CEs to which it applies.

I am not sure whether it is better for the SE or CE to do this 
specifying.  I have a light preference for having the SE publish the CEs 
for which it provides "initmate services", but I'm interested in what 
Stephen, Maarten and maybe Laurence have to say.

					JT

Jos van Wezel wrote:
> Jeff, Maarten,
> 
> of course I'm being pedantic and it should be carried on another list 
> but could you please refrain from using posix and rfio/dcap/gfal and the 
> like in one single sentence.
> 
> The common misnomer nowadays, at least in part of the grid storage 
> world, is: "POSIX-like". This just means you have to rewrite your 
> program. The fact that you cannot run some software package in a grid 
> without changing the IO part is a show-stopper for many applications. if 
> it is not POSIX it is not POSIX. Period.
> 
> There are many technical and financial implications too. You software 
> guys are probably not aware that these "POSIX-like"  (damn now I'm using 
> it myself) protocols demands us to install twice the amount of IO
> servers because of the poor performance of these user space thingies.
> Secondly the fabric managers, are now confronted with a wave of 
> protocols to support IO.
> 
> Lastly, doing IO via WAN maybe possible for the larger data but not in 
> the coming 10 years for all the logs, scratchpads, homes etc.
> 
> The concept of a close SE remains. In fact one needs a POSIX compliant 
> access in every cluster, not only to speed up things but also to enable 
> more applications to step into the grid. There is life after the LHC as 
> Alan Silverman recently said. The computing grid will stay if we do it 
> right.
> 
> Jos
> 
>>
>> Seems to me the concept of 'close' SE has been problematic since the 
>> BOG (Beginning Of Grid).  If we ever agree on a definition of 'close' 
>> -- and I doubt it, just try discussing what a 'dataset' is with 
>> somebody -- then we could use it.
>>
>> Given that 'close SE' comes out of the BOG epoch and is still poorly 
>> defined, I think it's best to ask: "what problem is it that having a 
>> 'close SE' is supposed to solve?"
>>
>> I can think of only two:
>>
>> - posix file access to that SE
>> - fast network to that SE
>>
>> I suspect the first point is no longer a valid one, since 'posix file' 
>> used to mean 'NFS mount' which is why the SE had to be 'close'.  Now 
>> that we have e.g. gsirfio, I could have a "close SE" in Chicago.
>>
>> So I would be in favor of reporting the protocols (we do this already) 
>> and also specifying a site-default SE, with the *option* of specifying 
>> a different one per VO.
>>
>> I still hope we will have user-space remote mounting someday, so I 
>> could e.g. mount the grid file system with root 
>> /grid/atlas/rome05/bbar/ivov as /gmount/ivovdata in user space ... 
>> solves a lot of problems.
>>
>>         J "this is only one cup of coffee" T
>>
>>
>> Maarten Litmaath, CERN wrote:
>>
>>> On Thu, 7 Jul 2005, Kyriakos G. Ginis wrote:
>>>
>>>
>>>
>>>> On Thu, Jul 07, 2005 at 12:10:43PM +0200, Maarten Litmaath, CERN wrote:
>>>>
>>>>
>>>>> On Mon, 4 Jul 2005, Rod Walker wrote:
>>>>>
>>>>>
>>>>>
>>>>>> The situation is that SFU has significant disk and tape storage, 
>>>>>> running dcache, and very good network to TRIUMF and WestGrid cpu. 
>>>>>> Previously it was published via the TRIUMF-GC-LCG2 site giis, but 
>>>>>> this was for convenience, and in order to isolate it from 
>>>>>> maintenance and problems at TRIUMF it would make sense to have a 
>>>>>> seperate site(giis).
>>>>>
>>>>>
>>>>> OK.  Any site that is "close" to SFU can publish the SE as a close SE.
>>>>
>>>>
>>>> Hello,
>>>>
>>>> Regarding this 'close SE' issue: If a site 'A' publishes a SE 'B' as a
>>>> close SE, are the WNs of site 'A' expected to have rfio access to 
>>>> the SE
>>>> 'B'?
>>>
>>>
>>>
>>> You raise an interesting point.  First of all, not all SEs support RFIO:
>>> a dCache SE has "gsidcap" instead.  A user application may be able to 
>>> deal
>>> with both protocols, though, e.g. by using GFAL.  If the SE 
>>> advertizes any
>>> such POSIX-like access protocol, one would indeed expect to be able 
>>> to use
>>> the protocol from any CE (WN) that is "close" to the SE, but I do not 
>>> know
>>> if such is required according to some official document at this time.
>>>
>>> In the case of SFU there need not be a problem, as it simply could 
>>> abstain
>>> from publishing "gsidcap", but when an SE is accessible from the 
>>> local CE
>>> through a POSIX-like protocol, a remote site should no longer declare it
>>> as a close SE, even when it is physically close and preferred...
>>> The remote site could still declare it to be the default SE, though.
>>> Comments?
>>
>>
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 08:09:31 +0300
>> From:    Filippidis christos <[log in to unmask]>
>> Subject: problem passing the daily  tests
>>
>> hi,
>> i have problem passing the daily test , as you can see here:
>>
>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr 
>>
>>
>> my site failed at (3rd Party Rep. central SE to defaultSE)  and at
>> (lcg-rep central SE to defaultSE)
>>
>> i have a hardware firewall and i  believe that i have the ports that a SE
>> needs open.
>>
>> do you believe that i have to check  more carefully the firewall or it
>> maybe somethink else
>>
>> thanks xristos
>>
>> Christos Filippidis
>> NCSR DEMOKRITOS
>> Institute of Nuclear Physics
>> office block 6(ktirion 6)
>> Gr-15310 Agia Paraskevi
>> GREECE
>> Tel:2106503425
>>
>> http://consult.cern.ch/xwho/people/117002
>> http://www.inp.demokritos.gr/~filippidisx/
>>
>>
>>
>>
>> ----------------------------------------------
>>
>> "Institute of Nuclear Physics NCSR Demokritos"
>>  http://www.inp.demokritos.gr/
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 10:08:51 +0100
>> From:    Kostas Georgiou <[log in to unmask]>
>> Subject: Re: Site with only SE
>>
>> On Fri, Jul 08, 2005 at 09:48:06AM +0200, Jeff Templon wrote:
>>
>>
>>> I still hope we will have user-space remote mounting someday, so I 
>>> could e.g. mount the grid file system with root 
>>> /grid/atlas/rome05/bbar/ivov as /gmount/ivovdata in user space ... 
>>> solves a lot of problems.
>>
>>
>>
>> Well it seems that FUSE (http://fuse.sourceforge.net/) will be merged 
>> in the kernel at some point (http://lkml.org/lkml/2005/6/30/51). 
>> Implementing "gsiftpfs" in user space should be easy with it ;P
>> Kostas
>>  
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 11:48:32 +0200
>> From:    "Maarten Litmaath, CERN" <[log in to unmask]>
>> Subject: Re: problem passing the daily  tests
>>
>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>
>>
>>> hi,
>>> i have problem passing the daily test , as you can see here:
>>>
>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr 
>>>
>>>
>>> my site failed at (3rd Party Rep. central SE to defaultSE)  and at
>>> (lcg-rep central SE to defaultSE)
>>>
>>> i have a hardware firewall and i  believe that i have the ports that 
>>> a SE
>>> needs open.
>>
>>
>>
>> Are your WNs on a private network?
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 10:01:22 +0300
>> From:    Filippidis christos <[log in to unmask]>
>> Subject: Re: problem passing the daily  tests
>>
>> yes everythink is on a private network (wn and the SE CE  )
>>
>>
>>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>>
>>>
>>>> hi,
>>>> i have problem passing the daily test , as you can see here:
>>>>
>>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr 
>>>>
>>>>
>>>> my site failed at (3rd Party Rep. central SE to defaultSE)  and at
>>>> (lcg-rep central SE to defaultSE)
>>>>
>>>> i have a hardware firewall and i  believe that i have the ports that a
>>>> SE
>>>> needs open.
>>>
>>>
>>> Are your WNs on a private network?
>>>
>>>
>>
>>
>>
>>
>> Christos Filippidis
>> NCSR DEMOKRITOS
>> Institute of Nuclear Physics
>> office block 6(ktirion 6)
>> Gr-15310 Agia Paraskevi
>> GREECE
>> Tel:2106503425
>>
>> http://consult.cern.ch/xwho/people/117002
>> http://www.inp.demokritos.gr/~filippidisx/
>>
>>
>>
>>
>> ----------------------------------------------
>>
>> "Institute of Nuclear Physics NCSR Demokritos"
>>  http://www.inp.demokritos.gr/
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 12:40:26 +0200
>> From:    "Maarten Litmaath, CERN" <[log in to unmask]>
>> Subject: Re: problem passing the daily  tests
>>
>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>
>>
>>> yes everythink is on a private network (wn and the SE CE  )
>>
>>
>>
>> I suppose your WNs access your SE using its private address?
>> That will cause 3rd party transfers to fail, because the private
>> address gets communicated to a remote site, where it cannot be used.
>> In LCG-2_6_0 (due end of next week) there will be a work-around for
>> this problem.  If you need it fixed now, I can tell you which rpms
>> to upgrade and how to get it to work.  Alternatively, you can let
>> the WNs access the SE always through its public address.
>>
>>
>>>> On Fri, 8 Jul 2005, Filippidis christos wrote:
>>>>
>>>>
>>>>> hi,
>>>>> i have problem passing the daily test , as you can see here:
>>>>>
>>>>> http://lcg-testzone-reports.web.cern.ch/lcg-testzone-reports/cgi-bin/sitereports.cgi?site=xg009.inp.demokritos.gr 
>>>>>
>>>>>
>>>>> my site failed at (3rd Party Rep. central SE to defaultSE)  and at
>>>>> (lcg-rep central SE to defaultSE)
>>>>>
>>>>> i have a hardware firewall and i  believe that i have the ports that a
>>>>> SE
>>>>> needs open.
>>>>
>>>>
>>>> Are your WNs on a private network?
>>
>>
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 11:45:00 +0100
>> From:    "Burke, S (Stephen)" <[log in to unmask]>
>> Subject: Re: Site with only SE
>>
>> LHC Computer Grid - Rollout=20
>>
>>> [mailto:[log in to unmask]] On Behalf Of Jeff Templon
>>
>>
>> said:
>>
>>> Seems to me the concept of 'close' SE has been problematic=20
>>> since the BOG=20
>>> (Beginning Of Grid).  If we ever agree on a definition of=20
>>> 'close' -- and=20
>>> I doubt it, just try discussing what a 'dataset' is with somebody --=20
>>> then we could use it.
>>
>>
>>
>> It has traditionally meant at least three different things: 1) a default
>> SE to use for writing files if no destination is explicitly specified;
>> 2) an SE you can use for reading files from a WN from which the access
>> can be expected to be "fast" in some undefined sense; 3) an SE to which
>> you can get "local" access from a WN for protocols like NFS and rfio
>> which only work within a site.
>>
>>   The first of those has been superseded by a VO-dependent environment
>> variable in LCG for some time, and that should now be explicitly
>> published in the new Glue schema. The third case was never very explicit
>> and didn't work very well; NFS has been out of use for some time and
>> rfio is not much used so it hasn't been that much of a problem. However,
>> if we intend to keep using site-local protocols, which we probably do,
>> we should come up with a better way to do it, and leave the SE binding
>> to the second case. Even there the semantics aren't very well defined,
>> e.g. if you specify multiple input files the broker only requires one of
>> them to be on a close SE (at least that used to be the case, I haven't
>> checked lately).
>>
>>   There is also the technical point that for historical reasons the
>> replica manager code used the access point in the CESEbind to construct
>> the SE pathname for classic SEs, with the result that a classic SE had
>> to be close to some CE. That is now fixed in the new glue schema, but I
>> don't know if the replica management tools have been updated yet.
>>
>> Stephen
>>
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 12:49:07 +0200
>> From:    EGEE BROADCAST <[log in to unmask]>
>> Subject: R-GMA registry unvailable at 13:10 BST (GMT+1)  today.
>>
>> ------------------------------------------------------------------------------------ 
>>
>> Publication from : steve traylen <[log in to unmask]> (RAL-LCG2)
>> This mail has been sent using the broadcasting tool available at 
>> http://cic.in2p3.fr
>> ------------------------------------------------------------------------------------ 
>>
>>
>> In order to address a memory leak in the
>> JDBC code used by the R-GMA registry there
>> will be a short interuption to the registry
>> service today at 13:10 BST (GMT+1).
>>
>> This is expected to take less than 30 minutes.
>> The situation will be monitored closely afterwards, in particular 
>> following the
>> SFTs which may well be reran.
>>
>> People may have noticed that the browser at
>>
>> http://lcgic01.gridpp.rl.ac.uk:8080/R-GMA/
>> is no longer visable. This is by design and all R-GMA MON boxes are by 
>> default configured with a web browser interface.
>>
>>     Steve
>> ------------------------------
>>
>> Date:    Fri, 8 Jul 2005 12:27:42 +0100
>> From:    Alessandra Forti // EOJ <[log in to unmask]>
>> Subject: LCG-2_6_0 plans?
>>
>> Hi,
>>
>> can anyone from the deployment team at CERN update us on the situation 
>> of LCG-2_6_0?
>>
>> I need to schedule manpower and agree with some of the experiments 
>> when to do the upgrade. The release was due this week but there is no 
>> sign of it and I haven't seen any email that explains why it has been 
>> delayed and when it is foreseen for and what we should expect from it. 
>> It would be very helpful to know. An EGEE broadcast would be apreciated.
>>
>> thanks
>>
>> cheers
>> alessandra
>>
Top of Message | Previous Page | Permalink
JiscMail Tools

Files Area | help
RSS Feeds and Sharing

Search Archives

Advanced Options