Print

Print


HI,

In addition to Gareth's journey (which I entirely agree with) there is another scaling journey.

Single Node -> Workgroup Cluster -> Large Cluster -> Small number of independent clusters -> Small Grid -> WLCG

(Clouds obviously need to sit somewhere, I'm just not sure where)

So, the community possibly needs to have some idea of where they are likely to end up not too far along the journey as some solutions for a WLCG scale solution are overly complex even for a small grid and some solutions that work perfectly well with the production manager logging into a few submit nodes and submitting jobs to clusters are dead ends when they try to scale up to the next level.

But obviously as Dave says we cannot push communities to use the tools we think they are going to need, we can only talk about the lessons we've learned along the journey, they probably need to make their own mistakes.

AndrewS's use cases sounds like a great idea, in view of what I'm talking about here, it might be an idea to include some idea of progression as their computing models mature (Alessandra and Gareth's points) and as they scale up.

Yours,
Chris.


On 06/12/2018, 09:51, "Testbed Support for GridPP member institutes on behalf of Gareth Roy" <[log in to unmask] on behalf of [log in to unmask]> wrote:

Hi,

I think you may both be talking a little bit at cross purposes... I think Alessandra may be focusing on development and early pathfinding exercises while Ian is looking at mature production runs (this may be putting words in your mouth). 

In going back to the main direction Alastair was taking this and thinking about it in terms of the email Dave sent and particularly point 5, I think new users will have a journey for any sort of large scale that looks like:

Development -> Testing -> Staging -> Production

The technologies at each level may be different and require different access or highlight different needs... which we should be cognizant of when onboarding new experiments. Each technology has its strengths and weakness... containers for instance work great at all levels in the above cycle but have drawbacks (singularity is _not_ the same as docker, from a deployment, operation or security standpoint). Running arbitrary container user payloads is fraught with risk without really understanding the security model.  CVMFS is excellent at efficiently deploying software to a wide (geographical) range of resources, it's fairly easily understood from a user perspective (just a filesystem) but is slow to respond to rapid changes, can have problems with caching etc.

Now I know you both know this, but I think where we sometimes run into problems with new user communities is they don't. We need to present the technology with both its strengths and weaknesses to allow them to make a judgement based on their (not our needs)... I think the phrase I'm looking for is informed consent.

Just my thoughts and maybe input for Friday's meeting which I can't make as I'm at a Data Centre meeting at Glasgow.

Thanks,

Gareth






On 06/12/2018, 09:15, "Testbed Support for GridPP member institutes on behalf of Alessandra Forti" <[log in to unmask] on behalf of [log in to unmask]> wrote:

    On 06/12/2018 08:54, Ian Collier - UKRI STFC wrote:
    >
    >> On 5 Dec 2018, at 15:21, Alessandra Forti <[log in to unmask]> wrote:
    >>
    >> On 05/12/2018 15:11, Ian Collier - UKRI STFC wrote:
    >>>
    >>>> On 5 Dec 2018, at 14:15, Alessandra Forti <[log in to unmask]> wrote:
    >>>>
    >>>> PS CVMFS is not good for small scale groups or groups that don't have an established code. We should discuss containers.
    >>> I think that is rather sweeping.
    >> experience. Again with LSST and SKA. Former went for tarballs on the storage, and latter went for containers. Other smaller groups have similar probles and prefer to upload tarballs. Even I who am one of the early fans of CVMFS wouldn't go for it if I was developing.
    > Again, tarballs are very easy to unpack onto cvmfs.
    sure they are you have to develop, connect to a machine unpack the 
    tarball and wait for it to be propagated. It then requires a number of 
    setup scripts for the job to access. It cannot be streamlined and it 
    requires time. Not a good solution for unstable software.
    >
    >>> I know of more than one solution for making cvmfs easiy to use for small teams (here at RAL and also at Nikhef). There are surely more. And once you have your software on cvmfs it has many large advantages.
    >>>
    >>> This is not to say that containers don’t have their place. (Of course they work /really/ well unpacked and distributed via cvmfs.)
    >> I'm afraid this is debatable. Unpacked are 3-5 times larger and you need the space for that. With CERN we are setting up a common repository for the LHC experimets but it will need policies on images lifetimes, and the registries again offer a directness CVMFS doesn't and which during development is necessary.
    > But most of the space is de-duplicated so shared between images, and it will be a rare job that actually access every file in a container.
    that's why we are setting up a common repository at CERN with the CVMFS 
    devs specifically working on getting the images from docker, but as I 
    said different workflows might still need the registries.
    
    cheers
    alessandra
    
    
    
    >
    > —Ian
    >
    >
    >> cheers
    >> alessandra
    >>> —Ian
    >>>
    >>>
    >>>
    >>>> On 05/12/2018 14:11, Alessandra Forti wrote:
    >>>>> Hi Alastair,
    >>>>>
    >>>>> On 05/12/2018 13:07, Alastair Dewhurst wrote:
    >>>>>> Hi All
    >>>>>>
    >>>>>> Sorry for the late reminder but I was waiting for Dave B email (To Grid or not to Grid?) to be sent round.
    >>>>>>
    >>>>>> We will have another technical meeting on Friday 7th December on “Onboarding new communities”.  This will be in two parts:
    >>>>>> 1) I will start with a description of the plans for the Tier-1 to provide direct batch system access for (some) new communities.  The intention is that this would introduce them to some of the aspects of the Grid (e.g. CVMFS) but still keep it easily accessible.  If the requirements of the VO grow then we would move them towards using the GridPP DIRAC instance.  The Tier-1 is obviously not the only site that can onboard new communities and often the site that the VO contact is based is the best place to introduce people to the Grid.  We should therefore try and align the services we offer.   I would appreciate feedback on my plans for the Tier-1, I want people to feel that this will benefit all GridPP sites.
    >>>>>>
    >>>>> this is how we did it with LSST. First direct access and then Dirac, then file catalogue. I cannot say they appreciated it considering they are showing around a "post-mortem" which is one of the reason you are now sending this email.
    >>>>>
    >>>>> cheers
    >>>>> alessandra
    >>>>>
    >>>>>> 2) We should have a discussion about how best to describe the problems people will face if they need to use distributed resources.  We should also discuss how best to describe the various software and services we have used to solve them.
    >>>>>>
    >>>>>> If anyone would like to send me suggested input in advance, I will try and incorporate this into the agenda.
    >>>>>>
    >>>>>> Agenda can be found here:
    >>>>>> https://indico.cern.ch/event/779045/
    >>>>>>
    >>>>>> Alastair
    >>>>>>
    >>>>>> To unsubscribe from the TB-SUPPORT list, click the following link:
    >>>>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    >>>>>>
    >>>>> -- 
    >>>>> Respect is a rational process. \\//
    >>>>> For Ur-Fascism, disagreement is treason. (U. Eco)
    >>>>>
    >>>>>
    >>>>> To unsubscribe from the TB-SUPPORT list, click the following link:
    >>>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    >>>>>
    >>>> -- 
    >>>> Respect is a rational process. \\//
    >>>> For Ur-Fascism, disagreement is treason. (U. Eco)
    >>>>
    >>>>
    >>>> To unsubscribe from the TB-SUPPORT list, click the following link:
    >>>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    >>>>
    >>> ########################################################################
    >>>
    >>> To unsubscribe from the TB-SUPPORT list, click the following link:
    >>> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    >> -- 
    >> Respect is a rational process. \\//
    >> For Ur-Fascism, disagreement is treason. (U. Eco)
    >>
    >> ########################################################################
    >>
    >> To unsubscribe from the TB-SUPPORT list, click the following link:
    >> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    >
    > ########################################################################
    >
    > To unsubscribe from the TB-SUPPORT list, click the following link:
    > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    
    -- 
    Respect is a rational process. \\//
    For Ur-Fascism, disagreement is treason. (U. Eco)
    
    ########################################################################
    
    To unsubscribe from the TB-SUPPORT list, click the following link:
    https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
    


########################################################################

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1


########################################################################

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1