Print

Print


Yes, absolutely... That is what i'm talking about.

Anybody,
./MS

-----Original Message-----
From: LHC Computer Grid - Rollout on behalf of Jeff Templon
Sent: Wed 2/23/2005 9:49 AM
To: [log in to unmask]
Subject:      Re: [LCG-ROLLOUT] SAN Fabric Shared Disk for Pool Accounts              Scratch Space
 
Hi

Doesn't this mean that you might have a job go to a worker node that is
good, but the job will fail because the worker node carrying its pool
account happens to have gone down?

I just counted and since Jan 1st we've had 11 separate events where a
worker node was removed for some reason -- one per five days on
average.  Seems like a maintenance nightmare.  Note this probably
scales like the number of worker nodes; we have 133 at the moment.  For
smaller sites it is probably not so much of a problem.  For larger
sites it would be a killer.

                                        JT

On Feb 23, 2005, at 14:41, Sotomayor, Maniel wrote:

> Hello all,
>   Does there has to be a 1-to-1 correspondence between pool account
> usernames from the CE to all WNs ? ie. Can I have let's say for
> example,
> On my CE: dteam001...dteam050, lhcb001...lhcb050, ...
>
> And then on the first WN: dteam051...dteam100, lhcb051...lhcb100, ...
>
> Then on the second WN: dteam101...dteam151, lhcb101...lhcb151, ...
>
> and so on, until all working nodes have different username ranges for
> pool accounts ?
>
> Is it possible to do this from torque point of view ? Or does the
> pbs_server needs to have the same usernames for server & mom services
> (ie ce usernames & wn usernames).
>
> I'm trying to do a shared virtual disk on a SAN fabric for using it as
> scratch space for the WNs. I was wondering if I could do this and if
> it is recommended...
>
> Planned design:
> To have all WN mount each pool account home directory over NFS from a
> mount point on the CE.
> The CE will then be connected to the SAN Fabric via fiber channel,
> hosting the shared virtual disk on the SAN fabric.
> The shared virtual disk will have only one volume containing all home
> directories for all WN pool accounts usernames.
> This way, storage space won't be wasted as directory contents
> constantly changes over and over again.
> Therefore, whenever a WN tries to execute a job it will use a unique
> home directory for the unique username among all WN.
> This permits us not to create a volume for each WN.
>
> This was thought of because, (maybe) not every job submitted to our
> site will need a 10GB scratch space on single moment (worst case
> scenario). Therefore, we can think of creating a hypothetical smaller
> virtual disk for serving all WN at the same time.
>
> The main reason for doing this is that, we have some WNs that have
> laptop hard disks. This disks are not very suitable for doing some
> very demanding writings as they are not very reliable. With this
> method (maybe) we can extend the hard disk lives, therefore increasing
> WN availability. By doing this, each WN will use its local hard disk
> for storing OS configuration and handling services.
>
> Any suggestions/critics will be very appreciated.
>
> Best Regards,
> ./MS