If you are worried about hosing the scratch space on the 'laptop'
nodes, what about finding some striping virtual file system and hook
many 'reliable' worker node disks into one 'virtual' scratch space
holding a per-job directory (like the TMPDIR function in PBS/Torque).
If I am not mistaken, we have all the jobmanagers here cd into TMPDIR
unless they are MPI jobs. So very little activity takes place in the
pool account directories.
You'd need a FS with some RAID-type redundancy, otherwise the FS will
hang if a worker node goes down. We looked at PVFS awhile back but for
some reason at the time its RAID capabilities were either unusable or
not yet implemented, so we stopped looking.
HTH
J "striped disk or pinwheel, you pick" T
On Feb 23, 2005, at 16:08, Sotomayor, Maniel wrote:
> Yes, absolutely... That is what i'm talking about.
>
> Anybody,
> ./MS
>
> -----Original Message-----
> From: LHC Computer Grid - Rollout on behalf of Jeff Templon
> Sent: Wed 2/23/2005 9:49 AM
> To: [log in to unmask]
> Subject: Re: [LCG-ROLLOUT] SAN Fabric Shared Disk for Pool
> Accounts Scratch Space
>
> Hi
>
> Doesn't this mean that you might have a job go to a worker node that is
> good, but the job will fail because the worker node carrying its pool
> account happens to have gone down?
>
> I just counted and since Jan 1st we've had 11 separate events where a
> worker node was removed for some reason -- one per five days on
> average. Seems like a maintenance nightmare. Note this probably
> scales like the number of worker nodes; we have 133 at the moment. For
> smaller sites it is probably not so much of a problem. For larger
> sites it would be a killer.
>
> JT
>
> On Feb 23, 2005, at 14:41, Sotomayor, Maniel wrote:
>
>> Hello all,
>> Does there has to be a 1-to-1 correspondence between pool account
>> usernames from the CE to all WNs ? ie. Can I have let's say for
>> example,
>> On my CE: dteam001...dteam050, lhcb001...lhcb050, ...
>>
>> And then on the first WN: dteam051...dteam100, lhcb051...lhcb100, ...
>>
>> Then on the second WN: dteam101...dteam151, lhcb101...lhcb151, ...
>>
>> and so on, until all working nodes have different username ranges for
>> pool accounts ?
>>
>> Is it possible to do this from torque point of view ? Or does the
>> pbs_server needs to have the same usernames for server & mom services
>> (ie ce usernames & wn usernames).
>>
>> I'm trying to do a shared virtual disk on a SAN fabric for using it as
>> scratch space for the WNs. I was wondering if I could do this and if
>> it is recommended...
>>
>> Planned design:
>> To have all WN mount each pool account home directory over NFS from a
>> mount point on the CE.
>> The CE will then be connected to the SAN Fabric via fiber channel,
>> hosting the shared virtual disk on the SAN fabric.
>> The shared virtual disk will have only one volume containing all home
>> directories for all WN pool accounts usernames.
>> This way, storage space won't be wasted as directory contents
>> constantly changes over and over again.
>> Therefore, whenever a WN tries to execute a job it will use a unique
>> home directory for the unique username among all WN.
>> This permits us not to create a volume for each WN.
>>
>> This was thought of because, (maybe) not every job submitted to our
>> site will need a 10GB scratch space on single moment (worst case
>> scenario). Therefore, we can think of creating a hypothetical smaller
>> virtual disk for serving all WN at the same time.
>>
>> The main reason for doing this is that, we have some WNs that have
>> laptop hard disks. This disks are not very suitable for doing some
>> very demanding writings as they are not very reliable. With this
>> method (maybe) we can extend the hard disk lives, therefore increasing
>> WN availability. By doing this, each WN will use its local hard disk
>> for storing OS configuration and handling services.
>>
>> Any suggestions/critics will be very appreciated.
>>
>> Best Regards,
>> ./MS
|