Stalled, I am afraid. The developer seemed to stop working on it and
it hasn't been integrated into the pilot. Could be done without much
difficulty, I think. Anyone got a summer student?
Graeme
On Wed, Jul 22, 2009 at 14:51, Duncan Rand<[log in to unmask]> wrote:
> Graeme
>
> What's the status of pCache - presumably it would help by managing the
> worker node scratch space more efficiently.
>
> Duncan
>
> Duncan Rand wrote:
>>
>> Alessandra Forti wrote:
>>>
>>> Hi Graeme,
>>>
>>>> ANALY_MANC1: Bad. Build job problems, which seem to be on stage-out
>>>> after the code compiles, e.g.,
>>>> http://panda.cern.ch:25980/server/pandamon/query?job=1016502474.
>>>>
>>>>
>>>
>>> all data servers links are saturated.
>>>
>>>> ANALY_MANC2: Fair. Running out of local stage space? "Error details:
>>>> pilot: Too little space left on local disk to run job: 2050048 kB
>>>> (need > 2097152 kB)" - maybe need to clean up disks or run less jobs
>>>> on these nodes? Otherwise ok.
>>>>
>>>>
>>>
>>> indeed this is a recurring problem that we will solve when we upgrade to
>>> SL5. We have only two cpus per node but only 20GB in scratch. It used to be
>>> enough but these analisys seem to copy far more than 10GB per job.
>>
>> We now get torque to create a per-job temporary directory which gets
>> cleared up at the end of the job:
>>
>> http://www.clusterresources.com/torquedocs21/users/2.2files.shtml#tmpdir
>>
>> Duncan
>>
>>> cheers
>>> alessandra
>>>
>
--
Dr Graeme Stewart http://www.physics.gla.ac.uk/~graeme/
Department of Physics and Astronomy, University of Glasgow, Scotland
DEATH TO MEETINGS!
|