Hi Duncan,
we have been using tempdir for well over 2 years now but we still have
dcache on the WNs. When we will upgrade we will change the partition
and raid the disks and most of the space will go to scratch.
thanks
cheers
alessandra
Duncan Rand wrote:
> Alessandra Forti wrote:
>> Hi Graeme,
>>
>>> ANALY_MANC1: Bad. Build job problems, which seem to be on stage-out
>>> after the code compiles, e.g.,
>>> http://panda.cern.ch:25980/server/pandamon/query?job=1016502474.
>>>
>>>
>> all data servers links are saturated.
>>
>>> ANALY_MANC2: Fair. Running out of local stage space? "Error details:
>>> pilot: Too little space left on local disk to run job: 2050048 kB
>>> (need > 2097152 kB)" - maybe need to clean up disks or run less jobs
>>> on these nodes? Otherwise ok.
>>>
>>>
>> indeed this is a recurring problem that we will solve when we upgrade
>> to SL5. We have only two cpus per node but only 20GB in scratch. It
>> used to be enough but these analisys seem to copy far more than 10GB
>> per job.
>
> We now get torque to create a per-job temporary directory which gets
> cleared up at the end of the job:
>
> http://www.clusterresources.com/torquedocs21/users/2.2files.shtml#tmpdir
>
> Duncan
>
>> cheers
>> alessandra
>>
--
No man ever steps in the same river twice, for it's not the same river and he's not the same man. (Heraclitus)
Northgrid Tier2 Technical Coordinator
http://www.hep.manchester.ac.uk/computing/tier2
|