Hi everyone,
We are currently involved in the file transfers from RAL. However, we have
been having trouble with our pool node in that all the CPU (8*1.9 GHz)
and memory (physical RAM is 32 GB) resources have been quickly used up,
grinding the machine to a halt. This has prevented us from accepting
files.
When Steve Thorn (NeSC) analysed the machine, it appears that dCache was
spawning java processes:
1195 ? S 0:00 /bin/sh /opt/d-cache/jobs/pool -pool=dcache
-logfile=
1197 ? S 0:00 \_ /usr/java/j2sdk1.4.2_08/bin/java -server
-Xmx256m
1200 ? S 9:55 \_ /usr/java/j2sdk1.4.2_08/bin/java
-server -Xmx
1201 ? S 0:57 \_ /usr/java/j2sdk1.4.2_08/bin/java
-server
1202 ? S 0:00 \_ /usr/java/j2sdk1.4.2_08/bin/java
-server
1203 ? S 0:00 \_ /usr/java/j2sdk1.4.2_08/bin/java
-server
1204 ? S 0:00 \_ /usr/java/j2sdk1.4.2_08/bin/java
-server
...
There were ~200 each using 57 MB RAM. At one point, the total RAM used was
31 GB. At the moment, dcache services have been stopped on the pool node
and after a reboot the machine appears to have returned to normal. Has
anyone seen/heard of this before?
Any advice would be useful.
Cheers,
Greig
|