Hi Andrew,
I have not had a response from gridpp-storage. I posted a similar message
to support-lcg-dcache a few hours ago, but have had no response yet.
What do you mean by // transfers?
Greig
On Wed, 20 Jul 2005, Sansum, RA (Andrew) wrote:
> Did you get a response back on this. My guess is thatthis is the same
> problem
> RAL are seeing where gridftp doors lock up. SARA also saw this during
> SC3 and
> have addressed it (for now) by reducingthe number of // transfers.
>
> regards
> Andrew
>
> > -----Original Message-----
> > From: GRIDPP2: Deployment and support of SRM and local
> > storage management [mailto:[log in to unmask]] On
> > Behalf Of Kostas Georgiou
> > Sent: 19 July 2005 17:26
> > To: [log in to unmask]
> > Subject: Re: dCache spawning java processes?
> >
> >
> > On Tue, Jul 19, 2005 at 04:17:22PM +0100, Greig A Cowan wrote:
> >
> > > Hi everyone,
> > >
> > > We are currently involved in the file transfers from RAL.
> > However, we
> > > have been having trouble with our pool node in that all the
> > CPU (8*1.9 GHz)
> > > and memory (physical RAM is 32 GB) resources have been
> > quickly used up,
> > > grinding the machine to a halt. This has prevented us from accepting
> > > files.
> > >
> > > When Steve Thorn (NeSC) analysed the machine, it appears
> > that dCache
> > > was spawning java processes:
> > >
> > > 1195 ? S 0:00 /bin/sh /opt/d-cache/jobs/pool
> > -pool=dcache
> > > -logfile=
> > > 1197 ? S 0:00 \_
> > /usr/java/j2sdk1.4.2_08/bin/java -server
> > > -Xmx256m
> > > 1200 ? S 9:55 \_ /usr/java/j2sdk1.4.2_08/bin/java
> > > -server -Xmx
> > > 1201 ? S 0:57 \_
> > /usr/java/j2sdk1.4.2_08/bin/java
> > > -server
> > > 1202 ? S 0:00 \_
> > /usr/java/j2sdk1.4.2_08/bin/java
> > > -server
> > > 1203 ? S 0:00 \_
> > /usr/java/j2sdk1.4.2_08/bin/java
> > > -server
> > > 1204 ? S 0:00 \_
> > /usr/java/j2sdk1.4.2_08/bin/java
> > > -server
> > > ...
> > >
> > > There were ~200 each using 57 MB RAM. At one point, the
> > total RAM used
> > > was 31 GB. At the moment, dcache services have been stopped on the
> > > pool node and after a reboot the machine appears to have
> > returned to
> > > normal. Has anyone seen/heard of this before?
> >
> > I can see around 400 threads from the two java processes in
> > one of our pool nodes. Total memory in use is ~380MB for both
> > of them (~100 for the pool, ~180 for gridftp). Are you sure
> > that the problem was caused because of low memory? Threads
> > share all the data so it's more likely to me that the process
> > was only using 57MB total ;P
> >
> > In our pool node, there are also 165 connections from
> > csfnfs*.rl.ac.uk and the disk spends most of it's seeking
> > instead of doing something usefull (writing) which causes a
> > huge load.
> >
> > Cheers,
> > Kostas
> >
>
--
=======================================================================
Dr Greig A Cowan http://www.ph.ed.ac.uk/~gcowan1
School of Physics, University of Edinburgh, James Clerk Maxwell Building
DCACHE PAGES: http://www.gridpp.ac.uk/deployment/admin/dcache/index.html
=======================================================================
|