Hello,
Can you pinpoint any change in the dcache that started off this
behaviour? This might give an indication of what's broken. Normally
I'd advise a mail to [log in to unmask] but considering the season
you're unlikely to get a response until the New Year. I found some
possibly useful information on:
http://trac.dcache.org/trac.cgi/wiki/manuals/lcg_utils_and_dcache
Although it doesn't contain any references to the security-like issues
dcache indicated in your logs.
I'm far from an expert, but my first hunch would be check the
firewalls. Hunch number 2 is try to update to a later java version
(we're on 1.5). Hunch number 3 is try to update to a later dcache
version, the latest seems to be 1.7.0-48. Although I once again point
out that I'm not an expert, and these suggestions are both overly
heavy handed and will probably involve a lot of work due to the size
of your dcache.
Sorry I couldn't be more help,
Matt
On 27/12/2007, Sergey <[log in to unmask]> wrote:
> Hi dcache Experts,
>
> We have experienced SAM test failures with repeating message:
> "lcg_cr: Protocol not supported"
>
> dcache-server 1.7.0-35
> dacache-client 1.70-35
> java 1.4.2_12
>
> Did try to restart services (core, pnfs, doors. srm) but non of the
> attempts did help.
>
> Also in adminDoorDomain.log the repeating typical is message:
>
> 12/27 13:58:54 Cell(alm@adminDoorDomain) : Exception in secure
> protocol : dmg.protocols.ssh.SshProtocolException: IO :
> java.net.SocketException: Connection reset
> 12/27 14:08:25 Cell(alm@adminDoorDomain) : Exception in secure
> protocol : dmg.protocols.ssh.SshProtocolException: Ssh Protocol
> violation in reading Version
> 12/27 14:17:57 Cell(alm@adminDoorDomain) : Exception in secure
> protocol : dmg.protocols.ssh.SshProtocolException: IO :
> java.net.SocketException: Connection reset
> ....
>
> and in the dCacheDomain.log
>
> 12/27 18:20:43 Cell(RoutingMgr@dCacheDomain) : update can't send
> update to RoutingMgr{uoid=<1198779643886:78099>;path=[>RoutingMgr@local];msg=Missing
> routing entry for RoutingMgr@local}
> 12/27 18:20:47 Cell(RoutingMgr@dCacheDomain) : update can't send
> update to RoutingMgr{uoid=<1198779647335:78101>;path=[>RoutingMgr@local];msg=Missing
> routing entry for RoutingMgr@local}
> 12/27 18:21:05 Cell(RoutingMgr@dCacheDomain) : update can't send
> update to RoutingMgr{uoid=<1198779665994:78105>;path=[>RoutingMgr@local];msg=Missing
> routing entry for RoutingMgr@local}
> 12/27 18:21:05 Cell(l-100-Unknown-15434@dCacheDomain) : runIO :
> java.io.EOFException
> 12/27 18:21:05 Cell(l-100-Unknown-15434@dCacheDomain) : acceptThread :
> java.io.EOFException
> 12/27 18:21:05 Cell(RoutingMgr@dCacheDomain) : update can't send
> update to RoutingMgr{uoid=<1198779665995:78107>;path=[>RoutingMgr@local];msg=Missing
> routing entry for RoutingMgr@local}
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : acceptThread :
> java.net.SocketException: Connection reset
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) :
> java.net.SocketException: Connection reset
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.net.SocketInputStream.read(SocketInputStream.java:168)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.BufferedInputStream.fill(BufferedInputStream.java:183)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.BufferedInputStream.read(BufferedInputStream.java:201)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2133)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2423)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2433)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1245)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:324)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> dmg.cells.network.LocationMgrTunnel.negotiateDomains(LocationMgrTunnel.java:479)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> dmg.cells.network.LocationMgrTunnel.makeObjectStreams(LocationMgrTunnel.java:461)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> dmg.cells.network.LocationMgrTunnel.acceptThread(LocationMgrTunnel.java:245)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> dmg.cells.network.LocationMgrTunnel.run(LocationMgrTunnel.java:349)
> 12/27 18:21:13 Cell(l-100-Unknown-15435@dCacheDomain) : at
> java.lang.Thread.run(Thread.java:534)
>
> in pnfsDomain.log:
>
> 12/27 15:19:26 Cell(cleaner@pnfsDomain) : sendRemoveToPool : got
> unexpected reply class : dmg.cells.nucleus.NoRouteToCellException
>
> So for me it sounds like we have a problem with security issues
>
> Can anybody give me advise?
>
> Regards
> Sergey
>
> --
> --
> Sergey Dolgobrodov
> Department of Physics & Astronomy
> University of Manchester
> Manchester M13 9PL
> Tel: +44 (0)161 6608472
> Mobile: +44 (0)790 4587534
> Skype: sergeygd
>
|