Hello Marten,
[log in to unmask] wrote on 31.01.2009 16:55:
> What does "glite-wms-job-logging-info -v 2" report?
> Maybe an error that has a Wiki entry here:
>
> http://goc.grid.sinica.edu.tw/gocwiki/SiteProblemsFollowUpFaq
No. No error message at all.
The last entry is
Event: Accepted
- arrived = Wed Feb 4 09:36:01 2009 CET
- from = JobController
- from_host = localhost
- from_instance = unavailable
- host = grid-wms1.desy.de
- local_jobid = 517905
- source = LogMonitor
- src_instance = unique
- timestamp = Wed Feb 4 09:36:01 2009 CET
- user = /O=GermanGrid/OU=TUD/CN=Ralph
Mueller-Pfefferkorn/CN=proxy/CN=proxy
> The WMS (i.e. Condor-G) uses a feature called "two-phase commit" that is
> not used by globus-job-run. It is more sensitive to firewall settings.
> An example of the traffic back and forth between WMS and CE is given here:
> http://goc.grid.sinica.edu.tw/gocwiki/Dialog_between_RB_and_CE
> The WMS has the same behavior as the RB, because both use Condor-G.
We checked the firewall and we don't see any drops during the submission.
> I tried to have a look at your CE, but it seems to sit on a local network
> or no longer exists:
> $ uberftp service1.ice.zih.tu-dresden.de pwd
> globus_xio: Unable to connect to service1.ice.zih.tu-dresden.de:2811
> globus_xio: globus_libc_getaddrinfo failed.
> globus_common: Name or service not known
> Failed to connect to service1.ice.zih.tu-dresden.de port 2811.
service1.ice.zih.tu-dresden.de is the CEs name in the internal network over
which it contacts the torque server.
For the outside network/world it is desdemona.zih.tu-dresden.de.
$ uberftp desdemona.zih.tu-dresden.de pwd
220 service1.ice.zih.tu-dresden.de GridFTP Server 2.3 (gcc32dbg,
1144436882-63) ready.
230 User zihp0040 logged in.
257 "/home/zihp0040" is current directory.
We still investigate the Maui issue. Do you know if it is really
the case that the CE uses Maui to get usage information?
As I said in the originial mail there are two different maui version,
the gLite version on the CE and the one which runs on the torque/maui
server node (which runs with SLES10). Just for a try we copied maui from
the SLES node to the CE and with these binaries maui works (e.g. a
showres). It seems that the compiled in authorization keys of maui are
really an issue.
What we would like to try is to recompile the gLite-maui version with
the right key. Do you know where to get the source code (source rpm) for
it (the version installed is 3.2.6p20-snap.1182974819.8)?
We don't know if this is the problem, but ... ;)
Greetings.
Ralph
|