On May 4, 2007, at 5:32 PM, Daniel Lorenz wrote:
> Hello,
>
> On Friday 04 May 2007 16:34, Steve Traylen wrote:
>> On May 4, 2007, at 4:24 PM, Daniel Lorenz wrote:
>>> Hello,
>>>
>>> since today 0:00 SFT fail on at least one WN.
>>>
>>> After the job start running on the WN, they immediately
>>> terminate. The
>>> /pbs/undelivered/<job>..ER file says:
>>
Daniel,
Is there anything in the mom logs on the WN, in particular there should
be reason why the file could not be copied back.
Be sure to have
$logevent 255
in the mom configuration to increase the verbosity but the default
should be enough
to pick up an error.
Also when you say you submitted jobs successfully with qsub did the
output actually
come back?
Steve
>> So it sounds like you tried everything on:
>>
>> http://goc.grid.sinica.edu.tw/gocwiki/submit-
>> helper_script_..._gave_error%3A_cache_export_dir_...
>>
>> Try su'ing to one of your pool accounts and qsub'ing a simple job.
>> Try in particular as
>> an ops VO.
>> Steve
>
> I tried this with a number of accounts and it worked fine . But if
> I use
> globus-job-run gcn54/jobmanager-lcgpbs no output (but also no error
> message)
> is returned and a "submit-helper script..." entry on the WN is
> generated.
> The fork manager also works.
>
> Regards,
> Daniel
>
>>
>>> submit-helper script running on host gcn51 gave error:
>>> cache_export_dir
>>> (/home/dteam006/.lcgjm/globus-cache-export.I23975) on gatekeeper
>>> did not
>>> contain a cache_export_dir.tar archive
>>>
>>> logging info says:
>>> Event: Done
>>> - exit_code = 1
>>> - host = rb127.cern.ch
>>> - level = SYSTEM
>>> - priority = asynchronous
>>> - reason = Cannot read JobWrapper output, both
>>> from Condor
>>> and from Maradona.
>>> - seqcode =
>>> UI=000003:NS=0000000003:WM=000012:BH=0000000000:JSS=000009:LM=000019
>>> :L
>>> RMS=000000:APP=000000
>>> - source = LogMonitor
>>> - src_instance = unique
>>> - status_code = FAILED
>>>
>>> The CRL was up to date.
>>> I can copy file from the WN to the CE with
>>> globus-url-copy.
>>> The clocks are synchronized.
>>> Users are mapped to the same id.
>>>
>>> Has anybody an idea?
>>>
>>> Thanks in advance,
>>> Daniel Lorenz
--
Steve Traylen
[log in to unmask]
CERN, IT-GD-OPS.
|