Hi Maarten,
Sorry for the late response.
Maarten Litmaath wrote:
> Hi Luís,
>
>
>> Since I'm having some problems with in site's CE (axon-g01.ieeta.pt):
>>
>> *************************************************************
>> BOOKKEEPING INFORMATION:
>> Status info for the Job : https://wms208.cern.ch:9000/4VVMBlRExMpZUwKUf5cwdg
>> Current Status: Aborted
>> Logged Reason(s):
>> - File not available.Cannot read JobWrapper output, both from Condor and from Maradona.
>> Status Reason: hit job shallow retry count (1)
>> Destination: axon-g01.ieeta.pt:2119/jobmanager-lcgpbs-ops
>> Submitted: Sat Jul 11 22:02:16 2009 CEST
>> *************************************************************
>>
>> https://lcg-sam.cern.ch:8443/sam/sam.py?funct=ShowHistory&vo=ops&nodename=axon-g01.ieeta.pt&sensors=CE
>>
>>
>> I thought they were related to what believed to be a gridmap inconsistency.
>>
>
> That error has a Wiki entry:
>
> http://goc.grid.sinica.edu.tw/gocwiki/Cannot_read_JobWrapper_output...
>
> An example job seems to start OK, but is found finished about 15 minutes later:
>
> -----------------------------------------------------------------------------
> Event: Running
> - Arrived = Sun Jul 12 01:19:21 2009 CEST
> - Host = wms208.cern.ch
> - Node = gt2 axon-g01.ieeta.pt:2119/jobmanager-lcgpbs
> - Source = LogMonitor
> - Src instance = unique
> - Timestamp = Sun Jul 12 01:19:21 2009 CEST
> - User = /DC=ch/DC=cern/OU=Organic Units/OU=[...]
> ---
> Event: Done
> - Arrived = Sun Jul 12 01:34:07 2009 CEST
> - Exit code = 1
> - Host = wms208.cern.ch
> - Reason = File not available.Cannot read JobWrapper output,
> both from Condor and from Maradona.
> - Source = LogMonitor
> -----------------------------------------------------------------------------
>
> Does your batch system kill SAM jobs after 15 minutes?
>
I don't see why it should do that. I've not change anything in the batch
system.
> In any case, 15 minutes is too long for a SAM job: it would be hanging on
> some operation. Does your node/department/campus firewall allow outbound
> access to port 8446?
>
The 8446 port is open.
Regards,
luis
|