Print

Print


Hi Kashif,

Thanks you so much for confirming what I thought. This is likely to be a 
bug, perhaps, and not a config error since you have set your system up 
entirely independently of our setup.

I guess you'll have some stale (held) jobs in condor_q, that you need to 
get rid of by hand. Please confirm if you notice this.

I'd guess it's a race condition, since it occurs so infrequently. It may 
take a bit of time to zero in on the source of it.

Cheers,

Ste


On 26/03/18 17:41, Kashif Mohammad wrote:
> HI Steve
>
> I tried this on SL6 ARC CE and got few instances like this; around 21 in last 30 days
>
> adowLog.old:01/26/18 07:41:01 (10342571.0) (3558208): ReliSock::put_file_with_permissions(): Failed to stat file '/var/spool/arc/grid01/778NDmkY9yrnD0VBFmzXO77mABFKDmABFKDm9r7VDmABFKDmsnieim/.gahp_complete                              ': No such file or directory (errno: 2, si_error: 1)
> ShadowLog.old:01/26/18 07:41:01 (10342571.0) (3558208): DoUpload: (Condor error code 13, subcode 2) SHADOW at 163.1.5.50 failed to send file(s) to <163.1.5.112:44261>: error reading from /var/spool/arc/grid0                              1/778NDmkY9yrnD0VBFmzXO77mABFKDmABFKDm9r7VDmABFKDmsnieim/.gahp_complete: (errno 2) No such file or directory; STARTER failed to receive file(s) from <163.1.5.50:21671>
> ShadowLog.old:01/26/18 07:41:01 (10342571.0) (3558208): Job 10342571.0 going into Hold state (code 13,2): Error from [log in to unmask]: SHADOW at 163.1.5.50 failed to send file(s) to <163.1.5.11                              2:44261>: error reading from /var/spool/arc/grid01/778NDmkY9yrnD0VBFmzXO77mABFKDmABFKDm9r7VDmABFKDmsnieim/.gahp_complete: (errno 2) No such file or directory; STARTER failed to receive file(s) from <163.1.5.                              50:21671>
>
>
> Cheers
>
> Kashif
>>>>> -----Original Message-----
>>>>> From: Testbed Support for GridPP member institutes [mailto:TB-
>>>>> [log in to unmask]] On Behalf Of Stephen Jones
>>>>> Sent: 26 March 2018 17:36
>>>>> To: [log in to unmask]
>>>>> Subject: .gahp
>>>>>
>>>>> Hi,
>>>>>
>>>>> Can someone who is running ARC/Condor please do this for me?
>>>>>
>>>>> # cd /var/log/condor/
>>>>> # grep .gahp ShadowLog*
>>>>>
>>>>> And let me know if anything like this pops out:
>>>>>
>>>>> ReliSock::put_file_with_permissions(): Failed to stat file
>>>>> '/var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMU
>>>>> aDm9BFKDmwLXxtm/.gahp_complete':
>>>>> No such file or directory (errno: 2, si_error: 1)
>>>>> DoUpload: (Condor error code 13, subcode 2) SHADOW at 192.168.178.105
>>>>> failed to send file(s) to <192.168.26.14:27452>: error reading from
>>>>> /var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUa
>>>>> Dm9BFKDmwLXxtm/.gahp_complete:
>>>>> (errno 2) No such file or directory; STARTER failed to receive file(s) from
>>>>> <138.253.178.105:9618> Job 208640.0 going into Hold state (code 13,2):
>>>>> Error from
>>>>> [log in to unmask]: SHADOW at 192.168.178.105 failed to send
>>>>> file(s) to <192.168.26.14:27452>: error reading from
>>>>> /var/spool/arc/grid/u2FNDmPzfKsnKbMCrqsOzK9nABFKDmABFKDmnMUa
>>>>> Dm9BFKDmwLXxtm/.gahp_complete:
>>>>> (errno 2) No such file or directory; STARTER failed to receive file(s) from
>>>>> <138.253.178.105:9618>
>>>>>
>>>>> PS: Using CentOS7, nordugrid-arc-5.4.1-1.el7.centos.x86_64 and
>>>>> condor-8.6.3-1.el7.x86_64
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Ste
>>>>>
>>>>>
>>>>> --
>>>>> Steve Jones                             [log in to unmask]
>>>>> Grid System Administrator               office: 220
>>>>> High Energy Physics Division            tel (int): 43396
>>>>> Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 3396
>>>>> University of Liverpool                 http://www.liv.ac.uk/physics/hep/


-- 
Steve Jones                             [log in to unmask]
Grid System Administrator               office: 220
High Energy Physics Division            tel (int): 43396
Oliver Lodge Laboratory                 tel (ext): +44 (0)151 794 3396
University of Liverpool                 http://www.liv.ac.uk/physics/hep/