Print

Print


Hi Alessandra,

Just to confirm /CVMFS is mounted on the two test worker nodes:

[root@node205 ~]# ls /cvmfs
atlas.cern.ch  atlas-condb.cern.ch  atlas-nightlies.cern.ch  config-egi.egi.eu  grid.cern.ch  sft.cern.ch  unpacked.cern.ch

[root@node206 ~]# ls /cvmfs
atlas.cern.ch  atlas-condb.cern.ch  atlas-nightlies.cern.ch  cvmfs-config.cern.ch  grid.cern.ch  sft.cern.ch  unpacked.cern.ch

Thanks
Patrick



From: Testbed Support for GridPP member institutes [[log in to unmask]] on behalf of Patrick Smith [[log in to unmask]]
Sent: 23 January 2020 09:51
To: [log in to unmask]
Subject: Re: [External] Re: ARC CE6 / UGE not working with Panda Queue

Hi Alessandra,

Great thank you.  Yes lets wait and see how thigs go.

Talk soon
Regards
Patrick

From: Testbed Support for GridPP member institutes [[log in to unmask]] on behalf of Alessandra Forti [[log in to unmask]]
Sent: 22 January 2020 16:07
To: [log in to unmask]
Subject: Re: [External] Re: ARC CE6 / UGE not working with Panda Queue

Hi,

looks already better

https://monit-grafana.cern.ch/d/3naRcbRZz/harvester?orgId=17&var-bin=1h&var-cloud=All&var-site=UKI-SOUTHGRID-SUSX&var-computingsite=All&var-instance=All&var-status=All&var-resourcetype=All&var-computingelement=All

let's wait.

cheers
alessandra

On 22/01/2020 15:50, Patrick Smith wrote:
Hi Dan & Alessandra,

I have enabled the RTEs as described below:

[root@grid-arc-01 ~]# arcctl rte enable APPS/HEP/ATLAS-SITE-LCG --dummy
[root@grid-arc-01 ~]# arcctl rte enable ENV/GLITE --dummy
[root@grid-arc-01 ~]# arcctl rte list
ENV/CANDYPOND                    (system, disabled)
ENV/CONDOR/DOCKER                (system, disabled)
ENV/LRMS-SCRATCH                 (system, disabled)
ENV/PROXY                        (system, enabled, default)
ENV/RTE                          (system, disabled)
ENV/SINGULARITY                  (system, disabled)
APPS/HEP/ATLAS-SITE-LCG          (dummy, enabled)
ENV/GLITE                        (dummy, enabled)


Thanks
Patrick

From: Daniel Traynor [[log in to unmask]]
Sent: 22 January 2020 13:44
To: [log in to unmask]; Patrick Smith
Subject: Re: [External] Re: ARC CE6 / UGE not working with Panda Queue

hi Patrick

That was missing from my notes!

I create a couple of dummy rte on my arcce which seems to be needed by atlas

do

arcctl rte enable APPS/HEP/ATLAS-SITE-LCG --dummy
arcctl rte enable ENV/GLITE --dummy
and you should see

$ arcctl rte list
ENV/CANDYPOND (system, disabled)
ENV/CONDOR/DOCKER (system, disabled)
ENV/LRMS-SCRATCH (system, disabled)
ENV/PROXY (system, enabled, default)
ENV/RTE (system, disabled)
ENV/SINGULARITY (system, disabled)
APPS/HEP/ATLAS-SITE-LCG (dummy, enabled)
ENV/GLITE (dummy, enabled)


dan


* Dr Daniel Traynor, Grid cluster system manager
* Tel +44(0)20 7882 6560, Particle Physics,QMUL


________________________________________
From: Testbed Support for GridPP member institutes <[log in to unmask]> on behalf of Alessandra Forti <[log in to unmask]>
Sent: 22 January 2020 12:48
To: [log in to unmask]
Subject: Re: [External] Re: ARC CE6 / UGE not working with Panda Queue

Hi,

I don't know if this is the only problem but if I add the RSL atlas generates to my simpler file the jobs cannot even be submitted because the runtime environment APPS/HEP/ATLAS-SITE-LCG is missing on ARC 5 this is a script in /etc/arc/runtime

[root@ce01 runtime]# ls /etc/arc/runtime/APPS/HEP/ATLAS-SITE-LCG
#!/bin/bash

sites can customise it but most of the times it's empty. runtimeenv are published in the nordugrid CE bdii to check

ldapsearch -x -LLL -h grid-arc-01.hpc.susx.ac.uk:2135 -b "o=grid"| perl -p00e 's/\r?\n //g' |less

Here is the manual for ARC6 http://www.nordugrid.org/arc/arc6/admins/details/rtes.html<https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.nordugrid.org%2Farc%2Farc6%2Fadmins%2Fdetails%2Frtes.html&data=02%7C01%7C%7C0568d7dfceea4d6455ec08d79f3978d9%7C569df091b01340e386eebd9cb9e25814%7C0%7C0%7C637152941497638652&sdata=wb7nFPf86itIKXM%2FTpHYxNVRWI1CmBu9E5veVPE5N%2Bc%3D&reserved=0>

Most sites have this setup by puppet I guess because I don't think I have ever added it myself.

Even if this might not be the problem (it seems to block my jobs earlier than the atlas jobs) can you fix it so we can continue to debug the ATLAS rsl?

thanks

cheers
alessandra

On 22/01/2020 10:14, Patrick Smith wrote:
Hi Ste,

Thank you for taking time to look. I am using an NFS to share the 'session directory' between the ARC CE and worker nodes but utilising a local 'scratch directory' on the worker nodes.

I'm following logs in /var/spool/arc/ to trace where the job ends up and fails. I will try what you have suggested by submitting by hand.

Thanks again.
Regards
Patrick
________________________________
From: Testbed Support for GridPP member institutes [[log in to unmask]<mailto:[log in to unmask]>] on behalf of sjones [[log in to unmask]<mailto:[log in to unmask]>]
Sent: 22 January 2020 09:59
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: [External] Re: ARC CE6 / UGE not working with Panda Queue

On 2020-01-21 14:30, Matt Doidge wrote:
> My shot in the dark is to ask does the ARC6 CE require a prolog/epilog
> script to copy input data to/from whatever sandbox the ARC6 CE uses to
> the job working directory?

When I last set up SGE at some place (~ 2008) I used NFS to share the
home directory file system between the batch server and the nodes. Hence
no prolog/epilog arrangements were needed. If this is not the case, then
yes, files have to be somehow transferred in before and taken out after
the job runs on the node.

BTW: Some of the errors in the gridFTP log files implicated
.gahp_complete - a signal file to show when an operation is done. I
think Gahp utilises file transfer protocols such as SSH and FTP to work.
So perhaps the problem does lie in the transfer of input files. To
check, somehow snatch the failing job specification file (that ARC
produces) before it gets sent to the SGE server, and try submitting that
by hand. If it fails, that will show why the batch system is not
receiving or accepting the work.

Cheers,

Ste

########################################################################

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2Fwebadmin%3FSUBED1%3DTB-SUPPORT%26A%3D1&data=02%7C01%7C%7C0568d7dfceea4d6455ec08d79f3978d9%7C569df091b01340e386eebd9cb9e25814%7C0%7C0%7C637152941497638652&sdata=82ElP3KdAotxPuPwS2wRI9LhmlslD5hREEVssyy6YH8%3D&reserved=0>

________________________________

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2Fwebadmin%3FSUBED1%3DTB-SUPPORT%26A%3D1&data=02%7C01%7C%7C0568d7dfceea4d6455ec08d79f3978d9%7C569df091b01340e386eebd9cb9e25814%7C0%7C0%7C637152941497638652&sdata=82ElP3KdAotxPuPwS2wRI9LhmlslD5hREEVssyy6YH8%3D&reserved=0>


--
Inference: a conclusion reached on the basis of evidence and reasoning
Respect is a rational process. \\//
For Ur-Fascism, disagreement is treason. (U. Eco)


________________________________

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1<https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.jiscmail.ac.uk%2Fcgi-bin%2Fwebadmin%3FSUBED1%3DTB-SUPPORT%26A%3D1&data=02%7C01%7C%7C0568d7dfceea4d6455ec08d79f3978d9%7C569df091b01340e386eebd9cb9e25814%7C0%7C0%7C637152941497648608&sdata=vr1ACZ7kuHxLeU2wvELvAPYnDr4JZWla6z9QqnMuijc%3D&reserved=0>


To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1


-- 
Inference: a conclusion reached on the basis of evidence and reasoning
Respect is a rational process. \\//
For Ur-Fascism, disagreement is treason. (U. Eco)


To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1



To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1



To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1