Print

Print


Thanks Alastair.

To you or any other helpful person, how to I get to a batch ID from a cream output file path?


________________________________
From: Testbed Support for GridPP member institutes <[log in to unmask]> on behalf of Alastair Dewhurst <[log in to unmask]>
Sent: 13 August 2018 10:01
To: [log in to unmask]
Subject: Re: atlas job filling up /var/cream_sandbox/atlaspil/ with huge log file

Hi

For any ATLAS issue always cc in UK cloud support, if anyone is going to do anything about this it will be someone on there.  They are all in TB-support anyway, but its always a good idea to make sure it isn’t missed.

You should also always provide a batch farm ID of the jobs. In the Panda monitor, you can search for jobs via batchID and this will then say who the user that actually submitted the jobs was etc, so they could be stopped from submitting more (by ATLAS).

I am afraid I can’t help you with blocking them at the CREAM level.

Alastair


On 13 Aug 2018, at 09:53, George, Simon <[log in to unmask]<mailto:[log in to unmask]>> wrote:

Thanks very much.
Done: https://ggus.eu/index.php?mode=ticket_info&ticket_id=136675&come_from=submit
I wonder if anyone else has seen jobs like these?


________________________________
From: Testbed Support for GridPP member institutes <[log in to unmask]<mailto:[log in to unmask]>> on behalf of Daniela Bauer <[log in to unmask]<mailto:[log in to unmask]>>
Sent: 13 August 2018 09:15
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: atlas job filling up /var/cream_sandbox/atlaspil/ with huge log file

ggus.org<http://ggus.org>, pick "VO specific" and chose Atlas. And make it "urgent".
Cheers,
Daniela

On Mon, 13 Aug 2018 at 09:10, George, Simon <[log in to unmask]<mailto:[log in to unmask]>> wrote:
>
> Thanks Daniela.
>
> How do I submit a VO ticket? Can you point me to the relevant ticketing system/email address please?
>
>
>
> ________________________________
> From: Testbed Support for GridPP member institutes <[log in to unmask]<mailto:[log in to unmask]>> on behalf of Daniela Bauer <[log in to unmask]<mailto:[log in to unmask]>>
> Sent: 13 August 2018 09:05
> To: [log in to unmask]<mailto:[log in to unmask]>
> Subject: Re: atlas job filling up /var/cream_sandbox/atlaspil/ with huge log file
>
> Atlas specific VO ticket ? At least then the other sites can see it.
>
> Cheers,
> Daniela
> On Mon, 13 Aug 2018 at 09:02, George, Simon <[log in to unmask]<mailto:[log in to unmask]>> wrote:
> >
> > Since last night I had two more jobs with the same problem.
> >
> >
> >
> > ________________________________
> > From: Testbed Support for GridPP member institutes <[log in to unmask]<mailto:[log in to unmask]>> on behalf of George, Simon <[log in to unmask]<mailto:[log in to unmask]>>
> > Sent: 12 August 2018 22:34
> > To: [log in to unmask]<mailto:[log in to unmask]>
> > Subject: atlas job filling up /var/cream_sandbox/atlaspil/ with huge log file
> >
> >
> > Hi,
> >
> > I have a rogue ATLAS jobs that has broken up one of my CREAM CEs with a 10 GB log file in a pilot directory,
> >
> > /var/cream_sandbox/atlaspil/CN_Robot__ATLAS_Pilot2_CN_531497_CN_atlpilo2_OU_Users_OU_Organic_Units_DC_cern_DC_ch_atlas_Role_pilot_Capability_NULL_platl017/79/CREAM791629196/OSB/4998630.20.out
> >
> >
> > The log file at some point starts listing the files unpacked from a tar ball like this:
> >
> > tarball_PandaJob_4024428114_ANALY_RHUL_SL6/usr/DiLepAna/1.0.0/InstallArea/x86_64-slc6-gcc62-opt/include/AnalysisHelpers/AnalysisHelpers/AnalysisHelpers
> >
> > tarball_PandaJob_4024428114_ANALY_RHUL_SL6/usr/DiLepAna/1.0.0/InstallArea/x86_64-slc6-gcc62-opt/include/AnalysisHelpers/AnalysisHelpers/AnalysisHelpers/AnalysisHelpers
> >
> > tarball_PandaJob_4024428114_ANALY_RHUL_SL6/usr/DiLepAna/1.0.0/InstallArea/x86_64-slc6-gcc62-opt/include/AnalysisHelpers/AnalysisHelpers/AnalysisHelpers/AnalysisHelpers/AnalysisHelpers
> >
> > and carries on like that forever, adding another /AnalysisHelpers at the end each time, soon changing to an error message
> >
> > "Cannot stat: Too many levels of symbolic links" each time.
> > There is clearly some kind of simlink loop in this tar file.
> > Perhaps some of you see other jobs with the same problem at your sites?
> >
> > This log file is up to around 10 GB and has filled my /var up, so I will copy the first 5000 lines in case it's useful and delete it.
> >
> >
> > Can anyone advise if/how to report it to ATLAS?
> >
> > And how can I identify and stop this job?
> >
> > (Apologies, I do not know my way around cream.)
> >
> >
> > Thanks,
> >
> > Simon
> >
> >
> > ________________________________
> >
> > To unsubscribe from the TB-SUPPORT list, click the following link:
> > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
> >
> >
> > ________________________________
> >
> > To unsubscribe from the TB-SUPPORT list, click the following link:
> > https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
>
>
>
> --
> Sent from the pit of despair
>
> -----------------------------------------------------------
> [log in to unmask]<mailto:[log in to unmask]>
> HEP Group/Physics Dep
> Imperial College
> London, SW7 2BW
> Tel: +44-(0)20-75947810
> http://www.hep.ph.ic.ac.uk/~dbauer/
>
> ########################################################################
>
> To unsubscribe from the TB-SUPPORT list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1
>
> ________________________________
>
> To unsubscribe from the TB-SUPPORT list, click the following link:
> https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1



--
Sent from the pit of despair

-----------------------------------------------------------
[log in to unmask]<mailto:[log in to unmask]>
HEP Group/Physics Dep
Imperial College
London, SW7 2BW
Tel: +44-(0)20-75947810
http://www.hep.ph.ic.ac.uk/~dbauer/

########################################################################

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1

________________________________
To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1


________________________________

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1

########################################################################

To unsubscribe from the TB-SUPPORT list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=TB-SUPPORT&A=1