Print

Print


Hi,

We see this almost all the time, and it is a long standing problem. Since it
appears from time to time (it is not always there), without any changes from
our side, we think that it is related to some WMS problem, and not to gCE
problems.

Somewhat related is the following ticket (although no mapping problems there):

https://gus.fzk.de/pages/ticket_details.php?ticket=20625

However, I don't know what is the status of improvements mentioned there...

Regards, Antun

-----
Antun Balaz
Research Assistant
E-mail: [log in to unmask]
Web: http://scl.phy.bg.ac.yu/

Phone: +381 11 3713152
Fax: +381 11 3162190

Scientific Computing Laboratory
Institute of Physics, Belgrade, Serbia
-----

---------- Original Message -----------
From: Esteban Freire Garcia <[log in to unmask]>
To: [log in to unmask]
Sent: Mon, 14 May 2007 22:50:47 +0200
Subject: Re: [LCG-ROLLOUT] LCAS/LCMAPS strange behaviour

> Hi Alex,
> 
>     From the upgrade 29 we have a very similar incidence on PPS, similar
> logs..although I am not sure that the problem happen since the 
> upgrade, in principle I didn't observe anything strange after to 
> upgrade. What is curious, is that from the page of monitoring, the 
> tests that are made automatically every hour has a status of Ok on 
> PPS, however if I try to send a test from the Sam Adminīs page, this 
> job is aborted with the following error :(reason  =   Got a job held 
> event, reason: "The job attribute PeriodicHold expression 'Matched 
> =!= TRUE && CurrentTime > QDate + 900' evaluated to TRUE" )    After 
> reviewing all the services running, I do not observe anything 
> strange, and I think that it is an authentication problem, although 
> I do not observe anything stranger in this sense.    So, I from here 
> send the same question that you, Has anyone seen similar behaviour?
> 
> Thanks,
> Esteban
> 
> > Hi all,
> >
> > Both on my production & pps sites on gliteCEs i've got the following
> > logged exactly every 5 minutes and 30 seconds:
> > -----------------------------------------------------
> > Notice: 6: Got connection 131.154.100.148 at Sun May 13 07:08:59 2007
> >
> > Notice: 5: Trying to use delegated user proxy
> > Notice: 5: Authenticated globus user: /C=PL/O=GRID/O=PSNC/CN=Rafal
> > Lichwala - OPS Notice: 0: GRID_SECURITY_HTTP_BODY_FD=9
> > Notice: 0: JOB_REPOSITORY_ID
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 (unique id used for
> > Job Repository) Notice: 0: FORMAT:
> > YYYY-MM-DD.hh:mm:ss.micros.pid.connection Notice: 0: (Format:
> > <date>.<time (with
> > microsecs)>.<pid>.<connection counter>)
> > Notice: 0: temporarily ALLOW empty credentials
> > Notice: 0: Using dlopen version of LCAS
> > Notice: 0: lcasmod_name = /opt/glite/lib/lcas.mod
> > LCAS   0: 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > LCAS   7: 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > Initialization LCAS version 1.3.1 LCAS   0:
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas.mod-lcas_init(): Reading LCAS database /opt/glite/etc/lcas/lcas.db
> > LCAS   0: 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > LCAS   5: 2007-05-13.07:09:00.123457.0000000507.0000004146 : LCAS
> > authorization request LCAS   0:
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas.mod-lcas_run_va(): user is /C=PL/O=GRID/O=PSNC/CN=Rafal Lichwala -
> > OPS LCAS   0: 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas_userban.mod-plugin_confirm_authorization(): checking banned users
> > in /opt/glite/etc/lcas/ban_users.db LCAS   0:
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas.mod-lcas_run_va(): authorization granted by plugin
> > /opt/glite/lib/modules/lcas_userban.mod LCAS   0:
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas_plugin_voms-plugin_confirm_authorization_from_x509(): Generic
> > verification error for VOMS (failure)! LCAS   0:
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas_plugin_voms-plugin_confirm_authorization_from_x509(): voms plugin
> > failed LCAS   0: 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas.mod-lcas_run_va(): authorization failed for plugin
> > /opt/glite/lib/modules/lcas_voms.mod LCAS   0:
> > 2007-05-13.07:09:00.123457.0000000507.0000004146 :
> > lcas.mod-lcas_run_va(): failed Failure: LCAS failed authorization.
> > Failure: LCAS failed authorization.
> > -----------------------------------------------------
> >
> > AFAIK /C=PL/O=GRID/O=PSNC/CN=Rafal Lichwala - OPS is the dn used to
> > submit tests from SAM Admin Portal. The connection  is coming from
> > glite-rb-01.cnaf.infn.it WMS.
> > Any ideas why it tries exactly every 5::30 minutes? Does the WMS try to
> > monitor some previously sent jobs or what?
> >
> > What is more interesting is that then i try to submit jobs from SAM
> > Admin Portal
> > to production gliteCE the Job gets Abroted due to:
> > Job got an error while in the CondorG queue.
> > hit job shallow retry count (0)
> > In the job logging info i see tha the job is submited by
> > /C=PL/O=GRID/O=PSNC/CN=Rafal Lichwala - OPS
> > But nothing is logged at /var/log/glite/gatekeeper.log &
> > /var/log/messages regarding lcas & lcamaps authentication.
> > Also there is nothing in /var/log/gridftp-lcas_lcmaps.log for the user.
> > But the there is a mapping under /etc/grid-security/gridmapdir for the
> > /C=PL/O=GRID/O=PSNC/CN=Rafal Lichwala - OPS dn to ops003
> >
> > But what is even more strange is then i submit from  SAM Admin Portal
> > to pps gliteCE, the job is sucessfully submited and executed by pbs and
> > blah record is insteted to /var/log/glite/accounting/blahp.log-200705 ,
> > but again nothing is logged both at /var/log/glite/gatekeeper.log &
> > /var/log/messages Howether the authentication is logged at
> > /var/log/gridftp-lcas_lcmaps.log
> >
> > How this can be? I've both at pps & production authentication working
> > ok for all other users with lcas & lcamaps messages logged as usual at
> > /var/log/glite/gatekeeper.log & /var/log/messages/
> > Any why the submition work for pps site only?
> >
> > Has anyone seen similar behaviour?
> >
> > Thanks
> > Alex
------- End of Original Message -------