Hi,
the problem is still there. This night the gatekeeper service crashed
again
As Maarten had indicated, the problem does not seem to be related with
the
log messages below, because the other CE is running normally
and there are the same log messages
>> thanks for having indicated a possible reason.
>> I've found log messages like this
>>
>> LCAS 0: lcas_userban.mod-plugin_confirm_authorization():
>> checking banned users in /opt/glite/etc/lcas/ban_users.db
>> LCAS 0: lcas_plugin_voms-
>> plugin_confirm_authorization_from_x509(): Generic verification error
>> for VOMS (failure): AC not yet (or not anymore) valid.
>> LCAS 0: 2008-07-24.19:15:43 : lcas_plugin_voms-
>> plugin_confirm_authorization_from_x509(): voms plugin failed
>> LCAS 0: lcas.mod-lcas_run_va(): authorization failed for plugin /
>> opt/
>> glite/lib/modules/lcas_voms.mod
>> LCAS 0: lcas.mod-lcas_run_va(): failed
Do you have any hint ?
in /var/log/messages I've found these log messages
Jul 29 01:52:24 ce05-lcg gridinfo[2717]: JMA 2008/07/29 01:52:24
GATEKEEPER_JM_ID 2008-07-29.01:52:24.0000002706.0000000000 has
GRAM_SCRIPT_JOB_ID 2747 manager type fork
Jul 29 01:53:00 ce05-lcg glite-lb-interlogd[21516]: error reading
server egee-rb-01.mi.infn.it reply: get_reply: error reading
server reply
Jul 29 01:53:00 ce05-lcg glite-lb-interlogd[21516]: queue_thread:
get_reply: error reading server reply
Jul 29 01:53:11 ce05-lcg GRAM gatekeeper[4102]: Got connection
131.154.101.14 at Tue Jul 29 01:53:11 2008
Jul 29 01:53:11 ce05-lcg GRAM gatekeeper[4102]: Authenticated globus
user: /C=CN/O=HEP/OU=IHEP/CN=Tao Junquan
Jul 29 01:53:28 ce05-lcg glite-lb-interlogd[21516]: error reading
server lb106.cern.ch reply: get_reply: error reading server reply
Jul 29 01:53:28 ce05-lcg glite-lb-interlogd[21516]: queue_thread:
get_reply: error reading server reply
Jul 29 01:53:29 ce05-lcg GRAM gatekeeper[4719]: Got connection
131.169.223.74 at Tue Jul 29 01:53:29 2008
Jul 29 01:53:29 ce05-lcg GRAM gatekeeper[4719]: Authenticated globus
user: /C=CN/O=HEP/OU=IHEP/CN=Tao Junquan
Jul 29 01:53:52 ce05-lcg nscd: nss_ldap: reconnected to LDAP server
ldap://ldap-m.cr.cnaf.infn.it after 1 attempt
Jul 29 01:54:02 ce05-lcg glite-lb-interlogd[21516]: error reading
server egee-rb-01.mi.infn.it reply: get_reply: error reading
server reply
Jul 29 01:54:02 ce05-lcg glite-lb-interlogd[21516]: queue_thread:
get_reply: error reading server reply
what do they mean ?
anyway the same log messages are present on the other CE
Ale
On Jul 25, 2008, at 11:44 AM, <[log in to unmask]> <[log in to unmask]
> wrote:
> Ciao Ale,
>
>> thanks for having indicated a possible reason.
>> I've found log messages like this
>>
>> LCAS 0: lcas_userban.mod-plugin_confirm_authorization():
>> checking banned users in /opt/glite/etc/lcas/ban_users.db
>> LCAS 0: lcas_plugin_voms-
>> plugin_confirm_authorization_from_x509(): Generic verification error
>> for VOMS (failure): AC not yet (or not anymore) valid.
>> LCAS 0: 2008-07-24.19:15:43 : lcas_plugin_voms-
>> plugin_confirm_authorization_from_x509(): voms plugin failed
>> LCAS 0: lcas.mod-lcas_run_va(): authorization failed for plugin /
>> opt/
>> glite/lib/modules/lcas_voms.mod
>> LCAS 0: lcas.mod-lcas_run_va(): failed
>>
>>
>> anyway,
>>
>> the bug fix for bug 35981 is already in production ?
>
> It is on the PPS since July 14. Look at the dependencies at the
> bottom:
>
> https://savannah.cern.ch/bugs/?35981
>
> It will probably go to production next week.
>
> BTW, these child process crashes normally are harmless. After the fix
> the bad proxies will be refused anyway.
>
|