Hi,
below you can find the last log messages before the crash
is there a way to increase log message verbosity ?
TIME: Tue Jul 29 02:20:55 2008
PID: 6495 -- Notice: 6: Got connection 192.135.14.39 at Tue Jul 29
02:20:55 2008
TIME: Tue Jul 29 02:20:55 2008
PID: 6496 -- Notice: 6: Got connection 192.135.14.39 at Tue Jul 29
02:20:55 2008
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 5: Authenticated globus user: /C=IT/O=INFN/
OU=Personal Certificate/L=Milano/CN=Davide Rebatto
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 5: Authenticated globus user: /C=IT/O=INFN/
OU=Personal Certificate/L=Milano/CN=Davide Rebatto
lcas client name: /C=IT/O=INFN/OU=Personal Certificate/L=Milano/
CN=Davide Rebatto
LCAS 0:
LCAS 1: Initialization LCAS version 1.3.7
allowing empty credentials
LCAS 2: LCAS authorization request
LCAS 0: lcas_userban.mod-plugin_confirm_authorization():
checking banned users in /opt/glite/etc/lcas/ban_users.db
lcas client name: /C=IT/O=INFN/OU=Personal Certificate/L=Milano/
CN=Davide Rebatto
LCAS 0:
LCAS 1: Initialization LCAS version 1.3.7
allowing empty credentials
LCAS 2: LCAS authorization request
LCAS 0: lcas_userban.mod-plugin_confirm_authorization():
checking banned users in /opt/glite/etc/lcas/ban_users.db
LCAS 0: 2008-07-29.02:20:56 : lcas_plugin_voms-
plugin_confirm_authorization_from_x509(): voms plugin succeeded
LCAS 0: lcas.mod-lcas_run_va(): succeeded
LCAS 1: Termination LCAS
lcmaps client name: /C=IT/O=INFN/OU=Personal Certificate/L=Milano/
CN=Davide Rebatto
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 :
LCMAPS 7: 2008-07-29.02:20:56.0000006496.0000000000 : Initialization
LCMAPS version 1.4.2
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
startPluginManager(): Reading LCMAPS database /opt/glite/etc/lcmaps/
lcmaps.db
LCMAPS 5: 2008-07-29.02:20:56.0000006496.0000000000 : LCMAPS
credential mapping request
LCAS 0: 2008-07-29.02:20:56 : lcas_plugin_voms-
plugin_confirm_authorization_from_x509(): voms plugin succeeded
LCAS 0: lcas.mod-lcas_run_va(): succeeded
LCAS 1: Termination LCAS
lcmaps client name: /C=IT/O=INFN/OU=Personal Certificate/L=Milano/
CN=Davide Rebatto
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 :
LCMAPS 7: 2008-07-29.02:20:56.0000006495.0000000000 : Initialization
LCMAPS version 1.4.2
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
startPluginManager(): Reading LCMAPS database /opt/glite/etc/lcmaps/
lcmaps.db
LCMAPS 5: 2008-07-29.02:20:56.0000006495.0000000000 : LCMAPS
credential mapping request
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/
lcmaps_voms_localgroup.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/
lcmaps_voms_localgroup.mod
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 :
lcmaps_plugin_voms_localgroup-plugin_run(): voms_localgroup plugin
succeeded
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/
lcmaps_voms_localaccount.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/
lcmaps_voms_localaccount.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/
lcmaps_voms_localgroup.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/
lcmaps_voms_localgroup.mod
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 :
lcmaps_plugin_voms_localgroup-plugin_run(): voms_localgroup plugin
succeeded
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/
lcmaps_voms_localaccount.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/
lcmaps_voms_localaccount.mod
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 :
lcmaps_plugin_voms_localaccount-plugin_run(): Could not find a VOMS
localaccount in /etc/grid-security/grid-mapfile (failure)
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 :
lcmaps_plugin_voms_localaccount-plugin_run(): voms_localaccount plugin
failed
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/
lcmaps_voms_poolaccount.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/
lcmaps_voms_poolaccount.mod
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 :
lcmaps_plugin_voms_localaccount-plugin_run(): Could not find a VOMS
localaccount in /etc/grid-security/grid-mapfile (failure)
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 :
lcmaps_plugin_voms_localaccount-plugin_run(): voms_localaccount plugin
failed
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/
lcmaps_voms_poolaccount.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/
lcmaps_voms_poolaccount.mod
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 :
lcmaps_plugin_voms_poolaccount-plugin_run(): voms_poolaccount plugin
succeeded
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/lcmaps_posix_enf.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/lcmaps_posix_enf.mod
LCMAPS 6: 2008-07-29.02:20:56.0000006496.0000000000 :
lcmaps_plugin_posix_enf-log_cred():
uid=5485(atlasprd029):pgid=5002(atlasprd):sgid=1307(atlas),1099360367(n
\a)
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 :
lcmaps_plugin_posix_enf-plugin_run(): posix_enf plugin succeeded
LCMAPS 0: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
lcmaps_run(): succeeded
LCMAPS 7: 2008-07-29.02:20:56.0000006496.0000000000 : Termination LCMAPS
LCMAPS 1: 2008-07-29.02:20:56.0000006496.0000000000 : lcmaps.mod-
lcmaps_term(): terminating
Successfull mapping done
Mapping service "LCMAPS" returned local user "atlasprd029"
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 :
lcmaps_plugin_voms_poolaccount-plugin_run(): voms_poolaccount plugin
succeeded
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): found plugin /opt/glite/lib/modules/lcmaps_posix_enf.mod
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
runPlugin(): running plugin /opt/glite/lib/modules/lcmaps_posix_enf.mod
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=9
LCMAPS 6: 2008-07-29.02:20:56.0000006495.0000000000 :
lcmaps_plugin_posix_enf-log_cred():
uid=5485(atlasprd029):pgid=5002(atlasprd):sgid=1307(atlas),1099360367(n
\a)
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 :
lcmaps_plugin_posix_enf-plugin_run(): posix_enf plugin succeeded
LCMAPS 0: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
lcmaps_run(): succeeded
LCMAPS 7: 2008-07-29.02:20:56.0000006495.0000000000 : Termination LCMAPS
LCMAPS 1: 2008-07-29.02:20:56.0000006495.0000000000 : lcmaps.mod-
lcmaps_term(): terminating
Successfull mapping done
Mapping service "LCMAPS" returned local user "atlasprd029"
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=9
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 5: Requested service: jobmanager-lcglsf
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 5: Authorized as local user: atlasprd029
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 5: Authorized as local uid: 5485
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 5: and local gid: 5002
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 5: "/C=IT/O=INFN/OU=Personal Certificate/
L=Milano/CN=Davide Rebatto" mapped to atlasprd029 (5485/5002)
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 5: Requested service: jobmanager-lcglsf
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 5: Authorized as local user: atlasprd029
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 5: Authorized as local uid: 5485
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 5: and local gid: 5002
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 5: "/C=IT/O=INFN/OU=Personal Certificate/
L=Milano/CN=Davide Rebatto" mapped to atlasprd029 (5485/5002)
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 0: executing /opt/globus/libexec/globus-job-
manager
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 0: GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006496.0000000000 for /C=IT/O=INFN/OU=Personal
Certificate/L=Milano/CN=Davide Rebatto on 192.135.14.39
JMA 2008/07/29 02:20:56 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006496.0000000000 has EDG_WL_JOBID 'https://egee-rb-01.mi.infn.it:9000/1JsHLJd8W0tRpdEbs-Jd9w'
GATEKEEPER_DGAS_FD=8 (/opt/edg/var/gatekeeper/jobs/
2008-07-29.02:20:56.0000014445.0000055146)
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=12
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 0: executing /opt/globus/libexec/globus-job-
manager
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 0: GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006495.0000000000 for /C=IT/O=INFN/OU=Personal
Certificate/L=Milano/CN=Davide Rebatto on 192.135.14.39
JMA 2008/07/29 02:20:56 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006495.0000000000 has EDG_WL_JOBID 'https://egee-rb-01.mi.infn.it:9000/fM2ZSdUF73qY-Ef_Txwsgg'
GATEKEEPER_DGAS_FD=8 (/opt/edg/var/gatekeeper/jobs/
2008-07-29.02:20:56.0000014445.0000055145)
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 0: GRID_SECURITY_CONTEXT_FD=12
TIME: Tue Jul 29 02:20:56 2008
PID: 6496 -- Notice: 0: Child 6507 started
TIME: Tue Jul 29 02:20:56 2008
PID: 6495 -- Notice: 0: Child 6508 started
Failure: error check net_accept select: No child processes
TIME: Tue Jul 29 02:20:56 2008
PID: 14445 -- Failure: error check net_accept select: No child
processes
JMA 2008/07/29 02:20:58 GATEKEEPER_JM_ID
2008-07-29.02:20:18.0000004537.0000000000 JM exiting
JMA 2008/07/29 02:20:58 GATEKEEPER_JM_ID
2008-07-29.02:20:18.0000004538.0000000000 JM exiting
JMA 2008/07/29 02:21:01 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006495.0000000000 for /C=IT/O=INFN/OU=Personal
Certificate/L=Milano/CN=Davide Rebatto on 192.135.14.39
JMA 2008/07/29 02:21:01 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006495.0000000000 mapped to atlasprd029 (5485,
5002)
JMA 2008/07/29 02:21:01 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006495.0000000000 has GRAM_SCRIPT_JOB_ID
1217290861:lcglsf:internal_3948000453:6508.1217290856 manager type
lcglsf
JMA 2008/07/29 02:21:01 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006496.0000000000 for /C=IT/O=INFN/OU=Personal
Certificate/L=Milano/CN=Davide Rebatto on 192.135.14.39
JMA 2008/07/29 02:21:01 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006496.0000000000 mapped to atlasprd029 (5485,
5002)
JMA 2008/07/29 02:21:01 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006496.0000000000 has GRAM_SCRIPT_JOB_ID
1217290861:lcglsf:internal_3099639915:6507.1217290856 manager type
lcglsf
JMA 2008/07/29 02:21:02 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006495.0000000000 JM exiting
JMA 2008/07/29 02:21:02 GATEKEEPER_JM_ID
2008-07-29.02:20:56.0000006496.0000000000 JM exiting
JMA 2008/07/29 02:26:06 GATEKEEPER_JM_ID
2008-07-29.01:25:59.0000026133.0000000000 JM exiting
JMA 2008/07/29 02:29:52 GATEKEEPER_JM_ID
2008-07-29.01:29:45.0000004999.0000000000 JM exiting
JMA 2008/07/29 02:35:22 GATEKEEPER_JM_ID
2008-07-29.02:06:18.0000004007.0000000000 JM exiting
JMA 2008/07/29 02:35:58 GATEKEEPER_JM_ID
2008-07-29.01:48:52.0000024483.0000000000 JM exiting
JMA 2008/07/29 02:36:51 GATEKEEPER_JM_ID
2008-07-29.01:35:40.0000020508.0000000000 JM exiting
JMA 2008/07/29 02:36:53 GATEKEEPER_JM_ID
2008-07-29.01:36:44.0000023051.0000000000 JM exiting
JMA 2008/07/29 02:38:43 GATEKEEPER_JM_ID
2008-07-29.01:38:35.0000027129.0000000000 JM exiting
JMA 2008/07/29 02:45:44 GATEKEEPER_JM_ID
2008-07-29.01:45:37.0000012353.0000000000 JM exiting
JMA 2008/07/29 02:45:44 GATEKEEPER_JM_ID
2008-07-29.01:57:38.0000013394.0000000000 JM exiting
JMA 2008/07/29 02:46:24 GATEKEEPER_JM_ID
2008-07-29.01:46:16.0000013985.0000000000 JM exiting
JMA 2008/07/29 02:49:24 GATEKEEPER_JM_ID
2008-07-29.01:49:17.0000025638.0000000000 JM exiting
JMA 2008/07/29 02:52:30 GATEKEEPER_JM_ID
2008-07-29.01:52:24.0000002706.0000000000 JM exiting
JMA 2008/07/29 02:54:18 GATEKEEPER_JM_ID
2008-07-29.01:54:10.0000006109.0000000000 JM exiting
JMA 2008/07/29 02:56:23 GATEKEEPER_JM_ID
2008-07-29.01:56:16.0000010460.0000000000 JM exiting
JMA 2008/07/29 02:57:43 GATEKEEPER_JM_ID
2008-07-29.02:14:27.0000023987.0000000000 JM exiting
JMA 2008/07/29 02:59:40 GATEKEEPER_JM_ID
2008-07-29.01:59:34.0000017843.0000000000 JM exiting
JMA 2008/07/29 03:04:29 GATEKEEPER_JM_ID
2008-07-29.02:04:21.0000031003.0000000000 JM exiting
JMA 2008/07/29 03:08:24 GATEKEEPER_JM_ID
2008-07-29.02:08:18.0000009462.0000000000 JM exiting
JMA 2008/07/29 03:10:20 GATEKEEPER_JM_ID
2008-07-29.02:10:13.0000014512.0000000000 JM exiting
TIME: Tue Jul 29 09:45:07 2008
PID: 2102 -- Notice: 6: /opt/globus/sbin/globus-gatekeeper pid=2102
starting at Tue Jul 29 09:45:07 2008
On Jul 29, 2008, at 10:28 AM, Italiano Alessandro wrote:
> Hi,
>
> the problem is still there. This night the gatekeeper service
> crashed again
> As Maarten had indicated, the problem does not seem to be related
> with the
> log messages below, because the other CE is running normally
> and there are the same log messages
>
>
>>> thanks for having indicated a possible reason.
>>> I've found log messages like this
>>>
>>> LCAS 0: lcas_userban.mod-plugin_confirm_authorization():
>>> checking banned users in /opt/glite/etc/lcas/ban_users.db
>>> LCAS 0: lcas_plugin_voms-
>>> plugin_confirm_authorization_from_x509(): Generic verification error
>>> for VOMS (failure): AC not yet (or not anymore) valid.
>>> LCAS 0: 2008-07-24.19:15:43 : lcas_plugin_voms-
>>> plugin_confirm_authorization_from_x509(): voms plugin failed
>>> LCAS 0: lcas.mod-lcas_run_va(): authorization failed for plugin /
>>> opt/
>>> glite/lib/modules/lcas_voms.mod
>>> LCAS 0: lcas.mod-lcas_run_va(): failed
>
>
> Do you have any hint ?
>
> in /var/log/messages I've found these log messages
>
> Jul 29 01:52:24 ce05-lcg gridinfo[2717]: JMA 2008/07/29 01:52:24
> GATEKEEPER_JM_ID 2008-07-29.01:52:24.0000002706.0000000000 has
> GRAM_SCRIPT_JOB_ID 2747 manager type fork
> Jul 29 01:53:00 ce05-lcg glite-lb-interlogd[21516]: error reading
> server egee-rb-01.mi.infn.it reply: get_reply: error reading
> server reply
> Jul 29 01:53:00 ce05-lcg glite-lb-interlogd[21516]: queue_thread:
> get_reply: error reading server reply
> Jul 29 01:53:11 ce05-lcg GRAM gatekeeper[4102]: Got connection
> 131.154.101.14 at Tue Jul 29 01:53:11 2008
> Jul 29 01:53:11 ce05-lcg GRAM gatekeeper[4102]: Authenticated globus
> user: /C=CN/O=HEP/OU=IHEP/CN=Tao Junquan
> Jul 29 01:53:28 ce05-lcg glite-lb-interlogd[21516]: error reading
> server lb106.cern.ch reply: get_reply: error reading server reply
> Jul 29 01:53:28 ce05-lcg glite-lb-interlogd[21516]: queue_thread:
> get_reply: error reading server reply
> Jul 29 01:53:29 ce05-lcg GRAM gatekeeper[4719]: Got connection
> 131.169.223.74 at Tue Jul 29 01:53:29 2008
> Jul 29 01:53:29 ce05-lcg GRAM gatekeeper[4719]: Authenticated globus
> user: /C=CN/O=HEP/OU=IHEP/CN=Tao Junquan
> Jul 29 01:53:52 ce05-lcg nscd: nss_ldap: reconnected to LDAP server
> ldap://ldap-m.cr.cnaf.infn.it after 1 attempt
> Jul 29 01:54:02 ce05-lcg glite-lb-interlogd[21516]: error reading
> server egee-rb-01.mi.infn.it reply: get_reply: error reading
> server reply
> Jul 29 01:54:02 ce05-lcg glite-lb-interlogd[21516]: queue_thread:
> get_reply: error reading server reply
>
> what do they mean ?
> anyway the same log messages are present on the other CE
>
> Ale
>
>
> On Jul 25, 2008, at 11:44 AM, <[log in to unmask]> <[log in to unmask]
> > wrote:
>
>> Ciao Ale,
>>
>>> thanks for having indicated a possible reason.
>>> I've found log messages like this
>>>
>>> LCAS 0: lcas_userban.mod-plugin_confirm_authorization():
>>> checking banned users in /opt/glite/etc/lcas/ban_users.db
>>> LCAS 0: lcas_plugin_voms-
>>> plugin_confirm_authorization_from_x509(): Generic verification error
>>> for VOMS (failure): AC not yet (or not anymore) valid.
>>> LCAS 0: 2008-07-24.19:15:43 : lcas_plugin_voms-
>>> plugin_confirm_authorization_from_x509(): voms plugin failed
>>> LCAS 0: lcas.mod-lcas_run_va(): authorization failed for plugin /
>>> opt/
>>> glite/lib/modules/lcas_voms.mod
>>> LCAS 0: lcas.mod-lcas_run_va(): failed
>>>
>>>
>>> anyway,
>>>
>>> the bug fix for bug 35981 is already in production ?
>>
>> It is on the PPS since July 14. Look at the dependencies at the
>> bottom:
>>
>> https://savannah.cern.ch/bugs/?35981
>>
>> It will probably go to production next week.
>>
>> BTW, these child process crashes normally are harmless. After the
>> fix
>> the bad proxies will be refused anyway.
>>
>
|