Hi Marteen,
I re-installed the CE twice manually. The sig 15 problem appeared
shortly after the first installation. The second installation was
successful and the CE has worked fine so far. Although, I did catch a
segmentation fault over a 12 hour period. It is of the same type as the
one reported in:
https://gus.fzk.de/pages/ticket_details.php?ticket=35694
I'm up to date with my CAs and verified the integrity (rpm verify) of some
of them (e.g. ca_CERN-Root).
I keep my finger crossed, as the problem I originally described appeared
after a weeks of stable running.
Thank you,
Yves
9057 09:32:49 munmap(0xb7f18000, 4096) = 0
9057 09:32:49 stat64("/etc/grid-security/certificates//d254cc30.1",
0xbfe7a2c4) = -1 ENOENT (No such file or directory)
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 write(2, "LCAS 0: \tlcas_plugin_voms-plug"..., 152) = 152
9057 09:32:49 time(NULL) = 1210149169
9057 09:32:49 write(2, "LCAS 0: 2008-05-07.09:32:49 : "..., 111) = 111
9057 09:32:49 write(2, "LCAS 0: lcas.mod-lcas_run_va()"..., 103) = 103
9057 09:32:49 write(2, "LCAS 0: lcas.mod-lcas_run_va()"..., 41) = 41
9057 09:32:49 --- SIGSEGV (Segmentation fault) @ 0 (0) ---
On Sat, 3 May 2008, Yves Coppens wrote:
> Hi Marteen,
>
> I'm re-installing the CE with the proper umask in cfengine.
> I'm not sure if this is the cause of this problem as I had installed the CE
> manually previously and it had run fine for a while until last Thursday
> before this occured, and which prompted my fresh install.
> But, I agree the CE should be re-installed to eliminate problems I've
> introduced in last install.
>
> Thank you,
>
> Yves
>
>
> On Sat, 3 May 2008, Maarten Litmaath wrote:
>
>> Hi Yves,
>> a few comments inline.
>>
>>> The BDII (replacement of globus-mds and not the site BDII) kept dying.
>>> It happened every hour just after the gatekeeper had received a kill
>>> signal and restarted itself.
>>
>> The CE is not configured to restarted any service periodically.
>> I would not be surprised if this were a result of the problem described
>> in the other thread. Please _reinstall_ your CE with a correct umask,
>> so that we may avoid a wild goose chase...
>>
>>> [...]
>>>
>>> The status of the marshals is misleading has they're actually running
>>> fine as can be seen in the process list above.
>>>
>>> [root@epgce2 ~]# service globus-gass-cache-marshal status
>>> globus-gass-cache-marshal dead but pid file exists
>>> [root@epgce2 ~]# service globus-job-manager-marshal status
>>> globus-job-manager-marshal dead but pid file exists
>>
>> https://savannah.cern.ch/bugs/index.php?36224
>>
>
|