I raised a GGUS ticket on this issue last week:
https://gus.fzk.de/pages/ticket_details.php?ticket=35694
This is a pretty critical issue as the SL3 CE is now unsupported.
However, at the moment there is no progress on this ticket, so Jeremy
has proposed that we identify all sites affected and raise this at the
EGEE operations meeting on Monday.
So far it has been seen at Glasgow, Durham and RHUL. If you are
running an x86_64 kernel then just grep for segfault in /var/log/
messages. If you still run a 32bit kernel then you need to attach an
strace to detect the problem.
Cheers
Graeme
On 16 Apr 2008, at 08:50, Phil Roffe wrote:
>> Remember that only the x86_64 kernel logs segfaults so for the i386
>> side you will not see anything in /var/log/messages.
> Good point, I thought that might be the case so running strace shows
> that Durham does indeed segfault...
>
> [pid 9570] open("/etc/grid-security/grid-mapfile", O_RDONLY) = 9
> [pid 9570] fstat64(9, {st_mode=S_IFREG|0644, st_size=395036, ...})
> = 0
> [pid 9570] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|
> MAP_ANONYMOUS, -1, 0) = 0xb7f54000
> [pid 9570] read(9, "\"\t/C=UK/O=eScience/OU=Manchester/L=HEP/
> CN=colin morey\" .gridpp\n\"/C=AM/O=ArmeSFo/"..., 4096) = 4096
> [pid 9570] read(9, "Felipe Fink Grael\" .atlas\n\"/C=BR/O=ICPEDU/
> O=UFF BrGrid CA/O=UFRJ/OU=IF/CN=Carla "..., 4096) = 4096
> <snip>
> [pid 9570] read(9, "O=nikhef/CN=Jeffrey Templon\" .prdatlas\n\"/
> O=dutchgrid/O=users/O=nikhef/CN=Jeroen "..., 4096) = 4096
> [pid 9570] read(9, "\"/biomed/Role=production/Capability=NULL
> \" .prdbiomed\n\"/biomed/Role=production\" ."..., 4096) = 1820
> [pid 9570] read(9, "", 4096) = 0
> [pid 9570] close(9) = 0
> [pid 9570] munmap(0xb7f54000, 4096) = 0
> [pid 9570] write(2, "LCAS 0: \tlcas_plugin_voms-
> plugin_confirm_authorization_from_x509(): Did not fi"..., 129) = 129
> [pid 9570] time(NULL) = 1208331608
> [pid 9570] write(2, "LCAS 0: 2008-04-16.08:40:08 :
> \tlcas_plugin_voms-plugin_confirm_authorization_f"..., 111) = 111
> [pid 9570] write(2, "LCAS 0: lcas.mod-lcas_run_va():
> authorization failed for plugin /opt/glite/lib"..., 103) = 103
> [pid 9570] write(2, "LCAS 0: lcas.mod-lcas_run_va(): failed\n",
> 41) = 41
> [pid 9570] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>
> However, the way I read the above trace is that the segfault occurs
> after the LCAS authentication has failed - the write lines are
> writing the failure messages into the logfile, then it segfaults.
> I'm still investigating...
>
> Cheers,
> Phil
|