Print

Print


I raised a GGUS ticket on this issue last week:

https://gus.fzk.de/pages/ticket_details.php?ticket=35694

This is a pretty critical issue as the SL3 CE is now unsupported.  
However, at the moment there is no progress on this ticket, so Jeremy  
has proposed that we identify all sites affected and raise this at the  
EGEE operations meeting on Monday.

So far it has been seen at Glasgow, Durham and RHUL. If you are  
running an x86_64 kernel then just grep for segfault in /var/log/ 
messages. If you still run a 32bit kernel then you need to attach an  
strace to detect the problem.

Cheers

Graeme

On 16 Apr 2008, at 08:50, Phil Roffe wrote:

>> Remember that only the x86_64 kernel logs segfaults so for the i386
>> side you will not see anything in /var/log/messages.
> Good point, I thought that might be the case so running strace shows  
> that Durham does indeed segfault...
>
> [pid  9570] open("/etc/grid-security/grid-mapfile", O_RDONLY) = 9
> [pid  9570] fstat64(9, {st_mode=S_IFREG|0644, st_size=395036, ...})  
> = 0
> [pid  9570] mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE| 
> MAP_ANONYMOUS, -1, 0) = 0xb7f54000
> [pid  9570] read(9, "\"\t/C=UK/O=eScience/OU=Manchester/L=HEP/ 
> CN=colin morey\" .gridpp\n\"/C=AM/O=ArmeSFo/"..., 4096) = 4096
> [pid  9570] read(9, "Felipe Fink Grael\" .atlas\n\"/C=BR/O=ICPEDU/ 
> O=UFF BrGrid CA/O=UFRJ/OU=IF/CN=Carla "..., 4096) = 4096
> <snip>
> [pid  9570] read(9, "O=nikhef/CN=Jeffrey Templon\" .prdatlas\n\"/ 
> O=dutchgrid/O=users/O=nikhef/CN=Jeroen "..., 4096) = 4096
> [pid  9570] read(9, "\"/biomed/Role=production/Capability=NULL 
> \" .prdbiomed\n\"/biomed/Role=production\" ."..., 4096) = 1820
> [pid  9570] read(9, "", 4096)           = 0
> [pid  9570] close(9)                    = 0
> [pid  9570] munmap(0xb7f54000, 4096)    = 0
> [pid  9570] write(2, "LCAS   0: \tlcas_plugin_voms- 
> plugin_confirm_authorization_from_x509(): Did not fi"..., 129) = 129
> [pid  9570] time(NULL)                  = 1208331608
> [pid  9570] write(2, "LCAS   0: 2008-04-16.08:40:08 :  
> \tlcas_plugin_voms-plugin_confirm_authorization_f"..., 111) = 111
> [pid  9570] write(2, "LCAS   0: lcas.mod-lcas_run_va():  
> authorization failed for plugin /opt/glite/lib"..., 103) = 103
> [pid  9570] write(2, "LCAS   0: lcas.mod-lcas_run_va(): failed\n",  
> 41) = 41
> [pid  9570] --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>
> However, the way I read the above trace is that the segfault occurs  
> after the LCAS authentication has failed - the write lines are  
> writing the failure messages into the logfile, then it segfaults.   
> I'm still investigating...
>
> Cheers,
> Phil