If you can get a traceback for IO errors or FP exceptions, then in
theory you can get them for SEGV. It's a matter of convincing the
runtime to catch this and hook it up to its existing traceback
mechanism it has for the other signals, and I'm sure the runtime adn
the convincing are paltform- and vendor- specific.
But there's a caveat; isn't there always... It is very VERY likely you
can avoid additional IO erros and FP exceptions while doing the
traceback, for almost all causes of the errors/exceptions. A SEGV,
however, generally indicates memory is broken. Depending on the
extent of the breakage, it could have corrupted the heap, global
variables and/or the stack, any of which can cause the traceback to
trip and go splat. Indeed, lots of times the corruption is minor and
a traceback is dandy.
Another option is to use core files; let the system dump core (quite
an anachronism) and then debug the core file with the debugger.
Depending on your core file settings, you may have to delete them with
some frequency, and a multi-process job on a cluster exacerbates core
file maintenance. A caveat here... Some systems don't dump
multi-threaded core files (Linux 2.2 kernels didn't) and some don't
get it right (Linux 2.4 on IA64 didn't get all the register sets
correct). I'm sure these will improve over time, and some systems
probably do it right, provided your debugger is equally capable.
|