[Bug-openmcl] Debugging crashes of dppccl
Gary Byers
gb at clozure.com
Thu Mar 4 18:02:03 MST 2004
The lisp tries to handle some types of exceptions. On OSX, it basically
tries to handle the Mach exceptions raised by attempts to execute illegal
instructions, PPC trap instructions, bad memory accesses, and floating-point
exceptions. If any of these things are raised and the handler fails to
handle them, you -should- wind up in OpenMCL's kernel debugger (with at
least a little ability to poke around and see how you got there.) If
the --batch flag is in effect, the kernel debugger exits without entering
a break loop.
At a slightly different level, the lisp handles a couple of POSIX signals
SIGINT and a few otherwise unused signal numbers used by thread suspend/
resume and by PROCESS-INTERRUPT.) I believe that it also ignores SIGPIPE.
An unhandled exception or fatal signal will kill the process, without
any of this handler code intervening in that. The Console application
has an option to produce crash logs in that case.
Ordinarily, the parent process (the shell) inhibits core dumps by
setting the corresponding resource limit to 0. I believe that I've
been able to change this and get crashes to produce core files; GDB
didn't seem to have any idea of what they were.
The version of GDB that ships with XCode/Panther is supposed to have
enhanced support for "focusing" on a specific thread while debugging.
I find that whatever GDB's doing seems to interfere with normal thread
behavior; sometimes, interrupting GDB with a SIGINT from the keyboard
causes it to thaw out whatever it'd frozen.
On Thu, 4 Mar 2004, Erik Pearson wrote:
> Is there a procedure for compiling dppccl so that it will produce useful
> information upon crashing? I'm testing an app that runs continuously (i.e.
> a server), waking up a process every few minutes to run a set of registered
> functions. It is doing website monitoring, so each function goes out,
> fetches a web page, looks at it, and saves the results. The server then
> will write to a file or send an email alert.
>
> After this process repeats about 250-300 times (at an interval of 15
> seconds for testing...) dppccl will just up and die, printing no messages,
> and returning to the shell, and leaving behind no core file. Well, that is
> not quite true. The last time this happened the message "Abort trap" was
> printed.
>
> This is running under the bleeding edge, checked out a couple of days ago
> with a few manual tweaks to the threading code.
>
> This just go away if it is just Gary fiddling with something. But in case
> not, or in any case, I'm wondering how to get diagnostic information on the
> cause of such crashes. I've looked _very_ briefly at gdb, and had it set up
> to monitor the dppccl process, but it caused dppccl to hang after a few
> minutes -- dppccl continued just fine after gdb was exited.
>
> I'm also going to be looking at different methods of exercising the code
> and isolating the problem, but this is made difficult since it happens
> after many iterations, and the code that is being run is pretty "wild" in
> that it interacts with the network layer in multiple ways involving threads
> and timeouts. (e.g. I've run the code also with > 700 iterations when
> monitoring a simple, close website which has never timed out and which is
> on the local subnet.)
>
> Thanks,
>
> Erik.
>
> --
> Erik Pearson
> Adaptations
> "Adaptation: It's not just for finches anymore."
> _______________________________________________
> Bug-openmcl mailing list
> Bug-openmcl at clozure.com
> http://clozure.com/mailman/listinfo/bug-openmcl
>
>
More information about the Bug-openmcl
mailing list