[Openmcl-devel] ARM testing
lisp at davidb.org
Fri Jan 28 00:52:52 CST 2011
On Thu, Jan 27 2011, Gary Byers wrote:
> On Thu, 27 Jan 2011, David Brown wrote:
> In that case, we're running into the write-protected guard pages at the end
> of the listener thread's control stack; the same sequence of events happens
> for me if I do:
> ? (process-interrupt ccl::*initial-process* #'foo 0)
> If foreign code (including the GC, including rmark()) tries to write
> to those guard pages we expect to get a SIGSEGV; in general, it's
> harder to
> recover from an exception in foreign code, and I think that we just drop
> into the kernel debugger in that case. (Or at least try to.)
> Do you get a STACK-OVERFLOW condition signaled in lisp ? Or does this just
> die with SIGBUS ? Or does something else happen ?
It dies with a SIGBUS.
> Here's another theory that makes so much sense (at the moment) that it's probably
> completely wrong: it's possible that recent Linux kernels are refusing to map
> the last page of a stack region and signaling SIGBUS (at least on ARM) when
> attempts are made to write to that page. (That's actually reminiscent of a
> Linux kernel change made last summer, where mmap() with the MAP_GROWSDOWN option
> refused to map the lowest page in the region it returned; that redefinition of
> mmap's behavior was - according to my possibly garbled understanding - related
> to stack growth/overflow detection.
Author: Linus Torvalds <torvalds at linux-foundation.org>
Date: Thu Aug 12 17:54:33 2010 -0700
mm: keep a guard page below a grow-down stack segment
This is a rather minimally invasive patch to solve the problem of the
user stack growing into a memory mapped area below it. Whenever we fill
the first page of the stack segment, expand the segment down by one
Now, admittedly some odd application might _want_ the stack to grow down
into the preceding memory mapping, and so we may at some point need to
make this a process tunable (some people might also want to have more
than a single page of guarding), but let's try the minimal approach
Tested with trivial application that maps a single page just below the
stack, and then starts recursing. Without this, we will get a SIGSEGV
_after_ the stack has smashed the mapping. With this patch, we'll get a
nice SIGBUS just as the stack touches the page just above the mapping.
Requested-by: Keith Packard <keithp at keithp.com>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>
It blocks the stack expansion when growing down if it would bump into an
adjacent page. The patch explicitly causes a SIGBUS on the guard page.
I guess CCL is the "odd application". I'm curious why I'm not seeing
this on x86/amd64, since I'm running a kernel with the same change, even
the process-interrupt above nicely produces a STACK-OVERFLOW condition.
> At the moment, I like this theory (but of course I liked the one from the other
> day, too.) One way of testing it is to move GCstack_limit a page higher; it's
> set near the start of the function gc() in lisp-kernel/gc-common.c:
> /* ignore the other case of the containing 'if'. This is around
> line 1394 */
> GCstack_limit = (natural)(tcr->cs_limit)+(natural)page_size;
> If we change 'page_size' to '2*page_size' in that line and recompile
> the kernel, does the problem (loading the bootstrapping image) persist ?
This fixes the bootstrap issue. The process-interrupt still dies with
the SIGBUS, so maybe it just needs to be handled.
I was going to try running this on an earlier kernel, but I haven't been
able to get the old kernel to be stable enough to even boot fully.
Given this, I'm guessing that it would work fine.
More information about the Openmcl-devel