[Bug-openmcl] processes not gc'd?
erik at adaptations.com
Mon Mar 1 09:16:29 MST 2004
Thanks for the detailed info, Gary -- comments below.
Gary Byers wrote:
> On Sun, 29 Feb 2004, Erik Pearson wrote:
>>Gary Byers wrote:
>>>On Sun, 29 Feb 2004, Erik Pearson wrote:
>>>>It looks like the creation of threads leads to a lot of memory
>>>>allocation that is not recovered by the gc. Here is a small test
>>>>function which creates n threads. I've tried it with n of 100, 1000,
>>>>10000, etc. and inspected memory allocation with top, and it just grows
>>>I found growth at 10000; I hadn't seen it earlier.
>>>When a thread exits, it deallocates the resources (stacks and semaphores)
>>>it had allocated on creation. This is pretty simple code and it seems
>>>to work reliably in simple cases.
>>Indeed. With DEBUG_THREAD_CLEANUP turned on (and recompiling the
>>kernel), it looks as if the thread termination code is not being run
>>correctly in the test code. If threads are created one at a time slowly
>>(by hand, once every few seconds), the termination code is executed each
>>time. However, when threads are created in rapid succession (with some
>>overlap) as in the test code, it looks like the cleanup code runs a few
>>times, and the stops after which it never runs. Perhaps something is
>>deadlocked? I don't know enough about this stuff or the codebase to say
> There's a global doubly-linked list (called "all_areas") whose elements
> are data structures that describe memory regions that the GC is interested
> in (stacks, etc.). Threads have been splicing their stack areas into
> and out of that list (on creation and termination) without any regard
> for concurrency issues.
Is it a big task to make this stuff thread-safe? And do you think it is
a long road to get the codebase fully thread-safe?
> That can certainly lead to all kinds of bad behavior. It's not clear
> to me whether the fprintf's enabled by DEBUG_THREAD_CLEANUP really tell
> us what's going on, since I'm not sure how thread-safe fprintf is.
Another interesting fprintf "debug" clue is that the cleanup mischief
seems to occur after a cluster of interleaved threads. That is, the test
code will run fine while there are serialized or even a couple of
interleaved threads, but the cleanup seems to stop after a fprintf
outputs more than three threads are active and perhaps in the process of
being cleaned up. Not very scientific, though.
> If all of this stuff works reliably, there's still a leak: for no
> good reason, the objects that sort of sit between a lisp PROCESS and
> the underlying thread are kept in a weak list that's marked as being
> finalizable. (This is leftover from the old cooperative scheduler,
> where the only way to know that a thread died - and the only way
> to free up its stacks and other resources - was to be told by the
> GC that it's about to become garbage.) The short version is that
> the GC thinks that a lisp thread needs to be held onto for finalization/
> termination, but nothing bothers to look at the list of "finalized
> threads" anymore (there's no reason to do so.)
I removed what I think is this code with no discernable effect (neither
good nor bad.) (The old MCL comments in the form of an essay are in
More information about the Bug-openmcl