[Bug-openmcl] processes not gc'd?
Erik Pearson
erik at adaptations.com
Mon Mar 1 09:16:29 MST 2004
Thanks for the detailed info, Gary -- comments below.
Gary Byers wrote:
>
> On Sun, 29 Feb 2004, Erik Pearson wrote:
>
>
>>
>>Gary Byers wrote:
>>
>>
>>>On Sun, 29 Feb 2004, Erik Pearson wrote:
>>>
>>>
>>>
>>>>Hi,
>>>>
>>>>It looks like the creation of threads leads to a lot of memory
>>>>allocation that is not recovered by the gc. Here is a small test
>>>>function which creates n threads. I've tried it with n of 100, 1000,
>>>>10000, etc. and inspected memory allocation with top, and it just grows
>>>>and grows...
>>>
>>>
>>>I found growth at 10000; I hadn't seen it earlier.
>>>
>>>When a thread exits, it deallocates the resources (stacks and semaphores)
>>>it had allocated on creation. This is pretty simple code and it seems
>>>to work reliably in simple cases.
>>
>>Indeed. With DEBUG_THREAD_CLEANUP turned on (and recompiling the
>>kernel), it looks as if the thread termination code is not being run
>>correctly in the test code. If threads are created one at a time slowly
>>(by hand, once every few seconds), the termination code is executed each
>>time. However, when threads are created in rapid succession (with some
>>overlap) as in the test code, it looks like the cleanup code runs a few
>>times, and the stops after which it never runs. Perhaps something is
>>deadlocked? I don't know enough about this stuff or the codebase to say
>>anything intelligent!
>>
>
>
> There's a global doubly-linked list (called "all_areas") whose elements
> are data structures that describe memory regions that the GC is interested
> in (stacks, etc.). Threads have been splicing their stack areas into
> and out of that list (on creation and termination) without any regard
> for concurrency issues.
Is it a big task to make this stuff thread-safe? And do you think it is
a long road to get the codebase fully thread-safe?
>
> That can certainly lead to all kinds of bad behavior. It's not clear
> to me whether the fprintf's enabled by DEBUG_THREAD_CLEANUP really tell
> us what's going on, since I'm not sure how thread-safe fprintf is.
Another interesting fprintf "debug" clue is that the cleanup mischief
seems to occur after a cluster of interleaved threads. That is, the test
code will run fine while there are serialized or even a couple of
interleaved threads, but the cleanup seems to stop after a fprintf
outputs more than three threads are active and perhaps in the process of
being cleaned up. Not very scientific, though.
>
> If all of this stuff works reliably, there's still a leak: for no
> good reason, the objects that sort of sit between a lisp PROCESS and
> the underlying thread are kept in a weak list that's marked as being
> finalizable. (This is leftover from the old cooperative scheduler,
> where the only way to know that a thread died - and the only way
> to free up its stacks and other resources - was to be told by the
> GC that it's about to become garbage.) The short version is that
> the GC thinks that a lisp thread needs to be held onto for finalization/
> termination, but nothing bothers to look at the list of "finalized
> threads" anymore (there's no reason to do so.)
I removed what I think is this code with no discernable effect (neither
good nor bad.) (The old MCL comments in the form of an essay are in
there too!)
More information about the Bug-openmcl
mailing list