[Bug-openmcl] problem with timed-wait-on-semaphore ?
Gary Byers
gb at clozure.com
Thu Feb 26 10:24:21 MST 2004
On Thu, 26 Feb 2004, Erik Pearson wrote:
> Hi,
>
> Using "Welcome to OpenMCL Version (Beta: Darwin) 0.14.1!" on an iBook
> 500Mhz.
>
> I'm attempting to write a "with-timeout" and friends for running
> time-constrained code in threads. In the process (you know, the process of
> programming) I've encountered some odd behavior from
> timed-wait-on-semaphore. The test function below should create a semaphore,
> pass it to a function which is run in a new process, which will signal the
> semaphore when (if) it finishes, and back in the original thread we wait on
> the semaphore. This is run by another function 10000 times back to back.
> The timed-wait-on-semaphore should never fail because it is passed a
> timeout of 10000 and the function does very little (which should complete
> well within the time limit).
The limit's in SECONDS, so I'd hope so.
>
> The result is the output below, which prints out a line for every
> timed-wait-on-semaphore failures (i.e. timeouts), with the "bad count" and
> "total count" following. (Code is below.)
>
> BAD: 1 / 2034
> BAD: 2 / 2092
> BAD: 3 / 2149
> BAD: 4 / 2178
> BAD: 5 / 2647
> BAD: 6 / 2809
> BAD: 7 / 3087
> BAD: 8 / 3507
> BAD: 9 / 3551
> BAD: 10 / 3623
> BAD: 11 / 3638
> BAD: 12 / 4013
> BAD: 13 / 4027
> BAD: 14 / 4212
> BAD: 15 / 5132
> 7528^C
> > Break in process listener(1):
> > While executing: #<Anonymous Function #x513E12E>
> > Type :GO to continue, :POP to abort.
> > If continued: Return from BREAK.
> Type :? for other options.
>
> Oops, I gave up because it was taking too long, and seemed to be slowing
> down.
>
> Of course, timeouts should not really happen under these conditions. So
> what is happening?
>
> Also, something is very slow here, as it takes over 10 minutes to complete
> this test on a 500mhz ibook, and consumes over 50% cpu. I suppose further
> tests could reveal what the performance culprit is.
>
> Code is:
>
> (Defun test ()
> (let* ((sem (ccl:make-semaphore))
> (process (ccl:process-run-function "test" #'(lambda (s)
> (format nil "Do nothing productive.")
> (ccl:signal-semaphore s)) sem)))
> (ccl:timed-wait-on-semaphore sem 10000)))
>
> (defun test2 ()
> (let ((bad 0))
> (dotimes (i 10000)
> (format t "~A~A" #\Return i)
> (unless (test)
> (incf bad)
> (format t "~ABAD: ~A / ~A~&" #\Return bad i)))))
>
>
> Run as
>
> (compile-file "test-timeout.cl")
> (load "test-timeout")
> (test2)
>
>
> Any comments welcome!
>
> Thanks,
>
> Erik.
> _______________________________________________
> Bug-openmcl mailing list
> Bug-openmcl at clozure.com
> http://clozure.com/mailman/listinfo/bug-openmcl
>
>
I haven't tried this yet, but ...
If this code was reorganized so that a single semaphore was allocated
outside the loop, does it still fail ?
CCL:CREATE-SEMAPHORE's work is actually done in the kernel (in the C
function new_semaphore() in "ccl:lisp-kernel;thread_manager.c"). It
uses semaphore_create() under darwin, but noting checks the return
value from that call.
One of the possible error returns is KERN_RESOURCE_SHORTAGE, which
is Mach's way of telling you that it's too busy to do what you want.
new_semaphore() returns the semaphore that it expects semaphore_create()
to have initialized; if semaphore_create() fails, new_semaphore()'s return
value is either (a) not a semaphore or (b) some other semaphore that
just happened to be sitting at that stack address, perhaps leftover
from the last call.
Something somewhere should clearly notice the fact that new_semaphore()
could fail and either tell you about Mach's unfortunate resource problems
or try again on your behalf. By the time they get exposed to lisp code,
semaphores are more-or-less first class objects; the GC will free them
if they become unreferenced, but there's no other advertised way to
destroy them.
If your code was reorganized to allocate a single semaphore outside of
the loop and then repeatedly signal/wait on it, these particular
scenarios wouldn't likely be involved.
None of this stuff does too much on the lisp side: it's just a pretty
thin wrapper around a few (OS-dependent) system calls.
More information about the Bug-openmcl
mailing list