[Openmcl-devel] Re: Hemlock performance investigation
Andrew P. Lentvorski, Jr.
bsder at mail.allcaps.org
Fri Aug 27 19:47:31 EDT 2004
On Aug 27, 2004, at 8:58 AM, Hamilton Link wrote:
> character-at-index is being called across the whole file on every
> keystroke. Opening a file, it looks like it's being called about 34
> times per character in the file (which is a little odd), but then it
> settles down to scanning across the entire file once every keystroke.
Would the buffer happen to be one monolithic Unicode NSString (or is
something expecting to be accessing an NSString)? If so, O(n) accesses
occur for every call to character-at-index since this is Unicode. It
has to walk the entire string from the beginning since the size in
bytes of a character is not fixed. If you scan every character in the
file, then that is also O(n).
You wind up with an O(n^2) algorithm. It doesn't matter *how* much you
speed up the constant factors. An O(n^2) algorithm is always going to
be slow. A quick test of this is whether an operation at the beginning
of the file is much faster than at the end. At one end, things will be
pretty close to O(n), and fast. At the other, things should be very
close to O(n^2), and slow. If you have an O(n) algorithm, then things
should range from O(n), fast, on one side to O(1), blazing, on the
Firing up the (require "COCOA") system on "cocoa-editor.lisp" shows
that insertions at the beginning of the file are pretty slow, but
insertions at the end of the file are pretty fast. This is backward
from what I would expect, but I still smell an O(n^2) problem.
There are probably two options:
1) The better option is probably to use an NSTextView. People seem to
be doing very nice editors with them. (cf. subethaedit and smultron).
2) The quicker option is probably to subclass NSString, implement a
tree structure so that index accesses aren't O(n), and
use that instead.
> At 4.5e-5 s per call, in a 5000-character file, this makes about a
> .25 second delay per keystroke and the bulk of the delay per, and adds
> up to tens of seconds in the course of typing a couple of lines, so
> it's where most of the time is going.
45e-6 s per call is not that bad. It's not great, but it's not bad.
Especially for something which is crossing interface boundaries. I
would expect something closer to 4.5e-6, but even so, that would still
be painful for a large file.
More information about the Openmcl-devel