[Openmcl-devel] Naming convention for character-encodings (UTF) or :utf32 vs. :utf-32
gb at clozure.com
Sat Sep 29 15:08:51 EDT 2007
I think that I tried to be careful with this, trying to use the
preferred names from <http://www.iana.org/assignments/character-sets>
(well, keywords based on those preferred names). See what happens
when you're careful! (I should have known better.)
You're right that "big endian utf 32" should be called "utf-32be"
and likewise for the little-endian variant. I'll change that.
Trying to strictly adhere to a standard is good, but there are
lots of standards. What ICU does when trying to match a provided
character encoding name with one of those that it knows about
is to ignore case and ignore dashes and underscores, so :utf-8
and :utf8 are treated identically. This seems like a good approach;
it's unlikely that someone who says :utf8 is referring to something
other than "8-bit Unicode Transformation Format", and it's not
all that useful to claim to have never heard of :utf8.
On Sat, 29 Sep 2007, Ralf Stoye wrote:
> trying to use utf-8 with portable-aserve i got errors about unknown
> I found out that most encodings are named :utf-xx but the :utf-32be
> and :utf-32le versions are mixed up:
> I vote for a consist use of the :utf-xx[xx] version.
> (defmethod stream-external-format ((s character-stream))
> (make-external-format :character-encoding #+big-endian-
> target :utf32-be #+little-endian-target :utf32-le :line-
> termination :unix))
> (define-character-encoding #+big-endian-target :utf-32be #-big-endian-
> target :utf32-le .....
> Ralf Stoye
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
More information about the Openmcl-devel