From: David Kastrup <dak@gnu.org>
To: Marko Rauhamaa <marko@pacujo.net>
Cc: guile-user@gnu.org
Subject: Re: guile can't find a chinese named file
Date: Mon, 30 Jan 2017 20:27:34 +0100 [thread overview]
Message-ID: <87y3xspcux.fsf@fencepost.gnu.org> (raw)
In-Reply-To: <87zii8bcdw.fsf@elektro.pacujo.net> (Marko Rauhamaa's message of "Mon, 30 Jan 2017 21:01:31 +0200")
Marko Rauhamaa <marko@pacujo.net> writes:
> David Kastrup <dak@gnu.org>:
>
>> Marko Rauhamaa <marko@pacujo.net> writes:
>>> Guile's mistake was to move to Unicode strings in the operating system
>>> interface.
>>
>> Emacs uses an UTF-8 based encoding internally [...]
>
> C uses 8-bit characters. That is a model worth emulating.
That's Guile-1.8. Guile-2 uses either Latin-1 or UCS-32 in its string
internals, either Latin-1 or UTF-8 in its string API, and UTF-8 in its
string port internals.
> UTF-8 beautifully bridges the interpretation gap between 8-bit
> character strings and text. However, the interpretation step should be
> done in the application and not in the programming language.
Elisp is focused enough about text that I think its choice of going
UTF-8 internally with a Unicode character type reasonably sane. Its
strings (the quirky unibyte strings excluded) are its own variant of
UTF-8 internally, and its string port equivalent (buffers) are that same
variant of UTF-8. And its API talks UTF-8 for strings, Unicode (or
higher) for characters, and it indexes strings and buffers via Unicode
character counts. Not O(1), but with enough trickery that it works well
enough in practice. If strings are to be implemented strictly
Scheme-standard-conforming, they need to be O(1) indexable. The Scheme
standard is rather silent about Unicode however. I am not sure that
sticking to the standard where it does not deal with reality is the best
choice.
I think the case for Guile-2 to _also_ support "unibyte strings" would
be quite stronger than for Emacs (byte arrays and binary string ports
don't allow using Guile's string processing functions). As it stands,
the design of Guile-2 in my book currently involves too many mandatory
conversions for just passing data around with Guile itself and
Guile-based applications.
> Support libraries for Unicode are naturally welcome.
>
> Plain Unicode text is actually quite a rare programming need. It is
> woefully inadequate for the human interface, which generally requires
> numerous other typesetting effects. But is also causing unnecessary
> grief in the computer-computer interface, where the classic textual
> naming and textual protocols are actually cutely chosen octet-aligned
> binary formats.
Sometimes yes, sometimes not. As long as Guile wants to be a
general-purpose programming and extension language, it should deal
reliably and robustly and reproducibly with whatever is thrown at it.
Its choice of libraries does not currently make it so, but that could be
fixed by either working on the (GNU) libraries or by giving Guile its
own implementation.
But that needs to be considered a priority. Nobody will do this just
for fun and kicks.
--
David Kastrup
next prev parent reply other threads:[~2017-01-30 19:27 UTC|newest]
Thread overview: 110+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-27 11:58 guile can't find a chinese named file Thomas Morley
2016-11-27 12:16 ` Chaos Eternal
2016-11-28 8:54 ` Thomas Morley
2017-01-26 21:59 ` Linas Vepstas
2017-01-30 14:20 ` Ludovic Courtès
2017-01-30 15:48 ` David Kastrup
2017-01-30 16:41 ` Ludovic Courtès
2017-01-30 17:04 ` David Kastrup
2017-01-30 15:54 ` Marko Rauhamaa
2017-01-30 16:19 ` David Kastrup
2017-01-30 16:33 ` Marko Rauhamaa
2017-01-30 16:42 ` David Kastrup
2017-01-30 17:58 ` Marko Rauhamaa
2017-01-30 18:32 ` David Kastrup
2017-01-30 18:50 ` Eli Zaretskii
2017-01-30 19:00 ` David Kastrup
2017-01-30 19:32 ` Eli Zaretskii
2017-01-30 19:59 ` Eli Zaretskii
2017-01-30 20:42 ` Mike Gran
2017-01-31 3:31 ` Eli Zaretskii
2017-01-31 6:16 ` Mike Gran
2017-01-31 8:51 ` David Kastrup
2017-01-30 19:01 ` Marko Rauhamaa
2017-01-30 19:27 ` David Kastrup [this message]
2017-02-14 20:10 ` Linas Vepstas
2017-02-14 20:54 ` Mike Gran
2017-02-14 21:07 ` Marko Rauhamaa
2017-02-14 21:52 ` Mike Gran
2017-02-14 22:12 ` Marko Rauhamaa
2017-02-14 22:19 ` Chris Vine
2017-02-15 7:15 ` Marko Rauhamaa
2017-02-15 9:18 ` tomas
2017-02-15 9:54 ` David Kastrup
2017-02-15 10:10 ` tomas
2017-02-15 17:04 ` Eli Zaretskii
2017-02-15 20:07 ` tomas
2017-02-15 20:22 ` Eli Zaretskii
2017-02-15 10:50 ` Marko Rauhamaa
2017-02-15 11:18 ` David Kastrup
2017-02-15 10:15 ` Chris Vine
2017-02-15 11:48 ` tomas
2017-02-15 12:13 ` Chris Vine
2017-02-15 12:41 ` tomas
2017-02-15 13:11 ` Chris Vine
2017-02-15 13:31 ` tomas
2017-02-15 17:07 ` Eli Zaretskii
2017-02-26 20:58 ` Andy Wingo
2017-02-27 16:02 ` Eli Zaretskii
2017-02-26 20:52 ` Andy Wingo
2017-02-15 16:59 ` Eli Zaretskii
2017-02-15 17:53 ` Marko Rauhamaa
2017-02-15 20:20 ` tomas
2017-02-15 20:32 ` Eli Zaretskii
2017-02-15 21:04 ` Marko Rauhamaa
2017-02-16 5:44 ` Eli Zaretskii
2017-02-16 6:15 ` Marko Rauhamaa
2017-02-16 6:29 ` Eli Zaretskii
2017-02-16 6:41 ` Eli Zaretskii
2017-02-16 7:16 ` Marko Rauhamaa
2017-02-16 8:26 ` David Kastrup
2017-02-16 10:21 ` Marko Rauhamaa
2017-02-16 10:43 ` David Kastrup
2017-02-16 11:04 ` Marko Rauhamaa
2017-02-16 11:11 ` David Kastrup
2017-02-16 11:32 ` Marko Rauhamaa
2017-02-16 11:49 ` David Kastrup
2017-02-16 12:14 ` Marko Rauhamaa
2017-02-16 16:21 ` Eli Zaretskii
2017-02-16 16:38 ` Marko Rauhamaa
2017-02-16 17:46 ` Eli Zaretskii
2017-02-16 18:38 ` Marko Rauhamaa
2017-02-16 18:46 ` Eli Zaretskii
2017-02-16 19:35 ` Marko Rauhamaa
2017-02-16 20:10 ` Eli Zaretskii
2017-02-16 20:52 ` David Kastrup
2017-02-16 21:13 ` Marko Rauhamaa
2017-02-17 6:44 ` Eli Zaretskii
2017-02-17 8:46 ` Marko Rauhamaa
2017-02-17 9:04 ` David Kastrup
2017-02-17 9:57 ` tomas
2017-02-17 9:07 ` Eli Zaretskii
2017-02-17 6:32 ` Eli Zaretskii
2017-02-16 16:06 ` Eli Zaretskii
2017-02-16 16:35 ` Marko Rauhamaa
2017-02-16 17:41 ` Eli Zaretskii
2017-02-16 18:30 ` Mike Gran
2017-02-16 18:48 ` David Kastrup
2017-02-16 7:02 ` Marko Rauhamaa
2017-02-16 15:47 ` Eli Zaretskii
2017-02-15 21:15 ` tomas
2017-02-16 5:54 ` Eli Zaretskii
2017-02-14 23:58 ` David Kastrup
2017-02-15 10:12 ` tomas
2017-02-15 12:04 ` Marko Rauhamaa
2017-02-26 21:20 ` Andy Wingo
2017-02-27 9:10 ` David Kastrup
2017-02-27 11:02 ` Andy Wingo
2017-02-27 12:09 ` David Kastrup
2017-02-27 12:33 ` Andy Wingo
2017-02-27 16:07 ` Eli Zaretskii
2017-02-27 19:29 ` Andy Wingo
2017-02-27 20:24 ` Jan Wedekind
2017-02-27 20:33 ` Eli Zaretskii
2017-02-14 22:26 ` Ludovic Courtès
2017-02-26 21:23 ` Andy Wingo
2017-01-30 19:41 ` Eli Zaretskii
2017-01-30 20:46 ` Marko Rauhamaa
2017-01-31 12:20 ` tomas
2017-02-14 19:58 ` Linas Vepstas
2017-02-26 21:33 ` Andy Wingo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y3xspcux.fsf@fencepost.gnu.org \
--to=dak@gnu.org \
--cc=guile-user@gnu.org \
--cc=marko@pacujo.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).