unofficial mirror of guile-user@gnu.org 
 help / color / mirror / Atom feed
From: Andy Wingo <wingo@pobox.com>
To: guile-user@gnu.org
Subject: Re: guile can't find a chinese named file
Date: Sun, 26 Feb 2017 22:20:31 +0100	[thread overview]
Message-ID: <8737f0tzs0.fsf@pobox.com> (raw)
In-Reply-To: <87inoc5npq.fsf@fencepost.gnu.org> (David Kastrup's message of "Wed, 15 Feb 2017 00:58:41 +0100")

Hello,

I feel the need to correct points in this mail for the benefit of
guile-user.  No reply is needed.

On Wed 15 Feb 2017 00:58, David Kastrup <dak@gnu.org> writes:

> Mike Gran <spk121@yahoo.com> writes:
>
>> But, for what it is worth, the Latin-1/UCS-32 design decision came
>> from a couple of conflicting requirements.  The switch happened in the
>> 1.9.x series.
>>
>> There was several examples of legacy C code using Guile for an
>> extension language that accessed the bytes of a string directly, using
>>
>> SCM_STRING_CHARS or scm_i_string_chars.  To keep from breaking legacy
>> code, we needed to retain the capability to use this (then already
>> deprecated) capability to have C programs access 8-bit-locale string
>> internals directly.
>
> But if you don't know whether the strings are Latin-1 or UCS-32, that's
> sort of academical.

Not at all.  Legacy programs don't use codepoints >255.  For UTF-32,
attempting to get the string data would throw an exception.  The
SCM_STRING_CHARS hack was a good trade-off.

> The problem is that Guile is _constantly_ required to recode strings it
> is processing.  And to add insult to injury, it cannot do this without
> data loss when its string encoding assumptions are wrong.

In Scheme, strings are sequences of characters.  Encoding and decoding
is only needed when going to and from bytes.  Guile supports a finite
number of encodings, so in general some encoding/decoding will always be
needed.  The specific encoding may change over time.

> PostScript files are usually encoded in Latin-1 with occasional UCS-16
> passages.  Reading and writing and copying such files byte-correctly
> while trying to actually parse their contents is not feasible with
> Guile.

Works perfectly well.  The web server for example reads the request as
Latin-1 and the body as something else.  Just re-set the port encoding
and there you go.

>> I still maintain that this design decision was a good one based on the
>> simplicity of implementation.
>
> As I said: the problem is not the chosen internal representation.  The
> problem is that there is no API to access it, and it does not even map
> to string ports.

String ports have nothing to do with the discussion AFAIU.  (Ports in
Guile are sequences of bytes also.  They may be accessed using textual
interfaces as well.  Therefore a string port must have an associated
encoding, to read/write the bytes.  But no error is possible for textual
I/O with the default UTF-8 encoding as all characters are representable.
Encoding to UTF-8 is fast and space-efficient.)

Andy



  parent reply	other threads:[~2017-02-26 21:20 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-27 11:58 guile can't find a chinese named file Thomas Morley
2016-11-27 12:16 ` Chaos Eternal
2016-11-28  8:54   ` Thomas Morley
2017-01-26 21:59     ` Linas Vepstas
2017-01-30 14:20 ` Ludovic Courtès
2017-01-30 15:48   ` David Kastrup
2017-01-30 16:41     ` Ludovic Courtès
2017-01-30 17:04       ` David Kastrup
2017-01-30 15:54   ` Marko Rauhamaa
2017-01-30 16:19     ` David Kastrup
2017-01-30 16:33       ` Marko Rauhamaa
2017-01-30 16:42         ` David Kastrup
2017-01-30 17:58           ` Marko Rauhamaa
2017-01-30 18:32             ` David Kastrup
2017-01-30 18:50               ` Eli Zaretskii
2017-01-30 19:00                 ` David Kastrup
2017-01-30 19:32                   ` Eli Zaretskii
2017-01-30 19:59                     ` Eli Zaretskii
2017-01-30 20:42                       ` Mike Gran
2017-01-31  3:31                         ` Eli Zaretskii
2017-01-31  6:16                           ` Mike Gran
2017-01-31  8:51                           ` David Kastrup
2017-01-30 19:01               ` Marko Rauhamaa
2017-01-30 19:27                 ` David Kastrup
2017-02-14 20:10                   ` Linas Vepstas
2017-02-14 20:54                     ` Mike Gran
2017-02-14 21:07                       ` Marko Rauhamaa
2017-02-14 21:52                         ` Mike Gran
2017-02-14 22:12                           ` Marko Rauhamaa
2017-02-14 22:19                           ` Chris Vine
2017-02-15  7:15                             ` Marko Rauhamaa
2017-02-15  9:18                             ` tomas
2017-02-15  9:54                               ` David Kastrup
2017-02-15 10:10                                 ` tomas
2017-02-15 17:04                                   ` Eli Zaretskii
2017-02-15 20:07                                     ` tomas
2017-02-15 20:22                                       ` Eli Zaretskii
2017-02-15 10:50                                 ` Marko Rauhamaa
2017-02-15 11:18                                   ` David Kastrup
2017-02-15 10:15                               ` Chris Vine
2017-02-15 11:48                                 ` tomas
2017-02-15 12:13                                   ` Chris Vine
2017-02-15 12:41                                     ` tomas
2017-02-15 13:11                                       ` Chris Vine
2017-02-15 13:31                                         ` tomas
2017-02-15 17:07                                     ` Eli Zaretskii
2017-02-26 20:58                                       ` Andy Wingo
2017-02-27 16:02                                         ` Eli Zaretskii
2017-02-26 20:52                                 ` Andy Wingo
2017-02-15 16:59                               ` Eli Zaretskii
2017-02-15 17:53                                 ` Marko Rauhamaa
2017-02-15 20:20                                 ` tomas
2017-02-15 20:32                                   ` Eli Zaretskii
2017-02-15 21:04                                     ` Marko Rauhamaa
2017-02-16  5:44                                       ` Eli Zaretskii
2017-02-16  6:15                                         ` Marko Rauhamaa
2017-02-16  6:29                                           ` Eli Zaretskii
2017-02-16  6:41                                             ` Eli Zaretskii
2017-02-16  7:16                                               ` Marko Rauhamaa
2017-02-16  8:26                                                 ` David Kastrup
2017-02-16 10:21                                                   ` Marko Rauhamaa
2017-02-16 10:43                                                     ` David Kastrup
2017-02-16 11:04                                                       ` Marko Rauhamaa
2017-02-16 11:11                                                         ` David Kastrup
2017-02-16 11:32                                                           ` Marko Rauhamaa
2017-02-16 11:49                                                             ` David Kastrup
2017-02-16 12:14                                                               ` Marko Rauhamaa
2017-02-16 16:21                                                                 ` Eli Zaretskii
2017-02-16 16:38                                                                   ` Marko Rauhamaa
2017-02-16 17:46                                                                     ` Eli Zaretskii
2017-02-16 18:38                                                                       ` Marko Rauhamaa
2017-02-16 18:46                                                                         ` Eli Zaretskii
2017-02-16 19:35                                                                           ` Marko Rauhamaa
2017-02-16 20:10                                                                             ` Eli Zaretskii
2017-02-16 20:52                                                                               ` David Kastrup
2017-02-16 21:13                                                                                 ` Marko Rauhamaa
2017-02-17  6:44                                                                                   ` Eli Zaretskii
2017-02-17  8:46                                                                                     ` Marko Rauhamaa
2017-02-17  9:04                                                                                       ` David Kastrup
2017-02-17  9:57                                                                                         ` tomas
2017-02-17  9:07                                                                                       ` Eli Zaretskii
2017-02-17  6:32                                                                                 ` Eli Zaretskii
2017-02-16 16:06                                                 ` Eli Zaretskii
2017-02-16 16:35                                                   ` Marko Rauhamaa
2017-02-16 17:41                                                     ` Eli Zaretskii
2017-02-16 18:30                                                     ` Mike Gran
2017-02-16 18:48                                                       ` David Kastrup
2017-02-16  7:02                                             ` Marko Rauhamaa
2017-02-16 15:47                                               ` Eli Zaretskii
2017-02-15 21:15                                     ` tomas
2017-02-16  5:54                                       ` Eli Zaretskii
2017-02-14 23:58                       ` David Kastrup
2017-02-15 10:12                         ` tomas
2017-02-15 12:04                           ` Marko Rauhamaa
2017-02-26 21:20                         ` Andy Wingo [this message]
2017-02-27  9:10                           ` David Kastrup
2017-02-27 11:02                             ` Andy Wingo
2017-02-27 12:09                               ` David Kastrup
2017-02-27 12:33                                 ` Andy Wingo
2017-02-27 16:07                           ` Eli Zaretskii
2017-02-27 19:29                             ` Andy Wingo
2017-02-27 20:24                               ` Jan Wedekind
2017-02-27 20:33                                 ` Eli Zaretskii
2017-02-14 22:26                     ` Ludovic Courtès
2017-02-26 21:23                       ` Andy Wingo
2017-01-30 19:41                 ` Eli Zaretskii
2017-01-30 20:46                   ` Marko Rauhamaa
2017-01-31 12:20                     ` tomas
2017-02-14 19:58             ` Linas Vepstas
2017-02-26 21:33               ` Andy Wingo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8737f0tzs0.fsf@pobox.com \
    --to=wingo@pobox.com \
    --cc=guile-user@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).