unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Mark H Weaver <mhw@netris.org>
To: Bruce Korb <bkorb@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: mutable interfaces - was: Guile: What's wrong with this?
Date: Sat, 07 Jan 2012 13:30:33 -0500	[thread overview]
Message-ID: <877h138iwm.fsf@netris.org> (raw)
In-Reply-To: <4F088252.9040000@gnu.org> (Bruce Korb's message of "Sat, 07 Jan 2012 09:35:14 -0800")

Bruce Korb <bkorb@gnu.org> writes:

> On 01/07/12 08:13, Mark H Weaver wrote:
>>> Most of the strings that I wind up altering are created with a
>>> scm_from_locale_string() C function call.
>>
>> BTW, beware that scm_from_locale_string() is only appropriate for
>> strings that came from the user (e.g. command-line arguments, reading
>> from a port, etc).  When converting string literals from your own source
>> code, you should use scm_from_latin1_string() or scm_from_utf8_string().
>>
>> Similarly, to make symbols from C string literals, use
>> scm_from_latin1_symbol() or scm_from_utf8_symbol().
>>
>> Caveat: these functions did not exist in Guile 1.8.  If your C string
>> literals are ASCII-only, I guess it won't matter in practice which
>> function you use, although it would be good to spread the understanding
>> that C string literals should not be interpreted according to the user's
>> locale.
>
> I go back to my argument that a facilitation language needs to focus
> on being as helpful as possible.  That means doing what is likely
> wanted instead of throwing errors at every possibility.  It also means
> not changing interfaces.

Sorry, but there's no way to maintain backward compatibility here.  I
know it's a pain, but there's no getting around the fact that in order
to write proper internationalized code, we now need to think carefully
about what encoding a particular string is in.  There's no automatic way
to handle this, not even in principle.

Fortunately, most modern GNU/Linux systems default to a UTF-8 locale, in
which case scm_from_locale_string and scm_from_utf8_string will be the
same anyway.  However, there are still some systems that use a non-UTF-8
locale, and we must strive to support them properly.

> Anyway, this then?  (abbreviated)
>
> #if   GUILE_VERSION < 107000
> # define AG_SCM_STR02SCM(_s)          scm_makfrom0str(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_mem2string(_st,_sz)
>
> #elif   GUILE_VERSION < 200000
> # define AG_SCM_STR02SCM(_s)          scm_from_locale_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_locale_stringn(_st,_sz)
>
> #elif   GUILE_VERSION < 200004
> #error "autogen does not work with this version of guile"
>   choke me.

This last clause is wrong.  scm_from_utf8_string and
scm_from_utf8_stringn were in Guile 2.0.0.

> #else
> # define AG_SCM_STR02SCM(_s)          scm_from_utf8_string(_s)
> # define AG_SCM_STR2SCM(_st,_sz)      scm_from_utf8_stringn(_st,_sz)
> #endif

Just remember that this change implies that these macros should only be
used for C string literals, and must _not_ be used for strings supplied
by the user (e.g. command-line arguments and I/O).

It could very well be that you're currently overloading these functions
for both purposes, in which case you should split this pair of macros
into two distinct pairs: one pair of macros for user strings (keep using
scm_from_locale_string{,n} for these), and one pair for C string
literals (use scm_from_utf8_string{,n} for Guile 2.0.0 or newer).

Then look at each use of these old overloaded macros in your code, and
figure out whether it's operating on a string that came from the user or
a string that came from your own source code.

Again, I stress that this has nothing to do with Guile.  All software,
if it wishes to be properly internationalized, needs to think about
where a string came from.  In general, your program's source code (and
thus the C string literals it contains) will have a different encoding
than C strings that come from the user.  C strings of different
encodings are essentially of different types (even though C's type
system is too crude to distinguish them), and you must treat them as
such.

      Mark



  parent reply	other threads:[~2012-01-07 18:30 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-03  4:08 What's wrong with this? Bruce Korb
2012-01-03 15:03 ` Mike Gran
2012-01-03 16:26   ` Guile: " Bruce Korb
2012-01-03 16:30     ` Mike Gran
2012-01-03 22:24     ` Ludovic Courtès
2012-01-03 23:15       ` Bruce Korb
2012-01-03 23:33         ` Ludovic Courtès
2012-01-04  0:55           ` Bruce Korb
2012-01-04  3:12             ` Noah Lavine
2012-01-04 17:37               ` bytevector -- was: " Bruce Korb
2012-01-04 21:17             ` Ludovic Courtès
2012-01-04 22:36               ` Bruce Korb
2012-01-05  0:01                 ` Ludovic Courtès
2012-01-05 18:36                   ` non-reproduction of initial issue -- was: " Bruce Korb
2012-01-05 18:50                     ` Mark H Weaver
2012-01-04 12:19         ` Ian Price
2012-01-04 17:16           ` Bruce Korb
2012-01-04 17:21             ` Andy Wingo
2012-01-04 17:39             ` David Kastrup
2012-01-04 21:52             ` Ian Price
2012-01-04 22:18               ` Bruce Korb
2012-01-04 23:22                 ` Mike Gran
2012-01-04 23:59                 ` Mark H Weaver
2012-01-05 17:22                   ` Bruce Korb
2012-01-05 18:13                     ` Mark H Weaver
2012-01-05 19:02                       ` Mark H Weaver
2012-01-05 20:24                     ` David Kastrup
2012-01-05 22:42                     ` Mark H Weaver
2012-01-06  1:02                       ` Mike Gran
2012-01-06  1:41                         ` Mark H Weaver
2012-01-06  2:38                           ` Noah Lavine
2012-01-06 13:37                           ` Mike Gran
2012-01-06 14:11                             ` David Kastrup
2012-01-06 18:13                             ` Mark H Weaver
2012-01-06 19:06                               ` Bruce Korb
2012-01-06 19:19                                 ` David Kastrup
2012-01-06 20:03                                   ` Mark H Weaver
2012-01-07 16:13                                 ` Mark H Weaver
2012-01-07 17:35                                   ` mutable interfaces - was: " Bruce Korb
2012-01-07 17:47                                     ` David Kastrup
2012-01-07 18:30                                     ` Mark H Weaver [this message]
2012-01-07 18:55                                       ` Mark H Weaver
2012-01-06 22:23                               ` Guile BUG: " Bruce Korb
2012-01-06 23:11                                 ` Mark H Weaver
2012-01-06 23:35                                   ` Andy Wingo
2012-01-06 23:41                                   ` Bruce Korb
2012-01-07 15:00                                     ` Mark H Weaver
2012-01-07 15:27                                       ` Bruce Korb
2012-01-07 16:38                                         ` Mark H Weaver
2012-01-07 17:39                                           ` Bruce Korb
2012-01-09 15:41                                             ` Mark H Weaver
2012-01-09 17:27                                               ` Bruce Korb
2012-01-09 18:32                                                 ` Andy Wingo
2012-01-09 19:48                                                   ` Bruce Korb
2012-01-07 15:47                                       ` David Kastrup
2012-01-07 17:07                                         ` Mark H Weaver
2012-01-07 14:35                                   ` Mark H Weaver
2012-01-07 15:20                                     ` Mike Gran
2012-01-07 22:25                                     ` Ludovic Courtès
2012-01-10  9:13                                     ` The empty string and other empty strings Ludovic Courtès
2012-01-10 11:28                                       ` Mike Gran
2012-01-10 13:03                                         ` Mark H Weaver
2012-01-10 13:09                                           ` Mike Gran
2012-01-10 15:41                                           ` Mark H Weaver
2012-01-10 15:48                                             ` David Kastrup
2012-01-10 16:15                                               ` Mark H Weaver
2012-01-12 22:33                                                 ` Ludovic Courtès
2012-01-13  9:27                                                   ` David Kastrup
2012-01-13 16:39                                                     ` Mark H Weaver
2012-01-13 17:36                                                       ` David Kastrup
2012-01-16  8:26                                                       ` Marijn
2012-01-16  8:47                                                         ` David Kastrup
2012-01-20 21:31                                                       ` Andy Wingo
2012-01-10 14:10                                         ` David Kastrup
2012-01-10 12:21                                       ` Mike Gran
2012-01-10 12:27                                       ` Mark H Weaver
2012-01-10 16:34                                         ` Ludovic Courtès
2012-01-10 17:04                                           ` David Kastrup
2012-01-06 23:28                                 ` Guile BUG: What's wrong with this? Bruce Korb
2012-01-07 20:57                           ` Guile: " Ian Price
2012-01-08  5:05                             ` Mark H Weaver
2012-01-06  9:23                         ` David Kastrup
2012-01-05  7:22                 ` David Kastrup
2012-01-04 22:46             ` Ludovic Courtès
2012-01-04  3:04       ` Mike Gran
2012-01-04  9:35         ` nalaginrut
2012-01-04  9:41         ` David Kastrup
2012-01-04 21:07         ` Ludovic Courtès
2012-01-04 10:03     ` Mark H Weaver
2012-01-04 14:29       ` Mike Gran
2012-01-04 14:45         ` David Kastrup
2012-01-04 16:47         ` Andy Wingo
2012-01-04 17:14           ` David Kastrup
2012-01-04 17:32             ` Andy Wingo
2012-01-04 17:49               ` David Kastrup
2012-01-04 18:09                 ` Andy Wingo
2012-01-04 17:30           ` Bruce Korb
2012-01-04 17:44             ` David Kastrup
2012-01-04 18:26             ` Ian Price
2012-01-04 18:48               ` Mark H Weaver
2012-01-04 19:29               ` Bruce Korb
2012-01-04 20:20                 ` David Kastrup
2012-01-04 23:19                 ` Mark H Weaver
2012-01-04 23:28                   ` Bruce Korb
2012-01-07 15:43                   ` Fixed string corruption bugs (was Guile: What's wrong with this?) Mark H Weaver
2012-01-07 16:19                     ` Fixed string corruption bugs Andy Wingo
2012-01-04 18:31           ` Guile: What's wrong with this? Mark H Weaver
2012-01-04 18:43             ` Andy Wingo
2012-01-04 19:29               ` Mark H Weaver
2012-01-04 19:43                 ` Andy Wingo
2012-01-04 20:08                   ` Bruce Korb
2012-01-04 20:14                     ` David Kastrup
2012-01-04 20:56                     ` Andy Wingo
2012-01-04 21:30                       ` Bruce Korb
2012-01-04 17:19         ` Mark H Weaver
2012-01-05  4:24           ` Mark H Weaver
2012-01-04 22:37       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=877h138iwm.fsf@netris.org \
    --to=mhw@netris.org \
    --cc=bkorb@gnu.org \
    --cc=guile-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).