From: Mike Gran <spk121@yahoo.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
Date: Tue, 08 Sep 2009 21:16:34 -0700 [thread overview]
Message-ID: <1252469794.24639.20.camel@localhost.localdomain> (raw)
In-Reply-To: <87fxaxavog.fsf@gnu.org>
On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès wrote:
> Hello!
>
> "Michael Gran" <spk121@yahoo.com> writes:
>
> > http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be
>
> [...]
>
> > - return scm_getc (input_port);
> > + return scm_get_byte_or_eof (input_port);
>
> This is actually an earlier change, but the prototype of scm_getc is now
> different from that in 1.8. Presumably, this means that it’s not
> source-compatible with 1.8, e.g., on platforms where
> sizeof (int) < sizeof (scm_t_wchar), right?
The readline library can't handle UCS-4 codepoints, but, it is capable
of dealing with locale-encoded text. So, it needs to have the raw bytes
of the locale-encoded characters, and scm_get_byte_or_eof returns the
raw bytes.
>
> > --- a/libguile/strings.h
> > +++ b/libguile/strings.h
> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> > SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> > SCM_API SCM scm_string_append (SCM args);
> >
> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len,
> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len,
> > const char *encoding,
> > scm_t_string_failed_conversion_handler
> > handler);
> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> > SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> > SCM_INTERNAL void scm_i_string_stop_writing (void);
> > SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
>
> Were these changes intended?
Well, one of the two of them was intended. :)
>
> > + (with-locale "en_US.iso88591"
> > + (pass-if-exception "no args" exception:wrong-num-args
> > + (regexp-quote))
>
> Is the locale part of the API? That is, should programs that use
> regexps explicitly ask for a locale with 8-bit encoding?
Basically yes. On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès
wrote:
> Hello!
>
> "Michael Gran" <spk121@yahoo.com> writes:
>
> > http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be
>
> [...]
>
> > - return scm_getc (input_port);
> > + return scm_get_byte_or_eof (input_port);
>
> This is actually an earlier change, but the prototype of scm_getc is now
> different from that in 1.8. Presumably, this means that it’s not
> source-compatible with 1.8, e.g., on platforms where
> sizeof (int) < sizeof (scm_t_wchar), right?
The readline library can't handle UCS-4 codepoints, but, it is capable
of dealing with locale-encoded text. So, it needs to have the raw bytes
of the locale-encoded characters, and scm_get_byte_or_eof returns the
raw bytes instead of doing the processing necessary to make codepoints.
>
> > --- a/libguile/strings.h
> > +++ b/libguile/strings.h
> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> > SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> > SCM_API SCM scm_string_append (SCM args);
> >
> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len,
> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len,
> > const char *encoding,
> > scm_t_string_failed_conversion_handler
> > handler);
> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> > SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> > SCM_INTERNAL void scm_i_string_stop_writing (void);
> > SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
>
> Were these changes intended?
Well, one of the two of them was intended. :)
>
> > + (with-locale "en_US.iso88591"
> > + (pass-if-exception "no args" exception:wrong-num-args
> > + (regexp-quote))
>
> Is the locale part of the API? That is, should programs that use
> regexps explicitly ask for a locale with 8-bit encoding?
Basically yes. The libc regex is 8-bit, and it uses
scm_to/from_locale_string to convert regex's input and output.
Until libunistring comes with Unicode regex, I think this is the best we
can do.
Thanks,
Mike
next prev parent reply other threads:[~2009-09-09 4:16 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <E1MkqDD-0003c6-C4@cvs.savannah.gnu.org>
2009-09-08 23:00 ` [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7 Ludovic Courtès
2009-09-09 4:16 ` Mike Gran [this message]
2009-09-09 7:42 ` Ludovic Courtès
2009-09-09 10:38 ` Mike Gran
2009-09-09 13:03 ` Ludovic Courtès
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/guile/
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1252469794.24639.20.camel@localhost.localdomain \
--to=spk121@yahoo.com \
--cc=guile-devel@gnu.org \
--cc=ludo@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).