unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Mike Gran <spk121@yahoo.com>
To: "Ludovic Courtès" <ludo@gnu.org>
Cc: guile-devel@gnu.org
Subject: Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7
Date: Tue, 08 Sep 2009 21:16:34 -0700	[thread overview]
Message-ID: <1252469794.24639.20.camel@localhost.localdomain> (raw)
In-Reply-To: <87fxaxavog.fsf@gnu.org>

On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès wrote:
> Hello!
> 
> "Michael Gran" <spk121@yahoo.com> writes:
> 
> > http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be
> 
> [...]
> 
> > -  return scm_getc (input_port);
> > +  return scm_get_byte_or_eof (input_port);
> 
> This is actually an earlier change, but the prototype of scm_getc is now
> different from that in 1.8.  Presumably, this means that it’s not
> source-compatible with 1.8, e.g., on platforms where
> sizeof (int) < sizeof (scm_t_wchar), right?

The readline library can't handle UCS-4 codepoints, but, it is capable
of dealing with locale-encoded text.  So, it needs to have the raw bytes
of the locale-encoded characters, and scm_get_byte_or_eof returns the
raw bytes.

> 
> > --- a/libguile/strings.h
> > +++ b/libguile/strings.h
> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_string_append (SCM args);
> >  
> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
> >                                       const char *encoding,
> >                                       scm_t_string_failed_conversion_handler 
> >                                       handler);
> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> 
> Were these changes intended?

Well, one of the two of them was intended.  :)

> 
> > +  (with-locale "en_US.iso88591"
> > +    (pass-if-exception "no args" exception:wrong-num-args
> > +      (regexp-quote))
> 
> Is the locale part of the API?  That is, should programs that use
> regexps explicitly ask for a locale with 8-bit encoding?

Basically yes.  On Wed, 2009-09-09 at 01:00 +0200, Ludovic Courtès
wrote: 
> Hello!
> 
> "Michael Gran" <spk121@yahoo.com> writes:
> 
> > http://git.savannah.gnu.org/cgit/guile.git/commit/?id=0d05ae7c4b1eddf6257f99f44eaf5cb7b11191be
> 
> [...]
> 
> > -  return scm_getc (input_port);
> > +  return scm_get_byte_or_eof (input_port);
> 
> This is actually an earlier change, but the prototype of scm_getc is now
> different from that in 1.8.  Presumably, this means that it’s not
> source-compatible with 1.8, e.g., on platforms where
> sizeof (int) < sizeof (scm_t_wchar), right?

The readline library can't handle UCS-4 codepoints, but, it is capable
of dealing with locale-encoded text.  So, it needs to have the raw bytes
of the locale-encoded characters, and scm_get_byte_or_eof returns the
raw bytes instead of doing the processing necessary to make codepoints.

> 
> > --- a/libguile/strings.h
> > +++ b/libguile/strings.h
> > @@ -111,7 +111,7 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
> >  SCM_API SCM scm_string_append (SCM args);
> >  
> > -SCM_INTERNAL SCM scm_i_from_stringn (const char *str, size_t len, 
> > +SCM_API SCM scm_i_from_stringn (const char *str, size_t len, 
> >                                       const char *encoding,
> >                                       scm_t_string_failed_conversion_handler 
> >                                       handler);
> > @@ -157,7 +157,7 @@ SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
> >  SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
> >  SCM_INTERNAL void scm_i_string_stop_writing (void);
> >  SCM_INTERNAL int scm_i_is_narrow_string (SCM str);
> > -SCM_INTERNAL scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> > +SCM_API scm_t_wchar scm_i_string_ref (SCM str, size_t x);
> 
> Were these changes intended?

Well, one of the two of them was intended.  :)

> 
> > +  (with-locale "en_US.iso88591"
> > +    (pass-if-exception "no args" exception:wrong-num-args
> > +      (regexp-quote))
> 
> Is the locale part of the API?  That is, should programs that use
> regexps explicitly ask for a locale with 8-bit encoding?

Basically yes. The libc regex is 8-bit, and it uses
scm_to/from_locale_string to convert regex's input and output.

Until libunistring comes with Unicode regex, I think this is the best we
can do.

Thanks,

Mike





  reply	other threads:[~2009-09-09  4:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <E1MkqDD-0003c6-C4@cvs.savannah.gnu.org>
2009-09-08 23:00 ` [Guile-commits] GNU Guile branch, master, updated. release_1-9-2-164-g0d05ae7 Ludovic Courtès
2009-09-09  4:16   ` Mike Gran [this message]
2009-09-09  7:42     ` Ludovic Courtès
2009-09-09 10:38       ` Mike Gran
2009-09-09 13:03         ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1252469794.24639.20.camel@localhost.localdomain \
    --to=spk121@yahoo.com \
    --cc=guile-devel@gnu.org \
    --cc=ludo@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).