need: scm_from_{utf8,latin1}_{string,symbol,keyword}

unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed

* need: scm_from_{utf8,latin1}_{string,symbol,keyword}
@ 2010-09-06 11:23 Andy Wingo
  2010-09-06 16:28 ` Mike Gran
  2010-09-06 17:02 ` Ludovic Courtès
  0 siblings, 2 replies; 13+ messages in thread
From: Andy Wingo @ 2010-09-06 11:23 UTC (permalink / raw)
  To: guile-devel

Hi,

In our C source, we have been trained to use scm_from_locale_string et
al. This is usually the right thing to do when interacting with the
operating system.

However, when we have literals in C source code, I think this strategy
is incorrect. I write my C source code in UTF-8 or in ISO-8859-1, but if
the user is running in another locale, they will not load my
strings/symbols/keywords correctly.

The solution is to use functions that specify the locale. We don't have
those yet, but we do have the capability to write them
now. Specifically:

  scm_from_utf8_string
  scm_from_utf8_symbol
  scm_from_utf8_keyword

  scm_from_latin1_string
  scm_from_latin1_symbol
  scm_from_latin1_keyword

We probably also need the "n" variants.

It's unlikely that you have a known utf-32 string as a char*, but we
should probably also provide scm_t_uint16* and scm_t_uint32* variants
for utf16 and utf32.

                               * * *

We also have the converse problem: since the easiest (and recommended)
way to get a char* from a Scheme string has been scm_to_locale_string,
in many cases we give external libraries locale-encoded strings instead
of the encoding they expect.

For example, most GLib-based libraries expect utf-8 strings, but
Guile-GNOME ignorantly passes them the result of calling
scm_to_locale_string. Though this will work in UTF-8 locales, it's only
by accident.

So then we need, I think:

  scm_to_utf8_string
  scm_to_utf16_string
  scm_to_utf32_string

We need the "n" variants here too (perhaps more).

What do people think? Any takers on implementing this? :)

Cheers,

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-06 11:23 need: scm_from_{utf8,latin1}_{string,symbol,keyword} Andy Wingo
@ 2010-09-06 16:28 ` Mike Gran
  2010-09-06 16:58   ` Andy Wingo
  2010-09-06 17:02 ` Ludovic Courtès
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Gran @ 2010-09-06 16:28 UTC (permalink / raw)
  To: Andy Wingo, guile-devel

> From: Andy Wingo <wingo@pobox.com>

[...]

> The solution is to use  functions that specify the locale. We don't have
> those yet, but we do have  the capability to write them
> now. Specifically:
> 
>    scm_from_utf8_string
>   scm_from_utf8_symbol
>    scm_from_utf8_keyword
> 
>   scm_from_latin1_string
>    scm_from_latin1_symbol
>   scm_from_latin1_keyword
> 
> We probably also  need the "n" variants.
> 

[...]

> So then we need, I  think:
> 
>   scm_to_utf8_string
>   scm_to_utf16_string
>    scm_to_utf32_string
> 
> We need the "n" variants here too (perhaps  more).

Some of this is already in the bytevectors module, but, 
perhaps not in an easy form for C source code.

It would easy enough to do, but, there is a failure case to 
consider for scm_from_utf8_string.  The C utf8 string could
contain incorrectly encoded data.

You could throw the encoding error, or you could replace the 
bad utf8 with U+FFFD or the question mark.

The bytevector's utf8->string always throws encoding-error.
Maybe that's good enough.

Otherwise, perhaps something like

scm_from_utf8_stringn (str, len, error_or_replace_strategy)

If you didn't mind the overhead of calling the somewhat 
heavyweight scm_{to,from}_stringn, these could be macros
or inline functions that wrap that.

-Mike

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-06 16:28 ` Mike Gran
@ 2010-09-06 16:58   ` Andy Wingo
  0 siblings, 0 replies; 13+ messages in thread
From: Andy Wingo @ 2010-09-06 16:58 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel

Greetings,

On Mon 06 Sep 2010 18:28, Mike Gran <spk121@yahoo.com> writes:

> there is a failure case to consider for scm_from_utf8_string.  The C
> utf8 string could contain incorrectly encoded data.

There is the analogous case of scm_to_locale_string, if the string is not
encodable in the current locale.

> You could throw the encoding error, or you could replace the 
> bad utf8 with U+FFFD or the question mark.
>
> The bytevector's utf8->string always throws encoding-error.
> Maybe that's good enough.

Yeah, maybe so.

> Otherwise, perhaps something like
>
> scm_from_utf8_stringn (str, len, error_or_replace_strategy)
>
> If you didn't mind the overhead of calling the somewhat 
> heavyweight scm_{to,from}_stringn, these could be macros
> or inline functions that wrap that.

Ah, I did not see scm_{to,from}_stringn. Cool! I think
scm_from_utf8_stringn et al should be proper functions, and probably
their initial implementations just call scm_{to,from}_stringn. But we
should at least do the straightforward optimization for the latin1 case.

Cheers,

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-06 11:23 need: scm_from_{utf8,latin1}_{string,symbol,keyword} Andy Wingo
  2010-09-06 16:28 ` Mike Gran
@ 2010-09-06 17:02 ` Ludovic Courtès
  2010-09-07 15:21   ` Mike Gran
  1 sibling, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2010-09-06 17:02 UTC (permalink / raw)
  To: guile-devel

Hello,

Andy Wingo <wingo@pobox.com> writes:

> However, when we have literals in C source code, I think this strategy
> is incorrect. I write my C source code in UTF-8 or in ISO-8859-1, but if
> the user is running in another locale, they will not load my
> strings/symbols/keywords correctly.

Actually locale encodings are typically ASCII-compatible (info
"(libunistring) Locale encodings"), so it’s rarely (never?) a problem in
practice.

> The solution is to use functions that specify the locale. We don't have
> those yet, but we do have the capability to write them
> now. Specifically:
>
>   scm_from_utf8_string
>   scm_from_utf8_symbol
>   scm_from_utf8_keyword
>
>   scm_from_latin1_string
>   scm_from_latin1_symbol
>   scm_from_latin1_keyword

The ‘latin1’ family should be easy to implement and that’s what we’d use
in our C code.

[...]

> For example, most GLib-based libraries expect utf-8 strings, but
> Guile-GNOME ignorantly passes them the result of calling
> scm_to_locale_string. Though this will work in UTF-8 locales, it's only
> by accident.

When using (system foreign), one can use:

  (bytevector->pointer (string->utf8 "foo"))

or similar.

Besides, there’s the undocumented ‘scm_from_stringn’ and the internal
‘scm_to_stringn’, which can convert from/to any encoding.  I think they
were initially kept internal because we weren’t quite sure about the
API.  Mike?

Perhaps it’d be enough to make these two functions public and
documented, and add the ‘latin1’ family?

Thanks,
Ludo’.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-06 17:02 ` Ludovic Courtès
@ 2010-09-07 15:21   ` Mike Gran
  2010-09-07 17:11     ` Ludovic Courtès
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Gran @ 2010-09-07 15:21 UTC (permalink / raw)
  To: Ludovic Courtès, guile-devel

> From: Ludovic Courtès <ludo@gnu.org>


> Besides, there’s the undocumented ‘scm_from_stringn’ and the  internal
> ‘scm_to_stringn’, which can convert from/to any encoding.  I  think they
> were initially kept internal because we weren’t quite sure about  the
> API.  Mike?

Also, I think we were trying to avoid compilation problems based on 
having to expose the libunistring's enum iconv_ilseq_handle to the world.
But later, we ended up creating the analogous
scm_t_string_failed_conversion_handler type to work around that problem.

-Mike



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-07 15:21   ` Mike Gran
@ 2010-09-07 17:11     ` Ludovic Courtès
  2010-09-07 22:05       ` Andy Wingo
  2010-09-08  3:26       ` Mike Gran
  0 siblings, 2 replies; 13+ messages in thread
From: Ludovic Courtès @ 2010-09-07 17:11 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel

Hi Mike,

Mike Gran <spk121@yahoo.com> writes:

>> From: Ludovic Courtès <ludo@gnu.org>
>
>
>> Besides, there’s the undocumented ‘scm_from_stringn’ and the  internal
>> ‘scm_to_stringn’, which can convert from/to any encoding.  I  think they
>> were initially kept internal because we weren’t quite sure about  the
>> API.  Mike?
>
> Also, I think we were trying to avoid compilation problems based on 
> having to expose the libunistring's enum iconv_ilseq_handle to the world.
> But later, we ended up creating the analogous
> scm_t_string_failed_conversion_handler type to work around that problem.

Right.  So I guess they can now be made public & documented.  Would you
like to do it?  :-)

Thanks,
Ludo’.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-07 17:11     ` Ludovic Courtès
@ 2010-09-07 22:05       ` Andy Wingo
  2010-09-08 12:35         ` Ludovic Courtès
  2010-09-08  3:26       ` Mike Gran
  1 sibling, 1 reply; 13+ messages in thread
From: Andy Wingo @ 2010-09-07 22:05 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi,

On Tue 07 Sep 2010 19:11, ludo@gnu.org (Ludovic Courtès) writes:

> Mike Gran <spk121@yahoo.com> writes:
>
>> From: Ludovic Courtès <ludo@gnu.org>
>>
>>> Besides, there’s the undocumented ‘scm_from_stringn’ and the  internal
>>> ‘scm_to_stringn’, which can convert from/to any encoding.  I  think they
>>> were initially kept internal because we weren’t quite sure about  the
>>> API.  Mike?
>>
>> Also, I think we were trying to avoid compilation problems based on 
>> having to expose the libunistring's enum iconv_ilseq_handle to the world.
>> But later, we ended up creating the analogous
>> scm_t_string_failed_conversion_handler type to work around that problem.
>
> Right.  So I guess they can now be made public & documented.  Would you
> like to do it?  :-)

Perhaps named scm_{to,from}_encoded_stringn? Giving the more brief names
to the more general functions sits a bit strange with me. But that could
just be me :)

Andy
-- 
http://wingolog.org/



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-07 17:11     ` Ludovic Courtès
  2010-09-07 22:05       ` Andy Wingo
@ 2010-09-08  3:26       ` Mike Gran
  2010-09-08 12:11         ` Ludovic Courtès
  2010-09-08 19:20         ` Andy Wingo
  1 sibling, 2 replies; 13+ messages in thread
From: Mike Gran @ 2010-09-08  3:26 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

[-- Attachment #1: Type: text/plain, Size: 752 bytes --]

> From: Ludovic Courtès <ludo@gnu.org>

> >>  Besides, there’s the undocumented ‘scm_from_stringn’ and the   internal
> >> ‘scm_to_stringn’, which can convert from/to any  encoding.  I  think they
> >> were initially kept internal  because we weren’t quite sure about  the
> >> API.   Mike?
> >
> > Also, I think we were trying to avoid compilation problems  based on 
> > having to expose the libunistring's enum iconv_ilseq_handle to  the world.
> > But later, we ended up creating the analogous
> >  scm_t_string_failed_conversion_handler type to work around that  problem.
> 
> Right.  So I guess they can now be made public &  documented.  Would you
> like to do it?   :-)

Perhaps something like the attached?

-Mike

[-- Attachment #2: commit-9132132 --]
[-- Type: application/octet-stream, Size: 6446 bytes --]

commit 9132132a0e19726abf739bdf207c82ae779e3060
Author: Michael Gran <spk121@yahoo.com>
Date:   Tue Sep 7 20:19:15 2010 -0700

    Provide non-locale C/Scheme string conversion functions
    
    * doc/ref/api-data.texi: document scm_to_stringn, scm_from_stringn,
      scm_to_iso88591_stringn, and scm_from_iso88591_stringn
    * libguile/strings.h (scm_to_stringn): make public
      (scm_to_iso88591_stringn): new macro
      (scm_from_iso88591_stringn): new macro

diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi
index 75e5e68..74db6ca 100755
--- a/doc/ref/api-data.texi
+++ b/doc/ref/api-data.texi
@@ -3969,6 +3969,71 @@ is larger than @var{max_len}, only @var{max_len} bytes have been
 stored and you probably need to try again with a larger buffer.
 @end deftypefn
 
+For most situations, string conversion should occur using the current
+locale, such as with the functions above.  But there may be cases where
+one wants to convert strings from a character encoding other than the
+locale's character encoding.  For these cases, the lower-level functions
+@code{scm_to_stringn} and @code{scm_from_stringn} are provided.  These
+functions should seldom be necessary if one is properly using locales.
+
+@deftp {C type} scm_t_string_failed_conversion_handler
+This is an enumerated type that can take one of three values:
+@code{SCM_FAILED_CONVERSION_ERROR},
+@code{SCM_FAILED_CONVERSION_QUESTION_MARK}, and
+@code{SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE}.  They are used to indicate
+a strategy for handling characters that cannot be converted to or from a
+given character encoding.  @code{SCM_FAILED_CONVERSION_ERROR} indicates
+that a conversion should throw an error if some characters cannot be
+converted.  @code{SCM_FAILED_CONVERSION_QUESTION_MARK} indicates that a
+conversion should replace unconvertable characters with the question
+mark character.  And, @code{SCM_FAILED_CONVERSION_ESCAPE_SEQUENCE}
+requests that a conversion should replace an unconvertable character
+with an escape sequence.
+
+While all three strategies apply when converting Scheme strings to C,
+only @code{SCM_FAILED_CONVERSION_ERROR} and
+@code{SCM_FAILED_CONVERSION_QUESTION_MARK} can be used when converting C
+strings to Scheme.
+@end deftp
+
+@deftypefn {C function} char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding, scm_t_string_failed_conversion_handler handler)
+Returns a newly allocated C string from the Guile string @var{str}.  The
+length of the string will be returned in @var{lenp}.  The character
+encoding of the C string is passed as the ASCII, null-terminated C
+string @var{encoding}.  The @var{handler} parameter gives a strategy for
+dealing with character that cannot be converted into @var{encoding}.
+
+If @var{lenp} is NULL, this function will return a null-terminated C
+string.  It will thrown an error if the string contains a null
+character.
+@end deftypefn
+
+@deftypefn {C function} SCM scm_from_stringn (const char *str, size_t len, const char *encoding, scm_t_string_failed_conversion_handler handler)
+This function returns a scheme string from the C string @var{str}.  The
+length of the C string is input as @var{len}.  The encoding of the C
+string is passed as the ASCII, null-terminated C string @code{encoding}.
+The @var{handler} parameters suggests a strategy for dealing with
+unconvertable characters.
+@end deftypefn
+
+Since Latin-1 encodings are common, the following two functions are
+provided.
+
+@deftypefn {C function} SCM scm_from_iso88591_stringn (const char *str, size_t len)
+Returns a scheme string from an ISO-8859-1-encoded C string @var{str} of
+length @var{len}.  This function may be implemented as a macro.
+@end deftypefn
+
+@deftypefn {C function} char * scm_to_iso88591_stringn (SCM str, size_t *lenp)
+Returns a newly allocated, ISO-8859-1-encoded C string from the scheme
+string @var{str}.  The length of the string is returned in @var{lenp}.
+An error will be thrown if the scheme string cannot be converted to the
+ISO-8859-1 encoding.  If @var{lenp} is @code{NULL}, the returned C
+string will be null-terminated, and an error will be thrown if the C
+string would otherwise contain null characters.  This function may be
+implemented as a macro.
+@end deftypefn
+
 @node String Internals
 @subsubsection String Internals
 
diff --git a/libguile/strings.h b/libguile/strings.h
index 734ac62..a7e13b2 100644
--- a/libguile/strings.h
+++ b/libguile/strings.h
@@ -113,10 +113,8 @@ SCM_API SCM scm_substring_shared (SCM str, SCM start, SCM end);
 SCM_API SCM scm_substring_copy (SCM str, SCM start, SCM end);
 SCM_API SCM scm_string_append (SCM args);
 
-SCM_API SCM scm_from_stringn (const char *str, size_t len, 
-                                     const char *encoding,
-                                     scm_t_string_failed_conversion_handler 
-                                     handler);
+SCM_API SCM scm_from_stringn (const char *str, size_t len, const char *encoding,
+                              scm_t_string_failed_conversion_handler handler);
 SCM_API SCM scm_c_make_string (size_t len, SCM chr);
 SCM_API size_t scm_c_string_length (SCM str);
 SCM_API size_t scm_c_symbol_length (SCM sym);
@@ -135,10 +133,8 @@ SCM_API SCM scm_take_locale_string (char *str);
 SCM_API SCM scm_take_locale_stringn (char *str, size_t len);
 SCM_API char *scm_to_locale_string (SCM str);
 SCM_API char *scm_to_locale_stringn (SCM str, size_t *lenp);
-SCM_INTERNAL char *scm_to_stringn (SCM str, size_t *lenp, 
-                                   const char *encoding,
-                                   scm_t_string_failed_conversion_handler
-                                   handler);
+SCM_API char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding,
+                              scm_t_string_failed_conversion_handler handler);
 SCM_INTERNAL scm_t_uint8 *scm_i_to_utf8_string (SCM str);
 SCM_API size_t scm_to_locale_stringbuf (SCM str, char *buf, size_t max_len);
 
@@ -215,6 +211,14 @@ SCM_API SCM scm_sys_symbol_dump (SCM);
 SCM_API SCM scm_sys_stringbuf_hist (void);
 #endif
 
+/* Macros */
+#define scm_to_iso88591_stringn(s,lenp)                                 \
+  scm_to_stringn ((s), (lenp), NULL, SCM_FAILED_CONVERSION_ERROR)
+#define scm_from_iso88591_stringn(s,len)                                \
+  scm_from_stringn ((s), (len), NULL, SCM_FAILED_CONVERSION_ERROR)
+
+
+
 /* deprecated stuff */
 
 #if SCM_ENABLE_DEPRECATED

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-08  3:26       ` Mike Gran
@ 2010-09-08 12:11         ` Ludovic Courtès
  2010-09-08 19:33           ` Andy Wingo
  2010-09-08 19:20         ` Andy Wingo
  1 sibling, 1 reply; 13+ messages in thread
From: Ludovic Courtès @ 2010-09-08 12:11 UTC (permalink / raw)
  To: Mike Gran; +Cc: guile-devel

Hi Mike,

Mike Gran <spk121@yahoo.com> writes:

>> From: Ludovic Courtès <ludo@gnu.org>
>
>> >>  Besides, there’s the undocumented ‘scm_from_stringn’ and the   internal
>> >> ‘scm_to_stringn’, which can convert from/to any  encoding.  I  think they
>> >> were initially kept internal  because we weren’t quite sure about  the
>> >> API.   Mike?
>> >
>> > Also, I think we were trying to avoid compilation problems  based on 
>> > having to expose the libunistring's enum iconv_ilseq_handle to  the world.
>> > But later, we ended up creating the analogous
>> >  scm_t_string_failed_conversion_handler type to work around that  problem.
>> 
>> Right.  So I guess they can now be made public &  documented.  Would you
>> like to do it?   :-)
>
> Perhaps something like the attached?

Yes, excellent!

> +@deftp {C type} scm_t_string_failed_conversion_handler

Should be “C Type”...

[...]

> +@deftypefn {C function} char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding, scm_t_string_failed_conversion_handler handler)

... and “C Function”.

> +Returns a newly allocated C string from the Guile string @var{str}.  The

Should be “Return”.

[...]

> +/* Macros */
> +#define scm_to_iso88591_stringn(s,lenp)                                 \
> +  scm_to_stringn ((s), (lenp), NULL, SCM_FAILED_CONVERSION_ERROR)
> +#define scm_from_iso88591_stringn(s,len)                                \
> +  scm_from_stringn ((s), (len), NULL, SCM_FAILED_CONVERSION_ERROR)

Please make them functions so that the implementation can eventually be
changed without breaking the ABI.

Apart from that, if Andy agrees, you can go ahead and push.

Thanks!

Ludo’.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-07 22:05       ` Andy Wingo
@ 2010-09-08 12:35         ` Ludovic Courtès
  0 siblings, 0 replies; 13+ messages in thread
From: Ludovic Courtès @ 2010-09-08 12:35 UTC (permalink / raw)
  To: guile-devel

Hi,

Andy Wingo <wingo@pobox.com> writes:

> On Tue 07 Sep 2010 19:11, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mike Gran <spk121@yahoo.com> writes:
>>
>>> From: Ludovic Courtès <ludo@gnu.org>
>>>
>>>> Besides, there’s the undocumented ‘scm_from_stringn’ and the  internal
>>>> ‘scm_to_stringn’, which can convert from/to any encoding.  I  think they
>>>> were initially kept internal because we weren’t quite sure about  the
>>>> API.  Mike?
>>>
>>> Also, I think we were trying to avoid compilation problems based on 
>>> having to expose the libunistring's enum iconv_ilseq_handle to the world.
>>> But later, we ended up creating the analogous
>>> scm_t_string_failed_conversion_handler type to work around that problem.
>>
>> Right.  So I guess they can now be made public & documented.  Would you
>> like to do it?  :-)
>
> Perhaps named scm_{to,from}_encoded_stringn?

FWIW I prefer ‘scm_{to,from}_string’ because (i) with these functions
the encoding as specified as a parameter instead of as part of the
function name (similar to ‘bytevector-u32-native-ref’
vs. ‘bytevector-u32-ref’), and (ii) the word ‘encoded’ doesn’t convey
any piece of information.

Thanks,
Ludo’.




^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-08  3:26       ` Mike Gran
  2010-09-08 12:11         ` Ludovic Courtès
@ 2010-09-08 19:20         ` Andy Wingo
  1 sibling, 0 replies; 13+ messages in thread
From: Andy Wingo @ 2010-09-08 19:20 UTC (permalink / raw)
  To: Mike Gran; +Cc: Ludovic Courtès, guile-devel

Hi Mike,

On Wed 08 Sep 2010 05:26, Mike Gran <spk121@yahoo.com> writes:

> Perhaps something like the attached?

Yes that's fantastic. I only have a couple of comments.

> +@deftypefn {C function} char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding, scm_t_string_failed_conversion_handler handler)

How do you feel about scm_to_encoded_stringn as a name? Also
scm_from_encoded_stringn.

> +If @var{lenp} is NULL, this function will return a null-terminated C
> +string.  It will thrown an error if the string contains a null
> +character.

Will scm_to_stringn return a null-terminated string if LENP is not null?

> +Since Latin-1 encodings are common, the following two functions are
> +provided.
> +
> +@deftypefn {C function} SCM scm_from_iso88591_stringn (const char *str, size_t len)

Ooh, my hands... can we call this one scm_from_latin1_stringn? And the
_to_ variant obviously.

> +This function may be implemented as a macro.

I think it's best to implement these as functions. The overhead is
minimal, and this gives us more deprecation flexibility in the future
should something be wrong with these functions.

What do you think?

Andy
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-08 12:11         ` Ludovic Courtès
@ 2010-09-08 19:33           ` Andy Wingo
  2010-09-08 21:04             ` Ludovic Courtès
  0 siblings, 1 reply; 13+ messages in thread
From: Andy Wingo @ 2010-09-08 19:33 UTC (permalink / raw)
  To: Ludovic Courtès; +Cc: guile-devel

Hi!

On Wed 08 Sep 2010 14:11, ludo@gnu.org (Ludovic Courtès) writes:

> Apart from that, if Andy agrees, you can go ahead and push.

It seems I hadn't pulled mail before replying. I'm OK with things, but
I'd still like to argue for "latin1" instead of "iso889595858951" ;-). I
know it's imprecise but it sure is easier to type.

A
-- 
http://wingolog.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: need: scm_from_{utf8,latin1}_{string,symbol,keyword}
  2010-09-08 19:33           ` Andy Wingo
@ 2010-09-08 21:04             ` Ludovic Courtès
  0 siblings, 0 replies; 13+ messages in thread
From: Ludovic Courtès @ 2010-09-08 21:04 UTC (permalink / raw)
  To: Andy Wingo; +Cc: guile-devel

Hi!

Andy Wingo <wingo@pobox.com> writes:

> On Wed 08 Sep 2010 14:11, ludo@gnu.org (Ludovic Courtès) writes:
>
>> Apart from that, if Andy agrees, you can go ahead and push.
>
> It seems I hadn't pulled mail before replying. I'm OK with things, but
> I'd still like to argue for "latin1" instead of "iso889595858951" ;-)

Fine with me!  :-)

Ludo’.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-09-08 21:04 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-06 11:23 need: scm_from_{utf8,latin1}_{string,symbol,keyword} Andy Wingo
2010-09-06 16:28 ` Mike Gran
2010-09-06 16:58   ` Andy Wingo
2010-09-06 17:02 ` Ludovic Courtès
2010-09-07 15:21   ` Mike Gran
2010-09-07 17:11     ` Ludovic Courtès
2010-09-07 22:05       ` Andy Wingo
2010-09-08 12:35         ` Ludovic Courtès
2010-09-08  3:26       ` Mike Gran
2010-09-08 12:11         ` Ludovic Courtès
2010-09-08 19:33           ` Andy Wingo
2010-09-08 21:04             ` Ludovic Courtès
2010-09-08 19:20         ` Andy Wingo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).