unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: "Ludovic Courtès" <ludo@gnu.org>
To: Andy Wingo <wingo@igalia.com>
Cc: guile-devel@gnu.org
Subject: Re: [PATCH] Avoid malloc/free/dynwind for POSIX subrs that take strings
Date: Sun, 17 May 2020 23:46:00 +0200	[thread overview]
Message-ID: <871rninskn.fsf@gnu.org> (raw)
In-Reply-To: <87a72m26bv.fsf@igalia.com> (Andy Wingo's message of "Tue, 05 May 2020 09:23:32 +0200")

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

Hi,

Andy Wingo <wingo@igalia.com> skribis:

> On Wed 29 Apr 2020 22:22, Ludovic Courtès <ludo@gnu.org> writes:
>
>> As discussed on IRC, the patch below arranges so that subrs that take
>> strings and pass them to syscall wrappers can avoid the
>> malloc/free/dynwind overhead.  This gives a 10% speedup on a tight loop
>> that calls these subrs:
>
> Neat optimization.  Since it's internal, no problem from me.  My concern
> is only about calcifying aspects of our C / GC interface in API.
>
>> On IRC, Andy mentioned concerns that the SCM could disappear and thus,
>> our the internal pointer returned by ‘scm_locale_string_data’ wouldn’t
>> be enough to prevent the stringbuf from being GC’d.
>
> I would suggest doing the whole optimization and being sure to do
> scm_remember_upto_here_1 on the SCM value at use sites.

Here’s an updated patch that does that.

The fallback case in ‘scm_locale_string_data’ is unfortunately not as
nice as I wrote before, because it has to call ‘scm_to_stringn’, which
mallocs the string, which must then be copied to GC-managed storage,
argh.  (The patch I sent earlier didn’t take care of the encoding
conversion in the fallback case.)

The libunistring functions can take a pre-allocated buffer, but they
always malloc a fresh one if needed.  So the best we could do is have a
‘scm_to_stringn’ variant that takes a buffer, but it’s not guaranteed
that it’ll actually be used.  All in all, it seems the added complexity
is not warranted.  The worst case of ‘scm_locale_string_data’ is also
rare enough.

Thoughts?

Ludo’.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: text/x-patch, Size: 25158 bytes --]

diff --git a/libguile/dynl.c b/libguile/dynl.c
index e9c03e95b..784b04b5e 100644
--- a/libguile/dynl.c
+++ b/libguile/dynl.c
@@ -1,6 +1,6 @@
 /* dynl.c - dynamic linking
 
-   Copyright 1990-2003,2008-2011,2017-2018
+   Copyright 1990-2003,2008-2011,2017-2018,2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -233,7 +233,7 @@ SCM_DEFINE (scm_dynamic_link, "dynamic-link", 0, 1, 0,
 #define FUNC_NAME s_scm_dynamic_link
 {
   void *handle;
-  char *file;
+  const char *file;
 
   scm_dynwind_begin (0);
   scm_i_dynwind_pthread_mutex_lock (&ltdl_lock);
@@ -241,10 +241,7 @@ SCM_DEFINE (scm_dynamic_link, "dynamic-link", 0, 1, 0,
   if (SCM_UNBNDP (filename))
     file = NULL;
   else
-    {
-      file = scm_to_locale_string (filename);
-      scm_dynwind_free (file);
-    }
+    file = scm_locale_string_data (filename);
 
   handle = sysdep_dynl_link (file, FUNC_NAME);
   scm_dynwind_end ();
@@ -315,13 +312,13 @@ SCM_DEFINE (scm_dynamic_pointer, "dynamic-pointer", 2, 0, 0,
     SCM_MISC_ERROR ("Already unlinked: ~S", dobj);
   else
     {
-      char *chars;
+      const char *chars;
 
       scm_dynwind_begin (0);
       scm_i_dynwind_pthread_mutex_lock (&ltdl_lock);
-      chars = scm_to_locale_string (name);
-      scm_dynwind_free (chars);
+      chars = scm_locale_string_data (name);
       val = sysdep_dynl_value (chars, DYNL_HANDLE (dobj), FUNC_NAME);
+      scm_remember_upto_here_1 (name);
       scm_dynwind_end ();
 
       return scm_from_pointer (val, NULL);
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 4f7115397..5d2965b1b 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -1,4 +1,4 @@
-/* Copyright 1996-2002,2004,2006,2009-2019
+/* Copyright 1996-2002,2004,2006,2009-2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -118,25 +118,24 @@
 
 /* Two helper macros for an often used pattern */
 
-#define STRING_SYSCALL(str,cstr,code)        \
-  do {                                       \
-    int eno;                                 \
-    char *cstr = scm_to_locale_string (str); \
-    SCM_SYSCALL (code);                      \
-    eno = errno; free (cstr); errno = eno;   \
+#define STRING_SYSCALL(str,cstr,code)                   \
+  do {                                                  \
+    int eno;                                            \
+    const char *cstr = scm_locale_string_data (str);    \
+    SCM_SYSCALL (code);                                 \
+    scm_remember_upto_here_1 (str);                     \
+    eno = errno; errno = eno;                           \
   } while (0)
 
-#define STRING2_SYSCALL(str1,cstr1,str2,cstr2,code)  \
-  do {                                               \
-    int eno;                                         \
-    char *cstr1, *cstr2;                             \
-    scm_dynwind_begin (0);                             \
-    cstr1 = scm_to_locale_string (str1);             \
-    scm_dynwind_free (cstr1);                          \
-    cstr2 = scm_to_locale_string (str2);             \
-    scm_dynwind_free (cstr2);                          \
-    SCM_SYSCALL (code);                              \
-    eno = errno; scm_dynwind_end (); errno = eno;      \
+#define STRING2_SYSCALL(str1,cstr1,str2,cstr2,code)     \
+  do {                                                  \
+    int eno;                                            \
+    const char *cstr1, *cstr2;                          \
+    cstr1 = scm_locale_string_data (str1);              \
+    cstr2 = scm_locale_string_data (str2);              \
+    SCM_SYSCALL (code);                                 \
+    scm_remember_upto_here_2 (str1, str2);              \
+    eno = errno; errno = eno;                           \
   } while (0)
 
 
@@ -536,9 +535,8 @@ SCM_DEFINE (scm_stat, "stat", 1, 1, 0,
     }
   else if (scm_is_string (object))
     {
-      char *file = scm_to_locale_string (object);
+      const char *file = scm_locale_string_data (object);
       SCM_SYSCALL (rv = stat_or_stat64 (file, &stat_temp));
-      free (file);
     }
   else
     {
@@ -1038,12 +1036,9 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0,
   int size = 100;
   char *buf;
   SCM result;
-  char *c_path;
-  
-  scm_dynwind_begin (0);
+  const char *c_path;
 
-  c_path = scm_to_locale_string (path);
-  scm_dynwind_free (c_path);
+  c_path = scm_locale_string_data (path);
 
   buf = scm_malloc (size);
 
@@ -1060,9 +1055,10 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0,
       errno = save_errno;
       SCM_SYSERROR;
     }
+
+  scm_remember_upto_here_1 (path);
   result = scm_take_locale_stringn (buf, rv);
 
-  scm_dynwind_end ();
   return result;
 }
 #undef FUNC_NAME
@@ -1073,18 +1069,14 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
 	    "The return value is unspecified.")
 #define FUNC_NAME s_scm_copy_file
 {
-  char *c_oldfile, *c_newfile;
+  const char *c_oldfile, *c_newfile;
   int oldfd, newfd;
   int n, rv;
   char buf[BUFSIZ];
   struct stat_or_stat64 oldstat;
 
-  scm_dynwind_begin (0);
-  
-  c_oldfile = scm_to_locale_string (oldfile);
-  scm_dynwind_free (c_oldfile);
-  c_newfile = scm_to_locale_string (newfile);
-  scm_dynwind_free (c_newfile);
+  c_oldfile = scm_locale_string_data (oldfile);
+  c_newfile = scm_locale_string_data (newfile);
 
   oldfd = open_or_open64 (c_oldfile, O_RDONLY | O_BINARY);
   if (oldfd == -1)
@@ -1104,6 +1096,8 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
       SCM_SYSERROR;
     }
 
+  scm_remember_upto_here_2 (oldfile, newfile);
+
   while ((n = read (oldfd, buf, sizeof buf)) > 0)
     if (write (newfd, buf, n) != n)
       {
@@ -1115,7 +1109,6 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0,
   if (close (newfd) == -1)
     SCM_SYSERROR;
 
-  scm_dynwind_end ();
   return SCM_UNSPECIFIED;
 }
 #undef FUNC_NAME
@@ -1384,11 +1377,11 @@ SCM_DEFINE (scm_access, "access?", 2, 0, 0,
 #define FUNC_NAME s_scm_access
 {
   int rv;
-  char *c_path;
+  const char *c_path;
 
-  c_path = scm_to_locale_string (path);
+  c_path = scm_locale_string_data (path);
   rv = access (c_path, scm_to_int (how));
-  free (c_path);
+  scm_remember_upto_here_1 (path);
 
   return scm_from_bool (!rv);
 }
@@ -1643,14 +1636,15 @@ SCM_DEFINE (scm_canonicalize_path, "canonicalize-path", 1, 0, 0,
             "Raises an error if any component of @var{path} does not exist.")
 #define FUNC_NAME s_scm_canonicalize_path
 {
-  char *str, *canon;
-  
+  char *canon;
+  const char *str;
+
   SCM_VALIDATE_STRING (1, path);
 
-  str = scm_to_locale_string (path);
+  str = scm_locale_string_data (path);
   canon = canonicalize_file_name (str);
-  free (str);
-  
+  scm_remember_upto_here_1 (path);
+
   if (canon)
     return scm_take_locale_string (canon);
   else
@@ -1664,11 +1658,12 @@ scm_i_relativize_path (SCM path, SCM in_path)
   SCM scanon;
   
   {
-    char *str, *canon;
+    char *canon;
+    const char *str;
 
-    str = scm_to_locale_string (path);
+    str = scm_locale_string_data (path);
     canon = canonicalize_file_name (str);
-    free (str);
+    scm_remember_upto_here_1 (path);
 
     if (!canon)
       return SCM_BOOL_F;
@@ -1683,12 +1678,13 @@ scm_i_relativize_path (SCM path, SCM in_path)
 
       /* Try to canonicalize DIR, since we have canonicalized PATH.  */
       {
-        char *str, *canon;
+        char *canon;
+        const char *str;
 
-        str = scm_to_locale_string (dir);
+        str = scm_locale_string_data (dir);
         canon = canonicalize_file_name (str);
-        free (str);
-  
+        scm_remember_upto_here_1 (dir);
+
         if (canon)
           dir = scm_from_locale_string (canon);
         free (canon);
diff --git a/libguile/foreign.c b/libguile/foreign.c
index 1368cc9da..95d5ad9dd 100644
--- a/libguile/foreign.c
+++ b/libguile/foreign.c
@@ -1,4 +1,4 @@
-/* Copyright 2010-2016,2018
+/* Copyright 2010-2016,2018,2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -386,22 +386,19 @@ SCM_DEFINE (scm_string_to_pointer, "string->pointer", 1, 1, 0,
     return scm_from_pointer (scm_to_locale_string (string), free);
   else
     {
-      char *enc;
+      const char *enc;
       SCM ret;
       
       SCM_VALIDATE_STRING (2, encoding);
 
-      enc = scm_to_locale_string (encoding);
-      scm_dynwind_begin (0);
-      scm_dynwind_free (enc);
+      enc = scm_locale_string_data (encoding);
 
       ret = scm_from_pointer
         (scm_to_stringn (string, NULL, enc,
                          scm_i_default_string_failed_conversion_handler ()),
          free);
 
-      scm_dynwind_end ();
-
+      scm_remember_upto_here_1 (encoding);
       return ret;
     }
 }
diff --git a/libguile/fports.c b/libguile/fports.c
index 4a3c30b88..5c420934e 100644
--- a/libguile/fports.c
+++ b/libguile/fports.c
@@ -151,12 +151,13 @@ fport_canonicalize_filename (SCM filename)
     }
   else if (scm_is_eq (mode, sym_absolute))
     {
-      char *str, *canon;
-  
-      str = scm_to_locale_string (filename);
+      char *canon;
+      const char *str;
+
+      str = scm_locale_string_data (filename);
       canon = canonicalize_file_name (str);
-      free (str);
-  
+      scm_remember_upto_here_1 (filename);
+
       return canon ? scm_take_locale_string (canon) : filename;
     }
   else
@@ -243,16 +244,13 @@ scm_open_file_with_encoding (SCM filename, SCM mode,
   SCM port;
   int fdes, flags, binary = 0;
   unsigned int retries;
-  char *file;
+  const char *file;
 
   if (SCM_UNLIKELY (!(scm_is_false (encoding) || scm_is_string (encoding))))
     scm_wrong_type_arg_msg (FUNC_NAME, 0, encoding,
                             "encoding to be string or false");
 
-  scm_dynwind_begin (0);
-
-  file = scm_to_locale_string (filename);
-  scm_dynwind_free (file);
+  file = scm_locale_string_data (filename);
 
   flags = scm_i_mode_to_open_flags (mode, &binary, FUNC_NAME);
 
@@ -321,8 +319,6 @@ scm_open_file_with_encoding (SCM filename, SCM mode,
         scm_i_set_port_encoding_x (port, enc);
     }
 
-  scm_dynwind_end ();
-
   return port;
 }
 #undef FUNC_NAME
diff --git a/libguile/gettext.c b/libguile/gettext.c
index b9af4d313..4f30b3a14 100644
--- a/libguile/gettext.c
+++ b/libguile/gettext.c
@@ -1,4 +1,4 @@
-/* Copyright 2004,2006,2018
+/* Copyright 2004,2006,2018,2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -94,14 +94,11 @@ SCM_DEFINE (scm_gettext, "gettext", 1, 2, 0,
 	    "and defaults to LC_MESSAGES.")
 #define FUNC_NAME s_scm_gettext
 {
-  char *c_msgid;
+  const char *c_msgid;
   char const *c_result;
   SCM result;
 
-  scm_dynwind_begin (0);
-
-  c_msgid = scm_to_locale_string (msgid);
-  scm_dynwind_free (c_msgid);
+  c_msgid = scm_locale_string_data (msgid);
 
   if (SCM_UNBNDP (domain))
     {
@@ -110,10 +107,9 @@ SCM_DEFINE (scm_gettext, "gettext", 1, 2, 0,
     }
   else
     {
-      char *c_domain;
+      const char *c_domain;
 
-      c_domain = scm_to_locale_string (domain);
-      scm_dynwind_free (c_domain);
+      c_domain = scm_locale_string_data (domain);
 
       if (SCM_UNBNDP (category))
 	{
@@ -135,7 +131,7 @@ SCM_DEFINE (scm_gettext, "gettext", 1, 2, 0,
   else
     result = scm_from_locale_string (c_result);
 
-  scm_dynwind_end ();
+  scm_remember_upto_here_2 (msgid, domain);
   return result;
 }
 #undef FUNC_NAME
@@ -150,19 +146,13 @@ SCM_DEFINE (scm_ngettext, "ngettext", 3, 2, 0,
 	    "@var{category} is optional and defaults to LC_MESSAGES.")
 #define FUNC_NAME s_scm_ngettext
 {
-  char *c_msgid;
-  char *c_msgid_plural;
+  const char *c_msgid, *c_msgid_plural;
   unsigned long c_n;
   const char *c_result;
   SCM result;
 
-  scm_dynwind_begin (0);
-
-  c_msgid = scm_to_locale_string (msgid);
-  scm_dynwind_free (c_msgid);
-
-  c_msgid_plural = scm_to_locale_string (msgid_plural);
-  scm_dynwind_free (c_msgid_plural);
+  c_msgid = scm_locale_string_data (msgid);
+  c_msgid_plural = scm_locale_string_data (msgid_plural);
 
   c_n = scm_to_ulong (n);
 
@@ -173,10 +163,9 @@ SCM_DEFINE (scm_ngettext, "ngettext", 3, 2, 0,
     }
   else
     {
-      char *c_domain;
+      const char *c_domain;
 
-      c_domain = scm_to_locale_string (domain);
-      scm_dynwind_free (c_domain);
+      c_domain = scm_locale_string_data (domain);
 
       if (SCM_UNBNDP (category))
 	{
@@ -200,8 +189,9 @@ SCM_DEFINE (scm_ngettext, "ngettext", 3, 2, 0,
     result = msgid_plural;
   else
     result = scm_from_locale_string (c_result);
-  
-  scm_dynwind_end ();
+
+  scm_remember_upto_here_2 (msgid, msgid_plural);
+  scm_remember_upto_here_1 (domain);
   return result;
 }
 #undef FUNC_NAME
@@ -214,18 +204,13 @@ SCM_DEFINE (scm_textdomain, "textdomain", 0, 1, 0,
 #define FUNC_NAME s_scm_textdomain
 {
   char const *c_result;
-  char *c_domain;
+  const char *c_domain;
   SCM result = SCM_BOOL_F;
 
-  scm_dynwind_begin (0);
-
   if (SCM_UNBNDP (domainname))
     c_domain = NULL;
   else
-    {
-      c_domain = scm_to_locale_string (domainname);
-      scm_dynwind_free (c_domain);
-    }
+    c_domain = scm_locale_string_data (domainname);
 
   c_result = textdomain (c_domain);
   if (c_result != NULL)
@@ -233,7 +218,7 @@ SCM_DEFINE (scm_textdomain, "textdomain", 0, 1, 0,
   else if (!SCM_UNBNDP (domainname))
     SCM_SYSERROR;
 
-  scm_dynwind_end ();
+  scm_remember_upto_here_1 (domainname);
   return result;
 }
 #undef FUNC_NAME
@@ -245,23 +230,16 @@ SCM_DEFINE (scm_bindtextdomain, "bindtextdomain", 1, 1, 0,
 	    "Return the directory bound to @var{domainname}.")
 #define FUNC_NAME s_scm_bindtextdomain
 {
-  char *c_domain;
-  char *c_directory;
+  const char *c_domain, *c_directory;
   char const *c_result;
   SCM result;
 
-  scm_dynwind_begin (0);
-
   if (SCM_UNBNDP (directory))
     c_directory = NULL;
   else
-    {
-      c_directory = scm_to_locale_string (directory);
-      scm_dynwind_free (c_directory);
-    }
+    c_directory = scm_locale_string_data (directory);
 
-  c_domain = scm_to_locale_string (domainname);
-  scm_dynwind_free (c_domain);
+  c_domain = scm_locale_string_data (domainname);
 
   c_result = bindtextdomain (c_domain, c_directory);
 
@@ -272,7 +250,7 @@ SCM_DEFINE (scm_bindtextdomain, "bindtextdomain", 1, 1, 0,
   else
     result = SCM_BOOL_F;
 
-  scm_dynwind_end ();
+  scm_remember_upto_here_2 (domainname, directory);
   return result;
 }
 #undef FUNC_NAME
@@ -284,23 +262,16 @@ SCM_DEFINE (scm_bind_textdomain_codeset, "bind-textdomain-codeset", 1, 1, 0,
 	    "Return the encoding of @var{domainname}.")
 #define FUNC_NAME s_scm_bind_textdomain_codeset
 {
-  char *c_domain;
-  char *c_encoding;
+  const char *c_domain, *c_encoding;
   char const *c_result;
   SCM result;
 
-  scm_dynwind_begin (0);
-
   if (SCM_UNBNDP (encoding))
     c_encoding = NULL;
   else
-    {
-      c_encoding = scm_to_locale_string (encoding);
-      scm_dynwind_free (c_encoding);
-    }
+    c_encoding = scm_locale_string_data (encoding);
 
-  c_domain = scm_to_locale_string (domainname);
-  scm_dynwind_free (c_domain);
+  c_domain = scm_locale_string_data (domainname);
 
   c_result = bind_textdomain_codeset (c_domain, c_encoding);
 
@@ -311,7 +282,7 @@ SCM_DEFINE (scm_bind_textdomain_codeset, "bind-textdomain-codeset", 1, 1, 0,
   else
     result = SCM_BOOL_F;
 
-  scm_dynwind_end ();
+  scm_remember_upto_here_2 (domainname, encoding);
   return result;
 }
 #undef FUNC_NAME
diff --git a/libguile/load.c b/libguile/load.c
index e95c36db1..71a3e8754 100644
--- a/libguile/load.c
+++ b/libguile/load.c
@@ -1,4 +1,4 @@
-/* Copyright 1995-1996,1998-2001,2004,2006,2008-2019
+/* Copyright 1995-1996,1998-2001,2004,2006,2008-2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -460,7 +460,7 @@ stringbuf_cat_locale_string (struct stringbuf *buf, SCM str)
 }
 
 static void
-stringbuf_cat (struct stringbuf *buf, char *str)
+stringbuf_cat (struct stringbuf *buf, const char *str)
 {
   size_t max_len = buf->buf_len - (buf->ptr - buf->buf) - 1;
   size_t len = strlen (str);
@@ -648,7 +648,7 @@ load_thunk_from_path (SCM filename, SCM source_file_name,
 {
   struct stringbuf buf;
   struct stat stat_buf;
-  char *filename_chars;
+  const char *filename_chars;
   size_t filename_len;
   SCM path, extensions;
   SCM result = SCM_BOOL_F;
@@ -664,11 +664,8 @@ load_thunk_from_path (SCM filename, SCM source_file_name,
     scm_misc_error ("%search-path", "bad extensions list: ~a",
                     scm_list_1 (extensions));
 
-  scm_dynwind_begin (0);
-
-  filename_chars = scm_to_locale_string (filename);
+  filename_chars = scm_locale_string_data (filename);
   filename_len = strlen (filename_chars);
-  scm_dynwind_free (filename_chars);
 
   /* If FILENAME is absolute and is still valid, return it unchanged.  */
   if (is_absolute_file_name (filename))
@@ -682,7 +679,7 @@ load_thunk_from_path (SCM filename, SCM source_file_name,
 
   /* If FILENAME has an extension, don't try to add EXTENSIONS to it.  */
   {
-    char *endp;
+    const char *endp;
 
     for (endp = filename_chars + filename_len - 1;
 	 endp >= filename_chars;
@@ -787,7 +784,7 @@ load_thunk_from_path (SCM filename, SCM source_file_name,
     scm_wrong_type_arg_msg (NULL, 0, path, "proper list");
 
  end:
-  scm_dynwind_end ();
+  scm_remember_upto_here_1 (filename);
   return result;
 }
 
@@ -804,7 +801,7 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts,
              struct stat *stat_buf)
 {
   struct stringbuf buf;
-  char *filename_chars;
+  const char *filename_chars;
   size_t filename_len;
   SCM result = SCM_BOOL_F;
   char initial_buffer[256];
@@ -816,11 +813,8 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts,
     scm_misc_error ("%search-path", "bad extensions list: ~a",
                     scm_list_1 (extensions));
 
-  scm_dynwind_begin (0);
-
-  filename_chars = scm_to_locale_string (filename);
+  filename_chars = scm_locale_string_data (filename);
   filename_len = strlen (filename_chars);
-  scm_dynwind_free (filename_chars);
 
   /* If FILENAME is absolute and is still valid, return it unchanged.  */
   if (is_absolute_file_name (filename))
@@ -835,7 +829,7 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts,
 
   /* If FILENAME has an extension, don't try to add EXTENSIONS to it.  */
   {
-    char *endp;
+    const char *endp;
 
     for (endp = filename_chars + filename_len - 1;
 	 endp >= filename_chars;
@@ -918,7 +912,7 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts,
     scm_wrong_type_arg_msg (NULL, 0, path, "proper list");
 
  end:
-  scm_dynwind_end ();
+  scm_remember_upto_here_1 (filename);
   return result;
 }
 
diff --git a/libguile/net_db.c b/libguile/net_db.c
index dfb61e8d0..711b52bd9 100644
--- a/libguile/net_db.c
+++ b/libguile/net_db.c
@@ -1,5 +1,5 @@
 /* "net_db.c" network database support
-   Copyright 1995-2001,2006,2009-2013,2018
+   Copyright 1995-2001,2006,2009-2013,2018,2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -165,9 +165,9 @@ SCM_DEFINE (scm_gethost, "gethost", 0, 1, 0,
     }
   else if (scm_is_string (host))
     {
-      char *str = scm_to_locale_string (host);
+      const char *str = scm_locale_string_data (host);
       entry = gethostbyname (str);
-      free (str);
+      scm_remember_upto_here_1 (host);
     }
   else
     {
@@ -239,10 +239,10 @@ SCM_DEFINE (scm_getnet, "getnet", 0, 1, 0,
     }
   else if (scm_is_string (net))
     {
-      char *str = scm_to_locale_string (net);
+      const char *str = scm_locale_string_data (net);
       entry = getnetbyname (str);
       eno = errno;
-      free (str);
+      scm_remember_upto_here_1 (net);
     }
   else
     {
@@ -291,10 +291,10 @@ SCM_DEFINE (scm_getproto, "getproto", 0, 1, 0,
     }
   else if (scm_is_string (protocol))
     {
-      char *str = scm_to_locale_string (protocol);
+      const char *str = scm_locale_string_data (protocol);
       entry = getprotobyname (str);
       eno = errno;
-      free (str);
+      scm_remember_upto_here_1 (protocol);
     }
   else
     {
@@ -341,7 +341,7 @@ SCM_DEFINE (scm_getserv, "getserv", 0, 2, 0,
 #define FUNC_NAME s_scm_getserv
 {
   struct servent *entry;
-  char *protoname;
+  const char *protoname;
   int eno;
 
   if (SCM_UNBNDP (name))
@@ -357,17 +357,14 @@ SCM_DEFINE (scm_getserv, "getserv", 0, 2, 0,
       return scm_return_entry (entry);
     }
 
-  scm_dynwind_begin (0);
-
-  protoname = scm_to_locale_string (protocol);
-  scm_dynwind_free (protoname);
+  protoname = scm_locale_string_data (protocol);
 
   if (scm_is_string (name))
     {
-      char *str = scm_to_locale_string (name);
+      const char *str = scm_locale_string_data (name);
       entry = getservbyname (str, protoname);
       eno = errno;
-      free (str);
+      scm_remember_upto_here_1 (name);
     }
   else
     {
@@ -378,7 +375,7 @@ SCM_DEFINE (scm_getserv, "getserv", 0, 2, 0,
   if (!entry)
     SCM_SYSERROR_MSG("no such service ~A", scm_list_1 (name), eno);
 
-  scm_dynwind_end ();
+  scm_remember_upto_here_1 (protoname);
   return scm_return_entry (entry);
 }
 #undef FUNC_NAME
@@ -607,7 +604,7 @@ SCM_DEFINE (scm_getaddrinfo, "getaddrinfo", 1, 5, 0,
 #define FUNC_NAME s_scm_getaddrinfo
 {
   int err;
-  char *c_name, *c_service;
+  const char *c_name, *c_service;
   struct addrinfo c_hints, *c_result;
   SCM result = SCM_EOL;
 
@@ -617,21 +614,13 @@ SCM_DEFINE (scm_getaddrinfo, "getaddrinfo", 1, 5, 0,
   if (!SCM_UNBNDP (service) && scm_is_true (service))
     SCM_VALIDATE_STRING (SCM_ARG2, service);
 
-  scm_dynwind_begin (0);
-
   if (scm_is_string (name))
-    {
-      c_name = scm_to_locale_string (name);
-      scm_dynwind_free (c_name);
-    }
+    c_name = scm_locale_string_data (name);
   else
     c_name = NULL;
 
   if (scm_is_string (service))
-    {
-      c_service = scm_to_locale_string (service);
-      scm_dynwind_free (c_service);
-    }
+    c_service = scm_locale_string_data (service);
   else
     c_service = NULL;
 
@@ -667,8 +656,7 @@ SCM_DEFINE (scm_getaddrinfo, "getaddrinfo", 1, 5, 0,
   else
     scm_throw (sym_getaddrinfo_error, scm_list_1 (scm_from_int (err)));
 
-  scm_dynwind_end ();
-
+  scm_remember_upto_here_2 (name, service);
   return result;
 }
 #undef FUNC_NAME
diff --git a/libguile/strings.c b/libguile/strings.c
index aab104498..175fdff6c 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -1,4 +1,4 @@
-/* Copyright 1995-1996,1998,2000-2001,2004,2006,2008-2016,2018-2019
+/* Copyright 1995-1996,1998,2000-2001,2004,2006,2008-2016,2018-2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -580,6 +580,51 @@ scm_i_string_data (SCM str)
   return data;
 }
 
+/* Return true if the LEN bytes at STR consist of pure ASCII.  */
+static int
+is_ascii (const char *str, size_t len)
+{
+  for (size_t i = 0; i < len; i++)
+    {
+      if (str[i] > 127)
+        return 0;
+    }
+
+  return 1;
+}
+
+/* Return a C null-terminated string in locale encoding corresponding to
+   STR.  When possible, avoid heap allocation; in other cases, return a
+   pointer to GC-managed memory.
+
+   Note: The returned pointer can be an interior pointer in the
+   stringbuf of STR.  Thus, the caller must ensure that STR is live
+   while it accesses the returned pointer.  */
+const char *
+scm_locale_string_data (SCM str)
+{
+  const void *data = scm_i_string_data (str);
+  size_t len = scm_i_string_length (str);
+
+  /* Per <https://pubs.opengroup.org/onlinepubs/007908799/xbd/charset.html>,
+     locale encodings must be a superset of the "portable character
+     set", essentially ASCII.  When STR is ASCII and nul-terminated,
+     return its pointer right away.  */
+  if (SCM_LIKELY (scm_i_is_narrow_string (str)
+                  && ((char *) data)[len] == '\0'
+                  && len < 1024       /* because 'is_ascii' is linear */
+                  && is_ascii (data, len)))
+    return data;
+
+  /* FIXME: The worst case is really bad since we end up allocating
+     twice.  */
+  char *buf = scm_to_locale_stringn (str, &len);
+  char *result = scm_gc_strndup (buf, len, "locale string");
+  free (buf);
+
+  return result;
+}
+
 /* Returns a pointer to the 8-bit Latin-1 encoded character array of
    STR.  */
 const char *
diff --git a/libguile/strings.h b/libguile/strings.h
index 3f92d8c89..938b13c37 100644
--- a/libguile/strings.h
+++ b/libguile/strings.h
@@ -1,7 +1,7 @@
 #ifndef SCM_STRINGS_H
 #define SCM_STRINGS_H
 
-/* Copyright 1995-1998,2000-2001,2004-2006,2008-2011,2013,2015-2019
+/* Copyright 1995-1998,2000-2001,2004-2006,2008-2011,2013,2015-2020
      Free Software Foundation, Inc.
 
    This file is part of Guile.
@@ -239,6 +239,7 @@ SCM_API /* FIXME: not internal */ const char *scm_i_string_chars (SCM str);
 SCM_API /* FIXME: not internal */ char *scm_i_string_writable_chars (SCM str);
 SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str);
 SCM_INTERNAL const void *scm_i_string_data (SCM str);
+SCM_INTERNAL const char *scm_locale_string_data (SCM str);
 
 SCM_INTERNAL SCM scm_i_string_start_writing (SCM str);
 SCM_INTERNAL void scm_i_string_stop_writing (void);

  reply	other threads:[~2020-05-17 21:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-29 20:22 [PATCH] Avoid malloc/free/dynwind for POSIX subrs that take strings Ludovic Courtès
2020-05-05  7:23 ` Andy Wingo
2020-05-17 21:46   ` Ludovic Courtès [this message]
2020-05-18  6:50     ` Andy Wingo
2020-05-22 20:47       ` Ludovic Courtès

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871rninskn.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=guile-devel@gnu.org \
    --cc=wingo@igalia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).