From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: =?utf-8?Q?Ludovic_Court=C3=A8s?= Newsgroups: gmane.lisp.guile.devel Subject: Re: [PATCH] Avoid malloc/free/dynwind for POSIX subrs that take strings Date: Sun, 17 May 2020 23:46:00 +0200 Message-ID: <871rninskn.fsf@gnu.org> References: <87h7x2qbxa.fsf@inria.fr> <87a72m26bv.fsf@igalia.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="95670"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) Cc: guile-devel@gnu.org To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Sun May 17 23:49:59 2020 Return-path: Envelope-to: guile-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1jaRAU-000Om2-Qw for guile-devel@m.gmane-mx.org; Sun, 17 May 2020 23:49:59 +0200 Original-Received: from localhost ([::1]:36128 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jaRAT-0005js-PY for guile-devel@m.gmane-mx.org; Sun, 17 May 2020 17:49:57 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:60620) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jaR6n-0005J8-2D for guile-devel@gnu.org; Sun, 17 May 2020 17:46:16 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:40546) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jaR6l-0002HY-4Q; Sun, 17 May 2020 17:46:07 -0400 Original-Received: from [2a01:e0a:1d:7270:af76:b9b:ca24:c465] (port=35910 helo=ribbon) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1jaR6f-0002dJ-S6; Sun, 17 May 2020 17:46:05 -0400 X-URL: http://www.fdn.fr/~lcourtes/ X-Revolutionary-Date: 29 =?utf-8?Q?Flor=C3=A9al?= an 228 de la =?utf-8?Q?R?= =?utf-8?Q?=C3=A9volution?= X-PGP-Key-ID: 0x090B11993D9AEBB5 X-PGP-Key: http://www.fdn.fr/~lcourtes/ludovic.asc X-PGP-Fingerprint: 3CE4 6455 8A84 FDC6 9DB4 0CFB 090B 1199 3D9A EBB5 X-OS: x86_64-pc-linux-gnu In-Reply-To: <87a72m26bv.fsf@igalia.com> (Andy Wingo's message of "Tue, 05 May 2020 09:23:32 +0200") X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane-mx.org@gnu.org Original-Sender: "guile-devel" Xref: news.gmane.io gmane.lisp.guile.devel:20524 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi, Andy Wingo skribis: > On Wed 29 Apr 2020 22:22, Ludovic Court=C3=A8s writes: > >> As discussed on IRC, the patch below arranges so that subrs that take >> strings and pass them to syscall wrappers can avoid the >> malloc/free/dynwind overhead. This gives a 10% speedup on a tight loop >> that calls these subrs: > > Neat optimization. Since it's internal, no problem from me. My concern > is only about calcifying aspects of our C / GC interface in API. > >> On IRC, Andy mentioned concerns that the SCM could disappear and thus, >> our the internal pointer returned by =E2=80=98scm_locale_string_data=E2= =80=99 wouldn=E2=80=99t >> be enough to prevent the stringbuf from being GC=E2=80=99d. > > I would suggest doing the whole optimization and being sure to do > scm_remember_upto_here_1 on the SCM value at use sites. Here=E2=80=99s an updated patch that does that. The fallback case in =E2=80=98scm_locale_string_data=E2=80=99 is unfortunat= ely not as nice as I wrote before, because it has to call =E2=80=98scm_to_stringn=E2= =80=99, which mallocs the string, which must then be copied to GC-managed storage, argh. (The patch I sent earlier didn=E2=80=99t take care of the encoding conversion in the fallback case.) The libunistring functions can take a pre-allocated buffer, but they always malloc a fresh one if needed. So the best we could do is have a =E2=80=98scm_to_stringn=E2=80=99 variant that takes a buffer, but it=E2=80= =99s not guaranteed that it=E2=80=99ll actually be used. All in all, it seems the added comple= xity is not warranted. The worst case of =E2=80=98scm_locale_string_data=E2=80= =99 is also rare enough. Thoughts? Ludo=E2=80=99. --=-=-= Content-Type: text/x-patch Content-Disposition: inline diff --git a/libguile/dynl.c b/libguile/dynl.c index e9c03e95b..784b04b5e 100644 --- a/libguile/dynl.c +++ b/libguile/dynl.c @@ -1,6 +1,6 @@ /* dynl.c - dynamic linking - Copyright 1990-2003,2008-2011,2017-2018 + Copyright 1990-2003,2008-2011,2017-2018,2020 Free Software Foundation, Inc. This file is part of Guile. @@ -233,7 +233,7 @@ SCM_DEFINE (scm_dynamic_link, "dynamic-link", 0, 1, 0, #define FUNC_NAME s_scm_dynamic_link { void *handle; - char *file; + const char *file; scm_dynwind_begin (0); scm_i_dynwind_pthread_mutex_lock (<dl_lock); @@ -241,10 +241,7 @@ SCM_DEFINE (scm_dynamic_link, "dynamic-link", 0, 1, 0, if (SCM_UNBNDP (filename)) file = NULL; else - { - file = scm_to_locale_string (filename); - scm_dynwind_free (file); - } + file = scm_locale_string_data (filename); handle = sysdep_dynl_link (file, FUNC_NAME); scm_dynwind_end (); @@ -315,13 +312,13 @@ SCM_DEFINE (scm_dynamic_pointer, "dynamic-pointer", 2, 0, 0, SCM_MISC_ERROR ("Already unlinked: ~S", dobj); else { - char *chars; + const char *chars; scm_dynwind_begin (0); scm_i_dynwind_pthread_mutex_lock (<dl_lock); - chars = scm_to_locale_string (name); - scm_dynwind_free (chars); + chars = scm_locale_string_data (name); val = sysdep_dynl_value (chars, DYNL_HANDLE (dobj), FUNC_NAME); + scm_remember_upto_here_1 (name); scm_dynwind_end (); return scm_from_pointer (val, NULL); diff --git a/libguile/filesys.c b/libguile/filesys.c index 4f7115397..5d2965b1b 100644 --- a/libguile/filesys.c +++ b/libguile/filesys.c @@ -1,4 +1,4 @@ -/* Copyright 1996-2002,2004,2006,2009-2019 +/* Copyright 1996-2002,2004,2006,2009-2020 Free Software Foundation, Inc. This file is part of Guile. @@ -118,25 +118,24 @@ /* Two helper macros for an often used pattern */ -#define STRING_SYSCALL(str,cstr,code) \ - do { \ - int eno; \ - char *cstr = scm_to_locale_string (str); \ - SCM_SYSCALL (code); \ - eno = errno; free (cstr); errno = eno; \ +#define STRING_SYSCALL(str,cstr,code) \ + do { \ + int eno; \ + const char *cstr = scm_locale_string_data (str); \ + SCM_SYSCALL (code); \ + scm_remember_upto_here_1 (str); \ + eno = errno; errno = eno; \ } while (0) -#define STRING2_SYSCALL(str1,cstr1,str2,cstr2,code) \ - do { \ - int eno; \ - char *cstr1, *cstr2; \ - scm_dynwind_begin (0); \ - cstr1 = scm_to_locale_string (str1); \ - scm_dynwind_free (cstr1); \ - cstr2 = scm_to_locale_string (str2); \ - scm_dynwind_free (cstr2); \ - SCM_SYSCALL (code); \ - eno = errno; scm_dynwind_end (); errno = eno; \ +#define STRING2_SYSCALL(str1,cstr1,str2,cstr2,code) \ + do { \ + int eno; \ + const char *cstr1, *cstr2; \ + cstr1 = scm_locale_string_data (str1); \ + cstr2 = scm_locale_string_data (str2); \ + SCM_SYSCALL (code); \ + scm_remember_upto_here_2 (str1, str2); \ + eno = errno; errno = eno; \ } while (0) @@ -536,9 +535,8 @@ SCM_DEFINE (scm_stat, "stat", 1, 1, 0, } else if (scm_is_string (object)) { - char *file = scm_to_locale_string (object); + const char *file = scm_locale_string_data (object); SCM_SYSCALL (rv = stat_or_stat64 (file, &stat_temp)); - free (file); } else { @@ -1038,12 +1036,9 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0, int size = 100; char *buf; SCM result; - char *c_path; - - scm_dynwind_begin (0); + const char *c_path; - c_path = scm_to_locale_string (path); - scm_dynwind_free (c_path); + c_path = scm_locale_string_data (path); buf = scm_malloc (size); @@ -1060,9 +1055,10 @@ SCM_DEFINE (scm_readlink, "readlink", 1, 0, 0, errno = save_errno; SCM_SYSERROR; } + + scm_remember_upto_here_1 (path); result = scm_take_locale_stringn (buf, rv); - scm_dynwind_end (); return result; } #undef FUNC_NAME @@ -1073,18 +1069,14 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0, "The return value is unspecified.") #define FUNC_NAME s_scm_copy_file { - char *c_oldfile, *c_newfile; + const char *c_oldfile, *c_newfile; int oldfd, newfd; int n, rv; char buf[BUFSIZ]; struct stat_or_stat64 oldstat; - scm_dynwind_begin (0); - - c_oldfile = scm_to_locale_string (oldfile); - scm_dynwind_free (c_oldfile); - c_newfile = scm_to_locale_string (newfile); - scm_dynwind_free (c_newfile); + c_oldfile = scm_locale_string_data (oldfile); + c_newfile = scm_locale_string_data (newfile); oldfd = open_or_open64 (c_oldfile, O_RDONLY | O_BINARY); if (oldfd == -1) @@ -1104,6 +1096,8 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0, SCM_SYSERROR; } + scm_remember_upto_here_2 (oldfile, newfile); + while ((n = read (oldfd, buf, sizeof buf)) > 0) if (write (newfd, buf, n) != n) { @@ -1115,7 +1109,6 @@ SCM_DEFINE (scm_copy_file, "copy-file", 2, 0, 0, if (close (newfd) == -1) SCM_SYSERROR; - scm_dynwind_end (); return SCM_UNSPECIFIED; } #undef FUNC_NAME @@ -1384,11 +1377,11 @@ SCM_DEFINE (scm_access, "access?", 2, 0, 0, #define FUNC_NAME s_scm_access { int rv; - char *c_path; + const char *c_path; - c_path = scm_to_locale_string (path); + c_path = scm_locale_string_data (path); rv = access (c_path, scm_to_int (how)); - free (c_path); + scm_remember_upto_here_1 (path); return scm_from_bool (!rv); } @@ -1643,14 +1636,15 @@ SCM_DEFINE (scm_canonicalize_path, "canonicalize-path", 1, 0, 0, "Raises an error if any component of @var{path} does not exist.") #define FUNC_NAME s_scm_canonicalize_path { - char *str, *canon; - + char *canon; + const char *str; + SCM_VALIDATE_STRING (1, path); - str = scm_to_locale_string (path); + str = scm_locale_string_data (path); canon = canonicalize_file_name (str); - free (str); - + scm_remember_upto_here_1 (path); + if (canon) return scm_take_locale_string (canon); else @@ -1664,11 +1658,12 @@ scm_i_relativize_path (SCM path, SCM in_path) SCM scanon; { - char *str, *canon; + char *canon; + const char *str; - str = scm_to_locale_string (path); + str = scm_locale_string_data (path); canon = canonicalize_file_name (str); - free (str); + scm_remember_upto_here_1 (path); if (!canon) return SCM_BOOL_F; @@ -1683,12 +1678,13 @@ scm_i_relativize_path (SCM path, SCM in_path) /* Try to canonicalize DIR, since we have canonicalized PATH. */ { - char *str, *canon; + char *canon; + const char *str; - str = scm_to_locale_string (dir); + str = scm_locale_string_data (dir); canon = canonicalize_file_name (str); - free (str); - + scm_remember_upto_here_1 (dir); + if (canon) dir = scm_from_locale_string (canon); free (canon); diff --git a/libguile/foreign.c b/libguile/foreign.c index 1368cc9da..95d5ad9dd 100644 --- a/libguile/foreign.c +++ b/libguile/foreign.c @@ -1,4 +1,4 @@ -/* Copyright 2010-2016,2018 +/* Copyright 2010-2016,2018,2020 Free Software Foundation, Inc. This file is part of Guile. @@ -386,22 +386,19 @@ SCM_DEFINE (scm_string_to_pointer, "string->pointer", 1, 1, 0, return scm_from_pointer (scm_to_locale_string (string), free); else { - char *enc; + const char *enc; SCM ret; SCM_VALIDATE_STRING (2, encoding); - enc = scm_to_locale_string (encoding); - scm_dynwind_begin (0); - scm_dynwind_free (enc); + enc = scm_locale_string_data (encoding); ret = scm_from_pointer (scm_to_stringn (string, NULL, enc, scm_i_default_string_failed_conversion_handler ()), free); - scm_dynwind_end (); - + scm_remember_upto_here_1 (encoding); return ret; } } diff --git a/libguile/fports.c b/libguile/fports.c index 4a3c30b88..5c420934e 100644 --- a/libguile/fports.c +++ b/libguile/fports.c @@ -151,12 +151,13 @@ fport_canonicalize_filename (SCM filename) } else if (scm_is_eq (mode, sym_absolute)) { - char *str, *canon; - - str = scm_to_locale_string (filename); + char *canon; + const char *str; + + str = scm_locale_string_data (filename); canon = canonicalize_file_name (str); - free (str); - + scm_remember_upto_here_1 (filename); + return canon ? scm_take_locale_string (canon) : filename; } else @@ -243,16 +244,13 @@ scm_open_file_with_encoding (SCM filename, SCM mode, SCM port; int fdes, flags, binary = 0; unsigned int retries; - char *file; + const char *file; if (SCM_UNLIKELY (!(scm_is_false (encoding) || scm_is_string (encoding)))) scm_wrong_type_arg_msg (FUNC_NAME, 0, encoding, "encoding to be string or false"); - scm_dynwind_begin (0); - - file = scm_to_locale_string (filename); - scm_dynwind_free (file); + file = scm_locale_string_data (filename); flags = scm_i_mode_to_open_flags (mode, &binary, FUNC_NAME); @@ -321,8 +319,6 @@ scm_open_file_with_encoding (SCM filename, SCM mode, scm_i_set_port_encoding_x (port, enc); } - scm_dynwind_end (); - return port; } #undef FUNC_NAME diff --git a/libguile/gettext.c b/libguile/gettext.c index b9af4d313..4f30b3a14 100644 --- a/libguile/gettext.c +++ b/libguile/gettext.c @@ -1,4 +1,4 @@ -/* Copyright 2004,2006,2018 +/* Copyright 2004,2006,2018,2020 Free Software Foundation, Inc. This file is part of Guile. @@ -94,14 +94,11 @@ SCM_DEFINE (scm_gettext, "gettext", 1, 2, 0, "and defaults to LC_MESSAGES.") #define FUNC_NAME s_scm_gettext { - char *c_msgid; + const char *c_msgid; char const *c_result; SCM result; - scm_dynwind_begin (0); - - c_msgid = scm_to_locale_string (msgid); - scm_dynwind_free (c_msgid); + c_msgid = scm_locale_string_data (msgid); if (SCM_UNBNDP (domain)) { @@ -110,10 +107,9 @@ SCM_DEFINE (scm_gettext, "gettext", 1, 2, 0, } else { - char *c_domain; + const char *c_domain; - c_domain = scm_to_locale_string (domain); - scm_dynwind_free (c_domain); + c_domain = scm_locale_string_data (domain); if (SCM_UNBNDP (category)) { @@ -135,7 +131,7 @@ SCM_DEFINE (scm_gettext, "gettext", 1, 2, 0, else result = scm_from_locale_string (c_result); - scm_dynwind_end (); + scm_remember_upto_here_2 (msgid, domain); return result; } #undef FUNC_NAME @@ -150,19 +146,13 @@ SCM_DEFINE (scm_ngettext, "ngettext", 3, 2, 0, "@var{category} is optional and defaults to LC_MESSAGES.") #define FUNC_NAME s_scm_ngettext { - char *c_msgid; - char *c_msgid_plural; + const char *c_msgid, *c_msgid_plural; unsigned long c_n; const char *c_result; SCM result; - scm_dynwind_begin (0); - - c_msgid = scm_to_locale_string (msgid); - scm_dynwind_free (c_msgid); - - c_msgid_plural = scm_to_locale_string (msgid_plural); - scm_dynwind_free (c_msgid_plural); + c_msgid = scm_locale_string_data (msgid); + c_msgid_plural = scm_locale_string_data (msgid_plural); c_n = scm_to_ulong (n); @@ -173,10 +163,9 @@ SCM_DEFINE (scm_ngettext, "ngettext", 3, 2, 0, } else { - char *c_domain; + const char *c_domain; - c_domain = scm_to_locale_string (domain); - scm_dynwind_free (c_domain); + c_domain = scm_locale_string_data (domain); if (SCM_UNBNDP (category)) { @@ -200,8 +189,9 @@ SCM_DEFINE (scm_ngettext, "ngettext", 3, 2, 0, result = msgid_plural; else result = scm_from_locale_string (c_result); - - scm_dynwind_end (); + + scm_remember_upto_here_2 (msgid, msgid_plural); + scm_remember_upto_here_1 (domain); return result; } #undef FUNC_NAME @@ -214,18 +204,13 @@ SCM_DEFINE (scm_textdomain, "textdomain", 0, 1, 0, #define FUNC_NAME s_scm_textdomain { char const *c_result; - char *c_domain; + const char *c_domain; SCM result = SCM_BOOL_F; - scm_dynwind_begin (0); - if (SCM_UNBNDP (domainname)) c_domain = NULL; else - { - c_domain = scm_to_locale_string (domainname); - scm_dynwind_free (c_domain); - } + c_domain = scm_locale_string_data (domainname); c_result = textdomain (c_domain); if (c_result != NULL) @@ -233,7 +218,7 @@ SCM_DEFINE (scm_textdomain, "textdomain", 0, 1, 0, else if (!SCM_UNBNDP (domainname)) SCM_SYSERROR; - scm_dynwind_end (); + scm_remember_upto_here_1 (domainname); return result; } #undef FUNC_NAME @@ -245,23 +230,16 @@ SCM_DEFINE (scm_bindtextdomain, "bindtextdomain", 1, 1, 0, "Return the directory bound to @var{domainname}.") #define FUNC_NAME s_scm_bindtextdomain { - char *c_domain; - char *c_directory; + const char *c_domain, *c_directory; char const *c_result; SCM result; - scm_dynwind_begin (0); - if (SCM_UNBNDP (directory)) c_directory = NULL; else - { - c_directory = scm_to_locale_string (directory); - scm_dynwind_free (c_directory); - } + c_directory = scm_locale_string_data (directory); - c_domain = scm_to_locale_string (domainname); - scm_dynwind_free (c_domain); + c_domain = scm_locale_string_data (domainname); c_result = bindtextdomain (c_domain, c_directory); @@ -272,7 +250,7 @@ SCM_DEFINE (scm_bindtextdomain, "bindtextdomain", 1, 1, 0, else result = SCM_BOOL_F; - scm_dynwind_end (); + scm_remember_upto_here_2 (domainname, directory); return result; } #undef FUNC_NAME @@ -284,23 +262,16 @@ SCM_DEFINE (scm_bind_textdomain_codeset, "bind-textdomain-codeset", 1, 1, 0, "Return the encoding of @var{domainname}.") #define FUNC_NAME s_scm_bind_textdomain_codeset { - char *c_domain; - char *c_encoding; + const char *c_domain, *c_encoding; char const *c_result; SCM result; - scm_dynwind_begin (0); - if (SCM_UNBNDP (encoding)) c_encoding = NULL; else - { - c_encoding = scm_to_locale_string (encoding); - scm_dynwind_free (c_encoding); - } + c_encoding = scm_locale_string_data (encoding); - c_domain = scm_to_locale_string (domainname); - scm_dynwind_free (c_domain); + c_domain = scm_locale_string_data (domainname); c_result = bind_textdomain_codeset (c_domain, c_encoding); @@ -311,7 +282,7 @@ SCM_DEFINE (scm_bind_textdomain_codeset, "bind-textdomain-codeset", 1, 1, 0, else result = SCM_BOOL_F; - scm_dynwind_end (); + scm_remember_upto_here_2 (domainname, encoding); return result; } #undef FUNC_NAME diff --git a/libguile/load.c b/libguile/load.c index e95c36db1..71a3e8754 100644 --- a/libguile/load.c +++ b/libguile/load.c @@ -1,4 +1,4 @@ -/* Copyright 1995-1996,1998-2001,2004,2006,2008-2019 +/* Copyright 1995-1996,1998-2001,2004,2006,2008-2020 Free Software Foundation, Inc. This file is part of Guile. @@ -460,7 +460,7 @@ stringbuf_cat_locale_string (struct stringbuf *buf, SCM str) } static void -stringbuf_cat (struct stringbuf *buf, char *str) +stringbuf_cat (struct stringbuf *buf, const char *str) { size_t max_len = buf->buf_len - (buf->ptr - buf->buf) - 1; size_t len = strlen (str); @@ -648,7 +648,7 @@ load_thunk_from_path (SCM filename, SCM source_file_name, { struct stringbuf buf; struct stat stat_buf; - char *filename_chars; + const char *filename_chars; size_t filename_len; SCM path, extensions; SCM result = SCM_BOOL_F; @@ -664,11 +664,8 @@ load_thunk_from_path (SCM filename, SCM source_file_name, scm_misc_error ("%search-path", "bad extensions list: ~a", scm_list_1 (extensions)); - scm_dynwind_begin (0); - - filename_chars = scm_to_locale_string (filename); + filename_chars = scm_locale_string_data (filename); filename_len = strlen (filename_chars); - scm_dynwind_free (filename_chars); /* If FILENAME is absolute and is still valid, return it unchanged. */ if (is_absolute_file_name (filename)) @@ -682,7 +679,7 @@ load_thunk_from_path (SCM filename, SCM source_file_name, /* If FILENAME has an extension, don't try to add EXTENSIONS to it. */ { - char *endp; + const char *endp; for (endp = filename_chars + filename_len - 1; endp >= filename_chars; @@ -787,7 +784,7 @@ load_thunk_from_path (SCM filename, SCM source_file_name, scm_wrong_type_arg_msg (NULL, 0, path, "proper list"); end: - scm_dynwind_end (); + scm_remember_upto_here_1 (filename); return result; } @@ -804,7 +801,7 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts, struct stat *stat_buf) { struct stringbuf buf; - char *filename_chars; + const char *filename_chars; size_t filename_len; SCM result = SCM_BOOL_F; char initial_buffer[256]; @@ -816,11 +813,8 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts, scm_misc_error ("%search-path", "bad extensions list: ~a", scm_list_1 (extensions)); - scm_dynwind_begin (0); - - filename_chars = scm_to_locale_string (filename); + filename_chars = scm_locale_string_data (filename); filename_len = strlen (filename_chars); - scm_dynwind_free (filename_chars); /* If FILENAME is absolute and is still valid, return it unchanged. */ if (is_absolute_file_name (filename)) @@ -835,7 +829,7 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts, /* If FILENAME has an extension, don't try to add EXTENSIONS to it. */ { - char *endp; + const char *endp; for (endp = filename_chars + filename_len - 1; endp >= filename_chars; @@ -918,7 +912,7 @@ search_path (SCM path, SCM filename, SCM extensions, SCM require_exts, scm_wrong_type_arg_msg (NULL, 0, path, "proper list"); end: - scm_dynwind_end (); + scm_remember_upto_here_1 (filename); return result; } diff --git a/libguile/net_db.c b/libguile/net_db.c index dfb61e8d0..711b52bd9 100644 --- a/libguile/net_db.c +++ b/libguile/net_db.c @@ -1,5 +1,5 @@ /* "net_db.c" network database support - Copyright 1995-2001,2006,2009-2013,2018 + Copyright 1995-2001,2006,2009-2013,2018,2020 Free Software Foundation, Inc. This file is part of Guile. @@ -165,9 +165,9 @@ SCM_DEFINE (scm_gethost, "gethost", 0, 1, 0, } else if (scm_is_string (host)) { - char *str = scm_to_locale_string (host); + const char *str = scm_locale_string_data (host); entry = gethostbyname (str); - free (str); + scm_remember_upto_here_1 (host); } else { @@ -239,10 +239,10 @@ SCM_DEFINE (scm_getnet, "getnet", 0, 1, 0, } else if (scm_is_string (net)) { - char *str = scm_to_locale_string (net); + const char *str = scm_locale_string_data (net); entry = getnetbyname (str); eno = errno; - free (str); + scm_remember_upto_here_1 (net); } else { @@ -291,10 +291,10 @@ SCM_DEFINE (scm_getproto, "getproto", 0, 1, 0, } else if (scm_is_string (protocol)) { - char *str = scm_to_locale_string (protocol); + const char *str = scm_locale_string_data (protocol); entry = getprotobyname (str); eno = errno; - free (str); + scm_remember_upto_here_1 (protocol); } else { @@ -341,7 +341,7 @@ SCM_DEFINE (scm_getserv, "getserv", 0, 2, 0, #define FUNC_NAME s_scm_getserv { struct servent *entry; - char *protoname; + const char *protoname; int eno; if (SCM_UNBNDP (name)) @@ -357,17 +357,14 @@ SCM_DEFINE (scm_getserv, "getserv", 0, 2, 0, return scm_return_entry (entry); } - scm_dynwind_begin (0); - - protoname = scm_to_locale_string (protocol); - scm_dynwind_free (protoname); + protoname = scm_locale_string_data (protocol); if (scm_is_string (name)) { - char *str = scm_to_locale_string (name); + const char *str = scm_locale_string_data (name); entry = getservbyname (str, protoname); eno = errno; - free (str); + scm_remember_upto_here_1 (name); } else { @@ -378,7 +375,7 @@ SCM_DEFINE (scm_getserv, "getserv", 0, 2, 0, if (!entry) SCM_SYSERROR_MSG("no such service ~A", scm_list_1 (name), eno); - scm_dynwind_end (); + scm_remember_upto_here_1 (protoname); return scm_return_entry (entry); } #undef FUNC_NAME @@ -607,7 +604,7 @@ SCM_DEFINE (scm_getaddrinfo, "getaddrinfo", 1, 5, 0, #define FUNC_NAME s_scm_getaddrinfo { int err; - char *c_name, *c_service; + const char *c_name, *c_service; struct addrinfo c_hints, *c_result; SCM result = SCM_EOL; @@ -617,21 +614,13 @@ SCM_DEFINE (scm_getaddrinfo, "getaddrinfo", 1, 5, 0, if (!SCM_UNBNDP (service) && scm_is_true (service)) SCM_VALIDATE_STRING (SCM_ARG2, service); - scm_dynwind_begin (0); - if (scm_is_string (name)) - { - c_name = scm_to_locale_string (name); - scm_dynwind_free (c_name); - } + c_name = scm_locale_string_data (name); else c_name = NULL; if (scm_is_string (service)) - { - c_service = scm_to_locale_string (service); - scm_dynwind_free (c_service); - } + c_service = scm_locale_string_data (service); else c_service = NULL; @@ -667,8 +656,7 @@ SCM_DEFINE (scm_getaddrinfo, "getaddrinfo", 1, 5, 0, else scm_throw (sym_getaddrinfo_error, scm_list_1 (scm_from_int (err))); - scm_dynwind_end (); - + scm_remember_upto_here_2 (name, service); return result; } #undef FUNC_NAME diff --git a/libguile/strings.c b/libguile/strings.c index aab104498..175fdff6c 100644 --- a/libguile/strings.c +++ b/libguile/strings.c @@ -1,4 +1,4 @@ -/* Copyright 1995-1996,1998,2000-2001,2004,2006,2008-2016,2018-2019 +/* Copyright 1995-1996,1998,2000-2001,2004,2006,2008-2016,2018-2020 Free Software Foundation, Inc. This file is part of Guile. @@ -580,6 +580,51 @@ scm_i_string_data (SCM str) return data; } +/* Return true if the LEN bytes at STR consist of pure ASCII. */ +static int +is_ascii (const char *str, size_t len) +{ + for (size_t i = 0; i < len; i++) + { + if (str[i] > 127) + return 0; + } + + return 1; +} + +/* Return a C null-terminated string in locale encoding corresponding to + STR. When possible, avoid heap allocation; in other cases, return a + pointer to GC-managed memory. + + Note: The returned pointer can be an interior pointer in the + stringbuf of STR. Thus, the caller must ensure that STR is live + while it accesses the returned pointer. */ +const char * +scm_locale_string_data (SCM str) +{ + const void *data = scm_i_string_data (str); + size_t len = scm_i_string_length (str); + + /* Per , + locale encodings must be a superset of the "portable character + set", essentially ASCII. When STR is ASCII and nul-terminated, + return its pointer right away. */ + if (SCM_LIKELY (scm_i_is_narrow_string (str) + && ((char *) data)[len] == '\0' + && len < 1024 /* because 'is_ascii' is linear */ + && is_ascii (data, len))) + return data; + + /* FIXME: The worst case is really bad since we end up allocating + twice. */ + char *buf = scm_to_locale_stringn (str, &len); + char *result = scm_gc_strndup (buf, len, "locale string"); + free (buf); + + return result; +} + /* Returns a pointer to the 8-bit Latin-1 encoded character array of STR. */ const char * diff --git a/libguile/strings.h b/libguile/strings.h index 3f92d8c89..938b13c37 100644 --- a/libguile/strings.h +++ b/libguile/strings.h @@ -1,7 +1,7 @@ #ifndef SCM_STRINGS_H #define SCM_STRINGS_H -/* Copyright 1995-1998,2000-2001,2004-2006,2008-2011,2013,2015-2019 +/* Copyright 1995-1998,2000-2001,2004-2006,2008-2011,2013,2015-2020 Free Software Foundation, Inc. This file is part of Guile. @@ -239,6 +239,7 @@ SCM_API /* FIXME: not internal */ const char *scm_i_string_chars (SCM str); SCM_API /* FIXME: not internal */ char *scm_i_string_writable_chars (SCM str); SCM_INTERNAL const scm_t_wchar *scm_i_string_wide_chars (SCM str); SCM_INTERNAL const void *scm_i_string_data (SCM str); +SCM_INTERNAL const char *scm_locale_string_data (SCM str); SCM_INTERNAL SCM scm_i_string_start_writing (SCM str); SCM_INTERNAL void scm_i_string_stop_writing (void); --=-=-=--