From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Mark H Weaver Newsgroups: gmane.lisp.guile.devel Subject: [PATCH] Add internal-only port structure; move iconv descriptors there Date: Sun, 31 Mar 2013 03:52:53 -0400 Message-ID: <87zjxk57l6.fsf_-_@tines.lan> References: <87vc9ij5z0.fsf@pobox.com> <87fw0l2yyk.fsf@gnu.org> <877gltxgrg.fsf_-_@tines.lan> <87y5e8stst.fsf@pobox.com> <87d2vdekws.fsf@tines.lan> <87d2uk62a0.fsf_-_@tines.lan> <87a9poin2d.fsf@pobox.com> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: ger.gmane.org 1364716404 28908 80.91.229.3 (31 Mar 2013 07:53:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 31 Mar 2013 07:53:24 +0000 (UTC) Cc: Ludovic =?utf-8?Q?Court=C3=A8s?= , guile-devel@gnu.org To: Andy Wingo Original-X-From: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Sun Mar 31 09:53:51 2013 Return-path: Envelope-to: guile-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UMD5L-0002od-45 for guile-devel@m.gmane.org; Sun, 31 Mar 2013 09:53:51 +0200 Original-Received: from localhost ([::1]:49093 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UMD4w-0006su-Qs for guile-devel@m.gmane.org; Sun, 31 Mar 2013 03:53:26 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:45051) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UMD4p-0006sm-IP for guile-devel@gnu.org; Sun, 31 Mar 2013 03:53:22 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UMD4m-0007dX-R0 for guile-devel@gnu.org; Sun, 31 Mar 2013 03:53:19 -0400 Original-Received: from world.peace.net ([96.39.62.75]:55712) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UMD4m-0007dO-Kz; Sun, 31 Mar 2013 03:53:16 -0400 Original-Received: from 209-6-91-212.c3-0.smr-ubr1.sbo-smr.ma.cable.rcn.com ([209.6.91.212] helo=tines.lan) by world.peace.net with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1UMD4X-0007KH-NK; Sun, 31 Mar 2013 03:53:02 -0400 In-Reply-To: <87a9poin2d.fsf@pobox.com> (Andy Wingo's message of "Wed, 27 Mar 2013 21:51:06 +0100") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x X-Received-From: 96.39.62.75 X-BeenThere: guile-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Developers list for Guile, the GNU extensibility library" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Original-Sender: guile-devel-bounces+guile-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.lisp.guile.devel:16065 Archived-At: --=-=-= Content-Type: text/plain Here's a new start at adding an internal-only port structure to 2.0. I've come to the conclusion that it is not safe to modify 'scm_t_port' in 2.0 at all; not even to change the member names. In brief, the reason has to do with the C11 standard definition of "compatible types", which ties into the strict aliasing rules. Section 6.2.7 of C11 spells out what it means for two structures declared in separate translation units to be compatible, and among other things their member names must be the same. So this patch leaves 'scm_t_port' completely unchanged, but 'input_cd' is repurposed to be a pointer to the internal-only structure, and 'output_cd' becomes unused. However, this ugly detail is evident in only two places: in 'scm_new_port_table_entry' where the 'scm_t_port' is allocated, and in the new 'SCM_INTERNAL_PTAB_ENTRY' macro that converts a port (SCM value) to a (scm_t_port_internal *). The idea is that by accessing the public and private port structures via the two macros 'SCM_PTAB_ENTRY' and 'SCM_INTERNAL_PTAB_ENTRY', we give ourselves plenty of flexibility to change the way these structures are accessed in 2.2, while keeping most of the code identical. This patch is more limited in scope than the last one. It does not attempt to deal with the SCM_PORT_ENCODING_MODE. That will be in a subsequent patch. This patch only adds the internal structure and moves the iconv descriptors there. Is this a reasonable start? Any suggestions before I proceed? Thanks, Mark --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0001-Add-scm_gc_typed_calloc-macro.patch Content-Description: [PATCH 1/2] Add 'scm_gc_typed_calloc' macro >From aeef3dc501dd598e5b3dea3fc5d1552a032d3917 Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Sat, 30 Mar 2013 23:09:01 -0400 Subject: [PATCH 1/2] Add 'scm_gc_typed_calloc' macro. * libguile/gc.h (scm_gc_typed_calloc): New macro. --- libguile/gc.h | 1 + 1 file changed, 1 insertion(+) diff --git a/libguile/gc.h b/libguile/gc.h index 9f00e01..c6f7a4b 100644 --- a/libguile/gc.h +++ b/libguile/gc.h @@ -207,6 +207,7 @@ SCM_API char *scm_gc_strdup (const char *str, const char *what) SCM_API char *scm_gc_strndup (const char *str, size_t n, const char *what) SCM_MALLOC; +#define scm_gc_typed_calloc(t) ((t *) scm_gc_calloc (sizeof (t), #t)) #ifdef BUILDING_LIBGUILE #include "libguile/bdw-gc.h" -- 1.7.10.4 --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=0002-Add-internal-only-port-structure-and-move-iconv-desc.patch Content-Description: [PATCH 2/2] Add internal-only port structure and move iconv descriptors there >From e11e069a6ae7da1f7a280e8c1936a61bc9119625 Mon Sep 17 00:00:00 2001 From: Mark H Weaver Date: Sat, 30 Mar 2013 22:34:56 -0400 Subject: [PATCH 2/2] Add internal-only port structure and move iconv descriptors there. * libguile/ports-internal.h: New file. * libguile/Makefile.am (noinst_HEADERS): Add ports-internal.h. * libguile/ports.h (scm_t_port): Add a comment mentioning that the 'input_cd' and 'output_cd' fields of the public structure are no longer what they seem to be. * libguile/ports.c: Include ports-internal.h. (finalize_port, scm_i_remove_port, get_iconv_codepoint, get_codepoint, scm_i_set_port_encoding_x): Access 'input_cd' and 'output_cd' via the new internal port structure. (scm_new_port_table_entry): Allocate and initialize the internal port structure. * libguile/print.c: Include ports-internal.h. (display_string_using_iconv, display_string): Access 'input_cd' and 'output_cd' via 'internal' pointer. --- libguile/Makefile.am | 2 +- libguile/ports-internal.h | 38 ++++++++++++++++++++++++++ libguile/ports.c | 65 +++++++++++++++++++++++++++------------------ libguile/ports.h | 19 ++++++++++--- libguile/print.c | 15 ++++++----- 5 files changed, 103 insertions(+), 36 deletions(-) create mode 100644 libguile/ports-internal.h diff --git a/libguile/Makefile.am b/libguile/Makefile.am index 4b1f96b..0c4b6ce 100644 --- a/libguile/Makefile.am +++ b/libguile/Makefile.am @@ -455,7 +455,7 @@ noinst_HEADERS = conv-integer.i.c conv-uinteger.i.c \ srfi-14.i.c \ quicksort.i.c \ win32-uname.h \ - private-gc.h private-options.h + private-gc.h private-options.h ports-internal.h # vm instructions noinst_HEADERS += vm-engine.c vm-i-system.c vm-i-scheme.c vm-i-loader.c diff --git a/libguile/ports-internal.h b/libguile/ports-internal.h new file mode 100644 index 0000000..b4daff3 --- /dev/null +++ b/libguile/ports-internal.h @@ -0,0 +1,38 @@ +/* + * ports-internal.h - internal-only declarations for ports. + * + * Copyright (C) 2013 Free Software Foundation, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public License + * as published by the Free Software Foundation; either version 3 of + * the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA + * 02110-1301 USA + */ + +#ifndef SCM_PORTS_INTERNAL +#define SCM_PORTS_INTERNAL + +#include "libguile/_scm.h" +#include "libguile/ports.h" + +typedef struct scm_t_port_internal +{ + /* input/output iconv conversion descriptors */ + void *input_cd; + void *output_cd; +} scm_t_port_internal; + +#define SCM_INTERNAL_PTAB_ENTRY(x) \ + ((scm_t_port_internal *) (SCM_PTAB_ENTRY(x)->input_cd)) + +#endif diff --git a/libguile/ports.c b/libguile/ports.c index 8737a76..86db4a0 100644 --- a/libguile/ports.c +++ b/libguile/ports.c @@ -55,6 +55,7 @@ #include "libguile/mallocs.h" #include "libguile/validate.h" #include "libguile/ports.h" +#include "libguile/ports-internal.h" #include "libguile/vectors.h" #include "libguile/weaks.h" #include "libguile/fluids.h" @@ -576,7 +577,7 @@ finalize_port (void *ptr, void *data) register_finalizer_for_port (port); else { - scm_t_port *entry; + scm_t_port_internal *pti; port_type = SCM_TC2PTOBNUM (SCM_CELL_TYPE (port)); if (port_type >= scm_numptob) @@ -587,12 +588,12 @@ finalize_port (void *ptr, void *data) is for explicit `close-port' by user. */ scm_ptobs[port_type].free (port); - entry = SCM_PTAB_ENTRY (port); + pti = SCM_INTERNAL_PTAB_ENTRY (port); - if (entry->input_cd != (iconv_t) -1) - iconv_close (entry->input_cd); - if (entry->output_cd != (iconv_t) -1) - iconv_close (entry->output_cd); + if (pti->input_cd != (iconv_t) -1) + iconv_close (pti->input_cd); + if (pti->output_cd != (iconv_t) -1) + iconv_close (pti->output_cd); SCM_SETSTREAM (port, 0); SCM_CLR_PORT_OPEN_FLAG (port); @@ -617,7 +618,8 @@ scm_new_port_table_entry (scm_t_bits tag) */ SCM z = scm_cons (SCM_EOL, SCM_EOL); - scm_t_port *entry = (scm_t_port *) scm_gc_calloc (sizeof (scm_t_port), "port"); + scm_t_port *entry = scm_gc_typed_calloc (scm_t_port); + scm_t_port_internal *pti = scm_gc_typed_calloc (scm_t_port_internal); const char *enc; entry->file_name = SCM_BOOL_F; @@ -630,11 +632,17 @@ scm_new_port_table_entry (scm_t_bits tag) entry->encoding = enc ? scm_gc_strdup (enc, "port") : NULL; /* The conversion descriptors will be opened lazily. */ - entry->input_cd = (iconv_t) -1; - entry->output_cd = (iconv_t) -1; + pti->input_cd = (iconv_t) -1; + pti->output_cd = (iconv_t) -1; entry->ilseq_handler = scm_i_default_port_conversion_handler (); + /* XXX These fields are not what they seem. They have been + repurposed, but cannot safely be renamed in 2.0 without breaking + ABI compatibility. This will be cleaned up in 2.2. */ + entry->input_cd = pti; /* XXX pointer to the internal port structure */ + entry->output_cd = NULL; /* XXX unused */ + SCM_SET_CELL_TYPE (z, tag); SCM_SETPTAB_ENTRY (z, entry); @@ -678,24 +686,26 @@ scm_i_remove_port (SCM port) #define FUNC_NAME "scm_remove_port" { scm_t_port *p; + scm_t_port_internal *pti; scm_i_scm_pthread_mutex_lock (&scm_i_port_table_mutex); p = SCM_PTAB_ENTRY (port); + pti = SCM_INTERNAL_PTAB_ENTRY (port); scm_port_non_buffer (p); p->putback_buf = NULL; p->putback_buf_size = 0; - if (p->input_cd != (iconv_t) -1) + if (pti->input_cd != (iconv_t) -1) { - iconv_close (p->input_cd); - p->input_cd = (iconv_t) -1; + iconv_close (pti->input_cd); + pti->input_cd = (iconv_t) -1; } - if (p->output_cd != (iconv_t) -1) + if (pti->output_cd != (iconv_t) -1) { - iconv_close (p->output_cd); - p->output_cd = (iconv_t) -1; + iconv_close (pti->output_cd); + pti->output_cd = (iconv_t) -1; } SCM_SETPTAB_ENTRY (port, 0); @@ -1296,13 +1306,13 @@ static int get_iconv_codepoint (SCM port, scm_t_wchar *codepoint, char buf[SCM_MBCHAR_BUF_SIZE], size_t *len) { - scm_t_port *pt; + scm_t_port_internal *pti; int err, byte_read; size_t bytes_consumed, output_size; char *output; scm_t_uint8 utf8_buf[SCM_MBCHAR_BUF_SIZE]; - pt = SCM_PTAB_ENTRY (port); + pti = SCM_INTERNAL_PTAB_ENTRY (port); for (output_size = 0, output = (char *) utf8_buf, bytes_consumed = 0, err = 0; @@ -1332,7 +1342,7 @@ get_iconv_codepoint (SCM port, scm_t_wchar *codepoint, input_left = bytes_consumed + 1; output_left = sizeof (utf8_buf); - done = iconv (pt->input_cd, &input, &input_left, + done = iconv (pti->input_cd, &input, &input_left, &output, &output_left); if (done == (size_t) -1) { @@ -1368,13 +1378,14 @@ get_codepoint (SCM port, scm_t_wchar *codepoint, { int err; scm_t_port *pt = SCM_PTAB_ENTRY (port); + scm_t_port_internal *pti = SCM_INTERNAL_PTAB_ENTRY (port); - if (pt->input_cd == (iconv_t) -1) + if (pti->input_cd == (iconv_t) -1) /* Initialize the conversion descriptors, if needed. */ scm_i_set_port_encoding_x (port, pt->encoding); /* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */ - if (pt->input_cd == (iconv_t) -1) + if (pti->input_cd == (iconv_t) -1) err = get_utf8_codepoint (port, codepoint, (scm_t_uint8 *) buf, len); else err = get_iconv_codepoint (port, codepoint, buf, len); @@ -2208,6 +2219,7 @@ void scm_i_set_port_encoding_x (SCM port, const char *encoding) { scm_t_port *pt; + scm_t_port_internal *pti; iconv_t new_input_cd, new_output_cd; new_input_cd = (iconv_t) -1; @@ -2215,6 +2227,7 @@ scm_i_set_port_encoding_x (SCM port, const char *encoding) /* Set the character encoding for this port. */ pt = SCM_PTAB_ENTRY (port); + pti = SCM_INTERNAL_PTAB_ENTRY (port); if (encoding == NULL) encoding = "ISO-8859-1"; @@ -2251,13 +2264,13 @@ scm_i_set_port_encoding_x (SCM port, const char *encoding) } } - if (pt->input_cd != (iconv_t) -1) - iconv_close (pt->input_cd); - if (pt->output_cd != (iconv_t) -1) - iconv_close (pt->output_cd); + if (pti->input_cd != (iconv_t) -1) + iconv_close (pti->input_cd); + if (pti->output_cd != (iconv_t) -1) + iconv_close (pti->output_cd); - pt->input_cd = new_input_cd; - pt->output_cd = new_output_cd; + pti->input_cd = new_input_cd; + pti->output_cd = new_output_cd; return; diff --git a/libguile/ports.h b/libguile/ports.h index d4d59b7..27d59a2 100644 --- a/libguile/ports.h +++ b/libguile/ports.h @@ -112,9 +112,22 @@ typedef struct unsigned char *putback_buf; size_t putback_buf_size; /* allocated size of putback_buf. */ - /* input/output iconv conversion descriptors */ - void *input_cd; - void *output_cd; + /* IMPORTANT: 'input_cd' and 'output_cd' used to be pointers to the + input and output iconv descriptors, but those have been moved to + the internal-only port structure defined in ports-internal.h. + + Given that we must preserve ABI compatibility in 2.0, we cannot + safely change this public structure without running afoul of C + strict aliasing rules. We cannot even change the field names. + + To work around this, in this public structure, 'input_cd' has been + repurposed to be a pointer to the internal port structure (see + ports-internal.h), and 'output_cd' is now unused. + + This will be cleaned up in 2.2. */ + + void *input_cd; /* XXX actually a pointer to scm_t_port_internal */ + void *output_cd; /* XXX actually unused */ } scm_t_port; diff --git a/libguile/print.c b/libguile/print.c index 647eed8..62d7267 100644 --- a/libguile/print.c +++ b/libguile/print.c @@ -45,6 +45,7 @@ #include "libguile/alist.h" #include "libguile/struct.h" #include "libguile/ports.h" +#include "libguile/ports-internal.h" #include "libguile/root.h" #include "libguile/strings.h" #include "libguile/strports.h" @@ -879,9 +880,9 @@ display_string_using_iconv (const void *str, int narrow_p, size_t len, scm_t_string_failed_conversion_handler strategy) { size_t printed; - scm_t_port *pt; + scm_t_port_internal *pti; - pt = SCM_PTAB_ENTRY (port); + pti = SCM_INTERNAL_PTAB_ENTRY (port); printed = 0; @@ -910,7 +911,7 @@ display_string_using_iconv (const void *str, int narrow_p, size_t len, output = encoded_output; output_left = sizeof (encoded_output); - done = iconv (pt->output_cd, &input, &input_left, + done = iconv (pti->output_cd, &input, &input_left, &output, &output_left); output_len = sizeof (encoded_output) - output_left; @@ -920,7 +921,7 @@ display_string_using_iconv (const void *str, int narrow_p, size_t len, int errno_save = errno; /* Reset the `iconv' state. */ - iconv (pt->output_cd, NULL, NULL, NULL, NULL); + iconv (pti->output_cd, NULL, NULL, NULL, NULL); /* Print the OUTPUT_LEN bytes successfully converted. */ scm_lfwrite (encoded_output, output_len, port); @@ -981,15 +982,17 @@ display_string (const void *str, int narrow_p, { scm_t_port *pt; + scm_t_port_internal *pti; pt = SCM_PTAB_ENTRY (port); + pti = SCM_INTERNAL_PTAB_ENTRY (port); - if (pt->output_cd == (iconv_t) -1) + if (pti->output_cd == (iconv_t) -1) /* Initialize the conversion descriptors, if needed. */ scm_i_set_port_encoding_x (port, pt->encoding); /* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */ - if (pt->output_cd == (iconv_t) -1) + if (pti->output_cd == (iconv_t) -1) return display_string_as_utf8 (str, narrow_p, len, port); else return display_string_using_iconv (str, narrow_p, len, -- 1.7.10.4 --=-=-=--