unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
* patch for mmap and friends
@ 2023-01-14  0:49 Matt Wette
  2023-01-14  1:00 ` Matt Wette
  2023-01-14 15:18 ` Maxime Devos
  0 siblings, 2 replies; 8+ messages in thread
From: Matt Wette @ 2023-01-14  0:49 UTC (permalink / raw)
  To: 27782, guile-devel

[-- Attachment #1: Type: text/plain, Size: 204 bytes --]

Please consider this patch for adding mmap(), munmap() and msync()
  to libguile/filesys.c.  Included is update for posix.texi and test 
file mman.test.
Once included, feature 'mman should be #t.

Matt

[-- Attachment #2: 0001-Add-mmap-and-friends-munmap-msync.patch --]
[-- Type: text/x-patch, Size: 13066 bytes --]

From 6c944174d35d43f87340c8199d47f3f088fa6ca7 Mon Sep 17 00:00:00 2001
From: Matt Wette <mwette@alumni.caltech.edu>
Date: Fri, 13 Jan 2023 16:42:06 -0800
Subject: [PATCH] Add mmap and friends (munmap, msync).

* libguile/filesys.[ch]: added scm_mmap_search, scm_mmap, scm_msync, and
  init_mman, built on availability of HAVE_MMAN_H; also provides feature
  'mman
* doc/ref/posix.texi: added documentation for mmap and friends
* test-suite/Makefile.am: updated for mman.test
* test-suite/tests/mman.test: mmap tests
---
 configure.ac           |   2 +
 doc/ref/posix.texi     |  45 +++++++
 libguile/filesys.c     | 264 +++++++++++++++++++++++++++++++++++++++++
 libguile/filesys.h     |   4 +
 test-suite/Makefile.am |   1 +
 5 files changed, 316 insertions(+)

diff --git a/configure.ac b/configure.ac
index f8c12f0d7..c348d14a2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1018,6 +1018,8 @@ AC_CHECK_MEMBERS([struct tm.tm_gmtoff],,,
 ])
 GUILE_STRUCT_UTIMBUF
 
+AC_CHECK_FUNCS([msync])
+
 
 #--------------------------------------------------------------------
 #
diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index 5653d3758..16f3bbc49 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -1216,6 +1216,51 @@ valid separators.  Thus, programs should not assume that
 separator---e.g., when extracting the components of a file name.
 @end defvr
 
+@deffn {Scheme Procedure} mmap addr len [prot [flags [fd [offset]]]]
+@deffnx {Scheme Procedure} mmap/search addr len [prot [flags [fd [offset]]]]
+Create a memory mapping, returning a bytevector.  @var{addr}, if
+non-zero, is the staring address; or, if zero, is assigned by the
+system.  @var{prot}, if provided, assigns protection.  @var{fd},
+if provided associates the memory region with a file, starting
+at @var{offset}, if provided.
+The region returned by mmap will NOT be searched by the garbage
+ collector for pointers, while that returned by mmap/search will.
+Note that the finalizer for the returned bytevector will call munmap.
+Defaults for optional arguments are
+@table @asis
+@item prot
+(logior PROT_READ PROT_WRITE)
+@item flags
+(logior MAP_ANONYMOUS MAP_PRIVATE)
+@item fd
+-1
+@item offset
+0
+@end table
+@end deffn
+
+@deffn {Scheme Procedure} munmap bvec
+Given bytevector generated by mmap or mmap/search, unmap the
+the associated memory.  The argument will be modified to
+reflect a zero length bv.  The return value is unspecified.
+Note that munmap is called by finalizer associated with
+bytevectors returned from mmap and mmap/search.
+@end deffn
+
+@deffn {Scheme Procedure} msync addr length flag
+Flush changes made to the in-core copy of a file mapped using
+mmap or mmap/search.  This should be executed on modified memory
+before calling munmap.  The @var{flags} argument must be exactly one
+of the following:
+@table @code
+@item MS_ASYNC
+Initiate update, return immediately.
+@item MS_SYNC
+Initiate update, block until complete.
+@item MS_INVALIDATE
+Invalidate other mappings of the same file.
+@end table
+@end deffn
 
 @node User Information
 @subsection User Information
diff --git a/libguile/filesys.c b/libguile/filesys.c
index 1f0bba556..0ddb4cfee 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -67,11 +67,17 @@
 # include <sys/sendfile.h>
 #endif
 
+#ifdef HAVE_SYS_MMAN_H
+# include <sys/mman.h>
+#endif
+
 #include "async.h"
 #include "boolean.h"
 #include "dynwind.h"
 #include "fdes-finalizers.h"
 #include "feature.h"
+#include "finalizers.h"
+#include "foreign.h"
 #include "fports.h"
 #include "gsubr.h"
 #include "iselect.h"
@@ -2263,6 +2269,261 @@ scm_dir_free (SCM p)
 
 \f
 
+#ifdef HAVE_SYS_MMAN_H
+/* see https://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html */
+
+static void
+mmap_finalizer (void *ptr, void *data)
+{
+  SCM bvec;
+  void *c_addr;
+  size_t c_len;
+  int rv;
+
+  bvec = SCM_PACK_POINTER (ptr);
+  if (!SCM_BYTEVECTOR_P (bvec))
+    scm_misc_error ("mmap", "expecting bytevector", SCM_EOL);
+
+  c_addr = SCM_BYTEVECTOR_CONTENTS (bvec);
+  c_len = SCM_BYTEVECTOR_LENGTH (bvec);
+  SCM_SYSCALL (rv = munmap(c_addr, c_len));
+  if (rv != 0)
+    scm_misc_error ("mmap", "failed to munmap memory", SCM_EOL);
+}
+
+SCM_DEFINE (scm_mmap_search, "mmap/search", 2, 4, 0,
+            (SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM offset),
+	    "Create a memory mapping, returning a bytevector..  @var{addr},\n"
+	    "if non-zero, is the staring address; or, if zero, is assigned by\n"
+	    "the system.  @var{prot}, if provided, assigns protection.\n"
+	    "@var{fd}, if provided associates the memory region with a file\n"
+	    "starting at @var{offset}, if provided.\n"
+	    "The region returned by mmap WILL be searched by the garbage\n"
+	    "collector for pointers.  See also mmap.  Note that the\n"
+            "finalizer for the returned bytevector will call munmap.\n"
+	    "Defaults for optional arguments are\n"
+	    "@table @asis\n"
+	    "@item prot\n(logior PROT_READ PROT_WRITE)\n"
+	    "@item flags\n(logior MAP_ANONYMOUS MAP_PRIVATE)\n"
+	    "@item fd\n-1\n"
+	    "@item offset\n0\n"
+	    "@end table")
+#define FUNC_NAME s_scm_mmap_search
+{
+  void *c_mem, *c_addr;
+  size_t c_len;
+  int c_prot, c_flags, c_fd;
+  scm_t_off c_offset;
+  SCM pointer, bvec;
+
+  if (SCM_POINTER_P (addr))
+    c_addr = SCM_POINTER_VALUE (addr);
+  else if (scm_is_integer (addr))
+    c_addr = (void*) scm_to_uintptr_t (addr);
+  else
+    scm_misc_error ("mmap", "bad addr", addr);
+
+  c_len = scm_to_size_t (len);
+
+  if (SCM_UNBNDP (prot))
+    c_prot = PROT_READ | PROT_WRITE;
+  else
+    c_prot = scm_to_int (prot);
+
+  if (SCM_UNBNDP (flags))
+    c_flags = MAP_ANONYMOUS | MAP_PRIVATE;
+  else
+    c_flags = scm_to_int (flags);
+
+  if (SCM_UNBNDP (fd))
+    c_fd = -1;
+  else
+    c_fd = scm_to_int (fd);
+
+  if (SCM_UNBNDP (offset))
+    c_offset = 0;
+  else
+    c_offset = scm_to_off_t (offset);
+
+  if ((c_addr == NULL) && (c_flags & MAP_FIXED))
+    scm_misc_error ("mmap", "cannot have NULL addr w/ MAP_FIXED", SCM_EOL);
+
+  SCM_SYSCALL (c_mem = mmap(c_addr, c_len, c_prot, c_flags, c_fd, c_offset));
+  if (c_mem == MAP_FAILED)
+    scm_syserror ("mmap");              /* errno set */
+
+  pointer = scm_cell (scm_tc7_pointer, (scm_t_bits) c_mem);
+  bvec = scm_c_take_typed_bytevector((signed char *) c_mem + c_offset, c_len,
+				     SCM_ARRAY_ELEMENT_TYPE_VU8, pointer);
+  assert(sizeof(void*) <= sizeof(size_t));
+  scm_i_set_finalizer (SCM2PTR (bvec), mmap_finalizer, (void*) c_len);
+  return bvec;
+}
+#undef FUNC_NAME
+
+SCM_DEFINE (scm_mmap, "mmap", 2, 4, 0,
+            (SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM offset),
+	    "Create a memory mapping, returning a bytevector.  @var{addr}, if\n"
+	    "non-zero, is the staring address; or, if zero, is assigned by the\n"
+	    "system.  @var{prot}, if provided, assigns protection.  @var{fd},\n"
+	    "if provided associates the memory region with a file, starting \n"
+	    "at @var{offset}, if provided.\n"
+	    "The region returned by mmap will NOT be searched by the garbage\n"
+	    "collector for pointers. See also mmap/search.  Note that the\n"
+            "finalizer for the returned bytevector will call munmap.\n"
+	    "Defaults for arguments are:\n"
+	    "@table @asis\n"
+	    "@item prot\n(logior PROT_READ PROT_WRITE)\n"
+	    "@item flags\n(logior MAP_ANONYMOUS MAP_PRIVATE)\n"
+	    "@item fd\n-1\n"
+	    "@item offset\n0\n"
+	    "@end table")
+#define FUNC_NAME s_scm_mmap
+{
+  void *c_mem;
+  size_t c_len;
+  SCM bvec;
+
+  bvec = scm_mmap_search(addr, len, prot, flags, fd, offset);
+  c_mem = SCM_BYTEVECTOR_CONTENTS(bvec);
+  c_len = SCM_BYTEVECTOR_LENGTH(bvec);
+
+  /* Tell GC not to scan for pointers. */
+  GC_exclude_static_roots(c_mem, (char*) c_mem + c_len);
+
+  return bvec;
+}
+#undef FUNC_NAME
+
+/* The following copied from bytevectors.c. Kludge? */
+#define SCM_BYTEVECTOR_SET_LENGTH(_bv, _len)            \
+  SCM_SET_CELL_WORD_1 ((_bv), (scm_t_bits) (_len))
+#define SCM_BYTEVECTOR_SET_CONTENTS(_bv, _contents)	\
+  SCM_SET_CELL_WORD_2 ((_bv), (scm_t_bits) (_contents))
+
+SCM_DEFINE (scm_munmap, "munmap", 1, 0, 0,
+            (SCM bvec),
+	    "Given bytevector generated by mmap or mmap/search, unmap the\n"
+            "the associated memory.  The argument will be modified to \n"
+            "reflect a zero length bv. The return value is unspecified.\n"
+            "Note that munmap is called by finalizer associated with\n"
+            "bytevectors returned from mmap and mmap/search.\n")
+#define FUNC_NAME s_scm_munmap
+{
+  void *addr;
+  size_t len;
+  int rv;
+
+  SCM_VALIDATE_BYTEVECTOR (1, bvec);
+
+  addr = (void *) SCM_BYTEVECTOR_CONTENTS (bvec);
+  len = SCM_BYTEVECTOR_LENGTH (bvec);
+
+  /* Invalidate further work on this bytevector. */
+  SCM_BYTEVECTOR_SET_LENGTH (bvec, 0);
+  SCM_BYTEVECTOR_SET_CONTENTS (bvec, NULL);
+
+  SCM_SYSCALL (rv = munmap(addr, len));
+  if (rv == -1)
+    SCM_SYSERROR;			/* errno set */
+
+  return SCM_UNSPECIFIED;
+}
+#undef FUNC_NAME
+
+#ifdef HAVE_MSYNC
+SCM_DEFINE (scm_msync, "msync", 2, 0, 0,
+            (SCM bvec, SCM flags),
+	    "Flush changes made to the in-core copy of a file mapped using\n"
+            "mmap or mmap/search.  This should be executed on modified memory\n"
+            "before calling munmap.  The @var{flags} argument must be exactly\n"
+            "one of the following:\n"
+            "@table @code\n"
+            "@item MS_ASYNC\n"
+            "Initiate update, return immediately.\n"
+            "@item MS_SYNC\n"
+            "Initiate update, block until complete.\n"
+            "@item MS_INVALIDATE\n"
+            "Invalidate other mappings of the same file.\n"
+            "@end table\n"
+            "The return value is unspecified.")
+#define FUNC_NAME s_scm_msync
+{
+  void *addr;
+  size_t len;
+  int c_flags, rv;
+
+  SCM_VALIDATE_BYTEVECTOR (1, bvec);
+  addr = (void *) SCM_BYTEVECTOR_CONTENTS (bvec);
+  len = SCM_BYTEVECTOR_LENGTH (bvec);
+
+  c_flags = scm_to_int (flags);
+
+  SCM_SYSCALL (rv = msync(addr, len, c_flags));
+  if (rv == -1)
+    SCM_SYSERROR;			/* errno set */
+
+  return SCM_UNSPECIFIED;
+}
+#undef FUNC_NAME
+#endif /* HAVE_MSYNC */
+
+static void init_mman(void) {
+  scm_add_feature("mman");
+
+#ifdef PROT_NONE
+  scm_c_define ("PROT_NONE", scm_from_int (PROT_NONE));
+#endif
+#ifdef PROT_READ
+  scm_c_define ("PROT_READ", scm_from_int (PROT_READ));
+#endif
+#ifdef PROT_WRITE
+  scm_c_define ("PROT_WRITE", scm_from_int (PROT_WRITE));
+#endif
+#ifdef PROT_EXEC
+  scm_c_define ("PROT_EXEC", scm_from_int (PROT_EXEC));
+#endif
+
+#ifdef MAP_ANONYMOUS
+  scm_c_define ("MAP_ANONYMOUS", scm_from_int (MAP_ANONYMOUS));
+#endif
+#ifdef MAP_ANON
+  scm_c_define ("MAP_ANON", scm_from_int (MAP_ANON));
+#endif
+#ifdef MAP_FILE
+  scm_c_define ("MAP_FILE", scm_from_int (MAP_FILE));
+#endif
+#ifdef MAP_FIXED
+  scm_c_define ("MAP_FIXED", scm_from_int (MAP_FIXED));
+#endif
+#ifdef MAP_HASSEMAPHORE
+  scm_c_define ("MAP_HASSEMAPHORE", scm_from_int (MAP_HASSEMAPHORE));
+#endif
+#ifdef MAP_PRIVATE
+  scm_c_define ("MAP_PRIVATE", scm_from_int (MAP_PRIVATE));
+#endif
+#ifdef MAP_SHARED
+  scm_c_define ("MAP_SHARED", scm_from_int (MAP_SHARED));
+#endif
+#ifdef MAP_NOCACHE
+  scm_c_define ("MAP_NOCACHE", scm_from_int (MAP_NOCACHE));
+#endif
+  scm_c_define ("PAGE_SIZE", scm_from_int (getpagesize()));
+#ifdef MS_ASYNC
+  scm_c_define ("MS_ASYNC", scm_from_int (MS_ASYNC));
+#endif
+#ifdef MS_SYNC
+  scm_c_define ("MS_SYNC", scm_from_int (MS_SYNC));
+#endif
+#ifdef MS_INVALIDATE
+  scm_c_define ("MS_INVALIDATE", scm_from_int (MS_INVALIDATE));
+#endif
+}
+
+#endif /* HAVE_SYS_MMAN_H */
+
+\f
+
 void
 scm_init_filesys ()
 {
@@ -2387,6 +2648,9 @@ scm_init_filesys ()
 #ifdef HAVE_READLINKAT
   scm_add_feature("readlink-port");
 #endif
+#if defined(HAVE_SYS_MMAN_H)
+  init_mman();
+#endif
 
 #include "filesys.x"
 }
diff --git a/libguile/filesys.h b/libguile/filesys.h
index 1ce50d30e..fa40b484f 100644
--- a/libguile/filesys.h
+++ b/libguile/filesys.h
@@ -80,6 +80,10 @@ SCM_API SCM scm_dirname (SCM filename);
 SCM_API SCM scm_basename (SCM filename, SCM suffix);
 SCM_API SCM scm_canonicalize_path (SCM path);
 SCM_API SCM scm_sendfile (SCM out, SCM in, SCM count, SCM offset);
+SCM_API SCM scm_mmap_search(SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM offset);
+SCM_API SCM scm_mmap(SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM offset);
+SCM_API SCM scm_msync(SCM bvec, SCM flags);
+SCM_API SCM scm_munmap(SCM bvec);
 SCM_INTERNAL SCM scm_i_relativize_path (SCM path, SCM in_path);
 
 SCM_INTERNAL void scm_init_filesys (void);
diff --git a/test-suite/Makefile.am b/test-suite/Makefile.am
index 16fa2e952..3785e2f85 100644
--- a/test-suite/Makefile.am
+++ b/test-suite/Makefile.am
@@ -77,6 +77,7 @@ SCM_TESTS = tests/00-initial-env.test		\
 	    tests/load.test			\
 	    tests/match.test			\
 	    tests/match.test.upstream		\
+	    tests/mman.test			\
 	    tests/modules.test			\
 	    tests/multilingual.nottest		\
 	    tests/net-db.test			\
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14  0:49 patch for mmap and friends Matt Wette
@ 2023-01-14  1:00 ` Matt Wette
  2023-01-14 15:18 ` Maxime Devos
  1 sibling, 0 replies; 8+ messages in thread
From: Matt Wette @ 2023-01-14  1:00 UTC (permalink / raw)
  To: 27782, guile-devel

[-- Attachment #1: Type: text/plain, Size: 399 bytes --]

On 1/13/23 4:49 PM, Matt Wette wrote:
> Please consider this patch for adding mmap(), munmap() and msync()
>  to libguile/filesys.c.  Included is update for posix.texi and test 
> file mman.test.
> Once included, feature 'mman should be #t.
>
> Matt
Please add the attached file: test-suite/tests/mman.test.

I thought it was included in the patch.  It's the thought that counts, 
right?

Matt



[-- Attachment #2: mman.test --]
[-- Type: text/plain, Size: 1493 bytes --]

;;;; mman.test --- Tests for mmap API.    -*- scheme -*-
;;;;
;;;; Copyright 2022 Free Software Foundation, Inc.
;;;;
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
;;;; License as published by the Free Software Foundation; either
;;;; version 3 of the License, or (at your option) any later version.
;;;;
;;;; This library is distributed in the hope that it will be useful,
;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
;;;; Lesser General Public License for more details.
;;;;
;;;; You should have received a copy of the GNU Lesser General Public
;;;; License along with this library; if not, write to the Free Software
;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA

(define-module (test-mman)
  #:use-module (test-suite lib)
  #:use-module (test-suite guile-test)
  #:use-module (rnrs bytevectors)
  #:declarative? #f
  )

(define (mmap-test-file)
  (data-file-name "foo.txt"))

(define mmap-test-string "hello, world")

(define (gen-mmap-test-file)
  (with-output-to-file (mmap-test-file)
    (lambda () (display mmap-test-string))))

(when (provided? 'mman)

  (gen-mmap-test-file)

  (with-test-prefix "mman"

    (pass-if "mman 1"
      (let ((bv (mmap 0 #x100)))
        (bytevector-u8-set! bv 0 34)
        (= (bytevector-u8-ref bv 0) 34)))

    ))

;; --- last line ---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14  0:49 patch for mmap and friends Matt Wette
  2023-01-14  1:00 ` Matt Wette
@ 2023-01-14 15:18 ` Maxime Devos
  2023-01-14 16:31   ` Matt Wette
  2023-01-14 16:41   ` tomas
  1 sibling, 2 replies; 8+ messages in thread
From: Maxime Devos @ 2023-01-14 15:18 UTC (permalink / raw)
  To: Matt Wette, 27782, guile-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 3395 bytes --]



On 14-01-2023 01:49, Matt Wette wrote:
> Please consider this patch for adding mmap(), munmap() and msync()
>   to libguile/filesys.c.  Included is update for posix.texi and test 
> file mman.test.
> Once included, feature 'mman should be #t.
> 
> Matt




> +  if (SCM_UNBNDP (fd))
> +    c_fd = -1;
> +  else
> +    c_fd = scm_to_int (fd);

Port objects should be accepted too, as previously asked on 
<https://lists.gnu.org/archive/html/guile-user/2022-06/msg00060.html>.
As implied by later comments, using a raw fd causes problems with 
'move->fdes'.  For the remaining response, I'll assume that the function 
accepts ports as well.

  (---)

After this code, the port 'fd' becomes unreferenced by this function.

> +  if (SCM_UNBNDP (offset))
> +    c_offset = 0;
> +  else
> +    c_offset = scm_to_off_t (offset);
> +
> +  if ((c_addr == NULL) && (c_flags & MAP_FIXED))
> +    scm_misc_error ("mmap", "cannot have NULL addr w/ MAP_FIXED", SCM_EOL);

Hence, if the GC is run here, its possible for fd to be automatically 
closed here.

> +  SCM_SYSCALL (c_mem = mmap(c_addr, c_len, c_prot, c_flags, c_fd, c_offset));

As such, C 'mmap' could be called on a closed file descriptor even even 
if the argument to Scheme 'mmap' was an open port, which isn't going to 
work.

(While it is recommended for Scheme code to keep a reference to the port 
to manually close afterwards, to free resources faster than waiting for 
GC, it is not actually required.)

Worse, if the port is closed (by GC), in the mean time another thread 
may have opened a new port, whose file descriptor coincides with c_fd.

To avoid this problem, you can add

   scm_remember_upto_here_1 (fd);

after the SCM_SYSCALL.

Even then, a problem remains -- a concurrent thread might be using 
'move->fdes' to 'move' the file descriptor, hence invalidating c_fd.
Functions like 'scm_seek' handle this with 'scm_dynwind_acquire_port' 
(IIUC, it takes a mutex to delay concurrent 'move->fdes' until finished).

IIUC, the solution then looks like (ignoring wrapping) (the lack of 
scm_remember_upto_here_1 is intentional):

scm_dynwind_begin (0);
scm_dynwind_acquire_port (fd); // (accepts both ports and numbers, IIUC)
// needs to be after scm_dynwind_acquire_port
if (SCM_UNBNDP (fd))
   c_fd = -1;
else
   c_fd = scm_to_int (fd);

SCM_SYSCALL (c_mem = mmap(c_addr, c_len, c_prot, c_flags, c_fd, c_offset));
if (c_mem == MAP_FAILED)
     scm_syserror ("mmap");
scm_dynwind_end ();

(I'm not really familiar with scm_dynwind_begin & friends, I'm mostly 
copy-pasting from libguile/ports.c here.)

> +  if (c_mem == MAP_FAILED)
> +    scm_syserror ("mmap");              /* errno set */
> +
> +  pointer = scm_cell (scm_tc7_pointer, (scm_t_bits) c_mem);
> +  bvec = scm_c_take_typed_bytevector((signed char *) c_mem + c_offset, c_len,
> +                                    SCM_ARRAY_ELEMENT_TYPE_VU8, pointer);

> +  assert(sizeof(void*) <= sizeof(size_t));

IIRC there is a C trick involving fields, arrays and types to check this 
at compile-time instead.  Maybe:

struct whatever {
    /* if availability of zero-length arrays can be assumed */
    int foo[sizeof(size_t) - sizeof(void*)];
    /* alternatively, a weaker but portable check */
    int foo[sizeof(size_t) - sizeof(void*) + 1];
};

Greetings,
Maxime.

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14 15:18 ` Maxime Devos
@ 2023-01-14 16:31   ` Matt Wette
  2023-01-14 22:08     ` Matt Wette
  2023-01-14 16:41   ` tomas
  1 sibling, 1 reply; 8+ messages in thread
From: Matt Wette @ 2023-01-14 16:31 UTC (permalink / raw)
  To: guile-devel

On 1/14/23 7:18 AM, Maxime Devos wrote:
> \
> Port objects should be accepted too, as previously asked on 
> <https://lists.gnu.org/archive/html/guile-user/2022-06/msg00060.html>.
> As implied by later comments, using a raw fd causes problems with 
> 'move->fdes'.  For the remaining response, I'll assume that the 
> function accepts ports as well.
>

> To avoid this problem, you can add
>
>   scm_remember_upto_here_1 (fd);
>
> after the SCM_SYSCALL.
> \


> IIRC there is a C trick involving fields, arrays and types to check 
> this at compile-time instead.  Maybe:
>
> struct whatever {
>    /* if availability of zero-length arrays can be assumed */
>    int foo[sizeof(size_t) - sizeof(void*)];
>    /* alternatively, a weaker but portable check */
>    int foo[sizeof(size_t) - sizeof(void*) + 1];
> };
>
> Greetings,
> Maxime.

Thanks for the feedback.   I'm sorry I missed you comments on the 
previous round.
I did respond to the ones I did catch.    I'll work this and resubmit.

Matt




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14 15:18 ` Maxime Devos
  2023-01-14 16:31   ` Matt Wette
@ 2023-01-14 16:41   ` tomas
  1 sibling, 0 replies; 8+ messages in thread
From: tomas @ 2023-01-14 16:41 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 1113 bytes --]

On Sat, Jan 14, 2023 at 04:18:58PM +0100, Maxime Devos wrote:

[...]

> (While it is recommended for Scheme code to keep a reference to the port to
> manually close afterwards, to free resources faster than waiting for GC, it
> is not actually required.)

Oh, oh. I've got a little anecdote to share here. The context was a Java
application, running on Sun medium-sized iron. It was slow & clumsy and
the customer decided to double the machine's RAM (these were times where
200 MB were quite a thing).

The application crashed a couple of times a day. Some log file poking
later it became clear: at the other side there was an Oracle database
and the app was exhausting the (limited) max number of connections
allowed for the license they had. More RAM -> less GC and the app was
relying on the object destructors to dispose of the unneeded database
connections.

There I learnt one shouldn't ever use memory as proxy for other
resources :-)

(The end of the story was that we could convince the user to have an
application under a free license and a free database :)

Cheers
-- 
t


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14 16:31   ` Matt Wette
@ 2023-01-14 22:08     ` Matt Wette
  2023-01-14 22:42       ` Maxime Devos
  0 siblings, 1 reply; 8+ messages in thread
From: Matt Wette @ 2023-01-14 22:08 UTC (permalink / raw)
  To: guile-devel

[-- Attachment #1: Type: text/plain, Size: 5968 bytes --]

On 1/14/23 8:31 AM, Matt Wette wrote:
> On 1/14/23 7:18 AM, Maxime Devos wrote:
>> \
>> Port objects should be accepted too, as previously asked on 
>> <https://lists.gnu.org/archive/html/guile-user/2022-06/msg00060.html>.
>> As implied by later comments, using a raw fd causes problems with 
>> 'move->fdes'.  For the remaining response, I'll assume that the 
>> function accepts ports as well.
>>
>
>> To avoid this problem, you can add
>>
>>   scm_remember_upto_here_1 (fd);
>>
>> after the SCM_SYSCALL.
>> \
>
>
>> IIRC there is a C trick involving fields, arrays and types to check 
>> this at compile-time instead.  Maybe:
>>
>> struct whatever {
>>    /* if availability of zero-length arrays can be assumed */
>>    int foo[sizeof(size_t) - sizeof(void*)];
>>    /* alternatively, a weaker but portable check */
>>    int foo[sizeof(size_t) - sizeof(void*) + 1];
>> };
>>
>> Greetings,
>> Maxime.
>
> Thanks for the feedback.   I'm sorry I missed you comments on the 
> previous round.
> I did respond to the ones I did catch.    I'll work this and resubmit.
>
> Matt
>
>

Here is another shot.
1) added port support
2) used dynwind to protect port/fd.
3) removed the assert: I don't know why I put it in there.

Notes:
1) four other guile headers needed: atomic-internal, foreign, finalizers 
and ioext
2) had to copy/modify dynwind_acquire_port and release_port from ports.c
3) one maybe-kludge is if an fd is passed I use (car (fd->ports fd)) for 
acquire_port

Matt

static void
mmap_finalizer (void *ptr, void *data)
{
   SCM bvec;
   void *c_addr;
   size_t c_len;
   int rv;

   bvec = SCM_PACK_POINTER (ptr);
   if (!SCM_BYTEVECTOR_P (bvec))
     scm_misc_error ("mmap", "expecting bytevector", SCM_EOL);

   c_addr = SCM_BYTEVECTOR_CONTENTS (bvec);
   c_len = SCM_BYTEVECTOR_LENGTH (bvec);
   SCM_SYSCALL (rv = munmap(c_addr, c_len));
   if (rv != 0)
     scm_misc_error ("mmap", "failed to munmap memory", SCM_EOL);
}

/* Code for scm_dynwind_acquire_port and release_port sourced from 
ports.c. */

static void
release_port (SCM port)
{
   scm_t_port *pt = SCM_PORT (port);
   uint32_t cur = 1, next = 0;
   while (!scm_atomic_compare_and_swap_uint32 (&pt->refcount, &cur, next))
     {
       if (cur == 0)
         return;
       next = cur - 1;
     }
  if (cur > 1)
     return;

   if (SCM_PORT_TYPE (port)->close)
     SCM_PORT_TYPE (port)->close (port);

   /* Skip encoding code from ports.c! */
}

static void
scm_dynwind_acquire_port (SCM port)
{
   scm_t_port *pt = SCM_PORT (port);
   uint32_t cur = 1, next = 2;
   while (!scm_atomic_compare_and_swap_uint32 (&pt->refcount, &cur, next))
     {
       if (cur == 0)
         scm_wrong_type_arg_msg (NULL, 0, port, "open port");
       next = cur + 1;
     }
   scm_dynwind_unwind_handler_with_scm (release_port, port,
                                        SCM_F_WIND_EXPLICITLY);
}

SCM_DEFINE (scm_mmap_search, "mmap/search", 2, 4, 0,
             (SCM addr, SCM len, SCM prot, SCM flags, SCM file, SCM offset),
         "Create a memory mapping, returning a bytevector.. @var{addr},\n"
         "if non-zero, is the staring address; or, if zero, is assigned 
by\n"
         "the system.  @var{prot}, if provided, assigns protection.\n"
         "@var{file}, a port or fd, if provided associates the memory\n"
             "region with a file starting at @var{offset}, if provided.\n"
         "The region returned by mmap WILL be searched by the garbage\n"
         "collector for pointers.  See also mmap.  Note that the\n"
             "finalizer for the returned bytevector will call munmap.\n"
         "Defaults for optional arguments are\n"
         "@table @asis\n"
         "@item prot\n(logior PROT_READ PROT_WRITE)\n"
         "@item flags\n(logior MAP_ANONYMOUS MAP_PRIVATE)\n"
         "@item fd\n-1\n"
         "@item offset\n0\n"
         "@end table")
#define FUNC_NAME s_scm_mmap_search
{
   void *c_mem, *c_addr;
   size_t c_len;
   int c_prot, c_flags, c_fd;
   scm_t_off c_offset;
   SCM pointer, bvec;

   if (SCM_POINTER_P (addr))
     c_addr = SCM_POINTER_VALUE (addr);
   else if (scm_is_integer (addr))
     c_addr = (void*) scm_to_uintptr_t (addr);
   else
     scm_misc_error ("mmap", "bad addr", addr);

   c_len = scm_to_size_t (len);

   if (SCM_UNBNDP (prot))
     c_prot = PROT_READ | PROT_WRITE;
   else
     c_prot = scm_to_int (prot);

   if (SCM_UNBNDP (flags))
     c_flags = MAP_ANONYMOUS | MAP_PRIVATE;
   else
     c_flags = scm_to_int (flags);

   scm_dynwind_begin (0);

   if (SCM_UNBNDP (file))
     c_fd = -1;
   else {
     /* Use the fd under clobber protection from GC or another thread. */
     if (SCM_PORTP (file))
       c_fd = scm_to_int (scm_fileno (file));
     else {
       c_fd = scm_to_int (file);
       file = SCM_CAR (scm_fdes_to_ports (file));
     }
     scm_dynwind_acquire_port (file);
   }

   if (SCM_UNBNDP (offset))
     c_offset = 0;
   else
     c_offset = scm_to_off_t (offset);

   if ((c_addr == NULL) && (c_flags & MAP_FIXED))
     scm_misc_error ("mmap", "cannot have NULL addr w/ MAP_FIXED", SCM_EOL);

   SCM_SYSCALL (c_mem = mmap(c_addr, c_len, c_prot, c_flags, c_fd, 
c_offset));
   if (c_mem == MAP_FAILED)
     scm_syserror ("mmap");              /* errno set */

   /* The fd is free to go now. */
   scm_dynwind_end ();

   pointer = scm_cell (scm_tc7_pointer, (scm_t_bits) c_mem);
   bvec = scm_c_take_typed_bytevector((signed char *) c_mem + c_offset, 
c_len,
                      SCM_ARRAY_ELEMENT_TYPE_VU8, pointer);
   scm_i_set_finalizer (SCM2PTR (bvec), mmap_finalizer, (void*) c_len);
   return bvec;
}
#undef FUNC_NAME


[-- Attachment #2: Type: text/html, Size: 8186 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14 22:08     ` Matt Wette
@ 2023-01-14 22:42       ` Maxime Devos
  2023-01-14 23:46         ` Matt Wette
  0 siblings, 1 reply; 8+ messages in thread
From: Maxime Devos @ 2023-01-14 22:42 UTC (permalink / raw)
  To: Matt Wette, guile-devel


[-- Attachment #1.1.1: Type: text/plain, Size: 3540 bytes --]



On 14-01-2023 23:08, Matt Wette wrote:
> 2) had to copy/modify dynwind_acquire_port and release_port from ports.c

Is it because of the 'static' qualifier?  If so, you could use the 
'SCM_INTERNAL [...] scm_i_[...]' pattern, e.g. see 
'scm_i_is_mutable_bitvector' in libguile/bitvectors.h and 
libguile/bitvectors.c.


>    if (SCM_UNBNDP (file))
>     c_fd = -1;
>   else {
>     /* Use the fd under clobber protection from GC or another thread. */
>     if (SCM_PORTP (file))
>       c_fd = scm_to_int (scm_fileno (file));
>     else {
>       c_fd = scm_to_int (file);
>       file = SCM_CAR (scm_fdes_to_ports (file));
>     }
>     scm_dynwind_acquire_port (file);
>   }

You need to acquire the port _before_ taking its file descriptor! 
Otherwise, it is protected too late.  I.e., scm_dynwind_acquire_port 
needs to be moved before the 'scm_fileno'.

On the second branch, in particular 'c_fd = scm_to_int (file);':
IIUC, the idea is to, when a raw fd is passed, look up the corresponding 
port to lock it, right?

If so, I think it's too late for that -- another thread might change 
things between 'c_fd = ...' and 'file = SCM_CAR (scm_fdes_to_ports (file))'.

More generally, when a raw fd is passed, I think it's impossible to 
solve the 'other thread/GC interfering' problem.

As such, to simplify things, I propose to only do the 
'scm_dynwind_acquire_port' when a port is passed, instead of failing to 
solve the interference problems (if the user passed a raw fd, then only 
they can make sure there are no problems, e.g. by changing their code to 
use ports or by not using move->fdes stuff).:

Proposed code (untested):

{
   void *c_mem, *c_addr;
   size_t c_len;
   int c_prot, c_flags, c_fd;
   scm_t_off c_offset;
   SCM pointer, bvec;

   if (SCM_POINTER_P (addr))
     c_addr = SCM_POINTER_VALUE (addr);
   else if (scm_is_integer (addr))
     c_addr = (void*) scm_to_uintptr_t (addr);
   else
     scm_misc_error ("mmap", "bad addr", addr);

   c_len = scm_to_size_t (len);

   if (SCM_UNBNDP (prot))
     c_prot = PROT_READ | PROT_WRITE;
   else
     c_prot = scm_to_int (prot);

   if (SCM_UNBNDP (flags))
     c_flags = MAP_ANONYMOUS | MAP_PRIVATE;
   else
     c_flags = scm_to_int (flags);

   scm_dynwind_begin (0);
   if (SCM_UNBNDP (file))
     c_fd = -1;
   else if (scm_is_integer (file))
     c_fd = scm_to_int (file);
   else
     {
       /* Use the fd of the port under clobber protection from
          concurrency. As scm_dynwind_acquire_port assumes that
          FILE is a port, check that first. */
       SCM_VALIDATE_PORT (SCM_ARG5, file);
       scm_dynwind_acquire_port (file);
       c_fd = scm_fileno (file);
     }

   if (SCM_UNBNDP (offset))
     c_offset = 0;
   else
     c_offset = scm_to_off_t (offset);

   if ((c_addr == NULL) && (c_flags & MAP_FIXED))
     scm_misc_error ("mmap", "cannot have NULL addr w/ MAP_FIXED", SCM_EOL);

   SCM_SYSCALL (c_mem = mmap(c_addr, c_len, c_prot, c_flags, c_fd, 
c_offset));
   if (c_mem == MAP_FAILED)
     scm_syserror ("mmap");              /* errno set */

   /* The fd is free to go now. */
   scm_dynwind_end ();

   pointer = scm_cell (scm_tc7_pointer, (scm_t_bits) c_mem);
   bvec = scm_c_take_typed_bytevector((signed char *) c_mem + c_offset, 
c_len,
                      SCM_ARRAY_ELEMENT_TYPE_VU8, pointer);
   scm_i_set_finalizer (SCM2PTR (bvec), mmap_finalizer, (void*) c_len);
   return bvec;
}
#undef FUNC_NAME



[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 929 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: patch for mmap and friends
  2023-01-14 22:42       ` Maxime Devos
@ 2023-01-14 23:46         ` Matt Wette
  0 siblings, 0 replies; 8+ messages in thread
From: Matt Wette @ 2023-01-14 23:46 UTC (permalink / raw)
  To: Maxime Devos, guile-devel; +Cc: 27782



On 1/14/23 2:42 PM, Maxime Devos wrote:
>     {
>       /* Use the fd of the port under clobber protection from
>          concurrency. As scm_dynwind_acquire_port assumes that
>          FILE is a port, check that first. */
>       SCM_VALIDATE_PORT (SCM_ARG5, file);
>       scm_dynwind_acquire_port (file);
>       c_fd = scm_fileno (file);
>     }

Thanks.  I'll try this, modulo update to  scm_to_int (scm_fileno (file)).

Matt




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-01-14 23:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-14  0:49 patch for mmap and friends Matt Wette
2023-01-14  1:00 ` Matt Wette
2023-01-14 15:18 ` Maxime Devos
2023-01-14 16:31   ` Matt Wette
2023-01-14 22:08     ` Matt Wette
2023-01-14 22:42       ` Maxime Devos
2023-01-14 23:46         ` Matt Wette
2023-01-14 16:41   ` tomas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).