unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Maxime Devos <maximedevos@telenet.be>
To: Matt Wette <matt.wette@gmail.com>,
	guile-devel@gnu.org, Guile User <guile-user@gnu.org>
Subject: Re: mmap for guile
Date: Sun, 26 Jun 2022 20:11:07 +0200	[thread overview]
Message-ID: <0cf4e4ee80169487694b844996e04f3293eab92f.camel@telenet.be> (raw)
In-Reply-To: <56ee7537-1666-3d04-7093-732a75624e9b@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3966 bytes --]

Some old mmap things that might be useful:

* https://lists.nongnu.org/archive/html/guile-devel/2013-04/msg00235.html
* https://lists.gnu.org/archive/html/bug-guile/2017-11/msg00033.html
* https://lists.gnu.org/archive/html/guile-user/2006-11/msg00013.html

+SCM_DEFINE (scm_mmap_search, "mmap/search", 2, 4, 0, 
+            (SCM addr, SCM len, SCM prot, SCM flags, SCM fd, SCM
offset),
+	    "See the unix man page for mmap.  Returns a bytevector.\n"
+	    "Note that the region allocated will be searched by the
garbage\n"
+	    "collector for pointers.  Defaults:\n"

I think it would be a good idea to document it will be automatically
unmapped during GC, as this is a rather low-leel interface

Also, what if you mmap a region, use bytevector->pointer and pass it to
some C thing, which saves the pointer somewhere where boehm-gc can find
it and boehm-gc considers it to be live, is there something that
prevents boehm-gc from improperly calling the finalizer & unmapping the
region, causing a dangling pointer?

Also, WDYT of using ports instead of raw fds in the API?  That would
play nicer with move->fdes etc.

> +  GC_exclude_static_roots(ptr, (char*)ptr + len);

After an unmap, will the GC properly forget the roots information?

>+  /* Invalidate further work on this bytevector. */
>+  SCM_BYTEVECTOR_SET_LENGTH (bvec, 0);
>+  SCM_BYTEVECTOR_SET_CONTENTS (bvec, NULL);

Possibly Guile's optimiser assumes that bytevectors never change in
length (needs to be checked).  So unless the relevant optimiser code is
changed, and it is documented that bytevectors can change in length, I
think it would be safer to not have an unmapping procedure in Scheme
(though a procedure for remapping it as /dev/zero should be safe).

> +  bvec = scm_c_take_typed_bytevector((signed char *) c_mem + c_offset, c_len,
> +				     SCM_ARRAY_ELEMENT_TYPE_VU8, pointer);

Would scm_pointer_to_bytevector fit here?  Also, scm_c_make_typed_bytevector
looks like something that can cause an out-of-memory-exception but the
finaliser hasn't been set yet.


>+// call fstat to get file size
>+SCM_DEFINE (scm_mmap_file, "mmap-file", 1, 1, 0, 
>+            (SCM file, SCM prot),
>+	    "This procedure accepts a file in the form of filename,\n"
>+            " file-port or fd.  It returns a bytevector.  It must
>not\n"
>+            " contain scheme allocated objects as it will not be\n"
>+            " searched for pointers. Default @var{prot} is @code{\"r\"}.")

I would restrict the C code to only ports and file descriptors, and leave file
names to a Scheme wrapper.  That way, you automatically get appropriate E...
errors in case open-file fails (and maybe &i/o-filename etc. if the core Guile
FS functions are later changed to R6RS), less chance on accidentally forgetting
to close a fd (*) ...

(*) One possible problem: if the file is opened, and mmap fails, then you still
need to close the file port (and rethrow), so some exception handling is still
required, though no C-style exception handling ...

> +/* The following copied from bytevectors.c. Kludge? */
> +#define SCM_BYTEVECTOR_SET_LENGTH(_bv, _len)            \
> +  SCM_SET_CELL_WORD_1 ((_bv), (scm_t_bits) (_len))
> +#define SCM_BYTEVECTOR_SET_CONTENTS(_bv, _contents)	\
> +  SCM_SET_CELL_WORD_2 ((_bv), (scm_t_bits) (_contents))

To avoid accidental problems if bytevectors.c is modified later, I'd add add a comment
in bytevectors.c referring to this file, to add a reminder that mmunmap
makes assumptions about the layout.

> +  scm_c_define ("PAGE_SIZE", scm_from_int (getpagesize()));

From local man page:

       SVr4,  4.4BSD,  SUSv2.   In SUSv2 the getpagesize() call is labeled LEGACY, and in
       POSIX.1-2001 it has been dropped; HP-UX does not have this call.

NOTES
       Portable applications should  employ  sysconf(_SC_PAGESIZE)  instead  of  getpage‐
       size():

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

  parent reply	other threads:[~2022-06-26 18:11 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-26 15:37 mmap for guile Matt Wette
2022-06-26 16:21 ` Matt Wette
2022-06-26 17:06 ` Olivier Dion via Developers list for Guile, the GNU extensibility library
2022-06-26 18:11 ` Maxime Devos [this message]
2022-07-04 10:09   ` Ludovic Courtès
2022-07-04 13:14     ` Greg Troxel
2022-07-04 20:03       ` Ludovic Courtès
2022-07-05 12:49         ` Greg Troxel
2022-07-19 13:20     ` Maxime Devos
2022-07-21  9:14       ` Ludovic Courtès
2022-07-19 13:30     ` Maxime Devos
2022-07-19 13:34     ` Maxime Devos
2022-06-26 18:21 ` Maxime Devos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0cf4e4ee80169487694b844996e04f3293eab92f.camel@telenet.be \
    --to=maximedevos@telenet.be \
    --cc=guile-devel@gnu.org \
    --cc=guile-user@gnu.org \
    --cc=matt.wette@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).