unofficial mirror of guile-devel@gnu.org 
 help / color / mirror / Atom feed
From: Maxime Devos <maximedevos@telenet.be>
To: Vijay Marupudi <vijaymarupudi@gatech.edu>, guile-devel@gnu.org
Subject: Re: Request to add *-resize! functions for contiguous mutable data structures.
Date: Sat, 07 Aug 2021 13:09:42 +0200	[thread overview]
Message-ID: <45e44ba58dbbb3b2fd3ffdeab4add4a1f7525cd4.camel@telenet.be> (raw)
In-Reply-To: <97e4262b-3ff9-1b21-35d8-45ad9d45ca99@gatech.edu>

[-- Attachment #1: Type: text/plain, Size: 2481 bytes --]

Vijay Marupudi schreef op vr 06-08-2021 om 09:33 [-0500]:
> Hello!
> 
> I was curious if Guile would be willing to provide a series of
> new procedures for resizing contiguous memory regions.
> 
> (bytevector-resize! <bytevector> new-size [fill])
> (vector-resize! <vector> new-size [fill])
> 
> The [fill] parameter could be used if the new-size is bigger than
> the current size.
>
> This would make writing imperative code easier and more
> performant.

A problem is that this prevents optimisations and can currently
introduce bugs in concurrent code.  Consider the following code:

b.scm:
(use-modules (rnrs bytevectors))

(define (bv-first-two bv)
  (unless (bytevector? bv)
    (error "not a bv"))
  (unless (>= (bytevector-length bv) 2) ; L6
    (error "too small"))
  (values (bytevector-u8-ref bv 0)   ; L8
          (bytevector-u8-ref bv 1))) ; L9
bv-first-two


Compile it with optimisations enabled:

  guild compile b.scm -o b.go -O3 && guild disassemble b.go

(Unfortunately, guile cannot yet compile the bounds check at L8 and L9 away
even though we performed a bounds check at L6 away.)

I can't say I understand the disassembled code very well, but I do note
that the bounds checks (search for (jl ...), (jnl ...) and imm-u64<?,
s64-imm<? and u64<?) are separate from the read of the 'length' of 
bytevector (maybe 'word-ref/immediate' or pointer-ref/immediate) and
the reading of the first and second byte of the bytevector (maybe (u8-ref 5 3 0),
(u8-ref 2 3 2)).

Now suppose some concurrent thread resizes the bytevector between the bounds
check and the actual reading, then there will be an out-of-bounds access ...

> I acknowledge that it is not idiomatic Scheme to use
> mutable data structures, however this is useful to me for
> dealing with large amounts of text data, in which I need random
> access and flexible data storage. It would allow me to move off
> my custom C extension vector and allow me to use other
> vector-* functions.
> 
> Ideally, this would use libc's `realloc` to make the resize
> quick, so that it can avoid data copying whenever possible.

If you're very careful, you can use 'bytevector->pointer', 'pointer->bytevector'
and (foreign-library-function ... "malloc" ...),
(foreign-library-function ... "realloc" ...),
(foreign-library-function ... "free" ...).

(ice-9 vlist) and <https://github.com/ijp/fectors> might be interesting as well.

Greetings,
Maxime.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 260 bytes --]

  parent reply	other threads:[~2021-08-07 11:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-06 14:33 Request to add *-resize! functions for contiguous mutable data structures Vijay Marupudi
2021-08-07 10:31 ` Taylan Kammer
2021-08-07 21:19   ` tomas
2021-08-08 12:17     ` Taylan Kammer
2021-08-07 11:09 ` Maxime Devos [this message]
2021-08-07 17:46   ` Taylan Kammer
2021-08-09  4:02     ` Vijay Marupudi
2021-08-09 18:24       ` Maxime Devos
2021-08-09 18:35         ` Vijay Marupudi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/guile/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45e44ba58dbbb3b2fd3ffdeab4add4a1f7525cd4.camel@telenet.be \
    --to=maximedevos@telenet.be \
    --cc=guile-devel@gnu.org \
    --cc=vijaymarupudi@gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).