all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Richard Hansen <rhansen@rhansen.org>
To: Eli Zaretskii <eliz@gnu.org>
Cc: 55952@debbugs.gnu.org, monnier@iro.umontreal.ca
Subject: bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
Date: Wed, 15 Jun 2022 14:49:58 -0400	[thread overview]
Message-ID: <3fe77a58-74ab-d44f-1fd8-19cbf0b2bda3@rhansen.org> (raw)
In-Reply-To: <83fsk6rujd.fsf@gnu.org>


[-- Attachment #1.1.1: Type: text/plain, Size: 2781 bytes --]

>>> Thanks, but AFAICT the documentation doesn't describe accurately 
>>> enough what the modified code does: what if the pre-allocated 
>>> destination string doesn't have enough storage for the null byte the 
>>> code adds?
>>
>> The existing code advances the index for the terminator, it just 
>> doesn't write 0 to that byte. So the existing code already signals 
>> an error in that case unless the `strz` is the final field.
> 
> I don't see how this is relevant to the concern I expressed.

The point I was trying to make is that this patch doesn't change the behavior (in any significant way) in the case of an undersized output string. Perhaps the documentation could be improved, but I'd rather do that in a follow-up patch because it is outside of the scope of this patch.

> 
> My concern is that you found it necessary to add a comment about 
> writing the terminating null byte (which is a good thing), but didn't 
> mention that aspect in the manual, not even as a hint.  I think it is 
> noteworthy enough to be in the manual.

What do you mean? The patch changes the manual to say:

     When packing, the entire input string is copied tothe packed
     output followed by a null (zero) byte.

The attached revision tweaks the wording to make it stand out more:

     When packing, the entire input string is copied tothe packed
     output, and a null (zero) byte is written after that.

> 
>> Regardless, the documentation for `bindat-pack` [1] clearly states 
>> that the pre-allocated string must have enough room:
>>
>>> When pre-allocating, you should make sure `(length raw)` meets or 
>>> exceeds the total length to avoid an out-of-range error.
>> [1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Bindat-Functions.html#index-bindat_002dpack
> 
> This comes _after_ the place where strz is described, so if someone 
> reads the manual in order, they wouldn't have read that yet.  And even 
> if they did, there's no reason to assume they remember it well enough.
> 
> Bottom line: I think this aspect of the code is important to mention 
> in the manual.  The price is small, whereas the benefit could be 
> significant.

I disagree -- I think the price is relatively high compared to the benefit. The pre-allocated length requirement is a requirement of only `bindat-pack` -- not of `bindat-type` or any of the type specifiers -- so it is best to keep that requirement with the documentation of `bindat-pack`. Indeed, users are unaware that packing to a pre-allocated string is even possible until they read the documentation for `bindat-pack` (except for references in the caveats documented for fixed-length `str` and `strz`, which I plan on addressing in a future patch).

Thanks,
Richard

[-- Attachment #1.1.2: v2-0001-bindat-strz-Write-null-terminator-after-variable-.patch --]
[-- Type: text/x-patch, Size: 4165 bytes --]

From 0d10f10a6b2b26d831dc5b250197f83967f65719 Mon Sep 17 00:00:00 2001
From: Richard Hansen <rhansen@rhansen.org>
Date: Thu, 9 Jun 2022 20:41:50 -0400
Subject: [PATCH v2] bindat (strz): Write null terminator after variable length
 string

* lisp/emacs-lisp/bindat.el (bindat--pack-strz): Explicitly write a
null byte after packing a variable-length string to ensure proper
termination when packing to a pre-allocated string (bug#55952).
* doc/lispref/processes.texi (Bindat Types): Update documentation.
* test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
Update tests.
---
 doc/lispref/processes.texi           | 25 ++++++++-----------------
 lisp/emacs-lisp/bindat.el            |  3 +++
 test/lisp/emacs-lisp/bindat-tests.el |  4 ++--
 3 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/doc/lispref/processes.texi b/doc/lispref/processes.texi
index 8c8f8fd6b2..1018be3341 100644
--- a/doc/lispref/processes.texi
+++ b/doc/lispref/processes.texi
@@ -3495,23 +3495,14 @@ Bindat Types
 @item strz &optional @var{len}
 If @var{len} is not provided: Variable-length null-terminated unibyte
 string (@pxref{Text Representations}).  When packing, the entire input
-string is copied to the packed output.  The following byte will be
-null (zero) unless a pre-allocated string was provided to
-@code{bindat-pack}, in which case that byte is left unmodified.  The
-length of the packed output is the length of the input string plus one
-(for the null terminator).  The input string must not contain any null
-bytes.  If the input string is multibyte with only ASCII and
-@code{eight-bit} characters, it is converted to unibyte before it is
-packed; other multibyte strings signal an error.  When unpacking, the
-resulting string contains all bytes up to (but excluding) the null
-byte.
-
-@quotation Caution
-If a pre-allocated string is provided to @code{bindat-pack}, the
-packed output will not be properly null-terminated unless the
-pre-allocated string already has a null byte at the appropriate
-location.
-@end quotation
+string is copied to the packed output, and a null (zero) byte is
+written after that.  The length of the packed output is the length of
+the input string plus one (for the null terminator).  The input string
+must not contain any null bytes.  If the input string is multibyte
+with only ASCII and @code{eight-bit} characters, it is converted to
+unibyte before it is packed; other multibyte strings signal an error.
+When unpacking, the resulting string contains all bytes up to (but
+excluding) the null byte.
 
 If @var{len} is provided: @code{strz} behaves the same as @code{str}
 with one difference: When unpacking, the first null byte encountered
diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
index 9ba89a5e3f..46e2a4901c 100644
--- a/lisp/emacs-lisp/bindat.el
+++ b/lisp/emacs-lisp/bindat.el
@@ -450,6 +450,9 @@ bindat--pack-strz
         ;; need to scan the input string looking for a null byte.
         (error "Null byte encountered in input strz string"))
       (aset bindat-raw (+ bindat-idx i) (aref v i)))
+    ;; Explicitly write a null terminator in case the user provided a
+    ;; pre-allocated string to bindat-pack that wasn't zeroed first.
+    (aset bindat-raw (+ bindat-idx len) 0)
     (setq bindat-idx (+ bindat-idx len 1))))
 
 (defun bindat--pack-bits (len v)
diff --git a/test/lisp/emacs-lisp/bindat-tests.el b/test/lisp/emacs-lisp/bindat-tests.el
index 7d1233ded7..cc223ad14e 100644
--- a/test/lisp/emacs-lisp/bindat-tests.el
+++ b/test/lisp/emacs-lisp/bindat-tests.el
@@ -182,8 +182,8 @@ bindat-test--str-strz-prealloc
                 ((((x strz 2)) ((x . "a"))) . "ax")
                 ((((x strz 2)) ((x . "ab"))) . "ab")
                 ((((x strz 2)) ((x . "abc"))) . "ab")
-                ((,(bindat-type strz) "") . "xx")
-                ((,(bindat-type strz) "a") . "ax")))
+                ((,(bindat-type strz) "") . "\0x")
+                ((,(bindat-type strz) "a") . "a\0")))
     (let ((prealloc (make-string 2 ?x)))
       (apply #'bindat-pack (append (car tc) (list prealloc)))
       (should (equal prealloc (cdr tc))))))
-- 
2.36.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-06-15 18:49 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-13 21:48 bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string Richard Hansen
2022-06-14 12:52 ` Eli Zaretskii
2022-06-14 20:47   ` Richard Hansen
2022-06-15 12:16     ` Eli Zaretskii
2022-06-15 18:49       ` Richard Hansen [this message]
2022-06-16  7:15         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3fe77a58-74ab-d44f-1fd8-19cbf0b2bda3@rhansen.org \
    --to=rhansen@rhansen.org \
    --cc=55952@debbugs.gnu.org \
    --cc=eliz@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.