unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
@ 2022-06-13 21:48 Richard Hansen
  2022-06-14 12:52 ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Hansen @ 2022-06-13 21:48 UTC (permalink / raw)
  To: 55952; +Cc: monnier

[-- Attachment #1: Type: text/plain, Size: 396 bytes --]

X-Debbugs-CC: monnier@iro.umontreal.ca

Attached patch:

* lisp/emacs-lisp/bindat.el (bindat--pack-strz): Explicitly write a
null byte after packing a variable-length string to ensure proper
termination when packing to a pre-allocated string.
* doc/lispref/processes.texi (Bindat Types): Update documentation.
* test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
Update tests.

[-- Attachment #2: 0001-bindat-strz-Write-null-terminator-after-variable-len.patch --]
[-- Type: text/x-patch, Size: 3979 bytes --]

From 29e40414f1b344774ed9085a8f125fd0801276c3 Mon Sep 17 00:00:00 2001
From: Richard Hansen <rhansen@rhansen.org>
Date: Thu, 9 Jun 2022 20:41:50 -0400
Subject: [PATCH] bindat (strz): Write null terminator after variable length
 string

* lisp/emacs-lisp/bindat.el (bindat--pack-strz): Explicitly write a
null byte after packing a variable-length string to ensure proper
termination when packing to a pre-allocated string.
* doc/lispref/processes.texi (Bindat Types): Update documentation.
* test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
Update tests.
---
 doc/lispref/processes.texi           | 17 ++++-------------
 lisp/emacs-lisp/bindat.el            |  3 +++
 test/lisp/emacs-lisp/bindat-tests.el |  4 ++--
 3 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/doc/lispref/processes.texi b/doc/lispref/processes.texi
index 8c8f8fd6b2..179980c0ed 100644
--- a/doc/lispref/processes.texi
+++ b/doc/lispref/processes.texi
@@ -3495,24 +3495,15 @@ Bindat Types
 @item strz &optional @var{len}
 If @var{len} is not provided: Variable-length null-terminated unibyte
 string (@pxref{Text Representations}).  When packing, the entire input
-string is copied to the packed output.  The following byte will be
-null (zero) unless a pre-allocated string was provided to
-@code{bindat-pack}, in which case that byte is left unmodified.  The
-length of the packed output is the length of the input string plus one
-(for the null terminator).  The input string must not contain any null
-bytes.  If the input string is multibyte with only ASCII and
+string is copied to the packed output followed by a null (zero) byte.
+The length of the packed output is the length of the input string plus
+one (for the null terminator).  The input string must not contain any
+null bytes.  If the input string is multibyte with only ASCII and
 @code{eight-bit} characters, it is converted to unibyte before it is
 packed; other multibyte strings signal an error.  When unpacking, the
 resulting string contains all bytes up to (but excluding) the null
 byte.
 
-@quotation Caution
-If a pre-allocated string is provided to @code{bindat-pack}, the
-packed output will not be properly null-terminated unless the
-pre-allocated string already has a null byte at the appropriate
-location.
-@end quotation
-
 If @var{len} is provided: @code{strz} behaves the same as @code{str}
 with one difference: When unpacking, the first null byte encountered
 in the packed string and all subsequent bytes are excluded from the
diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
index 9ba89a5e3f..46e2a4901c 100644
--- a/lisp/emacs-lisp/bindat.el
+++ b/lisp/emacs-lisp/bindat.el
@@ -450,6 +450,9 @@ bindat--pack-strz
         ;; need to scan the input string looking for a null byte.
         (error "Null byte encountered in input strz string"))
       (aset bindat-raw (+ bindat-idx i) (aref v i)))
+    ;; Explicitly write a null terminator in case the user provided a
+    ;; pre-allocated string to bindat-pack that wasn't zeroed first.
+    (aset bindat-raw (+ bindat-idx len) 0)
     (setq bindat-idx (+ bindat-idx len 1))))
 
 (defun bindat--pack-bits (len v)
diff --git a/test/lisp/emacs-lisp/bindat-tests.el b/test/lisp/emacs-lisp/bindat-tests.el
index 7d1233ded7..cc223ad14e 100644
--- a/test/lisp/emacs-lisp/bindat-tests.el
+++ b/test/lisp/emacs-lisp/bindat-tests.el
@@ -182,8 +182,8 @@ bindat-test--str-strz-prealloc
                 ((((x strz 2)) ((x . "a"))) . "ax")
                 ((((x strz 2)) ((x . "ab"))) . "ab")
                 ((((x strz 2)) ((x . "abc"))) . "ab")
-                ((,(bindat-type strz) "") . "xx")
-                ((,(bindat-type strz) "a") . "ax")))
+                ((,(bindat-type strz) "") . "\0x")
+                ((,(bindat-type strz) "a") . "a\0")))
     (let ((prealloc (make-string 2 ?x)))
       (apply #'bindat-pack (append (car tc) (list prealloc)))
       (should (equal prealloc (cdr tc))))))
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
  2022-06-13 21:48 bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string Richard Hansen
@ 2022-06-14 12:52 ` Eli Zaretskii
  2022-06-14 20:47   ` Richard Hansen
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2022-06-14 12:52 UTC (permalink / raw)
  To: Richard Hansen; +Cc: 55952, monnier

> Cc: monnier@iro.umontreal.ca
> Date: Mon, 13 Jun 2022 17:48:15 -0400
> From: Richard Hansen <rhansen@rhansen.org>
> 
> Attached patch:
> 
> * lisp/emacs-lisp/bindat.el (bindat--pack-strz): Explicitly write a
> null byte after packing a variable-length string to ensure proper
> termination when packing to a pre-allocated string.
> * doc/lispref/processes.texi (Bindat Types): Update documentation.
> * test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
> Update tests.

Thanks, but AFAICT the documentation doesn't describe accurately
enough what the modified code does: what if the pre-allocated
destination string doesn't have enough storage for the null byte the
code adds?





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
  2022-06-14 12:52 ` Eli Zaretskii
@ 2022-06-14 20:47   ` Richard Hansen
  2022-06-15 12:16     ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Hansen @ 2022-06-14 20:47 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 55952, monnier


[-- Attachment #1.1: Type: text/plain, Size: 819 bytes --]

On 6/14/22 08:52, Eli Zaretskii wrote:
> Thanks, but AFAICT the documentation doesn't describe accurately 
> enough what the modified code does: what if the pre-allocated 
> destination string doesn't have enough storage for the null byte the 
> code adds?

The existing code advances the index for the terminator, it just doesn't write 0 to that byte. So the existing code already signals an error in that case unless the `strz` is the final field.

Regardless, the documentation for `bindat-pack` [1] clearly states that the pre-allocated string must have enough room:

> When pre-allocating, you should make sure `(length raw)` meets or 
> exceeds the total length to avoid an out-of-range error.
[1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Bindat-Functions.html#index-bindat_002dpack

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
  2022-06-14 20:47   ` Richard Hansen
@ 2022-06-15 12:16     ` Eli Zaretskii
  2022-06-15 18:49       ` Richard Hansen
  0 siblings, 1 reply; 6+ messages in thread
From: Eli Zaretskii @ 2022-06-15 12:16 UTC (permalink / raw)
  To: Richard Hansen; +Cc: 55952, monnier

> Date: Tue, 14 Jun 2022 16:47:47 -0400
> Cc: 55952@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Richard Hansen <rhansen@rhansen.org>
> 
> > Thanks, but AFAICT the documentation doesn't describe accurately 
> > enough what the modified code does: what if the pre-allocated 
> > destination string doesn't have enough storage for the null byte the 
> > code adds?
> 
> The existing code advances the index for the terminator, it just doesn't write 0 to that byte. So the existing code already signals an error in that case unless the `strz` is the final field.

I don't see how this is relevant to the concern I expressed.

My concern is that you found it necessary to add a comment about
writing the terminating null byte (which is a good thing), but didn't
mention that aspect in the manual, not even as a hint.  I think it is
noteworthy enough to be in the manual.

> Regardless, the documentation for `bindat-pack` [1] clearly states that the pre-allocated string must have enough room:
> 
> > When pre-allocating, you should make sure `(length raw)` meets or 
> > exceeds the total length to avoid an out-of-range error.
> [1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Bindat-Functions.html#index-bindat_002dpack

This comes _after_ the place where strz is described, so if someone
reads the manual in order, they wouldn't have read that yet.  And even
if they did, there's no reason to assume they remember it well enough.

Bottom line: I think this aspect of the code is important to mention
in the manual.  The price is small, whereas the benefit could be
significant.

Thanks.





^ permalink raw reply	[flat|nested] 6+ messages in thread

* bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
  2022-06-15 12:16     ` Eli Zaretskii
@ 2022-06-15 18:49       ` Richard Hansen
  2022-06-16  7:15         ` Eli Zaretskii
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Hansen @ 2022-06-15 18:49 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 55952, monnier


[-- Attachment #1.1.1: Type: text/plain, Size: 2781 bytes --]

>>> Thanks, but AFAICT the documentation doesn't describe accurately 
>>> enough what the modified code does: what if the pre-allocated 
>>> destination string doesn't have enough storage for the null byte the 
>>> code adds?
>>
>> The existing code advances the index for the terminator, it just 
>> doesn't write 0 to that byte. So the existing code already signals 
>> an error in that case unless the `strz` is the final field.
> 
> I don't see how this is relevant to the concern I expressed.

The point I was trying to make is that this patch doesn't change the behavior (in any significant way) in the case of an undersized output string. Perhaps the documentation could be improved, but I'd rather do that in a follow-up patch because it is outside of the scope of this patch.

> 
> My concern is that you found it necessary to add a comment about 
> writing the terminating null byte (which is a good thing), but didn't 
> mention that aspect in the manual, not even as a hint.  I think it is 
> noteworthy enough to be in the manual.

What do you mean? The patch changes the manual to say:

     When packing, the entire input string is copied tothe packed
     output followed by a null (zero) byte.

The attached revision tweaks the wording to make it stand out more:

     When packing, the entire input string is copied tothe packed
     output, and a null (zero) byte is written after that.

> 
>> Regardless, the documentation for `bindat-pack` [1] clearly states 
>> that the pre-allocated string must have enough room:
>>
>>> When pre-allocating, you should make sure `(length raw)` meets or 
>>> exceeds the total length to avoid an out-of-range error.
>> [1] https://www.gnu.org/software/emacs/manual/html_node/elisp/Bindat-Functions.html#index-bindat_002dpack
> 
> This comes _after_ the place where strz is described, so if someone 
> reads the manual in order, they wouldn't have read that yet.  And even 
> if they did, there's no reason to assume they remember it well enough.
> 
> Bottom line: I think this aspect of the code is important to mention 
> in the manual.  The price is small, whereas the benefit could be 
> significant.

I disagree -- I think the price is relatively high compared to the benefit. The pre-allocated length requirement is a requirement of only `bindat-pack` -- not of `bindat-type` or any of the type specifiers -- so it is best to keep that requirement with the documentation of `bindat-pack`. Indeed, users are unaware that packing to a pre-allocated string is even possible until they read the documentation for `bindat-pack` (except for references in the caveats documented for fixed-length `str` and `strz`, which I plan on addressing in a future patch).

Thanks,
Richard

[-- Attachment #1.1.2: v2-0001-bindat-strz-Write-null-terminator-after-variable-.patch --]
[-- Type: text/x-patch, Size: 4165 bytes --]

From 0d10f10a6b2b26d831dc5b250197f83967f65719 Mon Sep 17 00:00:00 2001
From: Richard Hansen <rhansen@rhansen.org>
Date: Thu, 9 Jun 2022 20:41:50 -0400
Subject: [PATCH v2] bindat (strz): Write null terminator after variable length
 string

* lisp/emacs-lisp/bindat.el (bindat--pack-strz): Explicitly write a
null byte after packing a variable-length string to ensure proper
termination when packing to a pre-allocated string (bug#55952).
* doc/lispref/processes.texi (Bindat Types): Update documentation.
* test/lisp/emacs-lisp/bindat-tests.el (bindat-test--str-strz-prealloc):
Update tests.
---
 doc/lispref/processes.texi           | 25 ++++++++-----------------
 lisp/emacs-lisp/bindat.el            |  3 +++
 test/lisp/emacs-lisp/bindat-tests.el |  4 ++--
 3 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/doc/lispref/processes.texi b/doc/lispref/processes.texi
index 8c8f8fd6b2..1018be3341 100644
--- a/doc/lispref/processes.texi
+++ b/doc/lispref/processes.texi
@@ -3495,23 +3495,14 @@ Bindat Types
 @item strz &optional @var{len}
 If @var{len} is not provided: Variable-length null-terminated unibyte
 string (@pxref{Text Representations}).  When packing, the entire input
-string is copied to the packed output.  The following byte will be
-null (zero) unless a pre-allocated string was provided to
-@code{bindat-pack}, in which case that byte is left unmodified.  The
-length of the packed output is the length of the input string plus one
-(for the null terminator).  The input string must not contain any null
-bytes.  If the input string is multibyte with only ASCII and
-@code{eight-bit} characters, it is converted to unibyte before it is
-packed; other multibyte strings signal an error.  When unpacking, the
-resulting string contains all bytes up to (but excluding) the null
-byte.
-
-@quotation Caution
-If a pre-allocated string is provided to @code{bindat-pack}, the
-packed output will not be properly null-terminated unless the
-pre-allocated string already has a null byte at the appropriate
-location.
-@end quotation
+string is copied to the packed output, and a null (zero) byte is
+written after that.  The length of the packed output is the length of
+the input string plus one (for the null terminator).  The input string
+must not contain any null bytes.  If the input string is multibyte
+with only ASCII and @code{eight-bit} characters, it is converted to
+unibyte before it is packed; other multibyte strings signal an error.
+When unpacking, the resulting string contains all bytes up to (but
+excluding) the null byte.
 
 If @var{len} is provided: @code{strz} behaves the same as @code{str}
 with one difference: When unpacking, the first null byte encountered
diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
index 9ba89a5e3f..46e2a4901c 100644
--- a/lisp/emacs-lisp/bindat.el
+++ b/lisp/emacs-lisp/bindat.el
@@ -450,6 +450,9 @@ bindat--pack-strz
         ;; need to scan the input string looking for a null byte.
         (error "Null byte encountered in input strz string"))
       (aset bindat-raw (+ bindat-idx i) (aref v i)))
+    ;; Explicitly write a null terminator in case the user provided a
+    ;; pre-allocated string to bindat-pack that wasn't zeroed first.
+    (aset bindat-raw (+ bindat-idx len) 0)
     (setq bindat-idx (+ bindat-idx len 1))))
 
 (defun bindat--pack-bits (len v)
diff --git a/test/lisp/emacs-lisp/bindat-tests.el b/test/lisp/emacs-lisp/bindat-tests.el
index 7d1233ded7..cc223ad14e 100644
--- a/test/lisp/emacs-lisp/bindat-tests.el
+++ b/test/lisp/emacs-lisp/bindat-tests.el
@@ -182,8 +182,8 @@ bindat-test--str-strz-prealloc
                 ((((x strz 2)) ((x . "a"))) . "ax")
                 ((((x strz 2)) ((x . "ab"))) . "ab")
                 ((((x strz 2)) ((x . "abc"))) . "ab")
-                ((,(bindat-type strz) "") . "xx")
-                ((,(bindat-type strz) "a") . "ax")))
+                ((,(bindat-type strz) "") . "\0x")
+                ((,(bindat-type strz) "a") . "a\0")))
     (let ((prealloc (make-string 2 ?x)))
       (apply #'bindat-pack (append (car tc) (list prealloc)))
       (should (equal prealloc (cdr tc))))))
-- 
2.36.1


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string
  2022-06-15 18:49       ` Richard Hansen
@ 2022-06-16  7:15         ` Eli Zaretskii
  0 siblings, 0 replies; 6+ messages in thread
From: Eli Zaretskii @ 2022-06-16  7:15 UTC (permalink / raw)
  To: Richard Hansen; +Cc: monnier, 55952-done

> Date: Wed, 15 Jun 2022 14:49:58 -0400
> Cc: 55952@debbugs.gnu.org, monnier@iro.umontreal.ca
> From: Richard Hansen <rhansen@rhansen.org>
> 
> >> The existing code advances the index for the terminator, it just 
> >> doesn't write 0 to that byte. So the existing code already signals 
> >> an error in that case unless the `strz` is the final field.
> > 
> > I don't see how this is relevant to the concern I expressed.
> 
> The point I was trying to make is that this patch doesn't change the behavior (in any significant way) in the case of an undersized output string. Perhaps the documentation could be improved, but I'd rather do that in a follow-up patch because it is outside of the scope of this patch.

I never had any objections to changing the behavior, nor in general to
the code changes in your patch.  My comments were only about the
documentation in the manual.

> > My concern is that you found it necessary to add a comment about 
> > writing the terminating null byte (which is a good thing), but didn't 
> > mention that aspect in the manual, not even as a hint.  I think it is 
> > noteworthy enough to be in the manual.
> 
> What do you mean?

I meant that the caveat about having enough space in the output string
for the terminating null byte is not explicitly mentioned where strz
and its handling during packing are documented.

> > Bottom line: I think this aspect of the code is important to mention 
> > in the manual.  The price is small, whereas the benefit could be 
> > significant.
> 
> I disagree -- I think the price is relatively high compared to the benefit. The pre-allocated length requirement is a requirement of only `bindat-pack` -- not of `bindat-type` or any of the type specifiers -- so it is best to keep that requirement with the documentation of `bindat-pack`. Indeed, users are unaware that packing to a pre-allocated string is even possible until they read the documentation for `bindat-pack` (except for references in the caveats documented for fixed-length `str` and `strz`, which I plan on addressing in a future patch).

I don't see any point in continuing to argue about this.  I have my
opinions about how our manuals should explain these issues, and your
disagreement, which is noted, doesn't change those opinions.  So I
installed your previous version, and then made the changes in the
manual I thought were pertinent.

Thanks.





^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-06-16  7:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-06-13 21:48 bug#55952: [PATCH] bindat (strz): Write null terminator after variable length string Richard Hansen
2022-06-14 12:52 ` Eli Zaretskii
2022-06-14 20:47   ` Richard Hansen
2022-06-15 12:16     ` Eli Zaretskii
2022-06-15 18:49       ` Richard Hansen
2022-06-16  7:15         ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).