unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
From: Stefan Monnier <monnier@iro.umontreal.ca>
To: Eli Zaretskii <eliz@gnu.org>
Cc: Petteri Hintsanen <petterih@iki.fi>,  emacs-devel@gnu.org
Subject: Re: Bindat can exhaust memory when unpacking to vector
Date: Sun, 10 Mar 2024 17:58:14 -0400	[thread overview]
Message-ID: <jwvjzm9ix5f.fsf-monnier+emacs@gnu.org> (raw)
In-Reply-To: <jwvr0l67x40.fsf-monnier+emacs@gnu.org> (Stefan Monnier's message of "Sat, 04 Nov 2023 04:21:25 -0400")

[ Somehow I can't find my previous answer in the `emacs-devel`
  archive.  In any case, here's an updated (and late) answer :-)  ]

> If you then, for example, eval this
> 
>   (bindat-unpack foo-bindat-spec [255 255 255 255 0 1 2 3 4 5])
> 
> it will result in (make-vector 4294967295 0), which makes Emacs hang.

Yup.  Validation is needed.  And

> If the data being unpacked cannot be trusted, then some kind of sanity
> checking should be done to avoid things like above.  In my application
> this is the case, so I embedded validation into the spec itself:
> 
>   (defconst foo-bindat-spec
>     (bindat-type
>      (length uint 32)
>      (_ unit (when (> length 1234) (error "malicious input")))
>      (data vec length)))
> 
> Which is somewhat messy, but it works.

You should also be able to do

    (defconst foo-bindat-spec
      (bindat-type
       (length uint 32)
       (data vec (if (< length 1234) length (error "malicious input")))))

as well.  But it's still not very elegant.  We could provide helper
functions so you can write things like

    (defconst foo-bindat-spec
      (bindat-type
       (length uint 32)
       (data vec (bounded 1 length 1234))))

so we signal an error if the length is less than 1 or larger than 1234.

> I also played around with the idea of patching bindat.el itself to do
> trivial checking against the input data size, like this:
> 
> diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el
> index 6f2af7f975b..c58f6c8f5c7 100644
> --- a/lisp/emacs-lisp/bindat.el
> +++ b/lisp/emacs-lisp/bindat.el
> @@ -204,6 +204,9 @@ bindat--unpack-item
>     ('str (bindat--unpack-str len))
>     ('strz (bindat--unpack-strz len))
>     ('vec
> +    (when (> len (length bindat-raw))
> +      (error "Vector length %d is greater than raw data length %d."
> +             len (length bindat-raw)))
>      (let ((v (make-vector len 0)) (vlen 1))
>        (if (consp vectype)
>           (setq vlen (nth 1 vectype)
> 
> but this feels way too intrusive.

Actually, this is a nice solution, I think.

It could conceivably be too conservative if someone manages to use `vec`
with a type that occupies less than one byte (say the input bytes is
a bitvector, or a compressed stream, and the vector output is the result of
uncompressing), but currently I can't see how that could be implemented
with `vec`.  I think it can be implemented with a recursive type, like I do in
`bindat-test--LEB128`, but doing it with `vec` would be very cumbersome.

It seems hypothetical enough that I think we should go with your patch.

> Checking should be optional and somehow programmable, perhaps
> a separate "checked vec" type?  (I don't have any good, concrete
> ideas, sorry.)

I don't see the benefit of not-checking, to be honest.


        Stefan




  reply	other threads:[~2024-03-10 21:58 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-22 19:36 Bindat can exhaust memory when unpacking to vector Petteri Hintsanen
2023-10-23 11:25 ` Eli Zaretskii
2023-10-23 19:36   ` Petteri Hintsanen
2023-10-24  4:34     ` tomas
2023-11-12 19:11       ` Petteri Hintsanen
2023-11-04  7:48     ` Eli Zaretskii
2023-11-04  8:21       ` Stefan Monnier
2024-03-10 21:58         ` Stefan Monnier [this message]
2024-03-19 20:09           ` Petteri Hintsanen
2024-03-19 21:08             ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=jwvjzm9ix5f.fsf-monnier+emacs@gnu.org \
    --to=monnier@iro.umontreal.ca \
    --cc=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=petterih@iki.fi \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).