From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Petteri Hintsanen Newsgroups: gmane.emacs.devel Subject: Bindat can exhaust memory when unpacking to vector Date: Sun, 22 Oct 2023 22:36:36 +0300 Message-ID: <87pm16fnzv.fsf@iki.fi> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23940"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) To: emacs-devel@gnu.org Cancel-Lock: sha1:0gbTn7SgCtWjs9LIKlMzmp+DbHI= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Mon Oct 23 04:21:57 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qukZp-00066a-HB for ged-emacs-devel@m.gmane-mx.org; Mon, 23 Oct 2023 04:21:57 +0200 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qukYs-0003X9-8K; Sun, 22 Oct 2023 22:20:58 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1queFm-0006Fn-8I for emacs-devel@gnu.org; Sun, 22 Oct 2023 15:36:56 -0400 Original-Received: from ciao.gmane.io ([116.202.254.214]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1queFg-0006RW-N8 for emacs-devel@gnu.org; Sun, 22 Oct 2023 15:36:49 -0400 Original-Received: from list by ciao.gmane.io with local (Exim 4.92) (envelope-from ) id 1queFd-0003Vw-Jp for emacs-devel@gnu.org; Sun, 22 Oct 2023 21:36:41 +0200 X-Injected-Via-Gmane: http://gmane.org/ Received-SPF: pass client-ip=116.202.254.214; envelope-from=ged-emacs-devel@m.gmane-mx.org; helo=ciao.gmane.io X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Sun, 22 Oct 2023 22:20:55 -0400 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:311694 Archived-At: It appears that bindat-unpack can exhaust memory when it unpacks to a vector. Consider the following Bindat type expression: (require 'bindat) (defconst foo-bindat-spec (bindat-type (length uint 32) (data vec length))) If you then, for example, eval this (bindat-unpack foo-bindat-spec [255 255 255 255 0 1 2 3 4 5]) it will result in (make-vector 4294967295 0), which makes Emacs hang. Now it can be argued whether this is a problem or not. If the data being unpacked cannot be trusted, then some kind of sanity checking should be done to avoid things like above. In my application this is the case, so I embedded validation into the spec itself: (defconst foo-bindat-spec (bindat-type (length uint 32) (_ unit (when (> length 1234) (error "malicious input"))) (data vec length))) Which is somewhat messy, but it works. I also played around with the idea of patching bindat.el itself to do trivial checking against the input data size, like this: diff --git a/lisp/emacs-lisp/bindat.el b/lisp/emacs-lisp/bindat.el index 6f2af7f975b..c58f6c8f5c7 100644 --- a/lisp/emacs-lisp/bindat.el +++ b/lisp/emacs-lisp/bindat.el @@ -204,6 +204,9 @@ bindat--unpack-item ('str (bindat--unpack-str len)) ('strz (bindat--unpack-strz len)) ('vec + (when (> len (length bindat-raw)) + (error "Vector length %d is greater than raw data length %d." + len (length bindat-raw))) (let ((v (make-vector len 0)) (vlen 1)) (if (consp vectype) (setq vlen (nth 1 vectype) but this feels way too intrusive. Checking should be optional and somehow programmable, perhaps a separate "checked vec" type? (I don't have any good, concrete ideas, sorry.) Nonetheless, I think it would be prudent to mention this potential pitfall in the Emacs Lisp manual so that users become aware of it. I can write a texinfo patch if you agree. [Side note: vec seems to be the only bindat type that suffers from this. str (string) is another type that has a varying length, but it is unpacked by `substring' which will implicitly guard against nonsensical lengths.] Thanks, Petteri