From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Noam Postavsky Newsgroups: gmane.emacs.bugs Subject: bug#33133: 26.1.50; zlib-decompress-region too rigid Date: Tue, 30 Oct 2018 20:25:10 -0400 Message-ID: <87pnvrhsih.fsf@gmail.com> References: <87a7n4mbos.fsf@gmail.com> <87efcbjc2d.fsf@gmail.com> <837ei2m63m.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: blaine.gmane.org 1540945871 8979 195.159.176.226 (31 Oct 2018 00:31:11 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 31 Oct 2018 00:31:11 +0000 (UTC) User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) Cc: user42_kevin@yahoo.com.au, yamaoka@jpl.org, 33133@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Wed Oct 31 01:31:06 2018 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gHePY-0002E9-Ub for geb-bug-gnu-emacs@m.gmane.org; Wed, 31 Oct 2018 01:31:05 +0100 Original-Received: from localhost ([::1]:56242 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gHeRe-0003DP-RG for geb-bug-gnu-emacs@m.gmane.org; Tue, 30 Oct 2018 20:33:14 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:41582) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gHeKk-0004sT-7j for bug-gnu-emacs@gnu.org; Tue, 30 Oct 2018 20:26:08 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gHeKg-0000Cv-VL for bug-gnu-emacs@gnu.org; Tue, 30 Oct 2018 20:26:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:51782) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1gHeKg-0000CT-O1 for bug-gnu-emacs@gnu.org; Tue, 30 Oct 2018 20:26:02 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1gHeKg-0006jY-8a for bug-gnu-emacs@gnu.org; Tue, 30 Oct 2018 20:26:02 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Noam Postavsky Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Wed, 31 Oct 2018 00:26:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 33133 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 33133-submit@debbugs.gnu.org id=B33133.154094551925820 (code B ref 33133); Wed, 31 Oct 2018 00:26:02 +0000 Original-Received: (at 33133) by debbugs.gnu.org; 31 Oct 2018 00:25:19 +0000 Original-Received: from localhost ([127.0.0.1]:56040 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gHeJy-0006iO-To for submit@debbugs.gnu.org; Tue, 30 Oct 2018 20:25:19 -0400 Original-Received: from mail-it1-f170.google.com ([209.85.166.170]:55424) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1gHeJx-0006iB-O4 for 33133@debbugs.gnu.org; Tue, 30 Oct 2018 20:25:18 -0400 Original-Received: by mail-it1-f170.google.com with SMTP id e17so14363720itk.5 for <33133@debbugs.gnu.org>; Tue, 30 Oct 2018 17:25:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version; bh=2utDccJOQCkPTv2V4W6CZBGuav3alhfZv+4+EQAX5oc=; b=F4C87piCOzFvz042smDizjkxU5PNWX6LQgHjMTRwYYIIz0HeK6jnVH1am/iUUpRLQ8 IyrTYzqUJbRUY/PjzoZjFkDbrhdY628E9uU8eO0FcxVFvrq3iZxp/pUF37U+DgEHWVBc I+a9IpJHsbSmYwBNMNJcVqIZPwZBALj81jIrJtRxCS1jG2961feN7alSSx09rJfLNfJR ueTcvwLs1RveeDe25qd2hleEbrn5T0fpiNBoqr3sC+Fi1vNQ6dxmBXCEkOibvcgOxA8w qtiGG2y8Cy8Nurz6Ll6FKaqbc4Gv1kHQ583v64BJiuEOSjhaFWLvo1ZbtwGo63bUgZ3y WbxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:references:date:in-reply-to :message-id:user-agent:mime-version; bh=2utDccJOQCkPTv2V4W6CZBGuav3alhfZv+4+EQAX5oc=; b=E2wDLLAjhMu8BtX3WPyooyY6Fuu4Z77ypZ+zo5+/jGnWEr2191dhkmuKHYeWu8eNiY s0Wpr3pIKkaY6054At/jym8nwASRBxAEZNmGWwp5TTjRcU0CtP+F23kQQWLn4UrhIitx mElQCU/TAmrfVT7e+gZW1YqZV1k9TFCi68r0V98ItlSZmDQuIr25igfNj2sHgwpYl2zT tZ3UR//f/gGhGV2PBiEuBkoQxOHKron2wcZjckV6OdtQoyMfncj+bb7PVPjv5mlFFNnT gLmP8mJ1DBsVVNL1UUB/0yPZ5g2RfrJbAFzQ/apn0vcy3C4DgQ82rpZLpK+dlX3SZgoQ 31ew== X-Gm-Message-State: AGRZ1gI7P1IScYlXl5LqpZOhGiiimt/wfuwlTTkcpRZ6yMm5DXqtP/5n 94zKeKdE6R2gc1XamLs6swV4OQJO X-Google-Smtp-Source: AJdET5cGCDJZwMMuRmGCm1vJMjF9Z6vPhWyxu+z4PKXFr3Fp1tBjRCg3X5UrEQBu+OFCQJUThOv2uQ== X-Received: by 2002:a02:958a:: with SMTP id b10-v6mr727908jai.130.1540945512048; Tue, 30 Oct 2018 17:25:12 -0700 (PDT) Original-Received: from zebian (cbl-45-2-119-34.yyz.frontiernetworks.ca. [45.2.119.34]) by smtp.googlemail.com with ESMTPSA id i17-v6sm8757700iog.56.2018.10.30.17.25.10 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 30 Oct 2018 17:25:11 -0700 (PDT) In-Reply-To: <837ei2m63m.fsf@gnu.org> (Eli Zaretskii's message of "Sun, 28 Oct 2018 17:41:17 +0200") X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:151854 Archived-At: --=-=-= Content-Type: text/plain Eli Zaretskii writes: >> +data. If @var{allow-partial} is @code{nil}, on failure, the function > > We usually say "nil or omitted" for optional arguments. Also, I'd say > "then on failure, ...", otherwise this could be misinterpreted as if > "on failure" qualifies the "is nil" part. > > Same comment regarding the doc string of the function. Makes sense, done. >> +leaves the region unchanged and returns @code{nil}. Otherwise, it >> +returns the number of bytes that were not decompressed and replaces >> +the region text by whatever data was successfully decompressed. This >> +function can be called only in unibyte buffers. > > Maybe it would make sense here to say that this emulates what 'gzip' > does? Hmm, maybe. I've added a mention of this, not sure if it actually helps. >> + Lisp_Object ret = Qt; >> if (inflate_status != Z_STREAM_END) >> + { >> + if (!NILP (allow_partial)) >> + ret = make_int (iend - pos_byte); > Hmm... should we display a warning message, like gzip does? Not unconditionally, I'd say. In the example which prompted this bug thread, a warning would just be a nuisance. We could just leave it up to the caller to print a warning message if they want, e.g.: (unless (eq t (zlib-decompress-region START END t)) (message "Incomplete decompression")) Or perhaps instead of returning the byte count, return an error indicator which the caller could use to contruct a warning message (this could allow for a slightly more specific message)? Or maybe it's easier if the ALLOW-PARTIAL parameter could have another possible value to control display of the warning message? --=-=-= Content-Type: text/plain Content-Disposition: attachment; filename=v2-0001-Allow-partial-decompression-Bug-33133.patch Content-Description: patch >From 43a912181a4a30d826b2c016ca05b5c9d8daf3f4 Mon Sep 17 00:00:00 2001 From: Noam Postavsky Date: Sat, 27 Oct 2018 17:45:52 -0400 Subject: [PATCH v2] Allow partial decompression (Bug#33133) * src/decompress.c (Fzlib_decompress_region): Add optional ALLOW-PARTIAL parameter. * lisp/url/url-http.el (url-handle-content-transfer-encoding): Use it. * doc/lispref/text.texi (Decompression): Document it. * etc/NEWS: Announce it. --- doc/lispref/text.texi | 11 +++++++---- etc/NEWS | 6 ++++++ lisp/url/url-http.el | 5 +++-- src/decompress.c | 22 +++++++++++++++++----- 4 files changed, 33 insertions(+), 11 deletions(-) diff --git a/doc/lispref/text.texi b/doc/lispref/text.texi index 6c38d8eed0..9f241a6c3f 100644 --- a/doc/lispref/text.texi +++ b/doc/lispref/text.texi @@ -4462,14 +4462,17 @@ Decompression available. @end defun -@defun zlib-decompress-region start end +@defun zlib-decompress-region start end &optional allow-partial This function decompresses the region between @var{start} and @var{end}, using built-in zlib decompression. The region should contain data that were compressed with gzip or zlib. On success, the function replaces the contents of the region with the decompressed -data. On failure, the function leaves the region unchanged and -returns @code{nil}. This function can be called only in unibyte -buffers. +data. If @var{allow-partial} is @code{nil} or omitted, then on +failure, the function leaves the region unchanged and returns +@code{nil}. Otherwise, it returns the number of bytes that were not +decompressed and replaces the region text by whatever data was +successfully decompressed. This function can be called only in +unibyte buffers. @end defun diff --git a/etc/NEWS b/etc/NEWS index 3f86195695..395169253d 100644 --- a/etc/NEWS +++ b/etc/NEWS @@ -1159,6 +1159,12 @@ to mean that it is not known whether DST is in effect. 'json-insert', 'json-parse-string', and 'json-parse-buffer'. These are implemented in C using the Jansson library. ++++ +** 'zlib-decompress-region' can partially decompress corrupted data. +If the new optional ALLOW-PARTIAL argument is passed, then the data +that was decompressed successfully before failing will be inserted +into the buffer. + ** Mailcap --- diff --git a/lisp/url/url-http.el b/lisp/url/url-http.el index 6b5749e1bc..94ac660fcf 100644 --- a/lisp/url/url-http.el +++ b/lisp/url/url-http.el @@ -939,7 +939,8 @@ url-http-parse-headers (goto-char (point-min)) success)) -(declare-function zlib-decompress-region "decompress.c" (start end)) +(declare-function zlib-decompress-region "decompress.c" + (start end &optional allow-partial)) (defun url-handle-content-transfer-encoding () (let ((encoding (mail-fetch-field "content-encoding"))) @@ -951,7 +952,7 @@ url-handle-content-transfer-encoding (widen) (goto-char (point-min)) (when (search-forward "\n\n") - (zlib-decompress-region (point) (point-max))))))) + (zlib-decompress-region (point) (point-max) t)))))) ;; Miscellaneous (defun url-http-activate-callback () diff --git a/src/decompress.c b/src/decompress.c index 2836338216..dd55fd68e8 100644 --- a/src/decompress.c +++ b/src/decompress.c @@ -120,12 +120,18 @@ DEFUN ("zlib-available-p", Fzlib_available_p, Szlib_available_p, 0, 0, 0, DEFUN ("zlib-decompress-region", Fzlib_decompress_region, Szlib_decompress_region, - 2, 2, 0, + 2, 3, 0, doc: /* Decompress a gzip- or zlib-compressed region. Replace the text in the region by the decompressed data. -On failure, return nil and leave the data in place. + +If optional parameter ALLOW-PARTIAL is nil or omitted, then on +failure, return nil and leave the data in place. Otherwise, return +the number of bytes that were not decompressed and replace the region +text by whatever data was successfully decompressed (similar to gzip). +If decompression is completely successful return t. + This function can be called only in unibyte buffers. */) - (Lisp_Object start, Lisp_Object end) + (Lisp_Object start, Lisp_Object end, Lisp_Object allow_partial) { ptrdiff_t istart, iend, pos_byte; z_stream stream; @@ -206,8 +212,14 @@ DEFUN ("zlib-decompress-region", Fzlib_decompress_region, } while (inflate_status == Z_OK); + Lisp_Object ret = Qt; if (inflate_status != Z_STREAM_END) - return unbind_to (count, Qnil); + { + if (!NILP (allow_partial)) + ret = make_int (iend - pos_byte); + else + return unbind_to (count, Qnil); + } unwind_data.start = 0; @@ -218,7 +230,7 @@ DEFUN ("zlib-decompress-region", Fzlib_decompress_region, signal_after_change (istart, iend - istart, unwind_data.nbytes); update_compositions (istart, istart, CHECK_HEAD); - return unbind_to (count, Qt); + return unbind_to (count, ret); } -- 2.11.0 --=-=-=--