From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?UTF-8?Q?I=C3=B1igo?= Serna Newsgroups: gmane.emacs.bugs Subject: bug#22333: 24.5; EWW downloads invalid compressed tar-files Date: Sat, 09 Jan 2016 12:26:35 +0100 Message-ID: <87io3356z8.fsf@inigo.katxi.org> References: <8760z3zrxs.fsf@inigo.katxi.org> <83vb73nt4m.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1452338844 24841 80.91.229.3 (9 Jan 2016 11:27:24 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 9 Jan 2016 11:27:24 +0000 (UTC) Cc: 22333@debbugs.gnu.org To: Andreas Schwab Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Sat Jan 09 12:27:13 2016 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aHrfs-0006Aa-Ev for geb-bug-gnu-emacs@m.gmane.org; Sat, 09 Jan 2016 12:27:12 +0100 Original-Received: from localhost ([::1]:40174 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHrfr-0003Ry-N5 for geb-bug-gnu-emacs@m.gmane.org; Sat, 09 Jan 2016 06:27:11 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36826) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHrfn-0003P3-H9 for bug-gnu-emacs@gnu.org; Sat, 09 Jan 2016 06:27:08 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aHrfi-0000Ya-Gj for bug-gnu-emacs@gnu.org; Sat, 09 Jan 2016 06:27:07 -0500 Original-Received: from debbugs.gnu.org ([208.118.235.43]:55422) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aHrfi-0000YT-D3 for bug-gnu-emacs@gnu.org; Sat, 09 Jan 2016 06:27:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84) (envelope-from ) id 1aHrfi-0003Es-5K for bug-gnu-emacs@gnu.org; Sat, 09 Jan 2016 06:27:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: =?UTF-8?Q?I=C3=B1igo?= Serna Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Sat, 09 Jan 2016 11:27:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 22333 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 22333-submit@debbugs.gnu.org id=B22333.145233880612430 (code B ref 22333); Sat, 09 Jan 2016 11:27:02 +0000 Original-Received: (at 22333) by debbugs.gnu.org; 9 Jan 2016 11:26:46 +0000 Original-Received: from localhost ([127.0.0.1]:43642 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aHrfS-0003EP-HL for submit@debbugs.gnu.org; Sat, 09 Jan 2016 06:26:46 -0500 Original-Received: from mail-wm0-f45.google.com ([74.125.82.45]:33078) by debbugs.gnu.org with esmtp (Exim 4.84) (envelope-from ) id 1aHrfP-0003EB-Sb for 22333@debbugs.gnu.org; Sat, 09 Jan 2016 06:26:44 -0500 Original-Received: by mail-wm0-f45.google.com with SMTP id f206so163284972wmf.0 for <22333@debbugs.gnu.org>; Sat, 09 Jan 2016 03:26:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version:content-type:content-transfer-encoding; bh=1GDf4LAZ9Lwpt7dv3o+CU7PA6rU312atVSxtMIP+zVE=; b=VWsWb2pIr38ebDBCVwi1xsP9aojqYohH+GfPHF9k5MEMXEsdrJTcwBV5AyvazBsnuK 4IsJNFTmsBrFLCPgWqHCGtydrQgBm3jwnr4DoVnk1ily52aWvge1HDY46zjVUkTdKUa/ eGUUhzoVH39qdj27VxoRIdAIE2YHFZQG0V0KOmv+nHwx1oOmvejgCYfLGvrPoETBeRgQ wKs1ms05fdj4ueQAdNmRXuArY4VeWFQxklbZHOhdWzwtwxPh8p9O78ucPfH7KDDvn65k Gv66e600E3XHiv4bqV+eI5VVWI5qRw4jHo+S10CUQUgpFOSy6F8Mw4M/v+GEzJ7xrsrP I86Q== X-Received: by 10.194.206.69 with SMTP id lm5mr117655490wjc.84.1452338798226; Sat, 09 Jan 2016 03:26:38 -0800 (PST) Original-Received: from inigo.katxi.org (62.57.74.108.dyn.user.ono.com. [62.57.74.108]) by smtp.gmail.com with ESMTPSA id u191sm745021wmd.4.2016.01.09.03.26.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 09 Jan 2016 03:26:37 -0800 (PST) User-agent: mu4e 0.9.15; emacs 24.5.1 In-reply-to: X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.bugs:111416 Archived-At: Hi and thanks for your time, Andreas Schwab writes: > Eli Zaretskii writes: > >> Could it be that the saved tar file is already uncompressed? > > I'd rather guess it's compressed twice. I think I've discovered the problem with the downloaded files: 0. As downloaded (remember it's saved from an emacs buffer) is not a valid .tar.gz file 1. but it is a valid gzipped file 2. the uncompressed file contains some HTTP headers at the start of the file. 3. After manually removing these lines and changing extension (to add the '.gz' again), file is correctly recognized as a gzipped tar-file. [0] /home/inigo/Downloads ⚡ tar xvfz filename.tar.gz tar: This does not look like a tar archive tar: Skipping to next header tar: Exiting with failure status due to previous errors [1] /home/inigo/Downloads ⚡ gzip -d filename.tar.gz /home/inigo/Downloads ⚡ file filename.tar filename.tar: data [2] /home/inigo/Downloads ⚡ head -21 filename.tar HTTP/1.1 200 OK x-amz-replication-status: COMPLETED Last-Modified: Sun, 08 Nov 2015 13:11:05 GMT ETag: "9c13c5fafcb1aecd43f51fa9b0278000" Content-Type: application/octet-stream Server: AmazonS3 Via: 1.1 varnish Fastly-Debug-Digest: 37a3779f444d206796099a174bb873c853bb8f1b9a13cf06f29108f907a9d50b Cache-Control: max-age=31557600, public Content-Length: 87341 Accept-Ranges: bytes Date: Sat, 09 Jan 2016 10:16:29 GMT Via: 1.1 varnish Age: 135058 Connection: keep-alive X-Served-By: cache-sea1920-SEA, cache-fra1247-FRA X-Cache: HIT, HIT X-Cache-Hits: 2, 1 X-Timer: S1452334589.874408,VS0,VE1 *I?V.�dist/lfm-3.0.tar��[w۸�0���.�d���H�$;�>:����N�+��c;}9n/ [3] [...remove HTTP headers from first lines of file...] /home/inigo/Downloads ⚡ file filename.tar filename.tar: gzip compressed data, was "dist/lfm-3.0.tar", last modified: Sun Nov 8 14:07:54 2015, max compression /home/inigo/Downloads ⚡ mv filename.tar filename.tar.gz /home/inigo/Downloads ⚡ file filename.tar.gz filename.tar.gz: gzip compressed data, was "dist/lfm-3.0.tar", last modified: Sun Nov 8 14:07:54 2015, max compression /home/inigo/Downloads ⚡ tar xvfz filename.tar.gz lfm-3.0/ [...] So I think the error comes with those extra spurious headers added to the start of the buffer when downloading the file, which mess buffer saving operation. The bug can be easily reproduced with code like: (let ((url "https://pypi.python.org/packages/source/l/lfm/lfm-3.0.1.tar.gz")) (url-retrieve url 'test-cb (list url))) (defun test-cb (status url) (message "Downloaded: %s" url) ;; (sleep-for 60) (write-file "~/a.tgz") (message "Saved: %s" url)) `eww-download-callback` (called by `url-retrieve`) should remove those headers after download is finished and before buffer is saved to disk. A simple fix could be to add (excuse my poor elisp skills): (goto-char (point-min)) (search-forward-regexp "^$") (forward-line) (delete-region (point-min) (point)) before '(write-file file)' in `eww-download-callback`. ... and this almost works, but the downloaded file is compressed twice now as Andreas stated (ie, tar.gz is gzipped again at file saving). Thus it looks there is an additional issue when saving gzipped tar-files from a buffer. Any idea on how to solve this? Thanks, Iñigo Serna