* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
@ 2021-05-01 11:38 Ioannis Kappas
2021-05-01 11:48 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-01 11:38 UTC (permalink / raw)
To: 48137
[-- Attachment #1.1: Type: text/plain, Size: 1106 bytes --]
There is an issue with `package-install-file' that fails with the
following error when trying to install a file written with DOS line
endings:
error: Package lacks a "Version" or "Package-Version" header
(This is more apparent on MS-Windows which writes files with DOS line
endings by default)
To reproduce, create a minimal package file with Emacs on MS-Windows
and write it to disk as package-install-file-test-win.el:
;;; package-install-file-test-win.el --- test issue
;; Version: 0.1
(provide 'package-install-file-test-win)
;;; package-install-file-test-win.el ends here
Then M-x package-install-file RET package-install-file-test-win.el RET
and it should produce the error.
The above succeeds on GNU/Linux and installs the dummy package as
expected.
I have attached an ert test to do the same. The test fail on windows
and succeeds on Linux.
You can run it from command line as:
: emacs -Q --batch -l ert -l ./issue-package-install-file-test.el -f
ert-run-tests-batch
The cause appears to be that 'package can't handle package files
encoded with DOS line endings. Analysis to follow.
[-- Attachment #1.2: Type: text/html, Size: 1633 bytes --]
[-- Attachment #2: issue-package-install-file-test.el --]
[-- Type: application/octet-stream, Size: 1531 bytes --]
;;; -*- lexical-binding: t; -*-
(require 'ert)
(defconst test/package ";;; package-install-file-test-win.el --- test issue
;; Version: 0.1
(defun package-install-file-test-win/5 () 5)
(provide 'package-install-file-test-win)
;;; package-install-file-test-win.el ends here
"
"Minimal test package file")
(ert-deftest package-install-file ()
"Test that package install file works on a minimal package file
written to disk by Emacs."
(message ":system %s :version %s" system-configuration emacs-version)
(let* ((temp-dir (make-temp-file "package-install-file-test-win" t))
(package-path (expand-file-name
"package-install-file-test-win.el"
temp-dir)))
;; write file in the default encoding
(with-temp-buffer
(insert test/package)
(write-region (point-min) (point-max) package-path))
(message ":package-file-created-at %s" package-path)
;; report on file encoding
(with-temp-buffer
(insert-file-contents package-path)
(let* ((contents (buffer-string)))
(message ":package-file-coding-system %s" buffer-file-coding-system)))
;; install package
(package-install-file package-path)
(delete-directory temp-dir t)
;; call dummy function
(require 'package-install-file-test-win)
(should (equal 5 (package-install-file-test-win/5)))
;; remove package
(package-delete (cadr (assq 'package-install-file-test-win package-alist)))))
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-01 11:38 bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings Ioannis Kappas
@ 2021-05-01 11:48 ` Ioannis Kappas
2021-05-01 12:15 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-01 11:48 UTC (permalink / raw)
To: 48137
[-- Attachment #1: Type: text/plain, Size: 2922 bytes --]
The error appears to be a side effect of `package-install-file'
loading the file literally with `insert-file-contents-literally' and
thus setting the coding system to 'no-conversion (as reported by the
`buffer-file-coding-system' variable ). This means that for a DOS
encoded file the ?\r (Carriage Return) in the DOS line ending ?\r?\n
will be treated as an independent control character rather than
consumed as part of the line ending as far as Emacs is concerned.
This extra control character in our case appears in the version number
extracted with `lm-header' (.e.g. "0.1^M"), failing the
`version-to-list' conversion in
`package-buffer-info'->`package-strip-rcs-id', resulting to the error
that it lacks a version header.
It is not obvious why `package-install-file' loads package files
literally at all times, `package--with-response-buffer-1' also does
the same. If the reason turns out not to be important, then a possible
solution could be to load package files metaphorically:
diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
index ecb2573cab..98d63f1aff 100644
--- a/lisp/emacs-lisp/package.el
+++ b/lisp/emacs-lisp/package.el
@@ -2147,8 +2147,11 @@ package-install-file
(progn
(setq default-directory file)
(dired-mode))
- (insert-file-contents-literally file)
- (when (string-match "\\.tar\\'" file) (tar-mode)))
+ (if (string-match "\\.tar\\'" file)
+ (progn
+ (insert-file-contents-literally file)
+ (tar-mode))
+ (insert-file-contents file)))
(package-install-from-buffer)))
;;;###autoload
and perhaps the same can be extended to
`package--with-response-buffer-1'.
Another solution could be to upgrade the 'lisp-mnt package to ignore
?\r characters. Looking at the `lm-header' fn invoked by
`package-buffer-info', it does have a list of characters to stop at
when looking for a header, we can thus add the carriage return to the
list:
diff --git a/lisp/emacs-lisp/lisp-mnt.el b/lisp/emacs-lisp/lisp-mnt.el
index 9cba232e16..3eb493d286 100644
--- a/lisp/emacs-lisp/lisp-mnt.el
+++ b/lisp/emacs-lisp/lisp-mnt.el
@@ -267,7 +267,7 @@ lm-header
(if (save-excursion
(skip-chars-backward "^$" (match-beginning 0))
(= (point) (match-beginning 0)))
- "[^\n]+" "[^$\n]+")))
+ "[^\n\r]+" "[^$\n\r]+")))
(match-string-no-properties 0))))
(defun lm-header-multiline (header)
The attached issue-package-install-file-test-up1.el, is an updated ert
including an additional `header' test for the above.
Or is it perhaps that packages should only be written with Unix line
endings and thus the solution is to update documentation and create
more targeted error messages?
I don't have a thorough understanding of 'package, so please feel free
to scrutinize the above/suggest alternatives.
Thanks
[-- Attachment #2: issue-package-install-file-test-up1.el --]
[-- Type: application/octet-stream, Size: 2561 bytes --]
;;; -*- lexical-binding: t; -*-
(require 'ert)
(defconst test/package ";;; package-install-file-test-win.el --- test issue
;; Version: 0.1
(defun package-install-file-test-win/5 () 5)
(provide 'package-install-file-test-win)
;;; package-install-file-test-win.el ends here
"
"Minimal test package file")
(ert-deftest package-install-file ()
"Test that package install file works on a minimal package file
written to disk by Emacs."
(message ":system %s :version %s" system-configuration emacs-version)
(let* ((temp-dir (make-temp-file "package-install-file-test-win" t))
(package-path (expand-file-name
"package-install-file-test-win.el"
temp-dir)))
;; write file in the default encoding
(with-temp-buffer
(insert test/package)
(write-region (point-min) (point-max) package-path))
(message ":package-file-created-at %s" package-path)
;; report on file encoding
(with-temp-buffer
(insert-file-contents package-path)
(let* ((contents (buffer-string)))
(message ":package-file-coding-system %s" buffer-file-coding-system)))
;; install package
(package-install-file package-path)
(delete-directory temp-dir t)
;; call dummy function
(require 'package-install-file-test-win)
(should (equal 5 (package-install-file-test-win/5)))
;; remove package
(package-delete (cadr (assq 'package-install-file-test-win package-alist)))))
(require 'lisp-mnt)
(ert-deftest header ()
"Test that lm-header can return version information on a dos
coded packaged file loaded literally (mimicking 'package)"
(message ":system %s :version %s" system-configuration emacs-version)
(let* ((temp-dir (make-temp-file "package-install-file-test-win" t))
(package-path (expand-file-name
"package-install-file-test-win.el"
temp-dir)))
;; write with dos line endings
(with-temp-buffer
(insert test/package)
(let ((coding-system-for-write 'prefer-utf-8-dos))
(write-region (point-min) (point-max) package-path))
(message ":package-file-coding-system %s" buffer-file-coding-system))
(message ":package-file-created-at %s" package-path)
;; load literally
(with-temp-buffer
(insert-file-contents-literally package-path)
(goto-char (point-min))
(should (string= "0.1" (lm-header "version"))))
(delete-directory temp-dir t)))
^ permalink raw reply related [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-01 11:48 ` Ioannis Kappas
@ 2021-05-01 12:15 ` Eli Zaretskii
2021-05-01 13:51 ` Stefan Monnier
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-01 12:15 UTC (permalink / raw)
To: Ioannis Kappas, Stefan Monnier; +Cc: 48137
> From: Ioannis Kappas <ioannis.kappas@gmail.com>
> Date: Sat, 1 May 2021 12:48:06 +0100
>
> The error appears to be a side effect of `package-install-file'
> loading the file literally with `insert-file-contents-literally' and
> thus setting the coding system to 'no-conversion (as reported by the
> `buffer-file-coding-system' variable ). This means that for a DOS
> encoded file the ?\r (Carriage Return) in the DOS line ending ?\r?\n
> will be treated as an independent control character rather than
> consumed as part of the line ending as far as Emacs is concerned.
>
> This extra control character in our case appears in the version number
> extracted with `lm-header' (.e.g. "0.1^M"), failing the
> `version-to-list' conversion in
> `package-buffer-info'->`package-strip-rcs-id', resulting to the error
> that it lacks a version header.
>
> It is not obvious why `package-install-file' loads package files
> literally at all times, `package--with-response-buffer-1' also does
> the same. If the reason turns out not to be important, then a possible
> solution could be to load package files metaphorically:
>
> diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
> index ecb2573cab..98d63f1aff 100644
> --- a/lisp/emacs-lisp/package.el
> +++ b/lisp/emacs-lisp/package.el
> @@ -2147,8 +2147,11 @@ package-install-file
> (progn
> (setq default-directory file)
> (dired-mode))
> - (insert-file-contents-literally file)
> - (when (string-match "\\.tar\\'" file) (tar-mode)))
> + (if (string-match "\\.tar\\'" file)
> + (progn
> + (insert-file-contents-literally file)
> + (tar-mode))
> + (insert-file-contents file)))
I don't think this is TRT, because insert-file-contents also decodes
the character encoding, not just the EOL encoding.
> Another solution could be to upgrade the 'lisp-mnt package to ignore
> ?\r characters. Looking at the `lm-header' fn invoked by
> `package-buffer-info', it does have a list of characters to stop at
> when looking for a header, we can thus add the carriage return to the
> list:
>
> diff --git a/lisp/emacs-lisp/lisp-mnt.el b/lisp/emacs-lisp/lisp-mnt.el
> index 9cba232e16..3eb493d286 100644
> --- a/lisp/emacs-lisp/lisp-mnt.el
> +++ b/lisp/emacs-lisp/lisp-mnt.el
> @@ -267,7 +267,7 @@ lm-header
> (if (save-excursion
> (skip-chars-backward "^$" (match-beginning 0))
> (= (point) (match-beginning 0)))
> - "[^\n]+" "[^$\n]+")))
> + "[^\n\r]+" "[^$\n\r]+")))
This is better, but IMO the code should be rewritten not to allow a
lone CR character, only either a lone LF or the CRLF pair.
> Or is it perhaps that packages should only be written with Unix line
> endings and thus the solution is to update documentation and create
> more targeted error messages?
This is probably the best, IMO. Stefan?
Thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-01 12:15 ` Eli Zaretskii
@ 2021-05-01 13:51 ` Stefan Monnier
2021-05-03 17:47 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Stefan Monnier @ 2021-05-01 13:51 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Ioannis Kappas, 48137
>> diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
>> index ecb2573cab..98d63f1aff 100644
>> --- a/lisp/emacs-lisp/package.el
>> +++ b/lisp/emacs-lisp/package.el
>> @@ -2147,8 +2147,11 @@ package-install-file
>> (progn
>> (setq default-directory file)
>> (dired-mode))
>> - (insert-file-contents-literally file)
>> - (when (string-match "\\.tar\\'" file) (tar-mode)))
>> + (if (string-match "\\.tar\\'" file)
>> + (progn
>> + (insert-file-contents-literally file)
>> + (tar-mode))
>> + (insert-file-contents file)))
>
> I don't think this is TRT, because insert-file-contents also decodes
> the character encoding, not just the EOL encoding.
I think for `.el` files it's actually correct to decode the character
encoding here (it's maybe not necessary, but I think it's at least
as correct as what we do now, since `package-install-from-buffer`
expects a "normal" buffer, hence for `.el` buffers it expects one where
characters have been decoded).
BTW, this code needs some love, for example to handle `.tar.lz` files.
>> Or is it perhaps that packages should only be written with Unix line
>> endings and thus the solution is to update documentation and create
>> more targeted error messages?
> This is probably the best, IMO. Stefan?
It sounds like a good idea as well, yes.
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-01 13:51 ` Stefan Monnier
@ 2021-05-03 17:47 ` Ioannis Kappas
2021-05-03 18:23 ` Stefan Monnier
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-03 17:47 UTC (permalink / raw)
To: Stefan Monnier; +Cc: 48137
On Sat, May 1, 2021 at 2:51 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> > I don't think this is TRT, because insert-file-contents also decodes
> > the character encoding, not just the EOL encoding.
>
> I think for `.el` files it's actually correct to decode the character
> encoding here (it's maybe not necessary, but I think it's at least
> as correct as what we do now, since `package-install-from-buffer`
> expects a "normal" buffer, hence for `.el` buffers it expects one where
> characters have been decoded).
>
Thanks Stefan. Does this also apply to the only other use of
`insert-file-contents-literally' in 'package,? In particular
`package--with-response-buffer-1' is referenced by
`package-check-signature' and `package--download-one-archive'.
(I am personally more in favor of supporting DOS files, because it
makes the caller's life easier on MS-Windows and brings them on par
with Unix, though Eli's concern has to be addressed first)
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 17:47 ` Ioannis Kappas
@ 2021-05-03 18:23 ` Stefan Monnier
2021-05-03 18:33 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Stefan Monnier @ 2021-05-03 18:23 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137
Ioannis Kappas [2021-05-03 18:47:34] wrote:
> On Sat, May 1, 2021 at 2:51 PM Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> > I don't think this is TRT, because insert-file-contents also decodes
>> > the character encoding, not just the EOL encoding.
>>
>> I think for `.el` files it's actually correct to decode the character
>> encoding here (it's maybe not necessary, but I think it's at least
>> as correct as what we do now, since `package-install-from-buffer`
>> expects a "normal" buffer, hence for `.el` buffers it expects one where
>> characters have been decoded).
> Thanks Stefan. Does this also apply to the only other use of
> `insert-file-contents-literally' in 'package,?
I don't think so, no.
> In particular `package--with-response-buffer-1' is referenced by
> `package-check-signature'
This definitely needs to deal with bytes only, we don't want any
encoding/decoding to risk changing the byte contents.
> and `package--download-one-archive'.
I think the same is true here.
> (I am personally more in favor of supporting DOS files, because it
> makes the caller's life easier on MS-Windows and brings them on par
> with Unix, though Eli's concern has to be addressed first)
My own opinion is that .el files are files that belong to Emacs and they
should use Emacs's "native" file format, whatever that is. I think the
"most native" would be `utf-8-emacs-unix` so I'd be OK with deprecating
all other encodings, but I don't think that's going to happen ;-)
My comment on that subject was instead focused on files distributed via
ELPA (and hence wouldn't really apply to `package-install-file`): we
could document that those files should use utf-8 and unix EOLs.
At the same time, I haven't seen a clear technical benefit to imposing
such a constraint, so it's probably not worth the trouble.
[ There can be a real technical advantage to imposing
`utf-8-emacs-unix` for .el files since it can save us the trip
through `load-with-code-conversion` which slows down `load`ing
non-byte-compiled files, but that doesn't matter much for ELPA files
since those are usually byte-compiled anyway. ]
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 18:23 ` Stefan Monnier
@ 2021-05-03 18:33 ` Eli Zaretskii
2021-05-03 18:49 ` Ioannis Kappas
2021-05-03 19:41 ` Stefan Monnier
0 siblings, 2 replies; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-03 18:33 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>, 48137@debbugs.gnu.org
> Date: Mon, 03 May 2021 14:23:53 -0400
>
> > Thanks Stefan. Does this also apply to the only other use of
> > `insert-file-contents-literally' in 'package,?
>
> I don't think so, no.
>
> > In particular `package--with-response-buffer-1' is referenced by
> > `package-check-signature'
>
> This definitely needs to deal with bytes only, we don't want any
> encoding/decoding to risk changing the byte contents.
>
> > and `package--download-one-archive'.
>
> I think the same is true here.
>
> > (I am personally more in favor of supporting DOS files, because it
> > makes the caller's life easier on MS-Windows and brings them on par
> > with Unix, though Eli's concern has to be addressed first)
>
> My own opinion is that .el files are files that belong to Emacs and they
> should use Emacs's "native" file format, whatever that is. I think the
> "most native" would be `utf-8-emacs-unix` so I'd be OK with deprecating
> all other encodings, but I don't think that's going to happen ;-)
I think it's wrong for package.el to try to decode these files. It
should deliver the files to the disk exactly as they are received
through the wire. And the only safe way of doing that is to treat
these files as raw bytes. Where we must interpret some parts of the
files, we should take precautions to handle the complications related
to the encoding, but we should not try to decode anything.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 18:33 ` Eli Zaretskii
@ 2021-05-03 18:49 ` Ioannis Kappas
2021-05-03 18:52 ` Eli Zaretskii
2021-05-03 19:41 ` Stefan Monnier
1 sibling, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-03 18:49 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
On Mon, May 3, 2021 at 7:33 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Eli Zaretskii <eliz@gnu.org>, 48137@debbugs.gnu.org
> > Date: Mon, 03 May 2021 14:23:53 -0400
> >
> > > Thanks Stefan. Does this also apply to the only other use of
> > > `insert-file-contents-literally' in 'package,?
> >
> > I don't think so, no.
> >
> > > In particular `package--with-response-buffer-1' is referenced by
> > > `package-check-signature'
> >
> > This definitely needs to deal with bytes only, we don't want any
> > encoding/decoding to risk changing the byte contents.
> >
> > > and `package--download-one-archive'.
> >
> > I think the same is true here.
> >
> > > (I am personally more in favor of supporting DOS files, because it
> > > makes the caller's life easier on MS-Windows and brings them on par
> > > with Unix, though Eli's concern has to be addressed first)
> >
> > My own opinion is that .el files are files that belong to Emacs and they
> > should use Emacs's "native" file format, whatever that is. I think the
> > "most native" would be `utf-8-emacs-unix` so I'd be OK with deprecating
> > all other encodings, but I don't think that's going to happen ;-)
>
> I think it's wrong for package.el to try to decode these files. It
> should deliver the files to the disk exactly as they are received
> through the wire. And the only safe way of doing that is to treat
> these files as raw bytes. Where we must interpret some parts of the
> files, we should take precautions to handle the complications related
> to the encoding, but we should not try to decode anything.
Thanks, I think there is agreement we shouldn't decode files coming
from a remote archive. Are there still concerns though with decoding
local .el files passed in to `package-install-file' supporting the
windows DOS EOL case Eli? If not, I would like to suggest a slightly
updated patch that is targeting .el files directly as an exemption:
diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
index ecb2573cab..19ab0445b9 100644
--- a/lisp/emacs-lisp/package.el
+++ b/lisp/emacs-lisp/package.el
@@ -2147,7 +2147,9 @@ package-install-file
(progn
(setq default-directory file)
(dired-mode))
- (insert-file-contents-literally file)
+ (if (string-match "\\.el\\'" file)
+ (insert-file-contents file)
+ (insert-file-contents-literally file))
(when (string-match "\\.tar\\'" file) (tar-mode)))
(package-install-from-buffer)))
^ permalink raw reply related [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 18:49 ` Ioannis Kappas
@ 2021-05-03 18:52 ` Eli Zaretskii
2021-05-03 20:12 ` Stefan Monnier
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-03 18:52 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137, monnier
> From: Ioannis Kappas <ioannis.kappas@gmail.com>
> Date: Mon, 3 May 2021 19:49:27 +0100
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
>
> > I think it's wrong for package.el to try to decode these files. It
> > should deliver the files to the disk exactly as they are received
> > through the wire. And the only safe way of doing that is to treat
> > these files as raw bytes. Where we must interpret some parts of the
> > files, we should take precautions to handle the complications related
> > to the encoding, but we should not try to decode anything.
>
> Thanks, I think there is agreement we shouldn't decode files coming
> from a remote archive. Are there still concerns though with decoding
> local .el files passed in to `package-install-file' supporting the
> windows DOS EOL case Eli?
Yes, I think we shouldn't decode _any_ files, not even the *.el
files. That includes their EOL convention. Therefore, this:
> If not, I would like to suggest a slightly
> updated patch that is targeting .el files directly as an exemption:
>
> diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
> index ecb2573cab..19ab0445b9 100644
> --- a/lisp/emacs-lisp/package.el
> +++ b/lisp/emacs-lisp/package.el
> @@ -2147,7 +2147,9 @@ package-install-file
> (progn
> (setq default-directory file)
> (dired-mode))
> - (insert-file-contents-literally file)
> + (if (string-match "\\.el\\'" file)
> + (insert-file-contents file)
> + (insert-file-contents-literally file))
> (when (string-match "\\.tar\\'" file) (tar-mode)))
> (package-install-from-buffer)))
is not something I can endorse, especially as it decodes much more
than just the EOL format.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 18:33 ` Eli Zaretskii
2021-05-03 18:49 ` Ioannis Kappas
@ 2021-05-03 19:41 ` Stefan Monnier
2021-05-04 11:34 ` Eli Zaretskii
1 sibling, 1 reply; 32+ messages in thread
From: Stefan Monnier @ 2021-05-03 19:41 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: ioannis.kappas, 48137
> I think it's wrong for package.el to try to decode these files. It
> should deliver the files to the disk exactly as they are received
> through the wire. And the only safe way of doing that is to treat
> these files as raw bytes.
I'm not sure what you're referring to here.
In the OP's case, i.e. `package-install-file` (which is a seldom used
functionality, not part of the "normal" `package-install`) the file doesn't go
through the wire: it's already "local" and we currently implement this
code on top of `package-install-from-buffer`, so it has to work
correctly with an already-decoded buffer.
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 18:52 ` Eli Zaretskii
@ 2021-05-03 20:12 ` Stefan Monnier
2021-05-04 11:39 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Stefan Monnier @ 2021-05-03 20:12 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Ioannis Kappas, 48137
>> If not, I would like to suggest a slightly
>> updated patch that is targeting .el files directly as an exemption:
>>
>> diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
>> index ecb2573cab..19ab0445b9 100644
>> --- a/lisp/emacs-lisp/package.el
>> +++ b/lisp/emacs-lisp/package.el
>> @@ -2147,7 +2147,9 @@ package-install-file
>> (progn
>> (setq default-directory file)
>> (dired-mode))
>> - (insert-file-contents-literally file)
>> + (if (string-match "\\.el\\'" file)
>> + (insert-file-contents file)
>> + (insert-file-contents-literally file))
>> (when (string-match "\\.tar\\'" file) (tar-mode)))
>> (package-install-from-buffer)))
>
> is not something I can endorse, especially as it decodes much more
> than just the EOL format.
package.el should do *some* decoding because it needs to look at
the pseudo-headers in the file to generate the `<pkg>-pkg.el` which
contains among other things the package's short description.
Admittedly, these will usually work OK in undecoded buffers as well,
but not in all cases.
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 19:41 ` Stefan Monnier
@ 2021-05-04 11:34 ` Eli Zaretskii
2021-05-04 15:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-04 11:34 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: ioannis.kappas@gmail.com, 48137@debbugs.gnu.org
> Date: Mon, 03 May 2021 15:41:47 -0400
>
> > I think it's wrong for package.el to try to decode these files. It
> > should deliver the files to the disk exactly as they are received
> > through the wire. And the only safe way of doing that is to treat
> > these files as raw bytes.
>
> I'm not sure what you're referring to here.
To the suggestion to decode *.el files by package.el as part of the
installation.
> In the OP's case, i.e. `package-install-file` (which is a seldom used
> functionality, not part of the "normal" `package-install`) the file doesn't go
> through the wire: it's already "local" and we currently implement this
> code on top of `package-install-from-buffer`, so it has to work
> correctly with an already-decoded buffer.
If this works with buffer text, then how are DOS EOLs come into play
here? There are no CRLF pairs in decoded buffer text.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-03 20:12 ` Stefan Monnier
@ 2021-05-04 11:39 ` Eli Zaretskii
0 siblings, 0 replies; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-04 11:39 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ioannis Kappas <ioannis.kappas@gmail.com>, 48137@debbugs.gnu.org
> Date: Mon, 03 May 2021 16:12:17 -0400
>
> >> If not, I would like to suggest a slightly
> >> updated patch that is targeting .el files directly as an exemption:
> >>
> >> diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
> >> index ecb2573cab..19ab0445b9 100644
> >> --- a/lisp/emacs-lisp/package.el
> >> +++ b/lisp/emacs-lisp/package.el
> >> @@ -2147,7 +2147,9 @@ package-install-file
> >> (progn
> >> (setq default-directory file)
> >> (dired-mode))
> >> - (insert-file-contents-literally file)
> >> + (if (string-match "\\.el\\'" file)
> >> + (insert-file-contents file)
> >> + (insert-file-contents-literally file))
> >> (when (string-match "\\.tar\\'" file) (tar-mode)))
> >> (package-install-from-buffer)))
> >
> > is not something I can endorse, especially as it decodes much more
> > than just the EOL format.
>
> package.el should do *some* decoding because it needs to look at
> the pseudo-headers in the file to generate the `<pkg>-pkg.el` which
> contains among other things the package's short description.
That's not "decoding" in the sense I used it, which is when we
_replace_ in memory the original raw bytes with the decoded contents.
"Decoding" for the purpose of parsing some part of the file is fine,
especially since here it just needs to be prepared to having an
optional CR before each LF. (In fact, I'd be also okay with decoding
all of it into a temporary buffer, though that would probably be waste
of cycles.)
> Admittedly, these will usually work OK in undecoded buffers as well,
> but not in all cases.
In what cases this would not work? Can those cases be handled by
directing the decoded stuff to a destination that is not the original
text, thus keeping the original text unaltered?
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-04 11:34 ` Eli Zaretskii
@ 2021-05-04 15:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-04 16:14 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-05-04 15:57 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: ioannis.kappas, 48137
>> > I think it's wrong for package.el to try to decode these files. It
>> > should deliver the files to the disk exactly as they are received
>> > through the wire. And the only safe way of doing that is to treat
>> > these files as raw bytes.
>> I'm not sure what you're referring to here.
> To the suggestion to decode *.el files by package.el as part of the
> installation.
Even for `package-install-file`?
[ which is a completely different code path than `package-install`.
It's a kind of "side feature" to install packages
using `package.el` without using an ELPArchive. ]
>> In the OP's case, i.e. `package-install-file` (which is a seldom used
>> functionality, not part of the "normal" `package-install`) the file doesn't go
>> through the wire: it's already "local" and we currently implement this
>> code on top of `package-install-from-buffer`, so it has to work
>> correctly with an already-decoded buffer.
> If this works with buffer text, then how are DOS EOLs come into play
> here? There are no CRLF pairs in decoded buffer text.
`package-install-file` takes a file, not a buffer, as input.
But internally it works by calling `package-install-from-buffer`.
Taking a step back, I think I'm just failing to understand why you
insist on not decoding the `.el` file.
What `package-install-file` will do is extract `<foo>-pkg.el` from it,
save a copy of the file somehow to `~/.emacs.d/elpa/<foo>/<foo>.el` and
then compile it. After that, the file (neither the original file, nor
its copy in `~/.emacs.d/elpa/<foo>/<foo>.el`) will basically never be
used any more. So the only "significant" uses of this file are
extraction of data for `<foo>-pkg.el` and byte-compilation, both of
which work on the decoded version of the file.
Furthermore, `insert-file-contents + write-region` should(!) preserve
the bytes in all normal cases.
I agree that in principle it would be better to copy the file by copying
its bytes than by `insert-file-contents + write-region`.
[ BTW, we have a bug in `package-install-from-buffer` currently for
buffers which contain non-ASCII chars (because it saves the buffer's
content via `package-unpack` => `package--write-file-no-coding`), so
if we do go ahead with the change Ioannis proposes (which I think is
acceptable) we definitely need to fix that bug in
`package-install-from-buffer`. ]
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-04 15:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-05-04 16:14 ` Eli Zaretskii
2021-05-04 16:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-04 16:14 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: ioannis.kappas@gmail.com, 48137@debbugs.gnu.org
> Date: Tue, 04 May 2021 11:57:53 -0400
>
> >> I'm not sure what you're referring to here.
> > To the suggestion to decode *.el files by package.el as part of the
> > installation.
>
> Even for `package-install-file`?
Yes, why not? How else can we ensure 100% that the installed file
will be identical to the original one?
> > If this works with buffer text, then how are DOS EOLs come into play
> > here? There are no CRLF pairs in decoded buffer text.
>
> `package-install-file` takes a file, not a buffer, as input.
> But internally it works by calling `package-install-from-buffer`.
Then at some point we do insert the file into a buffer. I'm saying
that we should insert it literally.
> What `package-install-file` will do is extract `<foo>-pkg.el` from it,
^^^^^^^
What is "it" in "from it" there?
> save a copy of the file somehow to `~/.emacs.d/elpa/<foo>/<foo>.el` and
> then compile it. After that, the file (neither the original file, nor
> its copy in `~/.emacs.d/elpa/<foo>/<foo>.el`) will basically never be
> used any more. So the only "significant" uses of this file are
> extraction of data for `<foo>-pkg.el` and byte-compilation, both of
> which work on the decoded version of the file.
I'm saying that extraction should not decode it. Byte compilation
will do its usual thing, but that's not what I was talking about.
> Furthermore, `insert-file-contents + write-region` should(!) preserve
> the bytes in all normal cases.
Yes, but that's not 100% guaranteed. It can fail. the only way not
to fail is not to decode (and then not to encode).
> I agree that in principle it would be better to copy the file by copying
> its bytes than by `insert-file-contents + write-region`.
Why only "in principle"? what prevents us from actually doing that?
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-04 16:14 ` Eli Zaretskii
@ 2021-05-04 16:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-04 16:51 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-05-04 16:27 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: ioannis.kappas, 48137
>> I agree that in principle it would be better to copy the file by copying
>> its bytes than by `insert-file-contents + write-region`.
> Why only "in principle"? what prevents us from actually doing that?
Someone willing to take the time and rework the code accordingly?
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-04 16:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-05-04 16:51 ` Eli Zaretskii
2021-05-05 7:03 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-04 16:51 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: ioannis.kappas@gmail.com, 48137@debbugs.gnu.org
> Date: Tue, 04 May 2021 12:27:37 -0400
>
> >> I agree that in principle it would be better to copy the file by copying
> >> its bytes than by `insert-file-contents + write-region`.
> > Why only "in principle"? what prevents us from actually doing that?
>
> Someone willing to take the time and rework the code accordingly?
I thought that's Ioannis?
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-04 16:51 ` Eli Zaretskii
@ 2021-05-05 7:03 ` Ioannis Kappas
2021-05-05 12:01 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-05 7:03 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
On Tue, May 4, 2021 at 5:52 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: ioannis.kappas@gmail.com, 48137@debbugs.gnu.org
> > Date: Tue, 04 May 2021 12:27:37 -0400
> >
> > >> I agree that in principle it would be better to copy the file by copying
> > >> its bytes than by `insert-file-contents + write-region`.
> > > Why only "in principle"? what prevents us from actually doing that?
> >
> > Someone willing to take the time and rework the code accordingly?
>
> I thought that's Ioannis?
I can have a look.
Currently, `package-install-file' reads the package contents into a
buffer and calls `package-install-from-buffer' to parse the headers,
download & install its dependencies, and finally install the package
itself from the buffer.
Just to confirm, what we are discussing as a solution (at a high
level) is to parse the package headers from a decoded buffer, download
& install its dependencies, and copy/extract the package file to the
elpa user dir?
Thanks
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-05 7:03 ` Ioannis Kappas
@ 2021-05-05 12:01 ` Eli Zaretskii
[not found] ` <CAMRHuGAi9+q-MKRGPxLqxdP_7SSF4Nqj+JuSsZigviAQs_d7Rw@mail.gmail.com>
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-05 12:01 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137, monnier
> From: Ioannis Kappas <ioannis.kappas@gmail.com>
> Date: Wed, 5 May 2021 08:03:17 +0100
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
>
> Currently, `package-install-file' reads the package contents into a
> buffer and calls `package-install-from-buffer' to parse the headers,
> download & install its dependencies, and finally install the package
> itself from the buffer.
>
> Just to confirm, what we are discussing as a solution (at a high
> level) is to parse the package headers from a decoded buffer, download
> & install its dependencies, and copy/extract the package file to the
> elpa user dir?
My idea was to read the file literally, without decoding, then parse
the package headers from that. Stefan seems to think it's
non-trivial, which makes me suspect that I'm missing something.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
[not found] ` <CAMRHuGAi9+q-MKRGPxLqxdP_7SSF4Nqj+JuSsZigviAQs_d7Rw@mail.gmail.com>
@ 2021-05-06 6:55 ` Ioannis Kappas
2021-05-06 8:12 ` Eli Zaretskii
2021-05-06 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 2 replies; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-06 6:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
On Wed, May 5, 2021 at 1:01 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Ioannis Kappas <ioannis.kappas@gmail.com>
> > Date: Wed, 5 May 2021 08:03:17 +0100
> > Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
> >
> > Currently, `package-install-file' reads the package contents into a
> > buffer and calls `package-install-from-buffer' to parse the headers,
> > download & install its dependencies, and finally install the package
> > itself from the buffer.
> >
> > Just to confirm, what we are discussing as a solution (at a high
> > level) is to parse the package headers from a decoded buffer, download
> > & install its dependencies, and copy/extract the package file to the
> > elpa user dir?
>
> My idea was to read the file literally, without decoding, then parse
> the package headers from that.
I suppose you mean something along the other option below?
>
> > Another solution could be to upgrade the 'lisp-mnt package to ignore
> > ?\r characters. Looking at the `lm-header' fn invoked by
> > `package-buffer-info', it does have a list of characters to stop at
> > when looking for a header, we can thus add the carriage return to the
> > list:
> >
> > diff --git a/lisp/emacs-lisp/lisp-mnt.el b/lisp/emacs-lisp/lisp-mnt.el
> > index 9cba232e16..3eb493d286 100644
> > --- a/lisp/emacs-lisp/lisp-mnt.el
> > +++ b/lisp/emacs-lisp/lisp-mnt.el
> > @@ -267,7 +267,7 @@ lm-header
> > (if (save-excursion
> > (skip-chars-backward "^$" (match-beginning 0))
> > (= (point) (match-beginning 0)))
> > - "[^\n]+" "[^$\n]+")))
> > + "[^\n\r]+" "[^$\n\r]+")))
>
> This is better, but IMO the code should be rewritten not to allow a
> lone CR character, only either a lone LF or the CRLF pair.
Thanks
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-06 6:55 ` Ioannis Kappas
@ 2021-05-06 8:12 ` Eli Zaretskii
2021-05-06 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
1 sibling, 0 replies; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-06 8:12 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137, monnier
> From: Ioannis Kappas <ioannis.kappas@gmail.com>
> Date: Thu, 6 May 2021 07:55:52 +0100
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
>
> > My idea was to read the file literally, without decoding, then parse
> > the package headers from that.
>
> I suppose you mean something along the other option below?
It could be, but as I said up-thread, this specific change allows a
lone \r to be taken as an end of line, which I think is wrong. We
should only support a single \n or a \r\n pair. See this comment I
made back then:
> > > - "[^\n]+" "[^$\n]+")))
> > > + "[^\n\r]+" "[^$\n\r]+")))
> >
> > This is better, but IMO the code should be rewritten not to allow a
> > lone CR character, only either a lone LF or the CRLF pair.
Thanks.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-06 6:55 ` Ioannis Kappas
2021-05-06 8:12 ` Eli Zaretskii
@ 2021-05-06 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-06 15:26 ` Eli Zaretskii
1 sibling, 1 reply; 32+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-05-06 13:27 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: Eli Zaretskii, 48137
>> My idea was to read the file literally, without decoding, then parse
>> the package headers from that.
> I suppose you mean something along the other option below?
That's not sufficient, because if we don't decode the file before we
call `package-buffer-info` (from `package-install-from-buffer`), then
the <foo>-pkg.el file will have incorrect content (e.g. the non-ASCII
chars in the description of the package, will be later incorrectly
displayed in `list-packages`).
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-06 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-05-06 15:26 ` Eli Zaretskii
2021-05-11 6:52 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-06 15:26 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Eli Zaretskii <eliz@gnu.org>, 48137@debbugs.gnu.org
> Date: Thu, 06 May 2021 09:27:38 -0400
>
> That's not sufficient, because if we don't decode the file before we
> call `package-buffer-info` (from `package-install-from-buffer`), then
> the <foo>-pkg.el file will have incorrect content (e.g. the non-ASCII
> chars in the description of the package, will be later incorrectly
> displayed in `list-packages`).
So you are saying the description of the package needs to be decoded
before using it for list-packages? That'd be okay; all I care about
is that the decoded stuff does NOT replace the original raw bytes, but
instead is used only where decoding is needed. IOW, decoding should
either be done on substrings of the original file, and the result
stored in strings, or the decoded stuff should be placed in a separate
scratch buffer, which will be used only where decoding is really
needed.
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-06 15:26 ` Eli Zaretskii
@ 2021-05-11 6:52 ` Ioannis Kappas
2021-05-11 12:55 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-11 6:52 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
On Thu, May 6, 2021 at 4:26 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Eli Zaretskii <eliz@gnu.org>, 48137@debbugs.gnu.org
> > Date: Thu, 06 May 2021 09:27:38 -0400
> >
> > That's not sufficient, because if we don't decode the file before we
> > call `package-buffer-info` (from `package-install-from-buffer`), then
> > the <foo>-pkg.el file will have incorrect content (e.g. the non-ASCII
> > chars in the description of the package, will be later incorrectly
> > displayed in `list-packages`).
>
> So you are saying the description of the package needs to be decoded
> before using it for list-packages? That'd be okay; all I care about
> is that the decoded stuff does NOT replace the original raw bytes, but
> instead is used only where decoding is needed. IOW, decoding should
> either be done on substrings of the original file, and the result
> stored in strings, or the decoded stuff should be placed in a separate
> scratch buffer, which will be used only where decoding is really
> needed.
Is loading with `insert-file-contents' and saving as 'raw-text the
same as copying the raw bytes of the original file?
`hexlify-buffer' in 'hexl uses 'raw-text to display the raw bytes of
an encoded buffer. I always assumed hexl displayed the actual binary
representation of the underlying file.
In which case, having `package-install-file' load the .el package file
metaphorically and modifying `package-unpack' to store 'single files
with 'raw-text should satisfy the requirement? Thus header parsing is
done in the intended coding system, while the end package is a "copy"
of the original.
Example patch:
diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
index ecb2573cab..b5fa020179 100644
--- a/lisp/emacs-lisp/package.el
+++ b/lisp/emacs-lisp/package.el
@@ -932,7 +932,7 @@ package-unpack
('single
(let ((el-file (expand-file-name (format "%s.el" name) pkg-dir)))
(make-directory pkg-dir t)
- (package--write-file-no-coding el-file)))
+ (package--write-file-raw-text el-file)))
(kind (error "Unknown package kind: %S" kind)))
(package--make-autoloads-and-stuff pkg-desc pkg-dir)
;; Update package-alist.
@@ -1180,9 +1180,9 @@ package-dir-info
;; Set of low-level functions for communicating with archives and
;; signature checking.
-(defun package--write-file-no-coding (file-name)
+(defun package--write-file-raw-text (file-name)
"Write file FILE-NAME without encoding using coding system."
- (let ((buffer-file-coding-system 'no-conversion))
+ (let ((buffer-file-coding-system 'raw-text))
(write-region (point-min) (point-max) file-name nil 'silent)))
(declare-function url-http-file-exists-p "url-http" (url))
@@ -2147,7 +2147,9 @@ package-install-file
(progn
(setq default-directory file)
(dired-mode))
- (insert-file-contents-literally file)
+ (if (string-match "\\.el\\'" file)
+ (insert-file-contents file)
+ (insert-file-contents-literally file))
(when (string-match "\\.tar\\'" file) (tar-mode)))
(package-install-from-buffer)))
Btw,
https://www.gnu.org/software/emacs/manual/html_node/elisp/Coding-System-Basics.html
mentions about the 'no-conversion coding system:
no-conversion (and its alias binary) is equivalent to raw-text-unix:
it specifies no conversion of either character codes or end-of-line.
but since it is -unix, it does do EOL conversions to LF. Should the
above be corrected to something like:
no-conversion (and its alias binary) is equivalent to raw-text-unix:
it specifies no conversion of character codes but converts
end-of-lines to the unix convention.
Thanks
^ permalink raw reply related [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-11 6:52 ` Ioannis Kappas
@ 2021-05-11 12:55 ` Eli Zaretskii
2021-05-15 13:52 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-11 12:55 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137, monnier
> From: Ioannis Kappas <ioannis.kappas@gmail.com>
> Date: Tue, 11 May 2021 07:52:02 +0100
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
>
> > So you are saying the description of the package needs to be decoded
> > before using it for list-packages? That'd be okay; all I care about
> > is that the decoded stuff does NOT replace the original raw bytes, but
> > instead is used only where decoding is needed. IOW, decoding should
> > either be done on substrings of the original file, and the result
> > stored in strings, or the decoded stuff should be placed in a separate
> > scratch buffer, which will be used only where decoding is really
> > needed.
>
> Is loading with `insert-file-contents' and saving as 'raw-text the
> same as copying the raw bytes of the original file?
No. You should load with insert-file-contents-literally, and then
saving will automatically DTRT.
You could also load using raw-text, in which case Emacs will be able
to convert EOL. But I don't recommend to do any conversions,
including the EOL conversions, because they still can change the
contents on saving in some rare cases.
> `hexlify-buffer' in 'hexl uses 'raw-text to display the raw bytes of
> an encoded buffer. I always assumed hexl displayed the actual binary
> representation of the underlying file.
It does, indeed.
> In which case, having `package-install-file' load the .el package file
> metaphorically and modifying `package-unpack' to store 'single files
> with 'raw-text should satisfy the requirement? Thus header parsing is
> done in the intended coding system, while the end package is a "copy"
> of the original.
Sorry, you lost me here: I don't think I understand the details of how
you intend to do the above.
> - (package--write-file-no-coding el-file)))
> + (package--write-file-raw-text el-file)))
You don't need to force any encoding on write: Emacs by default will
use the same encoding as the one used to read the file.
> Btw,
> https://www.gnu.org/software/emacs/manual/html_node/elisp/Coding-System-Basics.html
> mentions about the 'no-conversion coding system:
>
> no-conversion (and its alias binary) is equivalent to raw-text-unix:
> it specifies no conversion of either character codes or end-of-line.
>
> but since it is -unix, it does do EOL conversions to LF.
No, -unix means it doesn't convert EOL. IOW, the -unix part means
"assume Unix-style LF-only EOLs and don't convert EOL conventions".
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-11 12:55 ` Eli Zaretskii
@ 2021-05-15 13:52 ` Ioannis Kappas
2021-05-16 9:09 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-15 13:52 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
On Tue, May 11, 2021 at 1:55 PM Eli Zaretskii <eliz@gnu.org> wrote:
> > In which case, having `package-install-file' load the .el package file
> > metaphorically and modifying `package-unpack' to store 'single files
> > with 'raw-text should satisfy the requirement? Thus header parsing is
> > done in the intended coding system, while the end package is a "copy"
> > of the original.
>
> Sorry, you lost me here: I don't think I understand the details of how
> you intend to do the above.
Yes, sorry, you are right, I misunderstood how 'raw-text is used in
`hexlify-buffer'. I thought it was used to discern the *original bytes
seq* from the encoded buffer and send them over to the hexl process
for processing. So, I was inquiring whether we could use the same in
our case, i.e. decode the .el file (with `insert-file-contents'), but
have `package-unpack' save it with 'raw-text instead of
'no-conversion. I was wrong though in my extrapolation, since
'raw-text specifies the encoding to use for reading data from the
synchronous hexl process, not allegedly writing the original file byte
sequence to the process from the buffer.
Thus I take it there is no way to reconstruct the original file
sequence from an encoded buffer under all circumstances as you said,
even in our specific case where the scope is just an .el file.
I shall have a look next whether we could always load the package with
`insert-file-contents-literally' but parse headers with the correct
encoding (`find-operation-coding-system' looks like a promising fn
to determine the correct encoding from a literal file buffer).
Thanks
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-15 13:52 ` Ioannis Kappas
@ 2021-05-16 9:09 ` Ioannis Kappas
2021-05-29 8:20 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-05-16 9:09 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
On Sat, May 15, 2021 at 2:52 PM Ioannis Kappas <ioannis.kappas@gmail.com> wrote:
> I shall have a look next whether we could always load the package with
> `insert-file-contents-literally' but parse headers with the correct
> encoding (`find-operation-coding-system' looks like a promising fn
> to determine the correct encoding from a literal file buffer).
Please find below a patch to read package headers from a temporarily
decoded buffer, while keeping the literal buffer (which will be used
to install the package) intact.
diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
index ecb2573cab..a7a11bc6cc 100644
--- a/lisp/emacs-lisp/package.el
+++ b/lisp/emacs-lisp/package.el
@@ -2122,8 +2122,22 @@ package-install-from-buffer
((derived-mode-p 'tar-mode)
(package-tar-file-info))
(t
- (save-excursion
- (package-buffer-info)))))
+ ;; Package headers should be parsed from decoded text
+ ;; (see Bug#48137) where possible.
+ (if (and (eq buffer-file-coding-system 'no-conversion)
+ buffer-file-name)
+ (let* ((package-buffer (current-buffer))
+ (decoding-system
+ (car (find-operation-coding-system
'insert-file-contents
+ (cons
buffer-file-name
+
package-buffer)))))
+ (with-temp-buffer
+ (insert-buffer-substring package-buffer)
+ (decode-coding-region (point-min) (point-max)
decoding-system)
+ (package-buffer-info)))
+
+ (save-excursion
+ (package-buffer-info))))))
(name (package-desc-name pkg-desc)))
;; Download and install the dependencies.
(let* ((requires (package-desc-reqs pkg-desc))
@@ -2148,6 +2162,7 @@ package-install-file
(setq default-directory file)
(dired-mode))
(insert-file-contents-literally file)
+ (set-visited-file-name file)
(when (string-match "\\.tar\\'" file) (tar-mode)))
(package-install-from-buffer)))
Notes:
- The logic can be moved to `package-buffer-info'.
- Ideally, I would have liked to only copy and header section to the
temporary buffer for decoding, but the `(while (comment-forward))`
trick that I've tried fails to move past the headers section in
literal buffers with CRLF pairs. Perhaps it is better after all not
to try and use any regex ops on a 'binary buffer?
thanks
^ permalink raw reply related [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-16 9:09 ` Ioannis Kappas
@ 2021-05-29 8:20 ` Eli Zaretskii
2021-05-29 13:59 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-29 8:20 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137, monnier
> From: Ioannis Kappas <ioannis.kappas@gmail.com>
> Date: Sun, 16 May 2021 10:09:07 +0100
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
>
> On Sat, May 15, 2021 at 2:52 PM Ioannis Kappas <ioannis.kappas@gmail.com> wrote:
> > I shall have a look next whether we could always load the package with
> > `insert-file-contents-literally' but parse headers with the correct
> > encoding (`find-operation-coding-system' looks like a promising fn
> > to determine the correct encoding from a literal file buffer).
>
> Please find below a patch to read package headers from a temporarily
> decoded buffer, while keeping the literal buffer (which will be used
> to install the package) intact.
Stefan, any comments?
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-29 8:20 ` Eli Zaretskii
@ 2021-05-29 13:59 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-29 14:09 ` Eli Zaretskii
0 siblings, 1 reply; 32+ messages in thread
From: Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors @ 2021-05-29 13:59 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: Ioannis Kappas, 48137
>> From: Ioannis Kappas <ioannis.kappas@gmail.com>
>> Date: Sun, 16 May 2021 10:09:07 +0100
>> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 48137@debbugs.gnu.org
>>
>> On Sat, May 15, 2021 at 2:52 PM Ioannis Kappas <ioannis.kappas@gmail.com> wrote:
>> > I shall have a look next whether we could always load the package with
>> > `insert-file-contents-literally' but parse headers with the correct
>> > encoding (`find-operation-coding-system' looks like a promising fn
>> > to determine the correct encoding from a literal file buffer).
>>
>> Please find below a patch to read package headers from a temporarily
>> decoded buffer, while keeping the literal buffer (which will be used
>> to install the package) intact.
>
> Stefan, any comments?
No particular comment from me, except: this is tricky enough that
it would benefit from some regression tests.
Stefan
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-29 13:59 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
@ 2021-05-29 14:09 ` Eli Zaretskii
2021-06-06 9:11 ` Ioannis Kappas
0 siblings, 1 reply; 32+ messages in thread
From: Eli Zaretskii @ 2021-05-29 14:09 UTC (permalink / raw)
To: Stefan Monnier; +Cc: ioannis.kappas, 48137
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Ioannis Kappas <ioannis.kappas@gmail.com>, 48137@debbugs.gnu.org
> Date: Sat, 29 May 2021 09:59:01 -0400
>
> >> Please find below a patch to read package headers from a temporarily
> >> decoded buffer, while keeping the literal buffer (which will be used
> >> to install the package) intact.
> >
> > Stefan, any comments?
>
> No particular comment from me, except: this is tricky enough that
> it would benefit from some regression tests.
Thanks. Yes, that's a good idea. Ioannis, could you perhaps add some
tests, both using existing packages and perhaps also some you concoct
just for testing purposes?
^ permalink raw reply [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-05-29 14:09 ` Eli Zaretskii
@ 2021-06-06 9:11 ` Ioannis Kappas
2021-07-20 13:54 ` Lars Ingebrigtsen
0 siblings, 1 reply; 32+ messages in thread
From: Ioannis Kappas @ 2021-06-06 9:11 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: 48137, Stefan Monnier
[-- Attachment #1: Type: text/plain, Size: 1463 bytes --]
On Sat, May 29, 2021 at 3:09 PM Eli Zaretskii <eliz@gnu.org> wrote:
>
> > From: Stefan Monnier <monnier@iro.umontreal.ca>
> > Cc: Ioannis Kappas <ioannis.kappas@gmail.com>, 48137@debbugs.gnu.org
> > Date: Sat, 29 May 2021 09:59:01 -0400
> >
> > >> Please find below a patch to read package headers from a temporarily
> > >> decoded buffer, while keeping the literal buffer (which will be used
> > >> to install the package) intact.
> > >
> > > Stefan, any comments?
> >
> > No particular comment from me, except: this is tricky enough that
> > it would benefit from some regression tests.
>
> Thanks. Yes, that's a good idea. Ioannis, could you perhaps add some
> tests, both using existing packages and perhaps also some you concoct
> just for testing purposes?
Please find attached a patch with the earlier fix and two new ert
package tests. `package-test-install-file' tests the installation of
single .el file and a .tar archive packages as referenced from the
test harness. `package-test-install-file-EOLs' rewrites the single .el
package multiple time with different EOL conventions ('unix, 'dos and
'mac), installs them and confirms that they are written verbatim
without modifications to the package directory by comparing that the
md5 hashes of the installing package vs the installed package are the
same.
All package tests pass locally both on GNU/Linux and MS-Windows.
Please feel free to scrutinize the patch and/or make modifications.
Thanks
[-- Attachment #2: emacs-bug48137.patch --]
[-- Type: application/octet-stream, Size: 5390 bytes --]
diff --git a/lisp/emacs-lisp/package.el b/lisp/emacs-lisp/package.el
index 5df9b53657..1b8f3304ba 100644
--- a/lisp/emacs-lisp/package.el
+++ b/lisp/emacs-lisp/package.el
@@ -2195,8 +2195,22 @@ package-install-from-buffer
((derived-mode-p 'tar-mode)
(package-tar-file-info))
(t
- (save-excursion
- (package-buffer-info)))))
+ ;; Package headers should be parsed from decoded text
+ ;; (see Bug#48137) where possible.
+ (if (and (eq buffer-file-coding-system 'no-conversion)
+ buffer-file-name)
+ (let* ((package-buffer (current-buffer))
+ (decoding-system
+ (car (find-operation-coding-system 'insert-file-contents
+ (cons buffer-file-name
+ package-buffer)))))
+ (with-temp-buffer
+ (insert-buffer-substring package-buffer)
+ (decode-coding-region (point-min) (point-max) decoding-system)
+ (package-buffer-info)))
+
+ (save-excursion
+ (package-buffer-info))))))
(name (package-desc-name pkg-desc)))
;; Download and install the dependencies.
(let* ((requires (package-desc-reqs pkg-desc))
@@ -2222,6 +2236,7 @@ package-install-file
(setq default-directory file)
(dired-mode))
(insert-file-contents-literally file)
+ (set-visited-file-name file)
(when (string-match "\\.tar\\'" file) (tar-mode)))
(package-install-from-buffer)))
diff --git a/test/lisp/emacs-lisp/package-tests.el b/test/lisp/emacs-lisp/package-tests.el
index 67d647d3b9..2943579955 100644
--- a/test/lisp/emacs-lisp/package-tests.el
+++ b/test/lisp/emacs-lisp/package-tests.el
@@ -263,6 +263,74 @@ package-test-install-single
(should (file-exists-p autoloads-file))
(should-not (get-file-buffer autoloads-file)))))
+(ert-deftest package-test-install-file ()
+ "Install files with `package-install-file'."
+ (with-package-test (:basedir (ert-resource-directory))
+ (package-initialize)
+ (let* ((pkg-el "simple-single-1.3.el")
+ (source-file (expand-file-name pkg-el (ert-resource-directory))))
+ (should-not (package-installed-p 'simple-single))
+ (package-install-file source-file)
+ (should (package-installed-p 'simple-single))
+ (package-delete (cadr (assq 'simple-single package-alist)))
+ (should-not (package-installed-p 'simple-single)))
+
+ (let* ((pkg-el "multi-file-0.2.3.tar")
+ (source-file (expand-file-name pkg-el (ert-resource-directory))))
+ (package-initialize)
+ (should-not (package-installed-p 'multie-file))
+ (package-install-file source-file)
+ (should (package-installed-p 'multi-file))
+ (package-delete (cadr (assq 'multi-file package-alist))))
+ ))
+
+(ert-deftest package-test-install-file-EOLs ()
+ "Install same file multiple time with `package-install-file'
+but with a different end of line convention (bug#48137)."
+ (with-package-test (:basedir (ert-resource-directory))
+ (package-initialize)
+ (let* ((pkg-el "simple-single-1.3.el")
+ (source-file (expand-file-name pkg-el (ert-resource-directory))))
+
+ (with-temp-buffer
+ (insert-file-contents source-file)
+
+ (let (hashes)
+ (dolist (coding '(unix dos mac) hashes)
+ (let* ((eol-file (expand-file-name pkg-el package-test-user-dir)))
+ ;; save package with this EOL convention.
+ (set-buffer-file-coding-system coding)
+ (write-region (point-min) (point-max) eol-file)
+
+ (should-not (package-installed-p 'simple-single))
+ (package-install-file eol-file)
+ (should (package-installed-p 'simple-single))
+
+ ;; check the package file has been installed unmodified.
+ (let ((eol-hash (with-temp-buffer
+ (insert-file-contents-literally eol-file)
+ (buffer-hash))))
+ ;; also perform an additional check that the package
+ ;; file created with this EOL convention is different
+ ;; than all the others created so far.
+ (should-not (member eol-hash hashes))
+ (setq hashes (cons eol-hash hashes))
+
+ (let* ((descr (cadr (assq 'simple-single package-alist)))
+ (pkg-dir (package-desc-dir descr))
+ (dest-file (expand-file-name "simple-single.el" pkg-dir ))
+ (dest-hash (with-temp-buffer
+ (insert-file-contents-literally dest-file)
+ (buffer-hash))))
+
+ (should (string= dest-hash eol-hash))))
+
+ (package-delete (cadr (assq 'simple-single package-alist)))
+ (should-not (package-installed-p 'simple-single))
+ (delete-file eol-file)
+ (should-not (file-exists-p eol-file))
+ )))))))
+
(ert-deftest package-test-install-dependency ()
"Install a package which includes a dependency."
(with-package-test ()
^ permalink raw reply related [flat|nested] 32+ messages in thread
* bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings
2021-06-06 9:11 ` Ioannis Kappas
@ 2021-07-20 13:54 ` Lars Ingebrigtsen
0 siblings, 0 replies; 32+ messages in thread
From: Lars Ingebrigtsen @ 2021-07-20 13:54 UTC (permalink / raw)
To: Ioannis Kappas; +Cc: 48137, Stefan Monnier
Ioannis Kappas <ioannis.kappas@gmail.com> writes:
>> Thanks. Yes, that's a good idea. Ioannis, could you perhaps add some
>> tests, both using existing packages and perhaps also some you concoct
>> just for testing purposes?
>
> Please find attached a patch with the earlier fix and two new ert
> package tests.
Skimming this thread, it seemed people agreed that this was the right
solution, so I've now installed the patch in Emacs 28.
This change was just small enough to apply without assigning copyright
to the FSF, but for future patches you want to submit, it might make
sense to get the paperwork started now, so that subsequent patches can
be applied speedily. Would you be willing to sign such paperwork?
--
(domestic pets only, the antidote for overdose, milk.)
bloggy blog: http://lars.ingebrigtsen.no
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2021-07-20 13:54 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-05-01 11:38 bug#48137: 27.2; `package-install-file' fails when loading a package file with DOS line endings Ioannis Kappas
2021-05-01 11:48 ` Ioannis Kappas
2021-05-01 12:15 ` Eli Zaretskii
2021-05-01 13:51 ` Stefan Monnier
2021-05-03 17:47 ` Ioannis Kappas
2021-05-03 18:23 ` Stefan Monnier
2021-05-03 18:33 ` Eli Zaretskii
2021-05-03 18:49 ` Ioannis Kappas
2021-05-03 18:52 ` Eli Zaretskii
2021-05-03 20:12 ` Stefan Monnier
2021-05-04 11:39 ` Eli Zaretskii
2021-05-03 19:41 ` Stefan Monnier
2021-05-04 11:34 ` Eli Zaretskii
2021-05-04 15:57 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-04 16:14 ` Eli Zaretskii
2021-05-04 16:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-04 16:51 ` Eli Zaretskii
2021-05-05 7:03 ` Ioannis Kappas
2021-05-05 12:01 ` Eli Zaretskii
[not found] ` <CAMRHuGAi9+q-MKRGPxLqxdP_7SSF4Nqj+JuSsZigviAQs_d7Rw@mail.gmail.com>
2021-05-06 6:55 ` Ioannis Kappas
2021-05-06 8:12 ` Eli Zaretskii
2021-05-06 13:27 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-06 15:26 ` Eli Zaretskii
2021-05-11 6:52 ` Ioannis Kappas
2021-05-11 12:55 ` Eli Zaretskii
2021-05-15 13:52 ` Ioannis Kappas
2021-05-16 9:09 ` Ioannis Kappas
2021-05-29 8:20 ` Eli Zaretskii
2021-05-29 13:59 ` Stefan Monnier via Bug reports for GNU Emacs, the Swiss army knife of text editors
2021-05-29 14:09 ` Eli Zaretskii
2021-06-06 9:11 ` Ioannis Kappas
2021-07-20 13:54 ` Lars Ingebrigtsen
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).