* package.el encoding problem
@ 2019-05-23 15:18 Yuri D'Elia
2019-05-23 15:27 ` Noam Postavsky
2019-05-24 15:45 ` Stefan Monnier
0 siblings, 2 replies; 12+ messages in thread
From: Yuri D'Elia @ 2019-05-23 15:18 UTC (permalink / raw)
To: emacs-devel
package.el seems to have a fit with this melpa package:
https://melpa.org/packages/web-mode-20190522.610.el
When attempting to install, it fails to determine the encoding of the
file and prompts for one. However when downloading it by hand it does
seem to be correctly coded as utf-8.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-23 15:18 package.el encoding problem Yuri D'Elia
@ 2019-05-23 15:27 ` Noam Postavsky
2019-05-23 15:33 ` Yuri D'Elia
2019-05-24 15:45 ` Stefan Monnier
1 sibling, 1 reply; 12+ messages in thread
From: Noam Postavsky @ 2019-05-23 15:27 UTC (permalink / raw)
To: Yuri D'Elia; +Cc: Emacs developers
On Thu, 23 May 2019 at 11:24, Yuri D'Elia <wavexx@thregr.org> wrote:
>
> package.el seems to have a fit with this melpa package:
>
> https://melpa.org/packages/web-mode-20190522.610.el
>
> When attempting to install, it fails to determine the encoding of the
> file and prompts for one. However when downloading it by hand it does
> seem to be correctly coded as utf-8.
Is this a package.el of the latest master (I think Stefan recently
pushed some changes that affect decoding) or an earlier one?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-23 15:18 package.el encoding problem Yuri D'Elia
2019-05-23 15:27 ` Noam Postavsky
@ 2019-05-24 15:45 ` Stefan Monnier
2019-05-24 16:36 ` Stefan Monnier
1 sibling, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2019-05-24 15:45 UTC (permalink / raw)
To: emacs-devel
> package.el seems to have a fit with this melpa package:
> https://melpa.org/packages/web-mode-20190522.610.el
> When attempting to install, it fails to determine the encoding of the
> file and prompts for one.
Can you enable "Options => Enter Debugger on Quit", then reproduce the
problem then hit C-g when you get the prompt?
[ Or do `M-: (debug)` when you get the prompt. ]
You might also do this within a formal bug-report, so we get a tracking
number for it.
Stefan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-24 15:45 ` Stefan Monnier
@ 2019-05-24 16:36 ` Stefan Monnier
2019-05-25 6:43 ` Eli Zaretskii
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2019-05-24 16:36 UTC (permalink / raw)
To: emacs-devel
> Can you enable "Options => Enter Debugger on Quit", then reproduce the
> problem then hit C-g when you get the prompt?
> [ Or do `M-: (debug)` when you get the prompt. ]
Don't bother I managed to reproduce it after all. It should fixed, thanks,
Stefan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-24 16:36 ` Stefan Monnier
@ 2019-05-25 6:43 ` Eli Zaretskii
2019-05-25 12:15 ` Stefan Monnier
0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2019-05-25 6:43 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Fri, 24 May 2019 12:36:01 -0400
>
> > Can you enable "Options => Enter Debugger on Quit", then reproduce the
> > problem then hit C-g when you get the prompt?
> > [ Or do `M-: (debug)` when you get the prompt. ]
>
> Don't bother I managed to reproduce it after all. It should fixed, thanks,
Thanks, but I don't think I understand the fix. Could you explain why
making the temporary buffer unibyte solves the problem? I thought the
need for unibyte buffers was removed long ago, at least in the vast
majority of situations.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-25 6:43 ` Eli Zaretskii
@ 2019-05-25 12:15 ` Stefan Monnier
2019-05-25 13:47 ` Eli Zaretskii
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2019-05-25 12:15 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
>> > Can you enable "Options => Enter Debugger on Quit", then reproduce the
>> > problem then hit C-g when you get the prompt?
>> > [ Or do `M-: (debug)` when you get the prompt. ]
>> Don't bother I managed to reproduce it after all. It should fixed, thanks,
> Thanks, but I don't think I understand the fix.
As you can see in the fix's assertion, the data we receive is
a unibyte string and we need to save it into a file.
We used to put it into a multibyte buffer, which then causes the save
the be all confused because the bytes 128-255 it contains aren't part of
any coding-system. We did try to circumvent this problem by specifying
`no-conversion` coding system, but I think the way we did this wasn't
quite right.
Rather than try to fix the circumvention, I decided to "do it right" and
use a unibyte buffer so the question doesn't show up.
> I thought the need for unibyte buffers was removed long ago, at least
> in the vast majority of situations.
It is definitely *possible* to use multibyte buffers even in cases where
we only manipulate bytes, but it is undesirable.
Stefan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-25 12:15 ` Stefan Monnier
@ 2019-05-25 13:47 ` Eli Zaretskii
2019-05-25 14:55 ` Stefan Monnier
0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2019-05-25 13:47 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sat, 25 May 2019 08:15:52 -0400
>
> >> > Can you enable "Options => Enter Debugger on Quit", then reproduce the
> >> > problem then hit C-g when you get the prompt?
> >> > [ Or do `M-: (debug)` when you get the prompt. ]
> >> Don't bother I managed to reproduce it after all. It should fixed, thanks,
> > Thanks, but I don't think I understand the fix.
>
> As you can see in the fix's assertion, the data we receive is
> a unibyte string and we need to save it into a file.
Yes, which is why I said I didn't understand the fix. Multibyte
buffers can handle raw bytes without any problem. Or at least I
thought they did.
> We used to put it into a multibyte buffer, which then causes the save
> the be all confused because the bytes 128-255 it contains aren't part of
> any coding-system.
I don't see how that matters. Raw bytes should be converted back to
their original unibyte form when saving, no matter what coding-system
is used.
Could you perhaps show a recipe for the problem? I'd like to look
into what happens there.
> It is definitely *possible* to use multibyte buffers even in cases where
> we only manipulate bytes, but it is undesirable.
I'm probably missing something, because I don't see would that be
undesirable. Hopefully, a reproducible recipe will show me the light.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-25 13:47 ` Eli Zaretskii
@ 2019-05-25 14:55 ` Stefan Monnier
2019-05-25 15:03 ` Eli Zaretskii
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2019-05-25 14:55 UTC (permalink / raw)
To: Eli Zaretskii; +Cc: emacs-devel
>> We used to put it into a multibyte buffer, which then causes the save
>> the be all confused because the bytes 128-255 it contains aren't part of
>> any coding-system.
>
> I don't see how that matters. Raw bytes should be converted back to
> their original unibyte form when saving, no matter what coding-system
> is used.
The bytes were saved correctly. But before that happened, the user was
prompted to choose a coding-system.
>> It is definitely *possible* to use multibyte buffers even in cases where
>> we only manipulate bytes, but it is undesirable.
> I'm probably missing something, because I don't see would that be
> undesirable.
It doesn't do anything else than introduce problems (e.g. having to
decide how to encode chars even though there are no chars to decode)
and inefficiencies.
I'm rather curious what you think would be the benefits from using
a multibyte buffer here.
Stefan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-25 14:55 ` Stefan Monnier
@ 2019-05-25 15:03 ` Eli Zaretskii
2019-05-25 15:26 ` Stefan Monnier
0 siblings, 1 reply; 12+ messages in thread
From: Eli Zaretskii @ 2019-05-25 15:03 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: emacs-devel@gnu.org
> Date: Sat, 25 May 2019 10:55:36 -0400
>
> >> We used to put it into a multibyte buffer, which then causes the save
> >> the be all confused because the bytes 128-255 it contains aren't part of
> >> any coding-system.
> >
> > I don't see how that matters. Raw bytes should be converted back to
> > their original unibyte form when saving, no matter what coding-system
> > is used.
>
> The bytes were saved correctly. But before that happened, the user was
> prompted to choose a coding-system.
Is that the only problem? IOW, if we prevent the prompt, will
package.el work correctly in this scenario?
You previously said that in the past we attempted to bind
coding-system-for-write, which in general is the easiest way of
preventing the prompt. Didn't it work?
> >> It is definitely *possible* to use multibyte buffers even in cases where
> >> we only manipulate bytes, but it is undesirable.
> > I'm probably missing something, because I don't see would that be
> > undesirable.
>
> It doesn't do anything else than introduce problems (e.g. having to
> decide how to encode chars even though there are no chars to decode)
> and inefficiencies.
>
> I'm rather curious what you think would be the benefits from using
> a multibyte buffer here.
One obvious benefit is that you won't need to set the buffer to be
unibyte. People tend to regard this as some kind of black magic,
which creates myths, like the (wrong) idea that unibyte text cannot be
processed correctly in a multibyte buffer. I'd rather we avoided
substantiating such myths. Also, we should in theory be able to
eliminate unibyte buffers, at least in Lisp application code.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-25 15:03 ` Eli Zaretskii
@ 2019-05-25 15:26 ` Stefan Monnier
2019-05-25 16:03 ` Eli Zaretskii
0 siblings, 1 reply; 12+ messages in thread
From: Stefan Monnier @ 2019-05-25 15:26 UTC (permalink / raw)
To: emacs-devel
> You previously said that in the past we attempted to bind
> coding-system-for-write, which in general is the easiest way of
> preventing the prompt. Didn't it work?
For some reason this function does something else:
(defun package--write-file-no-coding (file-name)
(let ((buffer-file-coding-system 'no-conversion))
(write-region (point-min) (point-max) file-name nil 'silent)))
AFAIK using coding-system-for-write would have solve the problem as well.
> One obvious benefit is that you won't need to set the buffer to be
> unibyte.
(set-buffer-multibyte nil) sets up the buffer to receive bytes.
Since we're putting bytes into the buffer, it's The Right Thing to do.
> People tend to regard this as some kind of black magic, which creates
> myths, like the (wrong) idea that unibyte text cannot be processed
> correctly in a multibyte buffer.
I think they're right: it's hard to get it right. Partly because it
encourages confusion between bytes and chars (and confusion between
sequences of bytes and sequences of chars).
Don't get me wrong: it's important that it be possible to do it, because
that's sometimes necessary. But when the code only manipulates bytes,
using any multibyte objects along the way is asking for trouble.
> I'd rather we avoided substantiating such myths.
And I'd rather we clarify that chars aren't bytes and vice versa.
[ I also wish we could through away the "unibyte" and "multibyte"
vocabulary which again encourages such confusion. ]
> Also, we should in theory be able to eliminate unibyte buffers, at
> least in Lisp application code.
I think this will help create more confusion.
Stefan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: package.el encoding problem
2019-05-25 15:26 ` Stefan Monnier
@ 2019-05-25 16:03 ` Eli Zaretskii
0 siblings, 0 replies; 12+ messages in thread
From: Eli Zaretskii @ 2019-05-25 16:03 UTC (permalink / raw)
To: Stefan Monnier; +Cc: emacs-devel
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Sat, 25 May 2019 11:26:21 -0400
>
> > You previously said that in the past we attempted to bind
> > coding-system-for-write, which in general is the easiest way of
> > preventing the prompt. Didn't it work?
>
> For some reason this function does something else:
>
> (defun package--write-file-no-coding (file-name)
> (let ((buffer-file-coding-system 'no-conversion))
> (write-region (point-min) (point-max) file-name nil 'silent)))
>
> AFAIK using coding-system-for-write would have solve the problem as well.
Yes. In fact, binding buffer-file-coding-system is not useful at all.
So why not bind coding-system-for-write here?
> > One obvious benefit is that you won't need to set the buffer to be
> > unibyte.
>
> (set-buffer-multibyte nil) sets up the buffer to receive bytes.
> Since we're putting bytes into the buffer, it's The Right Thing to do.
We disagree here. We can put bytes into a multibyte buffer, they have
a special representation there which tells they are raw bytes, so
there's nothing wrong with doing that.
> > People tend to regard this as some kind of black magic, which creates
> > myths, like the (wrong) idea that unibyte text cannot be processed
> > correctly in a multibyte buffer.
>
> I think they're right: it's hard to get it right.
Using a multibyte buffer removes one subtlety from what we should
educate people to do in this case, so it's easier to get that right.
> Don't get me wrong: it's important that it be possible to do it, because
> that's sometimes necessary. But when the code only manipulates bytes,
> using any multibyte objects along the way is asking for trouble.
Sorry, I see no trouble here. the only trouble is the binding of
buffer-file-coding-system instead of coding-system-for-write.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-05-25 16:03 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-05-23 15:18 package.el encoding problem Yuri D'Elia
2019-05-23 15:27 ` Noam Postavsky
2019-05-23 15:33 ` Yuri D'Elia
2019-05-24 15:45 ` Stefan Monnier
2019-05-24 16:36 ` Stefan Monnier
2019-05-25 6:43 ` Eli Zaretskii
2019-05-25 12:15 ` Stefan Monnier
2019-05-25 13:47 ` Eli Zaretskii
2019-05-25 14:55 ` Stefan Monnier
2019-05-25 15:03 ` Eli Zaretskii
2019-05-25 15:26 ` Stefan Monnier
2019-05-25 16:03 ` Eli Zaretskii
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/emacs.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).