unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Stefan Monnier <monnier@iro.umontreal.ca>
Cc: thievol@posteo.net, larsi@gnus.org, schwab@linux-m68k.org,
	44486@debbugs.gnu.org
Subject: bug#44486: 27.1; C-@ chars corrupt elisp buffer
Date: Sat, 14 Nov 2020 20:08:04 +0200	[thread overview]
Message-ID: <83y2j3u7zv.fsf@gnu.org> (raw)
In-Reply-To: <jwvblfzrfrp.fsf-monnier+emacs@gnu.org> (message from Stefan Monnier on Sat, 14 Nov 2020 12:55:51 -0500)

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: larsi@gnus.org,  thievol@posteo.net,  handa@gnu.org,
>   schwab@linux-m68k.org,  44486@debbugs.gnu.org
> Date: Sat, 14 Nov 2020 12:55:51 -0500
> 
> >> AFAIK `prefer-utf-8` is only ever used for files which are known to
> >> contain text and should almost always contain UTF-8 text.
> > For those, we should use utf-8, not prefer-utf-8.
> 
> No, `utf-8` should be used when other coding systems should be
> considered as errors (i.e. not "almost always" but "always")

Why?

> whereas `prefer-utf-8` is for use when utf-8 is the most likely one
> and other coding systems should be tried only when there's some
> evidence that the file actually doesn't use utf-8.
> 
> `prefer-utf-8` was introduced specifically for `.el` files (and I don't
> know of any other use of that encoding so far).

Maybe that was the history, but the reality is different.
prefer-utf-8 is the same as 'undecided' with coding-systems'
priorities tampered to prefer UTF-8.

> If `utf-8` is preferable over `prefer-utf-8` for this usage I think
> the problem is in `prefer-utf-8` since it was introduced
> specifically for that.

The implementation doesn't support your POV.

> >> I believe if there's a NUL byte in such a files but it otherwise doesn't
> >> contain any invalid UTF-8 byte sequence, it will result in better
> >> behavior if we treat it as UFT-8 than as binary.
> > We treat null bytes as the _single_ telltale sign of a binary file.
> 
> A .el file should *never* be a binary file.

We are not talking about .el files, we are talking about _any_ file
read using prefer-utf-8.

For .el files, we can always bind inhibit-null-byte-detection to t
when we load or visit such files.

> > If we disable that in coding-systems that are supposed to _detect_
> > encoding, we will never be able to detect binary files.
> 
> In which scenario would it be beneficial to detect a `.el` file as being
> binary instead of utf-8?

I'm not talking about .el files.  The coding-system's applicability is
wider than that.





  reply	other threads:[~2020-11-14 18:08 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-06 15:11 bug#44486: 27.1; C-@ chars corrupt elisp buffer Thierry Volpiatto
2020-11-06 15:33 ` Andreas Schwab
2020-11-06 15:40   ` Eli Zaretskii
2020-11-06 16:17     ` Eli Zaretskii
2020-11-06 20:07       ` Eli Zaretskii
2020-11-09 15:44         ` Lars Ingebrigtsen
2020-11-09 16:14           ` Eli Zaretskii
2020-11-09 16:27             ` Lars Ingebrigtsen
2020-11-09 16:57               ` Eli Zaretskii
2020-11-10 14:29                 ` Lars Ingebrigtsen
2020-11-10 16:04                   ` Eli Zaretskii
2020-11-14 14:02             ` Stefan Monnier
2020-11-14 15:09               ` Eli Zaretskii
2020-11-14 15:19                 ` Stefan Monnier
2020-11-14 16:13                   ` Eli Zaretskii
2020-11-14 17:55                     ` Stefan Monnier
2020-11-14 18:08                       ` Eli Zaretskii [this message]
2020-11-14 18:14                         ` Eli Zaretskii
2020-11-14 22:56                           ` Stefan Monnier
2020-11-15 15:14                             ` Eli Zaretskii
2020-11-14 22:53                         ` Stefan Monnier
2020-11-15 15:08                           ` Eli Zaretskii
2020-11-15 18:31                             ` Stefan Monnier
2020-11-14 12:43         ` Eli Zaretskii
2020-11-06 19:18   ` Thierry Volpiatto

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83y2j3u7zv.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=44486@debbugs.gnu.org \
    --cc=larsi@gnus.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=schwab@linux-m68k.org \
    --cc=thievol@posteo.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).