all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Yuchen Pei <id@ypei.org>
Cc: emacs-devel@gnu.org
Subject: Re: Coding warning attributes to wrong char
Date: Sat, 17 Jun 2023 09:30:51 +0300	[thread overview]
Message-ID: <834jn6siqs.fsf@gnu.org> (raw)
In-Reply-To: <87bkhebtvp.fsf@ypei.org> (message from Yuchen Pei on Sat, 17 Jun 2023 14:22:18 +1000)

> From: Yuchen Pei <id@ypei.org>
> Date: Sat, 17 Jun 2023 14:22:18 +1000
> 
> These default coding systems were tried to encode the following
> problematic characters in the buffer ‘encoding.txt’:
>   Coding System           Pos  Codepoint  Char
>   utf-8-unix               23  #x3FFFE2   \342
>                            24  #x3FFF80   \200
>                            25  #x3FFF99   \231
> 
> However, each of them encountered characters it couldn’t encode:
>   utf-8-unix cannot encode these: \342 \200 \231
> 
> Click on a character (or switch to this window by ‘C-x o’
> and select the characters by RET) to jump to the place it appears,
> where ‘C-u C-x =’ will give information about it.
> 
> Select one of the safe coding systems listed below,
> or cancel the writing with C-g and edit the buffer
>    to remove or modify the problematic characters,
> or specify any other coding system (and risk losing
>    the problematic characters).
> 
>   raw-text no-conversion
> --8<---------------cut here---------------end--------------->8---
> 
> Despite the warning, the correct fix is to remove the nul character.
> 
> This can be quite misleading, especially when one wants to fix encoding
> issues in big text files.

What is your proposal for better dealing with this situation?

The basic problem here is that Emacs cannot know whether the null
characters are or aren't supposed to be in the file.  You as the user
do know, presumably because you know where this file came from or what
is its purpose.  But Emacs doesn't know.  It also cannot easily know
that removing the null character would solve all the other problems,
since it examines each such character individually.



  reply	other threads:[~2023-06-17  6:30 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-17  4:22 Coding warning attributes to wrong char Yuchen Pei
2023-06-17  6:30 ` Eli Zaretskii [this message]
2023-06-17  9:20   ` Yuchen Pei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=834jn6siqs.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=id@ypei.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.