From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.help Subject: Re: Text copied from *grep* buffer has NUL (0x00) characters Date: Sun, 09 May 2021 15:47:18 -0400 Message-ID: References: <83bl9k8buk.fsf@gnu.org> <3e892a2e-1d04-7712-d129-e4f59382457b@yahoo.de> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="32726"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux) Cc: Eli Zaretskii , help-gnu-emacs@gnu.org To: "R. Diez" Original-X-From: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Sun May 09 21:47:55 2021 Return-path: Envelope-to: geh-help-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1lfpP8-0008PX-U4 for geh-help-gnu-emacs@m.gmane-mx.org; Sun, 09 May 2021 21:47:55 +0200 Original-Received: from localhost ([::1]:49546 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lfpP8-0003Jw-0U for geh-help-gnu-emacs@m.gmane-mx.org; Sun, 09 May 2021 15:47:54 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:56836) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lfpOe-0003JV-UH for help-gnu-emacs@gnu.org; Sun, 09 May 2021 15:47:25 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:45788) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lfpOc-0004PM-QV; Sun, 09 May 2021 15:47:24 -0400 Original-Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id F3E3910028A; Sun, 9 May 2021 15:47:20 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 86B471000CF; Sun, 9 May 2021 15:47:19 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1620589639; bh=j/R5mRgwUkjE153hWUQ7NEmG/pLAC64mv9+N0Q+mM+0=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=W83d+YfXcFZ5ya7noUyL0wKZLayV2ehbAGH/AHRpC6LZ9lw1eD1WF5BbFD7TyQYm4 QnO4ZuobkDuWt8TKwnXeJDwC8Vy3Ym/blNw+DZY+DjlMTr+T/LygbzVpkI/lple/B0 sjrgg1iEJuZrpa6DZ541l3wzN0DuHEvXY0XVEaUUB79i8XvARQKAvqXMdY8HaDp8r7 3wYfqs3WKoAbSMk/SkpoP1IE4KDitm39kMJLQR2CjiU/yulyLCRt+G9SZaFf+mUD5U 4kJhmrqB9l/1yA/HDopFekbms4zUdeW+ewvE6EYfmdF4Wg9g4nSPshlJtG9kEynzIb ac1HsRYgT10MQ== Original-Received: from alfajor (76-10-140-76.dsl.teksavvy.com [76.10.140.76]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 435331200B8; Sun, 9 May 2021 15:47:19 -0400 (EDT) In-Reply-To: <3e892a2e-1d04-7712-d129-e4f59382457b@yahoo.de> (R. Diez's message of "Sun, 9 May 2021 20:47:28 +0200") Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: help-gnu-emacs@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Users list for the GNU Emacs text editor List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: help-gnu-emacs-bounces+geh-help-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "help-gnu-emacs" Xref: news.gmane.io gmane.emacs.help:129640 Archived-At: >> For the detection of NULs in UTF-8 files, you could also ask for such >> a feature via `M-x report-emacs-bug` but it should be pretty easy to get >> something comparable with something like: >> [...] > > I don't think it is desirable for users to install such Lisp hooks to deal > with such corner cases. There's a tension between avoiding pitfalls and making it inconvenient for corner cases. I do think there's a real plain bug here, tho, if you change your "recipe" to `uft-8` instead `utf-8-with-signature`: take a utf-8 text file (in a UTF-8 locale), add a NUL byte to it, save, close, and re-open: you now get a unibyte buffer showing the bytes rather than the chars. Emacs should generally try and warn you when saving a file with a coding system different than the one it would guess when later re-opening the file. The problem doesn't show up with `utf-8-with-signature` because apparently the BOM is given more weight than the NUL byte in determining which coding system to use. > My opinion is that Emacs should be more helpful here by default. No point arguing here then: make it a bug report. `help-gnu-emacs` is rather for the case where you're looking for a workaround (or when you suspect what you're seeing is a "feature" you just fail to understand) rather than fixing it "for everyone". Stefan