From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Stefan Monnier Newsgroups: gmane.emacs.devel Subject: Re: [Emacs-diffs] master 58a3c54: Avoid using string-make-unibyte in select.el Date: Sat, 22 Jun 2019 12:44:05 -0400 Message-ID: References: <20190622083524.20617.42423@vcs0.savannah.gnu.org> <20190622083525.F1CA5209DE@vcs0.savannah.gnu.org> <838sttohx8.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="93393"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux) Cc: emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jun 22 18:44:41 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1hej85-000O9l-4D for ged-emacs-devel@m.gmane.org; Sat, 22 Jun 2019 18:44:41 +0200 Original-Received: from localhost ([::1]:41748 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hej84-0008RJ-5X for ged-emacs-devel@m.gmane.org; Sat, 22 Jun 2019 12:44:40 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:51384) by lists.gnu.org with esmtp (Exim 4.86_2) (envelope-from ) id 1hej7d-0008RA-D0 for emacs-devel@gnu.org; Sat, 22 Jun 2019 12:44:14 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hej7c-0002ao-9G for emacs-devel@gnu.org; Sat, 22 Jun 2019 12:44:13 -0400 Original-Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:49139) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hej7a-0002QH-Q9; Sat, 22 Jun 2019 12:44:10 -0400 Original-Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 40615443FA9; Sat, 22 Jun 2019 12:44:08 -0400 (EDT) Original-Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 7A4C6443F8F; Sat, 22 Jun 2019 12:44:06 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1561221846; bh=Ghh0hLFue5bympSBawco7VCM6b2efgKfKyCygBzrqD0=; h=From:To:Cc:Subject:References:Date:In-Reply-To:From; b=UmA6DKLAZMg0FLrgDPmXPC7Nm6hNhEx8hTs7KXKpCzmQWLW/aKEvyQ7Prom+NZLiN xvkRYS7p2a+fmYGZyQETfA3RIx7TsgtFHbkpD4PHgNcr+uo/yCkGmmFB7eK4v1Kfr4 lIygjxoZy+6+Gex0+yJoRRWrZR9uRja6pI/KW7ShwTCg4y9pR10bBnEU4CczBjtxLq a3DfnvRTG9mSGIBywzytWdKWQQ78Z9Jikl5vbUUqM1n9JX9w9nxDyS1brhCEdMIAIW +RWkZOC7hqUuh+FiCqTvMHIQUecm0LSd+GeV9r2ayZYMar9yQ+ZdFZOmA+ToW6Vzuo IgNCaiPRk4Adw== Original-Received: from alfajor (104-195-207-100.cpe.teksavvy.com [104.195.207.100]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 43D4212044D; Sat, 22 Jun 2019 12:44:06 -0400 (EDT) In-Reply-To: <838sttohx8.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 22 Jun 2019 16:42:27 +0300") X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 132.204.25.50 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:238039 Archived-At: >> > + (or (null (multibyte-string-p str)) >> > + (setq str (encode-coding-string 'raw-text-unix str)))) >> Isn't this the same as (setq str (string-to-unibyte str))? > No, because the former doesn't signal an error. Oh, right, that's yet another subtle distinction between all those alternatives. BTW, do we actually need to convert to unibyte here? (most place where we expect a unibyte string, we silently convert from multibyte when needed, in a way that's basically equivalent to the above). > (And I didn't want to use any of those string-to/as-uni/multibyte > functions anyway.) I hated those functions and still do for the string-as and string-make variety, but I'm beginning to like the string-to variety when we need to convert the representation of a sequence of *bytes* within encoding/decoding them as chars. So maybe the present case argues for adding a `no-error` argument to string-to-unibyte. I say this because to me (encode-coding-string 'raw-text-unix str) is an oxymoron since `raw-text-unix` is a synonym of `binary` and `no-conversion`, which basically says "do any encoding/decoding, instead preserve bytes as bytes". IOW coding-systems like `raw-text` make sense in places like the `coding:` tag or in buffer-file-coding-system, where we are forced to put some kind of coding-system and where it is hence handy to be able to use `raw-text-unix` to basically skip the en/decoding. But I find them confusing when passed as a constant to `en/decode-coding-string`. > The only thing we are supposed to do in the multibyte case is to make > sure the raw bytes are converted to their single-byte representation, > which is exactly what raw-text-unix does. Right (and indeed string-make-unibyte worked in practice for the same reason that encoding with pretty much any coding-system preserves the bytes as well, save for a few exceptions like utf-8-emacs). Stefan