From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#28179: Fwd: Re: bug#28179: Fix use of string-to-multibyte in ispell.el Date: Thu, 24 Aug 2017 21:20:46 +0300 Message-ID: <83a82o969t.fsf@gnu.org> References: <0df1f5ab-e99b-b473-549c-5a40045ab71a@sc3d.org> <83lgm89a1v.fsf@gnu.org> <4d1a7990-f076-9e22-39df-6edeef17ef7b@sc3d.org> Reply-To: Eli Zaretskii NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1503598994 3170 195.159.176.226 (24 Aug 2017 18:23:14 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 24 Aug 2017 18:23:14 +0000 (UTC) Cc: 28179@debbugs.gnu.org To: Reuben Thomas Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Thu Aug 24 20:23:09 2017 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dkwmY-0000OB-QV for geb-bug-gnu-emacs@m.gmane.org; Thu, 24 Aug 2017 20:23:06 +0200 Original-Received: from localhost ([::1]:49853 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dkwme-0001hU-0a for geb-bug-gnu-emacs@m.gmane.org; Thu, 24 Aug 2017 14:23:12 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:38060) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dkwmY-0001hN-H8 for bug-gnu-emacs@gnu.org; Thu, 24 Aug 2017 14:23:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dkwmT-0003q1-T5 for bug-gnu-emacs@gnu.org; Thu, 24 Aug 2017 14:23:06 -0400 Original-Received: from debbugs.gnu.org ([208.118.235.43]:44472) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1dkwmT-0003pt-PM for bug-gnu-emacs@gnu.org; Thu, 24 Aug 2017 14:23:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1dkwmT-0000YV-KE for bug-gnu-emacs@gnu.org; Thu, 24 Aug 2017 14:23:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 24 Aug 2017 18:23:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 28179 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: Original-Received: via spool by 28179-submit@debbugs.gnu.org id=B28179.15035989282058 (code B ref 28179); Thu, 24 Aug 2017 18:23:01 +0000 Original-Received: (at 28179) by debbugs.gnu.org; 24 Aug 2017 18:22:08 +0000 Original-Received: from localhost ([127.0.0.1]:53153 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dkwlc-0000X8-0O for submit@debbugs.gnu.org; Thu, 24 Aug 2017 14:22:08 -0400 Original-Received: from eggs.gnu.org ([208.118.235.92]:46979) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dkwlb-0000Ww-3d for 28179@debbugs.gnu.org; Thu, 24 Aug 2017 14:22:07 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1dkwlR-0003EM-Ae for 28179@debbugs.gnu.org; Thu, 24 Aug 2017 14:22:01 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:43911) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1dkwlR-0003EC-7z; Thu, 24 Aug 2017 14:21:57 -0400 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:1568 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1dkwlO-0004r7-KO; Thu, 24 Aug 2017 14:21:57 -0400 In-reply-to: <4d1a7990-f076-9e22-39df-6edeef17ef7b@sc3d.org> (message from Reuben Thomas on Thu, 24 Aug 2017 18:45:33 +0100) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 208.118.235.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:136159 Archived-At: > Cc: 28179@debbugs.gnu.org > From: Reuben Thomas > Date: Thu, 24 Aug 2017 18:45:33 +0100 > > The reason I am asking again is because you first said: > > > What if decode-coding-string returns a pure ASCII string, which is > > therefore unibyte? > > and then later you said: > > > The way I meant it, it has to do with the internal flag marking a > > string either unibyte or multibyte. Observe: > > (multibyte-string-p "abcd") => nil > > > > but > > > > (multibyte-string-p (decode-coding-string "abcd" 'utf-8)) => t That example may be conclusive for UTF-8, but is it conclusive for _any_ encoding? I don't know. E.g., what about the ISO-2022 based encodings, where all the bytes are (AFAIR) pure ASCII? > 1. As far as I can tell from the above (and my own confirmatory > experiments and reading of the documentation), a pure ASCII string can > be multibyte (it's a matter of the multibyte flag, not the number of > bytes used to store each character). > > 2. decode-coding-string always returns a multibyte string. Can you show me why 2 is always correct? It might be, I simply don't know. All I know is that in general relying on plain-ASCII strings to be always multibyte in any given situation is risky, we were bitten by that a few times. But maybe it's not an issue in this case. Which is why I was asking you whether you have sufficient basis to believe this to be so in this case. > Since these two observations seemed to mean that you contradicted > yourself, I was checking whether in fact I had misunderstood (so that > for example one of my two observations above is wrong), or if your > original understanding was incomplete (so that in fact your question > about decode-coding-string is therefore misguided, because it can return > a pure ASCII unibyte string (in the coding sense) which is nonetheless a > multibyte string (in the sense that multibyte-string-p on it returns t). I only used decode-coding-string because I remembered it as an easy way of creating a multibyte ASCII string, when the coding-system is UTF-8, that's all. There was no contradiction in what I said, at least not an intended one.