From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [Emacs-diffs] emacs-26 0feb673: Display raw bytes as belonging to 'eight-bit' charset Date: Sat, 28 Jul 2018 18:15:28 +0300 Message-ID: <83effntmr3.fsf@gnu.org> References: <20180727064907.6305.13029@vcs0.savannah.gnu.org> <20180727064909.85288203C0@vcs0.savannah.gnu.org> <87bmas3juo.fsf@gmail.com> <877elg4tbx.fsf@igel.home> <2499AA9B-E194-4EE0-BF2B-97F082B999EB@gnu.org> <83zhycqtk7.fsf@gnu.org> NNTP-Posting-Host: blaine.gmane.org X-Trace: blaine.gmane.org 1532790855 7852 195.159.176.226 (28 Jul 2018 15:14:15 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 28 Jul 2018 15:14:15 +0000 (UTC) Cc: Kenichi Handa , emacs-devel@gnu.org To: Stefan Monnier Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Jul 28 17:14:11 2018 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1fjQv2-0001wC-Vr for ged-emacs-devel@m.gmane.org; Sat, 28 Jul 2018 17:14:09 +0200 Original-Received: from localhost ([::1]:45592 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fjQx8-0005zO-Up for ged-emacs-devel@m.gmane.org; Sat, 28 Jul 2018 11:16:18 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:55020) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fjQwN-0005z6-NH for emacs-devel@gnu.org; Sat, 28 Jul 2018 11:15:32 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fjQwM-0005wE-QU for emacs-devel@gnu.org; Sat, 28 Jul 2018 11:15:31 -0400 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:54402) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fjQwI-0005tv-DM; Sat, 28 Jul 2018 11:15:26 -0400 Original-Received: from [176.228.60.248] (port=2832 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1fjQwH-0005N4-NW; Sat, 28 Jul 2018 11:15:26 -0400 In-reply-to: (message from Stefan Monnier on Sat, 28 Jul 2018 10:15:39 -0400) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:227923 Archived-At: > From: Stefan Monnier > Date: Sat, 28 Jul 2018 10:15:39 -0400 > > I know and understand what this charset is for. Yet I don't see why > eight-bit-control chars should be reported as belonging to this charset > (any more than they should be reported to belong to any of the other > charsets to which they may also belong, such as all the iso8859 > charsets). Because ISO-8859 charsets don't include bytes between 128 and 159 (inclusive), I guess. > I believe this happens only by accident: it seems to be the only charset > of its kind defined with `:superset (... eight-bit-control ...)` and > without :supplementary-p. It's true it's the only such combination, but it is much less clear to me why do you think this is an accident. It could be, but I see no reason to assume that without some independent evidence. I asked Handa-san to comment on that in the hope that he might be able to shed some light on this situation (and on the use of :supplementary-p in general). > But maybe the real source of the problem is that eight-bit-control is > defined as :supplementary-p (hard to tell, because I only see doc of > how/when :supplementary-p should be used, but not what it does). Maybe. And again, I don't see why you'd assume it's a bug. > > You are asking why we have this charset in the first place? > > No, I understand why we have it, what I don't understand why it should > be considered anything but a bug that eight-bit-control chars should be > considered as belonging to that charset instead of to the > eight-bit-control charset We actually don't want to expose eight-bit-control to users at all, we want there to be a single charset called 'eight-bit' covering all the raw bytes.