From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: character sets as they relate to =?utf-8?B?4oCcUmF34oCd?= string literals for elisp Date: Thu, 07 Oct 2021 16:34:49 +0300 Message-ID: <83tuhtyn46.fsf@gnu.org> References: <4209edd83cfee7c84b2d75ebfcd38784fa21b23c.camel@crossproduct.net> <87v92ft9z6.fsf@db48x.net> <87o885tyle.fsf@db48x.net> <83k0it6lu5.fsf@gnu.org> <87k0isu7hz.fsf_-_@db48x.net> <87a6jotszy.fsf@db48x.net> <877der8smr.fsf@mail.linkov.net> <83y2772y0s.fsf@gnu.org> <83sfxd1g05.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15905"; mail-complaints-to="usenet@ciao.gmane.io" Cc: rms@gnu.org, yuri.v.khan@gmail.com, juri@linkov.net, db48x@db48x.net, monnier@iro.umontreal.ca, emacs-devel@gnu.org To: Stefan Kangas Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Thu Oct 07 15:43:57 2021 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mYTgi-0003xJ-VK for ged-emacs-devel@m.gmane-mx.org; Thu, 07 Oct 2021 15:43:56 +0200 Original-Received: from localhost ([::1]:36428 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mYTgh-0002qF-V8 for ged-emacs-devel@m.gmane-mx.org; Thu, 07 Oct 2021 09:43:55 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:35672) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mYTYk-00047o-Sd for emacs-devel@gnu.org; Thu, 07 Oct 2021 09:35:44 -0400 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:36968) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mYTYi-0001NW-T4; Thu, 07 Oct 2021 09:35:40 -0400 Original-Received: from 84.94.185.95.cable.012.net.il ([84.94.185.95]:2990 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mYTXy-0003Dv-9H; Thu, 07 Oct 2021 09:34:56 -0400 In-Reply-To: (message from Stefan Kangas on Thu, 7 Oct 2021 09:14:47 -0400) X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:276496 Archived-At: > From: Stefan Kangas > Date: Thu, 7 Oct 2021 09:14:47 -0400 > Cc: db48x@db48x.net, yuri.v.khan@gmail.com, emacs-devel@gnu.org, > monnier@iro.umontreal.ca, juri@linkov.net > > Eli Zaretskii writes: > > >> Normally Texinfo represents an em-dash in ASCII output with two > >> dashes, not just one. It would be `...from a core dump-–provided > >> that a core dump...' > > This does not seem to happen in (info "(texinfo) Conventions"): > > * Use three hyphens in a row, '---', to produce a long dash--like > this (called an "em dash"), used for punctuation in sentences. You mean, you expected to see em dash there? They deliberately used @samp{---} to prevent that, because otherwise the text would be confusing: it talks about typing 3 dashes in the Texinfo sources. And texinfo.texi doesn't have "@documentencoding UTF-8" which AFAIR is required for the generation of non-ASCII characters from these multiple dashes. > They use two HYPHEN-MINUS characters to represents an em-dash. You mean, 3, not 2, right? > > That has changed, since we nowadays by default use UTF-8 encoding in > > our Info manuals. With that, '---' produces the Unicode em-dash > > character, displayed as a wide dash, and '--' produces a Unicode > > en-dash character, displayed as somewhat more narrow dash (but still > > wider than the ASCII dash). > > IMHO, this is a bug that we should look into, as the correct style used > in the texinfo manual is more readable. As Juri points out, it is not > well suited for a monospace font. What is the bug that you want to fix here? I'm not sure I understand. > I guess texinfo would need some way to produce the previous style em > dashes, while still using utf-8? Or something? > > Or perhaps we could add some code info.el to add a space on each side of > an em dash, but that seems like bit of a hack. I don't really see what needs to be fixed here. The original Texinfo source doesn't have the spaces, according to the US English conventions we use. And the produced text also doesn't have any spaces. So we get back what we asked for, and Texinfo isn't the one to blame: it just did what we told it to do. Or what am I missing?