From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jens Schmidt Newsgroups: gmane.emacs.devel Subject: Non-ASCII characters in man pages produced by groff 1.23 Date: Sun, 29 Oct 2023 17:13:20 +0100 Message-ID: <9fb217d7-37d9-42e1-b3d2-227903d2d182@vodafonemail.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="35272"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla Thunderbird To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sun Oct 29 17:14:40 2023 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1qx8Qy-00090D-JC for ged-emacs-devel@m.gmane-mx.org; Sun, 29 Oct 2023 17:14:40 +0100 Original-Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qx8Py-0000Wh-Ri; Sun, 29 Oct 2023 12:13:38 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qx8Pu-0000Us-AH for emacs-devel@gnu.org; Sun, 29 Oct 2023 12:13:35 -0400 Original-Received: from mr3.vodafonemail.de ([145.253.228.163]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qx8Ps-0000UU-6D for emacs-devel@gnu.org; Sun, 29 Oct 2023 12:13:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vodafonemail.de; s=vfde-mb-mr2-23sep; t=1698596009; bh=7ZOiLtreoNvhE/LNWK/Vd3J0DqITT2g4+RbFnyZc6Vw=; h=Message-ID:Date:User-Agent:From:Content-Language:To:Subject: Content-Type:From; b=ZgGcDwqEmfMw9/HGL6JIGgqdJqFSKC93Zz3gNrkLbkOH5tQ5GFspAWnISoWPCMYt/ HPINiwbv4XjtpWIV82eXzzyu66Ww6aXD9fKdWpVXSsDfr56g6nj+MhfxhOQspo8CFp 0hMwhKW2hE7d26XGj2Z6W25Ori2J2CwuEpkaokyQ= Original-Received: from smtp.vodafone.de (unknown [10.0.0.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by mr3.vodafonemail.de (Postfix) with ESMTPS id 4SJLzs2dzTz1ynp for ; Sun, 29 Oct 2023 16:13:29 +0000 (UTC) Original-Received: from [192.168.178.41] (port-92-194-99-65.dynamic.as20676.net [92.194.99.65]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp.vodafone.de (Postfix) with ESMTPSA id 4SJLzm0KvMz9ryW for ; Sun, 29 Oct 2023 16:13:20 +0000 (UTC) Content-Language: de-DE-frami, en-US X-purgate-type: clean X-purgate: clean X-purgate-size: 2234 X-purgate-ID: 155817::1698596005-25FF4816-14D8BF96/0/0 Received-SPF: pass client-ip=145.253.228.163; envelope-from=jschmidt4gnu@vodafonemail.de; helo=mr3.vodafonemail.de X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Xref: news.gmane.io gmane.emacs.devel:311971 Archived-At: In analogy to Eli's recent Texinfo 7.1 thread ... Since groff 1.23 (used in Debian testing, for example) we have: o The an (man) and doc (mdoc) macro packages no longer remap the -, ', and ` input characters to Basic Latin code points on UTF-8 devices, but treat them as groff normally does (and AT&T troff before it did) for typesetting devices, where they become the hyphen, apostrophe or right single quotation mark, and left single quotation mark, respectively. This change is expected to expose glyph usage errors in man pages. See the "PROBLEMS" file for a recipe that will conceal these errors. A better long-term approach is for man pages to adopt correct input practices; the man pages groff_man_style(7), groff_char(7), and man-pages(7) (subsection "Generating optimal glyphs"; from the Linux man-pages project) contain such instructions. Doing so also improves man page typography when formatting for PDF. See source https://git.savannah.gnu.org/cgit/groff.git/tree/NEWS?h=1.23.0#n206 and also the PROBLEMS entry https://git.savannah.gnu.org/cgit/groff.git/tree/PROBLEMS?h=1.23.0#n84 It seems, however, that at least Debian plans to conceal that issue again later: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052675#18 To clarify what the problem is: If, for example, you search for --compressed-ssh with ?\N{HYPHEN-MINUS}?\N{HYPHEN-MINUS}compressed?\N{HYPHEN-MINUS}ssh in *Man curl* on Debian testing, you won't find that option, because curl's author hasn't yet properly quoted all minus characters in the generated man page source. As a result, they are rendered as ?\N{HYPHEN} in man's output and occur as such in the Man-mode buffer. (Well, that concrete example above got fixed already, but others are still left.) I have been bitten by that already, and not only me: https://lists.debian.org/debian-devel/2023/10/msg00083.html So this is not an Emacs issue, it might get gradually better as man page authors improve their text, and it probably will go away for Debian when Debian freezes trixie. Which means this is just an FYI, and nothing which requires any action ... or what do you think?