From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Richard Hansen Newsgroups: gmane.emacs.bugs Subject: bug#55777: [PATCH] Improve documentation of `string-to-multibyte', `string-to-unibyte' Date: Sun, 5 Jun 2022 22:00:35 -0400 Message-ID: <16ed6ce6-725f-a183-8864-7e9185b14ff4@rhansen.org> References: <83sfomcjr7.fsf@gnu.org> <83ilpgc3bd.fsf@gnu.org> <1c6f61d2-80df-38ab-a895-f73ad4be63a7@rhansen.org> <83zgiracxf.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="31489"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Cc: 55777@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Jun 06 04:01:11 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1ny23K-00082Z-NC for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 06 Jun 2022 04:01:11 +0200 Original-Received: from localhost ([::1]:56264 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ny23J-0001se-61 for geb-bug-gnu-emacs@m.gmane-mx.org; Sun, 05 Jun 2022 22:01:09 -0400 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:57406) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ny23C-0001rx-5B for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2022 22:01:02 -0400 Original-Received: from debbugs.gnu.org ([209.51.188.43]:40076) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1ny23B-00042z-TA for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2022 22:01:01 -0400 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ny23B-0001hl-Ra for bug-gnu-emacs@gnu.org; Sun, 05 Jun 2022 22:01:01 -0400 X-Loop: help-debbugs@gnu.org Resent-From: Richard Hansen Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 06 Jun 2022 02:01:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 55777 X-GNU-PR-Package: emacs X-GNU-PR-Keywords: patch Original-Received: via spool by 55777-submit@debbugs.gnu.org id=B55777.16544808416519 (code B ref 55777); Mon, 06 Jun 2022 02:01:01 +0000 Original-Received: (at 55777) by debbugs.gnu.org; 6 Jun 2022 02:00:41 +0000 Original-Received: from localhost ([127.0.0.1]:33973 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ny22r-0001h4-3s for submit@debbugs.gnu.org; Sun, 05 Jun 2022 22:00:41 -0400 Original-Received: from spork.scientician.org ([66.228.35.160]:43586) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ny22o-0001gv-CL for 55777@debbugs.gnu.org; Sun, 05 Jun 2022 22:00:39 -0400 X-Submitted: to spork.scientician.org (Postfix) with ESMTPSA id B43AB47F45 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rhansen.org; s=20130902-spork; t=1654480837; bh=6uqPSomrDCgKRfW72BuLXrA+iJ63fmMQqFzVlJcLSPM=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=eFhHDkfQnvPd0gDOXS364a/6lt0jD0L99Tlen6jj31Y3vubTTWrmL08AULsPH8Xfz /8svUsJ4RS3Z28odSujwQmQjJGbIzemOC8+EHWMIyJuiY0rUwm3yKFYiF3neJJjolT xiUZGrXCBIPiz8stlmQW049BBA05gFSXxMMulCyU= X-Submitted: to mail.scientician.org (Postfix) with ESMTPSA id A9422201AF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rhansen.org; s=20130902-mail; t=1654480834; bh=6uqPSomrDCgKRfW72BuLXrA+iJ63fmMQqFzVlJcLSPM=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=yO6OqEY9QDYX65R8UB1loHeHe9d6o3zV1n6/lgROJQRsql8JMs1jSJ29UYe2ROHlQ rInaa6gu5cZpgYZv3hNgZeLhbymkyVDD0NuKYfqFjXuPdoxS+WpA5fVyekkPyT1S+6 69AhchOke2f1c9X/ebLdsvktKUXBRcvmyf8NyRuo= Content-Language: en-US In-Reply-To: <83zgiracxf.fsf@gnu.org> X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:233761 Archived-At: On 6/5/22 01:37, Eli Zaretskii wrote: > Could you please state what is confusing in the current wording? * "Raw 8-bit bytes" isn't really defined. It's mentioned earlier in the chapter -- the term is even in a @dfn{} -- but there's no definition there. * The term "raw 8-bit bytes" is misleading. It suggests binary data (bytes with values 0-255) but it's actually meant to only cover 128-255. * The term "raw 8-bit bytes" is not used consistently. Sometimes "8" is spelled out as "eight", sometimes "raw" comes after "8-bit", and sometimes it refers to all byte values 0-255 (see the first sentence under `@cindex unibyte text`). * It's not clear whether "raw 8-bit bytes" is meant to refer to bytes with values 128-255, or to the *characters* that map to those byte values. * The following phrasing is weird: "The function assumes that @var{string} includes ASCII characters and raw 8-bit bytes". The purpose of "raw 8-bit bytes" is to cover non-ASCII byte values, so by definition that assumption is always true. By saying "the function assumes", the reader is left wondering about the cases where that assumption is not true, which in turn causes the reader to question whether "raw 8-bit bytes" fully covers non-ASCII byte values, which in turn causes the reader to wonder how to handle those non-covered values (whatever they are). Maybe something like this: By definition, unibyte strings contain only @acronym{ASCII} characters (bytes with values 0-127) and raw 8-bit bytes (bytes with values 128-255); the latter are converted to their corresponding multibyte representations in the @code{eight-bit} character set (@pxref{Text Representations, codepoints}).