From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#52459: 28.0.90; prin1-to-string does not escape bidi control characters despite print-escape-control-characters=t Date: Mon, 13 Dec 2021 17:24:41 +0200 Message-ID: <8335mwmssm.fsf@gnu.org> References: <83v8ztmu75.fsf@gnu.org> <93d63756-f75d-c53e-de02-2e8270d07311@daniel-mendler.de> <83r1agn184.fsf@gnu.org> <0eabc668-ecb2-8f77-17cf-f9cb6dcf0626@daniel-mendler.de> <0504d4a8-1a4b-a451-d7d3-fea1c116b96d@daniel-mendler.de> Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="30508"; mail-complaints-to="usenet@ciao.gmane.io" Cc: 52459@debbugs.gnu.org To: Daniel Mendler Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Mon Dec 13 16:25:38 2021 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mwnCs-0007kZ-0x for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 13 Dec 2021 16:25:38 +0100 Original-Received: from localhost ([::1]:52276 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1mwnCq-0005zt-Qj for geb-bug-gnu-emacs@m.gmane-mx.org; Mon, 13 Dec 2021 10:25:36 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:59546) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mwnCI-0005y7-RI for bug-gnu-emacs@gnu.org; Mon, 13 Dec 2021 10:25:04 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:45046) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1mwnCI-0007v0-JG for bug-gnu-emacs@gnu.org; Mon, 13 Dec 2021 10:25:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1mwnCI-000551-De for bug-gnu-emacs@gnu.org; Mon, 13 Dec 2021 10:25:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 13 Dec 2021 15:25:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 52459 X-GNU-PR-Package: emacs Original-Received: via spool by 52459-submit@debbugs.gnu.org id=B52459.163940909719513 (code B ref 52459); Mon, 13 Dec 2021 15:25:02 +0000 Original-Received: (at 52459) by debbugs.gnu.org; 13 Dec 2021 15:24:57 +0000 Original-Received: from localhost ([127.0.0.1]:56592 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mwnCD-00054f-1s for submit@debbugs.gnu.org; Mon, 13 Dec 2021 10:24:57 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:58772) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1mwnC8-00054Q-Rj for 52459@debbugs.gnu.org; Mon, 13 Dec 2021 10:24:55 -0500 Original-Received: from [2001:470:142:3::e] (port=34314 helo=fencepost.gnu.org) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mwnC2-0007tV-C3; Mon, 13 Dec 2021 10:24:47 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=kvdIIFk7sEBLjMHwtZ92Jfc/4GTpsCb3yT06eSazesc=; b=dGQanli5u9yt d50+4dVj7ToPq9UYRxrsniUNrZYbpSklm6Syh5mb0FGjHpYfcL3djPk8hHsjbqspGMoZ55Xl0A+/G ejfKZiwnfmK4H5yaz8F+qcOA5qDquYy4IQA2R5wyVPh27XeoU1BE70bkKfGrFnToI+eogg5Uy7o1K +hp9Q/GgIyVcD9j7Se35lohTY4BJz7+/kOnHHmtG8NPFYZAb74n8ix9Y4LScVbzycrIrhos5cjuAZ FAuhqsfZ9XRXwx8j2ABiQJefXZVVJWEuYbQO2TQ1U6ymPZputuhfa8tfxZdgAePa0UFJRMy+GgZRm qGznSxPr3GdOxTIujAmyxQ==; Original-Received: from [87.69.77.57] (port=3979 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1mwnC2-0006aD-6I; Mon, 13 Dec 2021 10:24:46 -0500 In-Reply-To: <0504d4a8-1a4b-a451-d7d3-fea1c116b96d@daniel-mendler.de> (message from Daniel Mendler on Mon, 13 Dec 2021 14:30:13 +0100) X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:222321 Archived-At: > From: Daniel Mendler > Cc: 52459@debbugs.gnu.org > Date: Mon, 13 Dec 2021 14:30:13 +0100 > > In other words, there is a need for a functionality which makes it > possible to turn a string into a string literal in a form which could be > used in source code for example. > > If you look at the definition of bidi-directional-controls-chars in > simple.el, the bidi characters are escaped there. Why is this? Why did > you write this definition in this form and not with unescaped bidi > characters? > > (defvar bidi-directional-controls-chars "\x202a-\x202e\x2066-\x2069" > "Character set that matches bidirectional formatting control characters.") So you want a feature that would produce strings suitable for using in program source files, like we did in the above example? Is that the meaning of "sanitize" you have been using? Are there other use cases for those "escaped" or "sanitized" strings? If so, please describe them as well. Or if that's not the correct meaning of "sanitized", please define it more accurately. You see, this discussion is hard because I still don't understand what is it that you want Emacs to provide, and for what purposes. Please try to clarify that, to make the discussion more efficient and avoid misunderstandings. For now, I understand that those strings are not necessarily required to be readable on the Emacs display, at least not in all cases, because some of the reordering that these controls produce will be disabled when they are represented by ASCII escapes, and the character order on display will change as result. If the string includes RTL characters, the result might not be easily readable. But AFAIU, this is not a problem for the use cases you have in mind? > Therefore my proposal to add two variables > `print-escape-ascii-control-characters` and > `print-escape-unicode-control-characters`. I'd prefer to wait with concrete proposals until the requirements are clear. It seems like a variable like those you mention, which only affect the Emacs display, but not the string contents, is not what you need. You need to actually produce the ASCII characters \x2023a, so that you could put them in a string like you show above. Or am I misunderstanding again?