From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Markus Triska Newsgroups: gmane.emacs.bugs Subject: bug#53236: 26.1; encode-coding-string does not encode the string as expected Date: Thu, 13 Jan 2022 20:45:57 +0100 Message-ID: <8735lra07e.fsf@metalevel.at> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="6074"; mail-complaints-to="usenet@ciao.gmane.io" To: 53236@debbugs.gnu.org Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Thu Jan 13 21:14:18 2022 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1n86UC-0001L5-CU for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 13 Jan 2022 21:14:16 +0100 Original-Received: from localhost ([::1]:48918 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n86UB-0002fJ-BO for geb-bug-gnu-emacs@m.gmane-mx.org; Thu, 13 Jan 2022 15:14:15 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:39088) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n863q-0001TY-6G for bug-gnu-emacs@gnu.org; Thu, 13 Jan 2022 14:47:02 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:41751) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1n863p-0007iq-SR for bug-gnu-emacs@gnu.org; Thu, 13 Jan 2022 14:47:01 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1n863p-0006eT-Ro for bug-gnu-emacs@gnu.org; Thu, 13 Jan 2022 14:47:01 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Markus Triska Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Thu, 13 Jan 2022 19:47:01 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: report 53236 X-GNU-PR-Package: emacs X-Debbugs-Original-To: bug-gnu-emacs@gnu.org Original-Received: via spool by submit@debbugs.gnu.org id=B.164210317025493 (code B ref -1); Thu, 13 Jan 2022 19:47:01 +0000 Original-Received: (at submit) by debbugs.gnu.org; 13 Jan 2022 19:46:10 +0000 Original-Received: from localhost ([127.0.0.1]:34654 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n8630-0006d7-1s for submit@debbugs.gnu.org; Thu, 13 Jan 2022 14:46:10 -0500 Original-Received: from lists.gnu.org ([209.51.188.17]:56322) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1n862x-0006cp-0V for submit@debbugs.gnu.org; Thu, 13 Jan 2022 14:46:08 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:38830) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1n862u-00088b-0S for bug-gnu-emacs@gnu.org; Thu, 13 Jan 2022 14:46:04 -0500 Original-Received: from [78.47.144.35] (port=42036 helo=metalevel.at) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1n862r-0007Y5-JM for bug-gnu-emacs@gnu.org; Thu, 13 Jan 2022 14:46:02 -0500 Original-Received: from mt-Lenovo-ideapad-120S-11IAP (localhost [127.0.0.1]) by metalevel.at (Postfix) with ESMTP id A31059C73F for ; Thu, 13 Jan 2022 20:45:58 +0100 (CET) Original-Received: by mt-Lenovo-ideapad-120S-11IAP (Postfix, from userid 1000) id 2C6CA141261; Thu, 13 Jan 2022 20:45:58 +0100 (CET) X-Host-Lookup-Failed: Reverse DNS lookup failed for 78.47.144.35 (failed) Received-SPF: none client-ip=78.47.144.35; envelope-from=triska@metalevel.at; helo=metalevel.at X-Spam_score_int: -10 X-Spam_score: -1.1 X-Spam_bar: - X-Spam_report: (-1.1 / 5.0 requ) BAYES_00=-1.9, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane-mx.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.io gmane.emacs.bugs:224131 Archived-At: Dear all, please consider the UTF-8 encoding of the Unicode codepoint 0x80, which is formed by two bytes. In hexadecimal notation, they are: 0xC2 0x80. We can use decode-coding-string to verify that this byte sequence is decoded to 0x80 when specifying utf-8, which works exactly as expected: (decode-coding-string "\xC2\x80" 'utf-8) This yields "\200", which is the same as "\x80", as verified via: (string= "\200" "\x80") --> t Correspondingly, I expect (encode-coding-string "\200" 'utf-8) to yield a string equivalent to "\xC2\x80", but that seems not to be the case. I get: (encode-coding-string "\200" 'utf-8) --> "\200" And therefore, unexpectedly: (string= (encode-coding-string "\200" 'utf-8) "\xC2\x80") --> nil It appears that encode-coding-string does not encode the string in UTF-8 as expected. Is there any way to obtain the desired encoding with encode-coding-string, i.e., the UTF-8-encoded string "\xC2\x80"? Thank you and all the best! Markus In GNU Emacs 26.1 (build 3, x86_64-pc-linux-gnu, X toolkit, Xaw scroll bars) of 2019-04-09 built on mt-laptop Windowing system distributor 'The X.Org Foundation', version 11.0.12004000 System Description: Ubuntu 19.04 Configured features: XPM JPEG GIF PNG SOUND GSETTINGS NOTIFY GNUTLS LIBXML2 FREETYPE XFT ZLIB TOOLKIT_SCROLL_BARS LUCID X11 THREADS Important settings: value of $LC_MONETARY: en_GB.UTF-8 value of $LC_NUMERIC: en_GB.UTF-8 value of $LC_TIME: en_GB.UTF-8 value of $LANG: en_US.UTF-8 value of $XMODIFIERS: @im=ibus locale-coding-system: utf-8-unix