From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Juri Linkov Newsgroups: gmane.emacs.bugs Subject: bug#38587: base64-decode-region breaks encoding Date: Mon, 16 Dec 2019 23:51:48 +0200 Organization: LINKOV.NET Message-ID: <87r214giaz.fsf@mail.linkov.net> References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="158439"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (x86_64-pc-linux-gnu) Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org To: Eli Zaretskii Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Mon Dec 16 23:53:38 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1igzF9-000f2S-T3 for geb-bug-gnu-emacs@m.gmane.org; Mon, 16 Dec 2019 23:53:36 +0100 Original-Received: from localhost ([::1]:33018 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1igzF8-0003Yp-Om for geb-bug-gnu-emacs@m.gmane.org; Mon, 16 Dec 2019 17:53:34 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:33086) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1igzEf-0002uX-8X for bug-gnu-emacs@gnu.org; Mon, 16 Dec 2019 17:53:06 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1igzEd-0002ab-Rl for bug-gnu-emacs@gnu.org; Mon, 16 Dec 2019 17:53:05 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:34509) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1igzEd-0002aT-Of for bug-gnu-emacs@gnu.org; Mon, 16 Dec 2019 17:53:03 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1igzEd-00070F-N6 for bug-gnu-emacs@gnu.org; Mon, 16 Dec 2019 17:53:03 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Juri Linkov Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Mon, 16 Dec 2019 22:53:03 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs Original-Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157653677026841 (code B ref 38587); Mon, 16 Dec 2019 22:53:03 +0000 Original-Received: (at 38587) by debbugs.gnu.org; 16 Dec 2019 22:52:50 +0000 Original-Received: from localhost ([127.0.0.1]:40472 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igzEQ-0006yq-FZ for submit@debbugs.gnu.org; Mon, 16 Dec 2019 17:52:50 -0500 Original-Received: from antelope.elm.relay.mailchannels.net ([23.83.212.4]:32595) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1igzEO-0006yi-6L for 38587@debbugs.gnu.org; Mon, 16 Dec 2019 17:52:48 -0500 X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Original-Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id F27DC1A136D; Mon, 16 Dec 2019 22:52:46 +0000 (UTC) Original-Received: from pdx1-sub0-mail-a19.g.dreamhost.com (100-96-6-249.trex.outbound.svc.cluster.local [100.96.6.249]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 4AEEC1A103F; Mon, 16 Dec 2019 22:52:46 +0000 (UTC) X-Sender-Id: dreamhost|x-authsender|jurta@jurta.org Original-Received: from pdx1-sub0-mail-a19.g.dreamhost.com ([TEMPUNAVAIL]. [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.18.5); Mon, 16 Dec 2019 22:52:46 +0000 X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|jurta@jurta.org X-MailChannels-Auth-Id: dreamhost X-Cold-Vacuous: 71b202eb3bf647de_1576536766569_2258739021 X-MC-Loop-Signature: 1576536766569:1811971875 X-MC-Ingress-Time: 1576536766568 Original-Received: from pdx1-sub0-mail-a19.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a19.g.dreamhost.com (Postfix) with ESMTP id CA1D97F028; Mon, 16 Dec 2019 14:52:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=linkov.net; h=from:to:cc :subject:references:date:in-reply-to:message-id:mime-version :content-type:content-transfer-encoding; s=linkov.net; bh=VnwSoP 7dN0VN4EX4ibuKPy+HToE=; b=WFv5z/5eDZgpwbdqnt4Sljxi8S4Hy0vn2+jlhT 2V3M7tdTeh0xQqiDqZHW5lDnE+mBL4+KVMWxUVg3EVGOYyZd2FFCZidO3ulau2Ge pHSvmTHil3MvYS/0F6hSHDKqYpTfxReBPazDxmK++DvlkpR/aWiwx8jH54F179Nk 56LEw= Original-Received: from mail.jurta.org (m91-129-107-186.cust.tele2.ee [91.129.107.186]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: jurta@jurta.org) by pdx1-sub0-mail-a19.g.dreamhost.com (Postfix) with ESMTPSA id F1CFC7F024; Mon, 16 Dec 2019 14:52:39 -0800 (PST) X-DH-BACKEND: pdx1-sub0-mail-a19 In-Reply-To: <835zig5kka.fsf@gnu.org> (Eli Zaretskii's message of "Mon, 16 Dec 2019 17:58:29 +0200") X-VR-OUT-STATUS: OK X-VR-OUT-SCORE: -100 X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedufedrvddtiedgtdegucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvufhofhffjgfkfgggtgfgsehtkeertddtredunecuhfhrohhmpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqnecukfhppeeluddruddvledruddtjedrudekieenucfrrghrrghmpehmohguvgepshhmthhppdhhvghlohepmhgrihhlrdhjuhhrthgrrdhorhhgpdhinhgvthepledurdduvdelrddutdejrddukeeipdhrvghtuhhrnhdqphgrthhhpefluhhrihcunfhinhhkohhvuceojhhurhhisehlihhnkhhovhdrnhgvtheqpdhmrghilhhfrhhomhepjhhurhhisehlihhnkhhovhdrnhgvthdpnhhrtghpthhtohepvghlihiisehgnhhurdhorhhgnecuvehluhhsthgvrhfuihiivgeptd X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:173453 Archived-At: >> But is it still possible to tell base64-decode-region >> about the expected output coding system? Maybe using >> a prefix arg: C-u M-x base64-decode-region could ask >> for a coding, defaulting to the buffer's coding. > > If we want to make such a change, then "C-x RET c" is a better prefix > command, as it is consistent with other commands that accept > coding-system overrides. > >> Is there an equivalent of force_encoding('UTF-8') in Emacs? > > "C-x RET c utf-8 RET M-x SOME-COMMAND RET" I see that 'C-x RET c' just sets coding-system-for-read and coding-system-for-write for the next command, so could base64-decode-region get coding from these variables? > It will work if you encode "=E4" first: > > (decode-coding-string (base64-decode-string > (base64-encode-string > (encode-coding-string "=E4" 'utf-8))) > 'utf-8) Thanks, this works for strings. My real need was to find a way to decode base64 regions that were encoded with UTF-8 coding. First I tried to find such post-processing that would recover "broken" characters inserted by base64-decode-region. It seems these characters represent bytes that are parts of the UTF-8 characters encoded in the UTF-8 buffer using eight-bit charset. I failed to find such functions that would convert the result of base64-decode-region to UTF-8 characters in the UTF-8 buffer. So I wrote a replacement of base64-decode-region: (defun base64-decode-utf8-region (beg end) (interactive "r") (replace-region-contents beg end (lambda () (decode-coding-string (base64-decode-string (buffer-substring beg end)) (or coding-system-for-write 'utf-8))))) But the question remains: is it possible to do the same in a simpler way without the need to write a new command?