From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.bugs Subject: bug#38587: base64-decode-region breaks encoding Date: Tue, 17 Dec 2019 18:04:16 +0200 Message-ID: <83k16v3pmn.fsf@gnu.org> References: <87blsdhzeb.fsf@mail.linkov.net> <87pngtndhd.fsf@gnus.org> <87v9qieb6t.fsf@mail.linkov.net> <87eex66k7h.fsf@hase.home> <87zhft9rl4.fsf@mail.linkov.net> <835zig5kka.fsf@gnu.org> <87r214giaz.fsf@mail.linkov.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="105282"; mail-complaints-to="usenet@blaine.gmane.org" Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org To: Juri Linkov Original-X-From: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Tue Dec 17 17:05:11 2019 Return-path: Envelope-to: geb-bug-gnu-emacs@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from ) id 1ihFLT-000RFa-Df for geb-bug-gnu-emacs@m.gmane.org; Tue, 17 Dec 2019 17:05:11 +0100 Original-Received: from localhost ([::1]:42488 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihFLS-0004lq-27 for geb-bug-gnu-emacs@m.gmane.org; Tue, 17 Dec 2019 11:05:10 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38892) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ihFLL-0004kx-Gw for bug-gnu-emacs@gnu.org; Tue, 17 Dec 2019 11:05:04 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ihFLK-0003vA-A4 for bug-gnu-emacs@gnu.org; Tue, 17 Dec 2019 11:05:03 -0500 Original-Received: from debbugs.gnu.org ([209.51.188.43]:36281) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ihFLK-0003v6-6w for bug-gnu-emacs@gnu.org; Tue, 17 Dec 2019 11:05:02 -0500 Original-Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1ihFLK-0006rt-28 for bug-gnu-emacs@gnu.org; Tue, 17 Dec 2019 11:05:02 -0500 X-Loop: help-debbugs@gnu.org Resent-From: Eli Zaretskii Original-Sender: "Debbugs-submit" Resent-CC: bug-gnu-emacs@gnu.org Resent-Date: Tue, 17 Dec 2019 16:05:02 +0000 Resent-Message-ID: Resent-Sender: help-debbugs@gnu.org X-GNU-PR-Message: followup 38587 X-GNU-PR-Package: emacs Original-Received: via spool by 38587-submit@debbugs.gnu.org id=B38587.157659867926357 (code B ref 38587); Tue, 17 Dec 2019 16:05:02 +0000 Original-Received: (at 38587) by debbugs.gnu.org; 17 Dec 2019 16:04:39 +0000 Original-Received: from localhost ([127.0.0.1]:42254 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihFKw-0006r3-QU for submit@debbugs.gnu.org; Tue, 17 Dec 2019 11:04:39 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:57660) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ihFKu-0006qq-Uq for 38587@debbugs.gnu.org; Tue, 17 Dec 2019 11:04:37 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:53060) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ihFKo-0003Et-QR; Tue, 17 Dec 2019 11:04:30 -0500 Original-Received: from [176.228.60.248] (port=3372 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ihFKo-0002Zq-7j; Tue, 17 Dec 2019 11:04:30 -0500 In-reply-to: <87r214giaz.fsf@mail.linkov.net> (message from Juri Linkov on Mon, 16 Dec 2019 23:51:48 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 209.51.188.43 X-BeenThere: bug-gnu-emacs@gnu.org List-Id: "Bug reports for GNU Emacs, the Swiss army knife of text editors" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnu-emacs-bounces+geb-bug-gnu-emacs=m.gmane.org@gnu.org Original-Sender: "bug-gnu-emacs" Xref: news.gmane.org gmane.emacs.bugs:173486 Archived-At: > From: Juri Linkov > Cc: schwab@linux-m68k.org, larsi@gnus.org, 38587@debbugs.gnu.org > Date: Mon, 16 Dec 2019 23:51:48 +0200 > > >> Is there an equivalent of force_encoding('UTF-8') in Emacs? > > > > "C-x RET c utf-8 RET M-x SOME-COMMAND RET" > > I see that 'C-x RET c' just sets coding-system-for-read and > coding-system-for-write for the next command, so could > base64-decode-region get coding from these variables? Yes, just access the variable and use the value. > > (decode-coding-string (base64-decode-string > > (base64-encode-string > > (encode-coding-string "ä" 'utf-8))) > > 'utf-8) > > Thanks, this works for strings. > > My real need was to find a way to decode base64 regions > that were encoded with UTF-8 coding. Then you need just base64-decode-region followed by decode-coding-region. Assuming that I understand what you mean, i.e. that the region you want to decode includes only ASCII characters and raw bytes (otherwise it is not correct to say that it is "encoded with UTF-8"). > First I tried to find such post-processing that would > recover "broken" characters inserted by base64-decode-region. > It seems these characters represent bytes that are parts > of the UTF-8 characters encoded in the UTF-8 buffer > using eight-bit charset. I failed to find such functions > that would convert the result of base64-decode-region > to UTF-8 characters in the UTF-8 buffer. decode-coding-region should be what you want. It decodes raw bytes (a.k.a. "eight-bit charset") into characters. > So I wrote a replacement of base64-decode-region: > > (defun base64-decode-utf8-region (beg end) > (interactive "r") > (replace-region-contents beg end > (lambda () > (decode-coding-string > (base64-decode-string > (buffer-substring beg end)) > (or coding-system-for-write 'utf-8))))) > > But the question remains: is it possible to do the same > in a simpler way without the need to write a new command? Yes, see above. In particular, decode-coding-region already knows how to replace the region with the decoded text.