unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
From: Eli Zaretskii <eliz@gnu.org>
To: Juri Linkov <juri@linkov.net>
Cc: larsi@gnus.org, schwab@linux-m68k.org, 38587@debbugs.gnu.org
Subject: bug#38587: base64-decode-region breaks encoding
Date: Tue, 17 Dec 2019 18:04:16 +0200	[thread overview]
Message-ID: <83k16v3pmn.fsf@gnu.org> (raw)
In-Reply-To: <87r214giaz.fsf@mail.linkov.net> (message from Juri Linkov on Mon, 16 Dec 2019 23:51:48 +0200)

> From: Juri Linkov <juri@linkov.net>
> Cc: schwab@linux-m68k.org,  larsi@gnus.org,  38587@debbugs.gnu.org
> Date: Mon, 16 Dec 2019 23:51:48 +0200
> 
> >> Is there an equivalent of force_encoding('UTF-8') in Emacs?
> >
> > "C-x RET c utf-8 RET M-x SOME-COMMAND RET"
> 
> I see that 'C-x RET c' just sets coding-system-for-read and
> coding-system-for-write for the next command, so could
> base64-decode-region get coding from these variables?

Yes, just access the variable and use the value.

> >   (decode-coding-string (base64-decode-string
> >                          (base64-encode-string
> > 			  (encode-coding-string "ä" 'utf-8)))
> > 			'utf-8)
> 
> Thanks, this works for strings.
> 
> My real need was to find a way to decode base64 regions
> that were encoded with UTF-8 coding.

Then you need just base64-decode-region followed by
decode-coding-region.  Assuming that I understand what you mean,
i.e. that the region you want to decode includes only ASCII characters
and raw bytes (otherwise it is not correct to say that it is "encoded
with UTF-8").

> First I tried to find such post-processing that would
> recover "broken" characters inserted by base64-decode-region.
> It seems these characters represent bytes that are parts
> of the UTF-8 characters encoded in the UTF-8 buffer
> using eight-bit charset.  I failed to find such functions
> that would convert the result of base64-decode-region
> to UTF-8 characters in the UTF-8 buffer.

decode-coding-region should be what you want.  It decodes raw bytes
(a.k.a. "eight-bit charset") into characters.

> So I wrote a replacement of base64-decode-region:
> 
> (defun base64-decode-utf8-region (beg end)
>   (interactive "r")
>   (replace-region-contents beg end
>    (lambda ()
>      (decode-coding-string
>       (base64-decode-string
>        (buffer-substring beg end))
>       (or coding-system-for-write 'utf-8)))))
> 
> But the question remains: is it possible to do the same
> in a simpler way without the need to write a new command?

Yes, see above.  In particular, decode-coding-region already knows how
to replace the region with the decoded text.





  reply	other threads:[~2019-12-17 16:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-12 23:55 bug#38587: base64-decode-region breaks encoding Juri Linkov
2019-12-13  2:52 ` Lars Ingebrigtsen
2019-12-13  7:12   ` Eli Zaretskii
2019-12-14 23:31   ` Juri Linkov
2019-12-15  8:56     ` Andreas Schwab
2019-12-15 22:40       ` Juri Linkov
2019-12-16 15:58         ` Eli Zaretskii
2019-12-16 21:51           ` Juri Linkov
2019-12-17 16:04             ` Eli Zaretskii [this message]
2019-12-17 23:10               ` Juri Linkov
2019-12-24 15:37                 ` Lars Ingebrigtsen
2019-12-24 16:13                   ` Lars Ingebrigtsen
2019-12-16 16:18         ` Andreas Schwab
2019-12-17 16:27         ` Lars Ingebrigtsen
2019-12-15 15:26     ` Eli Zaretskii
2019-12-15 22:41       ` Juri Linkov
2019-12-16  3:28         ` Eli Zaretskii

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/emacs/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83k16v3pmn.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=38587@debbugs.gnu.org \
    --cc=juri@linkov.net \
    --cc=larsi@gnus.org \
    --cc=schwab@linux-m68k.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).