unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
@ 2017-02-09  2:35 Katsumi Yamaoka
  2017-02-10  0:56 ` Katsumi Yamaoka
  0 siblings, 1 reply; 8+ messages in thread
From: Katsumi Yamaoka @ 2017-02-09  2:35 UTC (permalink / raw)
  To: 25658

Hi,

In a message draft, an ELisp part containing non-ASCII letters,
like the following, is not encoded properly.

<#part type="application/emacs-lisp" disposition=inline>
(defun mm-shr (handle)
  ...
	 ;; Remove "soft hyphens".
	 (goto-char (point-min))
	 (while (search-forward "­" nil t)
	   (replace-match "" t t))
<#/part>

This doesn't happen with Emacs 25.1.  Specifying the charset spec,
as follows, doesn't help.

<#part type="application/emacs-lisp" disposition=inline charset="utf-8">

Please note that you may want to quote mml tags when replying.

Thanks.

In GNU Emacs 26.0.50.1 (i686-pc-cygwin, GTK+ Version 3.18.9)
 of 2017-02-08 built on localhost
Windowing system distributor 'The Cygwin/X Project', version 11.0.11900000





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-09  2:35 bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly Katsumi Yamaoka
@ 2017-02-10  0:56 ` Katsumi Yamaoka
  2017-02-10  7:51   ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: Katsumi Yamaoka @ 2017-02-10  0:56 UTC (permalink / raw)
  To: 25658

On Thu, 09 Feb 2017 11:35:50 +0900, Katsumi Yamaoka wrote:
> In a message draft, an ELisp part containing non-ASCII letters,
> like the following, is not encoded properly.

> <#part type="application/emacs-lisp" disposition=inline>
> (defun mm-shr (handle)
>   ...
> 	 ;; Remove "soft hyphens".
> 	 (goto-char (point-min))
> 	 (while (search-forward "­" nil t)
> 	   (replace-match "" t t))
> <#/part>

;; Note that "­" is a soft hyphen.

What Gnus wants to do is:

(quoted-printable-encode-string
 (encode-coding-string "­" 'iso-8859-1))
 => "=AD"

However what is actually done is:

(with-temp-buffer
  ;; `mml-generate-mime-1' does:
  (set-buffer-multibyte t)
  (insert "­")
  ;; `mm-encode-body' does:
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  ;; `mm-encode-buffer' does:
  (quoted-printable-encode-region (point-min) (point-max))
  (buffer-string))
 => "=3FFFAD"

Hmm.

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "­")
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  (append (buffer-string) nil))
 => (4194221)

This would probably be the multibyte version of:

(append (encode-coding-string "­" 'iso-8859-1) nil)
 => (173)

Doesn't it mean we ought not to use `encode-coding-region'?
Anyway, I think what we should do here would be one of the
following two ways:

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "­")
  (encode-coding-region (point-min) (point-max) 'iso-8859-1)
  (set-buffer-multibyte nil)
  (quoted-printable-encode-region (point-min) (point-max))
  (buffer-string))
 => "=AD"

I'm not sure whether (set-buffer-multibyte nil) above does not do
anything other than converting characters to the unibyte version
one by one.  OTOH, this is what I often do:

(with-temp-buffer
  (set-buffer-multibyte t)
  (insert "­")
  (insert (prog1
	      (encode-coding-string (buffer-string) 'iso-8859-1)
	    (erase-buffer)
	    (set-buffer-multibyte nil)))
  (quoted-printable-encode-region (point-min) (point-max))
  (buffer-string))
 => "=AD"

Regards,





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-10  0:56 ` Katsumi Yamaoka
@ 2017-02-10  7:51   ` Eli Zaretskii
  2017-02-10 17:33     ` Glenn Morris
  0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2017-02-10  7:51 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: 25658

> Date: Fri, 10 Feb 2017 09:56:47 +0900
> From: Katsumi Yamaoka <yamaoka@jpl.org>
> 
> On Thu, 09 Feb 2017 11:35:50 +0900, Katsumi Yamaoka wrote:
> > In a message draft, an ELisp part containing non-ASCII letters,
> > like the following, is not encoded properly.
> 
> > <#part type="application/emacs-lisp" disposition=inline>
> > (defun mm-shr (handle)
> >   ...
> > 	 ;; Remove "soft hyphens".
> > 	 (goto-char (point-min))
> > 	 (while (search-forward "­" nil t)
> > 	   (replace-match "" t t))
> > <#/part>
> 
> ;; Note that "­" is a soft hyphen.
> 
> What Gnus wants to do is:
> 
> (quoted-printable-encode-string
>  (encode-coding-string "­" 'iso-8859-1))
>  => "=AD"

Then why doesn't Gnus do exactly that?





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-10  7:51   ` Eli Zaretskii
@ 2017-02-10 17:33     ` Glenn Morris
  2017-02-12 23:05       ` Katsumi Yamaoka
  0 siblings, 1 reply; 8+ messages in thread
From: Glenn Morris @ 2017-02-10 17:33 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Katsumi Yamaoka, 25658

Eli Zaretskii wrote:

> Then why doesn't Gnus do exactly that?

Could it be... a bug?!   ;)





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-10 17:33     ` Glenn Morris
@ 2017-02-12 23:05       ` Katsumi Yamaoka
  2017-02-13  2:03         ` Glenn Morris
  2017-02-13  5:44         ` Eli Zaretskii
  0 siblings, 2 replies; 8+ messages in thread
From: Katsumi Yamaoka @ 2017-02-12 23:05 UTC (permalink / raw)
  To: Glenn Morris; +Cc: 25658

On Fri, 10 Feb 2017 12:33:26 -0500, Glenn Morris wrote:
> Eli Zaretskii wrote:
>> Then why doesn't Gnus do exactly that?
> Could it be... a bug?!   ;)

Ok.  So,

cd lisp/gnus
egrep '\((decode|encode)-coding-region' *.el|wc -l
 => 10

are they all potentially bugs?





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-12 23:05       ` Katsumi Yamaoka
@ 2017-02-13  2:03         ` Glenn Morris
  2017-02-13  5:44         ` Eli Zaretskii
  1 sibling, 0 replies; 8+ messages in thread
From: Glenn Morris @ 2017-02-13  2:03 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: 25658

Katsumi Yamaoka wrote:

>>> Then why doesn't Gnus do exactly that?
>> Could it be... a bug?!   ;)
>
> Ok.  So,
>
> cd lisp/gnus
> egrep '\((decode|encode)-coding-region' *.el|wc -l
>  => 10
>
> are they all potentially bugs?

Don't ask me, I was only being flippant. :)





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-12 23:05       ` Katsumi Yamaoka
  2017-02-13  2:03         ` Glenn Morris
@ 2017-02-13  5:44         ` Eli Zaretskii
  2017-02-13  8:31           ` Katsumi Yamaoka
  1 sibling, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2017-02-13  5:44 UTC (permalink / raw)
  To: Katsumi Yamaoka; +Cc: 25658

> Date: Mon, 13 Feb 2017 08:05:41 +0900
> From: Katsumi Yamaoka <yamaoka@jpl.org>
> Cc: Eli Zaretskii <eliz@gnu.org>, 25658@debbugs.gnu.org
> 
> On Fri, 10 Feb 2017 12:33:26 -0500, Glenn Morris wrote:
> > Eli Zaretskii wrote:
> >> Then why doesn't Gnus do exactly that?
> > Could it be... a bug?!   ;)
> 
> Ok.  So,
> 
> cd lisp/gnus
> egrep '\((decode|encode)-coding-region' *.el|wc -l
>  => 10
> 
> are they all potentially bugs?

Not necessarily, they need to be reviewed one by one.

My question was triggered by the fact that "what Gnus wants" was so
much simpler and obviously correct that it was a clear winner IMO.  If
the other places are all of the same variety, then yes, I'd suggest to
make similar replacements there as well.





^ permalink raw reply	[flat|nested] 8+ messages in thread

* bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly
  2017-02-13  5:44         ` Eli Zaretskii
@ 2017-02-13  8:31           ` Katsumi Yamaoka
  0 siblings, 0 replies; 8+ messages in thread
From: Katsumi Yamaoka @ 2017-02-13  8:31 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 25658

On Mon, 13 Feb 2017 07:44:07 +0200, Eli Zaretskii wrote:
>> cd lisp/gnus
>> egrep '\((decode|encode)-coding-region' *.el|wc -l
>>  => 10
>> are they all potentially bugs?

> Not necessarily, they need to be reviewed one by one.

Ok.  But I personally got to think *-coding-region should never
be used anymore.

> My question was triggered by the fact that "what Gnus wants" was so
> much simpler and obviously correct that it was a clear winner IMO.  If
> the other places are all of the same variety, then yes, I'd suggest to
> make similar replacements there as well.

I see, however it's not so easy to simplify the codes so as to
achieve just "what Gnus wants" perfectly (I mean using *-coding-
string for all the cases).

Instead, I've modified `mm-encode-body' for the emergency fix.
In the Emacs master, only `mml-generate-mime-1' uses it.
(`rfc2231-encode-string' uses it as well but now we use
 `rfc2047-encode-parameter' instead for encoding a file name.)

Regards,





^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-02-13  8:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-02-09  2:35 bug#25658: 26.0.50; ELisp part in a mail isn't encoded properly Katsumi Yamaoka
2017-02-10  0:56 ` Katsumi Yamaoka
2017-02-10  7:51   ` Eli Zaretskii
2017-02-10 17:33     ` Glenn Morris
2017-02-12 23:05       ` Katsumi Yamaoka
2017-02-13  2:03         ` Glenn Morris
2017-02-13  5:44         ` Eli Zaretskii
2017-02-13  8:31           ` Katsumi Yamaoka

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).