unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
@ 2009-01-02 22:09 Reiner Steib
  2009-01-03  3:23 ` Stefan Monnier
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Reiner Steib @ 2009-01-02 22:09 UTC (permalink / raw)
  To: emacs-pretest-bug

Package: emacs,gnus
Version: 23.0.60

When replying to an article with a bogus charset declaration
(e.g. charset="ISO 8859-15" produced by Knode; "ISO-8859-15" would be
correct. An example in gmane.test is
<873ag15s04.not-fsf@marauder.physik.uni-ulm.de>) with Gnus, the buffer
contains eight-bit-control characters.

When sending the reply, Gnus asks: "Use ASCII as charset? " (see
`mml-parse-1').

Option 1: Answer `y'.

Result: The reply is sent with charset=us-ascii, but it contains 8bit
        characters.


Option 2: Answer `n'.  The Gnus asks a second time.  Answer `n' again.

Result: The same as above.
        See <87wsddtn9r.fsf@marauder.physik.uni-ulm.de> in gmane.test


Expected behavior:

This following code from `message-fix-before-sending' should kick in:
(This is what happens in Emacs 22 with current Gnus CVS trunk,
i.e. the same Gnus code base as Emacs 23.)

  (message-check 'illegible-text
    (let (char found choice)
      (message-goto-body)
      (while (progn
	       (skip-chars-forward mm-7bit-chars)
	       (when (get-text-property (point) 'no-illegible-text)
		 ;; There is a signed or encrypted raw message part
		 ;; that is considered to be safe.
		 (goto-char (or (next-single-property-change
				 (point) 'no-illegible-text)
				(point-max))))
	       (setq char (char-after)))
	(when (or (< (mm-char-int char) 128)
		  (and (mm-multibyte-p)
		       (memq (char-charset char)
			     '(eight-bit-control eight-bit-graphic
						 control-1))
		       (not (get-text-property
			     (point) 'untranslated-utf-8))))
	  (message-overlay-put (message-make-overlay (point) (1+ (point)))
			       'face 'highlight)
	  (setq found t))
	(forward-char))
      (when found
	(setq choice
	      (gnus-multiple-choice
	       "Non-printable characters found.  Continue sending?"
	       `((?d "Remove non-printable characters and send")
		 (?r ,(format
		       "Replace non-printable characters with \"%s\" and send"
		       message-replacement-char))
		 (?i "Ignore non-printable characters and send")
		 (?e "Continue editing"))))
	(if (eq choice ?e)
	  (error "Non-printable characters"))
	(message-goto-body)
	(skip-chars-forward mm-7bit-chars)
	(while (not (eobp))
	  (when (let ((char (char-after)))
		  (or (< (mm-char-int char) 128)
		      (and (mm-multibyte-p)
			   ;; FIXME: Wrong for Emacs 23 (unicode) and for
			   ;; things like undecable utf-8.  Should at least
			   ;; use find-coding-systems-region.
			   (memq (char-charset char)
				 '(eight-bit-control eight-bit-graphic
						     control-1))
			   (not (get-text-property
				 (point) 'untranslated-utf-8)))))
	    (if (eq choice ?i)
		(message-kill-all-overlays)
	      (delete-char 1)
	      (when (eq choice ?r)
		(insert message-replacement-char))))
	  (forward-char)
	  (skip-chars-forward mm-7bit-chars)))))

In Emacs 23, (char-charset char) returns `eight-bit'.  Is adding
eight-bit next to eight-bit-graphic sufficient?  The comment (by Dave
Love, CC-ed if I got X-Debbugs-CC right) seems to suggest that there's
more to be done.

Bye, Reiner.



In GNU Emacs 23.0.60.1 (i686-pc-linux-gnu, GTK+ Version 2.12.9)
 of 2009-01-01 on primula
Windowing system distributor `The X.Org Foundation', version 11.0.10400090
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: en_US.UTF-8
  value of $XMODIFIERS: nil
  locale-coding-system: utf-8-unix
  default-enable-multibyte-characters: t
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2009-01-02 22:09 bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars Reiner Steib
@ 2009-01-03  3:23 ` Stefan Monnier
  2009-01-07 21:41 ` Dave Love
       [not found] ` <gk38s6$pa7$2@quimby.gnus.org>
  2 siblings, 0 replies; 9+ messages in thread
From: Stefan Monnier @ 2009-01-03  3:23 UTC (permalink / raw)
  To: Reiner Steib; +Cc: 1770, emacs-pretest-bug

> In Emacs 23, (char-charset char) returns `eight-bit'.  Is adding
> eight-bit next to eight-bit-graphic sufficient?

Not sure if it's sufficient, but it should help, yes.


        Stefan






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2009-01-02 22:09 bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars Reiner Steib
  2009-01-03  3:23 ` Stefan Monnier
@ 2009-01-07 21:41 ` Dave Love
       [not found] ` <gk38s6$pa7$2@quimby.gnus.org>
  2 siblings, 0 replies; 9+ messages in thread
From: Dave Love @ 2009-01-07 21:41 UTC (permalink / raw)
  To: Reiner Steib; +Cc: 1770@emacsbugs.donarmstrong.com, emacs-pretest-bug@gnu.org

Reiner Steib <reinersteib+gmane@imap.cc> writes:

> In Emacs 23, (char-charset char) returns `eight-bit'.  Is adding
> eight-bit next to eight-bit-graphic sufficient?  The comment (by Dave
> Love, CC-ed if I got X-Debbugs-CC right) seems to suggest that there's
> more to be done.

You should ask handa about that and other Mule issues.  Experience shows
it's not helpful for me to explain.

There were various things like that I left unfixed for Mule 6 (for
various reasons) five years ago, or whenever it was.

By the way, `undecable' should be `undecodable' in the comment, which
may only apply in Emacs 21 -- I don't know.

I think there are various things wrong with
`message-fix-before-sending'.  The one I remember is it objecting to
stuff in non-text inline MIME parts, e.g. if you try to use
application/octet-stream for a Lisp backtrace.







^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
       [not found] ` <gk38s6$pa7$2@quimby.gnus.org>
@ 2009-01-08 20:28   ` Reiner Steib
  2009-01-16  7:45     ` Kenichi Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Reiner Steib @ 2009-01-08 20:28 UTC (permalink / raw)
  To: Dave Love; +Cc: 1770, Kenichi Handa

On Wed, Jan 07 2009, Dave Love wrote:

> Reiner Steib <reinersteib+gmane@imap.cc> writes:
>
>> In Emacs 23, (char-charset char) returns `eight-bit'.  Is adding
>> eight-bit next to eight-bit-graphic sufficient?  The comment (by Dave
>> Love, CC-ed if I got X-Debbugs-CC right) seems to suggest that there's
>> more to be done.
>
> You should ask handa about that and other Mule issues.  Experience shows
> it's not helpful for me to explain.

Cc-ed.

> There were various things like that I left unfixed for Mule 6 (for
> various reasons) five years ago, or whenever it was.
>
> By the way, `undecable' should be `undecodable' in the comment, 

Fixed.

> which may only apply in Emacs 21 -- I don't know.

Added:

			   ;; FIXME: Wrong for Emacs 23 (unicode) and for
			   ;; things like undecodable utf-8 (in Emacs 21?).
			   ;; Should at least use find-coding-systems-region.
			   ;; -- fx

> I think there are various things wrong with
> `message-fix-before-sending'.  The one I remember is it objecting to
> stuff in non-text inline MIME parts, e.g. if you try to use
> application/octet-stream for a Lisp backtrace.

You can simply say "ignore", can't you?

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo---  |  PGP key available  |  http://rsteib.home.pages.de/






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2009-01-08 20:28   ` Reiner Steib
@ 2009-01-16  7:45     ` Kenichi Handa
  2010-09-30 17:48       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 9+ messages in thread
From: Kenichi Handa @ 2009-01-16  7:45 UTC (permalink / raw)
  To: Reiner Steib; +Cc: 1770, d.love

In article <871vvdee4d.fsf@marauder.physik.uni-ulm.de>, Reiner Steib <Reiner.Steib@gmx.de> writes:

> On Wed, Jan 07 2009, Dave Love wrote:
> > Reiner Steib <reinersteib+gmane@imap.cc> writes:
> >
>>> In Emacs 23, (char-charset char) returns `eight-bit'.  Is adding
>>> eight-bit next to eight-bit-graphic sufficient?  The comment (by Dave
>>> Love, CC-ed if I got X-Debbugs-CC right) seems to suggest that there's
>>> more to be done.
> >
> > You should ask handa about that and other Mule issues.  Experience shows
> > it's not helpful for me to explain.

> Cc-ed.

Yes.  For Emacs 23, adding eight-bit in the list is ok.
But, I think it is better to catch non-Unicode characters
(#x110000..#x3FFF7F) here too.  For Emacs 23 only, we can
have this simple code:

	(while (not (eobp))
	  (when (not (encode-char (char-after) 'unicode))
                ;; or simply (>= (char-after) #x110000)
	    (if (eq choice ?i)
		(message-kill-all-overlays)
	      (delete-char 1)
	      (when (eq choice ?r)
		(insert message-replacement-char))))
	  (forward-char)
	  (skip-chars-forward mm-7bit-chars))

> Added:

> 			   ;; FIXME: Wrong for Emacs 23 (unicode) and for
> 			   ;; things like undecodable utf-8 (in Emacs 21?).
> 			   ;; Should at least use find-coding-systems-region.
> 			   ;; -- fx

After filtering out those strange characters, how is a
coding system decided?  Is select-message-coding-system
used?

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2009-01-16  7:45     ` Kenichi Handa
@ 2010-09-30 17:48       ` Lars Magne Ingebrigtsen
  2010-10-14  6:37         ` Kenichi Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-09-30 17:48 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Reiner Steib, d.love, 1770

Kenichi Handa <handa@m17n.org> writes:

> Yes.  For Emacs 23, adding eight-bit in the list is ok.
> But, I think it is better to catch non-Unicode characters
> (#x110000..#x3FFF7F) here too.  For Emacs 23 only, we can
> have this simple code:
>
> 	(while (not (eobp))
> 	  (when (not (encode-char (char-after) 'unicode))
>                 ;; or simply (>= (char-after) #x110000)
> 	    (if (eq choice ?i)
> 		(message-kill-all-overlays)
> 	      (delete-char 1)
> 	      (when (eq choice ?r)
> 		(insert message-replacement-char))))
> 	  (forward-char)
> 	  (skip-chars-forward mm-7bit-chars))

Was this installed?  If not, was a different fix applied, and the bug
not closed, or is this still a problem?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2010-09-30 17:48       ` Lars Magne Ingebrigtsen
@ 2010-10-14  6:37         ` Kenichi Handa
  2010-10-14 19:19           ` Lars Magne Ingebrigtsen
  2011-01-24  2:55           ` Lars Ingebrigtsen
  0 siblings, 2 replies; 9+ messages in thread
From: Kenichi Handa @ 2010-10-14  6:37 UTC (permalink / raw)
  To: Lars Magne Ingebrigtsen; +Cc: Reiner.Steib, d.love, 1770

In article <m3tyl7gm1v.fsf@quimbies.gnus.org>, Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Kenichi Handa <handa@m17n.org> writes:
> > Yes.  For Emacs 23, adding eight-bit in the list is ok.
> > But, I think it is better to catch non-Unicode characters
> > (#x110000..#x3FFF7F) here too.  For Emacs 23 only, we can
> > have this simple code:
> >
> > 	(while (not (eobp))
> > 	  (when (not (encode-char (char-after) 'unicode))
> >                 ;; or simply (>= (char-after) #x110000)
> > 	    (if (eq choice ?i)
> > 		(message-kill-all-overlays)
> > 	      (delete-char 1)
> > 	      (when (eq choice ?r)
> > 		(insert message-replacement-char))))
> > 	  (forward-char)
> > 	  (skip-chars-forward mm-7bit-chars))

> Was this installed?

No.

> If not, was a different fix applied, 

Yes.

2009-01-03  Reiner Steib  <Reiner.Steib@gmx.de>

	* message.el (message-fix-before-sending): Add `eight-bit' to
	illegible-text check.

> and the bug not closed, or is this still a problem?

As I wrote, non-Unicode characters are still not caught
here.  But I'm not sure it's problem to be solved by
message-fix-before-sending.  I have not yet got a reply to
this question.

> After filtering out those strange characters, how is a
> coding system decided?  Is select-message-coding-system
> used?

---
Kenichi Handa
handa@m17n.org





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2010-10-14  6:37         ` Kenichi Handa
@ 2010-10-14 19:19           ` Lars Magne Ingebrigtsen
  2011-01-24  2:55           ` Lars Ingebrigtsen
  1 sibling, 0 replies; 9+ messages in thread
From: Lars Magne Ingebrigtsen @ 2010-10-14 19:19 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: d.love, bugs, Reiner.Steib, 1770

Kenichi Handa <handa@m17n.org> writes:

> As I wrote, non-Unicode characters are still not caught
> here.  But I'm not sure it's problem to be solved by
> message-fix-before-sending.  I have not yet got a reply to
> this question.

Removing characters that can't be encoded seems like a good idea.  If I
understand the problem correctly.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen





^ permalink raw reply	[flat|nested] 9+ messages in thread

* bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars
  2010-10-14  6:37         ` Kenichi Handa
  2010-10-14 19:19           ` Lars Magne Ingebrigtsen
@ 2011-01-24  2:55           ` Lars Ingebrigtsen
  1 sibling, 0 replies; 9+ messages in thread
From: Lars Ingebrigtsen @ 2011-01-24  2:55 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 1770-close, d.love, Reiner.Steib

Kenichi Handa <handa@m17n.org> writes:

> As I wrote, non-Unicode characters are still not caught
> here.  But I'm not sure it's problem to be solved by
> message-fix-before-sending.  I have not yet got a reply to
> this question.

Ok.  Well, I think it may (or may not be) nice to warn users about
sending un-encodable bytes.  But they will most likely get a warning of
some kind, since there's probably other eight-bit-chars there, so I
think that's probably sufficient.  So I'm closing this report now,
unless anybody objects...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen





^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-01-24  2:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-02 22:09 bug#1770: 23.0.60; (message-check 'illegible-text ...) fails on eight-bit chars Reiner Steib
2009-01-03  3:23 ` Stefan Monnier
2009-01-07 21:41 ` Dave Love
     [not found] ` <gk38s6$pa7$2@quimby.gnus.org>
2009-01-08 20:28   ` Reiner Steib
2009-01-16  7:45     ` Kenichi Handa
2010-09-30 17:48       ` Lars Magne Ingebrigtsen
2010-10-14  6:37         ` Kenichi Handa
2010-10-14 19:19           ` Lars Magne Ingebrigtsen
2011-01-24  2:55           ` Lars Ingebrigtsen

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).