unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines
@ 2004-09-24 14:03 Daniel Ortmann
  0 siblings, 0 replies; 6+ messages in thread
From: Daniel Ortmann @ 2004-09-24 14:03 UTC (permalink / raw)


This bug report will be sent to the Free Software Foundation,
not to your local site managers!
Please write in English, because the Emacs maintainers do not have
translators to read other languages for them.

Your bug report will be posted to the bug-gnu-emacs@gnu.org mailing list,
and to the gnu.emacs.bug news group.

In GNU Emacs 21.3.1 (i386-pc-linux-gnu, X toolkit, Xaw3d scroll bars)
 of 2004-08-03 on raven, modified by Debian
configured using `configure '--build=i386-linux' '--host=i386-linux' '--prefix=/usr' '--sharedstatedir=/var/lib' '--libexecdir=/usr/lib' '--localstatedir=/var/lib' '--infodir=/usr/share/info' '--mandir=/usr/share/man' '--with-pop=yes' '--with-x=yes' '--with-x-toolkit=athena' 'CFLAGS=-DDEBIAN -g -O2' 'build_alias=i386-linux' 'host_alias=i386-linux''
Important settings:
  value of $LC_ALL: nil
  value of $LC_COLLATE: nil
  value of $LC_CTYPE: nil
  value of $LC_MESSAGES: nil
  value of $LC_MONETARY: nil
  value of $LC_NUMERIC: nil
  value of $LC_TIME: nil
  value of $LANG: nil
  locale-coding-system: nil
  default-enable-multibyte-characters: t

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

Hello,

Many emails are now using 0xa0 as a newline character (at least as
rendered by w3 under gnus and as sent by messages using "R" for Reply).
Often the normal newline 0x0a occurs above and below the 0xa0 line.

E.g. some text.
0x0a
0xa0
0x0a
Some other text.

These strange line separators break C-x C-o (delete-black-lines), C-q
(fill-paragraph), as well as all other whitespace-related functionality.

What is the solution?
Should w3 render 0xa0 as 0x0a?
Should 0xa0 be treated as [[:space:]]???  (dangerous?)
Or is their some other solution?

It's difficult to "fix" (i.e. compensate for) infinite broken emailers.

Thanks!

Recent input:
C-n C-n C-n C-v C-v c c c c C-x k <return> <return> 
N P C-x o C-n C-SPC C-n C-n C-n C-n C-x n n M-x h e 
x l - m o <tab> <return> y C-p C-f C-f C-f C-f C-f 
C-f C-f C-f C-f C-f C-f C-f C-f C-f C-f C-p C-a C-f 
C-f C-f C-f C-f C-f C-f C-f C-f C-b C-b C-b C-b C-b 
C-f C-f C-f C-b C-b C-b C-f C-f <down-mouse-1> <mouse-1> 
<help-echo> C-c C-c C-x n w C-l C-x o M-u M-u C-p C-p 
<help-echo> <help-echo> <help-echo> <help-echo> <help-echo> 
<menu-bar> <help-menu> <report-emacs-bug>

Recent messages:
Parsed 100% of 6876...done
Unknown directive in stylesheet: @font-face
Unknown directive in stylesheet: @page
Drawing... done
Mark set
Invalid face text property value: minibuffer-prompt [27 times]
Auto-saving...done

Loading emacsbug...done
Invalid face text property value: minibuffer-prompt [314 times]

-- 
Daniel Ortmann, LSI Logic, 3425 40th Av NW, Suite 200, Rochester MN 55901
work: Daniel.Ortmann@lsil.com / 507.535.3861 / 63861 int / 8012.3861 gdds
home: dortmann@charter.net 507.288.7732, 2414 30Av NW #D, Rochester MN 55901
gpg/pgp public key: http://wwwkeys.us.pgp.net
jabber: daniel_ortmann@jabber.org / dortmann@jabber.co.lsil.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines
@ 2004-09-30 17:09 Lars Magne Ingebrigtsen
  2004-09-30 18:42 ` Daniel Ortmann
       [not found] ` <mailman.573.1096570193.2017.bug-gnu-emacs@gnu.org>
  0 siblings, 2 replies; 6+ messages in thread
From: Lars Magne Ingebrigtsen @ 2004-09-30 17:09 UTC (permalink / raw)
  Cc: bug-gnu-emacs

Daniel Ortmann <dortmann@lsil.com> writes:

> Many emails are now using 0xa0 as a newline character (at least as
> rendered by w3 under gnus and as sent by messages using "R" for Reply).
> Often the normal newline 0x0a occurs above and below the 0xa0 line.

That sounds odd.  0xa0 is non-breaking space, which doesn't have
anything to do with newlines.

However, it's not uncommon for message (especially HTML, for some
reason) to have non-breaking space in them.  So of you respond to
such an article, they'll be quoted just like any other character. 

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines
  2004-09-30 17:09 html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines Lars Magne Ingebrigtsen
@ 2004-09-30 18:42 ` Daniel Ortmann
  2004-10-19 21:37   ` html / emacs / email / gnus: 0xa0 classified as "whitespace" but not treated as whitespace Daniel Ortmann
       [not found]   ` <mailman.4093.1098222316.2017.bug-gnu-emacs@gnu.org>
       [not found] ` <mailman.573.1096570193.2017.bug-gnu-emacs@gnu.org>
  1 sibling, 2 replies; 6+ messages in thread
From: Daniel Ortmann @ 2004-09-30 18:42 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Daniel Ortmann <dortmann@lsil.com> writes:
> 
> > Many emails are now using 0xa0 as a newline character (at least as
> > rendered by w3 under gnus and as sent by messages using "R" for
> > Reply).  Often the normal newline 0x0a occurs above and below the
> > 0xa0 line.
> 
> That sounds odd.  0xa0 is non-breaking space, which doesn't have
> anything to do with newlines.
> 
> However, it's not uncommon for message (especially HTML, for some
> reason) to have non-breaking space in them.  So of you respond to such
> an article, they'll be quoted just like any other character.

Well, I am not seeing 0xa0's "quoted", but I *am* seeing them treated as
blank lines.

Note that I have "url" and "w3" installed.  Perhaps the problem lies there?

-- 
Daniel Ortmann, LSI Logic, 3425 40th Av NW, Suite 200, Rochester MN 55901
work: Daniel.Ortmann@lsil.com / 507.535.3861 / 63861 int / 8012.3861 gdds
home: dortmann@charter.net 507.288.7732, 2414 30Av NW #D, Rochester MN 55901
gpg/pgp public key: http://wwwkeys.us.pgp.net
jabber: daniel_ortmann@jabber.org / dortmann@jabber.co.lsil.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines
       [not found] ` <mailman.573.1096570193.2017.bug-gnu-emacs@gnu.org>
@ 2004-09-30 19:52   ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 6+ messages in thread
From: Lars Magne Ingebrigtsen @ 2004-09-30 19:52 UTC (permalink / raw)


Daniel Ortmann <dortmann@lsil.com> writes:

> Well, I am not seeing 0xa0's "quoted", but I *am* seeing them
> treated as blank lines.

In Message mode?  Then they're presumably quoted from the message
you're responding to.

> Note that I have "url" and "w3" installed.  Perhaps the problem lies
> there?

Look at the original messages.  Do the characters appear there, too? 

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen

^ permalink raw reply	[flat|nested] 6+ messages in thread

* html / emacs / email / gnus: 0xa0 classified as "whitespace" but not treated as whitespace
  2004-09-30 18:42 ` Daniel Ortmann
@ 2004-10-19 21:37   ` Daniel Ortmann
       [not found]   ` <mailman.4093.1098222316.2017.bug-gnu-emacs@gnu.org>
  1 sibling, 0 replies; 6+ messages in thread
From: Daniel Ortmann @ 2004-10-19 21:37 UTC (permalink / raw)


Correction, 0xa0 should actually be treated as whitespace but is not.
I.e. fill-paragraph and friends don't treat it as whitespace:

Here is what describe-char-after says:

--------------------------------
  character:   (04240, 2208, 0x8a0)
    charset: latin-iso8859-1 (Right-Hand Part of Latin Alphabet 1 (ISO/IEC 8859-1): ISO-IR-100)
 code point: 32
     syntax: whitespace
   category:  :This character counts as a space for indentation purposes.   l:Latin  
buffer code: 0x81 0xA0
  file code: not encodable by coding system nil
       font: -Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO8859-1
--------------------------------

I use the following routine to fix them up:

(defun 0xa0-clean () (interactive)
  (query-replace-regexp "\240" " "))

Any hint as to where the problem might be?  I suspect it's a simple fix.


Daniel Ortmann <dortmann@lsil.com> writes:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> 
> > Daniel Ortmann <dortmann@lsil.com> writes:
> > 
> > > Many emails are now using 0xa0 as a newline character (at least as
> > > rendered by w3 under gnus and as sent by messages using "R" for
> > > Reply).  Often the normal newline 0x0a occurs above and below the
> > > 0xa0 line.
> > 
> > That sounds odd.  0xa0 is non-breaking space, which doesn't have
> > anything to do with newlines.
> > 
> > However, it's not uncommon for message (especially HTML, for some
> > reason) to have non-breaking space in them.  So of you respond to such
> > an article, they'll be quoted just like any other character.
> 
> Well, I am not seeing 0xa0's "quoted", but I *am* seeing them treated as
> blank lines.
> 
> Note that I have "url" and "w3" installed.  Perhaps the problem lies there?

-- 
Daniel Ortmann, LSI Logic, 3425 40th Av NW, Suite 200, Rochester MN 55901
work: Daniel.Ortmann@lsil.com / 507.535.3861 / 63861 int / 8012.3861 gdds
home: dortmann@charter.net 612.518.3147, 2414 30 Av NW #D, Rochester MN 55901
gpg/pgp public key: http://wwwkeys.us.pgp.net
jabber: daniel_ortmann@jabber.org / dortmann@jabber.co.lsil.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: html / emacs / email / gnus: 0xa0 classified as "whitespace" but not   treated as whitespace
       [not found]   ` <mailman.4093.1098222316.2017.bug-gnu-emacs@gnu.org>
@ 2004-10-20 15:58     ` Kevin Rodgers
  0 siblings, 0 replies; 6+ messages in thread
From: Kevin Rodgers @ 2004-10-20 15:58 UTC (permalink / raw)


[Please don't top-post.]

Daniel Ortmann wrote:
 > Correction, 0xa0 should actually be treated as whitespace but is not.
 > I.e. fill-paragraph and friends don't treat it as whitespace:
 >
 > Here is what describe-char-after says:
 >
 > --------------------------------
 >   character:   (04240, 2208, 0x8a0)
 >     charset: latin-iso8859-1 (Right-Hand Part of Latin Alphabet 1 
(ISO/IEC 8859-1): ISO-IR-100)
 >  code point: 32
 >      syntax: whitespace
 >    category:  :This character counts as a space for indentation 
purposes.   l:Latin
 > buffer code: 0x81 0xA0
 >   file code: not encodable by coding system nil
 >        font: 
-Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO8859-1
 > --------------------------------

Latin 1 0xA0 is named NO-BREAK SPACE for a reason.  It should not be
treated the same as ASCII 0x20, which is the SPACE character.

-- 
Kevin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-10-20 15:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-09-30 17:09 html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines Lars Magne Ingebrigtsen
2004-09-30 18:42 ` Daniel Ortmann
2004-10-19 21:37   ` html / emacs / email / gnus: 0xa0 classified as "whitespace" but not treated as whitespace Daniel Ortmann
     [not found]   ` <mailman.4093.1098222316.2017.bug-gnu-emacs@gnu.org>
2004-10-20 15:58     ` Kevin Rodgers
     [not found] ` <mailman.573.1096570193.2017.bug-gnu-emacs@gnu.org>
2004-09-30 19:52   ` html / emacs / email / w3 / gnus: 0xa0 and 0x0a both used as newlines Lars Magne Ingebrigtsen
  -- strict thread matches above, loose matches on Subject: below --
2004-09-24 14:03 Daniel Ortmann

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).