unofficial mirror of emacs-devel@gnu.org 
 help / color / mirror / code / Atom feed
* ^M in the info files
@ 2008-06-11  0:43 Lennart Borgman (gmail)
  2008-06-11  4:42 ` dhruva
  0 siblings, 1 reply; 71+ messages in thread
From: Lennart Borgman (gmail) @ 2008-06-11  0:43 UTC (permalink / raw)
  To: Emacs Devel

I see a lot of ^M in the info files on w32, CVS from today. For example 
in these files

   Emacs FAQ
   VIPER




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-06-11  0:43 ^M in the info files Lennart Borgman (gmail)
@ 2008-06-11  4:42 ` dhruva
  2008-06-11 15:56   ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: dhruva @ 2008-06-11  4:42 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: Emacs Devel

Same here, I kept quiet as I am using GIT and thought I had messed
around with something!

On Wed, Jun 11, 2008 at 6:13 AM, Lennart Borgman (gmail)
<lennart.borgman@gmail.com> wrote:
> I see a lot of ^M in the info files on w32, CVS from today. For example in
> these files

-dhruva

-- 
Contents reflect my personal views only!




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-06-11  4:42 ` dhruva
@ 2008-06-11 15:56   ` Juanma Barranquero
  2008-07-09  1:51     ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-06-11 15:56 UTC (permalink / raw)
  To: dhruva; +Cc: Emacs Devel, Lennart Borgman (gmail), Kenichi Handa

On Wed, Jun 11, 2008 at 06:42, dhruva <dhruvakm@gmail.com> wrote:

> Same here, I kept quiet as I am using GIT and thought I had messed
> around with something!
>
> On Wed, Jun 11, 2008 at 6:13 AM, Lennart Borgman (gmail)
> <lennart.borgman@gmail.com> wrote:
>> I see a lot of ^M in the info files on w32, CVS from today. For example in
>> these files

Apparently related to these changes:

2008-06-05  Kenichi Handa  <handa@m17n.org>

	* coding.c (detect_coding): Fix previous change.
	(detect_coding_system): Likewise.

2008-06-04  Kenichi Handa  <handa@m17n.org>

	* coding.c (detect_coding): Fix handling of coding->head_ascii.
	Be sure to call setup_coding_system when we find a proper coding system.
	(detect_coding_system): Fix handling of coding->head_ascii.

Removing them fixes the problem for me.

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-06-11 15:56   ` Juanma Barranquero
@ 2008-07-09  1:51     ` Juanma Barranquero
  2008-07-09  2:44       ` Kenichi Handa
  0 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-07-09  1:51 UTC (permalink / raw)
  To: Emacs Devel

Has this one been forgotten?

On Wed, Jun 11, 2008 at 17:56, Juanma Barranquero <lekktu@gmail.com> wrote:
> On Wed, Jun 11, 2008 at 06:42, dhruva <dhruvakm@gmail.com> wrote:
>
>> Same here, I kept quiet as I am using GIT and thought I had messed
>> around with something!
>>
>> On Wed, Jun 11, 2008 at 6:13 AM, Lennart Borgman (gmail)
>> <lennart.borgman@gmail.com> wrote:
>>> I see a lot of ^M in the info files on w32, CVS from today. For example in
>>> these files
>
> Apparently related to these changes:
>
> 2008-06-05  Kenichi Handa  <handa@m17n.org>
>
>        * coding.c (detect_coding): Fix previous change.
>        (detect_coding_system): Likewise.
>
> 2008-06-04  Kenichi Handa  <handa@m17n.org>
>
>        * coding.c (detect_coding): Fix handling of coding->head_ascii.
>        Be sure to call setup_coding_system when we find a proper coding system.
>        (detect_coding_system): Fix handling of coding->head_ascii.
>
> Removing them fixes the problem for me.
>
>  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09  1:51     ` Juanma Barranquero
@ 2008-07-09  2:44       ` Kenichi Handa
  2008-07-09  2:56         ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-07-09  2:44 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

In article <f7ccd24b0807081851j30dd8a61p4f1926dd740f033b@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> Has this one been forgotten?

No, but I haven't had a time to work on it.  As I've just
committed a change about font selection, I started to work
on it.  I've just installed the latest Emacs on Windows with
cygwin, and done this:

> The sequence is this:

>  - I have a ChangeLog, apparently correct, up-to-date with the repository.
>  - I modify it, typically by adding a new ChangeLog entry.
>  - I commit in from inside Emacs, with vc-next-action followed by log-edit-done.
>  - I see the diff in emacs-diffs and notice that an empty line has
> been deleted. This seems to happen more to empty lines that separate
> paragraphs from date/author lines (as opposed to empty space between
> paragraphs), but I have no hard data, just a feeling.
>  - I edit the ChangeLog to see what's happened. All the lines end in ^M.
>  - I remove the ^M (with replace-string <ENTER> ^M^J <ENTER> ^J) [I
> don't write the ChangeLog, it's just to make it easier spot problems.]
>  - At that point (after removing the ^Ms) the line with the problem
> has this aspect:

But, I can't reproduce the bug.  The diff of the step 4
above shows only the entry I added.  What was the coding
system of your ChangeLog file when you first visitted it?

---
Kenichi Handa
handa@ni.aist.go.jp


> On Wed, Jun 11, 2008 at 17:56, Juanma Barranquero <lekktu@gmail.com> wrote:
> > On Wed, Jun 11, 2008 at 06:42, dhruva <dhruvakm@gmail.com> wrote:
> >
>>> Same here, I kept quiet as I am using GIT and thought I had messed
>>> around with something!
>>> 
>>> On Wed, Jun 11, 2008 at 6:13 AM, Lennart Borgman (gmail)
>>> <lennart.borgman@gmail.com> wrote:
>>>> I see a lot of ^M in the info files on w32, CVS from today. For example in
>>>> these files
> >
> > Apparently related to these changes:
> >
> > 2008-06-05  Kenichi Handa  <handa@m17n.org>
> >
> >        * coding.c (detect_coding): Fix previous change.
> >        (detect_coding_system): Likewise.
> >
> > 2008-06-04  Kenichi Handa  <handa@m17n.org>
> >
> >        * coding.c (detect_coding): Fix handling of coding->head_ascii.
> >        Be sure to call setup_coding_system when we find a proper coding system.
> >        (detect_coding_system): Fix handling of coding->head_ascii.
> >
> > Removing them fixes the problem for me.
> >
> >  Juanma







^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09  2:44       ` Kenichi Handa
@ 2008-07-09  2:56         ` Juanma Barranquero
  2008-07-09  4:33           ` Kenichi Handa
  0 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-07-09  2:56 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On Wed, Jul 9, 2008 at 04:44, Kenichi Handa <handa@m17n.org> wrote:

> No, but I haven't had a time to work on it.

Sorry if it seemed a request; I was just making sure it didn't went lost.

>> The sequence is this:
>
>>  - I have a ChangeLog, apparently correct, up-to-date with the repository.
>>  - I modify it, typically by adding a new ChangeLog entry.
>>  - I commit in from inside Emacs, with vc-next-action followed by log-edit-done.
>>  - I see the diff in emacs-diffs and notice that an empty line has
>> been deleted. This seems to happen more to empty lines that separate
>> paragraphs from date/author lines (as opposed to empty space between
>> paragraphs), but I have no hard data, just a feeling.
>>  - I edit the ChangeLog to see what's happened. All the lines end in ^M.
>>  - I remove the ^M (with replace-string <ENTER> ^M^J <ENTER> ^J) [I
>> don't write the ChangeLog, it's just to make it easier spot problems.]
>>  - At that point (after removing the ^Ms) the line with the problem
>> has this aspect:

I think you're mixing two different bugs. The one I referred to in the
"has [...] been forgotten" message is about ^M in info files, which
happens right now in the Windows port (there has been at least three
reporters, including me).

> But, I can't reproduce the bug.  The diff of the step 4
> above shows only the entry I added.

What you quote above is from a problem with ChangeLogs that only I
see, apparently, so it's no wonder you cannot reproduce it. *I* cannot
reproduce it at will, alas...

> What was the coding
> system of your ChangeLog file when you first visitted it?

utf-8-dos, AFAICS.

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09  2:56         ` Juanma Barranquero
@ 2008-07-09  4:33           ` Kenichi Handa
  2008-07-09  9:15             ` Jason Rumney
  2008-07-09 10:02             ` Juanma Barranquero
  0 siblings, 2 replies; 71+ messages in thread
From: Kenichi Handa @ 2008-07-09  4:33 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

In article <f7ccd24b0807081956o1c443d54k64cafc104d850aba@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> I think you're mixing two different bugs. The one I referred to in the
> "has [...] been forgotten" message is about ^M in info files, which
> happens right now in the Windows port (there has been at least three
> reporters, including me).

> > But, I can't reproduce the bug.  The diff of the step 4
> > above shows only the entry I added.

> What you quote above is from a problem with ChangeLogs that only I
> see, apparently, so it's no wonder you cannot reproduce it. *I* cannot
> reproduce it at will, alas...

Oops, sorry.  But, I can't see the ^M problem in info files.
Actually, on Windows, when I type C-h i, I get this error:

  Can't find the Info directory node

But, when I type C-u C-h i ~/emacs/info/efag RET (I built
Emacs under ~/emacs),  I see no '^M's.

What is the coding system of emacs/info/efag when you visit
that file directly?

> > What was the coding
> > system of your ChangeLog file when you first visitted it?

> utf-8-dos, AFAICS.

Hmmm, I tried with a ChangeLog of that encoding, but still
can't reproduce the bug.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09  4:33           ` Kenichi Handa
@ 2008-07-09  9:15             ` Jason Rumney
  2008-07-09 11:16               ` Kenichi Handa
  2008-07-09 10:02             ` Juanma Barranquero
  1 sibling, 1 reply; 71+ messages in thread
From: Jason Rumney @ 2008-07-09  9:15 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Juanma Barranquero, emacs-devel

Kenichi Handa wrote:
> In article <f7ccd24b0807081956o1c443d54k64cafc104d850aba@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> Oops, sorry.  But, I can't see the ^M problem in info files.
> Actually, on Windows, when I type C-h i, I get this error:
> 
>   Can't find the Info directory node
> 
> But, when I type C-u C-h i ~/emacs/info/efag RET (I built
> Emacs under ~/emacs),  I see no '^M's.

Perhaps the version of makeinfo you have is not generating files with
DOS line ends. Mine is, and even opening info/efaq with C-x C-f shows
the ^M characters, though I don't see any inconsistencies that should
cause line-end detection to fail and Emacs 22.2 opens it correctly with
a DOS coding system.





^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09  4:33           ` Kenichi Handa
  2008-07-09  9:15             ` Jason Rumney
@ 2008-07-09 10:02             ` Juanma Barranquero
  2008-07-09 14:54               ` "no-conversion" coding system (was: ^M in the info files) Stefan Monnier
  1 sibling, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-07-09 10:02 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On Wed, Jul 9, 2008 at 06:33, Kenichi Handa <handa@m17n.org> wrote:

> What is the coding system of emacs/info/efag when you visit
> that file directly?

Both when I visit it directly and when I access it through Info,
buffer-file-coding-system is `no-conversion'.

> Hmmm, I tried with a ChangeLog of that encoding, but still
> can't reproduce the bug.

As I said, neither can I reliably.

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09  9:15             ` Jason Rumney
@ 2008-07-09 11:16               ` Kenichi Handa
  2008-07-09 16:49                 ` Stefan Monnier
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-07-09 11:16 UTC (permalink / raw)
  To: Jason Rumney; +Cc: lekktu, emacs-devel

In article <487481B2.3090302@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:

> Kenichi Handa wrote:
> > In article <f7ccd24b0807081956o1c443d54k64cafc104d850aba@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> > Oops, sorry.  But, I can't see the ^M problem in info files.
> > Actually, on Windows, when I type C-h i, I get this error:
> > 
> >   Can't find the Info directory node
> > 
> > But, when I type C-u C-h i ~/emacs/info/efag RET (I built
> > Emacs under ~/emacs),  I see no '^M's.

> Perhaps the version of makeinfo you have is not generating files with
> DOS line ends. Mine is,

It seems so.

> and even opening info/efaq with C-x C-f shows
> the ^M characters, though I don't see any inconsistencies that should
> cause line-end detection to fail and Emacs 22.2 opens it correctly with
> a DOS coding system.

I've just found that the file "efaq" contains null-bytes
('\0') after "Concept Index\n********\n\n".  "viper" also
contains null-bytes.  So, Emacs detects that those files are
binary.  Hmmm, what should we do?  Make a new variable
inhibit-null-byte-detection (analogous to
inhibit-iso-escape-detection)?  Or, make the code of
null-byte detection checks the percentage of null-byte, and
conclude that the file is binary only when the percentage is
higher than some threshold?

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* "no-conversion" coding system (was: ^M in the info files)
  2008-07-09 10:02             ` Juanma Barranquero
@ 2008-07-09 14:54               ` Stefan Monnier
  2008-07-09 23:31                 ` Kenichi Handa
  2008-07-09 23:57                 ` Richard M Stallman
  0 siblings, 2 replies; 71+ messages in thread
From: Stefan Monnier @ 2008-07-09 14:54 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel, Kenichi Handa

>> What is the coding system of emacs/info/efag when you visit
>> that file directly?

> Both when I visit it directly and when I access it through Info,
> buffer-file-coding-system is `no-conversion'.

BTW, could we get rid of `no-conversion' (or at least make it obsolete
and deprecated)?

I mean, this is a misnomer which gives the illusion that there is such
a thing as "no conversion", even though in reality there always is some
conversion going on.  Better use the name `binary', I think.


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09 11:16               ` Kenichi Handa
@ 2008-07-09 16:49                 ` Stefan Monnier
  2008-07-09 17:58                   ` James Cloos
  2008-07-10 11:17                   ` Kenichi Handa
  0 siblings, 2 replies; 71+ messages in thread
From: Stefan Monnier @ 2008-07-09 16:49 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lekktu, emacs-devel, Jason Rumney

> I've just found that the file "efaq" contains null-bytes
> ('\0') after "Concept Index\n********\n\n".  "viper" also
> contains null-bytes.

Why is that?  Isn't it an error in those files?

> So, Emacs detects that those files are binary.

Does info.el need to use Emacs's auto-detection machinery?  I mean don't
Info files come with their own scheme to specify the encoding used?


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09 16:49                 ` Stefan Monnier
@ 2008-07-09 17:58                   ` James Cloos
  2008-07-09 20:19                     ` Juri Linkov
  2008-07-10 11:17                   ` Kenichi Handa
  1 sibling, 1 reply; 71+ messages in thread
From: James Cloos @ 2008-07-09 17:58 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, emacs-devel, Jason Rumney, Kenichi Handa

>>>>> "Stefan" == Stefan Monnier <monnier@iro.umontreal.ca> writes:

>> I've just found that the file "efaq" contains null-bytes
>> ('\0') after "Concept Index\n********\n\n".  "viper" also
>> contains null-bytes.

Stefan> Why is that?  Isn't it an error in those files?

There are *many* such examples in my info directory.  230 out of 856
files in my /usr/share/info have them.  It looks like all of them
have one or more instances of the line (in cat --show-all syntax):

^@^H[index^@^H]$

In the case of faq.texi it is from the code:

@node Concept index,  , Mail and news, Top
@unnumbered Concept Index
@printindex cp

@contents

I have texinfo-4.12 installed, but some of the relevant files date back
as far as 2004/August, so it isn't just a recent thing.

From my most recent compile of emacs, it affects:

emacs-mime.info.bz2
gnus-5.info.bz2
message.info.bz2
pgg.info.bz2
sasl.info.bz2
sieve.info.bz2

-JimC
-- 
James Cloos <cloos@jhcloos.com>         OpenPGP: 1024D/ED7DAEA6




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09 17:58                   ` James Cloos
@ 2008-07-09 20:19                     ` Juri Linkov
  2008-07-14 11:44                       ` Kenichi Handa
  0 siblings, 1 reply; 71+ messages in thread
From: Juri Linkov @ 2008-07-09 20:19 UTC (permalink / raw)
  To: James Cloos
  Cc: lekktu, Jason Rumney, Stefan Monnier, Kenichi Handa, emacs-devel

> There are *many* such examples in my info directory.  230 out of 856
> files in my /usr/share/info have them.  It looks like all of them
> have one or more instances of the line (in cat --show-all syntax):
>
> ^@^H[index^@^H]$
>
> In the case of faq.texi it is from the code:
>
> @node Concept index,  , Mail and news, Top
> @unnumbered Concept Index
> @printindex cp
>
> @contents
>
> I have texinfo-4.12 installed, but some of the relevant files date back
> as far as 2004/August, so it isn't just a recent thing.

This is an old feature, so I wonder why it starts causing problems
just now.

It is also used for embedded images in Info files:

  ^@^H[image src="BINARYFILE" text="TXTFILE" alt="ALTTEXT ... ^@^H]

so with a lot of images the percentage of null-bytes may be too high.

The Info reader correctly treats the coding cookie, so one solution
is to write it to all Info files, but this does help with existing
files.

But I still doesn't understand the problem with ^@^H in Info files.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system (was: ^M in the info files)
  2008-07-09 14:54               ` "no-conversion" coding system (was: ^M in the info files) Stefan Monnier
@ 2008-07-09 23:31                 ` Kenichi Handa
  2008-07-09 23:57                 ` Richard M Stallman
  1 sibling, 0 replies; 71+ messages in thread
From: Kenichi Handa @ 2008-07-09 23:31 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, emacs-devel

In article <jwvzlorqfgj.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

>>> What is the coding system of emacs/info/efag when you visit
>>> that file directly?

> > Both when I visit it directly and when I access it through Info,
> > buffer-file-coding-system is `no-conversion'.

> BTW, could we get rid of `no-conversion' (or at least make it obsolete
> and deprecated)?

> I mean, this is a misnomer which gives the illusion that there is such
> a thing as "no conversion", even though in reality there always is some
> conversion going on.  Better use the name `binary', I think.

When we do C-x C-m c no-conversion RET C-x C-f FILE RET,
"no-conversion" really means "no-conversion" because
FILE is read into a unibyte buffer without any conversion.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system (was: ^M in the info files)
  2008-07-09 14:54               ` "no-conversion" coding system (was: ^M in the info files) Stefan Monnier
  2008-07-09 23:31                 ` Kenichi Handa
@ 2008-07-09 23:57                 ` Richard M Stallman
  2008-07-10  0:31                   ` Kenichi Handa
  1 sibling, 1 reply; 71+ messages in thread
From: Richard M Stallman @ 2008-07-09 23:57 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, handa, emacs-devel

    I mean, this is a misnomer which gives the illusion that there is such
    a thing as "no conversion", even though in reality there always is some
    conversion going on.

`no-conversion' used to really mean no conversion.
Why has that changed?




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system (was: ^M in the info files)
  2008-07-09 23:57                 ` Richard M Stallman
@ 2008-07-10  0:31                   ` Kenichi Handa
  2008-07-10  1:12                     ` "no-conversion" coding system Stefan Monnier
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-07-10  0:31 UTC (permalink / raw)
  To: rms; +Cc: lekktu, monnier, emacs-devel

In article <E1KGjXh-0000U4-Jl@fencepost.gnu.org>, Richard M Stallman <rms@gnu.org> writes:

>     I mean, this is a misnomer which gives the illusion that there is such
>     a thing as "no conversion", even though in reality there always is some
>     conversion going on.

> `no-conversion' used to really mean no conversion.
> Why has that changed?

Nothing is changed.  When you insert a file in a multibyte
buffer with no-conversion, each eight-bit code is converted
to a special 2-byte form to represent eight-bit character.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system
  2008-07-10  0:31                   ` Kenichi Handa
@ 2008-07-10  1:12                     ` Stefan Monnier
  2008-07-14 23:19                       ` Eli Zaretskii
  0 siblings, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2008-07-10  1:12 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lekktu, rms, emacs-devel

>> I mean, this is a misnomer which gives the illusion that there is such
>> a thing as "no conversion", even though in reality there always is some
>> conversion going on.

>> `no-conversion' used to really mean no conversion.
>> Why has that changed?

Technically, that may be true.  But take a utf-8 text and open it with
"no-conversion" and it won't look like the same text, so for some
interpretation of "conversion", it has been converted.
I.e. it's a name that leads to confusion.  Its other name "binary" is
a lot more unequivocal.


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09 16:49                 ` Stefan Monnier
  2008-07-09 17:58                   ` James Cloos
@ 2008-07-10 11:17                   ` Kenichi Handa
  2008-07-10 16:02                     ` Stefan Monnier
  1 sibling, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-07-10 11:17 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, emacs-devel, jasonr

In article <jwvtzezovh5.fsf-monnier+emacs@gnu.org>, Stefan Monnier <monnier@iro.umontreal.ca> writes:

> Does info.el need to use Emacs's auto-detection machinery?  I mean don't
> Info files come with their own scheme to specify the encoding used?

FYI.  Only gnus.texi and emacs-mime.texi have
@documentencoding directive along with coding: tag.  As a
result (or not, I'm not sure), the resulting info files
"gnus" and "emacs-mime" contain coding: tag.

And, for instance, faq.texi has @today{} directive, and it
seems that makeinfo generates a date string according to the
current locale (and thus results in non-ASCII characters).

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-10 11:17                   ` Kenichi Handa
@ 2008-07-10 16:02                     ` Stefan Monnier
  2008-07-10 18:42                       ` Juri Linkov
  0 siblings, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2008-07-10 16:02 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lekktu, emacs-devel, jasonr

>> Does info.el need to use Emacs's auto-detection machinery?  I mean don't
>> Info files come with their own scheme to specify the encoding used?

> FYI.  Only gnus.texi and emacs-mime.texi have
> @documentencoding directive along with coding: tag.  As a
> result (or not, I'm not sure), the resulting info files
> "gnus" and "emacs-mime" contain coding: tag.

But the others are pure ASCII aren't they (isn't that what it means for
an Info file not to have a "coding:" tag)?

> And, for instance, faq.texi has @today{} directive, and it
> seems that makeinfo generates a date string according to the
> current locale (and thus results in non-ASCII characters).

Isn't that a bug in makeinfo?


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-10 16:02                     ` Stefan Monnier
@ 2008-07-10 18:42                       ` Juri Linkov
  2008-07-10 20:27                         ` Stefan Monnier
  0 siblings, 1 reply; 71+ messages in thread
From: Juri Linkov @ 2008-07-10 18:42 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, emacs-devel, jasonr, Kenichi Handa

>>> Does info.el need to use Emacs's auto-detection machinery?  I mean don't
>>> Info files come with their own scheme to specify the encoding used?
>
>> FYI.  Only gnus.texi and emacs-mime.texi have
>> @documentencoding directive along with coding: tag.  As a
>> result (or not, I'm not sure), the resulting info files
>> "gnus" and "emacs-mime" contain coding: tag.
>
> But the others are pure ASCII aren't they (isn't that what it means for
> an Info file not to have a "coding:" tag)?

If an Info file has no "coding:" tag, then the Info reader uses
Emacs's auto-detection machinery.

But maybe in this case we should force some safe coding like `undecided'
(as it seems to use now for pure ASCII Info files without null-bytes)?

>> And, for instance, faq.texi has @today{} directive, and it
>> seems that makeinfo generates a date string according to the
>> current locale (and thus results in non-ASCII characters).
>
> Isn't that a bug in makeinfo?

IIUC, this is an intentional feature.  But we could run it with e.g.
`LANG=C makeinfo' in Makefiles.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-10 18:42                       ` Juri Linkov
@ 2008-07-10 20:27                         ` Stefan Monnier
  2008-07-10 20:47                           ` Juri Linkov
  2008-07-19 22:29                           ` Eli Zaretskii
  0 siblings, 2 replies; 71+ messages in thread
From: Stefan Monnier @ 2008-07-10 20:27 UTC (permalink / raw)
  To: Juri Linkov; +Cc: lekktu, emacs-devel, jasonr, Kenichi Handa

>>>> Does info.el need to use Emacs's auto-detection machinery?  I mean don't
>>>> Info files come with their own scheme to specify the encoding used?
>> 
>>> FYI.  Only gnus.texi and emacs-mime.texi have
>>> @documentencoding directive along with coding: tag.  As a
>>> result (or not, I'm not sure), the resulting info files
>>> "gnus" and "emacs-mime" contain coding: tag.
>> 
>> But the others are pure ASCII aren't they (isn't that what it means for
>> an Info file not to have a "coding:" tag)?

> If an Info file has no "coding:" tag, then the Info reader uses
> Emacs's auto-detection machinery.

I'm not talking about what Emacs does but about what the Info format
"specifies".  IIUC the Info format is always ASCII unless explicitly
specified by a conding: tag.  Of course, you can have an Info file
without a coding: tag that uses non-ASCII chars in some encoding, but
IIUC this has never been considered as valid (from TeXinfo's point
of view).

> But maybe in this case we should force some safe coding like `undecided'
> (as it seems to use now for pure ASCII Info files without null-bytes)?

No, I think it should use `us-ascii' instead.

>>> And, for instance, faq.texi has @today{} directive, and it
>>> seems that makeinfo generates a date string according to the
>>> current locale (and thus results in non-ASCII characters).
>> 
>> Isn't that a bug in makeinfo?

> IIUC, this is an intentional feature.  But we could run it with e.g.
> `LANG=C makeinfo' in Makefiles.

What happens if your TeXinfo file specifies a latin-1 encoding and your
date is output in utf-8 because of your locale, then?


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-10 20:27                         ` Stefan Monnier
@ 2008-07-10 20:47                           ` Juri Linkov
  2008-07-10 22:11                             ` Stefan Monnier
  2008-07-19 22:29                           ` Eli Zaretskii
  1 sibling, 1 reply; 71+ messages in thread
From: Juri Linkov @ 2008-07-10 20:47 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, emacs-devel, jasonr, Kenichi Handa

> I'm not talking about what Emacs does but about what the Info format
> "specifies".  IIUC the Info format is always ASCII unless explicitly
> specified by a conding: tag.  Of course, you can have an Info file
> without a coding: tag that uses non-ASCII chars in some encoding, but
> IIUC this has never been considered as valid (from TeXinfo's point
> of view).

Yes, it seems the default coding for Info files is supposed to be
US-ASCII unless overridden by @documentencoding.

>>>> And, for instance, faq.texi has @today{} directive, and it
>>>> seems that makeinfo generates a date string according to the
>>>> current locale (and thus results in non-ASCII characters).
>>>
>>> Isn't that a bug in makeinfo?
>
>> IIUC, this is an intentional feature.  But we could run it with e.g.
>> `LANG=C makeinfo' in Makefiles.
>
> What happens if your TeXinfo file specifies a latin-1 encoding and your
> date is output in utf-8 because of your locale, then?

Then the result of @today is displayed as garbage.

But we can adjust Makefile to use the correct LANG for every Info manual
that specifies a non-default @documentencoding.

-- 
Juri Linkov
http://www.jurta.org/emacs/




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-10 20:47                           ` Juri Linkov
@ 2008-07-10 22:11                             ` Stefan Monnier
  0 siblings, 0 replies; 71+ messages in thread
From: Stefan Monnier @ 2008-07-10 22:11 UTC (permalink / raw)
  To: Juri Linkov; +Cc: lekktu, emacs-devel, jasonr, Kenichi Handa

>> What happens if your TeXinfo file specifies a latin-1 encoding and your
>> date is output in utf-8 because of your locale, then?

> Then the result of @today is displayed as garbage.

Again, my question is not how it's displayed but what is the content of
the resulting Info file.  If it's a mix of the two encodings, then it's
a bug in makeinfo (call it misfeature if you want) and I see no need to
try and support it in Emacs.

> But we can adjust Makefile to use the correct LANG for every Info manual
> that specifies a non-default @documentencoding.

That might be a good workaround,


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-09 20:19                     ` Juri Linkov
@ 2008-07-14 11:44                       ` Kenichi Handa
  2008-07-21 11:18                         ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-07-14 11:44 UTC (permalink / raw)
  To: Juri Linkov; +Cc: lekktu, emacs-devel, monnier, cloos, jasonr

In article <87k5fu94vx.fsf@jurta.org>, Juri Linkov <juri@jurta.org> writes:

> > I have texinfo-4.12 installed, but some of the relevant files date back
> > as far as 2004/August, so it isn't just a recent thing.

> This is an old feature, so I wonder why it starts causing problems
> just now.

I believe that it's because of the introduction of null-byte
detection code, but it was done in this April.  What I don't
understand is that....

Juanma wrote:

>>> On Wed, Jun 11, 2008 at 6:13 AM, Lennart Borgman (gmail)
>>> <lennart.borgman@gmail.com> wrote:
>>>> I see a lot of ^M in the info files on w32, CVS from today. For example in
>>>> these files
> >
> > Apparently related to these changes:
> >
> > 2008-06-05  Kenichi Handa  <handa@m17n.org>
> >
> >        * coding.c (detect_coding): Fix previous change.
> >        (detect_coding_system): Likewise.
> >
> > 2008-06-04  Kenichi Handa  <handa@m17n.org>
> >
> >        * coding.c (detect_coding): Fix handling of coding->head_ascii.
> >        Be sure to call setup_coding_system when we find a proper coding system.
> >        (detect_coding_system): Fix handling of coding->head_ascii.
> >
> > Removing them fixes the problem for me.
> >
> >  Juanma

I've just checked out the Emacs of 2008-06-03 (it doesn't
have the above change), and still info/efag is detected as
no-conversion.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system
  2008-07-10  1:12                     ` "no-conversion" coding system Stefan Monnier
@ 2008-07-14 23:19                       ` Eli Zaretskii
  2008-07-15  1:39                         ` Stefan Monnier
  2008-07-15  6:34                         ` Stephen J. Turnbull
  0 siblings, 2 replies; 71+ messages in thread
From: Eli Zaretskii @ 2008-07-14 23:19 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, emacs-devel, rms, handa

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Wed, 09 Jul 2008 21:12:41 -0400
> Cc: lekktu@gmail.com, rms@gnu.org, emacs-devel@gnu.org
> 
> Technically, that may be true.  But take a utf-8 text and open it with
> "no-conversion" and it won't look like the same text, so for some
> interpretation of "conversion", it has been converted.

no-conversion doesn't mean that the text will _look_ the same, it
means the byte stream will be the same.

> Its other name "binary" is a lot more unequivocal.

Only if you are a programmer who knows that binary files are usually
read with no conversions.  Otherwise, the name "binary" doesn't have
any useful mnemonic meaning in the context of transforming one text
encoding into another.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system
  2008-07-14 23:19                       ` Eli Zaretskii
@ 2008-07-15  1:39                         ` Stefan Monnier
  2008-07-20 14:04                           ` Eli Zaretskii
  2008-07-15  6:34                         ` Stephen J. Turnbull
  1 sibling, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2008-07-15  1:39 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, emacs-devel, rms, handa

>> Its other name "binary" is a lot more unequivocal.
> Only if you are a programmer who knows that binary files are usually
> read with no conversions.  Otherwise, the name "binary" doesn't have

Since the `no-conversion' method depends on the underlying byte-encoding
used in a file, it is inherently something for programmers,
bit-twiddlers, ...


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system
  2008-07-14 23:19                       ` Eli Zaretskii
  2008-07-15  1:39                         ` Stefan Monnier
@ 2008-07-15  6:34                         ` Stephen J. Turnbull
  2008-07-20 14:11                           ` Eli Zaretskii
  1 sibling, 1 reply; 71+ messages in thread
From: Stephen J. Turnbull @ 2008-07-15  6:34 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, handa, Stefan Monnier, rms, emacs-devel

Eli Zaretskii writes:
 > > From: Stefan Monnier <monnier@iro.umontreal.ca>
 > > Date: Wed, 09 Jul 2008 21:12:41 -0400
 > > Cc: lekktu@gmail.com, rms@gnu.org, emacs-devel@gnu.org
 > > 
 > > Technically, that may be true.  But take a utf-8 text and open it with
 > > "no-conversion" and it won't look like the same text, so for some
 > > interpretation of "conversion", it has been converted.

Irrelevant.  You can achieve the same kind of confusion with `emacs
-font'. :-)

 > no-conversion doesn't mean that the text will _look_ the same, it
 > means the byte stream will be the same.

That's true as a definition, and incorrect as a matter of fact in
multibyte buffers.  Unless Emacs enforces "no-conversion is unibyte",
"no-conversion" should die, and split its estate between "raw-text"
(where newline conventions are meaningful) and "binary" (or
"raw-bytes") for non-text files.

 > > Its other name "binary" is a lot more unequivocal.
 > 
 > Only if you are a programmer who knows that binary files are usually
 > read with no conversions.  Otherwise, the name "binary" doesn't have
 > any useful mnemonic meaning in the context of transforming one text
 > encoding into another.

Binary isn't a text encoding, and therefore it's entirely appropriate
that it lack a mnemonic meaning in the context of text encoding.<wink>

In any case, only Emacs maintainers should ever need to care about
the representational transformations performed by coding systems, and
they all do know what binary means.  Everybody else just needs to know
the magic spell that turns mojibake into language; even if they did
know about representations, they can't do anything about it.  (I guess
that's not strictly so in Emacs, but it should be. ;-)




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-10 20:27                         ` Stefan Monnier
  2008-07-10 20:47                           ` Juri Linkov
@ 2008-07-19 22:29                           ` Eli Zaretskii
  2008-07-21  4:57                             ` Stefan Monnier
  1 sibling, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2008-07-19 22:29 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: juri, lekktu, jasonr, handa, emacs-devel

> From: Stefan Monnier <monnier@IRO.UMontreal.CA>
> Date: Thu, 10 Jul 2008 16:27:09 -0400
> Cc: lekktu@gmail.com, emacs-devel@gnu.org, jasonr@gnu.org,
> 	Kenichi Handa <handa@m17n.org>
> 
> > But maybe in this case we should force some safe coding like `undecided'
> > (as it seems to use now for pure ASCII Info files without null-bytes)?
> 
> No, I think it should use `us-ascii' instead.

That is almost certainly a bad idea: no need to punish users who have
non-ASCII Info files without a proper `coding' tag.  The
@documentencoding feature is relatively new in Texinfo, and it's
clumsy to use (you need to specify --enable-encoding on the command
line, or else @documentencoding is ignored), so I expect quite a few
such Info files to be out there.

> What happens if your TeXinfo file specifies a latin-1 encoding and your
> date is output in utf-8 because of your locale, then?

You get a garbled file.  In general, Texinfo's l10n features don't
work well if @documentencoding doesn't match the current locale's
encoding.

Btw, the official name of the project is Texinfo, not TeXinfo.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system
  2008-07-15  1:39                         ` Stefan Monnier
@ 2008-07-20 14:04                           ` Eli Zaretskii
  0 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2008-07-20 14:04 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: lekktu, handa, rms, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Date: Mon, 14 Jul 2008 21:39:35 -0400
> Cc: lekktu@gmail.com, emacs-devel@gnu.org, rms@gnu.org, handa@m17n.org
> 
> >> Its other name "binary" is a lot more unequivocal.
> > Only if you are a programmer who knows that binary files are usually
> > read with no conversions.  Otherwise, the name "binary" doesn't have
> 
> Since the `no-conversion' method depends on the underlying byte-encoding
> used in a file, it is inherently something for programmers,
> bit-twiddlers, ...

I didn't mean Lisp programmers, I meant C/C++ programmers.  Many
people program in Lisp without knowing anything about low-level OS
stuff.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: "no-conversion" coding system
  2008-07-15  6:34                         ` Stephen J. Turnbull
@ 2008-07-20 14:11                           ` Eli Zaretskii
  0 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2008-07-20 14:11 UTC (permalink / raw)
  To: Stephen J. Turnbull; +Cc: lekktu, handa, monnier, rms, emacs-devel

> From: "Stephen J. Turnbull" <stephen@xemacs.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>,
>     lekktu@gmail.com,
>     emacs-devel@gnu.org,
>     rms@gnu.org,
>     handa@m17n.org
> Date: Tue, 15 Jul 2008 15:34:13 +0900
> 
>  > no-conversion doesn't mean that the text will _look_ the same, it
>  > means the byte stream will be the same.
> 
> That's true as a definition, and incorrect as a matter of fact in
> multibyte buffers.  Unless Emacs enforces "no-conversion is unibyte",

It does, as a matter of fact.

> In any case, only Emacs maintainers should ever need to care about
> the representational transformations performed by coding systems

I most strongly disagree, but maybe I don't understand what you mean
by ``representational transformations''.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-19 22:29                           ` Eli Zaretskii
@ 2008-07-21  4:57                             ` Stefan Monnier
  2008-07-21 15:08                               ` Eli Zaretskii
  0 siblings, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2008-07-21  4:57 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: juri, lekktu, jasonr, handa, emacs-devel

>> > But maybe in this case we should force some safe coding like `undecided'
>> > (as it seems to use now for pure ASCII Info files without null-bytes)?
>> No, I think it should use `us-ascii' instead.

> That is almost certainly a bad idea: no need to punish users who have
> non-ASCII Info files without a proper `coding' tag.

How common is that?

>> What happens if your TeXinfo file specifies a latin-1 encoding and your
>> date is output in utf-8 because of your locale, then?
> You get a garbled file.

Good, so we don't need to worry about this case since it's broken anyway.

> Btw, the official name of the project is Texinfo, not TeXinfo.

Yes, sorry, I actually know it, but old habits die hard,


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-14 11:44                       ` Kenichi Handa
@ 2008-07-21 11:18                         ` Juanma Barranquero
  0 siblings, 0 replies; 71+ messages in thread
From: Juanma Barranquero @ 2008-07-21 11:18 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Juri Linkov, emacs-devel, monnier, cloos, jasonr

On Mon, Jul 14, 2008 at 13:44, Kenichi Handa <handa@m17n.org> wrote:

> I've just checked out the Emacs of 2008-06-03 (it doesn't
> have the above change), and still info/efag is detected as
> no-conversion.

With an up-to-date CVS Emacs, reverting the patches for revisions
1.388 and 1.389 of coding.c (and with no other change), the problem
does not happen on Windows. There are no spurious ^Ms, and info/efaq
is detected as undecided-dos.

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-21  4:57                             ` Stefan Monnier
@ 2008-07-21 15:08                               ` Eli Zaretskii
  2008-07-21 18:20                                 ` Stefan Monnier
  0 siblings, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2008-07-21 15:08 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: juri, lekktu, jasonr, handa, emacs-devel

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: juri@jurta.org,  lekktu@gmail.com,  emacs-devel@gnu.org,  jasonr@gnu.org,  handa@m17n.org
> Date: Mon, 21 Jul 2008 00:57:02 -0400
> 
> >> > But maybe in this case we should force some safe coding like `undecided'
> >> > (as it seems to use now for pure ASCII Info files without null-bytes)?
> >> No, I think it should use `us-ascii' instead.
> 
> > That is almost certainly a bad idea: no need to punish users who have
> > non-ASCII Info files without a proper `coding' tag.
> 
> How common is that?

Quite common, especially in Latin-n world.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-07-21 15:08                               ` Eli Zaretskii
@ 2008-07-21 18:20                                 ` Stefan Monnier
  0 siblings, 0 replies; 71+ messages in thread
From: Stefan Monnier @ 2008-07-21 18:20 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: juri, lekktu, jasonr, handa, emacs-devel

>> >> > But maybe in this case we should force some safe coding like `undecided'
>> >> > (as it seems to use now for pure ASCII Info files without null-bytes)?
>> >> No, I think it should use `us-ascii' instead.
>> 
>> > That is almost certainly a bad idea: no need to punish users who have
>> > non-ASCII Info files without a proper `coding' tag.
>> 
>> How common is that?

> Quite common, especially in Latin-n world.

If you say so.  It's defnitely not commmon in my part of the
latin-1 world.


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-06 13:14   ` Juanma Barranquero
@ 2008-08-07  0:48     ` Kenichi Handa
  2008-08-07  3:20       ` Eli Zaretskii
                         ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Kenichi Handa @ 2008-08-07  0:48 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

I changed the Subject:

In article <f7ccd24b0808060614h4917f240oc261b8fedff0e8d0@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> > The problem is the newly introduced null-byte detection
> > mechanism, but the discussion divergented to encoding of
> > non-ascii characters, and for the original problem of
> > null-byte in info file, no one replied to the question in
> > this mail:
> >  http://lists.gnu.org/archive/html/emacs-devel/2008-07/msg00225.html

> How would inhibit-null-byte-detection work for info files?

For instance, by modifying info-insert-file-contents to bind
that variable to t while reading a file.

Eli Zaretskii <eliz@gnu.org> writes:

> I think the best solution would be that Emacs never tries to detect
> whether Info files are binary.

Do you have any idea to achieve that?

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-07  0:48     ` ^M in the info files Kenichi Handa
@ 2008-08-07  3:20       ` Eli Zaretskii
  2008-08-07  3:22       ` Juanma Barranquero
  2008-08-07 20:18       ` Stefan Monnier
  2 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2008-08-07  3:20 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lekktu, emacs-devel

> From: Kenichi Handa <handa@m17n.org>
> Date: Thu, 07 Aug 2008 09:48:37 +0900
> Cc: emacs-devel@gnu.org
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> 
> > I think the best solution would be that Emacs never tries to detect
> > whether Info files are binary.
> 
> Do you have any idea to achieve that?

Sorry, not yet.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-07  0:48     ` ^M in the info files Kenichi Handa
  2008-08-07  3:20       ` Eli Zaretskii
@ 2008-08-07  3:22       ` Juanma Barranquero
  2008-08-07  3:47         ` Kenichi Handa
  2008-08-07 20:18       ` Stefan Monnier
  2 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-08-07  3:22 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On Thu, Aug 7, 2008 at 02:48, Kenichi Handa <handa@m17n.org> wrote:

> For instance, by modifying info-insert-file-contents to bind
> that variable to t while reading a file.

Aha.

Well, that's a possible solution then. Is there any downside to it?

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-07  3:22       ` Juanma Barranquero
@ 2008-08-07  3:47         ` Kenichi Handa
  2008-08-07 13:01           ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-08-07  3:47 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

In article <f7ccd24b0808062022t71035360h5f120bcade3363e8@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> On Thu, Aug 7, 2008 at 02:48, Kenichi Handa <handa@m17n.org> wrote:
> > For instance, by modifying info-insert-file-contents to bind
> > that variable to t while reading a file.

> Aha.

> Well, that's a possible solution then. Is there any downside to it?

I don't see any bad effect to it.  But, as I'm not familiar
with the code of info, I don't know which place is the best
to bind that variable.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-07  3:47         ` Kenichi Handa
@ 2008-08-07 13:01           ` Juanma Barranquero
  2008-08-08  1:28             ` Kenichi Handa
  0 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-08-07 13:01 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On Thu, Aug 7, 2008 at 05:47, Kenichi Handa <handa@m17n.org> wrote:

> I don't see any bad effect to it.  But, as I'm not familiar
> with the code of info, I don't know which place is the best
> to bind that variable.

I've tried implementing your suggestion; see the attached patch.

It works for uncompressed info files, and for compressed ones when
auto-compression-mode is off.
However, when auto-compression-mode is on, the conversion depends on
the settings of `file-coding-system-alist' for the given compression
method.

   Juanma



Index: src/coding.c
===================================================================
RCS file: /sources/emacs/emacs/src/coding.c,v
retrieving revision 1.390
diff -u -2 -r1.390 coding.c
--- src/coding.c	9 Jul 2008 13:05:56 -0000	1.390
+++ src/coding.c	7 Aug 2008 10:42:46 -0000
@@ -381,4 +381,7 @@
 int inhibit_iso_escape_detection;

+/* Flag to inhibit detection of binary files through null bytes.  */
+int inhibit_null_byte_detection;
+
 /* Flag to make buffer-file-coding-system inherit from process-coding.  */
 int inherit_process_coding_system;
@@ -5895,5 +5898,5 @@
 	  if (i < coding_category_raw_text)
 	    setup_coding_system (CODING_ID_NAME (this->id), coding);
-	  else if (null_byte_found)
+	  else if (null_byte_found && ! inhibit_null_byte_detection)
 	    setup_coding_system (Qno_conversion, coding);
 	  else if ((detect_info.rejected & CATEGORY_MASK_ANY)
@@ -10233,4 +10236,23 @@
   inhibit_iso_escape_detection = 0;

+  DEFVAR_BOOL ("inhibit-null-byte-detection",
+	       &inhibit_null_byte_detection,
+	       doc: /*
+If non-nil, Emacs ignores null bytes on code detection.
+
+By default, on reading a file, Emacs tries to detect how the text is
+encoded.  This code detection is sensitive to null bytes.  If the
+text contains null bytes, the file is determined as containing
+binary data.
+
+However, there may be a case that you want to read non-binary data
+that contains null bytes.  In such a case, you can set this variable
+to non-nil.
+
+The default value is nil, and it is strongly recommended not to change
+it.  This variable is intended to be bound to t in the few instances
+where that is useful, for example to read certain info files.  */);
+  inhibit_null_byte_detection = 0;
+
   DEFVAR_LISP ("translation-table-for-input", &Vtranslation_table_for_input,
 	       doc: /* Char table for translating self-inserting characters.
Index: lisp/info.el
===================================================================
RCS file: /sources/emacs/emacs/lisp/info.el,v
retrieving revision 1.540
diff -u -2 -r1.540 info.el
--- lisp/info.el	30 Jul 2008 17:16:46 -0000	1.540
+++ lisp/info.el	7 Aug 2008 11:20:36 -0000
@@ -456,5 +456,6 @@
 	    (apply 'call-process-region (point-min) (point-max)
 		   (car decoder) t t nil (cdr decoder))))
-      (insert-file-contents fullname visit))))
+      (let ((inhibit-null-byte-detection t))
+	(insert-file-contents fullname visit)))))
 \f
 (defun Info-default-dirs ()

^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-07  0:48     ` ^M in the info files Kenichi Handa
  2008-08-07  3:20       ` Eli Zaretskii
  2008-08-07  3:22       ` Juanma Barranquero
@ 2008-08-07 20:18       ` Stefan Monnier
  2008-11-28 21:28         ` Drew Adams
  2 siblings, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2008-08-07 20:18 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: Juanma Barranquero, emacs-devel

>> How would inhibit-null-byte-detection work for info files?
> For instance, by modifying info-insert-file-contents to bind
> that variable to t while reading a file.

An alternative to inhibit-null-byte-detection would be something like
inhibit-coding-systems which would contain a list of coding systems (or
maybe coding categories?) we know we don't want to use.


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-07 13:01           ` Juanma Barranquero
@ 2008-08-08  1:28             ` Kenichi Handa
  2008-08-08 11:02               ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-08-08  1:28 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

In article <f7ccd24b0808070601i2897b57ak98d6343b2442f40c@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> On Thu, Aug 7, 2008 at 05:47, Kenichi Handa <handa@m17n.org> wrote:
> > I don't see any bad effect to it.  But, as I'm not familiar
> > with the code of info, I don't know which place is the best
> > to bind that variable.

> I've tried implementing your suggestion; see the attached patch.

Thank you!

> It works for uncompressed info files, and for compressed ones when
> auto-compression-mode is off.
> However, when auto-compression-mode is on, the conversion depends on
> the settings of `file-coding-system-alist' for the given compression
> method.

You handled inhibit_iso_escape_detection only in
detect_coding.  Isn't the problem fixed by modifing the
detect_coding_system too.  That funciton does code detection
when called from detect-coding-region/string.

---
Kenichi Handa
handa@ni.aist.go.jp




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-08  1:28             ` Kenichi Handa
@ 2008-08-08 11:02               ` Juanma Barranquero
  2008-08-13 10:08                 ` Kenichi Handa
  0 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-08-08 11:02 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On Fri, Aug 8, 2008 at 03:28, Kenichi Handa <handa@m17n.org> wrote:

> You handled inhibit_iso_escape_detection only in
> detect_coding.  Isn't the problem fixed by modifing the
> detect_coding_system too.

Well, you give me too much credit; I just did the easiest change that
could possibly work :-) [It didn't]

But AFAICS, the trouble with compressed info files is not because of
null-byte detection. If I gzip efaq and then visit it through Info,
the buffer is detected as utf-8-unix, not no-conversion.

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-08 11:02               ` Juanma Barranquero
@ 2008-08-13 10:08                 ` Kenichi Handa
  2008-08-15 23:09                   ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Kenichi Handa @ 2008-08-13 10:08 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

In article <f7ccd24b0808080402m2d2b958aq9fe7f82a25d7b48d@mail.gmail.com>, "Juanma Barranquero" <lekktu@gmail.com> writes:

> On Fri, Aug 8, 2008 at 03:28, Kenichi Handa <handa@m17n.org> wrote:
> > You handled inhibit_iso_escape_detection only in
> > detect_coding.  Isn't the problem fixed by modifing the
> > detect_coding_system too.

> Well, you give me too much credit; I just did the easiest change that
> could possibly work :-) [It didn't]

> But AFAICS, the trouble with compressed info files is not because of
> null-byte detection. If I gzip efaq and then visit it through Info,
> the buffer is detected as utf-8-unix, not no-conversion.

Do you mean that utf-8-unix is an incorrect detection?
Or, are there any other problems?

---
Kenichi Handa
handa@ni.aist.go.jp





^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-08-13 10:08                 ` Kenichi Handa
@ 2008-08-15 23:09                   ` Juanma Barranquero
  0 siblings, 0 replies; 71+ messages in thread
From: Juanma Barranquero @ 2008-08-15 23:09 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

On Wed, Aug 13, 2008 at 12:08, Kenichi Handa <handa@m17n.org> wrote:

> Do you mean that utf-8-unix is an incorrect detection?

Well, I would have expected that, if it detected the file as utf-8-dos
the CR/LF pairs would be correctly shown... (I'm talking of the case
of a gzipped Windows-generated info file).

   Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: ^M in the info files
  2008-08-07 20:18       ` Stefan Monnier
@ 2008-11-28 21:28         ` Drew Adams
  2008-11-28 22:39           ` Eli Zaretskii
  0 siblings, 1 reply; 71+ messages in thread
From: Drew Adams @ 2008-11-28 21:28 UTC (permalink / raw)
  To: 'Stefan Monnier', 'Kenichi Handa'
  Cc: 'Juanma Barranquero', emacs-devel

> From: Stefan Monnier Sent: Thursday, August 07, 2008 1:19 PM
> >> How would inhibit-null-byte-detection work for info files?
> >
> > For instance, by modifying info-insert-file-contents to bind
> > that variable to t while reading a file.
> 
> An alternative to inhibit-null-byte-detection would be something like
> inhibit-coding-systems which would contain a list of coding 
> systems (or maybe coding categories?) we know we don't want to use.

Whatever happened to this thread and the associated bugs: #876, #1117, #1284?

It sounds like there were alternative proposals about how to fix this, but there
was no discussion to try to reach a consensus or a decision. Is that where
things were left?

Meanwhile, it's still impossible to use the index in Info manuals on Windows,
and it's impossible to use some manuals (e.g. Viper) at all.






^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-11-28 21:28         ` Drew Adams
@ 2008-11-28 22:39           ` Eli Zaretskii
  2008-11-28 22:44             ` Juanma Barranquero
                               ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Eli Zaretskii @ 2008-11-28 22:39 UTC (permalink / raw)
  To: Drew Adams; +Cc: lekktu, emacs-devel, monnier, handa

> From: "Drew Adams" <drew.adams@oracle.com>
> Date: Fri, 28 Nov 2008 13:28:34 -0800
> Cc: 'Juanma Barranquero' <lekktu@gmail.com>, emacs-devel@gnu.org
> 
> Whatever happened to this thread and the associated bugs: #876, #1117, #1284?
> 
> It sounds like there were alternative proposals about how to fix this, but there
> was no discussion to try to reach a consensus or a decision. Is that where
> things were left?
> 
> Meanwhile, it's still impossible to use the index in Info manuals on Windows,
> and it's impossible to use some manuals (e.g. Viper) at all.

Don't worry, this will get fixed before Emacs 23 is ready for release.

I will work on it soon if no one beats me to it.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-11-28 22:39           ` Eli Zaretskii
@ 2008-11-28 22:44             ` Juanma Barranquero
  2008-11-29 10:50               ` Eli Zaretskii
  2008-11-28 22:49             ` Drew Adams
  2009-01-10 11:15             ` Eli Zaretskii
  2 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2008-11-28 22:44 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Fri, Nov 28, 2008 at 23:39, Eli Zaretskii <eliz@gnu.org> wrote:

> I will work on it soon if no one beats me to it.

How are you going to fix it?

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: ^M in the info files
  2008-11-28 22:39           ` Eli Zaretskii
  2008-11-28 22:44             ` Juanma Barranquero
@ 2008-11-28 22:49             ` Drew Adams
  2008-11-29 10:51               ` Eli Zaretskii
  2009-01-10 11:15             ` Eli Zaretskii
  2 siblings, 1 reply; 71+ messages in thread
From: Drew Adams @ 2008-11-28 22:49 UTC (permalink / raw)
  To: 'Eli Zaretskii'; +Cc: lekktu, emacs-devel, monnier, handa

> > Whatever happened to this thread and the associated bugs: 
> > #876, #1117, #1284?
> > 
> > It sounds like there were alternative proposals about how 
> > to fix this, but there was no discussion to try to reach a
> > consensus or a decision. Is that where things were left?
> > 
> > Meanwhile, it's still impossible to use the index in Info 
> > manuals on Windows, and it's impossible to use some manuals
> > (e.g. Viper) at all.
> 
> Don't worry, this will get fixed before Emacs 23 is ready for release.
> I will work on it soon if no one beats me to it.

OK, thanks.

But what will you do?
It sounds like nothing was decided wrt the fix that's needed.






^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-11-28 22:44             ` Juanma Barranquero
@ 2008-11-29 10:50               ` Eli Zaretskii
  2008-11-29 11:56                 ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2008-11-29 10:50 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: emacs-devel

> Date: Fri, 28 Nov 2008 23:44:49 +0100
> From: "Juanma Barranquero" <lekktu@gmail.com>
> Cc: emacs-devel@gnu.org
> 
> On Fri, Nov 28, 2008 at 23:39, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > I will work on it soon if no one beats me to it.
> 
> How are you going to fix it?

I don't know yet; figuring that out is part of the job.

Last time when I had a few hours to work on this, my Internet
connection died in the middle of reading past messages about this
problem, and came back up when the time I had was up.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-11-28 22:49             ` Drew Adams
@ 2008-11-29 10:51               ` Eli Zaretskii
  0 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2008-11-29 10:51 UTC (permalink / raw)
  To: Drew Adams; +Cc: emacs-devel

> From: "Drew Adams" <drew.adams@oracle.com>
> Cc: <monnier@iro.umontreal.ca>, <handa@m17n.org>, <lekktu@gmail.com>,
>         <emacs-devel@gnu.org>
> Date: Fri, 28 Nov 2008 14:49:32 -0800
> 
> > Don't worry, this will get fixed before Emacs 23 is ready for release.
> > I will work on it soon if no one beats me to it.
> 
> OK, thanks.
> 
> But what will you do?
> It sounds like nothing was decided wrt the fix that's needed.

Obviously, I will need first to decide which of the two suggested
approaches I like best.  Or maybe I will come up with some third
solution.  I will know when I actually start working on this.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-11-29 10:50               ` Eli Zaretskii
@ 2008-11-29 11:56                 ` Juanma Barranquero
  0 siblings, 0 replies; 71+ messages in thread
From: Juanma Barranquero @ 2008-11-29 11:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel

On Sat, Nov 29, 2008 at 11:50, Eli Zaretskii <eliz@gnu.org> wrote:

> Last time when I had a few hours to work on this, my Internet
> connection died in the middle of reading past messages about this
> problem, and came back up when the time I had was up.

An omen if I ever saw one...

  Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2008-11-28 22:39           ` Eli Zaretskii
  2008-11-28 22:44             ` Juanma Barranquero
  2008-11-28 22:49             ` Drew Adams
@ 2009-01-10 11:15             ` Eli Zaretskii
  2009-01-10 12:14               ` Eli Zaretskii
                                 ` (3 more replies)
  2 siblings, 4 replies; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-10 11:15 UTC (permalink / raw)
  To: 876-done; +Cc: emacs-pretest-bug, bug-gnu-emacs, emacs-devel

> Date: Sat, 29 Nov 2008 00:39:27 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: lekktu@gmail.com, emacs-devel@gnu.org, monnier@iro.umontreal.ca,
> 	handa@m17n.org
> 
> > From: "Drew Adams" <drew.adams@oracle.com>
> > Date: Fri, 28 Nov 2008 13:28:34 -0800
> > Cc: 'Juanma Barranquero' <lekktu@gmail.com>, emacs-devel@gnu.org
> > 
> > Whatever happened to this thread and the associated bugs: #876, #1117, #1284?
> > 
> > It sounds like there were alternative proposals about how to fix this, but there
> > was no discussion to try to reach a consensus or a decision. Is that where
> > things were left?
> > 
> > Meanwhile, it's still impossible to use the index in Info manuals on Windows,
> > and it's impossible to use some manuals (e.g. Viper) at all.
> 
> Don't worry, this will get fixed before Emacs 23 is ready for release.
> 
> I will work on it soon if no one beats me to it.

(Fore some value of "soon", sorry.)

I fixed this bug.

There were two suggestions for how to fix this: one by Handa-san in
this message:

  http://lists.gnu.org/archive/html/emacs-devel/2008-08/msg00293.html

followed by a tentative patch by Juanma here:

  http://lists.gnu.org/archive/html/emacs-devel/2008-08/msg00316.html

The other suggestion was by Stefan:

  http://lists.gnu.org/archive/html/emacs-devel/2008-08/msg00373.html

I decided I liked the first alternative better, since it has much
more local effect than the other one.  What Stefan suggested implied
messing with coding priorities, and I didn't feel that was TRT at this
late stage in Emacs 23.1 development.

The changes I installed are more thorough than what Juanma posted:
they change both detect_coding and detect_coding_system, and also bind
inhibit-null-byte-detection in a couple more places in info.el.

The result was tested on GNU/Linux, MS-Windows, and MS-DOS, both with
compressed and uncompressed Info files, with auto-compression-mode
both on and off.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 11:15             ` Eli Zaretskii
@ 2009-01-10 12:14               ` Eli Zaretskii
  2009-01-12 20:56                 ` Stefan Monnier
  2009-01-10 14:07               ` Juanma Barranquero
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-10 12:14 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel

> Date: Sat, 10 Jan 2009 13:15:51 +0200
> From: Eli Zaretskii <eliz@gnu.org>
> Cc: emacs-pretest-bug@gnu.org, bug-gnu-emacs@gnu.org, emacs-devel@gnu.org
> 
> The changes I installed are more thorough than what Juanma posted:
> they change both detect_coding and detect_coding_system, and also bind
> inhibit-null-byte-detection in a couple more places in info.el.

When documenting the new option, I found a strange inconsistency
between detect-coding-region/string and coding detection by
insert-file-contents.  If null bytes are detected, detect_coding sets
up to use no-conversion, but detect-coding-region/string does not call
detect_coding.  Instead, detect-coding-region/string call
detect_coding_system, which does not return no-conversion for a region
or string that include null bytes.  insert-file-contents does use
no-conversion for files that contain null bytes, but it does so
because decode_coding_gap does call detect_coding.

This creates an inconsistency for Lisp programs that do their own
decoding: if they call detect-coding-region/string for a region or
string with null bytes, they will not see no-conversion in the return
value, but insert-file-contents will use no-conversion nonetheless.

I think this inconsistency constitutes a bug.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 11:15             ` Eli Zaretskii
  2009-01-10 12:14               ` Eli Zaretskii
@ 2009-01-10 14:07               ` Juanma Barranquero
  2009-01-10 15:13                 ` Eli Zaretskii
  2009-01-10 16:21               ` Drew Adams
  2009-01-12 20:54               ` Stefan Monnier
  3 siblings, 1 reply; 71+ messages in thread
From: Juanma Barranquero @ 2009-01-10 14:07 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 876-done, emacs-devel

On Sat, Jan 10, 2009 at 12:15, Eli Zaretskii <eliz@gnu.org> wrote:

> I fixed this bug.

If the info files are compressed with gzip (using the Windows binary
from http://www.gzip.org), when visited from Info they still have
spurious ^M.

As an aside, when I use MSYS' gzip I get an error in decompressing:

  Error while executing "gzip -c -q -d < c:/emacs/info/efaq.gz"
  gzip: stdin: invalid compressed data--crc error

Could be related to CRLF issues in the redirection (but this is a gzip
problem, not Emacs', of course).

    Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 14:07               ` Juanma Barranquero
@ 2009-01-10 15:13                 ` Eli Zaretskii
  2009-01-10 18:23                   ` Juanma Barranquero
  0 siblings, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-10 15:13 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 876-done, emacs-devel

> Date: Sat, 10 Jan 2009 15:07:37 +0100
> From: "Juanma Barranquero" <lekktu@gmail.com>
> Cc: 876-done@emacsbugs.donarmstrong.com, emacs-devel@gnu.org
> 
> On Sat, Jan 10, 2009 at 12:15, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > I fixed this bug.
> 
> If the info files are compressed with gzip (using the Windows binary
> from http://www.gzip.org), when visited from Info they still have
> spurious ^M.

I cannot reproduce this (tried with compressing info/emacs*).  Please
send a complete self-contained recipe.

Does "spurious" mean that every line has a ^M, like you saw before the
fix, or just some lines?

Also, does the above reference to http://www.gzip.org means that files
compressed by other versions of gzip do work as intended?

> As an aside, when I use MSYS' gzip I get an error in decompressing:
> 
>   Error while executing "gzip -c -q -d < c:/emacs/info/efaq.gz"
>   gzip: stdin: invalid compressed data--crc error
> 
> Could be related to CRLF issues in the redirection (but this is a gzip
> problem, not Emacs', of course).

Right, and I don't have MSYS anyway.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* RE: ^M in the info files
  2009-01-10 11:15             ` Eli Zaretskii
  2009-01-10 12:14               ` Eli Zaretskii
  2009-01-10 14:07               ` Juanma Barranquero
@ 2009-01-10 16:21               ` Drew Adams
  2009-01-10 23:16                 ` Lennart Borgman
  2009-01-12 20:54               ` Stefan Monnier
  3 siblings, 1 reply; 71+ messages in thread
From: Drew Adams @ 2009-01-10 16:21 UTC (permalink / raw)
  To: 'Eli Zaretskii', 876-done
  Cc: emacs-pretest-bug, bug-gnu-emacs, emacs-devel

Thanks for fixing this. 





^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 15:13                 ` Eli Zaretskii
@ 2009-01-10 18:23                   ` Juanma Barranquero
  2009-01-10 19:08                     ` Eli Zaretskii
                                       ` (2 more replies)
  0 siblings, 3 replies; 71+ messages in thread
From: Juanma Barranquero @ 2009-01-10 18:23 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 876-done, emacs-devel

On Sat, Jan 10, 2009 at 16:13, Eli Zaretskii <eliz@gnu.org> wrote:

> I cannot reproduce this (tried with compressing info/emacs*).  Please
> send a complete self-contained recipe.

Sorry, apparently it depends on something in my .emacs. It's still a
bug, though.

Try with

  cd info
  gzip efaq
  emacs -Q -eval "(progn (set-language-environment \"UTF-8\") (info
\"(efaq)Top\"))"

> Does "spurious" mean that every line has a ^M, like you saw before the
> fix, or just some lines?

Every line, just like before the fix.

> Also, does the above reference to http://www.gzip.org means that files
> compressed by other versions of gzip do work as intended?

Means that using the gzip from www.gzip.org it fails, and using the
one from MSYS I get the error I reported. I don't have other versions
of gzip.

For additional fun,

  cd info
  gzip emacs
  emacs -Q -eval "(info \"(emacs)Top\")"
   =>
  gzip: illegal option -- Q
  usage: gzip [-acdfhlLnNrtvV19] [-S suffix] [file ...

(this with the standard www.gzip.org version).

    Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 18:23                   ` Juanma Barranquero
@ 2009-01-10 19:08                     ` Eli Zaretskii
  2009-01-10 19:12                     ` Eli Zaretskii
  2009-01-10 19:19                     ` Eli Zaretskii
  2 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-10 19:08 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 876-done, emacs-devel





^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 18:23                   ` Juanma Barranquero
  2009-01-10 19:08                     ` Eli Zaretskii
@ 2009-01-10 19:12                     ` Eli Zaretskii
  2009-01-10 19:19                     ` Eli Zaretskii
  2 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-10 19:12 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 876-done, emacs-devel

> Date: Sat, 10 Jan 2009 19:23:28 +0100
> From: "Juanma Barranquero" <lekktu@gmail.com>
> Cc: 876-done@emacsbugs.donarmstrong.com, emacs-devel@gnu.org
> 
> On Sat, Jan 10, 2009 at 16:13, Eli Zaretskii <eliz@gnu.org> wrote:
> 
> > I cannot reproduce this (tried with compressing info/emacs*).  Please
> > send a complete self-contained recipe.
> 
> Sorry, apparently it depends on something in my .emacs. It's still a
> bug, though.
> 
> Try with
> 
>   cd info
>   gzip efaq
>   emacs -Q -eval "(progn (set-language-environment \"UTF-8\") (info \"(efaq)Top\"))"

Yes, I see that now, but it has something to do with UTF-8 as the
language environment (it works correctly without it).  The Info buffer
has utf-8-unix as its buffer-file-coding-system, which is another
sign.  So I think this is a separate bug.  Please file a separate bug
report for it.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 18:23                   ` Juanma Barranquero
  2009-01-10 19:08                     ` Eli Zaretskii
  2009-01-10 19:12                     ` Eli Zaretskii
@ 2009-01-10 19:19                     ` Eli Zaretskii
  2009-01-10 21:04                       ` Juanma Barranquero
  2 siblings, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-10 19:19 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 876-done, emacs-devel

> Date: Sat, 10 Jan 2009 19:23:28 +0100
> From: "Juanma Barranquero" <lekktu@gmail.com>
> Cc: 876-done@emacsbugs.donarmstrong.com, emacs-devel@gnu.org
> 
>   cd info
>   gzip emacs
>   emacs -Q -eval "(info \"(emacs)Top\")"
>    =>
>   gzip: illegal option -- Q
>   usage: gzip [-acdfhlLnNrtvV19] [-S suffix] [file ...
> 
> (this with the standard www.gzip.org version).

Doesn't happen to me, neither with gzip 1.2.4 from gzip.org, nor with
v1.3.5 from GnuWin32.  I did replace "emacs" with a full file name,
though, to make sure it doesn't pick up some other version somewhere
on my INFOPATH.





^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 19:19                     ` Eli Zaretskii
@ 2009-01-10 21:04                       ` Juanma Barranquero
  0 siblings, 0 replies; 71+ messages in thread
From: Juanma Barranquero @ 2009-01-10 21:04 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 876-done, emacs-devel

On Sat, Jan 10, 2009 at 20:19, Eli Zaretskii <eliz@gnu.org> wrote:

>>   gzip: illegal option -- Q
>>   usage: gzip [-acdfhlLnNrtvV19] [-S suffix] [file ...
>>
>> (this with the standard www.gzip.org version).

My mistake. My command interpreter was trying to execute emacs.gz.
Sorry for the noise.

    Juanma




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 16:21               ` Drew Adams
@ 2009-01-10 23:16                 ` Lennart Borgman
  2009-01-11  5:41                   ` dhruva
  2009-01-11  5:51                   ` dhruva
  0 siblings, 2 replies; 71+ messages in thread
From: Lennart Borgman @ 2009-01-10 23:16 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs Devel

On Sat, Jan 10, 2009 at 5:21 PM, Drew Adams <drew.adams@oracle.com> wrote:
> Thanks for fixing this.

Yes, great Eli!




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 23:16                 ` Lennart Borgman
@ 2009-01-11  5:41                   ` dhruva
  2009-01-11  5:51                   ` dhruva
  1 sibling, 0 replies; 71+ messages in thread
From: dhruva @ 2009-01-11  5:41 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: Eli Zaretskii, Emacs Devel

Hello,

On Sun, Jan 11, 2009 at 4:46 AM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> On Sat, Jan 10, 2009 at 5:21 PM, Drew Adams <drew.adams@oracle.com> wrote:
>> Thanks for fixing this.
>
> Yes, great Eli!

I start emacs 'emacs -q' (built using MinGW on WXP, bzr HEAD) and see
^M in ada-mode. Do I need to 'bootstrap' or should a normal 'make all
install'.

-dhruva

-- 
Contents reflect my personal views only!




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 23:16                 ` Lennart Borgman
  2009-01-11  5:41                   ` dhruva
@ 2009-01-11  5:51                   ` dhruva
  1 sibling, 0 replies; 71+ messages in thread
From: dhruva @ 2009-01-11  5:51 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Emacs Devel

Hello,

On Sun, Jan 11, 2009 at 4:46 AM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> On Sat, Jan 10, 2009 at 5:21 PM, Drew Adams <drew.adams@oracle.com> wrote:
>> Thanks for fixing this.
>
> Yes, great Eli!

I had to do a 'make clean' and 'make all install' (on WXP using MinGW,
bzr HEAD) to get it working. Thanks for the fix!

-dhruva

-- 
Contents reflect my personal views only!




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 11:15             ` Eli Zaretskii
                                 ` (2 preceding siblings ...)
  2009-01-10 16:21               ` Drew Adams
@ 2009-01-12 20:54               ` Stefan Monnier
  2009-01-12 22:13                 ` Eli Zaretskii
  3 siblings, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2009-01-12 20:54 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-pretest-bug, bug-gnu-emacs, 876-done, emacs-devel

Thank you, Eli,

> I decided I liked the first alternative better, since it has much
> more local effect than the other one.  What Stefan suggested implied
> messing with coding priorities, and I didn't feel that was TRT at this
> late stage in Emacs 23.1 development.

[ Obviously my suggestion requires more work to implement, but.. ]
Hmm...curious, why does it involve coding priorities?


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-10 12:14               ` Eli Zaretskii
@ 2009-01-12 20:56                 ` Stefan Monnier
  2009-01-12 22:04                   ` Eli Zaretskii
  0 siblings, 1 reply; 71+ messages in thread
From: Stefan Monnier @ 2009-01-12 20:56 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: emacs-devel, Kenichi Handa

> When documenting the new option, I found a strange inconsistency
> between detect-coding-region/string and coding detection by
> insert-file-contents.  If null bytes are detected, detect_coding sets
> up to use no-conversion, but detect-coding-region/string does not call
> detect_coding.  Instead, detect-coding-region/string call
> detect_coding_system, which does not return no-conversion for a region
> or string that include null bytes.  insert-file-contents does use
> no-conversion for files that contain null bytes, but it does so
> because decode_coding_gap does call detect_coding.

> This creates an inconsistency for Lisp programs that do their own
> decoding: if they call detect-coding-region/string for a region or
> string with null bytes, they will not see no-conversion in the return
> value, but insert-file-contents will use no-conversion nonetheless.

> I think this inconsistency constitutes a bug.

Indeed, it sounds like a bug.  Not sure how/when it would manifest
itself, tho.


        Stefan




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-12 20:56                 ` Stefan Monnier
@ 2009-01-12 22:04                   ` Eli Zaretskii
  0 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-12 22:04 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-devel, handa

> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: Kenichi Handa <handa@m17n.org>,  emacs-devel@gnu.org
> Date: Mon, 12 Jan 2009 15:56:16 -0500
> 
> > This creates an inconsistency for Lisp programs that do their own
> > decoding: if they call detect-coding-region/string for a region or
> > string with null bytes, they will not see no-conversion in the return
> > value, but insert-file-contents will use no-conversion nonetheless.
> 
> > I think this inconsistency constitutes a bug.
> 
> Indeed, it sounds like a bug.  Not sure how/when it would manifest
> itself, tho.

Well, for starters, it means that insert-file-contents does its own
detection that cannot be separated from it on the Lisp level.  That
is,

   (let ((coding-system-for-read 'undecided))
     (insert-file-contents FOO))

and

   (insert-file-contents-literally FOO)
   (let ((coding-system-for-read
          (detect-coding-region (point-min) (point-max) t)))
     (insert-file-contents FOO))

will surprisingly produce different results.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-12 20:54               ` Stefan Monnier
@ 2009-01-12 22:13                 ` Eli Zaretskii
  2009-01-12 22:27                   ` Glenn Morris
  0 siblings, 1 reply; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-12 22:13 UTC (permalink / raw)
  To: Stefan Monnier; +Cc: emacs-pretest-bug, bug-gnu-emacs, 876-done, emacs-devel

> X-Spam-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,
> 	FORGED_RCVD_HELO,SPF_SOFTFAIL autolearn=no version=3.1.0
> From: Stefan Monnier <monnier@iro.umontreal.ca>
> Cc: 876-done@emacsbugs.donarmstrong.com,  emacs-pretest-bug@gnu.org,  bug-gnu-emacs@gnu.org,  emacs-devel@gnu.org
> Date: Mon, 12 Jan 2009 15:54:38 -0500
> 
> Hmm...curious, why does it involve coding priorities?

You suggested to introduce a way of controlling which possible
encodings will _not_ be acceptable by the caller of
detect-coding-region/string.  These primitives work by scanning the
coding_priorities[] array and checking them against those categories
that has been rejected based on the text in the region/string.  See
detect_coding and detect_coding_system.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-12 22:13                 ` Eli Zaretskii
@ 2009-01-12 22:27                   ` Glenn Morris
  2009-01-13  4:01                     ` Eli Zaretskii
  0 siblings, 1 reply; 71+ messages in thread
From: Glenn Morris @ 2009-01-12 22:27 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: Stefan Monnier, 876, emacs-devel


Please stop cc'ing emacs-pretest-bug@gnu.org and bug-gnu-emacs@gnu.org
in this discussion, it messes up the bug tracker.




^ permalink raw reply	[flat|nested] 71+ messages in thread

* Re: ^M in the info files
  2009-01-12 22:27                   ` Glenn Morris
@ 2009-01-13  4:01                     ` Eli Zaretskii
  0 siblings, 0 replies; 71+ messages in thread
From: Eli Zaretskii @ 2009-01-13  4:01 UTC (permalink / raw)
  To: Glenn Morris; +Cc: monnier, 876, emacs-devel

> From: Glenn Morris <rgm@gnu.org>
> Cc: Stefan Monnier <monnier@iro.umontreal.ca>, 876@emacsbugs.donarmstrong.com,  emacs-devel@gnu.org
> Date: Mon, 12 Jan 2009 17:27:03 -0500
> 
> 
> Please stop cc'ing emacs-pretest-bug@gnu.org and bug-gnu-emacs@gnu.org
> in this discussion, it messes up the bug tracker.

Why won't the tracker get its act together and stop forcing us into
editing mail headers?  I use the Rmail's `r' command, which replies to
all the original "To" and "CC" addressees.  With the amounts of mail
I'm responding to, I cannot afford editing headers of every message I
send.  Even if I try, I will surely sometimes forget.




^ permalink raw reply	[flat|nested] 71+ messages in thread

end of thread, other threads:[~2009-01-13  4:01 UTC | newest]

Thread overview: 71+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-06-11  0:43 ^M in the info files Lennart Borgman (gmail)
2008-06-11  4:42 ` dhruva
2008-06-11 15:56   ` Juanma Barranquero
2008-07-09  1:51     ` Juanma Barranquero
2008-07-09  2:44       ` Kenichi Handa
2008-07-09  2:56         ` Juanma Barranquero
2008-07-09  4:33           ` Kenichi Handa
2008-07-09  9:15             ` Jason Rumney
2008-07-09 11:16               ` Kenichi Handa
2008-07-09 16:49                 ` Stefan Monnier
2008-07-09 17:58                   ` James Cloos
2008-07-09 20:19                     ` Juri Linkov
2008-07-14 11:44                       ` Kenichi Handa
2008-07-21 11:18                         ` Juanma Barranquero
2008-07-10 11:17                   ` Kenichi Handa
2008-07-10 16:02                     ` Stefan Monnier
2008-07-10 18:42                       ` Juri Linkov
2008-07-10 20:27                         ` Stefan Monnier
2008-07-10 20:47                           ` Juri Linkov
2008-07-10 22:11                             ` Stefan Monnier
2008-07-19 22:29                           ` Eli Zaretskii
2008-07-21  4:57                             ` Stefan Monnier
2008-07-21 15:08                               ` Eli Zaretskii
2008-07-21 18:20                                 ` Stefan Monnier
2008-07-09 10:02             ` Juanma Barranquero
2008-07-09 14:54               ` "no-conversion" coding system (was: ^M in the info files) Stefan Monnier
2008-07-09 23:31                 ` Kenichi Handa
2008-07-09 23:57                 ` Richard M Stallman
2008-07-10  0:31                   ` Kenichi Handa
2008-07-10  1:12                     ` "no-conversion" coding system Stefan Monnier
2008-07-14 23:19                       ` Eli Zaretskii
2008-07-15  1:39                         ` Stefan Monnier
2008-07-20 14:04                           ` Eli Zaretskii
2008-07-15  6:34                         ` Stephen J. Turnbull
2008-07-20 14:11                           ` Eli Zaretskii
  -- strict thread matches above, loose matches on Subject: below --
2008-08-06 11:09 A few bugs not in the bug tracker (I think) Juanma Barranquero
2008-08-06 12:31 ` Kenichi Handa
2008-08-06 13:14   ` Juanma Barranquero
2008-08-07  0:48     ` ^M in the info files Kenichi Handa
2008-08-07  3:20       ` Eli Zaretskii
2008-08-07  3:22       ` Juanma Barranquero
2008-08-07  3:47         ` Kenichi Handa
2008-08-07 13:01           ` Juanma Barranquero
2008-08-08  1:28             ` Kenichi Handa
2008-08-08 11:02               ` Juanma Barranquero
2008-08-13 10:08                 ` Kenichi Handa
2008-08-15 23:09                   ` Juanma Barranquero
2008-08-07 20:18       ` Stefan Monnier
2008-11-28 21:28         ` Drew Adams
2008-11-28 22:39           ` Eli Zaretskii
2008-11-28 22:44             ` Juanma Barranquero
2008-11-29 10:50               ` Eli Zaretskii
2008-11-29 11:56                 ` Juanma Barranquero
2008-11-28 22:49             ` Drew Adams
2008-11-29 10:51               ` Eli Zaretskii
2009-01-10 11:15             ` Eli Zaretskii
2009-01-10 12:14               ` Eli Zaretskii
2009-01-12 20:56                 ` Stefan Monnier
2009-01-12 22:04                   ` Eli Zaretskii
2009-01-10 14:07               ` Juanma Barranquero
2009-01-10 15:13                 ` Eli Zaretskii
2009-01-10 18:23                   ` Juanma Barranquero
2009-01-10 19:08                     ` Eli Zaretskii
2009-01-10 19:12                     ` Eli Zaretskii
2009-01-10 19:19                     ` Eli Zaretskii
2009-01-10 21:04                       ` Juanma Barranquero
2009-01-10 16:21               ` Drew Adams
2009-01-10 23:16                 ` Lennart Borgman
2009-01-11  5:41                   ` dhruva
2009-01-11  5:51                   ` dhruva
2009-01-12 20:54               ` Stefan Monnier
2009-01-12 22:13                 ` Eli Zaretskii
2009-01-12 22:27                   ` Glenn Morris
2009-01-13  4:01                     ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).