unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* bug#870: Missing ^J in ChangeLog
@ 2008-09-03 11:03           ` Juanma Barranquero
  2008-09-03 12:50             ` martin rudalics
  2009-01-07 11:00             ` bug#870: marked as done (Missing ^J in ChangeLog) Emacs bug Tracking System
  0 siblings, 2 replies; 33+ messages in thread
From: Juanma Barranquero @ 2008-09-03 11:03 UTC (permalink / raw)
  To: submit

Sometimes, while editing a ChangeLog file, one or several ^J disappear.

It could be related to doing other modifications to the ChangeLog
(reverting the buffer when there's outside changes while trying to
commit, for example), but so far there is no recipe to reproduce it.

The bug has been observed many times, and by at least two developers. It
seems to be confined to the Windows port.

emacs-devel discussion:

http://lists.gnu.org/archive/html/emacs-devel/2008-06/msg02050.html
http://lists.gnu.org/archive/html/emacs-devel/2008-07/msg00075.html






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-09-03 11:03           ` bug#870: Missing ^J in ChangeLog Juanma Barranquero
@ 2008-09-03 12:50             ` martin rudalics
  2008-09-03 15:20               ` Juanma Barranquero
  2009-01-07 11:00             ` bug#870: marked as done (Missing ^J in ChangeLog) Emacs bug Tracking System
  1 sibling, 1 reply; 33+ messages in thread
From: martin rudalics @ 2008-09-03 12:50 UTC (permalink / raw)
  To: 870; +Cc: Juanma Barranquero

 > Sometimes, while editing a ChangeLog file, one or several ^J disappear.
 >
 > It could be related to doing other modifications to the ChangeLog
 > (reverting the buffer when there's outside changes while trying to
 > commit, for example), but so far there is no recipe to reproduce it.
 >
 > The bug has been observed many times, and by at least two developers. It
 > seems to be confined to the Windows port.
 >
 > emacs-devel discussion:
 >
 > http://lists.gnu.org/archive/html/emacs-devel/2008-06/msg02050.html
 > http://lists.gnu.org/archive/html/emacs-devel/2008-07/msg00075.html

Can't you put a `modification-hook' (or even a `read-only' hook) on all
^Js in your ChangeLog buffers?

martin






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-09-03 12:50             ` martin rudalics
@ 2008-09-03 15:20               ` Juanma Barranquero
  2008-10-22 15:14                 ` Juanma Barranquero
  0 siblings, 1 reply; 33+ messages in thread
From: Juanma Barranquero @ 2008-09-03 15:20 UTC (permalink / raw)
  To: martin rudalics; +Cc: 870

On Wed, Sep 3, 2008 at 14:50, martin rudalics <rudalics@gmx.at> wrote:

> Can't you put a `modification-hook' (or even a `read-only' hook) on all
> ^Js in your ChangeLog buffers?

Yes, I've added a modification hook, but so far I haven't been able to
catch the bug happening.

 Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-09-03 15:20               ` Juanma Barranquero
@ 2008-10-22 15:14                 ` Juanma Barranquero
  2008-10-22 19:45                   ` Eli Zaretskii
  0 siblings, 1 reply; 33+ messages in thread
From: Juanma Barranquero @ 2008-10-22 15:14 UTC (permalink / raw)
  To: 870

X-Debbugs-CC: eliz@gnu.org, jasonr@gnu.org

I just had it happen to me again, after a long while. Those were my steps:

 1.- Edit and save src/xdisp.c.
 2.- Edit and save src/ChangeLog to add a new entry.
 3.- Run vc-dir (through C-x v d) and select to check only on "src/";
vc-dir says there are pending changes in src/ChangeLog (among other
files).
 4.- Exit Emacs.
 5.- cvs update from the checkout root; src/ChangeLog is merged. There
are no conflicts (my change to src/ChangeLog was not at the top of the
file because, before merging, the first entry was already by me).
 6.- Enter Emacs (it visits src/ChangeLog because I'm using desktop.el).
 7.- Split the second entry (by me) in two, moving part of it as a new
entry at the top of src/ChangeLog.
 8.- Save src/ChangeLog.
 9.- Run vc-dir again; there are files ready to be commited.
10.- Mark them with M (vc-dir-mark-all-files) and commit them with v
(vc-next-action).
11.- Fill the CVS log entry and send with C-c C-c.

I'm puzzled. I'd think CVSNT was screwing with CRLF, but after 5) the
ChangeLog was fine, or I would have seen it at 6).

  Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 15:14                 ` Juanma Barranquero
@ 2008-10-22 19:45                   ` Eli Zaretskii
  2008-10-22 20:15                     ` Lennart Borgman (gmail)
  2008-10-22 21:58                     ` Juanma Barranquero
  0 siblings, 2 replies; 33+ messages in thread
From: Eli Zaretskii @ 2008-10-22 19:45 UTC (permalink / raw)
  To: Juanma Barranquero, 870; +Cc: bug-gnu-emacs

> Date: Wed, 22 Oct 2008 17:14:27 +0200
> From: "Juanma Barranquero" <lekktu@gmail.com>
> 
> I'm puzzled. I'd think CVSNT was screwing with CRLF, but after 5) the
> ChangeLog was fine, or I would have seen it at 6).

I suspect that we have a more fundamental problem somewhere in
insert-file-contents or its subroutines.  Did you see that in the
Index nodes of Info manuals some lines end with TWO ^M characters,
whereas the file has only one?  Maybe the same problem is at work
here, who knows?







^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 19:45                   ` Eli Zaretskii
@ 2008-10-22 20:15                     ` Lennart Borgman (gmail)
  2008-10-22 21:08                       ` Eli Zaretskii
  2008-10-22 22:26                       ` Lennart Borgman
  2008-10-22 21:58                     ` Juanma Barranquero
  1 sibling, 2 replies; 33+ messages in thread
From: Lennart Borgman (gmail) @ 2008-10-22 20:15 UTC (permalink / raw)
  To: Eli Zaretskii, 870; +Cc: Juanma Barranquero

Eli Zaretskii wrote:
>> Date: Wed, 22 Oct 2008 17:14:27 +0200
>> From: "Juanma Barranquero" <lekktu@gmail.com>
>>
>> I'm puzzled. I'd think CVSNT was screwing with CRLF, but after 5) the
>> ChangeLog was fine, or I would have seen it at 6).
> 
> I suspect that we have a more fundamental problem somewhere in
> insert-file-contents or its subroutines.  Did you see that in the
> Index nodes of Info manuals some lines end with TWO ^M characters,
> whereas the file has only one?  Maybe the same problem is at work
> here, who knows?


Due to the problem with ^M in w32 info files this is very visible ;-)

How is those lines build (where is the code that builds them)?






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 20:15                     ` Lennart Borgman (gmail)
@ 2008-10-22 21:08                       ` Eli Zaretskii
  2008-10-22 21:22                         ` Lennart Borgman (gmail)
  2008-10-22 22:26                       ` Lennart Borgman
  1 sibling, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2008-10-22 21:08 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: lekktu, 870

> Date: Wed, 22 Oct 2008 22:15:24 +0200
> From: "Lennart Borgman (gmail)" <lennart.borgman@gmail.com>
> CC: Juanma Barranquero <lekktu@gmail.com>
> 
> How is those lines build (where is the code that builds them)?

Like I said: insert-file-contents and its subroutines.






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 21:08                       ` Eli Zaretskii
@ 2008-10-22 21:22                         ` Lennart Borgman (gmail)
  2008-10-22 22:06                           ` Eli Zaretskii
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Borgman (gmail) @ 2008-10-22 21:22 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: lekktu, 870

Eli Zaretskii wrote:
>> Date: Wed, 22 Oct 2008 22:15:24 +0200
>> From: "Lennart Borgman (gmail)" <lennart.borgman@gmail.com>
>> CC: Juanma Barranquero <lekktu@gmail.com>
>>
>> How is those lines build (where is the code that builds them)?
> 
> Like I said: insert-file-contents and its subroutines.

Do you mean that the whole index page is built by an
insert-file-contents call?







^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 19:45                   ` Eli Zaretskii
  2008-10-22 20:15                     ` Lennart Borgman (gmail)
@ 2008-10-22 21:58                     ` Juanma Barranquero
  2008-10-22 22:17                       ` Eli Zaretskii
  1 sibling, 1 reply; 33+ messages in thread
From: Juanma Barranquero @ 2008-10-22 21:58 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 870

On Wed, Oct 22, 2008 at 21:45, Eli Zaretskii <eliz@gnu.org> wrote:

> I suspect that we have a more fundamental problem somewhere in
> insert-file-contents or its subroutines.

Yes, I suppose so, but it is strange it only manifests in ChangeLogs.

> Did you see that in the
> Index nodes of Info manuals some lines end with TWO ^M characters,
> whereas the file has only one?  Maybe the same problem is at work
> here, who knows?

Are you talking of #876? I don't think it is related to #870.

  Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 21:22                         ` Lennart Borgman (gmail)
@ 2008-10-22 22:06                           ` Eli Zaretskii
  0 siblings, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2008-10-22 22:06 UTC (permalink / raw)
  To: Lennart Borgman (gmail); +Cc: lekktu, 870

> Date: Wed, 22 Oct 2008 23:22:56 +0200
> From: "Lennart Borgman (gmail)" <lennart.borgman@gmail.com>
> CC: 870@emacsbugs.donarmstrong.com, lekktu@gmail.com
> 
> Eli Zaretskii wrote:
> >> Date: Wed, 22 Oct 2008 22:15:24 +0200
> >> From: "Lennart Borgman (gmail)" <lennart.borgman@gmail.com>
> >> CC: Juanma Barranquero <lekktu@gmail.com>
> >>
> >> How is those lines build (where is the code that builds them)?
> > 
> > Like I said: insert-file-contents and its subroutines.
> 
> Do you mean that the whole index page is built by an
> insert-file-contents call?

No.






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 21:58                     ` Juanma Barranquero
@ 2008-10-22 22:17                       ` Eli Zaretskii
  2008-10-22 23:32                         ` Juanma Barranquero
  0 siblings, 1 reply; 33+ messages in thread
From: Eli Zaretskii @ 2008-10-22 22:17 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 870

> Date: Wed, 22 Oct 2008 23:58:36 +0200
> From: "Juanma Barranquero" <lekktu@gmail.com>
> Cc: 870@emacsbugs.donarmstrong.com
> 
> > Did you see that in the
> > Index nodes of Info manuals some lines end with TWO ^M characters,
> > whereas the file has only one?  Maybe the same problem is at work
> > here, who knows?
> 
> Are you talking of #876? I don't think it is related to #870.

No, I'm talking about pairs of ^M^M characters at the end of some
lines, where in the file there's only one ^M at the end of each line.






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 20:15                     ` Lennart Borgman (gmail)
  2008-10-22 21:08                       ` Eli Zaretskii
@ 2008-10-22 22:26                       ` Lennart Borgman
  2008-10-22 23:10                         ` Lennart Borgman
  1 sibling, 1 reply; 33+ messages in thread
From: Lennart Borgman @ 2008-10-22 22:26 UTC (permalink / raw)
  To: Eli Zaretskii, 870; +Cc: Juanma Barranquero

On Wed, Oct 22, 2008 at 10:15 PM, Lennart Borgman (gmail)
<lennart.borgman@gmail.com> wrote:
> Eli Zaretskii wrote:
>>> Date: Wed, 22 Oct 2008 17:14:27 +0200
>>> From: "Juanma Barranquero" <lekktu@gmail.com>
>>>
>>> I'm puzzled. I'd think CVSNT was screwing with CRLF, but after 5) the
>>> ChangeLog was fine, or I would have seen it at 6).
>>
>> I suspect that we have a more fundamental problem somewhere in
>> insert-file-contents or its subroutines.  Did you see that in the
>> Index nodes of Info manuals some lines end with TWO ^M characters,
>> whereas the file has only one?  Maybe the same problem is at work
>> here, who knows?
>
>
> Due to the problem with ^M in w32 info files this is very visible ;-)
>
> How is those lines build (where is the code that builds them)?

Even though those lines in the info index shows as

   ^M^M

at the end it seems like the it is 13, 10 not 13, 13. Is this just
another bug, or?






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 22:26                       ` Lennart Borgman
@ 2008-10-22 23:10                         ` Lennart Borgman
  0 siblings, 0 replies; 33+ messages in thread
From: Lennart Borgman @ 2008-10-22 23:10 UTC (permalink / raw)
  To: Eli Zaretskii, 870; +Cc: Juanma Barranquero

On Thu, Oct 23, 2008 at 12:26 AM, Lennart Borgman
<lennart.borgman@gmail.com> wrote:
> On Wed, Oct 22, 2008 at 10:15 PM, Lennart Borgman (gmail)
> <lennart.borgman@gmail.com> wrote:
>> Eli Zaretskii wrote:
>>>> Date: Wed, 22 Oct 2008 17:14:27 +0200
>>>> From: "Juanma Barranquero" <lekktu@gmail.com>
>>>>
>>>> I'm puzzled. I'd think CVSNT was screwing with CRLF, but after 5) the
>>>> ChangeLog was fine, or I would have seen it at 6).
>>>
>>> I suspect that we have a more fundamental problem somewhere in
>>> insert-file-contents or its subroutines.  Did you see that in the
>>> Index nodes of Info manuals some lines end with TWO ^M characters,
>>> whereas the file has only one?  Maybe the same problem is at work
>>> here, who knows?
>>
>>
>> Due to the problem with ^M in w32 info files this is very visible ;-)
>>
>> How is those lines build (where is the code that builds them)?
>
> Even though those lines in the info index shows as
>
>   ^M^M
>
> at the end it seems like the it is 13, 10 not 13, 13. Is this just
> another bug, or?

It looks like a display bug!

I copied the two ^M^M to *scratch* and tried to investigate what was
there. I saw some strange things, but the most enlightening was when I
copied these characters as a line. With cua-mode on I selected the
line and copied it to a new location. Then suddenly it became

  ^M
                            (line 34)^M

which is actually what you can see in the file

   info/emacs-7






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 22:17                       ` Eli Zaretskii
@ 2008-10-22 23:32                         ` Juanma Barranquero
  2008-10-22 23:41                           ` Juanma Barranquero
  0 siblings, 1 reply; 33+ messages in thread
From: Juanma Barranquero @ 2008-10-22 23:32 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 870

On Thu, Oct 23, 2008 at 00:17, Eli Zaretskii <eliz@gnu.org> wrote:

> No, I'm talking about pairs of ^M^M characters at the end of some
> lines, where in the file there's only one ^M at the end of each line.

How can I reproduce that? I don't see the problem, or I don't know
where to look.

  Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 23:32                         ` Juanma Barranquero
@ 2008-10-22 23:41                           ` Juanma Barranquero
  2008-10-23  0:39                             ` Lennart Borgman
  0 siblings, 1 reply; 33+ messages in thread
From: Juanma Barranquero @ 2008-10-22 23:41 UTC (permalink / raw)
  To: Eli Zaretskii; +Cc: 870

On Thu, Oct 23, 2008 at 01:32, Juanma Barranquero <lekktu@gmail.com> wrote:

> How can I reproduce that? I don't see the problem, or I don't know
> where to look.

Forget my previous message, I see it now.

I think Lennart's right, it seems like a display bug. If you put the
cursor over the second ^M and do C-u C-x =, it says that it is a LF.

   Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-22 23:41                           ` Juanma Barranquero
@ 2008-10-23  0:39                             ` Lennart Borgman
  2008-10-23 13:34                               ` Juanma Barranquero
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Borgman @ 2008-10-23  0:39 UTC (permalink / raw)
  To: Juanma Barranquero, 870

On Thu, Oct 23, 2008 at 1:41 AM, Juanma Barranquero <lekktu@gmail.com> wrote:
> On Thu, Oct 23, 2008 at 01:32, Juanma Barranquero <lekktu@gmail.com> wrote:
>
>> How can I reproduce that? I don't see the problem, or I don't know
>> where to look.
>
> Forget my previous message, I see it now.
>
> I think Lennart's right, it seems like a display bug. If you put the
> cursor over the second ^M and do C-u C-x =, it says that it is a LF.

... but with (buffer-substring ...) you can see that there are
invisible characters ...






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Missing ^J in ChangeLog
  2008-10-23  0:39                             ` Lennart Borgman
@ 2008-10-23 13:34                               ` Juanma Barranquero
  0 siblings, 0 replies; 33+ messages in thread
From: Juanma Barranquero @ 2008-10-23 13:34 UTC (permalink / raw)
  To: Lennart Borgman; +Cc: 870

> ... but with (buffer-substring ...) you can see that there are
> invisible characters ...

In that case, it looks like it *is* an instance of #870:

visible line1^M
invisible line2^M
visible line 3^M

appearing as

visible line1^M^M
visible line3^M

  Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found] <f7ccd24b0901042103u5b241a60u7842ed51ca9249fb@mail.gmail.com>
@ 2009-01-05 10:59 ` Jason Rumney
       [not found] ` <4961E7F7.2000509@gnu.org>
  1 sibling, 0 replies; 33+ messages in thread
From: Jason Rumney @ 2009-01-05 10:59 UTC (permalink / raw)
  To: Juanma Barranquero, 870; +Cc: Emacs Devel

Juanma Barranquero wrote:
>    emacs -Q --eval "(desktop-save-mode 1)" ChangeLog.870
>   

I can also reproduce the bug with C-x RET r utf-8-dos after visiting the 
file normally.

It appears that there is a bug in all the decode_coding_* functions when 
a CR lies on a CHARBUF_SIZE (0x4000) boundary with a matching LF on the 
other side of the boundary.

They all do something like:

      if (eol_crlf && c1 == '\r')
        ONE_MORE_BYTE (byte_after_cr);

but ONE_MORE_BYTE will abort the decode if it reaches the end of the 
buffer, leaving the CR in limbo between having been read and being added 
to the buffer. Then on decoding the subsequent block, the initial LF 
does not trip the normal CRLF decoding, so it is put into the buffer.







^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found] ` <4961E7F7.2000509@gnu.org>
@ 2009-01-05 11:12   ` Juanma Barranquero
       [not found]   ` <f7ccd24b0901050312r10286531q1c19da99d1779447@mail.gmail.com>
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Juanma Barranquero @ 2009-01-05 11:12 UTC (permalink / raw)
  To: Jason Rumney; +Cc: 870, Emacs Devel

On Mon, Jan 5, 2009 at 11:59, Jason Rumney <jasonr@gnu.org> wrote:

> It appears that there is a bug in all the decode_coding_* functions when a
> CR lies on a CHARBUF_SIZE (0x4000) boundary with a matching LF on the other
> side of the boundary.
>
> They all do something like:
>
>     if (eol_crlf && c1 == '\r')
>       ONE_MORE_BYTE (byte_after_cr);
>
> but ONE_MORE_BYTE will abort the decode if it reaches the end of the buffer,
> leaving the CR in limbo between having been read and being added to the
> buffer. Then on decoding the subsequent block, the initial LF does not trip
> the normal CRLF decoding, so it is put into the buffer.

Wouldn't that mean that, on writing the buffer, the file would end
with extra CRs, instead of missing LFs?

    Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]   ` <f7ccd24b0901050312r10286531q1c19da99d1779447@mail.gmail.com>
@ 2009-01-05 11:22     ` Jason Rumney
       [not found]     ` <4961ED68.1090609@gnu.org>
  1 sibling, 0 replies; 33+ messages in thread
From: Jason Rumney @ 2009-01-05 11:22 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 870, Emacs Devel

Juanma Barranquero wrote:
> On Mon, Jan 5, 2009 at 11:59, Jason Rumney <jasonr@gnu.org> wrote:
>
>   
>> It appears that there is a bug in all the decode_coding_* functions when a
>> CR lies on a CHARBUF_SIZE (0x4000) boundary with a matching LF on the other
>> side of the boundary.
>>
>> They all do something like:
>>
>>     if (eol_crlf && c1 == '\r')
>>       ONE_MORE_BYTE (byte_after_cr);
>>
>> but ONE_MORE_BYTE will abort the decode if it reaches the end of the buffer,
>> leaving the CR in limbo between having been read and being added to the
>> buffer. Then on decoding the subsequent block, the initial LF does not trip
>> the normal CRLF decoding, so it is put into the buffer.
>>     
>
> Wouldn't that mean that, on writing the buffer, the file would end
> with extra CRs, instead of missing LFs?
>   
The CRs are effectively stripped on reading, since they end up in limbo 
between being read and being added to the decoding buffer. I haven't 
tried writing the file, but I think (from memory and from the way the 
code looks to me) the problem is a missing CR, not a missing LF.







^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]     ` <4961ED68.1090609@gnu.org>
@ 2009-01-05 11:31       ` Juanma Barranquero
       [not found]       ` <f7ccd24b0901050331w4d35bb66ue2323dde8c8ac6a2@mail.gmail.com>
  1 sibling, 0 replies; 33+ messages in thread
From: Juanma Barranquero @ 2009-01-05 11:31 UTC (permalink / raw)
  To: Jason Rumney; +Cc: 870, Emacs Devel

On Mon, Jan 5, 2009 at 12:22, Jason Rumney <jasonr@gnu.org> wrote:

> The CRs are effectively stripped on reading, since they end up in limbo
> between being read and being added to the decoding buffer. I haven't tried
> writing the file, but I think (from memory and from the way the code looks
> to me) the problem is a missing CR, not a missing LF.

That's not what I see.

ChangeLog.870 initially contains:

0000 7ff0 20 74 69 6d 65 2d 73 74  61 6d 70 2e 65 6c 3a 0d   time-stamp.el:.
0000 8000 0a 09 2a 20 74 69 6d 65  2e 65 6c 3a 0d 0a 09 2a  ..* time.el:...*

After rereading the file, in Emacs it shows as:

	* time-stamp.el:^M	* time.el:

which I interpret as if, while reading, the ^M was read without ^L and
so taken literally, while the ^L was missing.

Then, if I write it back, the file on disk contains

0000 7ff0 20 74 69 6d 65 2d 73 74  61 6d 70 2e 65 6c 3a 0d   time-stamp.el:.
0000 8000 09 2a 20 74 69 6d 65 2e  65 6c 3a 0d 0a 09 2a 20  .* time.el:...*

so a LF has gone missing.

    Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]       ` <f7ccd24b0901050331w4d35bb66ue2323dde8c8ac6a2@mail.gmail.com>
@ 2009-01-05 13:50         ` Jason Rumney
       [not found]         ` <4962100E.4060808@gnu.org>
  1 sibling, 0 replies; 33+ messages in thread
From: Jason Rumney @ 2009-01-05 13:50 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 870, Emacs Devel

Juanma Barranquero wrote:
> After rereading the file, in Emacs it shows as:
>
> 	* time-stamp.el:^M	* time.el:
>
> which I interpret as if, while reading, the ^M was read without ^L and
> so taken literally, while the ^L was missing.
>
> Then, if I write it back, the file on disk contains
>
> 0000 7ff0 20 74 69 6d 65 2d 73 74  61 6d 70 2e 65 6c 3a 0d   time-stamp.el:.
> 0000 8000 09 2a 20 74 69 6d 65 2e  65 6c 3a 0d 0a 09 2a 20  .* time.el:...*
>
> so a LF has gone missing.
>   

Yes, you're right it is a LF (^J) that has gone missing - I was 
confused. So maybe I am wrong about exactly what happens in that part of 
the decode functions - maybe the CR does get written to the buffer, but 
the following LF is somehow swallowed.







^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]         ` <4962100E.4060808@gnu.org>
@ 2009-01-05 14:28           ` Juanma Barranquero
  0 siblings, 0 replies; 33+ messages in thread
From: Juanma Barranquero @ 2009-01-05 14:28 UTC (permalink / raw)
  To: Jason Rumney; +Cc: 870, Emacs Devel

On Mon, Jan 5, 2009 at 14:50, Jason Rumney <jasonr@gnu.org> wrote:

> So
> maybe I am wrong about exactly what happens in that part of the decode
> functions - maybe the CR does get written to the buffer, but the following
> LF is somehow swallowed.

The bug does not happen on encoding (for writing), because it is
already visible after re-decoding (I mean, after desktop.el applies
buffer-file-coding-system, or after the
revert-buffer-with-coding-system call in your example). Once the
buffer has the lone ^M, it's no wonder it ends up in the file after
writing.

I think you're right that the problem is related to decoding a CRLF
when the pair crosses a buffer boundary.

    Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found] ` <4961E7F7.2000509@gnu.org>
  2009-01-05 11:12   ` Juanma Barranquero
       [not found]   ` <f7ccd24b0901050312r10286531q1c19da99d1779447@mail.gmail.com>
@ 2009-01-07  1:07   ` Kenichi Handa
       [not found]   ` <E1LKMsw-0005wG-G6@etlken.m17n.org>
  3 siblings, 0 replies; 33+ messages in thread
From: Kenichi Handa @ 2009-01-07  1:07 UTC (permalink / raw)
  To: Jason Rumney; +Cc: lekktu, 870, emacs-devel

In article <4961E7F7.2000509@gnu.org>, Jason Rumney <jasonr@gnu.org> writes:

> Juanma Barranquero wrote:
> >    emacs -Q --eval "(desktop-save-mode 1)" ChangeLog.870
> >   

> I can also reproduce the bug with C-x RET r utf-8-dos after visiting the 
> file normally.

I can reproduce it by that recipe.

> It appears that there is a bug in all the decode_coding_* functions when 
> a CR lies on a CHARBUF_SIZE (0x4000) boundary with a matching LF on the 
> other side of the boundary.

> They all do something like:

>       if (eol_crlf && c1 == '\r')
>         ONE_MORE_BYTE (byte_after_cr);

> but ONE_MORE_BYTE will abort the decode if it reaches the end of the 
> buffer, leaving the CR in limbo between having been read and being added 
> to the buffer. Then on decoding the subsequent block, the initial LF 
> does not trip the normal CRLF decoding, so it is put into the buffer.

??? decode_coding_* gets bytes from coding->source and
produces characters in CHARBUF.  So, I think the above
analysis is not correct.

As normal visiting of ChangeLog.870 doesn't have the problem
but revisiting it causes the problem, I think the bug is in
Finsert_file_contents; perhaps in the handling of REPLACE.
I'll have a look at it.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]   ` <E1LKMsw-0005wG-G6@etlken.m17n.org>
@ 2009-01-07  6:53     ` Kenichi Handa
       [not found]     ` <E1LKSIW-00083J-BE@etlken.m17n.org>
  1 sibling, 0 replies; 33+ messages in thread
From: Kenichi Handa @ 2009-01-07  6:53 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: lekktu, emacs-devel, 870

In article <E1LKMsw-0005wG-G6@etlken.m17n.org>, Kenichi Handa <handa@m17n.org> writes:

> > It appears that there is a bug in all the decode_coding_* functions when 
> > a CR lies on a CHARBUF_SIZE (0x4000) boundary with a matching LF on the 
> > other side of the boundary.

> > They all do something like:

> >       if (eol_crlf && c1 == '\r')
> >         ONE_MORE_BYTE (byte_after_cr);

> > but ONE_MORE_BYTE will abort the decode if it reaches the end of the 
> > buffer, leaving the CR in limbo between having been read and being added 
> > to the buffer. Then on decoding the subsequent block, the initial LF 
> > does not trip the normal CRLF decoding, so it is put into the buffer.

> ??? decode_coding_* gets bytes from coding->source and
> produces characters in CHARBUF.  So, I think the above
> analysis is not correct.

> As normal visiting of ChangeLog.870 doesn't have the problem
> but revisiting it causes the problem, I think the bug is in
> Finsert_file_contents; perhaps in the handling of REPLACE.
> I'll have a look at it.

I fixed the bug.  Actually what wrong was decode_coding_*
but in the different place as above.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]     ` <E1LKSIW-00083J-BE@etlken.m17n.org>
       [not found]       ` <f7ccd24b0901070143s394f66adq79a7a6ca2d25dea3@mail.gmail.com>
@ 2009-01-07  8:19       ` martin rudalics
  2009-01-07 12:29         ` Kenichi Handa
  2009-01-07  9:43       ` Juanma Barranquero
  2 siblings, 1 reply; 33+ messages in thread
From: martin rudalics @ 2009-01-07  8:19 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 870

 >> As normal visiting of ChangeLog.870 doesn't have the problem
 >> but revisiting it causes the problem, I think the bug is in
 >> Finsert_file_contents; perhaps in the handling of REPLACE.
 >> I'll have a look at it.
 >
 > I fixed the bug.  Actually what wrong was decode_coding_*
 > but in the different place as above.

Handa-san, while you're there could you please also have a look at
bug#1039?  Maybe it's related to the present issue.

Thank you, martin.






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]     ` <E1LKSIW-00083J-BE@etlken.m17n.org>
       [not found]       ` <f7ccd24b0901070143s394f66adq79a7a6ca2d25dea3@mail.gmail.com>
  2009-01-07  8:19       ` martin rudalics
@ 2009-01-07  9:43       ` Juanma Barranquero
  2 siblings, 0 replies; 33+ messages in thread
From: Juanma Barranquero @ 2009-01-07  9:43 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: emacs-devel, 870

On Wed, Jan 7, 2009 at 07:53, Kenichi Handa <handa@m17n.org> wrote:

> I fixed the bug.

Thanks! (I've been suffering this #$@!&* for the past eight months or so.)

I've added the "(Bug#870)" ref to your ChangeLog entry.

    Juanma






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: marked as done (Missing ^J in ChangeLog)
  2008-09-03 11:03           ` bug#870: Missing ^J in ChangeLog Juanma Barranquero
  2008-09-03 12:50             ` martin rudalics
@ 2009-01-07 11:00             ` Emacs bug Tracking System
  1 sibling, 0 replies; 33+ messages in thread
From: Emacs bug Tracking System @ 2009-01-07 11:00 UTC (permalink / raw)
  To: Jason Rumney

[-- Attachment #1: Type: text/plain, Size: 843 bytes --]


Your message dated Wed, 07 Jan 2009 18:54:10 +0800
with message-id <496489D2.8030902@gnu.org>
and subject line Re: bug#870: Repeatable instance of bug#870
has caused the Emacs bug report #870,
regarding Missing ^J in ChangeLog
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@emacsbugs.donarmstrong.com
immediately.)


-- 
870: http://emacsbugs.donarmstrong.com/cgi-bin/bugreport.cgi?bug=870
Emacs Bug Tracking System
Contact owner@emacsbugs.donarmstrong.com with problems

[-- Attachment #2: Type: message/rfc822, Size: 2696 bytes --]

From: "Juanma Barranquero" <lekktu@gmail.com>
To: submit@emacsbugs.donarmstrong.com
Subject: Missing ^J in ChangeLog
Date: Wed, 3 Sep 2008 13:03:17 +0200
Message-ID: <f7ccd24b0809030403q24d9eb9ey1c44cab6ea650dfb@mail.gmail.com>

Sometimes, while editing a ChangeLog file, one or several ^J disappear.

It could be related to doing other modifications to the ChangeLog
(reverting the buffer when there's outside changes while trying to
commit, for example), but so far there is no recipe to reproduce it.

The bug has been observed many times, and by at least two developers. It
seems to be confined to the Windows port.

emacs-devel discussion:

http://lists.gnu.org/archive/html/emacs-devel/2008-06/msg02050.html
http://lists.gnu.org/archive/html/emacs-devel/2008-07/msg00075.html



[-- Attachment #3: Type: message/rfc822, Size: 3394 bytes --]

From: Jason Rumney <jasonr@gnu.org>
To: Juanma Barranquero <lekktu@gmail.com>
Cc: Kenichi Handa <handa@m17n.org>, 870-done@emacsbugs.donarmstrong.com
Subject: Re: bug#870: Repeatable instance of bug#870
Date: Wed, 07 Jan 2009 18:54:10 +0800
Message-ID: <496489D2.8030902@gnu.org>

Juanma Barranquero wrote:
> Thanks! (I've been suffering this #$@!&* for the past eight months or so.)
>   

As usual, a repeatable test case helps a lot more than a mysterious 
occurrence that a few people have seen but noone can explain.

> I've added the "(Bug#870)" ref to your ChangeLog entry.
>   

And I've added "-done" to the bug-address Cc to close the bug report 
(and removed emacs-devel to cut down on the duplicates that end up on 
the list).



^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
       [not found]           ` <f7ccd24b0901070301t221f906atf75f8632dcf1c41@mail.gmail.com>
@ 2009-01-07 11:10             ` Jason Rumney
  0 siblings, 0 replies; 33+ messages in thread
From: Jason Rumney @ 2009-01-07 11:10 UTC (permalink / raw)
  To: Juanma Barranquero; +Cc: 870, Kenichi Handa

Juanma Barranquero wrote:
> However, for the past few days (since 2008/01/04 or so)
> NNN-done@emacsbugs messages seem to be ignored, though messages to
> control@emacsbugs do work.
>   

I haven't noticed that - it seems to have worked in this case. Maybe 
something has been changed to automatically reopen reports when a 
subsequent mail is received on the original report address. If so, I 
think it is a degradation, as often such messages are background chit 
chat about other side issues, if someone really reports that the fix 
does not work, then the bug can be reopened through the control address.







^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
  2009-01-07  8:19       ` martin rudalics
@ 2009-01-07 12:29         ` Kenichi Handa
  2009-01-07 15:33           ` martin rudalics
  0 siblings, 1 reply; 33+ messages in thread
From: Kenichi Handa @ 2009-01-07 12:29 UTC (permalink / raw)
  To: martin rudalics; +Cc: 870

In article <4964657F.5010205@gmx.at>, martin rudalics <rudalics@gmx.at> writes:

>>> As normal visiting of ChangeLog.870 doesn't have the problem
>>> but revisiting it causes the problem, I think the bug is in
>>> Finsert_file_contents; perhaps in the handling of REPLACE.
>>> I'll have a look at it.
> 
> I fixed the bug.  Actually what wrong was decode_coding_*
> but in the different place as above.

> Handa-san, while you're there could you please also have a look at
> bug#1039?  Maybe it's related to the present issue.

I installed a fix.  It was a different issue.

2009-01-07  Kenichi Handa  <handa@m17n.org>

	* fileio.c (Finsert_file_contents): In the case of replace,
	remeber the coding system used for decoding in
	coding_system (Bug#1039).

---
Kenichi Handa
handa@m17n.org








^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
  2009-01-07 12:29         ` Kenichi Handa
@ 2009-01-07 15:33           ` martin rudalics
  2009-01-13  2:30             ` Kenichi Handa
  0 siblings, 1 reply; 33+ messages in thread
From: martin rudalics @ 2009-01-07 15:33 UTC (permalink / raw)
  To: Kenichi Handa; +Cc: 870

 > I installed a fix.  It was a different issue.
 >
 > 2009-01-07  Kenichi Handa  <handa@m17n.org>
 >
 > 	* fileio.c (Finsert_file_contents): In the case of replace,
 > 	remeber the coding system used for decoding in
 > 	coding_system (Bug#1039).

Thanks for taking care of this.  Your fix solves the problem for me
though I'm not sure whether it fixes the issue raised by Peter:

 > That patch fixes the bug I reported, but it creates a new one: if you
 > change the EOL convention outside of emacs, revert-buffer no longer
 > detects this. To reproduce:
 > printf "hello\r\nworld\r\n" > hello
 > emacs -Q hello &
 > printf "hello\rworld\r" > hello
 > M-x revert-buffer
 > # emacs still sees DOS newlines

In particular, when I visit a file, (1) save it with a different line
ending, (2) change the line ending outside this instance of Emacs, and
(3) revert the buffer, its line ending is the one saved in (1) and not
the one from (2).  But IIUC Emacs 22 didn't handle this either.

martin






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
  2009-01-07 15:33           ` martin rudalics
@ 2009-01-13  2:30             ` Kenichi Handa
  2009-01-13  4:06               ` Eli Zaretskii
  0 siblings, 1 reply; 33+ messages in thread
From: Kenichi Handa @ 2009-01-13  2:30 UTC (permalink / raw)
  To: martin rudalics; +Cc: 870

In article <4964CB64.2090506@gmx.at>, martin rudalics <rudalics@gmx.at> writes:

> Thanks for taking care of this.  Your fix solves the problem for me
> though I'm not sure whether it fixes the issue raised by Peter:

> That patch fixes the bug I reported, but it creates a new one: if you
> change the EOL convention outside of emacs, revert-buffer no longer
> detects this. To reproduce:
> printf "hello\r\nworld\r\n" > hello
> emacs -Q hello &
> printf "hello\rworld\r" > hello
> M-x revert-buffer
> # emacs still sees DOS newlines

As I can't reproduce the above problem, I think the bug is fixed.

> In particular, when I visit a file, (1) save it with a different line
> ending, (2) change the line ending outside this instance of Emacs, and
> (3) revert the buffer, its line ending is the one saved in (1) and not
> the one from (2).  But IIUC Emacs 22 didn't handle this either.

By (1), the variable buffer-file-coding-system-explicit is
set to XXX, and, in such a case, revert-buffer binds
coding-system-for-read to XXX to respect your decision made
by (1).

I'm not sure this behavior is a bug.

---
Kenichi Handa
handa@m17n.org






^ permalink raw reply	[flat|nested] 33+ messages in thread

* bug#870: Repeatable instance of bug#870
  2009-01-13  2:30             ` Kenichi Handa
@ 2009-01-13  4:06               ` Eli Zaretskii
  0 siblings, 0 replies; 33+ messages in thread
From: Eli Zaretskii @ 2009-01-13  4:06 UTC (permalink / raw)
  To: Kenichi Handa, 870; +Cc: bug-gnu-emacs

> From: Kenichi Handa <handa@m17n.org>
> Date: Tue, 13 Jan 2009 11:30:16 +0900
> Cc: 870@emacsbugs.donarmstrong.com
> 
> > In particular, when I visit a file, (1) save it with a different line
> > ending, (2) change the line ending outside this instance of Emacs, and
> > (3) revert the buffer, its line ending is the one saved in (1) and not
> > the one from (2).  But IIUC Emacs 22 didn't handle this either.
> 
> By (1), the variable buffer-file-coding-system-explicit is
> set to XXX, and, in such a case, revert-buffer binds
> coding-system-for-read to XXX to respect your decision made
> by (1).
> 
> I'm not sure this behavior is a bug.

It isn't.






^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2009-01-13  4:06 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <f7ccd24b0901042103u5b241a60u7842ed51ca9249fb@mail.gmail.com>
2009-01-05 10:59 ` bug#870: Repeatable instance of bug#870 Jason Rumney
     [not found] ` <4961E7F7.2000509@gnu.org>
2009-01-05 11:12   ` Juanma Barranquero
     [not found]   ` <f7ccd24b0901050312r10286531q1c19da99d1779447@mail.gmail.com>
2009-01-05 11:22     ` Jason Rumney
     [not found]     ` <4961ED68.1090609@gnu.org>
2009-01-05 11:31       ` Juanma Barranquero
     [not found]       ` <f7ccd24b0901050331w4d35bb66ue2323dde8c8ac6a2@mail.gmail.com>
2009-01-05 13:50         ` Jason Rumney
     [not found]         ` <4962100E.4060808@gnu.org>
2009-01-05 14:28           ` Juanma Barranquero
2009-01-07  1:07   ` Kenichi Handa
     [not found]   ` <E1LKMsw-0005wG-G6@etlken.m17n.org>
2009-01-07  6:53     ` Kenichi Handa
     [not found]     ` <E1LKSIW-00083J-BE@etlken.m17n.org>
     [not found]       ` <f7ccd24b0901070143s394f66adq79a7a6ca2d25dea3@mail.gmail.com>
     [not found]         ` <496489D2.8030902@gnu.org>
2008-09-03 11:03           ` bug#870: Missing ^J in ChangeLog Juanma Barranquero
2008-09-03 12:50             ` martin rudalics
2008-09-03 15:20               ` Juanma Barranquero
2008-10-22 15:14                 ` Juanma Barranquero
2008-10-22 19:45                   ` Eli Zaretskii
2008-10-22 20:15                     ` Lennart Borgman (gmail)
2008-10-22 21:08                       ` Eli Zaretskii
2008-10-22 21:22                         ` Lennart Borgman (gmail)
2008-10-22 22:06                           ` Eli Zaretskii
2008-10-22 22:26                       ` Lennart Borgman
2008-10-22 23:10                         ` Lennart Borgman
2008-10-22 21:58                     ` Juanma Barranquero
2008-10-22 22:17                       ` Eli Zaretskii
2008-10-22 23:32                         ` Juanma Barranquero
2008-10-22 23:41                           ` Juanma Barranquero
2008-10-23  0:39                             ` Lennart Borgman
2008-10-23 13:34                               ` Juanma Barranquero
2009-01-07 11:00             ` bug#870: marked as done (Missing ^J in ChangeLog) Emacs bug Tracking System
     [not found]           ` <f7ccd24b0901070301t221f906atf75f8632dcf1c41@mail.gmail.com>
2009-01-07 11:10             ` bug#870: Repeatable instance of bug#870 Jason Rumney
2009-01-07  8:19       ` martin rudalics
2009-01-07 12:29         ` Kenichi Handa
2009-01-07 15:33           ` martin rudalics
2009-01-13  2:30             ` Kenichi Handa
2009-01-13  4:06               ` Eli Zaretskii
2009-01-07  9:43       ` Juanma Barranquero

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).