unofficial mirror of bug-gnu-emacs@gnu.org 
 help / color / mirror / code / Atom feed
* Re: decode_eol and inconsistent EOL
       [not found] <E170n3M-0000tW-00@fencepost.gnu.org>
@ 2002-04-25 19:21 ` Eli Zaretskii
  2002-04-26 13:00   ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2002-04-25 19:21 UTC (permalink / raw)
  Cc: bug-emacs

> From: Stephen Gildea <gildea@stop.mail-abuse.org>
> Date: Thu, 25 Apr 2002 13:29:05 -0400
> 
> I'd like to see a cleverer setting of the eol_type.  If almost all of
> the lines have a particular eol style, use that instead of falling back
> to CODING_EOL_LF after a single bad line.

That kind of heuristic is bound to trip some people, I think.  It's
difficult to set a threshold that will suit everyone.  It's also
difficult for users to set such a threshold, if we give them an
option.

Can't your problem be solved with "C-x RET c dos RET" before the
command that gives you the trouble?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-25 19:21 ` decode_eol and inconsistent EOL Eli Zaretskii
@ 2002-04-26 13:00   ` Stefan Monnier
  2002-04-26 14:37     ` Stefan Monnier
  2002-04-29  4:46     ` Eli Zaretskii
  0 siblings, 2 replies; 8+ messages in thread
From: Stefan Monnier @ 2002-04-26 13:00 UTC (permalink / raw)


>> I'd like to see a cleverer setting of the eol_type.  If almost all of
>> the lines have a particular eol style, use that instead of falling back
>> to CODING_EOL_LF after a single bad line.
> That kind of heuristic is bound to trip some people, I think.  It's
> difficult to set a threshold that will suit everyone.  It's also
> difficult for users to set such a threshold, if we give them an
> option.
> Can't your problem be solved with "C-x RET c dos RET" before the
> command that gives you the trouble?

I think all that we really care about is that the load+save trip is safe
and that the content of the Emacs buffer looks "as right as possible".

In Stephen's case, clearly CRCRLF should be consider as CR+eol and there
does not need to be any heuristic for that.  It's perfectly safe and can
only be the right choice for the internal representation.

I.e. if all LF are preceded by a CR we should use "dos-eol" whether or
not those CRs are sometimes preceded by other CRs.


        Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-26 13:00   ` Stefan Monnier
@ 2002-04-26 14:37     ` Stefan Monnier
  2002-04-29  4:46     ` Eli Zaretskii
  1 sibling, 0 replies; 8+ messages in thread
From: Stefan Monnier @ 2002-04-26 14:37 UTC (permalink / raw)


>>>>> "Stefan" == Stefan Monnier <monnier+gnu.emacs.bug/news/@flint.cs.yale.edu> writes:
> I think all that we really care about is that the load+save trip is safe
> and that the content of the Emacs buffer looks "as right as possible".

I.e. I suggest the patch below which makes Emacs accept lone CRs inside
dos-style files.  It should fix the problem for Stephen while still
guaranteeing a safe load+save round trip.


        Stefan


--- coding.c.~1.241.~	Tue Apr 16 10:15:28 2002
+++ coding.c	Fri Apr 26 09:09:16 2002
@@ -3173,11 +3173,6 @@
 	      ONE_MORE_BYTE (c);
 	      if (c != '\n')
 		{
-		  if (coding->mode & CODING_MODE_INHIBIT_INCONSISTENT_EOL)
-		    {
-		      coding->result = CODING_FINISH_INCONSISTENT_EOL;
-		      goto label_end_of_loop;
-		    }
 		  src--;
 		  c = '\r';
 		}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-26 13:00   ` Stefan Monnier
  2002-04-26 14:37     ` Stefan Monnier
@ 2002-04-29  4:46     ` Eli Zaretskii
  2002-04-29 12:59       ` Stefan Monnier
  1 sibling, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2002-04-29  4:46 UTC (permalink / raw)
  Cc: gnu-emacs-bug


On 26 Apr 2002, Stefan Monnier wrote:

> I think all that we really care about is that the load+save trip is safe
> and that the content of the Emacs buffer looks "as right as possible".

The second part is the issue here: it's open to interpretation.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-29  4:46     ` Eli Zaretskii
@ 2002-04-29 12:59       ` Stefan Monnier
  2002-04-29 18:00         ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2002-04-29 12:59 UTC (permalink / raw)
  Cc: Stefan Monnier, gnu-emacs-bug

> > I think all that we really care about is that the load+save trip is safe
> > and that the content of the Emacs buffer looks "as right as possible".
> 
> The second part is the issue here: it's open to interpretation.

I believe we've lost track of the problem.
Do you agree with the patch below ?

All it does is that if the auto-detection of eol has decided to
use dos-style eol, then we use dos-style eols even if the file has some
stray CRs characters.  Of course, if there is an LF in the file which
is not preceded by a CR, we revert to unix-style eol, as before.

It is safe and does not change any heuristic.  I don't think it's
"open to interpretation" because it only changes the behavior when
there are CRLFs in the file (otherwise the auto-detection would not
have chosen dos-style eol) and when every LF is preceded by a CR
which is a pretty clear indication that we're dealing with a dos-style
file.


	Stefan


--- coding.c.~1.241.~	Tue Apr 16 10:15:28 2002
+++ coding.c	Fri Apr 26 09:09:16 2002
@@ -3173,11 +3173,6 @@
 	      ONE_MORE_BYTE (c);
 	      if (c != '\n')
 		{
-		  if (coding->mode & CODING_MODE_INHIBIT_INCONSISTENT_EOL)
-		    {
-		      coding->result = CODING_FINISH_INCONSISTENT_EOL;
-		      goto label_end_of_loop;
-		    }
 		  src--;
 		  c = '\r';
 		}

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-29 12:59       ` Stefan Monnier
@ 2002-04-29 18:00         ` Eli Zaretskii
  2002-04-29 18:11           ` Stefan Monnier
  0 siblings, 1 reply; 8+ messages in thread
From: Eli Zaretskii @ 2002-04-29 18:00 UTC (permalink / raw)
  Cc: gnu-emacs-bug

> From: "Stefan Monnier" <monnier+gnu/emacs/bug@rum.cs.yale.edu>
> Date: Mon, 29 Apr 2002 08:59:44 -0400
> 
> I believe we've lost track of the problem.
> Do you agree with the patch below ?

I'm not sure I do.

> It is safe and does not change any heuristic.  I don't think it's
> "open to interpretation" because it only changes the behavior when
> there are CRLFs in the file (otherwise the auto-detection would not
> have chosen dos-style eol)

Let me remind you that auto-detection only examines 3 lines before it
decides.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-29 18:00         ` Eli Zaretskii
@ 2002-04-29 18:11           ` Stefan Monnier
  2002-04-29 19:08             ` Eli Zaretskii
  0 siblings, 1 reply; 8+ messages in thread
From: Stefan Monnier @ 2002-04-29 18:11 UTC (permalink / raw)
  Cc: monnier+gnu/emacs/bug, gnu-emacs-bug

> > From: "Stefan Monnier" <monnier+gnu/emacs/bug@rum.cs.yale.edu>
> > Date: Mon, 29 Apr 2002 08:59:44 -0400
> > 
> > I believe we've lost track of the problem.
> > Do you agree with the patch below ?
> 
> I'm not sure I do.
> 
> > It is safe and does not change any heuristic.  I don't think it's
> > "open to interpretation" because it only changes the behavior when
> > there are CRLFs in the file (otherwise the auto-detection would not
> > have chosen dos-style eol)
> 
> Let me remind you that auto-detection only examines 3 lines before it
> decides.

So what ?
It still means that dos is only used if the first three lines are terminated
by CRLF.  Why is it better to use unix-eol rather than dos-eol when
the file has:
- at least 3 CRLF.
- no LF without a preceding CR.
- some lone CRs after the first three lines.


	Stefan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: decode_eol and inconsistent EOL
  2002-04-29 18:11           ` Stefan Monnier
@ 2002-04-29 19:08             ` Eli Zaretskii
  0 siblings, 0 replies; 8+ messages in thread
From: Eli Zaretskii @ 2002-04-29 19:08 UTC (permalink / raw)
  Cc: gnu-emacs-bug

> From: "Stefan Monnier" <monnier+gnu/emacs/bug@rum.cs.yale.edu>
> Date: Mon, 29 Apr 2002 14:11:59 -0400
> 
> Why is it better to use unix-eol rather than dos-eol when
> the file has:
> - at least 3 CRLF.
> - no LF without a preceding CR.
> - some lone CRs after the first three lines.

Because unix-eol (a.k.a. no EOL conversion) shows you the exact
contents of the file, without hiding parts of it.

I'm not saying that this is _always_ better, but I certainly don't see
why the change you suggest would make a better behavior.  Each one
has its merits and demerits, I think.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-04-29 19:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E170n3M-0000tW-00@fencepost.gnu.org>
2002-04-25 19:21 ` decode_eol and inconsistent EOL Eli Zaretskii
2002-04-26 13:00   ` Stefan Monnier
2002-04-26 14:37     ` Stefan Monnier
2002-04-29  4:46     ` Eli Zaretskii
2002-04-29 12:59       ` Stefan Monnier
2002-04-29 18:00         ` Eli Zaretskii
2002-04-29 18:11           ` Stefan Monnier
2002-04-29 19:08             ` Eli Zaretskii

Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/emacs.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).