Re: EOL: unix/dos/mac - Eli Zaretskii

all messages for Emacs-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

From: Eli Zaretskii <eliz@gnu.org>
To: "Stephen J. Turnbull" <stephen@xemacs.org>
Cc: per.starback@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org
Subject: Re: EOL: unix/dos/mac
Date: Tue, 26 Mar 2013 15:07:21 +0200	[thread overview]
Message-ID: <83obe6z4vq.fsf@gnu.org> (raw)
In-Reply-To: <871ub2crhm.fsf@uwakimon.sk.tsukuba.ac.jp>

> From: "Stephen J. Turnbull" <stephen@xemacs.org>
> Cc: per.starback@gmail.com,
>     monnier@iro.umontreal.ca,
>     emacs-devel@gnu.org
> Date: Tue, 26 Mar 2013 20:47:33 +0900
> 
>  > > Trying to support multiple EOL codings in the buffer is craziness.
>  > 
>  > But it's the only way to be 100% sure you don't introduce spurious
>  > changes into files.  And since newlines, unlike characters, are not
>  > displayed, there's no issues with fonts etc. here.
> 
> Currently NLFs *are* displayed, if they don't match the default for
> the buffer.

No, they are displayed because nothing other than a single LF is
treated like NLF by the Emacs internals.  EOL conversion is a layer on
top of that; the buffer maintenance and the display engine know
absolutely nothing about it.

Once these byte sequences are recognized as NLFs, they will not be
displayed, because that's how the Emacs display works.

>  > > Doing it only for EOLs would be much less painful, but it's not
>  > > worth it.
>  > 
>  > Please explain why do you think it isn't worth it.
> 
> Because you have to fix pretty much everything

I'm probably missing something important, because things I think will
need fixing are nowhere near "pretty much everything".  How about
posting a long enough list of things to fix to convince me that
"pretty much everything" is close to the truth?

> new syntax will be required for stuff like zap-to-char

Why?

> and nearly required for regexps.

For $ we will need to get regex.c support the additional NLFs, and
that's all.  If you mean a literal \n in regexps, then yes, something
will have to be done with that.  But it would be a good thing on its
own right, because Emacs will come closer to supporting Unicode
standard annexes.

> Code will be massively uglified with tests for variable-length
> sequences instead of single characters

The code is already replete with that, ever since Emacs started using
a multi-byte representation for characters in buffers.  We have a set
of macros to fetch and examine multi-byte sequences, for that reason.
I see nothing hard or "ugly" here, sorry.

> everything from motion to insdel will have to be modified

Why?

> Any code handling old-style hidden lines (with CR marking
> "invisible" lines) will have to be changed.

First, we want to deprecate and remove this feature anyway (there's
already an implemented alternative).  And second, we already handle
this today so that we don't display ^M there; the same method can be
used for the other NLFs.

> It's not obvious to me that there are no counterintuitive
> implications.  Opposed to that, there are very few text files with
> mixed line endings, and in many cases the user would actually like to
> have them regularized (at a time of their choosing, so they can have a
> commit with only whitespace changes, for example).

We should be consistent: either there is a problem with mixed line
endings and with Unicode NLFs that aren't treated as EOL at all, or
there isn't.  If the problem is insignificant, perhaps nothing should
be changed at all.  If the problem _is_ significant, we might as well
solve it The Right Way, instead of applying more and more band-aid.
Conversion of NLFs to a single LF is a kludge, same as emptying the
kettle when you already have a procedure for preparing a kettle of
boiled water starting with an empty one.  You cannot do such
conversion efficiently if you need to discover the EOL format for
every line.  Dispensing with the conversion altogether solves both
problems in one go.  What it adds doesn't seem so frightening to me,
certainly less so than, say, adding bidi support ;-)

>  > Surely, going again through the pain of inadvertent changes to user
>  > files is a movie we don't want to be part of again.
> 
> What pain of inadvertant changes?  Sure, there will likely be bugs in
> the first draft of such code, what else is new?  If you're talking
> specifically about the \201 regression, that's a completely different
> issue AFAICT -- that was about buffer-as-unibyte exposing the
> *internal* representation to Lisp, which was a "Mr. Foot, may I
> introduce to you Mr. Bullet" kind of idea from Day 1.

The internal representation is still exposed, so nothing's changed in
that department.

>  > >  > Anything else _will_ introduce spurious modifications, and could
>  > >  > even corrupt some files, if the exact EOL sequence here or there
>  > >  > matters.
>  > > 
>  > > No, it need not, any more than any ambiguous encoding need do so.  Of
>  > > course it will be fragile if (for example) Emacs crashes and you have
>  > > to recover an autosave file.
>  > 
>  > It will be fragile, and subtle bugs will tend to break quite a bit.
> 
> I don't think so.

Well, then we will have agree to disagree.

> I think you're hearing monsters in the closet.

And I think _you_ are hearing them.  Or maybe you will show me such a
large list of things that will become broken by keeping NLFs that I
will change my mind.

next prev parent reply	other threads:[~2013-03-26 13:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-25 13:34 EOL: unix/dos/mac Per Starbäck
2013-03-25 13:56 ` Xue Fuqiao
2013-03-25 22:41   ` Richard Stallman
2013-03-26  2:11     ` Stephen J. Turnbull
2013-03-25 14:21 ` Eli Zaretskii
2013-03-25 17:28   ` Dani Moncayo
2013-03-25 19:17 ` Stefan Monnier
2013-03-26  1:42   ` Stephen J. Turnbull
2013-03-26  6:28     ` Eli Zaretskii
2013-03-26  7:45       ` Stephen J. Turnbull
2013-03-26  8:42         ` Eli Zaretskii
2013-03-26 11:47           ` Stephen J. Turnbull
2013-03-26 13:07             ` Eli Zaretskii [this message]
2013-03-26 18:12               ` Stephen J. Turnbull
2013-03-26 18:44                 ` Eli Zaretskii
2013-03-27  5:10                   ` Stephen J. Turnbull
2013-03-26 12:51     ` Stefan Monnier
2013-03-26 13:10       ` Eli Zaretskii
2013-03-26 17:16         ` Stefan Monnier
2013-03-26 17:47           ` Eli Zaretskii
2013-03-26 18:41             ` Stephen J. Turnbull
2013-03-26 16:16       ` Stephen J. Turnbull
2013-03-26 14:02     ` Alan Mackenzie
2013-03-26 14:19       ` Eli Zaretskii
2013-03-26 18:34       ` Stephen J. Turnbull
2013-03-26  7:53   ` Ulrich Mueller
2013-03-26 12:53     ` Stefan Monnier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83obe6z4vq.fsf@gnu.org \
    --to=eliz@gnu.org \
    --cc=emacs-devel@gnu.org \
    --cc=monnier@iro.umontreal.ca \
    --cc=per.starback@gmail.com \
    --cc=stephen@xemacs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/emacs.git
	https://git.savannah.gnu.org/cgit/emacs/org-mode.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.