From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: Eli Zaretskii <eliz@gnu.org>
Newsgroups: gmane.emacs.devel
Subject: Re: EOL: unix/dos/mac
Date: Tue, 26 Mar 2013 15:07:21 +0200
Message-ID: <83obe6z4vq.fsf@gnu.org>
References: <CADkQgvvX1hZV5QbMZ4UfzG5i9oyFJVQS6LirozHg6xayQdMc1g@mail.gmail.com>
	<jwv620fl2p8.fsf-monnier+emacs@gnu.org>
	<87ip4fc4xd.fsf@uwakimon.sk.tsukuba.ac.jp> <831ub21xpn.fsf@gnu.org>
	<87620ed2p1.fsf@uwakimon.sk.tsukuba.ac.jp> <83vc8ezh52.fsf@gnu.org>
	<871ub2crhm.fsf@uwakimon.sk.tsukuba.ac.jp>
Reply-To: Eli Zaretskii <eliz@gnu.org>
NNTP-Posting-Host: plane.gmane.org
X-Trace: ger.gmane.org 1364303247 28307 80.91.229.3 (26 Mar 2013 13:07:27 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 26 Mar 2013 13:07:27 +0000 (UTC)
Cc: per.starback@gmail.com, monnier@iro.umontreal.ca, emacs-devel@gnu.org
To: "Stephen J. Turnbull" <stephen@xemacs.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Mar 26 14:07:51 2013
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1UKTbQ-0004F2-87
	for ged-emacs-devel@m.gmane.org; Tue, 26 Mar 2013 14:07:48 +0100
Original-Received: from localhost ([::1]:47143 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1UKTb2-0006b3-9Y
	for ged-emacs-devel@m.gmane.org; Tue, 26 Mar 2013 09:07:24 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:47143)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <eliz@gnu.org>) id 1UKTat-0006ar-Lo
	for emacs-devel@gnu.org; Tue, 26 Mar 2013 09:07:21 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <eliz@gnu.org>) id 1UKTao-0004XK-DZ
	for emacs-devel@gnu.org; Tue, 26 Mar 2013 09:07:15 -0400
Original-Received: from mtaout22.012.net.il ([80.179.55.172]:56463)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <eliz@gnu.org>)
	id 1UKTao-0004X3-0P
	for emacs-devel@gnu.org; Tue, 26 Mar 2013 09:07:10 -0400
Original-Received: from conversion-daemon.a-mtaout22.012.net.il by
	a-mtaout22.012.net.il (HyperSendmail v2007.08) id
	<0MK900B00R37NA00@a-mtaout22.012.net.il> for
	emacs-devel@gnu.org; Tue, 26 Mar 2013 15:07:08 +0200 (IST)
Original-Received: from HOME-C4E4A596F7 ([87.69.4.28]) by a-mtaout22.012.net.il
	(HyperSendmail v2007.08) with ESMTPA id
	<0MK900BWVR3W1L90@a-mtaout22.012.net.il>;
	Tue, 26 Mar 2013 15:07:08 +0200 (IST)
In-reply-to: <871ub2crhm.fsf@uwakimon.sk.tsukuba.ac.jp>
X-012-Sender: halo1@inter.net.il
X-detected-operating-system: by eggs.gnu.org: Solaris 10
X-Received-From: 80.179.55.172
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:158205
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/158205>

> From: "Stephen J. Turnbull" <stephen@xemacs.org>
> Cc: per.starback@gmail.com,
>     monnier@iro.umontreal.ca,
>     emacs-devel@gnu.org
> Date: Tue, 26 Mar 2013 20:47:33 +0900
> 
>  > > Trying to support multiple EOL codings in the buffer is craziness.
>  > 
>  > But it's the only way to be 100% sure you don't introduce spurious
>  > changes into files.  And since newlines, unlike characters, are not
>  > displayed, there's no issues with fonts etc. here.
> 
> Currently NLFs *are* displayed, if they don't match the default for
> the buffer.

No, they are displayed because nothing other than a single LF is
treated like NLF by the Emacs internals.  EOL conversion is a layer on
top of that; the buffer maintenance and the display engine know
absolutely nothing about it.

Once these byte sequences are recognized as NLFs, they will not be
displayed, because that's how the Emacs display works.

>  > > Doing it only for EOLs would be much less painful, but it's not
>  > > worth it.
>  > 
>  > Please explain why do you think it isn't worth it.
> 
> Because you have to fix pretty much everything

I'm probably missing something important, because things I think will
need fixing are nowhere near "pretty much everything".  How about
posting a long enough list of things to fix to convince me that
"pretty much everything" is close to the truth?

> new syntax will be required for stuff like zap-to-char

Why?

> and nearly required for regexps.

For $ we will need to get regex.c support the additional NLFs, and
that's all.  If you mean a literal \n in regexps, then yes, something
will have to be done with that.  But it would be a good thing on its
own right, because Emacs will come closer to supporting Unicode
standard annexes.

> Code will be massively uglified with tests for variable-length
> sequences instead of single characters

The code is already replete with that, ever since Emacs started using
a multi-byte representation for characters in buffers.  We have a set
of macros to fetch and examine multi-byte sequences, for that reason.
I see nothing hard or "ugly" here, sorry.

> everything from motion to insdel will have to be modified

Why?

> Any code handling old-style hidden lines (with CR marking
> "invisible" lines) will have to be changed.

First, we want to deprecate and remove this feature anyway (there's
already an implemented alternative).  And second, we already handle
this today so that we don't display ^M there; the same method can be
used for the other NLFs.

> It's not obvious to me that there are no counterintuitive
> implications.  Opposed to that, there are very few text files with
> mixed line endings, and in many cases the user would actually like to
> have them regularized (at a time of their choosing, so they can have a
> commit with only whitespace changes, for example).

We should be consistent: either there is a problem with mixed line
endings and with Unicode NLFs that aren't treated as EOL at all, or
there isn't.  If the problem is insignificant, perhaps nothing should
be changed at all.  If the problem _is_ significant, we might as well
solve it The Right Way, instead of applying more and more band-aid.
Conversion of NLFs to a single LF is a kludge, same as emptying the
kettle when you already have a procedure for preparing a kettle of
boiled water starting with an empty one.  You cannot do such
conversion efficiently if you need to discover the EOL format for
every line.  Dispensing with the conversion altogether solves both
problems in one go.  What it adds doesn't seem so frightening to me,
certainly less so than, say, adding bidi support ;-)

>  > Surely, going again through the pain of inadvertent changes to user
>  > files is a movie we don't want to be part of again.
> 
> What pain of inadvertant changes?  Sure, there will likely be bugs in
> the first draft of such code, what else is new?  If you're talking
> specifically about the \201 regression, that's a completely different
> issue AFAICT -- that was about buffer-as-unibyte exposing the
> *internal* representation to Lisp, which was a "Mr. Foot, may I
> introduce to you Mr. Bullet" kind of idea from Day 1.

The internal representation is still exposed, so nothing's changed in
that department.

>  > >  > Anything else _will_ introduce spurious modifications, and could
>  > >  > even corrupt some files, if the exact EOL sequence here or there
>  > >  > matters.
>  > > 
>  > > No, it need not, any more than any ambiguous encoding need do so.  Of
>  > > course it will be fragile if (for example) Emacs crashes and you have
>  > > to recover an autosave file.
>  > 
>  > It will be fragile, and subtle bugs will tend to break quite a bit.
> 
> I don't think so.

Well, then we will have agree to disagree.

> I think you're hearing monsters in the closet.

And I think _you_ are hearing them.  Or maybe you will show me such a
large list of things that will become broken by keeping NLFs that I
will change my mind.