From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!not-for-mail
From: "Stephen J. Turnbull" <stephen@xemacs.org>
Newsgroups: gmane.emacs.devel
Subject: Re: EOL: unix/dos/mac
Date: Wed, 27 Mar 2013 03:34:36 +0900
Message-ID: <87vc8eau2r.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <CADkQgvvX1hZV5QbMZ4UfzG5i9oyFJVQS6LirozHg6xayQdMc1g@mail.gmail.com>
	<jwv620fl2p8.fsf-monnier+emacs@gnu.org>
	<87ip4fc4xd.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20130326140247.GB4179@acm.acm>
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
X-Trace: ger.gmane.org 1364322945 29663 80.91.229.3 (26 Mar 2013 18:35:45 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 26 Mar 2013 18:35:45 +0000 (UTC)
Cc: Per =?utf-8?Q?Starb=C3=A4ck?= <per.starback@gmail.com>,
	Stefan Monnier <monnier@iro.umontreal.ca>, emacs-devel@gnu.org
To: Alan Mackenzie <acm@muc.de>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Tue Mar 26 19:36:07 2013
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([208.118.235.17])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1UKYj6-0007QJ-UA
	for ged-emacs-devel@m.gmane.org; Tue, 26 Mar 2013 19:36:05 +0100
Original-Received: from localhost ([::1]:40552 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1UKYii-0003hN-Tn
	for ged-emacs-devel@m.gmane.org; Tue, 26 Mar 2013 14:35:40 -0400
Original-Received: from eggs.gnu.org ([208.118.235.92]:56082)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stephen@xemacs.org>) id 1UKYhl-0002QU-LR
	for emacs-devel@gnu.org; Tue, 26 Mar 2013 14:34:43 -0400
Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stephen@xemacs.org>) id 1UKYhk-0004Na-9N
	for emacs-devel@gnu.org; Tue, 26 Mar 2013 14:34:41 -0400
Original-Received: from mgmt2.sk.tsukuba.ac.jp ([130.158.97.224]:52544)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stephen@xemacs.org>) id 1UKYhk-0004Mt-08
	for emacs-devel@gnu.org; Tue, 26 Mar 2013 14:34:40 -0400
Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp
	[130.158.99.156])
	by mgmt2.sk.tsukuba.ac.jp (Postfix) with ESMTP id 3A48B97090B;
	Wed, 27 Mar 2013 03:34:37 +0900 (JST)
Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000)
	id F36CB1A3D97; Wed, 27 Mar 2013 03:34:36 +0900 (JST)
In-Reply-To: <20130326140247.GB4179@acm.acm>
X-Mailer: VM undefined under 21.5  (beta32) "habanero" b0d40183ac79 XEmacs
	Lucid (x86_64-unknown-linux)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6.x
X-Received-From: 130.158.97.224
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:158237
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/158237>

Alan Mackenzie writes:

 > This is a little confusing to poor old me.  ASCII doesn't care about line
 > breaks either; only particular use cases care.

True.  ASCII is a coded character set.  It does not have a way to
represent an abstract line break in a single character; whatever you
do, then, is outside of the ASCII standard.

 > If you write a script (whether bash, sed, ....) on a *nix system
 > and it has CRLF line ends, it will fail (with an obscure error
 > message) regardless of whether that script is nominally in UTF-8 or
 > ASCII or whatever.

Python, at least, is not in your ellipsis.  Not by default, and not on
any supported platform.  I wouldn't be surprised if Perl and Ruby have
adopted "universal newlines", too.

 > In what sense does Unicode "not care"?

In the sense that Unicode is more than a character set; it prescribes
all kinds of algorithms for text processing as well.  Here, section
5.8 of the Unicode Standard v6.2 prescribes that any of LF, CR, CRLF,
and ISO 6246 NEXT LINE (U+0085) should be considered to be a single
line (or paragraph) break in legacy text.  It says nothing about how
they should be represented internally, though.  Unusually for the
Unicode Standard, it allows you to guess what the user wants, and in
some cases even alter the input stream before outputting it.

"Legacy" text means it uses ASCII (or C1) control characters to
represent line and/or paragraph breaks, rather than the characters
prescribed by Unicode (U+2028 LINE SEPARATOR and U+2029 PARAGRAPH
SEPARATOR).