From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files Date: Mon, 28 Sep 2015 00:04:36 +0300 Message-ID: <83twqfefq3.fsf@gnu.org> References: <20150921165211.20434.28114@vcs.savannah.gnu.org> <83fv27mt7r.fsf@gnu.org> <83wpvfix7i.fsf@gnu.org> <83fv23hr0z.fsf@gnu.org> <5605CB6B.4000102@cs.ucla.edu> <83twqhhf0g.fsf@gnu.org> <5606AC48.7090801@cs.ucla.edu> <83zj09fbzp.fsf@gnu.org> <5606C140.6090309@cs.ucla.edu> <878u7trwlb.fsf@fencepost.gnu.org> <5606E995.2000102@cs.ucla.edu> <83si61ezxd.fsf@gnu.org> <560700E1.4010403@cs.ucla.edu> <83pp14fhj5.fsf@gnu.org> <87io6wqpf5.fsf@fencepost.gnu.org> <83bncof9w2.fsf@gnu.org> <56084FDF.704@cs.ucla.edu> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Trace: ger.gmane.org 1443387893 16465 80.91.229.3 (27 Sep 2015 21:04:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 27 Sep 2015 21:04:53 +0000 (UTC) Cc: rustompmody@gmail.com, emacs-devel@gnu.org To: Paul Eggert Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Sep 27 23:04:39 2015 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1ZgJ7e-0006JP-Ex for ged-emacs-devel@m.gmane.org; Sun, 27 Sep 2015 23:04:38 +0200 Original-Received: from localhost ([::1]:58768 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZgJ7d-0000AP-Th for ged-emacs-devel@m.gmane.org; Sun, 27 Sep 2015 17:04:37 -0400 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:36005) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZgJ7a-0000A0-6C for emacs-devel@gnu.org; Sun, 27 Sep 2015 17:04:35 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZgJ7V-0003ww-7F for emacs-devel@gnu.org; Sun, 27 Sep 2015 17:04:34 -0400 Original-Received: from mtaout21.012.net.il ([80.179.55.169]:53168) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZgJ7U-0003wq-Va for emacs-devel@gnu.org; Sun, 27 Sep 2015 17:04:29 -0400 Original-Received: from conversion-daemon.a-mtaout21.012.net.il by a-mtaout21.012.net.il (HyperSendmail v2007.08) id <0NVC00H00T2LO100@a-mtaout21.012.net.il> for emacs-devel@gnu.org; Mon, 28 Sep 2015 00:04:27 +0300 (IDT) Original-Received: from HOME-C4E4A596F7 ([84.94.185.246]) by a-mtaout21.012.net.il (HyperSendmail v2007.08) with ESMTPA id <0NVC00HH9T7EN520@a-mtaout21.012.net.il>; Mon, 28 Sep 2015 00:04:27 +0300 (IDT) In-reply-to: <56084FDF.704@cs.ucla.edu> X-012-Sender: halo1@inter.net.il X-detected-operating-system: by eggs.gnu.org: Solaris 10 X-Received-From: 80.179.55.169 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:190419 Archived-At: > Cc: emacs-devel@gnu.org > From: Paul Eggert > Date: Sun, 27 Sep 2015 13:21:51 -0700 >=20 > Eli Zaretskii wrote: > > This is unrelated: it specifies which character sequences should = be > > composed and displayed as a single grapheme cluster. >=20 > Yes. It might be reasonable to replace some of those \u instances = for=20 > readability, e.g.: >=20 > -=09 ("V" . "[\u0904-\u0914\u0960-\u0961\u0972]") ; independent v= owel > +=09 ("V" . "[=E0=A4=84-=E0=A4=94=E0=A5=A0-=E0=A5=A1=E0=A5=B2]") = ; independent vowel I'm not so sure this is a good idea: since most of us don't read Indi= c scripts, leaving the codepoints there makes it easier to compare thes= e patterns with various relevant publications and standards on the Internet. If we make them characters instead, most of us will have t= o use "C-x =3D" to see the codepoints anyway. > But replacements would not be such a good idea for some of this cod= e, e.g.: >=20 > -=09 ("H" . "\u094D")=09=09; HALANT > +=09 ("H" . "=E0=A5=8D")=09=09; HALANT >=20 > as standalone combining characters are problematic on display, and = here: >=20 > -=09 ("J" . "\u200D")=09=09; ZWJ > +=09 ("J" . "=E2=80=8D")=09=09; ZWJ >=20 > where one can't easily see a zero width joiner when editing the > source file. Indeed.