From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Eli Zaretskii Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Fri, 12 Feb 2016 14:00:20 +0200 Message-ID: <83pow26svf.fsf@gnu.org> References: <87mvr9wxqz.fsf@wanadoo.es> <87io1xwq1e.fsf@wanadoo.es> <87vb5wvzfz.fsf@mail.linkov.net> <87io1wt4cc.fsf@wanadoo.es> <8737syoima.fsf@mail.linkov.net> <871t8iu277.fsf@wanadoo.es> <83d1s28kvh.fsf@gnu.org> <87r3gis7sm.fsf@wanadoo.es> <83twle71xy.fsf@gnu.org> <87io1us0te.fsf@wanadoo.es> Reply-To: Eli Zaretskii NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1455278444 13537 80.91.229.3 (12 Feb 2016 12:00:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 12 Feb 2016 12:00:44 +0000 (UTC) Cc: emacs-devel@gnu.org To: =?utf-8?Q?=C3=93scar?= Fuentes Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Feb 12 13:00:40 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aUCOs-0000n5-4s for ged-emacs-devel@m.gmane.org; Fri, 12 Feb 2016 13:00:38 +0100 Original-Received: from localhost ([::1]:60058 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUCOr-0006aP-HE for ged-emacs-devel@m.gmane.org; Fri, 12 Feb 2016 07:00:37 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:42557) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUCOj-0006Yi-CZ for emacs-devel@gnu.org; Fri, 12 Feb 2016 07:00:33 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aUCOd-0001CJ-J5 for emacs-devel@gnu.org; Fri, 12 Feb 2016 07:00:29 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:52771) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aUCOd-0001CE-GX; Fri, 12 Feb 2016 07:00:23 -0500 Original-Received: from 84.94.185.246.cable.012.net.il ([84.94.185.246]:2081 helo=home-c4e4a596f7) by fencepost.gnu.org with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) (Exim 4.82) (envelope-from ) id 1aUCOc-0005Qa-SL; Fri, 12 Feb 2016 07:00:23 -0500 In-reply-to: <87io1us0te.fsf@wanadoo.es> (message from =?utf-8?Q?=C3=93sca?= =?utf-8?Q?r?= Fuentes on Fri, 12 Feb 2016 11:03:09 +0100) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:199811 Archived-At: > From: Óscar Fuentes > Cc: emacs-devel@gnu.org > Date: Fri, 12 Feb 2016 11:03:09 +0100 > > Eli Zaretskii writes: > > >> If ñ is meant to be read as ñ > > > > Don't you see them displayed identically in Emacs (and in any other > > program that correctly implements display of combining accents)? > > Maybe I don't really understand that "if" part. > > They look a bit different here. It could be an issue with your default font. Perhaps it doesn't have the precomposed glyph. > ñ shall match ñ, but n shall not match either, from an Spaniard POV. But in the case of 2 characters, a literal n is present in the buffer, so not finding it would be a miss, don't you think? > > Otherwise, I'm afraid I see > > no sense in this logic: IMO identically looking text should match, or > > else users will kill us. > > Agreed, although in practice your example is not a big issue since I do > expect to rarely see ñ (the composed variant) used in Spanish text. And > probably not easy to implement at all for the general case (all > identical-looking combinations for all languages). We do that by using the Unicode database, because then we are free from the need to decide whether a given diacrtic can or cannot combine with a given base character. > > If you agree that a match is TRT in these (and other similar) cases, > > then you should agree that _some_ form of character folding should be > > turned on by default. > > I see where are you coming from ;-) On my first message on this thread I > said that I was ambivalent wrt the default status of this feature, > before finding the n/ñ issue. Not so after. A Spaniard could also deem > useful to match ú and ü while searching for u. See, the problem here is > not character-folding itsef, but how it works: a non-Spaniard could > expect matching ñ while searching for n, because for him ñ is a `n' with > a tilde, which is essentially the same case as the `u' example mentioned > above but from the POV of someone who doesn't know Spanish. (*) What about finding ⒜ when searching for a, don't you want to find that? This is not specific to any language.