From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: John Wiegley Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Sat, 27 Feb 2016 00:58:02 -0800 Message-ID: References: <83io1jpt4u.fsf@gnu.org> <87povqhj25.fsf@gnus.org> <87povqe5tr.fsf@gnus.org> <87ziuta4l4.fsf@gnus.org> <87y4adzcia.fsf@gnus.org> <83twl0k1k5.fsf@gnu.org> <83k2lvi99c.fsf@gnu.org> <83oab6gfiw.fsf@gnu.org> <878u29x8vl.fsf@fastmail.fm> <83ziuncpch.fsf@gnu.org> <83fuwecztu.fsf@gnu.org> Reply-To: John Wiegley NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-Trace: ger.gmane.org 1456563508 10996 80.91.229.3 (27 Feb 2016 08:58:28 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 27 Feb 2016 08:58:28 +0000 (UTC) Cc: joostkremers@fastmail.fm, larsi@gnus.org, lokedhs@gmail.com, rms@gnu.org, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Feb 27 09:58:21 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aZahf-0002w8-T1 for ged-emacs-devel@m.gmane.org; Sat, 27 Feb 2016 09:58:20 +0100 Original-Received: from localhost ([::1]:53918 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZahf-0003Gc-2x for ged-emacs-devel@m.gmane.org; Sat, 27 Feb 2016 03:58:19 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:46930) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZaha-0003GK-Pl for emacs-devel@gnu.org; Sat, 27 Feb 2016 03:58:15 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aZahZ-0007PX-Gj for emacs-devel@gnu.org; Sat, 27 Feb 2016 03:58:14 -0500 Original-Received: from mail-pa0-x22c.google.com ([2607:f8b0:400e:c03::22c]:33597) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZahU-0007P4-EG; Sat, 27 Feb 2016 03:58:08 -0500 Original-Received: by mail-pa0-x22c.google.com with SMTP id fl4so63776570pad.0; Sat, 27 Feb 2016 00:58:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:in-reply-to:date:message-id:references :user-agent:reply-to:mail-followup-to:mime-version; bh=83NUBfvkn+QUBc+ekQB+olh6I7JwFAyL4UUqoFfidd8=; b=XffBQOqIkfT3lh8KBha9L0fWiXwKxWBWw7DzU04zWl5lthvN1gfsvLHMod9h3tA65z SES6byxgo36M4ob2kNHPH+Nct2VtCwKUM57jy+yvcmBDtcesstpT9oZT47hvJ+iTlXh/ LgihwyPWYWErTllChZsQySvtwo2KtmUzB8Fc7g3V9SYjQ9Trq9LSy76jCR/EJymGM/I6 kM43UemCRubqCQJYuRu4YT8/OQx4RkEEpSKcUOYuudlJMAukwZmA4TnwZvS3PAf9sfc1 uv8niCG3TyvmFHJEKvzpTjVk6t9rQSy3saUNlxgjxi+SxHmtT7HQKj5Wwbbi+sYoQH/3 dtxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:in-reply-to:date:message-id :references:user-agent:reply-to:mail-followup-to:mime-version; bh=83NUBfvkn+QUBc+ekQB+olh6I7JwFAyL4UUqoFfidd8=; b=Pa9dPeB72EtJ5r1IDBTRHCfUGyxtJhuc8l+lYauTHWI0Jch7po0g7FolIkzgbx4V4W w6d5axFp6NSozqovlBGdxyi15ALP3I5OKBoNQjiICbsQBt+h9AEJMnimEUa+gx3zz+UH gq1Q6zFTe8h6UHa15cU8q70WlQBpDEOlBcFDNQHn0Q/wpDk4joDk8tC2gDW8Q5jpBHlf cmIy7u5mN2KaZk8pllVKpB+wFE6/+/a2PP1jNLLqhIqRkPIoXQHGZUF26K0FxTeYqzuy rTvuYF8yka0AJrN0dKkN35aE8XAI2Rp5vAmSJop2pPOnDN48VLD1Keb71AG76leslDpm /cwA== X-Gm-Message-State: AD7BkJLzOzhZXlOSvI6Xea36FAVvYwQcdOOEw4Z9fi3XE12Q8XL6r2vd03gobCwmvQj+AA== X-Received: by 10.66.193.161 with SMTP id hp1mr8161446pac.9.1456563487498; Sat, 27 Feb 2016 00:58:07 -0800 (PST) Original-Received: from Vulcan.local (76-234-68-79.lightspeed.frokca.sbcglobal.net. [76.234.68.79]) by smtp.gmail.com with ESMTPSA id yj1sm24542127pac.16.2016.02.27.00.58.06 (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 27 Feb 2016 00:58:06 -0800 (PST) X-Google-Original-From: John Wiegley Original-Received: by Vulcan.local (Postfix, from userid 501) id 9AF22131CF546; Sat, 27 Feb 2016 00:58:05 -0800 (PST) In-Reply-To: <83fuwecztu.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 27 Feb 2016 10:38:53 +0200") User-Agent: Gnus/5.130014 (Ma Gnus v0.14) Emacs/24.5 (darwin) Mail-Followup-To: Eli Zaretskii , joostkremers@fastmail.fm, rms@gnu.org, lokedhs@gmail.com, larsi@gnus.org, emacs-devel@gnu.org X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400e:c03::22c X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200707 Archived-At: --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable >>>>> Eli Zaretskii writes: > The simplest change would be to have character-folding disabled by default > in some European locales whose users expressed objections to having it on= by > default, due to folding of some characters that shouldn't be folded in the > languages of those locales. > Another, more complex, but still simple enough, possibility would be to h= ave > character-folding on by default, but have the problematic foldings filter= ed > out from the regexp used by it. We could either always filter out all of > them, or filter out only some of them, as determined by the user locale. = For > example, in the Spanish locales, =C3=B1 will not be folded. > The next alternative is to come up with a fine-grained classification of > character-folding, and provide user options to control each one of them > independently, with the defaults determined by the user locale. For examp= le, > one class of folding is the one required for matching pre-composed > characters such as =C3=A1 with its decomposed variant a=CC=81; another cl= ass is for > finding "similar" characters, such as finding =E2=92=9C when looking for = a. There > should probably be classes that are disliked by users of certain language= s, > such as =C3=B1 for Spanish. Etc. etc. (I think this alternative needs more > research and user feedback, and so is probably not for the release branch= .) > Maybe there are more alternatives, I don't know. It's not like they were > explicitly proposed by someone; the above is just my personal conclusions > from reading the discussion. Thank you for that summary. From that reading, it sounds like this will require a fairly complex decision tree, to determine what should be folded when based on the details of each particular country/language? That is, we can't expect to make a single decision up front, but will need feedback from users in every country that uses Emacs, in order to determine what the corr= ect settings are for each language? And what about a Swedish speaker living in America who uses en_US because that's what 90% of his text is in, who then wants to search some Swedish te= xt? Is it the locale that determines it, or something specific to the nature of the text in each buffer? And how would Emacs know? Unless I'm not seeing the light at the end of this tunnel, this feature is just not ready for prime-time as a default. There are too many unanswered questions, and it sounds like none of them can be answered in the abstract = for every case. I have a feeling we'd be getting bug reports constantly from us= ers whose language contains details we never anticipated. =2D-=20 John Wiegley GPG fingerprint =3D 4710 CF98 AF9B 327B B80F http://newartisans.com 60E1 46C4 BD1A 7AC1 4BA2 --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQGcBAEBCgAGBQJW0WUaAAoJEMFE2PTxn+YwSukL/0Essxif7sbhttFHSJ1LPYjE GZNYxDWNhGocxSUx+K5mQJ3Xh8sxZ/GFteTFaJM51yn6ruZqhGhImN77S5Fe8rli EYh5tx/jSNqMQfcGnn9vndBwBg1Pna+DX4i1xcpgsAtknEbBmvnI2GQgQKBm5Tb7 iEMziMmUkrp4jf8aewJ7hUZ6pOUADhyFPlxUB3NnLI7QD1pH892hzlW981WV42iY B3jCL+7Llo2V393TzIHVfbT19M3miT38kSwe17umxBTpA0gzYyXsz0mh8B0bB0Vk v3VS6H7DiTbeMP9JPV8ijkfeC6oY8i7bCJdKXGNKryp2QHzz3vO2JWT16NAP3ccs nPSe0xPS3WqzSilVh0KxvnKln1y/xFSVMaSMuiFtJo1zbCr90noxzIlvjHYlY8vc iHxw3o9S/M/c+7Mn6FXGHCcGmB40MP90Cpxdf1xUCBJPsfTs6By+L5cO89kT4IYh Sr55LmowPRqyV4HqaczN0YwiTqTCqg7UjRb55/kB1A== =eSI3 -----END PGP SIGNATURE----- --=-=-=--