From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Richard Stallman Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Fri, 26 Feb 2016 15:23:45 -0500 Message-ID: References: <83pow26svf.fsf@gnu.org> <87a8n5srbp.fsf@wanadoo.es> <83d1s17npz.fsf@gnu.org> <87oablfpn3.fsf@mail.linkov.net> <834mdd6llx.fsf@gnu.org> <7fbb8bc7-9a97-4bad-a103-a6690a35241d@default> <834mdc5w6o.fsf@gnu.org> <838u2hu6aq.fsf@gnu.org> <871t899tde.fsf@gnus.org> <83y4ahru04.fsf@gnu.org> <83fuwproyf.fsf@gnu.org> <837fi0sz29.fsf@gnu.org> <83egc8qzjh.fsf@gnu.org> <87egc7evu3.fsf@gnus.org> <83io1jpt4u.fsf@gnu.org> <87povqhj25.fsf@gnus.org> <83povqm3dw.fsf@gnu.org> <831t84lgsa.fsf@gnu.org> Reply-To: rms@gnu.org NNTP-Posting-Host: plane.gmane.org Content-Type: text/plain; charset=Utf-8 X-Trace: ger.gmane.org 1456518278 25034 80.91.229.3 (26 Feb 2016 20:24:38 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 26 Feb 2016 20:24:38 +0000 (UTC) Cc: larsi@gnus.org, lokedhs@gmail.com, emacs-devel@gnu.org To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Fri Feb 26 21:24:30 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aZOw3-0004Op-N8 for ged-emacs-devel@m.gmane.org; Fri, 26 Feb 2016 21:24:23 +0100 Original-Received: from localhost ([::1]:51975 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZOw3-0003oL-7H for ged-emacs-devel@m.gmane.org; Fri, 26 Feb 2016 15:24:23 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:37160) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZOvo-0003nf-RS for emacs-devel@gnu.org; Fri, 26 Feb 2016 15:24:09 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aZOvn-00079z-Q7 for emacs-devel@gnu.org; Fri, 26 Feb 2016 15:24:08 -0500 Original-Received: from fencepost.gnu.org ([2001:4830:134:3::e]:55731) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aZOvS-0006Vu-Iv; Fri, 26 Feb 2016 15:23:46 -0500 Original-Received: from rms by fencepost.gnu.org with local (Exim 4.82) (envelope-from ) id 1aZOvR-0004Jj-Bm; Fri, 26 Feb 2016 15:23:45 -0500 In-reply-to: <831t84lgsa.fsf@gnu.org> (message from Eli Zaretskii on Mon, 22 Feb 2016 20:51:49 +0200) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2001:4830:134:3::e X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200696 Archived-At: [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > * A per-buffer language preference variable. > > * A global value which becomes the default for new buffers. > That's unnecessarily restrictive; we can do better with the current > infrastructure. This is not a restiction, it is a feature. It is meant to enables people to do something convenient. > Some encodings provide us with charset information, > which can be used to deduce the language of the text. Some characters > belong to Unicode blocks that allow identification of the language, or > maybe a small group of languages. In some cases, the text itself > comes with metadata which describes the language. And there might be > other sources of information about the language. If there are useful ways to determine the language from the text, that work well enough that users won't complain, let's do it. That would be an add-on to the structure I proposed. > There are other aspects of this that need to be considered, if we want > for language-specific searching to be solid. E.g., what happens with > text copied to another buffer which might have a different per-buffer > language preference? does it suddenly behave differently when > searched? Yes. If you want the two buffers to have the same language preference, then maybe Emacs can guess that for you; if not, you can specify it. > But the most basic issue is that any significant development in these > directions require to re-implement the feature on the C level, and use > char-tables for folding, like we do with case-mapping. It needs to use some sort of tables. Whether they are the current kind of char table, or some other structure, is something to be determined. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html.