From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: =?UTF-8?Q?Elias_M=C3=A5rtenson?= Newsgroups: gmane.emacs.devel Subject: Re: On language-dependent defaults for character-folding Date: Mon, 22 Feb 2016 00:58:37 +0800 Message-ID: References: <83pow26svf.fsf@gnu.org> <87a8n5srbp.fsf@wanadoo.es> <83d1s17npz.fsf@gnu.org> <87oablfpn3.fsf@mail.linkov.net> <834mdd6llx.fsf@gnu.org> <7fbb8bc7-9a97-4bad-a103-a6690a35241d@default> <834mdc5w6o.fsf@gnu.org> <838u2hu6aq.fsf@gnu.org> <871t899tde.fsf@gnus.org> <83y4ahru04.fsf@gnu.org> <83fuwproyf.fsf@gnu.org> <837fi0sz29.fsf@gnu.org> <83egc8qzjh.fsf@gnu.org> <87egc7evu3.fsf@gnus.org> <83io1jpt4u.fsf@gnu.org> <87povqhj25.fsf@gnus.org> <83povqm3dw.fsf@gnu.org> NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=001a114302429c8370052c4aa06c X-Trace: ger.gmane.org 1456073930 26931 80.91.229.3 (21 Feb 2016 16:58:50 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 21 Feb 2016 16:58:50 +0000 (UTC) Cc: Lars Ingebrigtsen , emacs-devel To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sun Feb 21 17:58:49 2016 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([208.118.235.17]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1aXXLL-0005BO-RI for ged-emacs-devel@m.gmane.org; Sun, 21 Feb 2016 17:58:48 +0100 Original-Received: from localhost ([::1]:42415 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXXLI-0001np-31 for ged-emacs-devel@m.gmane.org; Sun, 21 Feb 2016 11:58:44 -0500 Original-Received: from eggs.gnu.org ([2001:4830:134:3::10]:48293) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXXLD-0001nX-Nc for emacs-devel@gnu.org; Sun, 21 Feb 2016 11:58:41 -0500 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aXXLC-0006YD-Am for emacs-devel@gnu.org; Sun, 21 Feb 2016 11:58:39 -0500 Original-Received: from mail-vk0-x22a.google.com ([2607:f8b0:400c:c05::22a]:34106) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aXXLC-0006Y9-46; Sun, 21 Feb 2016 11:58:38 -0500 Original-Received: by mail-vk0-x22a.google.com with SMTP id e185so111530433vkb.1; Sun, 21 Feb 2016 08:58:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=VGN56ReUHQoaqjNwfPYD2mfhzz5cf2j089xN9T8RS80=; b=lKDOspq2dxY18lJgk5S9RJhHPqx4GVWNxX+t8LenrSXzGYdqXzwN6hcPC4ZGE5g5mJ f/9yoITUIXupqeutBycIlDJ5hplV+5yBXywfObwPu5GVQsnytvqinWy+FKuPNkAT5bFN sPd7r4OO25sRjhvutm4ByFkgs0jHeVBaS2/PUPIEtyICfYJ/yp13qcNr2+82hJdZxOog Do6+AMFJ9KH8xPgVtsHEVfO33DZ3dhtAoyoXrtL2oZewhesS4cIOzJIdbozfK/u9whKA /8ReZz7aFuj7fYWEv4vXiybRrJ3MOiYAvY5ql5cvulhvAuc0vsDoxm2CkO5InNaD8fsH FlmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=VGN56ReUHQoaqjNwfPYD2mfhzz5cf2j089xN9T8RS80=; b=Gi1vA/r+cNTTLUxm9J6Moqi2j2sJLs5NvJdc38dZwXxG/WBJf+7pbQ7wTjTldkv77X 0yxL8qR1ZunKpAVPFcZX3te+FMrKKvtYH2LKK6YebummpZUFUdi2Lanqac3i1zzpYV0Y BFWv/Mhq2LtEw8rDi0AzokBSomZ7gqsYexbg1OfQ7C+jJ96eDxIZdWmwglVQ7J8vYS7+ pKQrt8gDSlZaltu71IFrJZMeWPL8KQXGrF4SbA+r6hmjXXw0PRZ4kfqdx3KLBrRy9QQR FpWPWukKIWp2f+M/3Sq/k/1ROSAZx56q5ZzAuySrXxUirUFxa9uENJ6evhfEJ2S5wIuG ikpw== X-Gm-Message-State: AG10YOSpsv4u9D9/iNFzplsORzWgSixjt/qVMmtZygTbPsnwOfY255UvWnXxLr9hXmICm6xFC1LPUJfP4agijQ== X-Received: by 10.31.16.24 with SMTP id g24mr19602471vki.41.1456073917364; Sun, 21 Feb 2016 08:58:37 -0800 (PST) Original-Received: by 10.176.3.146 with HTTP; Sun, 21 Feb 2016 08:58:37 -0800 (PST) In-Reply-To: <83povqm3dw.fsf@gnu.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400c:c05::22a X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:200388 Archived-At: --001a114302429c8370052c4aa06c Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 22 February 2016 at 00:31, Eli Zaretskii wrote: > > > After all, the only languages that use =C3=B8 are languages that use it= as a > character of its own). > > Not sure what this means: how is the usage of =C3=B8 in this regard > different from, say, =C3=A4? > Well, if you are interested, here's how it works in the Scandinavian languages: Swedish has three extra characters: =C3=A5, =C3=A4 and =C3=B6. These are in= dividual characters as has been discussed many times in this thread. Norwegian and Danish has the same extra characters, except that they write them as =C3=A5= , =C3=A6 and =C3=B8 (they also sort them in different order, but that's beside the p= oint). Now, other languages may use the character (in the Unicode sense) =C3=B6 as= a variation of o. In other words, o with =C2=A8 on top of it. For users of su= ch languages =C3=B6 is just a variation of o as we also have discussed before.= On the other hand, =C3=B8 is not used as a variation of o in any language that= I am aware of. In Sweden, when discussing Norwegian or Danish words (usually names) we tend to keep their style of characters. So for example, if I might refer to my Swedish friend =C3=96sten and my Norwegian friend =C3=98ystein. I would = not spell his name =C3=96ystein, even though it's technically the same letter. However, when searching for "=C3=B6" I would certainly expect to match the = first letter of =C3=98ystein. > In the thread on the Unicode mailing list, the recommendation seems to be > to use the CLDR (http://cldr.unicode.org/). Of course, this assumes there > is a locale, but the choice of locale can easily be customisable (with th= e > default being the user's locale). > > Not locale, language. > Right. I guess I'm getting ahead of myself. As you know, I'm advocating choosing a default language based on the locale of the user. Regards, Elias --001a114302429c8370052c4aa06c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On 2= 2 February 2016 at 00:31, Eli Zaretskii <eliz@gnu.org> wrote:
=

> After all, the only languages that use =C3=B8 are languages that use i= t as a character of its own).

Not sure what this means: how is the usage of =C3=B8 in this regard<= br> different from, say, =C3=A4?

Well, if y= ou are interested, here's how it works in the Scandinavian languages:

Swedish has three extra characters: =C3=A5, =C3=A4 = and =C3=B6. These are individual characters as has been discussed many time= s in this thread. Norwegian and Danish has the same extra characters, excep= t that they write them as =C3=A5, =C3=A6 and =C3=B8 (they also sort them in= different order, but that's beside the point).

Now, other languages may use the character (in the Unicode sense) =C3=B6 = as a variation of o. In other words, o with =C2=A8 on top of it. For users = of such languages =C3=B6 is just a variation of o as we also have discussed= before. On the other hand, =C3=B8 is not used as a variation of o in any l= anguage that I am aware of.

In Sweden, when discus= sing Norwegian or Danish words (usually names) we tend to keep their style = of characters. So for example, if I might refer to my Swedish friend =C3=96= sten and my Norwegian friend =C3=98ystein. I would not spell his name =C3= =96ystein, even though it's technically the same letter.

=
However, when searching for "=C3=B6" I would certainly= expect to match the first letter of =C3=98ystein.

> In the thread on the Unico= de mailing list, the recommendation seems to be to use the CLDR (http://cldr= .unicode.org/). Of course, this assumes there is a locale, but the choi= ce of locale can easily be customisable (with the default being the user= 9;s locale).

Not locale, language.

Right. I g= uess I'm getting ahead of myself. As you know, I'm advocating choos= ing a default language based on the locale of the user.

Regards,
Elias
--001a114302429c8370052c4aa06c--