From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Paul Eggert Newsgroups: gmane.emacs.devel Subject: Re: Emacs i18n Date: Sun, 10 Mar 2019 20:52:47 -0700 Organization: UCLA Computer Science Department Message-ID: <21308168-e809-13f3-2e12-b8456efab836@cs.ucla.edu> References: <87o97aq6gz.fsf@jidanni.org> <87tvgoud56.fsf@mail.linkov.net> <83o96wk2mi.fsf@gnu.org> <87k1hjfvjd.fsf@mail.linkov.net> <871s3p0zdz.fsf@mail.linkov.net> <83h8ckezyt.fsf@gnu.org> <83o96qegv1.fsf@gnu.org> <32b1ab1b-bef4-629a-8830-b1dcc6915087@cs.ucla.edu> <83a7iae9va.fsf@gnu.org> <05ed2dec-2a84-f7dc-1af5-c9d923992785@cs.ucla.edu> <87bm2p56gu.fsf@mail.linkov.net> <837edbdg33.fsf@gnu.org> <65e3fe78-3264-12ff-1edf-a05bfd86a9a9@cs.ucla.edu> <19b02ea5-c5e2-29a6-c037-f7481490f92a@cs.ucla.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="82562"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.1 Cc: eliz@gnu.org, juri@linkov.net, lokedhs@gmail.com, emacs-devel@gnu.org To: rms@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Mon Mar 11 05:04:21 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h3CAn-000LLi-L0 for ged-emacs-devel@m.gmane.org; Mon, 11 Mar 2019 05:04:21 +0100 Original-Received: from localhost ([127.0.0.1]:54930 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h3CAm-0004mb-La for ged-emacs-devel@m.gmane.org; Mon, 11 Mar 2019 00:04:20 -0400 Original-Received: from eggs.gnu.org ([209.51.188.92]:49359) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h3CAX-0004dy-BF for emacs-devel@gnu.org; Mon, 11 Mar 2019 00:04:06 -0400 Original-Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1h3Bzg-0008AJ-Gk for emacs-devel@gnu.org; Sun, 10 Mar 2019 23:52:53 -0400 Original-Received: from zimbra.cs.ucla.edu ([131.179.128.68]:43248) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1h3Bze-00087m-TF; Sun, 10 Mar 2019 23:52:51 -0400 Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 8C7211614D7; Sun, 10 Mar 2019 20:52:49 -0700 (PDT) Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id GgVoHe717_nz; Sun, 10 Mar 2019 20:52:48 -0700 (PDT) Original-Received: from localhost (localhost [127.0.0.1]) by zimbra.cs.ucla.edu (Postfix) with ESMTP id 4F9A016155B; Sun, 10 Mar 2019 20:52:48 -0700 (PDT) X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu Original-Received: from zimbra.cs.ucla.edu ([127.0.0.1]) by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id uN2uRFo2Fe3p; Sun, 10 Mar 2019 20:52:48 -0700 (PDT) Original-Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com [23.242.74.103]) by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 19442161549; Sun, 10 Mar 2019 20:52:48 -0700 (PDT) In-Reply-To: Content-Language: en-US X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 131.179.128.68 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:234053 Archived-At: Richard Stallman wrote: > Compare this >=20 > (numeric-case NUMBER > (russian-masc "%d =D0=B1=D0=B0=D0=B9=D1=82 =D1=81=D0=BA=D0=BE=D0= =BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD, %s, %s") > (russian-fem "%d =D0=B1=D0=B0=D0=B9=D1=82=D0=B0 =D1=81=D0=BA=D0=BE= =D0=BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD=D0=BE, %s, %s") > (russian-neut "%d =D0=B1=D0=B0=D0=B9=D1=82 =D1=81=D0=BA=D0=BE=D0= =BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD=D0=BE, %s, %s")) >=20 > with this: >=20 > "Plural-Forms: nplurals=3D3; plural=3D(n%10=3D=3D1 && n%100!=3D11 = ? 0 : n%10>=3D2 && n%10<=3D4 > && (n%100<10 || n%100>=3D20) ? 1 : 2);\n" > ... > #: src/dd.c:822 > #, c-format > msgid "% byte copied, %s, %s" > msgid_plural "% bytes copied, %s, %s" > msgstr[0] "% =D0=B1=D0=B0=D0=B9=D1=82 =D1=81=D0=BA=D0=BE=D0= =BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD, %s, %s" > msgstr[1] "% =D0=B1=D0=B0=D0=B9=D1=82=D0=B0 =D1=81=D0=BA=D0= =BE=D0=BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD=D0=BE, %s, %s" > msgstr[2] "% =D0=B1=D0=B0=D0=B9=D1=82 =D1=81=D0=BA=D0=BE=D0= =BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD=D0=BE, %s, %s" I'm afraid that's not a apples-to-apples comparison. The first form conta= ins=20 only the Russian translations, whereas the second form contains much more= =20 information: the source-code location of the untranslated strings, a copy= of the=20 untranslated English-language strings, and the general rules for Russian = (the=20 last is shared among all the Russian translations, not just the translati= ons=20 listed here). This extra information is useful for translators, and it ha= s a=20 reasonably extensive software suite that already supports it, not to ment= ion=20 translators who are already used to it. > I can envision something like this: >=20 > "russian-nom:%d =D0=B1=D0=B0=D0=B9=D1=82%| =D1=81=D0=BA=D0=BE=D0= =BF=D0=B8=D1=80=D0=BE=D0=B2=D0=B0=D0=BD%|, %s, %s" >=20 > where the 'russian-nom' operator would replace the two %| sequences > with the appropriate declensional suffixes for the nominative case. But Russian declension is not that simple. The Russian word for "byte" is= =20 "=D0=B1=D0=B0=D0=B9=D1=82", but its plural form depends not only on the n= umber (as in the above=20 examples) but also in its case: the "=D0=B1=D0=B0=D0=B9=D1=82" and "=D0=B1= =D0=B0=D0=B9=D1=82=D0=B0" in the above examples are=20 not exhaustive. And some words have irregular declensions: for example, =D1= =80=D0=B5=D0=B1=D1=91=D0=BD=D0=BE=D0=BA=20 (singular) versus =D0=B4=D0=B5=CC=81=D1=82=D0=B8 (plural) for the same no= un. And it's not just nouns and=20 pronouns that are affected: adjectives also have singular and plural form= s. And=20 I have by no means exhausted the issues involved here; to get a better fe= eling=20 for the complexity in this area, please see: https://en.wikipedia.org/wiki/Russian_declension Although it wouldn't be impossible for Emacs Lisp code to handle all the = special=20 cases for Russian declension, it would be tricky to implement, or to docu= ment it=20 in a way that translators would easily understand. And we'd also have to=20 implement and document similarly tricky rules for other languages. And we= 'd have=20 to deal with the fact that not every Russian-speaker agrees with how to d= ecline=20 words like "=D0=B1=D0=B0=D0=B9=D1=82" that are imported from English. The= se sorts of issues should=20 be delegated to translators, not to likely-fragile code in Emacs Lisp (a=20 technology that translators typically do not grok). In contrast, the gettext way is relatively simple and easily understood, = and is=20 already common practice.