From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Richard Stallman Newsgroups: gmane.emacs.devel Subject: Re: Emacs i18n Date: Fri, 08 Mar 2019 22:11:51 -0500 Message-ID: References: <87o97aq6gz.fsf@jidanni.org> <87tvgoud56.fsf@mail.linkov.net> <83o96wk2mi.fsf@gnu.org> <87k1hjfvjd.fsf@mail.linkov.net> <871s3p0zdz.fsf@mail.linkov.net> <83h8ckezyt.fsf@gnu.org> <83o96qegv1.fsf@gnu.org> <32b1ab1b-bef4-629a-8830-b1dcc6915087@cs.ucla.edu> <83a7iae9va.fsf@gnu.org> <05ed2dec-2a84-f7dc-1af5-c9d923992785@cs.ucla.edu> <87bm2p56gu.fsf@mail.linkov.net> <837edbdg33.fsf@gnu.org> <65e3fe78-3264-12ff-1edf-a05bfd86a9a9@cs.ucla.edu> Reply-To: rms@gnu.org Content-Type: text/plain; charset=Utf-8 Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="73687"; mail-complaints-to="usenet@blaine.gmane.org" Cc: eliz@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org, juri@linkov.net To: Elias =?iso-8859-1?Q?M=C3=A5rtenson?= Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Sat Mar 09 04:12:53 2019 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([209.51.188.17]) by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1h2SPt-000J40-10 for ged-emacs-devel@m.gmane.org; Sat, 09 Mar 2019 04:12:53 +0100 Original-Received: from localhost ([127.0.0.1]:53262 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h2SPr-0007fh-Mn for ged-emacs-devel@m.gmane.org; Fri, 08 Mar 2019 22:12:51 -0500 Original-Received: from eggs.gnu.org ([209.51.188.92]:47667) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h2SOv-0007fN-45 for emacs-devel@gnu.org; Fri, 08 Mar 2019 22:11:54 -0500 Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:51967) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1h2SOu-0001xo-8z; Fri, 08 Mar 2019 22:11:52 -0500 Original-Received: from rms by fencepost.gnu.org with local (Exim 4.82) (envelope-from ) id 1h2SOt-0007PC-3u; Fri, 08 Mar 2019 22:11:51 -0500 In-Reply-To: (message from Elias =?iso-8859-1?Q?M=C3=A5rtenson?= on Fri, 8 Mar 2019 12:33:24 +0800) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.org gmane.emacs.devel:233942 Archived-At: [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Russian, for example, uses three different grammatical cases, which are > dependent on the last digit of the number, the system needs to be more > complicated. Here's an idea for a scheme general enough to handle Russian as well. I propose something like a case or select construct. First, the elegant Lispy way to represent it: (numeric-case NUMBER (1 "Just one frob") (2 "Two frobs") (russian-masc "%d-m frobs") (russian-fem "%d-f frobs") (russian-neut "%d-n frobs") (t "%d frobs")) Translation would have to the entire numeric-case construct with another (translated) numeric-case construct. Thus, the source code would contain one suitable for English: (numeric-case NUMBER (1 "one frob") (t "%d frobs")) and for Russian we would translate it into this one (numeric-case NUMBER (russian-masc "%d-m frobs") (russian-fem "%d-f frobs") (russian-neut "%d-n frobs")) I think this framework could be extended to handle whatever other weird grammatical rules we might encounter in other languages in the future. While doing it with Lisp syntax is elegant, it would require generalization of the infrastructure for recording translations to handle more than strings. That would be a pain. Here's a way to represent the conditional construct as a kind of string. That way, translation would only need to translate strings into strings. We could use | in the string to separate alternatives, and : to end a condition. It would look like this: (numeric-case NUMBER "1:one frob|\ t:%d frobs") For Russian, we would translate the source string 1:one frob|t:%d frobs into russian-masc:%d-m frobs|russian-fem:%d-f frobs|russian-neut:%d-n frobs The subsequences : and | would be handled by the function numeric-case. They would not affect the meaning of the string data type as such. numeric-case would ignore whitespace after |. With this string convention, we only need to translate strings. To include a | in an alternative, you could write a double |. We do not need a way to quote a colon. Perhaps one could develop a smarter 'russian' alternative that knows how to change the last letter automatically and handles all three alternatives. Maybe we need to define a format-spec for devouring and ignoring one argument. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org)