From mboxrd@z Thu Jan  1 00:00:00 1970
Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail
From: Richard Stallman <rms@gnu.org>
Newsgroups: gmane.emacs.devel,gmane.comp.gnu.gettext.bugs
Subject: Re: Emacs i18n
Date: Wed, 20 Mar 2019 22:14:27 -0400
Message-ID: <E1h6nDv-0000ao-K0@fencepost.gnu.org>
References: <25076895.mA2g9mTHSI@omega>
Reply-To: rms@gnu.org
Content-Type: text/plain; charset=Utf-8
Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226";
	logging-data="216033"; mail-complaints-to="usenet@blaine.gmane.org"
Cc: bug-gettext@gnu.org, emacs-devel@gnu.org
To: Bruno Haible <bruno@clisp.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Mar 21 03:35:04 2019
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([209.51.188.17])
	by blaine.gmane.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256)
	(Exim 4.89)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1h6nXs-000tsq-9I
	for ged-emacs-devel@m.gmane.org; Thu, 21 Mar 2019 03:35:04 +0100
Original-Received: from localhost ([127.0.0.1]:58247 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>)
	id 1h6nHb-0007Cm-N2
	for ged-emacs-devel@m.gmane.org; Wed, 20 Mar 2019 22:18:15 -0400
Original-Received: from eggs.gnu.org ([209.51.188.92]:41555)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <rms@gnu.org>)
	id 1h6nEp-0005lk-Ty; Wed, 20 Mar 2019 22:15:25 -0400
Original-Received: from fencepost.gnu.org ([2001:470:142:3::e]:38940)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <rms@gnu.org>)
	id 1h6nDw-00026w-Fs; Wed, 20 Mar 2019 22:14:28 -0400
Original-Received: from rms by fencepost.gnu.org with local (Exim 4.82)
	(envelope-from <rms@gnu.org>)
	id 1h6nDv-0000ao-K0; Wed, 20 Mar 2019 22:14:27 -0400
In-Reply-To: <25076895.mA2g9mTHSI@omega> (message from Bruno Haible on Wed, 20
	Mar 2019 12:59:32 +0100)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/emacs-devel/>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Original-Sender: "Emacs-devel" <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Xref: news.gmane.org gmane.emacs.devel:234444 gmane.comp.gnu.gettext.bugs:1964
Archived-At: <http://permalink.gmane.org/gmane.emacs.devel/234444>

[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > When you design a translation system, you have two personas:
  >   - the programmer,
  >   - the translator.

  > The translation system defines
  >   1) which information flows from the programmer to the translator,
  >      and in which format,
  >   2) which information flows back from the translator to the programmer,
    >      and in which format.

That argument is valid for gettext, but not for Emacs.

This is the part that doesn't fit Emacs:

  >   - The programmer, you can assume, can write and understand algorithms,
  >     but does not master the grammar of more than one language (usually).

In the development of Emacs there are many programmers, even some who
speak Russian.  We will have no difficulty implementing and
maintaining russian-masc, russian-nom, and so on.

These constructs do not need to be known to gettext.
For gettext, they will simply be part of the translation string.

We can do this for those languages in which it is convenient for us --
those that someone knows and decides to handle.  For other languages,
we can stick to the low-level gettext approach, which will work
for all languages.

    > - The translator, you can assume, can translate sentences and knows
    >   about the different meanings of words in different context. 

The Russian translation team for Emacs will not have difficulty using
russian-masc, russian-nom, and so on.  Being Russian speakers, they
will understand how these constructs make sense for Russian, once
they read the documentation for them.

  > In the gettext approach (where 1) are POT files and 2) are PO files) we
  > added plural form handling, which is just a small morphological variation,
  > and it required a significant amount of documentation and education for
  > translators. I would say, it is on the limit what we can make translators
  > grok.

The gettext approach requires coding the algorithm in the translations file.
My approach has the advantage of avoiding that.

  > Now, when you give a translator a string

  >    "russian-nom:%d байт%| скопирован%|, %s, %s"

  > you need to think about the appropriate tooling that will make the
  > translator understand
  >   - what 'russian-nom' means,
  >   - what the '|' characters mean,
  >   - what the '%' characters mean.

I picked that syntax on the spur of the moment because I thought it
would be natural and convenient.  If that isn't natural and convenient
for the translators, we can pick a different one.

  > Either the translator tool should somehow highlight these characters
  > and present on-line help,

That would be good to do.

  >  it should present it as a sequence of
  > strings to translate:

  >   Rule: russian-nom
  >   "%d байт"
  >   " скопирован"
  >   ", %s, %s"

Is this general enough to handle all the use cases?
I don't know -- I don't speak Russian.

  > For the plural form
  > handling alone, it took several years until the main tools had support for
  > it in their UI.

What sort of syntax do the tools support for plurals?

-- 
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)