From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Daniel Brooks Newsgroups: gmane.emacs.devel Subject: Re: Internationalize Emacs's messages (swahili) Date: Sat, 26 Dec 2020 01:07:50 -0800 Message-ID: <87tus9szo9.fsf@db48x.net> References: <87o8ivumn5.fsf@telefonica.net> <87v9d3nkxk.fsf@gnus.org> <83sg7xrgr5.fsf@gnu.org> <83h7odrdwy.fsf@gnu.org> <86sg7w39fh.fsf@163.com> <83pn30pku5.fsf@gnu.org> <86wnx8otoj.fsf@163.com> <834kkbp9vr.fsf@gnu.org> <87czyxuxw6.fsf@db48x.net> <83sg7tm2es.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="33918"; mail-complaints-to="usenet@ciao.gmane.io" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) Cc: rms@gnu.org, bugs@gnu.support, dimech@gmx.com, abrochard@gmx.com, emacs-devel@gnu.org, all_but_last@163.com To: Eli Zaretskii Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Sat Dec 26 10:09:26 2020 Return-path: Envelope-to: ged-emacs-devel@m.gmane-mx.org Original-Received: from lists.gnu.org ([209.51.188.17]) by ciao.gmane.io with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1kt5Zl-0008iX-QB for ged-emacs-devel@m.gmane-mx.org; Sat, 26 Dec 2020 10:09:25 +0100 Original-Received: from localhost ([::1]:42402 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kt5Zk-0002hg-Re for ged-emacs-devel@m.gmane-mx.org; Sat, 26 Dec 2020 04:09:24 -0500 Original-Received: from eggs.gnu.org ([2001:470:142:3::10]:38510) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kt5YQ-000225-M0 for emacs-devel@gnu.org; Sat, 26 Dec 2020 04:08:02 -0500 Original-Received: from smtp-out-4.mxes.net ([198.205.123.69]:47155) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kt5YN-0001V0-Mr for emacs-devel@gnu.org; Sat, 26 Dec 2020 04:08:02 -0500 Original-Received: from Customer-MUA (mua.mxes.net [10.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 4D1137598F; Sat, 26 Dec 2020 04:07:51 -0500 (EST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mxes.net; s=mta; t=1608973675; bh=AP+yGksN3junCc2BVYsMIWY3Qyan34zo3jeo600y9MI=; h=From:To:Subject:References:Date:In-Reply-To:Message-ID: MIME-Version:Content-Type; b=UzO39+8d96p+dm7baYuqRl56CIyJVgCl2ia/Umnxc0VcqbbdWKkjMSK3Ib4kFG6rZ JJe+g7BV0+LWURO3lzkHnNO97/9af9bcVISvjrefpv5otKUuKpYkdpajyfIjM6+5Xf uFOkD3uZM5Sxwy+dFJPt0i/tCOb1Eg/ObUJ3yIPg= Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAABGdBTUEAALGOfPtRkwAAABJQ TFRFpKfbdou67PD6JjJgAwUWXGSeIcyLHgAAAkZJREFUOI1VU8Fy6yAMxLi+Q13fCZ3cnQL3dqTc 7RD+/1feStDXVnXHDuvVSivZTMba2GPdw3gyCGcMAFxTyrTd9dwGoxHiZX9PmRFUHYAQlGGtXY+F Uk0SJOxgJiUEnH1qkitT9D+pQub7qGAmUbR6bu3CvI96Yv6QqkBBMrsyfZccr1/RDXGDTLf4P7ZY glVxe2V+/ACXWO1gvDO9/gDRpFFVmPluvLcmBjd5H6d8DEte+Pbk4rcY/Fa5tLKLOtCZsuQKYhpa LOkYDT7hESya7/WIET3lfQBqX0pwFtbI832Is0ayMUR9B+12xjgPCQ089cfwkCkX6L5TPmRelJTh zMS0Sz1PyjLAMCUWjcmgQLWQMds+e3aaauZDf9dU9A2/8kPVF2odCUoMKHkfjJR+mbgC+DRiycw5 3XSqGe6HmhN/AWjHypkAXOAFW5EiuA1ge2GiZuMb0s1fSEXcATeLUfbyEY2L8yPOmdSsdghQXx3K pz2eoeXuYvMCINVFDrCdNfVUp4eJ6cSEbjbgFjBEvonGGTrgv9cHjAc8aVgSAPoxaONbzfwhDIhR at7IIS7fAGiDSwIA9alhhTBzfA7YM2FY6eMwayrIGK8FDFmshmUA43WqhFtpvoqG9HHaJ7fqtgTz 8EWVkgZgtsylFliHDgk0MB7KAEC45C/rgnGvanNLXyzOeTzcT2nw/N44gfrtYXRQLoz9Q3TgmJRx 2Mx/Q51qzpm+l3m8z2SWBqC5+PZXAtNYlGFf/gKfHfjFkDT4x7od7R+w3Ls+ZdQBuQAAAABJRU5E rkJggg== In-Reply-To: <83sg7tm2es.fsf@gnu.org> (Eli Zaretskii's message of "Sat, 26 Dec 2020 09:50:35 +0200") X-Sent-To: Received-SPF: none client-ip=198.205.123.69; envelope-from=db48x@db48x.net; helo=smtp-out-4.mxes.net X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane-mx.org@gnu.org Original-Sender: "Emacs-devel" Xref: news.gmane.io gmane.emacs.devel:261795 Archived-At: Eli Zaretskii writes: >> From: Daniel Brooks >> Cc: Zhu Zihao , dimech@gmx.com, abrochard@gmx.co= m, >> rms@gnu.org, bugs@gnu.support, emacs-devel@gnu.org >> Date: Fri, 25 Dec 2020 18:03:21 -0800 >>=20 >> My personal opinion is that gettext is too limited. It works for simple >> things, but provides no help at all for complex things. > > That is in no way specific to Emacs, is it? Absolutely. >> I think that the most productive way to think about translation is that >> each coherent message that we present to a user (whether it's via the >> message function or not) should explicitly be the result of calling a >> function written by the translator. gettext only allows the translator >> to supply strings, so it falls down in complex situations. > > The advantage of the translation infrastructure based on gettext is > that the translators don't have to be programmers, they only need to > be experts in correct use of technical terminology in their > languages. Even with that significant advantage, it is hard to find > translators for many languages. Your suggestion would make that job > much harder, with the net result that more messages for more programs > will remain untranslated: a classic example where the best is a sworn > enemy of the good. Yes, the simplicity of gettext is a big point in its favor. In the common case, Fluent is not much more complicated. Here's a file from the en-US locale from Firefox: https://searchfox.org/mozilla-central/source/browser/locales/en-US/browser/= aboutDialog.ftl Most of the complications here come from the html that is embedded inside the localized text: update-failed-main =3D Update failed. Download the latest version Another language might put different text before or after the link, so the anchor tag has to be part of the localized text. However, the Javascript that displays this text will add the href to the anchor first. > (Disclosure: I'm the team leader for translators to the Hebrew > language, as part of the GNU Translation Project. I'm talking from > personal experience here.) > > This problem was solved in gettext long ago, and is being widely used > in existing translations. See the node "Plural Forms" in the GNU > gettext manual. Emacs has the ngettext function in preparation for > the day when we will be able to have translatable message strings. Yes, I am aware of ngettext, and I could have picked a different example. Consider the example from projectfluent.org, where the output should change based on the user's gender: shared-photos =3D {$userName} {$photoCount -> [one] added a new photo *[other] added {$photoCount} new photos } to {$userGender -> [male] his stream [female] her stream *[other] their stream }. This one produces messages like "Anne=E2=81=A9 =E2=81=A8added =E2=81=A83=E2= =81=A9 new photos=E2=81=A9 to =E2=81=A8her stream=E2=81=A9.", which vary based on the three inputs passed in. ngettext handles plurals, but it doesn't generalize to any other type of variation we might want. I can't think of any reason why Emacs would care about gender, but maybe BBDB could. Another example from fluentproject.org illustrates that individual translations can add variations that are purely for their own use. The English translation has "-sync-brand-name =3D Firefox Account", which is just assigning static text to a variable which will use used frequently. The Italian translation changes it to this: -sync-brand-name =3D {$first -> *[uppercase] Account Firefox [lowercase] account Firefox } which serves the same purpose but lets the translator put this text at both the beginning and end of a sentence. Meanwhile, the Polish translator has changed it to this: -sync-brand-name =3D {$case -> *[nominative] Konto Firefox [genitive] Konta Firefox [accusative] Kontem Firefox } which lets them choose the correct declension when needed: sync-signedout-title =3D Zaloguj do {-sync-brand-name(case: "genitive")} All three translations can do their own thing, without needing to ask the UI implementer to change anything and without coordinating with each other first. They can also add these features gradually, as they refine the translation. For example, they can start with a form that partly dodges the grammar like "New emails: 42" at first, then later refine it to "Found 42 new emails" once they get better coverage. >> I recommend taking a look at Project Fluent >> . It's a free-software implementation of >> exactly the system that I've described. Translators write functions in a >> syntax that is similar in some ways to both Javascript and an ini file, >> which could be easily compiled into Elisp. (It's the successor to the >> l20n project, which you might also have heard of.) > > How many translated languages for how many programs does this project > have? The main one that I know of is Firefox, which by my count has 96 translations. (See .) > Anyway, the hard problems in translating some of the Emacs UI are > elsewhere, as can be seen from the discussions to which I pointed. We > need to solve those first, and only after that worry about the issues > you mention (if they are real). I think that a system like Fluent moves most of the problems into the translations, where they are more tractable (because each translation only has to solve it's own problems). Note that most of Firefox's translations are maintained by voluteers. They don't even have to send patches or commit files to version control; they use a web page to view and edit the translation, as well as to preview the results live. The same tools can be used for Emacs. I will continue to peruse these previous threads that you've pointed out, but I'm not aware of anything that would be harder than just going through the code factoring out the text. There aren't any clever macros that can help with that, just hard work. db48x