From mboxrd@z Thu Jan 1 00:00:00 1970 Path: news.gmane.org!not-for-mail From: Jean-Christophe Helary Newsgroups: gmane.emacs.devel Subject: Re: A system for localizing documentation strings Date: Thu, 26 Jul 2007 22:51:54 +0900 Message-ID: References: <795F38F4-7253-47DC-97DD-53BED4F0AB97@mx6.tiki.ne.jp> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 (Apple Message framework v752.3) Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1185457965 2519 80.91.229.12 (26 Jul 2007 13:52:45 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 26 Jul 2007 13:52:45 +0000 (UTC) To: emacs-devel@gnu.org Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Thu Jul 26 15:52:41 2007 Return-path: Envelope-to: ged-emacs-devel@m.gmane.org Original-Received: from lists.gnu.org ([199.232.76.165]) by lo.gmane.org with esmtp (Exim 4.50) id 1IE3le-0005Bz-W4 for ged-emacs-devel@m.gmane.org; Thu, 26 Jul 2007 15:52:39 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IE3le-0006ME-HS for ged-emacs-devel@m.gmane.org; Thu, 26 Jul 2007 09:52:38 -0400 Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1IE3lH-0004t0-67 for emacs-devel@gnu.org; Thu, 26 Jul 2007 09:52:15 -0400 Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1IE3lF-0004hi-Ds for emacs-devel@gnu.org; Thu, 26 Jul 2007 09:52:14 -0400 Original-Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1IE3lF-0004gy-3t for emacs-devel@gnu.org; Thu, 26 Jul 2007 09:52:13 -0400 Original-Received: from smtp10.tiki.ne.jp ([218.40.30.107]) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1IE3l9-0005tP-Gk for emacs-devel@gnu.org; Thu, 26 Jul 2007 09:52:12 -0400 Original-Received: from [192.168.11.4] (pl062.nas933.takamatsu.nttpc.ne.jp [210.136.182.62]) (authenticated bits=0) by smtp10.tiki.ne.jp (8.13.8/8.13.8) with ESMTP id l6QDq3Kf011446 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Thu, 26 Jul 2007 22:52:04 +0900 (JST) (envelope-from fusion@mx6.tiki.ne.jp) In-Reply-To: X-Mailer: Apple Mail (2.752.3) X-detected-kernel: FreeBSD 5.3-5.4 X-BeenThere: emacs-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Emacs development discussions." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Xref: news.gmane.org gmane.emacs.devel:75571 Archived-At: On 26 juil. 07, at 21:13, Eli Zaretskii wrote: >> From: Jean-Christophe Helary >> Date: Thu, 26 Jul 2007 12:29:19 +0900 >> >> I was told by Eli Zaretskii that such discussions should better take >> place here, hence my forward from help-gnu-emacs@gnu.org > > Thanks. > >> To offer a practical possibility for interactive localization we'd >> need a function that dynamically generates output instead of the >> "optional-documentation..." string. This function would take a number >> of paired arguments: >> >> (docfun >> source-language-1 source-language-1-documentation-string >> source-language-2 source-language-2-documentation-string >> etc ...) >> >> for ex: >> >> (docfun >> EN "optional-documentation in EN..." >> FR "documentation optionnelle en FR...") > > I don't like this implementation idea, because it would require the > user to byte-compile Lisp files whenever a translation to another > language is added. This would be very inconvenient, especially for > *.el files that are preloaded when Emacs is built, because that would > mean one must have the sources available nearby, and must run the > build procedure and "make install". The latter step requires sysadmin > privileges on many platforms, another inconvenience. I see 2 types of .el files. 1) Those that come with the emacs distribution. 2) Those that can be installed in user space. The distribution would come with the translated .el files and the updates to the translation could be included in updates of the distribution without requiring the user to build the distribution when new translations come. Since the translations are in the code there is little management necessary (unlike systems where the localizations come in separate files). For the .el files in user space, the author would maintain the file's translation and users would be able to compile the file without any specific problem. > I think it's much better to have a separate translation file for each > .el file. Such a translation file would be loaded on demand when > documentation for symbols defined on that file is requested by the > user. The translation file needs include only the names of the > symbols and their doc strings for supported languages. (We could also > have separate translation files for each language, but I think it > won't be necessary, as the number of symbols on a single .el file is > quite small.) I don't think it is a good idea because it would put a bigger load on the coder who would then have to write keys in the code and then values in a separate file. That would impair understanding of the code too. Besides for the fact that it would be a major departure from the current code workflow. >> (transfun function-name >> source-language >> target-language >> reference-function-name ; should be a list >> reference-file ; should be a list) >> >> The function-name declares which function has to be translated >> The source-language declares from which language string the source >> should be displayed > > Why do we need the source language? Because as soon as we have a system that allows for localization we can expect to have "native" code written and so we'll have to reference the source language that _will_ be different from English. We don't want people who don't master English to produce weird English in their descriptions because the system implies source=English. That would require the faulty English string to be rewritten before proceeding to translation. Also, we can expect to find translators who would be more familiar with one source language than the other. If there is a Japanese equivalent to an English description. I'd rather use the Japanese as source and the English as reference to translate to French for exemple. In the end, the translation needs to be recorded as a list of paired strings for future reference. This is usually called a "translation memory" and comes in different formats. A widely used one in the free world being PO compendia, another one, mostly used in the translation world is TMX (an XML dialect: Translation Memory eXchange). PO kind of expects English to be the source, but my understanding is that recent developments of gettext are making this a thing of the past. TMX is totaly "multilingual". >> What we need is provide 1) a way for coders to identify the necessary >> strings for the translation 2) a way for translators to add >> translated strings "the emacs way" 3) a modification of the display >> procedures to take the new strings into account. > > Your number 3) is not described correctly: it's not the display that > needs to be modified, it's the Emacs documentation commands. The > documentation commands don't display anything, they just insert the > doc text into a buffer, whether *Help* or minibuffer or something > else. The Emacs redisplay engine then displays that buffer; however, > if the text in the buffer to be displayed is already in French (say), > that is what you will see after it is displayed. Ok, sorry for my misunderstanding. > So what is needed is to modify the documentation commands so that they > will look up the translated text and display that text instead of the > original English doc string. That is correct. Except that we can't expect the English to be the original anymore. Hence the necessity to specify what language the strings are written in. > Also, we should keep in mind that Lisp primitives (those > implemented in > C) have their doc strings as C comments, not as C strings. The > infrastructure developed for Emacs l10n should provide solution for > the primitives as well, and the solution will have to be different > both from your suggestion above and from the traditional gettext-style > message catalog. Could that part be concieved separatly ? I mean, we could start by having a modification in the documentation commands and then see what that gives with catalogs (either in the .el or separate, because in the end we could consider that the 2 solutions are valid depending on the scope of the .el file). Jean-Christophe Helary