* Re: Emacs i18n
@ 2019-03-20 11:59 Bruno Haible
2019-03-20 16:36 ` Paul Eggert
` (2 more replies)
0 siblings, 3 replies; 151+ messages in thread
From: Bruno Haible @ 2019-03-20 11:59 UTC (permalink / raw)
To: rms-mXXj517/zsQ; +Cc: bug-gettext-mXXj517/zsQ, emacs-devel-mXXj517/zsQ
Richard Stallman wrote in
<https://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00328.html>:
> I can envision something like this:
>
> "russian-nom:%d байт%| скопирован%|, %s, %s"
>
> where the 'russian-nom' operator would replace the two %| sequences
> with the appropriate declensional suffixes for the nominative case.
It is, of course, tempting to try to do morphological analysis in an
algorithmic way, based on our background as algorithm hackers. François
Pinard and others considered this, back in 1995 when they started i18n in GNU.
The reason this approach was not chosen is still valid today:
When you design a translation system, you have two personas:
- the programmer,
- the translator.
The translation system defines
1) which information flows from the programmer to the translator,
and in which format,
2) which information flows back from the translator to the programmer,
and in which format.
And it has to cope with the assumed skills of these personas:
- The programmer, you can assume, can write and understand algorithms,
but does not master the grammar of more than one language (usually).
- The translator, you can assume, can translate sentences and knows
about the different meanings of words in different context. But they
cannot write nor understand algorithms. Many translators, in fact,
don't see the grammar as a set of rules.
You may find some people on the intersection, such as a Russian hacker,
but it is hard to find people with both skills for languages such as
Vietnamese, Slovenian, or Basque. So, you better design the system in
such a way that no person is assumed to have both skills.
The challenge is to define these formats 1) and 2) in a way that
* Programmers can do their job with their skills (i.e. don't need to
understand Russian).
* Translators can do their job with their skills (i.e. don't need to
understand algorithms).
In the gettext approach (where 1) are POT files and 2) are PO files) we
added plural form handling, which is just a small morphological variation,
and it required a significant amount of documentation and education for
translators. I would say, it is on the limit what we can make translators
grok.
Now, when you give a translator a string
"russian-nom:%d байт%| скопирован%|, %s, %s"
you need to think about the appropriate tooling that will make the
translator understand
- what 'russian-nom' means,
- what the '|' characters mean,
- what the '%' characters mean.
Either the translator tool should somehow highlight these characters
and present on-line help, or it should present it as a sequence of
strings to translate:
Rule: russian-nom
"%d байт"
" скопирован"
", %s, %s"
It is important to realize that each such case of morphological variation
requires translator tooling support. And unfortunately different such tools
exist, and every translator has their preferred one. For the plural form
handling alone, it took several years until the main tools had support for
it in their UI.
Bruno
^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-20 11:59 Emacs i18n Bruno Haible @ 2019-03-20 16:36 ` Paul Eggert 2019-03-20 21:32 ` Juri Linkov 2019-03-21 2:14 ` Richard Stallman 2 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-20 16:36 UTC (permalink / raw) To: Bruno Haible, rms-mXXj517/zsQ Cc: emacs-devel-mXXj517/zsQ, bug-gettext-mXXj517/zsQ On 3/20/19 4:59 AM, Bruno Haible wrote: > In the gettext approach (where 1) are POT files and 2) are PO files) we > added plural form handling, which is just a small morphological variation, > and it required a significant amount of documentation and education for > translators. I would say, it is on the limit what we can make translators > grok. Thanks for making the point better than I was able to. There's another reason pluralization is a good place to stop. GNU gettext attacks the problem of how to translate formats containing printf conversion specifications like %d, in phrases like "%d items". That is, gettext deals with the grammatical problem of number, because printf formats numbers. However, there are no printf conversion specifications for other grammatical aspects such as case, gender, tense, voice, or mood, which means there is no significant need for gettext to deal with these other aspects. In hindsight it might have been better if gettext had not attacked the problem of plurals. As you wrtite, even plurals are nearly a bridge too far. But it's done now, so we might as well use it. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-20 11:59 Emacs i18n Bruno Haible 2019-03-20 16:36 ` Paul Eggert @ 2019-03-20 21:32 ` Juri Linkov 2019-03-21 2:14 ` Richard Stallman [not found] ` <87h8bx5ijn.fsf-i9wRM+HIrmlRTR8OWt4JRw@public.gmane.org> 2019-03-21 2:14 ` Richard Stallman 2 siblings, 2 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-20 21:32 UTC (permalink / raw) To: Bruno Haible; +Cc: emacs-devel, rms, bug-gettext > Richard Stallman wrote in > <https://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00328.html>: > >> I can envision something like this: >> >> "russian-nom:%d байт%| скопирован%|, %s, %s" >> >> where the 'russian-nom' operator would replace the two %| sequences >> with the appropriate declensional suffixes for the nominative case. > > It is, of course, tempting to try to do morphological analysis in an > algorithmic way, based on our background as algorithm hackers. François > Pinard and others considered this, back in 1995 when they started i18n in GNU. > > The reason this approach was not chosen is still valid today: > > When you design a translation system, you have two personas: > - the programmer, > - the translator. > > The translation system defines > 1) which information flows from the programmer to the translator, > and in which format, > 2) which information flows back from the translator to the programmer, > and in which format. > > And it has to cope with the assumed skills of these personas: > > - The programmer, you can assume, can write and understand algorithms, > but does not master the grammar of more than one language (usually). > > - The translator, you can assume, can translate sentences and knows > about the different meanings of words in different context. But they > cannot write nor understand algorithms. Many translators, in fact, > don't see the grammar as a set of rules. > > You may find some people on the intersection, such as a Russian hacker, > but it is hard to find people with both skills for languages such as > Vietnamese, Slovenian, or Basque. So, you better design the system in > such a way that no person is assumed to have both skills. > > The challenge is to define these formats 1) and 2) in a way that > > * Programmers can do their job with their skills (i.e. don't need to > understand Russian). > > * Translators can do their job with their skills (i.e. don't need to > understand algorithms). > > In the gettext approach (where 1) are POT files and 2) are PO files) we > added plural form handling, which is just a small morphological variation, > and it required a significant amount of documentation and education for > translators. I would say, it is on the limit what we can make translators > grok. > > Now, when you give a translator a string > > "russian-nom:%d байт%| скопирован%|, %s, %s" > > you need to think about the appropriate tooling that will make the > translator understand > - what 'russian-nom' means, > - what the '|' characters mean, > - what the '%' characters mean. > Either the translator tool should somehow highlight these characters > and present on-line help, or it should present it as a sequence of > strings to translate: > > Rule: russian-nom > "%d байт" > " скопирован" > ", %s, %s" > > It is important to realize that each such case of morphological variation > requires translator tooling support. And unfortunately different such tools > exist, and every translator has their preferred one. For the plural form > handling alone, it took several years until the main tools had support for > it in their UI. Indeed, a complete implementation of all Russian morphological rules takes ~1600 lines of dense Perl code: http://www.linkov.net/files/nlp/Lingua-RU-Inflect.pm I can't imagine how to include all these rules to gettext. But there is no need because gettext already strikes a decent balance between complexity of natural languages and practical needs of program internationalization where translators themselves decide how words in messages should be inflected for different plural forms. Currently we have more urgent tasks after the first step of adding ‘ngettext’ like in CLISP, the development stalled on the problem of splitting messages into domains. But maybe CLISP already provides a good way to map packages to gettext domains? Does it require every package to have a separate domain or it collects translations from all packages into one domain? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-20 21:32 ` Juri Linkov @ 2019-03-21 2:14 ` Richard Stallman [not found] ` <E1h6nE3-0000bt-SW-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> [not found] ` <87h8bx5ijn.fsf-i9wRM+HIrmlRTR8OWt4JRw@public.gmane.org> 1 sibling, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-21 2:14 UTC (permalink / raw) To: Juri Linkov; +Cc: bug-gettext, bruno, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Indeed, a complete implementation of all Russian morphological rules > takes ~1600 lines of dense Perl code: > http://www.linkov.net/files/nlp/Lingua-RU-Inflect.pm > I can't imagine how to include all these rules to gettext. I agree with you about that. What I propose is something else. 1. I do not propose implementing them all. Only some -- whichever ones we think are worth while. 2. I do not propose putting any of this in gettext. What I propose would be Emacs code that operates on the strings that come from gettext. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
[parent not found: <E1h6nE3-0000bt-SW-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>]
* Re: Emacs i18n [not found] ` <E1h6nE3-0000bt-SW-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> @ 2019-03-21 21:45 ` Juri Linkov 2019-03-23 2:28 ` Richard Stallman 2019-03-22 20:50 ` Chusslove Illich 1 sibling, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-21 21:45 UTC (permalink / raw) To: Richard Stallman; +Cc: bug-gettext-mXXj517/zsQ, emacs-devel-mXXj517/zsQ > > Indeed, a complete implementation of all Russian morphological rules > > takes ~1600 lines of dense Perl code: > > > http://www.linkov.net/files/nlp/Lingua-RU-Inflect.pm > > > I can't imagine how to include all these rules to gettext. > > I agree with you about that. What I propose is something else. > > 1. I do not propose implementing them all. Only some -- whichever ones > we think are worth while. > > 2. I do not propose putting any of this in gettext. > What I propose would be Emacs code that operates on the strings that > come from gettext. The misconception of your proposal is assuming a pure algorithmic inflection whereas actually inflection in Russian is dictionary-based (in addition to algorithms that process words from the dictionary), i.e. to be able to inflect a word you need a large dictionary of all words where each word in the dictionary has at least the following lexical properties: - part of speech - noun grammatical gender: masculine, feminine, neuter - noun animacy: animate, inanimate - inflection type And the main parameters that influence the declension are: - grammatical case (one of 6 basic: nominative, genitive, dative, accusative, instrumental, prepositional plus some additional) - number: singular and plural. Dual is not a grammatical number, it only influences the choice of cases for words after numerals: for 1 - nominative case, singular for 2..4 - genitive case, singular for 5.. - genitive case, plural An additional problem is that there are many exceptions: some words have an additional form called "count form" https://en.wikipedia.org/wiki/Russian_declension#Count_form For instance, an exception is to use "5 байт" (5 byte) instead of what should be according to the grammatical rule that requires genitive plural for most other words, but not for bytes, i.e. this is incorrect: "5 байтов" (5 bytes). Such exceptions are marked in the dictionary with a special property that has different values: - mandatory: only the count form is allowed for such units of measure as amperes, watts, volts, bits, bytes, etc. - optional: both forms are accepted for such units as angstroms, gauss, (kilo)grams, decibels, carats, microns, ohms, röntgen, etc. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:45 ` Juri Linkov @ 2019-03-23 2:28 ` Richard Stallman 2019-03-23 7:55 ` Yuri Khan [not found] ` <E1h7WOF-0006T8-Be-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> 0 siblings, 2 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-23 2:28 UTC (permalink / raw) To: Juri Linkov; +Cc: bug-gettext, bruno, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > The misconception of your proposal is assuming a pure algorithmic > inflection whereas actually inflection in Russian is dictionary-based > (in addition to algorithms that process words from the dictionary), Maybe I did. I see various meanings for "pure algorithmic", so I am not sure whether I did, or did not, assume the point you have in mind. > i.e. to be able to inflect a word you need a large dictionary of all > words where each word in the dictionary has at least the following > lexical properties: > - part of speech > - noun grammatical gender: masculine, feminine, neuter > - noun animacy: animate, inanimate > - inflection type It sounds like Russian has various declensions, like Latin. Do I understand right? If so, it is true that 'russian-nom' would require specification of declensions. Maybe `russian-masc' as well. Are there standard names or codes for the declensions? In Latin, each declension has a number. I think those numbers are standard in teaching of Latin, so everyone who can read Latin knows those numbers, and translators would find it natural to use a construct that specifies the proper declension by number. We would not have to implement all of them -- only those that are useful enough to be worth implementing and documenting. We would tell translators to handle the other declensions, and the special exceptions, and the irregular plurals, using lower-level constructs comparable to what gettext does now. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-23 2:28 ` Richard Stallman @ 2019-03-23 7:55 ` Yuri Khan [not found] ` <CAP_d_8WjQwAtcWCfkjXHtc-dqYyBfnaP0+9L8KK6eCp4r_ZsPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> [not found] ` <E1h7WOF-0006T8-Be-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> 1 sibling, 1 reply; 151+ messages in thread From: Yuri Khan @ 2019-03-23 7:55 UTC (permalink / raw) To: rms; +Cc: Emacs developers, bug-gettext, bruno, Juri Linkov > It sounds like Russian has various declensions, like Latin. > Do I understand right? Yes, Russian has several declension types for nouns, and several conjugation types for verbs, and knowing the declension type helps one deduce the ending for each grammatical case and number, most of the time. > Are there standard names or codes for the declensions? At school, they teach us there are three declensions. First declension for feminine nouns and a few masculine nouns ending in -а and -я; second declension for masculine nouns ending in a consonant and neuter nouns ending in -о or -е; and third declension for feminine nouns ending in -ь. This is not the complete picture. There are also non-declined nouns, nouns that decline as if they were adjectives or pronouns, and exceptions. > In Latin, each > declension has a number. I think those numbers are standard in > teaching of Latin, so everyone who can read Latin knows those numbers, > and translators would find it natural to use a construct that > specifies the proper declension by number. The difference is that any translation to Latin would be done by a linguist, while translation to Russian would be done by ordinary Russian-speaking people. Most of whom do not keep the scientific details of declensions in their heads any longer than necessary for their current study and work. > We would not have to implement all of them -- only those that > are useful enough to be worth implementing and documenting. Right, and pretty much the only part of grammar that is useful for software UI localization is the grammatical number, because that is the only thing that changes by circumstances outside the control of the code. It just does not happen that the same message would put a noun in different cases, or a verb in different tenses, depending on the value of some expression. ^ permalink raw reply [flat|nested] 151+ messages in thread
[parent not found: <CAP_d_8WjQwAtcWCfkjXHtc-dqYyBfnaP0+9L8KK6eCp4r_ZsPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Emacs i18n [not found] ` <CAP_d_8WjQwAtcWCfkjXHtc-dqYyBfnaP0+9L8KK6eCp4r_ZsPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2019-03-23 17:50 ` Ineiev 2019-03-24 1:43 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Ineiev @ 2019-03-23 17:50 UTC (permalink / raw) To: Yuri Khan Cc: Juri Linkov, bug-gettext-mXXj517/zsQ, rms-mXXj517/zsQ, Emacs developers [-- Attachment #1: Type: text/plain, Size: 639 bytes --] On Sat, Mar 23, 2019 at 02:55:48PM +0700, Yuri Khan wrote: > Right, and pretty much the only part of grammar that is useful for > software UI localization is the grammatical number, because that is > the only thing that changes by circumstances outside the control of > the code. It just does not happen that the same message would put a > noun in different cases, or a verb in different tenses, depending on > the value of some expression. I encountered constructs like "We use %s %s", where the first arg is a color ("red", "green", "cool"), and the second arg is a fruit ("apple", "lemon", "horse radish") in singular, dual or plural. [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n [not found] ` <CAP_d_8WjQwAtcWCfkjXHtc-dqYyBfnaP0+9L8KK6eCp4r_ZsPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2019-03-23 17:50 ` Ineiev @ 2019-03-24 1:43 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-24 1:43 UTC (permalink / raw) To: Yuri Khan Cc: juri-GgPz7P5p7nCsTnJN9+BGXg, bug-gettext-mXXj517/zsQ, emacs-devel-mXXj517/zsQ [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Right, and pretty much the only part of grammar that is useful for > software UI localization is the grammatical number, because that is > the only thing that changes by circumstances outside the control of > the code. It just does not happen that the same message would put a > noun in different cases, or a verb in different tenses, depending on > the value of some expression. That's how I thought it would be. That's why I proposed a construct russian-nom that would decline a noun that's meant to be in the nominative case. There could be other such constructs for other cases, those that are used often enough in messages to be worth the trouble, and maybe different ones for adjectives, if that is worth the trouble. If translators don't find these more convenient, they won't have to use these. I don't see any sense in ruling them out without giving them a try. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
[parent not found: <E1h7WOF-0006T8-Be-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org>]
* Re: Emacs i18n [not found] ` <E1h7WOF-0006T8-Be-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> @ 2019-03-23 21:48 ` Juri Linkov 2019-03-24 1:47 ` Richard Stallman 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-23 21:48 UTC (permalink / raw) To: Richard Stallman; +Cc: bug-gettext-mXXj517/zsQ, emacs-devel-mXXj517/zsQ > Maybe I did. I see various meanings for "pure algorithmic", so I am > not sure whether I did, or did not, assume the point you have in mind. Sorry, if I incorrectly interpreted your proposal, it was unclear to me at what stage you planned to expand the special markup such as %| Doing this at runtime means adding large dictionaries to the distribution (together with complex algorithms) to perform a dictionary lookup. Or alternatively to use an external online API that inflects the words. But what we could do better than that is to improve po-mode to help translators to fill all plural forms in the existing format, but to do this automatically using either a downloaded or online dictionary. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-23 21:48 ` Juri Linkov @ 2019-03-24 1:47 ` Richard Stallman 0 siblings, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-24 1:47 UTC (permalink / raw) To: Juri Linkov; +Cc: bug-gettext, bruno, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Sorry, if I incorrectly interpreted your proposal, it was unclear to me > at what stage you planned to expand the special markup such as %| At the stage of translating a message -- when the value of the number is known. > Doing this at runtime means adding large dictionaries to the distribution > (together with complex algorithms) to perform a dictionary lookup. No dictionary is needed. It would handle only regular plurals (following known rules).. > Or alternatively to use an external online API that inflects the words. No, it will be done directly in Lisp code. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n [not found] ` <E1h6nE3-0000bt-SW-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> 2019-03-21 21:45 ` Juri Linkov @ 2019-03-22 20:50 ` Chusslove Illich 1 sibling, 0 replies; 151+ messages in thread From: Chusslove Illich @ 2019-03-22 20:50 UTC (permalink / raw) To: bug-gettext-mXXj517/zsQ, rms-mXXj517/zsQ Cc: emacs-devel-mXXj517/zsQ, Juri Linkov [-- Attachment #1: Type: text/plain, Size: 3010 bytes --] >> [: Juri Linkov :] >> Indeed, a complete implementation of all Russian morphological rules >> takes ~1600 lines of dense Perl code: >> >> http://www.linkov.net/files/nlp/Lingua-RU-Inflect.pm >> >> I can't imagine how to include all these rules to gettext. > > [: Richard Stallman :] > I agree with you about that. What I propose is something else. > > 1. I do not propose implementing them all. Only some -- whichever ones > we think are worth while. > > 2. I do not propose putting any of this in gettext. What I propose > would be Emacs code that operates on the strings that come from > gettext. I'd like to mention that a system of this kind, the Ki18n, is in operation within the KDE ecosystem for more than a decade now. The system is in fact invisible to programmers (for the most part), and it is also invisible for translators, unless they know about it and want to use it. At the last count, 10 language teams do make use of it. Translators have at their disposal a generic scripting system, so that any kind of algorithmic adaptation of translation is possible; and some interesting uses have come up. Programmer's perspective is given here: http://api.kde.org/frameworks/ki18n/html/prg_guide.html . There is in fact almost no mention of the system, which is as intended; only the subsections "Dynamic Contexts" and "Placing and Installing Scripting Modules" provide a clue that it exists. Translator's perspective is given here: http://techbase.kde.org/Localization/Concepts/Transcript . It includes some real-life example at the end. The variety of basic functions defined by translators can be seen in the system's source tarball http://download.kde.org/stable/frameworks/5.56/ki18n-5.56.0.tar.xz in po/*/scripts directories. Regarding specifically plural handling, this is in normal use left to Gettext standard functionality, since it was already there for a long time. However, there are two cases where the system does get used for plurals. One is the typical failure case where a programmer knows that the substituted number will always be greater than and therefore thinks a ngettext call is not needed; when this error is seen during a pre- release message freeze, a scripted translation can be used to work around until fix for next release. The other case is when a language needs also plural handling for float-type arguments (e.g. gd in the tarball above). Each programming environment (programming language plus foundation libraries) can implement its own version of a similar system, as proposed here for Emacs. However, I think a unified Gettext solution would be preferable. Based on the experience with Ki18n, some years ago I made such a clean design for Gettext, but never got time to work on it. It is described at http://nedohodnik.net/gettextbis/ . Section 6 describes the scripting system itself, with sections 2 and 3 detailing the necessary support for it. -- Chusslove Illich (Часлав Илић) [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 195 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
[parent not found: <87h8bx5ijn.fsf-i9wRM+HIrmlRTR8OWt4JRw@public.gmane.org>]
* Re: Emacs i18n [not found] ` <87h8bx5ijn.fsf-i9wRM+HIrmlRTR8OWt4JRw@public.gmane.org> @ 2019-03-21 2:55 ` Bruno Haible 0 siblings, 0 replies; 151+ messages in thread From: Bruno Haible @ 2019-03-21 2:55 UTC (permalink / raw) To: Juri Linkov Cc: emacs-devel-mXXj517/zsQ, rms-mXXj517/zsQ, bug-gettext-mXXj517/zsQ Hi Juri, > Currently we have more urgent tasks after the first step of adding > ‘ngettext’ like in CLISP, the development stalled on the problem of > splitting messages into domains. You are very welcome to ask for advice on bug-gettext. It's there that you can find the experts. (I don't read emacs-devel usually.) > But maybe CLISP already provides a good way to map packages to gettext > domains? Does it require every package to have a separate domain or > it collects translations from all packages into one domain? What matters for the domains is what code gets distributed together. * When you have two Lisp packages that are released by separate groups of developers, of course they must use separate translation domains. Otherwise you would have to co-ordinate the merging of their POT files, which makes no sense since they make releases at different times. * On the other hand, when you have two Lisp packages that are always released together, in the same tarball, it is more efficient for the translators if they receive one notification about a new POT file than two notifications about two POT files on the same day. For Common Lisp code, the Common Lisp package name _may_ be used to derive the domain name. But this is up to the developers. For reference, i18n in CLISP is described here: https://clisp.sourceforge.io/impnotes/i18n-mod.html https://www.gnu.org/software/gettext/manual/html_node/Common-Lisp.html and there is a sample and a test case in GNU gettext: https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/examples/hello-clisp/hello.lisp.in https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/tests/lang-clisp Bruno ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-20 11:59 Emacs i18n Bruno Haible 2019-03-20 16:36 ` Paul Eggert 2019-03-20 21:32 ` Juri Linkov @ 2019-03-21 2:14 ` Richard Stallman 2019-03-22 1:26 ` Bruno Haible 2 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-21 2:14 UTC (permalink / raw) To: Bruno Haible; +Cc: bug-gettext, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > When you design a translation system, you have two personas: > - the programmer, > - the translator. > The translation system defines > 1) which information flows from the programmer to the translator, > and in which format, > 2) which information flows back from the translator to the programmer, > and in which format. That argument is valid for gettext, but not for Emacs. This is the part that doesn't fit Emacs: > - The programmer, you can assume, can write and understand algorithms, > but does not master the grammar of more than one language (usually). In the development of Emacs there are many programmers, even some who speak Russian. We will have no difficulty implementing and maintaining russian-masc, russian-nom, and so on. These constructs do not need to be known to gettext. For gettext, they will simply be part of the translation string. We can do this for those languages in which it is convenient for us -- those that someone knows and decides to handle. For other languages, we can stick to the low-level gettext approach, which will work for all languages. > - The translator, you can assume, can translate sentences and knows > about the different meanings of words in different context. The Russian translation team for Emacs will not have difficulty using russian-masc, russian-nom, and so on. Being Russian speakers, they will understand how these constructs make sense for Russian, once they read the documentation for them. > In the gettext approach (where 1) are POT files and 2) are PO files) we > added plural form handling, which is just a small morphological variation, > and it required a significant amount of documentation and education for > translators. I would say, it is on the limit what we can make translators > grok. The gettext approach requires coding the algorithm in the translations file. My approach has the advantage of avoiding that. > Now, when you give a translator a string > "russian-nom:%d байт%| скопирован%|, %s, %s" > you need to think about the appropriate tooling that will make the > translator understand > - what 'russian-nom' means, > - what the '|' characters mean, > - what the '%' characters mean. I picked that syntax on the spur of the moment because I thought it would be natural and convenient. If that isn't natural and convenient for the translators, we can pick a different one. > Either the translator tool should somehow highlight these characters > and present on-line help, That would be good to do. > it should present it as a sequence of > strings to translate: > Rule: russian-nom > "%d байт" > " скопирован" > ", %s, %s" Is this general enough to handle all the use cases? I don't know -- I don't speak Russian. > For the plural form > handling alone, it took several years until the main tools had support for > it in their UI. What sort of syntax do the tools support for plurals? -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 2:14 ` Richard Stallman @ 2019-03-22 1:26 ` Bruno Haible 2019-03-23 2:29 ` Richard Stallman 0 siblings, 1 reply; 151+ messages in thread From: Bruno Haible @ 2019-03-22 1:26 UTC (permalink / raw) To: rms; +Cc: bug-gettext, emacs-devel Richard Stallman wrote: > > - The translator, you can assume, can translate sentences and knows > > about the different meanings of words in different context. > > The Russian translation team for Emacs will not have difficulty using > russian-masc, russian-nom, and so on. Being Russian speakers, they > will understand how these constructs make sense for Russian, once > they read the documentation for them. Still, it's essential to consider what the programmers send to the translators, and what the translators send back in return. > I don't speak Russian. Since you speak French perfectly, and we have a French translator on this list (Jean-Christophe Helary), let me make an example in French. As far as I understand, for plural handling instead of asking the translator to translate msgid "He bought one nice horse." msgid_plural "He bought %d nice horses." you would send them just the string "He bought %d nice horses." and expect that the translator sends back the string "Il acheta %d beau%| cheval%|." and then have, in the program, code that transforms this to "Il acheta un beau cheval." or "Il acheta %d beaux chevaux." I claim that 1) It is not a win for the translator. It is just as easy for the translator to produce two strings, than a string with several markers. Speech is natural to translators, not markup and grammar. Additionally, how will the translator know whether they have done it correctly or made a mistake? If you don't want translators to return untested translations, there will be the need to integrate the algorithmic code into the translation tools (KBabel, Lokalize, Gtranslator, Poedit, etc.). How do you want to do that? 2) It is hard to implement: - For the singular case, you need to know that "cheval" is masculine. Would you like the translator to provide this information, through markup such as "Il acheta %d<m> beau%| cheval%|." or would you like the code to look it up through a dictionary? - For the plural case, you need to know that the plural of "*eau" is "*eaux" in French (a rule that can be coded), but also that the plural of "cheval" is "chevaux" (or vice versa, that the singular of "chevaux" is "cheval") - which requires a dictionary lookup. Bruno ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 1:26 ` Bruno Haible @ 2019-03-23 2:29 ` Richard Stallman 0 siblings, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-23 2:29 UTC (permalink / raw) To: Bruno Haible; +Cc: bug-gettext-mXXj517/zsQ, emacs-devel-mXXj517/zsQ [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > and expect that the translator sends back the string > "Il acheta %d beau%| cheval%|." You are right that modification of the stem creates a complication. However, in French it would not be difficult to handle. The processing could recognize when the stem before %| calls for modification and modify the plural according to the rules. I think it would not be hard to implement operators french-masc and french-fem that would handle all regular plurals. > - For the plural case, you need to know that the plural of "*eau" is > "*eaux" in French (a rule that can be coded), but also that the > plural of "cheval" is "chevaux" (or vice versa, that the singular > of "chevaux" is "cheval") - which requires a dictionary lookup. Each of these words has a regular plural. french-masc would convert al%| into aux when plural is called for. For irregular plurals, the translator would give both forms by hand, working at the gettext level. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
[parent not found: <87o97aq6gz.fsf@jidanni.org>]
[parent not found: <87tvgoud56.fsf@mail.linkov.net>]
[parent not found: <83o96wk2mi.fsf@gnu.org>]
[parent not found: <87k1hjfvjd.fsf@mail.linkov.net>]
[parent not found: <E1gzZKP-0000kS-Iw@fencepost.gnu.org>]
[parent not found: <871s3p0zdz.fsf@mail.linkov.net>]
* Re: bug#34520: delete-matching-lines should report how many lines it deleted [not found] ` <871s3p0zdz.fsf@mail.linkov.net> @ 2019-03-03 3:04 ` Richard Stallman 2019-03-03 15:31 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-03 3:04 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] You wrote: ====================================================================== Here is an experimental but extensible implementation that handles the case of formatting the recently added message taking into account grammatical number of its argument: (defvar i18n-translations-hash (make-hash-table :test 'equal)) (defun i18n-add-translation (_language-environment from to) (puthash from to i18n-translations-hash)) (i18n-add-translation "English" "Deleted %d matching lines" (lambda (format-string count) (if (= count 1) "Deleted %d matching line" "Deleted %d matching lines"))) (defun i18n-get-translation (format-string &rest args) (pcase (gethash format-string i18n-translations-hash) ((and (pred functionp) f) (apply f format-string args)) ((and (pred stringp) s) s) (_ format-string))) (advice-add 'message :around (lambda (orig-fun format-string &rest args) (apply orig-fun (apply 'i18n-get-translation format-string args) args)) '((name . message-i18n))) ====================================================================== It seems pretty good. When installing it, it should not use `advice-add'. Rather, `message' should call a list of functions. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) 2019-03-03 3:04 ` bug#34520: delete-matching-lines should report how many lines it deleted Richard Stallman @ 2019-03-03 15:31 ` Eli Zaretskii 2019-03-03 20:57 ` Emacs i18n Juri Linkov 2019-03-04 3:27 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Richard Stallman 0 siblings, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-03 15:31 UTC (permalink / raw) To: rms; +Cc: emacs-devel, juri > From: Richard Stallman <rms@gnu.org> > Date: Sat, 02 Mar 2019 22:04:06 -0500 > Cc: emacs-devel@gnu.org > > (advice-add 'message :around > (lambda (orig-fun format-string &rest args) > (apply orig-fun (apply 'i18n-get-translation format-string args) args)) > '((name . message-i18n))) > ====================================================================== > > It seems pretty good. When installing it, it should not use > `advice-add'. Rather, `message' should call a list of functions. This has come up several times in the past. The main problem with i18n in Emacs is that, unlike in many text-mode programs, 'message' covers a tiny portion of the Emacs UI. We have help commands that pop up buffers; we have commands that prompt in the minibuffer; we have menu items and labels on tool-bar buttons; we have help-echo on menus, tool bar, the mode line, and mouse-sensitive text; we have tooltips; etc. etc. What's worse, most of the text shown by these features is computed dynamically by the commands that display the text. Any reasonably relevant i18n infrastructure for Emacs should address at least some of the above. For example, a significant progress could be made if we had infrastructure for translating doc strings, which would allow translators to provide message catalogs for individual Lisp packages. Past discussions revealed that even this limited progress is not really trivial. Unfortunately, past discussions didn't lead to any significant progress wrt this. While doing some progress would be welcome, I suggest that we don't pretend the solution is as easy as advice around 'message', but instead try to attack the more significant parts of the problem. Volunteers are welcome. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-03 15:31 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Eli Zaretskii @ 2019-03-03 20:57 ` Juri Linkov 2019-03-04 1:46 ` Jean-Christophe Helary 2019-03-04 3:27 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Richard Stallman 1 sibling, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-03 20:57 UTC (permalink / raw) To: Eli Zaretskii; +Cc: rms, emacs-devel [-- Attachment #1: Type: text/plain, Size: 825 bytes --] >> It seems pretty good. When installing it, it should not use >> `advice-add'. Rather, `message' should call a list of functions. > > Unfortunately, past discussions didn't lead to any significant > progress wrt this. My intention was to fix the bug which manifests itself in grammatically incorrect sentences displayed by ‘message’ like Deleted 1 matching lines 1 matches found ... After searching for available packages I found only this page https://savannah.nongnu.org/projects/emacs-i18n that shows no progress for many years. So here is a patch that fixes the bug by translating currently invalid messages into grammatically correct English. It also opens the gate towards translation of messages in many languages. Currently this feature is activated by (require 'i18n-message): [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: i18n-message.patch --] [-- Type: text/x-diff, Size: 10283 bytes --] diff --git a/lisp/replace.el b/lisp/replace.el index 59ad1a375b..b05bb51353 100644 --- a/lisp/replace.el +++ b/lisp/replace.el @@ -986,6 +986,12 @@ flush-lines (when interactive (message "Deleted %d matching lines" count)) count)) +(eval-after-load "i18n-message" + '(i18n-add-translation "English" + "Deleted %d matching lines" + '("Deleted %d matching line" + "Deleted %d matching lines"))) + (defun how-many (regexp &optional rstart rend interactive) "Print and return number of matches for REGEXP following point. When called from Lisp and INTERACTIVE is omitted or nil, just return @@ -1032,11 +1038,15 @@ how-many (if (= opoint (point)) (forward-char 1) (setq count (1+ count)))) - (when interactive (message "%d occurrence%s" - count - (if (= count 1) "" "s"))) + (when interactive (message "%d occurrences" count)) count))) +(eval-after-load "i18n-message" + '(i18n-add-translation "English" + "%d occurrences" + '("%d occurrence" + "%d occurrences"))) + \f (defvar occur-menu-map (let ((map (make-sparse-keymap))) @@ -2730,10 +2740,7 @@ perform-replace (1+ num-replacements)))))) (when (and (eq def 'undo-all) (null (zerop num-replacements))) - (message "Undid %d %s" num-replacements - (if (= num-replacements 1) - "replacement" - "replacements")) + (message "Undid %d replacements" num-replacements) (ding 'no-terminate) (sit-for 1))) (setq replaced nil last-was-undo t last-was-act-and-show nil))) @@ -2859,9 +2866,8 @@ perform-replace last-was-act-and-show nil)))))) (replace-dehighlight)) (or unread-command-events - (message "Replaced %d occurrence%s%s" + (message "Replaced %d occurrences%s" replace-count - (if (= replace-count 1) "" "s") (if (> (+ skip-read-only-count skip-filtered-count skip-invisible-count) @@ -2883,6 +2889,16 @@ perform-replace ""))) (or (and keep-going stack) multi-buffer))) +(eval-after-load "i18n-message" + '(i18n-add-translations + "English" + '(("Undid %d replacements" + ("Undid %d replacement" + "Undid %d replacements")) + ("Replaced %d occurrences%s" + ("Replaced %d occurrence%s" + "Replaced %d occurrences%s"))))) + (provide 'replace) ;;; replace.el ends here diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el index 3fd2a7e701..d2d748fca3 100644 --- a/lisp/progmodes/grep.el +++ b/lisp/progmodes/grep.el @@ -459,7 +459,7 @@ grep-mode-font-lock-keywords ;; remove match from grep-regexp-alist before fontifying ("^Grep[/a-zA-z]* started.*" (0 '(face nil compilation-message nil help-echo nil mouse-face nil) t)) - ("^Grep[/a-zA-z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?matches found\\)\\|\\(no matches found\\)\\).*" + ("^Grep[/a-zA-z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? found\\)\\|\\(no matches found\\)\\).*" (0 '(face nil compilation-message nil help-echo nil mouse-face nil) t) (1 compilation-info-face nil t) (2 compilation-warning-face nil t)) @@ -561,6 +561,12 @@ grep-exit-message (cons msg code))) (cons msg code))) +(eval-after-load "i18n-message" + '(i18n-add-translation "English" + "finished with %d matches found\n" + '("finished with %d match found\n" + "finished with %d matches found\n"))) + (defun grep-filter () "Handle match highlighting escape sequences inserted by the grep process. This function is called from `compilation-filter-hook'." diff --git a/lisp/international/i18n-message.el b/lisp/international/i18n-message.el new file mode 100644 index 0000000000..14755966e0 --- /dev/null +++ b/lisp/international/i18n-message.el @@ -0,0 +1,118 @@ +;;; i18n-message.el --- internationalization of messages -*- lexical-binding: t; -*- + +;; Copyright (C) 2019 Free Software Foundation, Inc. + +;; Author: Juri Linkov <juri@linkov.net> +;; Maintainer: emacs-devel@gnu.org +;; Keywords: i18n, multilingual + +;; This file is part of GNU Emacs. + +;; GNU Emacs is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation, either version 3 of the License, or +;; (at your option) any later version. + +;; GNU Emacs is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. + +;;; Commentary: + +;; + +;;; Code: + +(defcustom i18n-fallbacks + '(("en" "English")) + "An alist mapping the current language to possible fallbacks. +Each element should look like (\"LANG\" . FALLBACK-LIST), where +FALLBACK-LIST is a list of languages to try to find a translation." + :type '(alist :key-type (string :tag "Current language") + :value-type (repeat :tag "A list of fallbacks" string)) + :group 'i18n + :version "27.1") + +(defvar i18n-dictionaries (make-hash-table :test 'equal)) + +(defun i18n-add-dictionary (lang) + (unless (gethash lang i18n-dictionaries) + (puthash lang (make-hash-table :test 'equal) i18n-dictionaries))) + +;;;###autoload +(defun i18n-add-translation (lang from to) + (let ((dict (gethash lang i18n-dictionaries))) + (unless dict + (setq dict (i18n-add-dictionary lang))) + (puthash from to dict))) + +;;;###autoload +(defun i18n-add-translations (lang translations) + (dolist (translation translations) + (i18n-add-translation lang (nth 0 translation) (nth 1 translation)))) + +(defun i18n-get-plural (lang n) + ;; Source: (info "(gettext) Plural forms") + (pcase lang + ((or "Japanese" "Vietnamese" "Korean" "Thai") + 0) + ((or "English" "German" "Dutch" "Swedish" "Danish" "Norwegian" + "Faroese" "Spanish" "Portuguese" "Italian" "Bulgarian" "Greek" + "Finnish" "Estonian" "Hebrew" "Bahasa Indonesian" "Esperanto" + "Hungarian" "Turkish") + (if (/= n 1) 1 0)) + ((or "Brazilian Portuguese" "French") + (if (> n 1) 1 0)) + ((or "Latvian") + (if (and (= (% n 10) 1) (/= (% n 100) 11)) 0 (if (/= n 0) 1 2))) + ((or "Gaeilge" "Irish") + (if (= n 1) 0 (if (= n 2) 1 2))) + ((or "Romanian") + (if (= n 1) 0 (if (or (= n 0) (and (> (% n 100) 0) (< (% n 100) 20))) 1 2))) + ((or "Lithuanian") + (if (and (= (% n 10) 1) (/= (% n 100) 11)) 0 + (if (and (>= (% n 10) 2) (or (< (% n 100) 10) (>= (% n 100) 20))) 1 2))) + ((or "Russian" "Ukrainian" "Belarusian" "Serbian" "Croatian") + (if (and (= (% n 10) 1) (/= (% n 100) 11)) 0 + (if (and (>= (% n 10) 2) (<= (% n 10) 4) (or (< (% n 100) 10) (>= (% n 100) 20))) 1 2))) + ((or "Czech" "Slovak") + (if (= n 1) 0 (if (and (>= n 2) (<= n 4)) 1 2))) + ((or "Polish") + (if (= n 1) 0 + (if (and (>= (% n 10) 2) (<= (% n 10) 4) (or (< (% n 100) 10) (>= (% n 100) 20))) 1 2))) + ((or "Slovenian") + (if (= (% n 100) 1) 0 (if (= (% n 100) 2) 1 (if (or (= (% n 100) 3) (= (% n 100) 4)) 2 3)))) + ((or "Arabic") + (if (= n 0) 0 (if (= n 1) 1 (if (= n 2) 2 (if (and (>= (% n 100) 3) (<= (% n 100) 10)) 3 + (if (>= (% n 100) 11) 4 5)))))))) + +(defun i18n-get-translation (format-string &rest args) + (let* ((lang current-language-environment) + (fallbacks (cdr (assoc lang i18n-fallbacks))) + dict found) + (while (and (not found) lang) + (when (setq dict (gethash lang i18n-dictionaries)) + (setq found + (pcase (gethash format-string dict) + ((and (pred functionp) f) (apply f format-string args)) + ((and (pred stringp) s) s) + ((and (pred consp) l) + (let ((n (i18n-get-plural lang (car args)))) + (when n (nth n l))))))) + (unless found + (setq lang (pop fallbacks)))) + (or found format-string))) + +(defun i18n-message-translate (&rest args) + (apply 'i18n-get-translation args)) + +(defvar message-translate-function) + +(setq message-translate-function 'i18n-message-translate) + +(provide 'i18n-message) +;;; i18n-message.el ends here diff --git a/src/editfns.c b/src/editfns.c index bffb5db43e..f517679576 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -3050,6 +3050,14 @@ produced text. usage: (format STRING &rest OBJECTS) */) (ptrdiff_t nargs, Lisp_Object *args) { + if (!NILP (Vmessage_translate_function) && nargs > 0) + { + Lisp_Object format = apply1 (Vmessage_translate_function, + Flist (nargs, args)); + if (STRINGP (format)) + args[0] = format; + } + return styled_format (nargs, args, false); } @@ -3066,6 +3074,14 @@ and right quote replacement characters are specified by usage: (format-message STRING &rest OBJECTS) */) (ptrdiff_t nargs, Lisp_Object *args) { + if (!NILP (Vmessage_translate_function) && nargs > 0) + { + Lisp_Object format = apply1 (Vmessage_translate_function, + Flist (nargs, args)); + if (STRINGP (format)) + args[0] = format; + } + return styled_format (nargs, args, true); } @@ -4462,6 +4478,11 @@ of the buffer being accessed. */); functions if all the text being accessed has this property. */); Vbuffer_access_fontified_property = Qnil; + DEFVAR_LISP ("message-translate-function", + Vmessage_translate_function, + doc: /* Function that translates messages. */); + Vmessage_translate_function = Qnil; + DEFVAR_LISP ("system-name", Vsystem_name, doc: /* The host name of the machine Emacs is running on. */); Vsystem_name = cached_system_name = Qnil; ^ permalink raw reply related [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-03 20:57 ` Emacs i18n Juri Linkov @ 2019-03-04 1:46 ` Jean-Christophe Helary 2019-03-06 9:38 ` Elias Mårtenson 2019-03-21 20:33 ` Clément Pit-Claudel 0 siblings, 2 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-04 1:46 UTC (permalink / raw) To: Juri Linkov; +Cc: Eli Zaretskii, Richard Stallman, emacs-devel [-- Attachment #1: Type: text/plain, Size: 625 bytes --] > On Mar 4, 2019, at 5:57, Juri Linkov <juri@linkov.net <mailto:juri@linkov.net>> wrote: > > My intention was to fix the bug which manifests itself in > grammatically incorrect sentences displayed by ‘message’ like > > Deleted 1 matching lines > 1 matches found > ... The best way to do that (I fixed the almost 100% of the package.el code with that) is to not use such syntax but rather things like: Number of matches found: %d Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com <http://mac4translators.blogspot.com/> @brandelune [-- Attachment #2: Type: text/html, Size: 3121 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-04 1:46 ` Jean-Christophe Helary @ 2019-03-06 9:38 ` Elias Mårtenson 2019-03-06 11:23 ` Jean-Christophe Helary 2019-03-21 20:33 ` Clément Pit-Claudel 1 sibling, 1 reply; 151+ messages in thread From: Elias Mårtenson @ 2019-03-06 9:38 UTC (permalink / raw) To: Jean-Christophe Helary Cc: Eli Zaretskii, emacs-devel, Richard Stallman, Juri Linkov [-- Attachment #1: Type: text/plain, Size: 384 bytes --] On Mon, 4 Mar 2019 at 09:48, Jean-Christophe Helary <brandelune@gmail.com> wrote: > > The best way to do that (I fixed the almost 100% of the package.el code > with that) is to not use such syntax but rather things like: > > Number of matches found: %d > That works for English most of the time (although I would argue that it isn't great). But it may be harder in other languages. [-- Attachment #2: Type: text/html, Size: 979 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 9:38 ` Elias Mårtenson @ 2019-03-06 11:23 ` Jean-Christophe Helary 0 siblings, 0 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-06 11:23 UTC (permalink / raw) To: emacs-devel > On Mar 6, 2019, at 18:38, Elias Mårtenson > > The best way to do that (I fixed the almost 100% of the package.el code with that) is to not use such syntax but rather things like: > > Number of matches found: %d > > That works for English most of the time (although I would argue that it isn't great). I'm not sure why that is not "great" here. But I know from what I saw in packages.el that what is even less great it to get lost in lisp that attempts to mimic natural language inflections. > But it may be harder in other languages. It is unlikely that this structure is harder in non English languages than the original. Removing the need to express the difference in number actually removes an order of complexity. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-04 1:46 ` Jean-Christophe Helary 2019-03-06 9:38 ` Elias Mårtenson @ 2019-03-21 20:33 ` Clément Pit-Claudel 2019-03-21 20:50 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-21 20:33 UTC (permalink / raw) To: Jean-Christophe Helary, Juri Linkov Cc: Eli Zaretskii, Richard Stallman, emacs-devel On 2019-03-03 20:46, Jean-Christophe Helary wrote: >> On Mar 4, 2019, at 5:57, Juri Linkov <juri@linkov.net <mailto:juri@linkov.net>> wrote: >> My intention was to fix the bug which manifests itself in >> grammatically incorrect sentences displayed by ‘message’ like >> >> Deleted 1 matching lines >> 1 matches found >> ... > > The best way to do that (I fixed the almost 100% of the package.el code with that) is to not use such syntax but rather things like: > > Number of matches found: %d I'm a bit late to the party, but I hope it's still OK to respond :) This is a valid way to work around the issue, but I'm not sure how much I like it (I just noticed the change after pulling the latest Emacs from git). The current package.el doesn't say 'Number of packages that are not available: %d'; instead, it says 'Packages that are not available: %d' (it used to say "%s packages are not available"). Other examples are 'Packages to hide: %d' (originally 'Hiding %s packages') and 'Packages that can be upgraded: %d; type `%s' to mark for upgrading.' (originally '%d package%s can be upgraded; type `%s' to mark %s for upgrading.'). I find this suboptimal for three reasons: First, after 'packages that are not available', I expect to see a list of packages, not a number. Second, the new way the message is phrased puts the important bit in a less obvious place (in the middle of the message, rather than at the beginning: "Packages that can be upgraded: 5; type `U' to mark for upgrading"). Third (but this is a bit more fuzzy), the way the message is now written makes errors sound like normal events ('Packages that are not available: 3' read like the response to the query 'how many packages are not available?'). I understand that there's hope to support plurals and internationalization in a more principled way soon, but is this workaround (61f73703c74756e6963cc622f03bcc6938ab71b2) needed in the meantime? Clément. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 20:33 ` Clément Pit-Claudel @ 2019-03-21 20:50 ` Eli Zaretskii 2019-03-21 21:03 ` Clément Pit-Claudel 2019-03-21 21:17 ` Jean-Christophe Helary 2019-03-21 21:59 ` Juri Linkov 2 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-21 20:50 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel, brandelune, rms, juri > Cc: Eli Zaretskii <eliz@gnu.org>, Richard Stallman <rms@gnu.org>, > emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Thu, 21 Mar 2019 16:33:21 -0400 > > I understand that there's hope to support plurals and internationalization in a more principled way soon, but is this workaround (61f73703c74756e6963cc622f03bcc6938ab71b2) needed in the meantime? Whether we like it or not, it's one of the standard methods of solving these situations. It might sound somewhat more awkward in some languages than the original wording, but it has other more important advantages. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 20:50 ` Eli Zaretskii @ 2019-03-21 21:03 ` Clément Pit-Claudel 2019-03-21 21:21 ` Jean-Christophe Helary 2019-03-22 8:22 ` Eli Zaretskii 0 siblings, 2 replies; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-21 21:03 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, brandelune, rms, juri On 2019-03-21 16:50, Eli Zaretskii wrote: >> Cc: Eli Zaretskii <eliz@gnu.org>, Richard Stallman <rms@gnu.org>, >> emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Thu, 21 Mar 2019 16:33:21 -0400 >> >> I understand that there's hope to support plurals and internationalization in a more principled way soon, but is this workaround (61f73703c74756e6963cc622f03bcc6938ab71b2) needed in the meantime? > > Whether we like it or not, it's one of the standard methods of solving > these situations. It might sound somewhat more awkward in some > languages than the original wording, but it has other more important > advantages. I don't understand: what does this change buy us currently, except the awkward wording? Arguably the patch for the change was a simplification, but the original author had actually writen the code to get the English plurals right in most cases. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:03 ` Clément Pit-Claudel @ 2019-03-21 21:21 ` Jean-Christophe Helary 2019-03-21 21:34 ` Clément Pit-Claudel 2019-03-22 8:22 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-21 21:21 UTC (permalink / raw) To: emacs-devel > On Mar 22, 2019, at 6:03, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > > but the original author had actually writen the code to get the English plurals right in most cases. Yes, but the issue is not to have "most cases right" but rather to have readable strings in the code. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:21 ` Jean-Christophe Helary @ 2019-03-21 21:34 ` Clément Pit-Claudel 2019-03-21 21:56 ` Jean-Christophe Helary 0 siblings, 1 reply; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-21 21:34 UTC (permalink / raw) To: emacs-devel On 2019-03-21 17:21, Jean-Christophe Helary wrote: >> On Mar 22, 2019, at 6:03, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: >> >> but the original author had actually writen the code to get the English plurals right in most cases. > > Yes, but the issue is not to have "most cases right" but rather to have readable strings in the code. But is that worth the loss in readability in the UI? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:34 ` Clément Pit-Claudel @ 2019-03-21 21:56 ` Jean-Christophe Helary 2019-03-21 22:05 ` Clément Pit-Claudel 0 siblings, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-21 21:56 UTC (permalink / raw) To: emacs-devel > On Mar 22, 2019, at 6:34, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > > On 2019-03-21 17:21, Jean-Christophe Helary wrote: >>> On Mar 22, 2019, at 6:03, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: >>> >>> but the original author had actually writen the code to get the English plurals right in most cases. >> >> Yes, but the issue is not to have "most cases right" but rather to have readable strings in the code. > > But is that worth the loss in readability in the UI? That's a subjective issue. I think the strings are more readable now and there are potentially less grammatical mistakes (I am not talking about the "Number of" part you mentioned). While before it was easy to not notice a bug in the code. I think that is a net gain. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:56 ` Jean-Christophe Helary @ 2019-03-21 22:05 ` Clément Pit-Claudel 2019-03-21 23:46 ` Jean-Christophe Helary 0 siblings, 1 reply; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-21 22:05 UTC (permalink / raw) To: Jean-Christophe Helary, emacs-devel On 2019-03-21 17:56, Jean-Christophe Helary wrote: > > >> On Mar 22, 2019, at 6:34, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: >> >> On 2019-03-21 17:21, Jean-Christophe Helary wrote: >>>> On Mar 22, 2019, at 6:03, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: >>>> >>>> but the original author had actually writen the code to get the English plurals right in most cases. >>> >>> Yes, but the issue is not to have "most cases right" but rather to have readable strings in the code. >> >> But is that worth the loss in readability in the UI? > > That's a subjective issue. I think the strings are more readable now and there are potentially less grammatical mistakes (I am not talking about the "Number of" part you mentioned). While before it was easy to not notice a bug in the code. I think that is a net gain. Understood, thanks. I would have preferred sticking with the previous phrasing, but a single opinion is not much to act upon. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 22:05 ` Clément Pit-Claudel @ 2019-03-21 23:46 ` Jean-Christophe Helary 0 siblings, 0 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-21 23:46 UTC (permalink / raw) To: emacs-devel > On Mar 22, 2019, at 7:05, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > >> That's a subjective issue. I think the strings are more readable now and there are potentially less grammatical mistakes (I am not talking about the "Number of" part you mentioned). While before it was easy to not notice a bug in the code. I think that is a net gain. > > Understood, thanks. I would have preferred sticking with the previous phrasing, but a single opinion is not much to act upon. :) No, it's quite the opposite. everything starts from a single opinion. I would also prefer to stick to more natural sounding phrasing, but the tools available at the moment don't allow for that. It is usually accepted in code internationalization that concatenation and other similar processes should not be used to generate natural language strings. So there are 2 ways to deal with that: either you simplify the strings to the point where there are as little variations possible (my choice for packages.el) or you use strings redundancy to cover all the possible variations with processes that support that (which we don't have at the moment). Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:03 ` Clément Pit-Claudel 2019-03-21 21:21 ` Jean-Christophe Helary @ 2019-03-22 8:22 ` Eli Zaretskii 2019-03-22 16:10 ` Clément Pit-Claudel 1 sibling, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-22 8:22 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel, brandelune, rms, juri > Cc: brandelune@gmail.com, juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Thu, 21 Mar 2019 17:03:31 -0400 > > > Whether we like it or not, it's one of the standard methods of solving > > these situations. It might sound somewhat more awkward in some > > languages than the original wording, but it has other more important > > advantages. > > I don't understand: what does this change buy us currently, except the awkward wording? It brings us a step closer to the i18n goal. A very small step, admittedly, but step in the right direction nonetheless. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 8:22 ` Eli Zaretskii @ 2019-03-22 16:10 ` Clément Pit-Claudel 2019-03-22 16:35 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-22 16:10 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, brandelune, rms, juri On 2019-03-22 04:22, Eli Zaretskii wrote: >> Cc: brandelune@gmail.com, juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Thu, 21 Mar 2019 17:03:31 -0400 >> >>> Whether we like it or not, it's one of the standard methods of solving >>> these situations. It might sound somewhat more awkward in some >>> languages than the original wording, but it has other more important >>> advantages. >> >> I don't understand: what does this change buy us currently, except the awkward wording? > > It brings us a step closer to the i18n goal. A very small step, > admittedly, but step in the right direction nonetheless. Thanks, that's what puzzles me. IIUC, we will revert to the previous strings once we have proper translation support in place, right? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 16:10 ` Clément Pit-Claudel @ 2019-03-22 16:35 ` Eli Zaretskii 2019-03-22 17:16 ` Clément Pit-Claudel 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-22 16:35 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel, brandelune, rms, juri > Cc: brandelune@gmail.com, juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 Mar 2019 12:10:42 -0400 > > > It brings us a step closer to the i18n goal. A very small step, > > admittedly, but step in the right direction nonetheless. > > Thanks, that's what puzzles me. IIUC, we will revert to the previous strings once we have proper translation support in place, right? It isn't clear to me yet. At least at the time this change was made, we didn't expect to revert, we thought this form will remain when it can be translated. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 16:35 ` Eli Zaretskii @ 2019-03-22 17:16 ` Clément Pit-Claudel 2019-03-22 17:35 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-22 17:16 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, brandelune, rms, juri On 2019-03-22 12:35, Eli Zaretskii wrote: >> Cc: brandelune@gmail.com, juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 Mar 2019 12:10:42 -0400 >> >>> It brings us a step closer to the i18n goal. A very small step, >>> admittedly, but step in the right direction nonetheless. >> >> Thanks, that's what puzzles me. IIUC, we will revert to the previous strings once we have proper translation support in place, right? > > It isn't clear to me yet. At least at the time this change was made, > we didn't expect to revert, we thought this form will remain when it > can be translated. Oh! Then I misunderstood. I thought the idea was that once we have a library that can handle this well, we'd write something like (ngettext "One package installed" "%d packages installed" n), with ngettext picking between both in English and picking the appropriate string in other languages. Clément. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 17:16 ` Clément Pit-Claudel @ 2019-03-22 17:35 ` Eli Zaretskii 2019-03-22 23:17 ` Clément Pit-Claudel 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-22 17:35 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel, brandelune, rms, juri > Cc: brandelune@gmail.com, juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Fri, 22 Mar 2019 13:16:17 -0400 > > > It isn't clear to me yet. At least at the time this change was made, > > we didn't expect to revert, we thought this form will remain when it > > can be translated. > > Oh! Then I misunderstood. I thought the idea was that once we have a library that can handle this well, we'd write something like (ngettext "One package installed" "%d packages installed" n), with ngettext picking between both in English and picking the appropriate string in other languages. Maybe. ngettext wasn't on the table when this change was made, and even now I'm not yet sure what the end result will look like. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 17:35 ` Eli Zaretskii @ 2019-03-22 23:17 ` Clément Pit-Claudel 0 siblings, 0 replies; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-22 23:17 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, brandelune, rms, juri On 2019-03-22 13:35, Eli Zaretskii wrote: >> Cc: brandelune@gmail.com, juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org >> From: Clément Pit-Claudel <cpitclaudel@gmail.com> >> Date: Fri, 22 Mar 2019 13:16:17 -0400 >> >>> It isn't clear to me yet. At least at the time this change was made, >>> we didn't expect to revert, we thought this form will remain when it >>> can be translated. >> >> Oh! Then I misunderstood. I thought the idea was that once we have a library that can handle this well, we'd write something like (ngettext "One package installed" "%d packages installed" n), with ngettext picking between both in English and picking the appropriate string in other languages. > > Maybe. ngettext wasn't on the table when this change was made, and > even now I'm not yet sure what the end result will look like. Got it. Thanks for taking the time to explain! ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 20:33 ` Clément Pit-Claudel 2019-03-21 20:50 ` Eli Zaretskii @ 2019-03-21 21:17 ` Jean-Christophe Helary 2019-03-21 21:59 ` Juri Linkov 2 siblings, 0 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-21 21:17 UTC (permalink / raw) To: emacs-devel Thank you Clement for the remarks. I wrote all at first because there were issues with the original code that did not take into account some singular cases. When I checked the code, I found a terrible amount of strings mixed with code and so I went a bit further and "fixed" that too. I don't remember forgetting about that "Number" issue you mention though. Sorry for that. Jean-Christophe > On Mar 22, 2019, at 5:33, Clément Pit-Claudel <cpitclaudel@gmail.com> wrote: > > On 2019-03-03 20:46, Jean-Christophe Helary wrote: >>> On Mar 4, 2019, at 5:57, Juri Linkov <juri@linkov.net <mailto:juri@linkov.net>> wrote: >>> My intention was to fix the bug which manifests itself in >>> grammatically incorrect sentences displayed by ‘message’ like >>> >>> Deleted 1 matching lines >>> 1 matches found >>> ... >> >> The best way to do that (I fixed the almost 100% of the package.el code with that) is to not use such syntax but rather things like: >> >> Number of matches found: %d > > I'm a bit late to the party, but I hope it's still OK to respond :) This is a valid way to work around the issue, but I'm not sure how much I like it (I just noticed the change after pulling the latest Emacs from git). > > The current package.el doesn't say 'Number of packages that are not available: %d'; instead, it says 'Packages that are not available: %d' (it used to say "%s packages are not available"). Other examples are 'Packages to hide: %d' (originally 'Hiding %s packages') and 'Packages that can be upgraded: %d; type `%s' to mark for upgrading.' (originally '%d package%s can be upgraded; type `%s' to mark %s for upgrading.'). > > I find this suboptimal for three reasons: First, after 'packages that are not available', I expect to see a list of packages, not a number. Second, the new way the message is phrased puts the important bit in a less obvious place (in the middle of the message, rather than at the beginning: "Packages that can be upgraded: 5; type `U' to mark for upgrading"). Third (but this is a bit more fuzzy), the way the message is now written makes errors sound like normal events ('Packages that are not available: 3' read like the response to the query 'how many packages are not available?'). > > I understand that there's hope to support plurals and internationalization in a more principled way soon, but is this workaround (61f73703c74756e6963cc622f03bcc6938ab71b2) needed in the meantime? > > Clément. > Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 20:33 ` Clément Pit-Claudel 2019-03-21 20:50 ` Eli Zaretskii 2019-03-21 21:17 ` Jean-Christophe Helary @ 2019-03-21 21:59 ` Juri Linkov 2019-03-22 8:22 ` Eli Zaretskii 2 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-21 21:59 UTC (permalink / raw) To: Clément Pit-Claudel Cc: Eli Zaretskii, emacs-devel, Jean-Christophe Helary, Richard Stallman > The current package.el doesn't say 'Number of packages that are not > available: %d'; instead, it says 'Packages that are not available: %d' > (it used to say "%s packages are not available"). Both don't sound natural, it's too robotic. Let's use ngettext plurals from now on. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-21 21:59 ` Juri Linkov @ 2019-03-22 8:22 ` Eli Zaretskii 2019-03-23 21:50 ` Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-22 8:22 UTC (permalink / raw) To: Juri Linkov; +Cc: cpitclaudel, emacs-devel, brandelune, rms > From: Juri Linkov <juri@linkov.net> > Cc: Jean-Christophe Helary <brandelune@gmail.com>, Eli Zaretskii <eliz@gnu.org>, Richard Stallman <rms@gnu.org>, emacs-devel@gnu.org > Date: Thu, 21 Mar 2019 23:59:23 +0200 > > Let's use ngettext plurals from now on. I don't think I understand the practical implications of that. Could you please elaborate? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-22 8:22 ` Eli Zaretskii @ 2019-03-23 21:50 ` Juri Linkov 2019-03-24 3:36 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-23 21:50 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, emacs-devel, brandelune, rms >> Let's use ngettext plurals from now on. > > I don't think I understand the practical implications of that. Could > you please elaborate? This means replacing (message "Packages to install: %d" n) with (message (ngettext "One package will be installed" "%d packages will be installed" n) n) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-23 21:50 ` Juri Linkov @ 2019-03-24 3:36 ` Eli Zaretskii 2019-03-24 21:55 ` Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-24 3:36 UTC (permalink / raw) To: Juri Linkov; +Cc: cpitclaudel, emacs-devel, brandelune, rms > From: Juri Linkov <juri@linkov.net> > Cc: cpitclaudel@gmail.com, brandelune@gmail.com, rms@gnu.org, emacs-devel@gnu.org > Date: Sat, 23 Mar 2019 23:50:53 +0200 > > >> Let's use ngettext plurals from now on. > > > > I don't think I understand the practical implications of that. Could > > you please elaborate? > > This means replacing > > (message "Packages to install: %d" n) > > with > > (message (ngettext "One package will be installed" > "%d packages will be installed" n) n) But since we don't yet have ngettext, we cannot yet use this paradigm. I thought by "from now on" you literally meant from now; did I misunderstand? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-24 3:36 ` Eli Zaretskii @ 2019-03-24 21:55 ` Juri Linkov 2019-03-24 23:31 ` Jean-Christophe Helary ` (2 more replies) 0 siblings, 3 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-24 21:55 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, emacs-devel, brandelune, rms >> >> Let's use ngettext plurals from now on. >> > >> > I don't think I understand the practical implications of that. Could >> > you please elaborate? >> >> This means replacing >> >> (message "Packages to install: %d" n) >> >> with >> >> (message (ngettext "One package will be installed" >> "%d packages will be installed" n) n) > > But since we don't yet have ngettext, we cannot yet use this > paradigm. I thought by "from now on" you literally meant from now; > did I misunderstand? Yes, literally. After the patch from http://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00586.html is pushed to master, ngettext is available to use for pluralization. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-24 21:55 ` Juri Linkov @ 2019-03-24 23:31 ` Jean-Christophe Helary 2019-03-25 21:32 ` Juri Linkov 2019-03-25 3:35 ` Eli Zaretskii 2019-03-25 10:52 ` Mattias Engdegård 2 siblings, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-24 23:31 UTC (permalink / raw) To: Emacs developers > On Mar 25, 2019, at 6:55, Juri Linkov <juri@linkov.net> wrote: > >>>>> Let's use ngettext plurals from now on. >>>> >>>> I don't think I understand the practical implications of that. Could >>>> you please elaborate? >>> >>> This means replacing >>> >>> (message "Packages to install: %d" n) >>> >>> with >>> >>> (message (ngettext "One package will be installed" >>> "%d packages will be installed" n) n) >> >> But since we don't yet have ngettext, we cannot yet use this >> paradigm. I thought by "from now on" you literally meant from now; >> did I misunderstand? > > Yes, literally. After the patch from > http://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00586.html > is pushed to master, ngettext is available to use for pluralization. Why put the patch in subr.el and not in its own i18n related new package ? Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-24 23:31 ` Jean-Christophe Helary @ 2019-03-25 21:32 ` Juri Linkov 2019-03-25 22:31 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-25 21:32 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: Emacs developers > Why put the patch in subr.el and not in its own i18n related new package ? I don't know where to put i18n related code, so since ngettext should have C calls anyway, I moved it to editfns.c near the function ‘message’ where it still just returns the correct plurals without doing any translation. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 21:32 ` Juri Linkov @ 2019-03-25 22:31 ` Paul Eggert 2019-03-26 16:11 ` Eli Zaretskii 2019-03-26 23:16 ` Juri Linkov 0 siblings, 2 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-25 22:31 UTC (permalink / raw) To: Juri Linkov; +Cc: Jean-Christophe Helary, Emacs developers [-- Attachment #1: Type: text/plain, Size: 1795 bytes --] On 3/25/19 2:32 PM, Juri Linkov wrote: > I don't know where to put i18n related code, so since ngettext should > have C calls anyway, I moved it to editfns.c near the function ‘message’ > where it still just returns the correct plurals without doing any translation. That stub had some problems: 1. It lacked documentation in the Elisp manual. Important changes like this should be documented -- to some extent the documentation is even more important than the code. Can you write something? 2. While you're thinking about (1) here are some other questions. How will ngettext determine the message catalog? Is the catalog visible to users as a global variable, or as a hidden part of the global state, or is it something explicit? How will catalogs from multiple packages be used? How would a multi-lingual application work in Emacs if the message catalog is part of global state? This seems to be a crucial issue, I'd say. For example, should Emacs export dcngettext to Lisp code, instead of just plain ngettext? (Emacs could then define ngettext in terms of dcngettext.) 3. User C code is not supposed to inspect the _LIBC macro; that's for glibc internal use. In Emacs _LIBC should be used only with code shared with glibc, and we should assume _LIBC is never defined when files are compiled for Emacs. 4. The stub doesn't work with bignums. 5. When calling the C-level ngettext, strings are not properly recoded. I fixed (3) and (4), and temporarily worked around (5), by installing the attached patch. To do a better job with (2) and (5) please see the gettext manual's instructions for package maintainers, here: https://www.gnu.org/software/gettext/manual/gettext.html#Maintainers To my mind (1) and (2) are the most-pressing problems. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Port-recent-ngettext-stub-to-non-glibc.patch --] [-- Type: text/x-patch; name="0001-Port-recent-ngettext-stub-to-non-glibc.patch", Size: 2850 bytes --] From a361c54b8339ad79f65e924c4a1f7bbcdb1859e2 Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Mon, 25 Mar 2019 15:20:20 -0700 Subject: [PATCH] Port recent ngettext stub to non-glibc MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * src/editfns.c: Don’t try to call glibc ngettext; we’re not ready for that yet. (Fngettext): Do not restrict integer arguments to fixnums. Improve doc string a bit. --- src/editfns.c | 34 +++++++++------------------------- 1 file changed, 9 insertions(+), 25 deletions(-) diff --git a/src/editfns.c b/src/editfns.c index ab48cdb6fd..bfffadc733 100644 --- a/src/editfns.c +++ b/src/editfns.c @@ -53,12 +53,6 @@ along with GNU Emacs. If not, see <https://www.gnu.org/licenses/>. */ #include "window.h" #include "blockinput.h" -#ifdef _LIBC -# include <libintl.h> -#else -# include "gettext.h" -#endif - static void update_buffer_properties (ptrdiff_t, ptrdiff_t); static Lisp_Object styled_format (ptrdiff_t, Lisp_Object *, bool); @@ -2845,30 +2839,20 @@ usage: (save-restriction &rest BODY) */) /* i18n (internationalization). */ DEFUN ("ngettext", Fngettext, Sngettext, 3, 3, 0, - doc: /* Return the plural form of the translation of the string. -This function is similar to the `gettext' function as it finds the message -catalogs in the same way. But it takes two extra arguments. The MSGID -parameter must contain the singular form of the string to be converted. -It is also used as the key for the search in the catalog. -The MSGID_PLURAL parameter is the plural form. The parameter N is used -to determine the plural form. If no message catalog is found MSGID is -returned if N is equal to 1, otherwise MSGID_PLURAL. */) + doc: /* Return the translation of MSGID (plural MSGID_PLURAL) depending on N. +MSGID is the singular form of the string to be converted; +use it as the key for the search in the translation catalog. +MSGID_PLURAL is the plural form. Use N to select the proper translation. +If no message catalog is found, MSGID is returned if N is equal to 1, +otherwise MSGID_PLURAL. */) (Lisp_Object msgid, Lisp_Object msgid_plural, Lisp_Object n) { CHECK_STRING (msgid); CHECK_STRING (msgid_plural); - CHECK_FIXNUM (n); + CHECK_INTEGER (n); -#ifdef _LIBGETTEXT_H - return build_string (ngettext (SSDATA (msgid), - SSDATA (msgid_plural), - XFIXNUM (n))); -#else - if (XFIXNUM (n) == 1) - return msgid; - else - return msgid_plural; -#endif + /* Placeholder implementation until we get our act together. */ + return EQ (n, make_fixnum (1)) ? msgid : msgid_plural; } \f DEFUN ("message", Fmessage, Smessage, 1, MANY, 0, -- 2.20.1 ^ permalink raw reply related [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 22:31 ` Paul Eggert @ 2019-03-26 16:11 ` Eli Zaretskii 2019-03-26 16:22 ` Stefan Monnier ` (2 more replies) 2019-03-26 23:16 ` Juri Linkov 1 sibling, 3 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-26 16:11 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, brandelune, juri > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 25 Mar 2019 15:31:14 -0700 > Cc: Jean-Christophe Helary <brandelune@gmail.com>, > Emacs developers <emacs-devel@gnu.org> > > 2. While you're thinking about (1) here are some other questions. How > will ngettext determine the message catalog? Is the catalog visible to > users as a global variable, or as a hidden part of the global state, or > is it something explicit? How will catalogs from multiple packages be > used? How would a multi-lingual application work in Emacs if the message > catalog is part of global state? This seems to be a crucial issue, I'd > say. For example, should Emacs export dcngettext to Lisp code, instead > of just plain ngettext? (Emacs could then define ngettext in terms of > dcngettext.) Do we have any reasons not to follow the CLISP example of factoring these issues? > 5. When calling the C-level ngettext, strings are not properly recoded. Did you mean decoding the translated string that ngettext returns? If so, we will need some way of getting at the encoding of the strings in the catalog, I think. Or will we mandate that Emacs catalogs need always to be in UTF-8 encoding? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 16:11 ` Eli Zaretskii @ 2019-03-26 16:22 ` Stefan Monnier 2019-03-26 16:55 ` Eli Zaretskii 2019-03-26 22:35 ` Paul Eggert 2019-03-27 2:34 ` Jean-Christophe Helary 2 siblings, 1 reply; 151+ messages in thread From: Stefan Monnier @ 2019-03-26 16:22 UTC (permalink / raw) To: emacs-devel > Did you mean decoding the translated string that ngettext returns? If > so, we will need some way of getting at the encoding of the strings in > the catalog, I think. Or will we mandate that Emacs catalogs need > always to be in UTF-8 encoding? Mandating `utf-8-emacs` would make things simpler. Stefan ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 16:22 ` Stefan Monnier @ 2019-03-26 16:55 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-26 16:55 UTC (permalink / raw) To: Stefan Monnier; +Cc: emacs-devel > From: Stefan Monnier <monnier@iro.umontreal.ca> > Date: Tue, 26 Mar 2019 12:22:52 -0400 > > > Did you mean decoding the translated string that ngettext returns? If > > so, we will need some way of getting at the encoding of the strings in > > the catalog, I think. Or will we mandate that Emacs catalogs need > > always to be in UTF-8 encoding? > > Mandating `utf-8-emacs` would make things simpler. If the translators won't mind, sure. Or maybe we will recode into UTF-8 before importing those catalogs that aren't. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 16:11 ` Eli Zaretskii 2019-03-26 16:22 ` Stefan Monnier @ 2019-03-26 22:35 ` Paul Eggert 2019-03-27 3:43 ` Eli Zaretskii 2019-03-27 2:34 ` Jean-Christophe Helary 2 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-26 22:35 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, brandelune, juri On 3/26/19 9:11 AM, Eli Zaretskii wrote: >> 2. While you're thinking about (1) here are some other questions. How >> will ngettext determine the message catalog? Is the catalog visible to >> users as a global variable, or as a hidden part of the global state, or >> is it something explicit? How will catalogs from multiple packages be >> used? How would a multi-lingual application work in Emacs if the message >> catalog is part of global state? This seems to be a crucial issue, I'd >> say. For example, should Emacs export dcngettext to Lisp code, instead >> of just plain ngettext? (Emacs could then define ngettext in terms of >> dcngettext.) > Do we have any reasons not to follow the CLISP example of factoring > these issues? That's the first I've heard that CLISP does gettext. I looked into it, and it's a reasonably simple binding, which means that the language is part of the global state (Emacs would not easily be multilingual) and that each package can have its own catalog and can specify that catalog as a trailing argument to gettext (presumably the default catalog would be for Emacs core). This should be good enough, though it will be a bit of a hassle for non-core code to keep track of the catalog. > will we mandate that Emacs catalogs need > always to be in UTF-8 encoding? Yes, that makes sense. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 22:35 ` Paul Eggert @ 2019-03-27 3:43 ` Eli Zaretskii 2019-03-28 14:56 ` Clément Pit-Claudel 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-27 3:43 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, brandelune, juri > Cc: juri@linkov.net, brandelune@gmail.com, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Tue, 26 Mar 2019 15:35:22 -0700 > > > Do we have any reasons not to follow the CLISP example of factoring > > these issues? > > That's the first I've heard that CLISP does gettext. I learned that from post by Bruno here up-thread. > I looked into it, and it's a reasonably simple binding, which means > that the language is part of the global state (Emacs would not > easily be multilingual) We could offer the language as another optional argument. I'm not sure we need to allow control of the CATEGORY (for choosing the LC_* category), so we could replace that with the language. Or we could keep CATEGORY for compatibility and just add LANGUAGE. > and that each package can have its own catalog and can specify that > catalog as a trailing argument to gettext (presumably the default > catalog would be for Emacs core). This should be good enough, though > it will be a bit of a hassle for non-core code to keep track of the > catalog. If we want some automatic way of changing the domain when a function from a package is called, we need to develop the infrastructure for that. But that could wait for later, I think. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-27 3:43 ` Eli Zaretskii @ 2019-03-28 14:56 ` Clément Pit-Claudel 2019-03-28 15:52 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Clément Pit-Claudel @ 2019-03-28 14:56 UTC (permalink / raw) To: emacs-devel On 2019-03-26 23:43, Eli Zaretskii wrote: > If we want some automatic way of changing the domain when a function > from a package is called, we need to develop the infrastructure for > that. But that could wait for later, I think. I expect I'd define a foo-ngettext macro in each `foo' package expanding to ngettext with the appropriate group argument. If there are multiple functions (gettext, ngettext, etc), maybe a single macro defining all foo-* variants at once would be nice. Clément. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-28 14:56 ` Clément Pit-Claudel @ 2019-03-28 15:52 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-28 15:52 UTC (permalink / raw) To: Clément Pit-Claudel; +Cc: emacs-devel > From: Clément Pit-Claudel <cpitclaudel@gmail.com> > Date: Thu, 28 Mar 2019 10:56:44 -0400 > > On 2019-03-26 23:43, Eli Zaretskii wrote: > > If we want some automatic way of changing the domain when a function > > from a package is called, we need to develop the infrastructure for > > that. But that could wait for later, I think. > > I expect I'd define a foo-ngettext macro in each `foo' package expanding to ngettext with the appropriate group argument. If there are multiple functions (gettext, ngettext, etc), maybe a single macro defining all foo-* variants at once would be nice. I really hope we could come up with something more elegant. And besides, your suggestion doesn't handle calls from Lisp packages to core APIs, including primitives and modules. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 16:11 ` Eli Zaretskii 2019-03-26 16:22 ` Stefan Monnier 2019-03-26 22:35 ` Paul Eggert @ 2019-03-27 2:34 ` Jean-Christophe Helary 2 siblings, 0 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-27 2:34 UTC (permalink / raw) To: Emacs developers > On Mar 27, 2019, at 1:11, Eli Zaretskii <eliz@gnu.org> wrote: > Or will we mandate that Emacs catalogs need > always to be in UTF-8 encoding? Please. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 22:31 ` Paul Eggert 2019-03-26 16:11 ` Eli Zaretskii @ 2019-03-26 23:16 ` Juri Linkov 2019-03-27 1:35 ` Paul Eggert 2019-04-24 6:39 ` Jean-Christophe Helary 1 sibling, 2 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-26 23:16 UTC (permalink / raw) To: Paul Eggert; +Cc: Jean-Christophe Helary, Emacs developers >> I don't know where to put i18n related code, so since ngettext should >> have C calls anyway, I moved it to editfns.c near the function ‘message’ >> where it still just returns the correct plurals without doing any translation. > > That stub had some problems: > > 1. It lacked documentation in the Elisp manual. Important changes like > this should be documented -- to some extent the documentation is even > more important than the code. Can you write something? I'll start writing documentation. Is it allowed to make references from the Elisp manual to the Gettext Info manual? I see in (info "(gettext) elisp-format") a reference back to the Elisp manual is a web link, not an Info reference. > 2. While you're thinking about (1) here are some other questions. How > will ngettext determine the message catalog? Is the catalog visible to > users as a global variable, or as a hidden part of the global state, or > is it something explicit? How will catalogs from multiple packages be > used? How would a multi-lingual application work in Emacs if the message > catalog is part of global state? This seems to be a crucial issue, I'd > say. For example, should Emacs export dcngettext to Lisp code, instead > of just plain ngettext? (Emacs could then define ngettext in terms of > dcngettext.) It seems most of these needs could be covered by adding two optional arguments DOMAIN and CATEGORY to ngettext (where the default domain "emacs" will be hard-coded). As a convenience not to require a package to add its domain to every ngettext call, maybe when something like 'defdomain' is declared at the beginning of the package, its value should affect the domain within the package scope. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 23:16 ` Juri Linkov @ 2019-03-27 1:35 ` Paul Eggert 2019-04-24 6:39 ` Jean-Christophe Helary 1 sibling, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-27 1:35 UTC (permalink / raw) To: Juri Linkov; +Cc: Jean-Christophe Helary, Emacs developers On 3/26/19 4:16 PM, Juri Linkov wrote: > I'll start writing documentation. Is it allowed to make references > from the Elisp manual to the Gettext Info manual? I see in (info > "(gettext) elisp-format") a reference back to the Elisp manual is a > web link, not an Info reference. > Thanks for taking this on. Yes, you can do cross-references; e.g., files.texi has this: @xref{File permissions,,, coreutils, The @sc{gnu} @code{Coreutils} Manual} > It seems most of these needs could be covered by adding two optional > arguments DOMAIN and CATEGORY to ngettext (where the default domain > "emacs" will be hard-coded). > This appears to be what CLISP does; see: https://sourceforge.net/p/clisp/clisp/ci/default/tree/modules/i18n/i18n.lisp https://clisp.sourceforge.io/impnotes.html#i18n-mod > As a convenience not to require a package to add its domain to every > ngettext call, maybe when something like 'defdomain' is declared at > the beginning of the package, its value should affect the domain > within the package scope. > Would this be done statically or dynamically? Preferably the former but I don't exactly see how it would work, and even dynamically the details are not obvious to me. For example, would you have to do something like the following? (define mymodule--ngettext (n sing-msgid pl-msgid) (ngettext n sing-msgid pl-msgid "mymodule")) (defun report-items (n) (message (mymodule--ngettext n "%d item" "%d items") n)) (defun report-keystrokes (n) (message (mymodule--ngetext n "%d keystroke received." "%d keystrokes received.") n)) Something like this would work, but it looks pretty annoying.... ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 23:16 ` Juri Linkov 2019-03-27 1:35 ` Paul Eggert @ 2019-04-24 6:39 ` Jean-Christophe Helary 2019-04-24 20:18 ` Juri Linkov 1 sibling, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-04-24 6:39 UTC (permalink / raw) To: Juri Linkov; +Cc: Paul Eggert, Emacs developers [-- Attachment #1: Type: text/plain, Size: 2166 bytes --] So, where do we go from here now ? Juri, have you written documentation ? Do you want help ? Jean-Christophe > On Mar 27, 2019, at 8:16, Juri Linkov <juri@linkov.net <mailto:juri@linkov.net>> wrote: > >>> I don't know where to put i18n related code, so since ngettext should >>> have C calls anyway, I moved it to editfns.c near the function ‘message’ >>> where it still just returns the correct plurals without doing any translation. >> >> That stub had some problems: >> >> 1. It lacked documentation in the Elisp manual. Important changes like >> this should be documented -- to some extent the documentation is even >> more important than the code. Can you write something? > > I'll start writing documentation. Is it allowed to make > references from the Elisp manual to the Gettext Info manual? > I see in (info "(gettext) elisp-format") a reference back to > the Elisp manual is a web link, not an Info reference. > >> 2. While you're thinking about (1) here are some other questions. How >> will ngettext determine the message catalog? Is the catalog visible to >> users as a global variable, or as a hidden part of the global state, or >> is it something explicit? How will catalogs from multiple packages be >> used? How would a multi-lingual application work in Emacs if the message >> catalog is part of global state? This seems to be a crucial issue, I'd >> say. For example, should Emacs export dcngettext to Lisp code, instead >> of just plain ngettext? (Emacs could then define ngettext in terms of >> dcngettext.) > > It seems most of these needs could be covered by adding two optional > arguments DOMAIN and CATEGORY to ngettext (where the default domain > "emacs" will be hard-coded). > > As a convenience not to require a package to add its domain to every > ngettext call, maybe when something like 'defdomain' is declared at the > beginning of the package, its value should affect the domain within > the package scope. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com <http://mac4translators.blogspot.com/> @brandelune [-- Attachment #2: Type: text/html, Size: 4267 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-04-24 6:39 ` Jean-Christophe Helary @ 2019-04-24 20:18 ` Juri Linkov 0 siblings, 0 replies; 151+ messages in thread From: Juri Linkov @ 2019-04-24 20:18 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: Paul Eggert, Emacs developers > So, where do we go from here now ? > > Juri, have you written documentation ? It's still WIP. First I looked how i18n is implemented in XEmacs, and discovered that whereas the interface is documented, it's not fully functional. What is worse, it's quite ugly. So I turned onto a nicer interface in CLISP that could be used as a basis of gettext interface in Emacs Lisp. > Do you want help ? Help is needed to install the standard gettext infrastructure using gettextize. Help is expected from someone who has more experience in applying gettext to other projects. Once the default gettext infrastructure is installed, I could help in adapting gettext to Emacs. Meanwhile, currently I'm replacing dired-plural-s with ngettext in bug#35287. It's not without problems: one problematic place is in dired-do-kill-lines: (defun dired-do-kill-lines (&optional arg fmt) ... (let ((count 0)) (setq count (1+ count)) (or (equal "" fmt) (message (or fmt "Killed %d line%s.") count (dired-plural-s count))) count) (defun dired-omit-expunge (&optional regexp) ... (setq count (dired-do-kill-lines nil (if dired-omit-verbose "Omitted %d line%s." ""))) The format string can't be just replaced in dired-do-kill-lines with something like (ngettext "Killed %d line." "Killed %d lines." count) because it can be called with a format string from dired-omit-expunge, but also dired-omit-expunge has no access to the variable 'count'. There are more such marginal cases, but eventually they all have to resolved somehow. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-24 21:55 ` Juri Linkov 2019-03-24 23:31 ` Jean-Christophe Helary @ 2019-03-25 3:35 ` Eli Zaretskii 2019-03-25 9:04 ` Jean-Christophe Helary 2019-03-25 21:02 ` Juri Linkov 2019-03-25 10:52 ` Mattias Engdegård 2 siblings, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-25 3:35 UTC (permalink / raw) To: Juri Linkov; +Cc: cpitclaudel, emacs-devel, brandelune, rms > From: Juri Linkov <juri@linkov.net> > Cc: cpitclaudel@gmail.com, brandelune@gmail.com, rms@gnu.org, emacs-devel@gnu.org > Date: Sun, 24 Mar 2019 23:55:57 +0200 > > > But since we don't yet have ngettext, we cannot yet use this > > paradigm. I thought by "from now on" you literally meant from now; > > did I misunderstand? > > Yes, literally. After the patch from > http://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00586.html > is pushed to master, ngettext is available to use for pluralization. That's changing history. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 3:35 ` Eli Zaretskii @ 2019-03-25 9:04 ` Jean-Christophe Helary 2019-03-25 21:02 ` Juri Linkov 1 sibling, 0 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-25 9:04 UTC (permalink / raw) To: Emacs developers [-- Attachment #1: Type: text/plain, Size: 803 bytes --] > On Mar 25, 2019, at 12:35, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Juri Linkov <juri@linkov.net> >> Cc: cpitclaudel@gmail.com, brandelune@gmail.com, rms@gnu.org, emacs-devel@gnu.org >> Date: Sun, 24 Mar 2019 23:55:57 +0200 >> >>> But since we don't yet have ngettext, we cannot yet use this >>> paradigm. I thought by "from now on" you literally meant from now; >>> did I misunderstand? >> >> Yes, literally. After the patch from >> http://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00586.html >> is pushed to master, ngettext is available to use for pluralization. > > That's changing history. How do we practically use that ? Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune [-- Attachment #2: Type: text/html, Size: 2835 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 3:35 ` Eli Zaretskii 2019-03-25 9:04 ` Jean-Christophe Helary @ 2019-03-25 21:02 ` Juri Linkov 2019-03-26 3:27 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-25 21:02 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, emacs-devel, brandelune, rms >> > But since we don't yet have ngettext, we cannot yet use this >> > paradigm. I thought by "from now on" you literally meant from now; >> > did I misunderstand? >> >> Yes, literally. After the patch from >> http://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00586.html >> is pushed to master, ngettext is available to use for pluralization. > > That's changing history. When you asked to read past discussions, I did it, and among all opinions the most encouraging were the wise words of François Pinard: "Yet, when it is affordable to do so, and to spare the overall effort, it is often a good thing to aim in directions which have less chance to lead into dead ends, from which we might later have to backtrack from. Yet, dead ends are not always technical, so sometimes, not always, dead ends might be more fruitful than no road at all. It is surely a fine art, being able to choose the best roads, considering all issues. My opinion is, when we are lacking of volunteer time, that the best road is the one having the least steps in it, and this often means that backtracking should be avoided. Best is trying to do things the right way, even if not everything gets done at once. So steps accumulate constructively over time." I hope that starting with a small step of adding ngettext to provide the correct plurals for English words would lead in the right direction while avoiding the danger of backtracking from dead ends. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 21:02 ` Juri Linkov @ 2019-03-26 3:27 ` Eli Zaretskii 2019-03-27 23:06 ` Richard Stallman 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-26 3:27 UTC (permalink / raw) To: Juri Linkov; +Cc: cpitclaudel, emacs-devel, brandelune, rms > From: Juri Linkov <juri@linkov.net> > Cc: cpitclaudel@gmail.com, brandelune@gmail.com, rms@gnu.org, emacs-devel@gnu.org > Date: Mon, 25 Mar 2019 23:02:59 +0200 > > I hope that starting with a small step of adding ngettext to provide > the correct plurals for English words would lead in the right direction > while avoiding the danger of backtracking from dead ends. That's not what I meant. I asked a question, and you replied as if the commit done yesterday already existed before I asked the question. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-26 3:27 ` Eli Zaretskii @ 2019-03-27 23:06 ` Richard Stallman 0 siblings, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-27 23:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: cpitclaudel, emacs-devel, brandelune, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I hope that starting with a small step of adding ngettext to provide > > the correct plurals for English words would lead in the right direction > > while avoiding the danger of backtracking from dead ends. > That's not what I meant. I asked a question, and you replied as if > the commit done yesterday already existed before I asked the question. It sounds like a minor miscommunication. A short private conversation might enable you to set it straight. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-24 21:55 ` Juri Linkov 2019-03-24 23:31 ` Jean-Christophe Helary 2019-03-25 3:35 ` Eli Zaretskii @ 2019-03-25 10:52 ` Mattias Engdegård 2019-03-25 15:37 ` Eli Zaretskii 2019-03-25 21:11 ` Juri Linkov 2 siblings, 2 replies; 151+ messages in thread From: Mattias Engdegård @ 2019-03-25 10:52 UTC (permalink / raw) To: Juri Linkov; +Cc: brandelune, Eli Zaretskii, cpitclaudel, rms, emacs-devel 24 mars 2019 kl. 22.55 skrev Juri Linkov <juri@linkov.net>: > > http://lists.gnu.org/archive/html/emacs-devel/2019-03/msg00586.html > is pushed to master, ngettext is available to use for pluralization. That patch exposes some Emacs-specific translation problems: - (cons (format "finished with %d matches found\n" grep-num-matches-found) + (cons (format (ngettext "finished with %d match found\n" + "finished with %d matches found\n" + grep-num-matches-found) + grep-num-matches-found) This is fine -- typical i18n code (except that the subject of the sentence is missing, which should go into a comment to translators). ;; remove match from grep-regexp-alist before fontifying ("^Grep[/a-zA-Z]* started.*" (0 '(face nil compilation-message nil help-echo nil mouse-face nil) t)) - ("^Grep[/a-zA-Z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?matches found\\)\\|\\(no matches found\\)\\).*" + ("^Grep[/a-zA-Z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? found\\)\\|\\(no matches found\\)\\).*" Since it is not uncommon in Emacs to pattern-match on generated text, either the translator needs to understand regexps well or the code must be restructured to avoid that kind of matching, perhaps by using text properties. Besides, translating regexp strings precludes the use of modern regexp notations like rx, since gettext is string-oriented. Of course the patch was just a proof-of-concept and not intended as actual code. Please forgive me for using it to make a point. This is also not an argument against using gettext. Quite the contrary; it's the obvious way to go if i18n is to be undertaken at all. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 10:52 ` Mattias Engdegård @ 2019-03-25 15:37 ` Eli Zaretskii 2019-03-25 21:11 ` Juri Linkov 1 sibling, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-25 15:37 UTC (permalink / raw) To: Mattias Engdegård; +Cc: cpitclaudel, emacs-devel, brandelune, rms, juri > From: Mattias Engdegård <mattiase@acm.org> > Date: Mon, 25 Mar 2019 11:52:45 +0100 > Cc: Eli Zaretskii <eliz@gnu.org>, cpitclaudel@gmail.com, emacs-devel@gnu.org, > brandelune@gmail.com, rms@gnu.org > > Of course the patch was just a proof-of-concept and not intended as actual code. It's not proof-of-concept, it's an actual patch that was committed yesterday night. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 10:52 ` Mattias Engdegård 2019-03-25 15:37 ` Eli Zaretskii @ 2019-03-25 21:11 ` Juri Linkov 2019-03-25 22:05 ` Mattias Engdegård 1 sibling, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-25 21:11 UTC (permalink / raw) To: Mattias Engdegård Cc: brandelune, Eli Zaretskii, cpitclaudel, rms, emacs-devel > + (cons (format (ngettext "finished with %d match found\n" > + "finished with %d matches found\n" > + grep-num-matches-found) > + grep-num-matches-found) > > ("^Grep[/a-zA-Z]* started.*" > (0 '(face nil compilation-message nil help-echo nil mouse-face nil) t)) > - ("^Grep[/a-zA-Z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?matches > found\\)\\|\\(no matches found\\)\\).*" > + ("^Grep[/a-zA-Z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? > found\\)\\|\\(no matches found\\)\\).*" > > Since it is not uncommon in Emacs to pattern-match on generated text, > either the translator needs to understand regexps well or the code > must be restructured to avoid that kind of matching, perhaps by using > text properties. Besides, translating regexp strings precludes the use > of modern regexp notations like rx, since gettext is string-oriented. Is it possible to generate a regexp from ngettext arguments? For example, given the same arguments and calling a hypothetical function ‘rx-ngettext’: (rx-ngettext "finished with %d match found\n" "finished with %d matches found\n") to generate a regexp like: "finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? found\\)" ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 21:11 ` Juri Linkov @ 2019-03-25 22:05 ` Mattias Engdegård 2019-03-27 21:22 ` Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Mattias Engdegård @ 2019-03-25 22:05 UTC (permalink / raw) To: Juri Linkov; +Cc: Eli Zaretskii, emacs-devel, cpitclaudel, brandelune, rms 25 mars 2019 kl. 22.11 skrev Juri Linkov <juri@linkov.net>: > > Is it possible to generate a regexp from ngettext arguments? > For example, given the same arguments and calling a hypothetical > function ‘rx-ngettext’: > > (rx-ngettext "finished with %d match found\n" > "finished with %d matches found\n") > > to generate a regexp like: > > "finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? found\\)" Trivially so by generating an or-pattern: "singular text\\|plural text". Anything better is a matter of optimisation, basically a diff algorithm (or just prefix and suffix merging). Is it practical, though? For %s, we would need to generate a match-anything subexpression, even though the argument is much more constrained in practice. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-25 22:05 ` Mattias Engdegård @ 2019-03-27 21:22 ` Juri Linkov 2019-03-28 11:03 ` Mattias Engdegård 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-27 21:22 UTC (permalink / raw) To: Mattias Engdegård Cc: Eli Zaretskii, emacs-devel, cpitclaudel, brandelune, rms >> Is it possible to generate a regexp from ngettext arguments? >> For example, given the same arguments and calling a hypothetical >> function ‘rx-ngettext’: >> >> (rx-ngettext "finished with %d match found\n" >> "finished with %d matches found\n") >> >> to generate a regexp like: >> >> "finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? found\\)" > > Trivially so by generating an or-pattern: "singular text\\|plural text". > Anything better is a matter of optimisation, basically a diff algorithm > (or just prefix and suffix merging). > > Is it practical, though? For %s, we would need to generate a match-anything > subexpression, even though the argument is much more constrained in practice. I tried ‘regexp-opt’ and it generates a ready-to-use regexp: (replace-regexp-in-string "%d" "\\\\([0-9]+\\\\)" (regexp-opt '("finished with %d match found" "finished with %d matches found" "finished with no matches found"))) ⇒ "\\(?:finished with \\(?:\\(?:\\([0-9]+\\) match\\(?:es\\)?\\|no matches\\) found\\)\\)" ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-27 21:22 ` Juri Linkov @ 2019-03-28 11:03 ` Mattias Engdegård 0 siblings, 0 replies; 151+ messages in thread From: Mattias Engdegård @ 2019-03-28 11:03 UTC (permalink / raw) To: Juri Linkov; +Cc: Emacs developers 27 mars 2019 kl. 22.22 skrev Juri Linkov <juri@linkov.net>: > > I tried ‘regexp-opt’ and it generates a ready-to-use regexp: > > (replace-regexp-in-string > "%d" "\\\\([0-9]+\\\\)" > (regexp-opt '("finished with %d match found" > "finished with %d matches found" > "finished with no matches found"))) > > ⇒ "\\(?:finished with \\(?:\\(?:\\([0-9]+\\) match\\(?:es\\)?\\|no matches\\) found\\)\\)" Well now. There is no guarantee that regexp-opt won't split the %d. Format strings must be parsed left-to-right for correctness¹. I'm still skeptical, but if you really want to give this a try, then first segment the format string: "Today %d little piggies built %03o houses and said '%s'." "Today %d little piggy built %o house and said '%s'." => ("Today " ?d " little piggies built " ?o " houses and said '" ?s "'.") ("Today " ?d " little piggy built " ?o " house and said '" ?s "'.") leaving the format placeholders as atomic entities (here shown as characters, but you may need more information there). Then run your fav diff algo on the result. Most important to performance is prefix merging; anything else is just to make the regexp smaller. Here, prefix and suffix merging would leave you with (still in abstract form) ("Today " ?d " little pigg" (("ies built " ?o " houses") ("y built " ?o " house")) " and said '" ?s "'.") From there you can either recursively try to find more common subsequences, or call it a day and render it into a regexp: "Today -?[0-9]+ little pigg\\(?:ies built -?[0-7]+ houses\\|y built -?[0-7]+ house\\) and said '\\(?:.\\|\n\\)*'." All this will need to be done at run-time, since it is run on translated strings. ¹ To match format parameters, try something like (rx "%" (opt (1+ digit) "$") (0+ digit) (opt "." (0+ digit)) (any "%sdioxXefgcS")) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) 2019-03-03 15:31 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Eli Zaretskii 2019-03-03 20:57 ` Emacs i18n Juri Linkov @ 2019-03-04 3:27 ` Richard Stallman 2019-03-04 16:36 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-04 3:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > This has come up several times in the past. The main problem with > i18n in Emacs is that, unlike in many text-mode programs, 'message' > covers a tiny portion of the Emacs UI. We have help commands that pop > up buffers; we have commands that prompt in the minibuffer; we have > menu items and labels on tool-bar buttons; we have help-echo on menus, That is quite true. However, I recommend a different approach to doing the job. An incremental one. Let's install the lookup code and make `message' call it -- not using advice. Perhaps we should rewrite it into C, since it is short and we will want to call it from C code. Let's develop something to load translations from po files. Let's develop software to generate and write lists of messages that need translating. Then people can start developing useful sets of translations. Meanwhile, we can also hook it into other interfaces where it appropriate. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) 2019-03-04 3:27 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Richard Stallman @ 2019-03-04 16:36 ` Eli Zaretskii 2019-03-04 18:37 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-04 16:36 UTC (permalink / raw) To: rms; +Cc: emacs-devel, juri > From: Richard Stallman <rms@gnu.org> > Cc: juri@linkov.net, emacs-devel@gnu.org > Date: Sun, 03 Mar 2019 22:27:36 -0500 > > That is quite true. However, I recommend a different approach to > doing the job. An incremental one. > > Let's install the lookup code and make `message' call it -- not using > advice. Perhaps we should rewrite it into C, since it is short > and we will want to call it from C code. > > Let's develop something to load translations from po files. Let's > develop software to generate and write lists of messages that need > translating. > > Then people can start developing useful sets of translations. > > Meanwhile, we can also hook it into other interfaces where it > appropriate. The incremental approach is a great approach, but it does have its limitations. Especially when several non-trivial features will eventually need to be compatible with each other to be true parts of a greater whole, which is i18n for Emacs. For example, it is IMO pointless to be able to display translated strings from 'message' without also having a convenient automated way of collecting translatable messages and creating a message catalog that such a 'message' could use, or without being able to install such message catalogs for different ELisp packages. IOW, this feature, like many other large features, cannot be implemented in increments that are too small. Each increment should be large enough to make sense. And then there's a more complex issue of how the increments will work together; some thought must be invested in that up front. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) 2019-03-04 16:36 ` Eli Zaretskii @ 2019-03-04 18:37 ` Paul Eggert 2019-03-04 19:07 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-04 18:37 UTC (permalink / raw) To: Eli Zaretskii, rms; +Cc: juri, emacs-devel On 3/4/19 8:36 AM, Eli Zaretskii wrote: > For example, it is IMO pointless to be able to display translated > strings from 'message' without also having a convenient automated way > of collecting translatable messages and creating a message catalog > that such a 'message' could use There is longstanding technology to do that for C code. We could apply that to Emacs, and then at least the builtin C-level messages will be translated. Later, we could extend this to Elisp. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) 2019-03-04 18:37 ` Paul Eggert @ 2019-03-04 19:07 ` Eli Zaretskii 2019-03-05 2:09 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-04 19:07 UTC (permalink / raw) To: Paul Eggert; +Cc: juri, rms, emacs-devel > Cc: emacs-devel@gnu.org, juri@linkov.net > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Mon, 4 Mar 2019 10:37:31 -0800 > > On 3/4/19 8:36 AM, Eli Zaretskii wrote: > > For example, it is IMO pointless to be able to display translated > > strings from 'message' without also having a convenient automated way > > of collecting translatable messages and creating a message catalog > > that such a 'message' could use > > There is longstanding technology to do that for C code. We could apply > that to Emacs, and then at least the builtin C-level messages will be > translated. Later, we could extend this to Elisp. I'm saying that IMO it makes no sense at all to do this only for C. The infrastructure used for that will most probably not work for Lisp, let alone allow separate translations for separate packages to be brought together and used in the same Emacs session. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) 2019-03-04 19:07 ` Eli Zaretskii @ 2019-03-05 2:09 ` Paul Eggert 2019-03-05 21:58 ` Emacs i18n Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-05 2:09 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, rms, emacs-devel On 3/4/19 11:07 AM, Eli Zaretskii wrote: > I'm saying that IMO it makes no sense at all to do this only for C. Yes, of course it should also work for Elisp. I mentioned C only as a way to get it started, since the C infrastructure already exists and we need to do it for the C messages anyway. > The infrastructure used for that will most probably not work for Lisp, > let alone allow separate translations for separate packages to be > brought together and used in the same Emacs session. I don't see why it wouldn't work for Elisp. The gettext infrastructure allows multiple message catalogs in the same session. Obviously some hacking would be involved, since Elisp currently doesn't do any of this; but it could be built atop the existing infrastructure used by other GNU applications rather than being rewritten from scratch. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-05 2:09 ` Paul Eggert @ 2019-03-05 21:58 ` Juri Linkov 2019-03-06 2:16 ` Richard Stallman ` (2 more replies) 0 siblings, 3 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-05 21:58 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, rms, emacs-devel >> The infrastructure used for that will most probably not work for Lisp, >> let alone allow separate translations for separate packages to be >> brought together and used in the same Emacs session. > > I don't see why it wouldn't work for Elisp. The gettext infrastructure > allows multiple message catalogs in the same session. Obviously some > hacking would be involved, since Elisp currently doesn't do any of this; > but it could be built atop the existing infrastructure used by other GNU > applications rather than being rewritten from scratch. One of the main decisions that has to be made is whether to wrap all user-facing translatable strings in all Lisp files using a macro/function 'gettext' (alias '_') explicitly like is implemented in XEmacs' I18N3 that would help to extract translations from the source code, or to use a low-level implicit translation without changing the existing code like is implemented for handling text-quoting-style in format strings. The latter will even allow translation of strings that a package author forgot to mark with '_'. Depending on this decision a translation file format has to be selected, be it flat Gettext PO format files or even some YAML-like hierarchical Lisp structures with scopes. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-05 21:58 ` Emacs i18n Juri Linkov @ 2019-03-06 2:16 ` Richard Stallman 2019-03-06 18:15 ` Eli Zaretskii 2019-03-06 17:30 ` Eli Zaretskii 2019-03-06 18:09 ` Eli Zaretskii 2 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-06 2:16 UTC (permalink / raw) To: Juri Linkov; +Cc: eliz, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > One of the main decisions that has to be made is whether to wrap all > user-facing translatable strings in all Lisp files using a macro/function > 'gettext' (alias '_') explicitly like is implemented in XEmacs' I18N3 > that would help to extract translations from the source code, or to use > a low-level implicit translation without changing the existing code like > is implemented for handling text-quoting-style in format strings. > The latter will even allow translation of strings that a package author > forgot to mark with '_'. We could recognize all strings passed as certain arguments to certain functions as translatable automatically, and have an explicit way to mark other strings as translatable. That could reduce the amount of work for developers to mark them. The translatability could be recorded as a text property in the string. Then, if a function such as 'message' gets a format string that is not translatable, it could warn, or save up a record that developers could optionally look at later. This would help remind developers to mark the strings that need it. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 2:16 ` Richard Stallman @ 2019-03-06 18:15 ` Eli Zaretskii 2019-03-06 19:47 ` Paul Eggert 2019-03-07 3:42 ` Richard Stallman 0 siblings, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-06 18:15 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, juri > From: Richard Stallman <rms@gnu.org> > Cc: eggert@cs.ucla.edu, eliz@gnu.org, emacs-devel@gnu.org > Date: Tue, 05 Mar 2019 21:16:07 -0500 > > We could recognize all strings passed as certain arguments to certain > functions as translatable automatically, and have an explicit > way to mark other strings as translatable. That could reduce > the amount of work for developers to mark them. You mean, the function, such as 'message', that receives the string will translate it? As opposed to the alternative of translating the string _before_ it gets passed to the function? If we do that, how do we deal with strings that are computed by concatenation or formatting? They get in one piece to functions like 'message', but the catalog will not hold that concatenated string, it will have the parts separately. How will the function be able to look up the translation in such cases? ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 18:15 ` Eli Zaretskii @ 2019-03-06 19:47 ` Paul Eggert 2019-03-06 20:19 ` Eli Zaretskii 2019-03-08 4:07 ` Richard Stallman 2019-03-07 3:42 ` Richard Stallman 1 sibling, 2 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-06 19:47 UTC (permalink / raw) To: Eli Zaretskii, rms; +Cc: emacs-devel, juri On 3/6/19 10:15 AM, Eli Zaretskii wrote: > how do we deal with strings that are computed by concatenation or > formatting? > The same way that other GNU packages deal with them: we redo calls, to make the strings easier to translate. For example, instead of this code (adapted from todo-mode.el): (message (concat "The highlighted item" (if (= count 1) " is " "s precedes ") "the timestamp %s.") timestamp) we do something like this: (nmessage count "The highlighted item is not up to date." "The highlighted items are not up to date." timestamp) where (nmessage N FMT1 FMT2 ...) is a new function that mimics GNU ngettext by returning a translation of FMT1 (using N) if a translation is available, and if no translation is available it falls back on using FMT1 if N is 1 and FMT2 otherwise. It's inevitable that we'd need to redo Lisp code this way, as translators cannot be expected to be programming experts that understand arbitrary Lisp code involving 'concat' and whatnot. This is what other GNU packages have done, and Emacs can do something similar. Of course it will be a big task to fully internationalize in this way, and it's not something that can be done all at once. But it doesn't have to be done all at once: we can create the machinery, do some proper i18n of a few key Lisp modules, and let the other modules be fixed up later when people find time. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 19:47 ` Paul Eggert @ 2019-03-06 20:19 ` Eli Zaretskii 2019-03-07 1:52 ` Paul Eggert 2019-03-08 4:07 ` Richard Stallman 1 sibling, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-06 20:19 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, rms, juri > Cc: juri@linkov.net, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Wed, 6 Mar 2019 11:47:23 -0800 > > On 3/6/19 10:15 AM, Eli Zaretskii wrote: > > > how do we deal with strings that are computed by concatenation or > > formatting? > > > The same way that other GNU packages deal with them: we redo calls, to > make the strings easier to translate. For example, instead of this code > (adapted from todo-mode.el): > > (message (concat "The highlighted item" (if (= count 1) " is " "s > precedes ") > "the timestamp %s.") > timestamp) > > we do something like this: > > (nmessage count > "The highlighted item is not up to date." > "The highlighted items are not up to date." > timestamp) That's the easy case. This one is a bit tougher: (message "The program says: " (shell-command-to-string "foo")) > It's inevitable that we'd need to redo Lisp code this way, as > translators cannot be expected to be programming experts that understand > arbitrary Lisp code involving 'concat' and whatnot. This is what other > GNU packages have done, and Emacs can do something similar. Except that Emacs is so much larger that doing this "like other packages" might make the job infeasible. Which is one reason why I think we should start from doc strings: they are both easier and much more important. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 20:19 ` Eli Zaretskii @ 2019-03-07 1:52 ` Paul Eggert 2019-03-07 3:37 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-07 1:52 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, rms, juri On 3/6/19 12:19 PM, Eli Zaretskii wrote: > > That's the easy case. This one is a bit tougher: (message "The program > says: " (shell-command-to-string "foo")) > Assuming you meant this: (message (concat "The program says: " (shell-command-to-string "foo"))) then it shouldn't be tough at all. The Elisp code should be rewritten like this: (message "The program says: %s" (shell-command-to-string "foo")) xgettext will automatically put "The program says: %s" into the pool of translatable strings. The output of the "foo" command won't be translated, nor should it be. Anyway, the Elisp code with "concat" needs to be rewritten regardless of whether we do i18n, as it can throw an exception if the shell command's output contains "%". All this is routine for program internationalization. Emacs is not special here; we've often had to do this sort of thing for other GNU packages. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 1:52 ` Paul Eggert @ 2019-03-07 3:37 ` Eli Zaretskii 2019-03-08 4:07 ` Richard Stallman 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-07 3:37 UTC (permalink / raw) To: Paul Eggert; +Cc: juri, rms, emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Wed, 6 Mar 2019 17:52:05 -0800 > Cc: emacs-devel@gnu.org, rms@gnu.org, juri@linkov.net > > then it shouldn't be tough at all. The Elisp code should be rewritten > like this: > > (message "The program says: %s" (shell-command-to-string "foo")) > > xgettext will automatically put "The program says: %s" into the pool of > translatable strings. The output of the "foo" command won't be > translated, nor should it be. > > Anyway, the Elisp code with "concat" needs to be rewritten regardless of > whether we do i18n, as it can throw an exception if the shell command's > output contains "%". > > All this is routine for program internationalization. Emacs is not > special here; we've often had to do this sort of thing for other GNU > packages. Sure, except that Emacs is so much larger, and gives the programmer a lot more freedom with treating code and data alike, than a typical C program. I just want people to realize how this job is more complicated in Emacs than in any other program. E.g., IIUC what you say, we will need to rewrite also the likes of this: (let* ((field (get-char-property pos 'field)) (button (get-char-property pos 'button)) (doc (get-char-property pos 'widget-doc)) (text (cond (field "This is an editable text area.") (button "This is an active area.") (doc "This is documentation text.") (t "This is unidentified text."))) (widget (or field button doc))) (when widget (widget-browse widget)) (message text))) And this is just a random, and not the most complicated, example. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 3:37 ` Eli Zaretskii @ 2019-03-08 4:07 ` Richard Stallman 2019-03-08 8:16 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-08 4:07 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > (let* ((field (get-char-property pos 'field)) > (button (get-char-property pos 'button)) > (doc (get-char-property pos 'widget-doc)) > (text (cond (field "This is an editable text area.") > (button "This is an active area.") > (doc "This is documentation text.") > (t "This is unidentified text."))) > (widget (or field button doc))) > (when widget > (widget-browse widget)) > (message text))) We would need to make SOME sort of change in it, but change could be very simple. It could consist of writing a call to 'translate' around each of those string constants. Or we might adopt a reader syntax for translatable strings. That might be convenient, since we want these to be found by tools processing the code, not solely handled by executing the code. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 4:07 ` Richard Stallman @ 2019-03-08 8:16 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-08 8:16 UTC (permalink / raw) To: rms; +Cc: juri, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org, juri@linkov.net > Date: Thu, 07 Mar 2019 23:07:20 -0500 > > Or we might adopt a reader syntax for translatable strings. > That might be convenient, since we want these to be found > by tools processing the code, not solely handled by > executing the code. I think we will need to come up with such a syntax anyway, because we will want to leave the Lisp programmers the freedom of writing code that computes displayable text out of thin air. It doesn't have to be reader syntax, btw: it could be a special function. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 19:47 ` Paul Eggert 2019-03-06 20:19 ` Eli Zaretskii @ 2019-03-08 4:07 ` Richard Stallman 2019-03-08 4:33 ` Elias Mårtenson 1 sibling, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-08 4:07 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, juri, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > we do something like this: > (nmessage count > "The highlighted item is not up to date." > "The highlighted items are not up to date." > timestamp) It might be better to define a function like this (defun numeric-select (count &rest messages) (or (nth count messages) (car (last messages)))) and then write (message (numeric-select count "The highlighted item is not up to date." "The highlighted items are not up to date.")) Translation infrastructure might be able to recognize this construct and mark the two strings as translatable if they are constants. Even better, translation could allow replacing that list of messages with a different list of messages, perhaps longer. That would make possible perfect support for a language where you need a different text for 2 and for numbers larger than 2. We could decide that the first element is for COUNT = 0, and if that element is a number instead of a string, it means to use the element for that number. (message (numeric-select count 2 "The highlighted item is not up to date." "The highlighted items are not up to date.")) This, together with the feature of translating the list as a different list, could be totally general. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 4:07 ` Richard Stallman @ 2019-03-08 4:33 ` Elias Mårtenson 2019-03-08 8:22 ` Eli Zaretskii 2019-03-09 3:11 ` Richard Stallman 0 siblings, 2 replies; 151+ messages in thread From: Elias Mårtenson @ 2019-03-08 4:33 UTC (permalink / raw) To: Richard Stallman; +Cc: Eli Zaretskii, Paul Eggert, emacs-devel, Juri Linkov [-- Attachment #1: Type: text/plain, Size: 495 bytes --] On Fri, 8 Mar 2019 at 12:08, Richard Stallman <rms@gnu.org> wrote: Even better, translation could allow replacing that list of messages > with a different list of messages, perhaps longer. That would > make possible perfect support for a language where you need a different > text for 2 and for numbers larger than 2. Russian, for example, uses three different grammatical cases, which are dependent on the last digit of the number, the system needs to be more complicated. Regards, Elias [-- Attachment #2: Type: text/html, Size: 837 bytes --] ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 4:33 ` Elias Mårtenson @ 2019-03-08 8:22 ` Eli Zaretskii 2019-03-09 3:11 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-08 8:22 UTC (permalink / raw) To: Elias Mårtenson; +Cc: eggert, emacs-devel, rms, juri > From: Elias Mårtenson <lokedhs@gmail.com> > Date: Fri, 8 Mar 2019 12:33:24 +0800 > Cc: Paul Eggert <eggert@cs.ucla.edu>, Eli Zaretskii <eliz@gnu.org>, Juri Linkov <juri@linkov.net>, > emacs-devel <emacs-devel@gnu.org> > > Russian, for example, uses three different grammatical cases, which are dependent on the last digit of the > number, the system needs to be more complicated. It's more complicated than that (e.g., 21 and 11 produce different forms in Russian), but gettext already has infrastructure for all that, AFAIR. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 4:33 ` Elias Mårtenson 2019-03-08 8:22 ` Eli Zaretskii @ 2019-03-09 3:11 ` Richard Stallman 2019-03-09 7:54 ` Paul Eggert 1 sibling, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-09 3:11 UTC (permalink / raw) To: Elias MÃ¥rtenson; +Cc: eliz, eggert, emacs-devel, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Russian, for example, uses three different grammatical cases, which are > dependent on the last digit of the number, the system needs to be more > complicated. Here's an idea for a scheme general enough to handle Russian as well. I propose something like a case or select construct. First, the elegant Lispy way to represent it: (numeric-case NUMBER (1 "Just one frob") (2 "Two frobs") (russian-masc "%d-m frobs") (russian-fem "%d-f frobs") (russian-neut "%d-n frobs") (t "%d frobs")) Translation would have to the entire numeric-case construct with another (translated) numeric-case construct. Thus, the source code would contain one suitable for English: (numeric-case NUMBER (1 "one frob") (t "%d frobs")) and for Russian we would translate it into this one (numeric-case NUMBER (russian-masc "%d-m frobs") (russian-fem "%d-f frobs") (russian-neut "%d-n frobs")) I think this framework could be extended to handle whatever other weird grammatical rules we might encounter in other languages in the future. While doing it with Lisp syntax is elegant, it would require generalization of the infrastructure for recording translations to handle more than strings. That would be a pain. Here's a way to represent the conditional construct as a kind of string. That way, translation would only need to translate strings into strings. We could use | in the string to separate alternatives, and : to end a condition. It would look like this: (numeric-case NUMBER "1:one frob|\ t:%d frobs") For Russian, we would translate the source string 1:one frob|t:%d frobs into russian-masc:%d-m frobs|russian-fem:%d-f frobs|russian-neut:%d-n frobs The subsequences : and | would be handled by the function numeric-case. They would not affect the meaning of the string data type as such. numeric-case would ignore whitespace after |. With this string convention, we only need to translate strings. To include a | in an alternative, you could write a double |. We do not need a way to quote a colon. Perhaps one could develop a smarter 'russian' alternative that knows how to change the last letter automatically and handles all three alternatives. Maybe we need to define a format-spec for devouring and ignoring one argument. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 3:11 ` Richard Stallman @ 2019-03-09 7:54 ` Paul Eggert 2019-03-09 10:30 ` Eli Zaretskii ` (3 more replies) 0 siblings, 4 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-09 7:54 UTC (permalink / raw) To: rms; +Cc: eliz, emacs-devel, Elias Mårtenson, juri Richard Stallman wrote: > Here's an idea for a scheme general enough to handle Russian as well. That idea's use of "-masc", "-fem", and "-neut" suggests that you misunderstood the problem with translating format strings like "%d items" into Russian. Russian has three plural forms useful for translating a string that formats an integer N. One form is for when (N%10 == 1 && N%100/10 != 1), one is for when (2 <= N%10 && N%10 <= 4 && N%100/10 == 1), and one is for everything else. So the form depends on N, not on whether the translation of the word "items" is masculine or feminine or whatever. Other languages have other rules, with varying levels of complexity; for example, Arabic has six different plural forms. GNU gettext deals with this at the translation level, so that ordinary programs can just use a function like ngettext to translate an English-language format with two plural forms. Emacs Lisp should do something similar: we shouldn't try to reinvent this wheel. Here's an example, taken from GNU dd. The C source code contains the two English forms and looks something like this: fprintf (stderr, ngettext ("%"PRIuMAX" byte copied, %s, %s", "%"PRIuMAX" bytes copied, %s, %s", w_bytes), w_bytes, delta_s_buf, bytes_per_second); And the ru.po file (which Russian translators edit) looks like this: "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n" ... #: src/dd.c:822 #, c-format msgid "%<PRIuMAX> byte copied, %s, %s" msgid_plural "%<PRIuMAX> bytes copied, %s, %s" msgstr[0] "%<PRIuMAX> байт скопирован, %s, %s" msgstr[1] "%<PRIuMAX> байта скопировано, %s, %s" msgstr[2] "%<PRIuMAX> байт скопировано, %s, %s" Each of the three Russian plural forms is supported, and the right one is chosen by the translation system without the programmer having to know how Russian plural forms work. For more about this, please see the GNU gettext manual, such as this web page: https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html PS. Although the email from Elias said "From: =?UTF-8?Q?Elias_M=C3=A5rtenson?=" which displays correctly as "Elias Mårtenson", your reply said "To: Elias =?iso-8859-1?Q?M=C3=A5rtenson?=" which displays incorrectly as "Elias MÃ¥rtenson". It looks like there's a bug in your email client, or in your configuration of it, a bug that munges names of your email correspondents. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 7:54 ` Paul Eggert @ 2019-03-09 10:30 ` Eli Zaretskii 2019-03-10 3:05 ` Richard Stallman ` (2 subsequent siblings) 3 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 10:30 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, lokedhs, rms, juri > Cc: Elias Mårtenson <lokedhs@gmail.com>, eliz@gnu.org, > juri@linkov.net, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Fri, 8 Mar 2019 23:54:28 -0800 > > So the form depends on N, not on whether the translation of the word > "items" is masculine or feminine or whatever. It actually depends on both. > GNU gettext deals with this at the translation level, so that ordinary programs > can just use a function like ngettext to translate an English-language format > with two plural forms. Emacs Lisp should do something similar: we shouldn't try > to reinvent this wheel. Right. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 7:54 ` Paul Eggert 2019-03-09 10:30 ` Eli Zaretskii @ 2019-03-10 3:05 ` Richard Stallman 2019-03-10 6:07 ` Paul Eggert 2019-03-10 8:45 ` Yuri Khan 2019-03-10 3:05 ` Richard Stallman 2019-03-10 3:05 ` Richard Stallman 3 siblings, 2 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-10 3:05 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, juri, lokedhs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Russian has three plural forms useful for translating a string that formats an > integer N. One form is for when (N%10 == 1 && N%100/10 != 1), one is for when (2 > <= N%10 && N%10 <= 4 && N%100/10 == 1), and one is for everything else. So the > form depends on N, not on whether the translation of the word "items" is > masculine or feminine or whatever. That's how I understood it, and that is exactly what my proposal does. I will try to explain it again. Each clause inside numeric-select handles certain numbers. The car of the clause (in Lispy structure) selects numbers to handle. 'russian-masc' selects numbers that require a masculine ending, in Russian. You use it with a string that contains the masculine ending. 'russian-fem' selects numbers that require a feminine ending, in Russian. You use it with a string that contains the feminine ending. 'russian-neut' selects numbers that require a neuter ending, in Russian. You use it with a string that contains the neuter ending. If this does not work, why not? In the example that was sent, I see code that tests for certain kinds of numbers. But since I don't know the language that that is written in, the mathematical conditions are the only part I understand. I don't see what it will _do_ in each of those conditions. I presume it selects the appropriate suffix for the number, but I don't follow how it does so. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-10 3:05 ` Richard Stallman @ 2019-03-10 6:07 ` Paul Eggert 2019-03-11 1:20 ` Richard Stallman 2019-03-10 8:45 ` Yuri Khan 1 sibling, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-10 6:07 UTC (permalink / raw) To: rms; +Cc: eliz, juri, lokedhs, emacs-devel Richard Stallman wrote: > If this does not work, why not? Thanks for explaining the -masc, -fem, -neut part. I'm afraid, though, that I still don't fully understand the proposal. It sounds like it is a redesign of what GNU gettext does, but I don't see any advantage over GNU gettext. > In the example that was sent, I see code that tests for certain kinds > of numbers. But since I don't know the language that that is written > in, the mathematical conditions are the only part I understand. I > don't see what it will _do_ in each of those conditions. I presume it > selects the appropriate suffix for the number, but I don't follow how > it does so. The GNU gettext translation code doesn't know anything about suffixes. All it knows is that if n%10==1 && n%100!=11 then it should use msgstr[0], else if n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) then it should use msgstr[1], else it should use msgstr[2]. The translations themselves are string formats that already have the proper suffixes, and GNU gettext simply copies those suffixes. This is a simple scheme that does not attempt to solve the problem of generating idiomatic phrases for numbers (e.g., "twenty-four bytes" in English, "двадцать четыре байта" in Russian). All it solves is the problem of generating phrases containing numerals (e.g., "24 bytes" in English, "23 байта" in Russian), as these are the sorts of phrases that printf formats can generate. In practice, this is good enough. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-10 6:07 ` Paul Eggert @ 2019-03-11 1:20 ` Richard Stallman 2019-03-11 3:52 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-11 1:20 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, juri, lokedhs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Thanks for explaining the -masc, -fem, -neut part. I'm afraid, though, that I > still don't fully understand the proposal. It sounds like it is a redesign of > what GNU gettext does, but I don't see any advantage over GNU gettext. The advantage -- which is a big one -- is that the way the translation is represented is much cleaner. Compare this (numeric-case NUMBER (russian-masc "%d байт скопирован, %s, %s") (russian-fem "%d байта скопировано, %s, %s") (russian-neut "%d байт скопировано, %s, %s")) (I have filled in strings for the real example you sent. Since I don't speak Russian, I was unable to write one myself, and it would have taken me hours to find one.) or this: "russian-masc:%d байт скопирован, %s, %s|\ russian-fem:%d байта скопировано, %s, %s|\ russian-neut:%d байт скопировано, %s, %s" with this: "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n" ... #: src/dd.c:822 #, c-format msgid "%<PRIuMAX> byte copied, %s, %s" msgid_plural "%<PRIuMAX> bytes copied, %s, %s" msgstr[0] "%<PRIuMAX> байт скопирован, %s, %s" msgstr[1] "%<PRIuMAX> байта скопировано, %s, %s" msgstr[2] "%<PRIuMAX> байт скопировано, %s, %s" If the selector symbol can modify the string too, I can envision something like this: "russian-nom:%d байт%| скопирован%|, %s, %s" where the 'russian-nom' operator would replace the two %| sequences with the appropriate declensional suffixes for the nominative case. Building that sort of thing into gettext would be bad architecture. Gettext is too low level, and used in too many places. Making Emacs handle 'russian-nom' in a string it pulls out of gettext would be no problem at all. > This is a simple scheme that does not attempt to solve the problem of generating > idiomatic phrases for numbers (e.g., "twenty-four bytes" in English, I agree we don't need to do this. But, with the mechanism I've just proposed, it would be easy to do, so I suppose we would implement it. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 1:20 ` Richard Stallman @ 2019-03-11 3:52 ` Paul Eggert 2019-03-12 3:31 ` Richard Stallman 2019-03-12 3:31 ` Richard Stallman 0 siblings, 2 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-11 3:52 UTC (permalink / raw) To: rms; +Cc: eliz, juri, lokedhs, emacs-devel Richard Stallman wrote: > Compare this > > (numeric-case NUMBER > (russian-masc "%d байт скопирован, %s, %s") > (russian-fem "%d байта скопировано, %s, %s") > (russian-neut "%d байт скопировано, %s, %s")) > > with this: > > "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 > && (n%100<10 || n%100>=20) ? 1 : 2);\n" > ... > #: src/dd.c:822 > #, c-format > msgid "%<PRIuMAX> byte copied, %s, %s" > msgid_plural "%<PRIuMAX> bytes copied, %s, %s" > msgstr[0] "%<PRIuMAX> байт скопирован, %s, %s" > msgstr[1] "%<PRIuMAX> байта скопировано, %s, %s" > msgstr[2] "%<PRIuMAX> байт скопировано, %s, %s" I'm afraid that's not a apples-to-apples comparison. The first form contains only the Russian translations, whereas the second form contains much more information: the source-code location of the untranslated strings, a copy of the untranslated English-language strings, and the general rules for Russian (the last is shared among all the Russian translations, not just the translations listed here). This extra information is useful for translators, and it has a reasonably extensive software suite that already supports it, not to mention translators who are already used to it. > I can envision something like this: > > "russian-nom:%d байт%| скопирован%|, %s, %s" > > where the 'russian-nom' operator would replace the two %| sequences > with the appropriate declensional suffixes for the nominative case. But Russian declension is not that simple. The Russian word for "byte" is "байт", but its plural form depends not only on the number (as in the above examples) but also in its case: the "байт" and "байта" in the above examples are not exhaustive. And some words have irregular declensions: for example, ребёнок (singular) versus де́ти (plural) for the same noun. And it's not just nouns and pronouns that are affected: adjectives also have singular and plural forms. And I have by no means exhausted the issues involved here; to get a better feeling for the complexity in this area, please see: https://en.wikipedia.org/wiki/Russian_declension Although it wouldn't be impossible for Emacs Lisp code to handle all the special cases for Russian declension, it would be tricky to implement, or to document it in a way that translators would easily understand. And we'd also have to implement and document similarly tricky rules for other languages. And we'd have to deal with the fact that not every Russian-speaker agrees with how to decline words like "байт" that are imported from English. These sorts of issues should be delegated to translators, not to likely-fragile code in Emacs Lisp (a technology that translators typically do not grok). In contrast, the gettext way is relatively simple and easily understood, and is already common practice. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 3:52 ` Paul Eggert @ 2019-03-12 3:31 ` Richard Stallman 2019-03-12 3:31 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-12 3:31 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, emacs-devel, lokedhs, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I can envision something like this: > > > > "russian-nom:%d байт%| скопирован%|, %s, %s" > > > > where the 'russian-nom' operator would replace the two %| sequences > > with the appropriate declensional suffixes for the nominative case. > But Russian declension is not that simple. The Russian word for "byte" is > "байт", but its plural form depends not only on the number (as in the above > examples) but also in its case: Yes, of course. I anticipated that. That is why I called the construct 'russian-nom', and specified that it provides "the appropriate declensional suffixes for the nominative case." We could define similar constructs for some of the other cases in Russian, whichever ones translators would want to use. the "байт" and "байта" in the above examples are not exhaustive. No problem. Nobody supposed that they were. And some words have irregular declensions: I anticipated that, too. The low-level forms 'russian-masc' and friends can handle all such situations. In them you can specify the precise conjugated forms for the irregular words in the message. > nd it's not just nouns and > pronouns that are affected: adjectives also have singular and plural forms. 'russian-masc' and friends allow explicit conjugation of any parts of speech. > And we'd have > to deal with the fact that not every Russian-speaker agrees with how to decline > words like "байт" that are imported from English. The translator is always welcome to use the low-level constructs 'russian-masc' and friends, to exercise explicit control over that. > I have by no means exhausted the issues involved here; to get a better feeling > for the complexity in this area, please see: > https://en.wikipedia.org/wiki/Russian_declension I don't need to understand all the details of Russian numbers. I've designed a method so flexible that it can handle any such complexities. > And we'd also have to > implement and document similarly tricky rules for other languages. No, we don't. With my approach, we don't _have to_ implement any of these specific solutions. We can implement whichever ones we like. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 3:52 ` Paul Eggert 2019-03-12 3:31 ` Richard Stallman @ 2019-03-12 3:31 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-12 3:31 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, emacs-devel, lokedhs, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > I'm afraid that's not a apples-to-apples comparison. It doesn't need to be. The first form contains > only the Russian translations, whereas the second form contains much more > information: the source-code location of the untranslated strings, a copy of the > untranslated English-language strings, and the general rules for Russian (the > last is shared among all the Russian translations, not just the translations > listed here). I can't draw any conclusions about the translation data you sent. It is in a format I have never seen and you have not explained it. So I don't try to understand it. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-10 3:05 ` Richard Stallman 2019-03-10 6:07 ` Paul Eggert @ 2019-03-10 8:45 ` Yuri Khan 1 sibling, 0 replies; 151+ messages in thread From: Yuri Khan @ 2019-03-10 8:45 UTC (permalink / raw) To: rms Cc: Eli Zaretskii, Paul Eggert, Elias Mårtenson, Emacs developers, Juri Linkov On Sun, Mar 10, 2019 at 10:06 AM Richard Stallman <rms@gnu.org> wrote: > 'russian-neut' selects numbers that require a neuter ending, in > Russian. You use it with a string that contains the neuter ending. > > If this does not work, why not? You are conflating three grammatical categories: the number, the gender, and the declension type. Gender and declension type are attributes of the noun, and are fixed with respect to the noun. So if your message is about bytes, your translator knows to use noun endings according to declension type 1a and verb endings for masculine gender; there is nothing left for the machine to guess. (Gender of the noun also affects the form of the numeral if it is spelled out, but for computer-generated messages we usually do not do that and just use digits.) Number depends on the numeral’s value and affects the forms of the noun, and any adjectives and verbs attached to it. Singular number applies to values than end in 1, except for values that end in 11. Dual number applies to values that end in 2..4, again, except for values that end in 12..14. Plural number applies to everything else. Any grammatical number can apply to any noun, so the translator will provide all three wordings and let the machine select one using the above logic. Your example would work if you changed -masc, -fem and -neut to -sing, -dual and -pl. But that is, as Paul mentioned, reinventing ngettext(3). ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 7:54 ` Paul Eggert 2019-03-09 10:30 ` Eli Zaretskii 2019-03-10 3:05 ` Richard Stallman @ 2019-03-10 3:05 ` Richard Stallman 2019-03-10 6:14 ` Paul Eggert 2019-03-10 3:05 ` Richard Stallman 3 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-10 3:05 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, juri, lokedhs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Russian has three plural forms useful for translating a string that formats an > integer N. One form is for when (N%10 == 1 && N%100/10 != 1), one is for when (2 > <= N%10 && N%10 <= 4 && N%100/10 == 1), and one is for everything else. So the > form depends on N, not on whether the translation of the word "items" is > masculine or feminine or whatever. I know that. That is the problem I addressed. Each clause inside numeric-select tests for and handles certain numbers. The first thing in the clause is a condition that tests the number. If the condition is a number, it matches only that number. 'russian-masc' tests for numbers that require a masculine noun ending; you use it with a string that contains the masculine ending. 'russian-fem' tests for numbers that require a feminine noun ending. you use it with a string that contains the feminine ending. 'russian-neut' tests for numbers that require a neuter noun ending. you use it with a string that contains the neuter ending. Since I do not speak Russian, I wrote dummies for those endings: -m, -f and -n. Thus, (numeric-case NUMBER (russian-masc "%d-m frobs") (russian-fem "%d-f frobs") (russian-neut "%d-n frobs")) Do you follow, now? > "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 > && (n%100<10 || n%100>=20) ? 1 : 2);\n" > ... > #: src/dd.c:822 > #, c-format > msgid "%<PRIuMAX> byte copied, %s, %s" > msgid_plural "%<PRIuMAX> bytes copied, %s, %s" It would be better if we can define these criteria just once, rather than restate them in many places. My idea is to incorporate it into the definition of the conditionals, russian-masc etc. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-10 3:05 ` Richard Stallman @ 2019-03-10 6:14 ` Paul Eggert 0 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-10 6:14 UTC (permalink / raw) To: rms; +Cc: eliz, juri, lokedhs, emacs-devel Richard Stallman wrote: > > "Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 > > && (n%100<10 || n%100>=20) ? 1 : 2);\n" > > It would be better if we can define these criteria just once, rather > than restate them in many places. The criteria are stated just once per translation catalog. For example, the "Plural-forms:" line appears just once in the Russian translation catalog for coreutils. The criteria need not be repeated for each translated message. If Emacs ends up having dozens or hundreds of message catalogs, it may be worth looking into maintaining just one copy of the Russian criteria, rather than once per Russian translation catalog. I hope we don't go that route, though. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 7:54 ` Paul Eggert ` (2 preceding siblings ...) 2019-03-10 3:05 ` Richard Stallman @ 2019-03-10 3:05 ` Richard Stallman 3 siblings, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-10 3:05 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, juri, lokedhs, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > which displays correctly as "Elias Mårtenson", your reply said "To: Elias > =?iso-8859-1?Q?M=C3=A5rtenson?=" which displays incorrectly as "Elias > MÃ¥rtenson". It looks like there's a bug in your email client, or in your > configuration of it, a bug that munges names of your email correspondents. Indeed, it is a bug. Maybe someday I can fix it. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 18:15 ` Eli Zaretskii 2019-03-06 19:47 ` Paul Eggert @ 2019-03-07 3:42 ` Richard Stallman 2019-03-07 14:46 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-07 3:42 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > You mean, the function, such as 'message', that receives the string > will translate it? As opposed to the alternative of translating the > string _before_ it gets passed to the function? Yes, of course. > If we do that, how do we deal with strings that are computed by > concatenation or formatting? Feed them in through %s or something like that. I'm proposing the convention that the first argument to 'message' gets by default translated, and other arguments don't. With this convention, whichever result you want, it is clear how to get it. We already do things basically this way, because if you want to compute a string to be the message, you don't want % to be treated specially in it. So you use "%s" as the first argument and pas that string as the second. So I think this will require only occasional changes and they won't be urgent. > They get in one piece to functions like > 'message', but the catalog will not hold that concatenated string, it > will have the parts separately. That would happen if the catalog is made ONLY by scanning the source. That's why I suggested a feature to record whatever nontrivial format strings are passed to 'message' and are not in the catalog. Then you can add them to the catalog, or fix things some other way. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 3:42 ` Richard Stallman @ 2019-03-07 14:46 ` Eli Zaretskii 2019-03-07 17:19 ` Paul Eggert 2019-03-08 4:11 ` Richard Stallman 0 siblings, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-07 14:46 UTC (permalink / raw) To: rms; +Cc: juri, eggert, emacs-devel > From: Richard Stallman <rms@gnu.org> > Cc: eggert@cs.ucla.edu, emacs-devel@gnu.org, juri@linkov.net > Date: Wed, 06 Mar 2019 22:42:06 -0500 > > > If we do that, how do we deal with strings that are computed by > > concatenation or formatting? > > Feed them in through %s or something like that. But then the strings that are formatted via %s will not be translated, they will remain in English. > I'm proposing the convention that the first argument to 'message' gets > by default translated, and other arguments don't. With this > convention, whichever result you want, it is clear how to get it. > > We already do things basically this way, because if you want to > compute a string to be the message, you don't want % to be treated > specially in it. So you use "%s" as the first argument and pas that > string as the second. For the point I'm trying to make, it is immaterial whether the first argument is "%s" and the second argument is computed from several sources, or the first argument is that computed string. The problems that follow are the same. > > They get in one piece to functions like > > 'message', but the catalog will not hold that concatenated string, it > > will have the parts separately. > > That would happen if the catalog is made ONLY by scanning the source. > That's why I suggested a feature to record whatever nontrivial format > strings are passed to 'message' and are not in the catalog. Such a feature will only help when a given call to 'message' produce a small number of fixed text strings. If the text it produces includes some non-deterministic ingredient, this method will not help. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 14:46 ` Eli Zaretskii @ 2019-03-07 17:19 ` Paul Eggert 2019-03-07 18:24 ` martin rudalics 2019-03-07 20:22 ` Eli Zaretskii 2019-03-08 4:11 ` Richard Stallman 1 sibling, 2 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-07 17:19 UTC (permalink / raw) To: Eli Zaretskii, rms; +Cc: juri, emacs-devel [-- Attachment #1: Type: text/plain, Size: 3615 bytes --] On 3/7/19 6:46 AM, Eli Zaretskii wrote: >> From: Richard Stallman <rms@gnu.org> Cc: eggert@cs.ucla.edu, >> emacs-devel@gnu.org, juri@linkov.net Date: Wed, 06 Mar 2019 22:42:06 >> -0500 > If we do that, how do we deal with strings that are computed >> by > concatenation or formatting? Feed them in through %s or >> something like that. >> > But then the strings that are formatted via %s will not be translated, > they will remain in English. > Yes, but the scenario you describe should not occur in a properly internationalized GNU application. We obviously can't assume that Emacs's translation subroutine acts like Google Translate and can translate any English-language string to the user's language. All we can assume is that the translation subroutine converts one of a fixed set of English-language strings to a string appropriate for the user's language. This limitation will cause problems with Elisp code that does extensive parsing or processing of English syntax (doctor.el, say), and that sort of Elisp code will remain English-only (unless someone takes the time to i18nize them specially). However most Elisp code does not parse English or generate idiomatic English on the fly: instead, it uses a fixed, stilted style that can routinely be converted to calls like (message FORMAT ARG1 ARG2 ...) where FORMAT is translated and the ARG values are not. To get a quick feel for this issue, I did a simple grep for the string '(message (concat' in the Emacs source code. I found 41 instances of this string. Of these, 8 were erroneous because the result of the concatenation could contain an unwanted "%" or '`' that could cause 'message' to go awry, and I fixed them by installing the attached patch (by the way, it's routine for i18n efforts to find trivial bugs like this). The other 33 instances could easily be reworded to do proper i18n when 'message' translates just its first argument and only simple, xgettext-style static analysis is used to find the message strings. For example, this code in calc-do-embedded: (message (concat "Embedded Calc mode enabled; " (if calc-embedded-quiet "Type `C-x * x'" "Give this command again") " to return to normal")) can easily be rewritten to this: (if calc-embedded-quiet "Embedded Calc mode enabled; Type `C-x * x' to return to normal" "Embedded Calc mode enabled; Give this command again to return to normal")) which is easier for translators to grok, and is arguably clearer even if we don't want to translate at all.Obviously my '(message (concat' exercises only a small part of the problem, but the results of this little sample are encouraging. So this problem is solvable. Sure, it'll require substantial work, but the work is routine and this sort of thing has been done for other packages. The main argument against doing all this is that it's too much work overall and nobody will have the time to do it all, so let's not even bother. I have some sympathy for this argument, as i18n is clearly too much work for any single contributor and the work will distract us from other things. On the other hand, there's no pressing need to do all the work quickly, it's a low-level task that can be farmed out to non-expert volunteers that could conceivably grow the volunteer population, and if we never even start the work then it will never get done and Emacs will remain unfriendly to users who don't grok English. [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: 0001-Be-safer-about-in-message-formats.patch --] [-- Type: text/x-patch; name="0001-Be-safer-about-in-message-formats.patch", Size: 7751 bytes --] From f15d0d0247ffe7bc3bbd5fbe10271c93b2e2fb1c Mon Sep 17 00:00:00 2001 From: Paul Eggert <eggert@cs.ucla.edu> Date: Thu, 7 Mar 2019 09:02:15 -0800 Subject: [PATCH] Be safer about "%" in message formats MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * lisp/calc/calc-store.el (calc-copy-special-constant): * lisp/net/rcirc.el (rcirc-handler-PART, rcirc-handler-KICK): * lisp/org/org-agenda.el (org-agenda): * lisp/org/org-clock.el (org-clock-out, org-clock-display): * lisp/org/org.el (org-refile): * lisp/progmodes/ada-xref.el (ada-goto-declaration): * lisp/progmodes/idlwave.el (idlwave-scan-library-catalogs): Don’t trust arbitrary strings to not contain "%" or "`" in (message (concat STRING1 STRING2 ...)). --- lisp/calc/calc-store.el | 4 ++-- lisp/net/rcirc.el | 4 ++-- lisp/org/org-agenda.el | 13 ++++++------- lisp/org/org-clock.el | 22 ++++++++++++---------- lisp/org/org.el | 3 ++- lisp/progmodes/ada-xref.el | 3 +-- lisp/progmodes/idlwave.el | 7 +++---- 7 files changed, 28 insertions(+), 28 deletions(-) diff --git a/lisp/calc/calc-store.el b/lisp/calc/calc-store.el index 589a776c41..3987c129c2 100644 --- a/lisp/calc/calc-store.el +++ b/lisp/calc/calc-store.el @@ -405,8 +405,8 @@ calc-copy-special-constant sconst)))) (if var (let ((msg (calc-store-value var value ""))) - (message (concat "Special constant \"%s\" copied to \"%s\"" msg) - sconst (calc-var-name var))))))))) + (message "Special constant \"%s\" copied to \"%s\"%s" + sconst (calc-var-name var) msg)))))))) (defun calc-copy-variable (&optional var1 var2) (interactive) diff --git a/lisp/net/rcirc.el b/lisp/net/rcirc.el index b1a6c1ce8d..9d53cd4436 100644 --- a/lisp/net/rcirc.el +++ b/lisp/net/rcirc.el @@ -2685,7 +2685,7 @@ rcirc-handler-PART-or-KICK (defun rcirc-handler-PART (process sender args _text) (let* ((channel (car args)) (reason (cadr args)) - (message (concat channel " " reason))) + (message "%s %s" channel reason)) (rcirc-print process sender "PART" channel message) ;; print in private chat buffer if it exists (when (rcirc-get-buffer (rcirc-buffer-process) sender) @@ -2697,7 +2697,7 @@ rcirc-handler-KICK (let* ((channel (car args)) (nick (cadr args)) (reason (nth 2 args)) - (message (concat nick " " channel " " reason))) + (message "%s %s %s" nick channel reason)) (rcirc-print process sender "KICK" channel message t) ;; print in private chat buffer if it exists (when (rcirc-get-buffer (rcirc-buffer-process) nick) diff --git a/lisp/org/org-agenda.el b/lisp/org/org-agenda.el index e416f5f062..23ee8d71e6 100644 --- a/lisp/org/org-agenda.el +++ b/lisp/org/org-agenda.el @@ -2882,13 +2882,12 @@ org-agenda (let* ((m (org-agenda-get-any-marker)) (note (and m (org-entry-get m "THEFLAGGINGNOTE")))) (when note - (message (concat - "FLAGGING-NOTE ([?] for more info): " - (org-add-props - (replace-regexp-in-string - "\\\\n" "//" - (copy-sequence note)) - nil 'face 'org-warning))))))) + (message "FLAGGING-NOTE ([?] for more info): %s" + (org-add-props + (replace-regexp-in-string + "\\\\n" "//" + (copy-sequence note)) + nil 'face 'org-warning)))))) t t)) ((equal org-keys "#") (call-interactively 'org-agenda-list-stuck-projects)) ((equal org-keys "/") (call-interactively 'org-occur-in-agenda-files)) diff --git a/lisp/org/org-clock.el b/lisp/org/org-clock.el index 34b694d487..62c7cd92d1 100644 --- a/lisp/org/org-clock.el +++ b/lisp/org/org-clock.el @@ -1622,9 +1622,10 @@ org-clock-out "\\>")))) (org-todo org-clock-out-switch-to-state)))))) (force-mode-line-update) - (message (concat "Clock stopped at %s after " - (org-duration-from-minutes (+ (* 60 h) m)) "%s") - te (if remove " => LINE REMOVED" "")) + (message (if remove + "Clock stopped at %s after %s => LINE REMOVED" + "Clock stopped at %s after %s") + te (org-duration-from-minutes (+ (* 60 h) m))) (run-hooks 'org-clock-out-hook) (unless (org-clocking-p) (setq org-clock-current-task nil))))))) @@ -1925,13 +1926,14 @@ org-clock-display nil 'local)))) (let* ((h (/ org-clock-file-total-minutes 60)) (m (- org-clock-file-total-minutes (* 60 h)))) - (message (concat (format "Total file time%s: " - (cond (todayp " for today") - (customp " (custom)") - (t ""))) - (org-duration-from-minutes - org-clock-file-total-minutes) - " (%d hours and %d minutes)") + (message (cond + (todayp + "Total file time for today: %s (%d hours and %d minutes)") + (customp + "Total file time (custom): %s (%d hours and %d minutes)") + (t + "Total file time: %s (%d hours and %d minutes)")) + (org-duration-from-minutes org-clock-file-total-minutes) h m)))) (defvar-local org-clock-overlays nil) diff --git a/lisp/org/org.el b/lisp/org/org.el index 3a434d12df..e3c78ae90d 100644 --- a/lisp/org/org.el +++ b/lisp/org/org.el @@ -11878,7 +11878,8 @@ org-refile (when (featurep 'org-inlinetask) (org-inlinetask-remove-END-maybe)) (setq org-markers-to-move nil) - (message (concat actionmsg " to \"%s\" in file %s: done") (car it) file))))))) + (message "%s to \"%s\" in file %s: done" actionmsg + (car it) file))))))) (defun org-refile-goto-last-stored () "Go to the location where the last refile was stored." diff --git a/lisp/progmodes/ada-xref.el b/lisp/progmodes/ada-xref.el index 28c52b0653..c9c923e1d6 100644 --- a/lisp/progmodes/ada-xref.el +++ b/lisp/progmodes/ada-xref.el @@ -1133,8 +1133,7 @@ ada-goto-declaration (ada-find-in-ali identlist other-frame) ;; File not found: print explicit error message (ada-error-file-not-found - (message (concat (error-message-string err) - (nthcdr 1 err)))) + (message "%s%s" (error-message-string err) (nthcdr 1 err))) (error (let ((ali-file (ada-get-ali-file-name (ada-file-of identlist)))) diff --git a/lisp/progmodes/idlwave.el b/lisp/progmodes/idlwave.el index 476d935e8a..25bc788ffc 100644 --- a/lisp/progmodes/idlwave.el +++ b/lisp/progmodes/idlwave.el @@ -5588,7 +5588,7 @@ idlwave-scan-library-catalogs (mapcar 'car idlwave-path-alist))) (old-libname "") dir-entry dir catalog all-routines) - (if message-base (message message-base)) + (if message-base (message "%s" message-base)) (while (setq dir (pop dirs)) (catch 'continue (when (file-readable-p @@ -5603,8 +5603,7 @@ idlwave-scan-library-catalogs message-base (not (string= idlwave-library-catalog-libname old-libname))) - (message "%s" (concat message-base - idlwave-library-catalog-libname)) + (message "%s%s" message-base idlwave-library-catalog-libname) (setq old-libname idlwave-library-catalog-libname)) (when idlwave-library-catalog-routines (setq all-routines @@ -5618,7 +5617,7 @@ idlwave-scan-library-catalogs (setq dir-entry (assoc dir idlwave-path-alist))) (idlwave-path-alist-add-flag dir-entry 'lib))))) (unless no-load (setq idlwave-library-catalog-routines all-routines)) - (if message-base (message (concat message-base "done")))))) + (if message-base (message "%sdone" message-base))))) ;;----- Communicating with the Shell ------------------- -- 2.20.1 ^ permalink raw reply related [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 17:19 ` Paul Eggert @ 2019-03-07 18:24 ` martin rudalics 2019-03-07 18:44 ` Paul Eggert 2019-03-07 20:22 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: martin rudalics @ 2019-03-07 18:24 UTC (permalink / raw) To: Paul Eggert, Eli Zaretskii, rms; +Cc: emacs-devel, juri These - (message (concat channel " " reason))) + (message "%s %s" channel reason)) (rcirc-print process sender "PART" channel message) ;; print in private chat buffer if it exists (when (rcirc-get-buffer (rcirc-buffer-process) sender) @@ -2697,7 +2697,7 @@ rcirc-handler-KICK (let* ((channel (car args)) (nick (cadr args)) (reason (nth 2 args)) - (message (concat nick " " channel " " reason))) + (message "%s %s %s" nick channel reason)) get me here In toplevel form: ../../lisp/net/rcirc.el:2685:1:Warning: Malformed `let*' binding: (message "%s %s" channel reason) ../../lisp/net/rcirc.el:2696:1:Warning: Malformed `let*' binding: (message "%s %s %s" nick channel reason) martin ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 18:24 ` martin rudalics @ 2019-03-07 18:44 ` Paul Eggert 0 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-07 18:44 UTC (permalink / raw) To: martin rudalics, Eli Zaretskii, rms; +Cc: emacs-devel, juri On 3/7/19 10:24 AM, martin rudalics wrote: > ../../lisp/net/rcirc.el:2685:1:Warning: Malformed `let*' binding: > (message "%s > %s" channel reason) Oops. Thanks for reporting that. I fixed it in master. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 17:19 ` Paul Eggert 2019-03-07 18:24 ` martin rudalics @ 2019-03-07 20:22 ` Eli Zaretskii 2019-03-07 22:25 ` Paul Eggert 2019-03-08 4:18 ` Richard Stallman 1 sibling, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-07 20:22 UTC (permalink / raw) To: Paul Eggert; +Cc: juri, rms, emacs-devel > Cc: emacs-devel@gnu.org, juri@linkov.net > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Thu, 7 Mar 2019 09:19:35 -0800 > > This limitation will cause problems with Elisp code that does extensive > parsing or processing of English syntax (doctor.el, say), and that sort > of Elisp code will remain English-only (unless someone takes the time to > i18nize them specially). However most Elisp code does not parse English > or generate idiomatic English on the fly: instead, it uses a fixed, > stilted style that can routinely be converted to calls like (message > FORMAT ARG1 ARG2 ...) where FORMAT is translated and the ARG values are not. > > To get a quick feel for this issue, I did a simple grep for the string > '(message (concat' in the Emacs source code. I found 41 instances of > this string. But 'message' is just a representative of a class of such functions. There are others: 'signal', 'error', 'user-error', 'princ', 'format', and probably some more I'm missing. So the actual number of occurrences is larger than the 40 you found. I guess I'm saying that we should think some more whether we indeed want to give up marking translatable strings and instead rely on some functions always translating their argument strings. Perhaps doing so will impose restrictions on what a Lisp program can do, and we don't want to live with such restrictions without some fire escape, in the form of explicitly translated strings? In general, I think we should not blindly accept any technique used for localization, because Emacs is so much different from a typical console program written in C. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 20:22 ` Eli Zaretskii @ 2019-03-07 22:25 ` Paul Eggert 2019-03-08 7:29 ` Eli Zaretskii 2019-03-08 4:18 ` Richard Stallman 1 sibling, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-07 22:25 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, rms, emacs-devel On 3/7/19 12:22 PM, Eli Zaretskii wrote: > > 'message' is just a representative of a class of such functions. > There are others: 'signal', 'error', 'user-error', 'princ', 'format', > and probably some more I'm missing. So the actual number of > occurrences is larger than the 40 you found. Yes, of course. And even for 'message', all I searched for was the string '(message (concat', which is just a fraction of the calls to 'message' that will need to be reworked. That search was not an attempt to count all the problems we'd run into; it merely was a sample of the problems. If the sample is representative then each individual problem should be relatively easy to solve. > whether we indeed > want to give up marking translatable strings and instead rely on some > functions always translating their argument strings. We could mark each translatable string by hand. But this would make for more churn to the source code and would be more work. It's hard to see why that would be a win, compared to the reasonably-common practice of marking some well-known functions as doing translations automatically. > Perhaps doing so > will impose restrictions on what a Lisp program can do, and we don't > want to live with such restrictions without some fire escape, in the > form of explicitly translated strings? One can easily work around any such restrictions by having a variant of 'message' that does not translate its format argument. We're already doing this for translation of '`', by having two functions 'format' and 'format-message': the former does not translate '`', the latter does. A similar approach can work for natural-language translation and 'message'. > we should not blindly accept any technique used > for localization, because Emacs is so much different from a typical > console program written in C. Of course we should not accept techniques blindly. We should use techniques with our eyes open, based on experience. That being said, this discussion suggests that Emacs is not really that much of a special case aside from its size. If so, there is little need to reinvent the i18n wheel just for Emacs, and there is a real advantage to reusing existing GNU technology in this area rather than trying to reinvent it. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 22:25 ` Paul Eggert @ 2019-03-08 7:29 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-08 7:29 UTC (permalink / raw) To: Paul Eggert; +Cc: juri, rms, emacs-devel > Cc: rms@gnu.org, emacs-devel@gnu.org, juri@linkov.net > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Thu, 7 Mar 2019 14:25:30 -0800 > > > whether we indeed > > want to give up marking translatable strings and instead rely on some > > functions always translating their argument strings. > > We could mark each translatable string by hand. No, I didn't mean "each", I meant just some, hopefully a small minority. Because most of the use cases are probably easy enough to change so that strings could be collected by a tool, and 'message' and its ilk could then translate them automatically. Having an explicit translation function would then be that "fire escape" for when converting code not to compute strings would be too painful. > > Perhaps doing so > > will impose restrictions on what a Lisp program can do, and we don't > > want to live with such restrictions without some fire escape, in the > > form of explicitly translated strings? > > One can easily work around any such restrictions by having a variant of > 'message' that does not translate its format argument. We could, but I don't see how that would help. If a string is not found in the catalog(s), it will be output untranslated anyway, so why do we need a separate function? > this discussion suggests that Emacs is not really that much of a special > case aside from its size. I'm not sure I agree. I think the fact that Emacs is written mostly in Lisp and not in a procedural compiled language will make another qualitative difference. > there is a real advantage to reusing existing GNU technology in this > area rather than trying to reinvent it. Where it fits, sure. Especially we should strive hard to use the PO files for catalogs, because that affects the translation teams. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 20:22 ` Eli Zaretskii 2019-03-07 22:25 ` Paul Eggert @ 2019-03-08 4:18 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-08 4:18 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > But 'message' is just a representative of a class of such functions. > There are others: 'signal', 'error', 'user-error', 'princ', 'format', > and probably some more I'm missing. So the actual number of > occurrences is larger than the 40 you found. Some of them should be handled in the same way as 'message'. But not 'format' -- it can be used for various things and some should not be translated. > I guess I'm saying that we should think some more whether we indeed > want to give up marking translatable strings Of course we need an explicit way to mark translatable strings -- but we should also adopt short cuts (like recognizing first arg of 'message') so that a large fraction of these strings don't need to be explicitly marked. If we are going to handle translation, this is the obvious best way, so let's not worry about the precise details. It will get done. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 14:46 ` Eli Zaretskii 2019-03-07 17:19 ` Paul Eggert @ 2019-03-08 4:11 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-08 4:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > That would happen if the catalog is made ONLY by scanning the source. > > That's why I suggested a feature to record whatever nontrivial format > > strings are passed to 'message' and are not in the catalog. > Such a feature will only help when a given call to 'message' produce a > small number of fixed text strings. If the text it produces includes > some non-deterministic ingredient, this method will not help. That is true. But it is a comparatively small problem, because those cases are a small minority. The approach I have in mind is to make several mechanisms, each designed to handle a large fraction of cases easily, and leave the exceptions to be handled less easily. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-05 21:58 ` Emacs i18n Juri Linkov 2019-03-06 2:16 ` Richard Stallman @ 2019-03-06 17:30 ` Eli Zaretskii 2019-03-06 18:09 ` Eli Zaretskii 2 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-06 17:30 UTC (permalink / raw) To: Juri Linkov; +Cc: eggert, rms, emacs-devel > From: Juri Linkov <juri@linkov.net> > Cc: Eli Zaretskii <eliz@gnu.org>, rms@gnu.org, emacs-devel@gnu.org > Date: Tue, 05 Mar 2019 23:58:25 +0200 > > One of the main decisions that has to be made is whether to wrap all > user-facing translatable strings in all Lisp files using a macro/function > 'gettext' (alias '_') explicitly like is implemented in XEmacs' I18N3 > that would help to extract translations from the source code, or to use > a low-level implicit translation without changing the existing code like > is implemented for handling text-quoting-style in format strings. > The latter will even allow translation of strings that a package author > forgot to mark with '_'. I'd encourage people who want or consider working on this to read past discussions about related topics. Some very important conclusions and ideas came out of those discussions, and it would be a pity if we'd need to reiterate all of what was already said and argued time and again, instead of starting from where those past discussions ended. Significant discussions of this happened in Dec 2001, in July 2007, and lately in Apr 2017. Some of those are quite long, but please do read them, even if you were part of those discussions. This current discussion will be much more fruitful if we first recollect what we already talked over. > Depending on this decision a translation file format has to be selected, > be it flat Gettext PO format files or even some YAML-like hierarchical > Lisp structures with scopes. The first alternative we should consider is to use the PO format, because that's what translation teams out there are used to work with. If it turns out that we cannot use the PO format for some good reasons (which will have to be very good), we can consider other formats, but translation teams will be in general very unhappy about that. And I think these technicalities are not the first, let alone main, decisions we must make. They are important, but there are more important and complex problems we need to address first. I will talk about this separately. Thanks. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-05 21:58 ` Emacs i18n Juri Linkov 2019-03-06 2:16 ` Richard Stallman 2019-03-06 17:30 ` Eli Zaretskii @ 2019-03-06 18:09 ` Eli Zaretskii 2019-03-06 19:39 ` Paul Eggert ` (2 more replies) 2 siblings, 3 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-06 18:09 UTC (permalink / raw) To: Juri Linkov; +Cc: eggert, rms, emacs-devel > From: Juri Linkov <juri@linkov.net> > Cc: Eli Zaretskii <eliz@gnu.org>, rms@gnu.org, emacs-devel@gnu.org > Date: Tue, 05 Mar 2019 23:58:25 +0200 > > One of the main decisions that has to be made is whether to wrap all > user-facing translatable strings in all Lisp files using a macro/function > 'gettext' First, AFAIR the conclusion back when this was discussed was that we might not need to mark the translatable strings, because almost all of them should be translatable. If anything, we might consider marking strings that do NOT need to be translated, as they are a very small minority. Just look at the strings in a typical Emacs source file and try to find strings that you wouldn't want translated. Unlike some other programs, Emacs almost never says something that is not meant to be read and understood by the user. Second, I don't understand why we are still talking about 'message'. Most of the user interaction in Emacs that will benefit the most from translation is not messages we show in the echo area: Emacs actually doesn't chatter there too much. Most of the stuff that IMO is much more important to have translated are the doc strings. It's no coincidence that Emacs has around 5000 calls to 'message', but almost 50000 doc strings, 10 times more than echo-area messages. So even if we do decide to attack the 'message' part first, we should consider the doc strings as well, so that whatever infrastructure we develop for messages will work for doc strings as well. And that adds more issues that the basic design must solve or be capable of solving. Then there are some seemingly minor technical issues, but I think Emacs will force us to deal with them up front, because Emacs is so much different from a typical localized text-mode program. Some of the issues that came up in the past: . Do we use a separate message catalog for each Lisp package, or a single catalog for all of Emacs? Each alternative has its merits and demerits. For example, if we go with separate catalogs, then how do we make the correct bindtextdomain call, given that packages call each other? If we go for a single catalog, how do we support installing and loading a new package without exiting Emacs? . How to specify which target language to use? The locale is not necessarily correct, e.g., when editing with Tramp. Also, since translating all of Emacs is such a humongous job, it's quite possible that some languages will have little or no translations, and the respective users might want to use translations for a "fallback" language, which they prefer to English. . Many user-facing text messages include portions that we generate directly from symbol names, which are of course in English. We should have some idea for how to deal with that. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 18:09 ` Eli Zaretskii @ 2019-03-06 19:39 ` Paul Eggert 2019-03-06 19:49 ` Eli Zaretskii 2019-03-06 19:47 ` Paul Eggert 2019-03-07 3:44 ` Richard Stallman 2 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-06 19:39 UTC (permalink / raw) To: Eli Zaretskii, Juri Linkov; +Cc: rms, emacs-devel On 3/6/19 10:09 AM, Eli Zaretskii wrote: > we might consider marking > strings that do NOT need to be translated, as they are a very small > minority. Just look at the strings in a typical Emacs source file and > try to find strings that you wouldn't want translated. Unlike some > other programs, Emacs almost never says something that is not meant to > be read and understood by the user. My impression is just the opposite. Of course it depends on the module, but I just now took a census of todo-mode.el (which I happened to be editing anyway) and looked at the first 300 lines of source code (at which point I got tired of counting). I counted 24 strings that should not be translated, and 5 strings that should be. (I did not count doc strings, which obviously should all be translated and shouldn't need to be marked.) Here are the strings needing translation: "==--== DONE " "DONE " "Invalid value: must be distinct from `todo-item-mark'" %s category %d: %s" "Invalid value: must be a positive integer" and here are the strings that don't need translation: "todo/" "\\.toda\\'" "\\.todo\\'" "--==-- " "=" "[" "]" "*" "\\(?4:\\(?5:" "\\)\\|" "\\(?6:%s\\)" "\\(?7:[0-9]+\\|\\*\\)" "\\(?8:[0-9]+\\|\\*\\)" "-?\\(?9:[0-9]+\\|\\*\\)" "" "\\)" "^\\(" "\\|" "\\)?" "^\\[" "\\(" "\\|" "\\)" "" ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 19:39 ` Paul Eggert @ 2019-03-06 19:49 ` Eli Zaretskii 2019-03-07 1:33 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-06 19:49 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, rms, juri > Cc: rms@gnu.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Wed, 6 Mar 2019 11:39:50 -0800 > > On 3/6/19 10:09 AM, Eli Zaretskii wrote: > > we might consider marking > > strings that do NOT need to be translated, as they are a very small > > minority. Just look at the strings in a typical Emacs source file and > > try to find strings that you wouldn't want translated. Unlike some > > other programs, Emacs almost never says something that is not meant to > > be read and understood by the user. > > My impression is just the opposite. Of course it depends on the module, > but I just now took a census of todo-mode.el (which I happened to be > editing anyway) and looked at the first 300 lines of source code (at > which point I got tired of counting). I counted 24 strings that should > not be translated, and 5 strings that should be. We are miscommunicating: I meant strings passed to 'message' and its ilk, not just any kind of strings. It goes without saying that most strings in our sources don't need to be translated, but that's not what we are discussing. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 19:49 ` Eli Zaretskii @ 2019-03-07 1:33 ` Paul Eggert 2019-03-07 3:30 ` Eli Zaretskii 2019-03-07 4:35 ` Jean-Christophe Helary 0 siblings, 2 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-07 1:33 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, rms, juri On 3/6/19 11:49 AM, Eli Zaretskii wrote: > We are miscommunicating: I meant strings passed to 'message' and its > ilk, not just any kind of strings. In that case, the solution that Richard proposed should suffice for most cases. That is, in most cases we shouldn't need to change the Elisp source code; all we need is for xgettext (or its equivalent) to consider the first argument of 'message' to be a translatable string. This is a standard feature of xgettext (see its --keyword argument). ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 1:33 ` Paul Eggert @ 2019-03-07 3:30 ` Eli Zaretskii 2019-03-07 16:06 ` Paul Eggert 2019-03-07 4:35 ` Jean-Christophe Helary 1 sibling, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-07 3:30 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, rms, juri > Cc: juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Wed, 6 Mar 2019 17:33:26 -0800 > > On 3/6/19 11:49 AM, Eli Zaretskii wrote: > > We are miscommunicating: I meant strings passed to 'message' and its > > ilk, not just any kind of strings. > > In that case, the solution that Richard proposed should suffice for most > cases. That is, in most cases we shouldn't need to change the Elisp > source code; all we need is for xgettext (or its equivalent) to consider > the first argument of 'message' to be a translatable string. This is a > standard feature of xgettext (see its --keyword argument). This will solve the string extraction part. But how will the actual translation happen? As I wrote elsewhere, I don't see how relying on the function to perform the extraction will work with non-fixed strings. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 3:30 ` Eli Zaretskii @ 2019-03-07 16:06 ` Paul Eggert 0 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-07 16:06 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, rms, juri On 3/6/19 7:30 PM, Eli Zaretskii wrote: > I don't see how relying on > the function to perform the extraction will work with non-fixed > strings. Yes, if a caller computes a string and then passes it to 'message', xgettext's static analysis won't find the string. Although these calls are in the minority, they do happen, and they'll need to be rewritten. This is standard practice when any application is internationalized, and I've already given an example of this. Of course Emacs is a much bigger project than a small program like 'cat' or 'uniq', and so Emacs will take much more work to internationalize. But this is a problem of quantity, not of technology. That is, translators will need to do more work than usual (as there are more messages to translate) and developers will need to do some more work (as there are more "tricky" uses of 'message' in Emacs than there were "tricky" uses of fprintf in 'cat'.) However, the standard GNU internationalization technology should work just fine with Emacs. > E.g., what to do > with Org, which is in the core, but also distributed separately. A simple way to address that problem is to have Org use and ship the same message catalog that Emacs does. Alternatively, Org could ship a separate message catalog that contains only Org's messages and is therefore a subset of the Emacs catalog. However, I doubt whether the hassle of doing the latter would be worth the effort. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 1:33 ` Paul Eggert 2019-03-07 3:30 ` Eli Zaretskii @ 2019-03-07 4:35 ` Jean-Christophe Helary 2019-03-07 16:04 ` Paul Eggert ` (2 more replies) 1 sibling, 3 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-07 4:35 UTC (permalink / raw) To: emacs-devel > On Mar 7, 2019, at 10:33, Paul Eggert <eggert@cs.ucla.edu> wrote: > > On 3/6/19 11:49 AM, Eli Zaretskii wrote: >> We are miscommunicating: I meant strings passed to 'message' and its ilk, not just any kind of strings. > > In that case, the solution that Richard proposed should suffice for most cases. That is, in most cases we shouldn't need to change the Elisp source code; all we need is for xgettext (or its equivalent) to consider the first argument of 'message' to be a translatable string. This is a standard feature of xgettext (see its --keyword argument). Yes but... The first argument of message is often a lisp expression that generates natural language strings programatically. That part will have to be modified (although far from perfect, please check what I did on packages.el if what I wrote above is not clear). ps: what is the proper way to reply to this list ? Keep everybody in Cc or remove the Cc and only keep the list address ? Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 4:35 ` Jean-Christophe Helary @ 2019-03-07 16:04 ` Paul Eggert 2019-03-08 4:09 ` Richard Stallman 2019-03-11 21:48 ` Juri Linkov 2 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-07 16:04 UTC (permalink / raw) To: Jean-Christophe Helary, emacs-devel On 3/6/19 8:35 PM, Jean-Christophe Helary wrote: > in most cases we shouldn't need to change the Elisp source code; all we need is for xgettext (or its equivalent) to consider the first argument of 'message' to be a translatable string. This is a standard feature of xgettext (see its --keyword argument). > Yes but... The first argument of message is often a lisp expression Of course; we are on the same page here. Most cases of 'message' should be fine, but there will often be exceptions that we do need to rewrite. These exceptions are not urgent: we can fix them as we find time. > ps: what is the proper way to reply to this list ? Keep everybody in Cc or remove the Cc and only keep the list address ? I typically just reply to whatever my email software defaults to. Occasionally I'll remove a Cc if I think that particular person probably won't care about my reply or I know the person is on a mailing list I'm already replying to; but this is optional and takes time and I typically don't bother. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 4:35 ` Jean-Christophe Helary 2019-03-07 16:04 ` Paul Eggert @ 2019-03-08 4:09 ` Richard Stallman 2019-03-11 21:48 ` Juri Linkov 2 siblings, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-08 4:09 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Yes but... The first argument of message is often a lisp > expression that generates natural language strings > programatically. That part will have to be modified (although far > from perfect, please check what I did on packages.el if what I > wrote above is not clear). Those cases will need to be modified. Such is life. It may take time to get them all, but we will. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 4:35 ` Jean-Christophe Helary 2019-03-07 16:04 ` Paul Eggert 2019-03-08 4:09 ` Richard Stallman @ 2019-03-11 21:48 ` Juri Linkov 2019-03-11 22:51 ` Paul Eggert ` (2 more replies) 2 siblings, 3 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-11 21:48 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: emacs-devel >> In that case, the solution that Richard proposed should suffice for most >> cases. That is, in most cases we shouldn't need to change the Elisp >> source code; all we need is for xgettext (or its equivalent) to consider >> the first argument of 'message' to be a translatable string. This is >> a standard feature of xgettext (see its --keyword argument). > > Yes but... The first argument of message is often a lisp expression that > generates natural language strings programatically. That part will have to > be modified (although far from perfect, please check what I did on > packages.el if what I wrote above is not clear). Please note that you have to handle not only format-strings of ‘message’, but also ‘error’ and even more low-level ‘format’, i.e. all these (error STRING &rest ARGS) (message FORMAT-STRING &rest ARGS) (format-message STRING &rest OBJECTS) (format STRING &rest OBJECTS) because there are many places that construct the string arguments of ‘message’ using ‘format’ like in ‘perform-replace’: (message "Replaced %d occurrences%s" replace-count (if (> (+ skip-read-only-count skip-filtered-count skip-invisible-count) 0) (format " (skipped %s)" (mapconcat #'identity (delq nil (list (if (> skip-read-only-count 0) (format "%s read-only" skip-read-only-count)) (if (> skip-invisible-count 0) (format "%s invisible" skip-invisible-count)) (if (> skip-filtered-count 0) (format "%s filtered out" skip-filtered-count)))) ", ")) "")) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 21:48 ` Juri Linkov @ 2019-03-11 22:51 ` Paul Eggert 2019-03-12 21:45 ` Juri Linkov 2019-03-11 23:59 ` Jean-Christophe Helary 2019-03-12 9:16 ` Michael Albinus 2 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-11 22:51 UTC (permalink / raw) To: Juri Linkov; +Cc: Jean-Christophe Helary, emacs-devel On 3/11/19 2:48 PM, Juri Linkov wrote: > Please note that you have to handle not only format-strings of > ‘message’, but also ‘error’ and even more low-level ‘format’, i.e. all > these (error STRING &rest ARGS) (message FORMAT-STRING &rest ARGS) > (format-message STRING &rest OBJECTS) (format STRING &rest OBJECTS) > I expect that 'format' won't translate its first argument, whereas 'error', 'message', and 'format-message' will. This will be for the same reason that 'format' does not translate quotes. > there are many places that construct the string arguments of ‘message’ > using ‘format’ like in ‘perform-replace’: > Yes, quite right. These places will need to be redone so that the translation will work properly. Here's a first cut at how to redo the perform-replace code that you mentioned (this could get fancier if needed): (nmessage replace-count "Replaced %d occurrence%s" "Replaced %d occurrences%s" replace-count (if (> (+ skip-read-only-count skip-filtered-count skip-invisible-count) 0) (format-message " (skipped %s)" (mapconcat #'identity (delq nil (list (if (> skip-read-only-count 0) (format-message "%s read-only" skip-read-only-count)) (if (> skip-invisible-count 0) (format-message "%s invisible" skip-invisible-count)) (if (> skip-filtered-count 0) (format-message "%s filtered out" skip-filtered-count)))) (gettext ", "))) "")) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 22:51 ` Paul Eggert @ 2019-03-12 21:45 ` Juri Linkov 2019-03-17 21:23 ` Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-12 21:45 UTC (permalink / raw) To: Paul Eggert; +Cc: Jean-Christophe Helary, emacs-devel >> Please note that you have to handle not only format-strings of >> ‘message’, but also ‘error’ and even more low-level ‘format’, i.e. all >> these (error STRING &rest ARGS) (message FORMAT-STRING &rest ARGS) >> (format-message STRING &rest OBJECTS) (format STRING &rest OBJECTS) >> > I expect that 'format' won't translate its first argument, whereas > 'error', 'message', and 'format-message' will. This will be for the same > reason that 'format' does not translate quotes. Then it should be sufficient to add a gettext call to 'format-message' only, because all other related functions 'message', 'error', 'tramp-message', 'tramp-error', etc. all they use 'format-message' directly or indirectly. If someone would create a new branch with all standard gettext prerequisites like Makefiles, headers, textdomain bindings, locale settings, i.e. everything that is required to translate other GNU applications, then I could help with testing and finding more problematic places. Only then we could see how well gettext (designed for static translation) performs in more dynamic Emacs environment. >> there are many places that construct the string arguments of ‘message’ >> using ‘format’ like in ‘perform-replace’: >> > Yes, quite right. These places will need to be redone so that the > translation will work properly. Here's a first cut at how to redo the > perform-replace code that you mentioned (this could get fancier if needed): > > (nmessage replace-count > "Replaced %d occurrence%s" > "Replaced %d occurrences%s" > replace-count IIUC, using standard gettext functions this would rather correspond to (message (ngettext "Replaced %1$d occurrence%s" "Replaced %1$d occurrences%s" replace-count) replace-count (if (> (+ skip-read-only-count skip-filtered-count skip-invisible-count) 0) (format-message " (skipped %s)" (mapconcat #'identity (delq nil (list (if (> skip-read-only-count 0) (format-message "%s read-only" skip-read-only-count)) (if (> skip-invisible-count 0) (format-message "%s invisible" skip-invisible-count)) (if (> skip-filtered-count 0) (format-message "%s filtered out" skip-filtered-count)))) (gettext ", "))) "")) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-12 21:45 ` Juri Linkov @ 2019-03-17 21:23 ` Juri Linkov 2019-03-18 21:20 ` Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-17 21:23 UTC (permalink / raw) To: emacs-devel [-- Attachment #1: Type: text/plain, Size: 2627 bytes --] >>> Please note that you have to handle not only format-strings of >>> ‘message’, but also ‘error’ and even more low-level ‘format’, i.e. all >>> these (error STRING &rest ARGS) (message FORMAT-STRING &rest ARGS) >>> (format-message STRING &rest OBJECTS) (format STRING &rest OBJECTS) >>> >> I expect that 'format' won't translate its first argument, whereas >> 'error', 'message', and 'format-message' will. This will be for the same >> reason that 'format' does not translate quotes. > > Then it should be sufficient to add a gettext call to 'format-message' only, > because all other related functions 'message', 'error', 'tramp-message', > 'tramp-error', etc. all they use 'format-message' directly or indirectly. Maybe I'm too stupid to comprehend the complexity of this task in its entirety, but I tried to install gettext infrastructure in Emacs with gettextize, and then tried to run xgettext on source code, and see no technical problems. What I tried is to run this command, and it extracts all messages: xgettext --from-code=UTF-8 -kformat-message -kmessage -kerror -ktramp-message -ktramp-error *.el then this command extracts all Gnus messages into a separate file: xgettext --from-code=UTF-8 -kformat-message -kmessage -kerror gnus/*.el -o gnus_messages.po this command extracts all menu items: xgettext --from-code=UTF-8 -kmenu-item *.el **/*.el -o menus.po and this extracts all docstrings: xgettext --from-code=UTF-8 -kdefcustom:3 -kdefvar:3 -kdefun:3 *.el **/*.el -o docstrings.po The size of docstrings.po is about 9MB, so perhaps it should reside in a separate catalog defined by e.g. (defdomain emacs-docstrings with semantics similar to defgroup, but I have no opinion about this. I think this project urgently needs a coordinator: to negotiate with package authors and translation teams about how to better split translations to message catalogs. So there are not so much technical problems, but mostly organizational ones. > IIUC, using standard gettext functions this would rather correspond to > > (message (ngettext "Replaced %1$d occurrence%s" > "Replaced %1$d occurrences%s" > replace-count) It seems better to start with this standard function and add more optimizations like ‘nmessage’ later. Other Lisp implementations use ‘ngettext’ as well, e.g.: https://clisp.sourceforge.io/impnotes.html#ggettext So I'm going to start with more obvious parts of the task by fixing the current bugs of incorrect English syntax in a forward-compatible way: [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #2: i18n-ngettext.patch --] [-- Type: text/x-diff, Size: 6767 bytes --] diff --git a/lisp/subr.el b/lisp/subr.el index 6c0ad00afa..1f000f77ad 100644 --- a/lisp/subr.el +++ b/lisp/subr.el @@ -342,6 +342,13 @@ define-error (delete-dups (copy-sequence (cons name conditions)))) (when message (put name 'error-message message)))) +(defun ngettext (msgid msgid_plural n &optional _domain _category) + "Return the plural form of the translation for of MSGID and N. +In the given DOMAIN, depending on the given CATEGORY. MSGID and +MSGID_PLURAL should be ASCII strings, and are normally the English singular +and English plural variant of the message, respectively." + (if (/= n 1) msgid_plural msgid)) + ;; We put this here instead of in frame.el so that it's defined even on ;; systems where frame.el isn't loaded. (defun frame-configuration-p (object) diff --git a/lisp/progmodes/grep.el b/lisp/progmodes/grep.el index a5427dd8b7..c0f47159c9 100644 --- a/lisp/progmodes/grep.el +++ b/lisp/progmodes/grep.el @@ -459,7 +459,7 @@ grep-mode-font-lock-keywords ;; remove match from grep-regexp-alist before fontifying ("^Grep[/a-zA-Z]* started.*" (0 '(face nil compilation-message nil help-echo nil mouse-face nil) t)) - ("^Grep[/a-zA-Z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?matches found\\)\\|\\(no matches found\\)\\).*" + ("^Grep[/a-zA-Z]* finished with \\(?:\\(\\(?:[0-9]+ \\)?match\\(?:es\\)? found\\)\\|\\(no matches found\\)\\).*" (0 '(face nil compilation-message nil help-echo nil mouse-face nil) t) (1 compilation-info-face nil t) (2 compilation-warning-face nil t)) @@ -552,7 +552,10 @@ grep-exit-message ;; so the buffer is still unmodified if there is no output. (cond ((and (zerop code) (buffer-modified-p)) (if (> grep-num-matches-found 0) - (cons (format "finished with %d matches found\n" grep-num-matches-found) + (cons (format (ngettext "finished with %d match found\n" + "finished with %d matches found\n" + grep-num-matches-found) + grep-num-matches-found) "matched") '("finished with matches found\n" . "matched"))) ((not (buffer-modified-p)) diff --git a/lisp/replace.el b/lisp/replace.el index 59ad1a375b..318a9fb025 100644 --- a/lisp/replace.el +++ b/lisp/replace.el @@ -983,7 +983,10 @@ flush-lines (progn (forward-line 1) (point))) (setq count (1+ count)))) (set-marker rend nil) - (when interactive (message "Deleted %d matching lines" count)) + (when interactive (message (ngettext "Deleted %d matching line" + "Deleted %d matching lines" + count) + count)) count)) (defun how-many (regexp &optional rstart rend interactive) @@ -1032,9 +1035,10 @@ how-many (if (= opoint (point)) (forward-char 1) (setq count (1+ count)))) - (when interactive (message "%d occurrence%s" - count - (if (= count 1) "" "s"))) + (when interactive (message (ngettext "%d occurrence" + "%d occurrences" + count) + count)) count))) \f @@ -1617,11 +1621,12 @@ occur-1 (not (eq occur-excluded-properties t)))))) (let* ((bufcount (length active-bufs)) (diff (- (length bufs) bufcount))) - (message "Searched %d buffer%s%s; %s match%s%s" - bufcount (if (= bufcount 1) "" "s") + (message "Searched %d %s%s; %s %s%s" + bufcount + (ngettext "buffer" "buffers" bufcount) (if (zerop diff) "" (format " (%d killed)" diff)) (if (zerop count) "no" (format "%d" count)) - (if (= count 1) "" "es") + (ngettext "match" "matches" count) ;; Don't display regexp if with remaining text ;; it is longer than window-width. (if (> (+ (length (or (get-text-property 0 'isearch-string regexp) @@ -1856,14 +1861,15 @@ occur-engine (let ((beg (point)) end) (insert (propertize - (format "%d match%s%s%s in buffer: %s%s\n" - matches (if (= matches 1) "" "es") + (format "%d %s%s%s in buffer: %s%s\n" + matches + (ngettext "match" "matches" matches) ;; Don't display the same number of lines ;; and matches in case of 1 match per line. (if (= lines matches) - "" (format " in %d line%s" + "" (format " in %d %s" lines - (if (= lines 1) "" "s"))) + (ngettext "line" "lines" lines))) ;; Don't display regexp for multi-buffer. (if (> (length buffers) 1) "" (occur-regexp-descr regexp)) @@ -1889,13 +1895,15 @@ occur-engine (goto-char (point-min)) (let ((beg (point)) end) - (insert (format "%d match%s%s total%s:\n" - global-matches (if (= global-matches 1) "" "es") + (insert (format "%d %s%s total%s:\n" + global-matches + (ngettext "match" "matches" global-matches) ;; Don't display the same number of lines ;; and matches in case of 1 match per line. (if (= global-lines global-matches) - "" (format " in %d line%s" - global-lines (if (= global-lines 1) "" "s"))) + "" (format " in %d %s" + global-lines + (ngettext "line" "lines" global-lines))) (occur-regexp-descr regexp))) (setq end (point)) (when title-face @@ -2730,10 +2738,10 @@ perform-replace (1+ num-replacements)))))) (when (and (eq def 'undo-all) (null (zerop num-replacements))) - (message "Undid %d %s" num-replacements - (if (= num-replacements 1) - "replacement" - "replacements")) + (message (ngettext "Undid %d replacement" + "Undid %d replacements" + num-replacements) + num-replacements) (ding 'no-terminate) (sit-for 1))) (setq replaced nil last-was-undo t last-was-act-and-show nil))) @@ -2859,9 +2867,10 @@ perform-replace last-was-act-and-show nil)))))) (replace-dehighlight)) (or unread-command-events - (message "Replaced %d occurrence%s%s" + (message (ngettext "Replaced %d occurrence%s" + "Replaced %d occurrences%s" + replace-count) replace-count - (if (= replace-count 1) "" "s") (if (> (+ skip-read-only-count skip-filtered-count skip-invisible-count) ^ permalink raw reply related [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-17 21:23 ` Juri Linkov @ 2019-03-18 21:20 ` Juri Linkov 2019-03-18 21:55 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Juri Linkov @ 2019-03-18 21:20 UTC (permalink / raw) To: emacs-devel > Other Lisp implementations use ‘ngettext’ as well, e.g.: > https://clisp.sourceforge.io/impnotes.html#ggettext And this command will extract ‘ngettext’ messages: xgettext --from-code=UTF-8 -kngettext:1,2 *.el **/*.el Using only ‘ngettext’ has an additional advantage: there will be no need to add more such functions as nmessage, nerror, nuser-error, ntramp-error, etc. But for cases when ‘message’ will receive a string already translated by ‘ngettext’, e.g.: (message (ngettext "Replaced %d occurrence%s" "Replaced %d occurrences%s" replace-count) we need to mark translated strings like Richard suggested: (defun ngettext (msgid msgid_plural n &optional _domain _category) "Return the plural form of the translation for of MSGID and N." (propertize (if (/= n 1) msgid_plural msgid) 'translated t)) so ‘format-message’ should check if its first argument is translated, and not to call gettext again. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-18 21:20 ` Juri Linkov @ 2019-03-18 21:55 ` Paul Eggert 2019-03-19 20:40 ` Juri Linkov 0 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-18 21:55 UTC (permalink / raw) To: Juri Linkov; +Cc: emacs-devel On 3/18/19 2:20 PM, Juri Linkov wrote: > Using only ‘ngettext’ has an additional advantage: > there will be no need to add more such functions as > nmessage, nerror, nuser-error, ntramp-error, etc. That's not a real advantage, as there is no need to add those functions anyway. They are merely conveniences that we can either add or not add, depending on whether the convenience in use is worth the hassle of supporting and documenting the functions. For example, suppose 'message' always translates its format argument and that there is no 'nmessage' function. Then you can use 'message' this way to handle plurals: (message "%s" (format (ngettext n "%d item" "%d items") n)) If we find expressions like the above to be common, we can easily write an nmessage function in Lisp, so that the code can look like this instead: (nmessage n "%d item" "%d items" n) but this is merely a convenience. > ‘format-message’ should check if its first argument is translated, > and not to call gettext again. > I'd rather not involve dynamic checking like that, as it's fragile and more complicated to explain and a bit slower. format-message should either always translate, or never translate. In practice, it'll be more convenient for format-message to always translate, so I expect we should do it that way. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-18 21:55 ` Paul Eggert @ 2019-03-19 20:40 ` Juri Linkov 0 siblings, 0 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-19 20:40 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel >> ‘format-message’ should check if its first argument is translated, >> and not to call gettext again. >> > I'd rather not involve dynamic checking like that, as it's fragile and > more complicated to explain and a bit slower. format-message should > either always translate, or never translate. In practice, it'll be more > convenient for format-message to always translate, so I expect we should > do it that way. I see this as a kind of optimization. But I don't know if it is necessary until trying how fast gettext is on very large translation files (if it hashes translations strings then should be fast enough). ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 21:48 ` Juri Linkov 2019-03-11 22:51 ` Paul Eggert @ 2019-03-11 23:59 ` Jean-Christophe Helary 2019-03-12 9:16 ` Michael Albinus 2 siblings, 0 replies; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-11 23:59 UTC (permalink / raw) To: emacs-devel > On 2019/03/12, at 6:48, Juri Linkov <juri@linkov.net> wrote: > >> Yes but... The first argument of message is often a lisp expression that >> generates natural language strings programatically. That part will have to >> be modified (although far from perfect, please check what I did on >> packages.el if what I wrote above is not clear). > > Please note that you have to handle not only format-strings of ‘message’, > but also ‘error’ and even more low-level ‘format’, i.e. all these I know. That's what I tried to do with packages.el There may be an expression or two that were too obscure for me but I think I managed to straighten all the strings there. Check the current version vs what was in the repository about 1 year ago (I don't remember when my fix was committed). Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-11 21:48 ` Juri Linkov 2019-03-11 22:51 ` Paul Eggert 2019-03-11 23:59 ` Jean-Christophe Helary @ 2019-03-12 9:16 ` Michael Albinus 2 siblings, 0 replies; 151+ messages in thread From: Michael Albinus @ 2019-03-12 9:16 UTC (permalink / raw) To: Juri Linkov; +Cc: Jean-Christophe Helary, emacs-devel Juri Linkov <juri@linkov.net> writes: > Please note that you have to handle not only format-strings of ‘message’, > but also ‘error’ and even more low-level ‘format’, i.e. all these > > (error STRING &rest ARGS) > (message FORMAT-STRING &rest ARGS) > (format-message STRING &rest OBJECTS) > (format STRING &rest OBJECTS) There are even more functions to be considered. Tramp, for example, uses consequently `tramp-message' instead of `message', and `tramp-error' instead of `error'. Likely, we shall provide a mean that a package like Tramp can add its own entries to such a list of functions. Best regards, Michael. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 18:09 ` Eli Zaretskii 2019-03-06 19:39 ` Paul Eggert @ 2019-03-06 19:47 ` Paul Eggert 2019-03-06 20:21 ` Eli Zaretskii 2019-03-07 3:44 ` Richard Stallman 2 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-06 19:47 UTC (permalink / raw) To: Eli Zaretskii, Juri Linkov; +Cc: rms, emacs-devel On 3/6/19 10:09 AM, Eli Zaretskii wrote: > even if > we do decide to attack the 'message' part first, we should consider > the doc strings as well Absolutely. In some sense doc strings should be easier, since we shouldn't need to make changes to existing code; all we need to do is add some infrastructure that puts doc strings into a .po file and that translates them when people ask for documentation. > . Do we use a separate message catalog for each Lisp package, or a > single catalog for all of Emacs? We can start with a single catalog that handles core Emacs; we'll need to do that anyway. We can deal with packages later. > . How to specify which target language to use? The locale is not > necessarily correct, e.g., when editing with Tramp. Also, since > translating all of Emacs is such a humongous job, it's quite > possible that some languages will have little or no translations, > and the respective users might want to use translations for a > "fallback" language, which they prefer to English. It should be easy for Emacs users to specify a preferred locale for messages, independently of what the system locale. Similarly, they can specify a preferred fallback locale. All this is relatively easy to do at the C level. > . Many user-facing text messages include portions that we generate > directly from symbol names, which are of course in English. We > should have some idea for how to deal with that. We start by leaving them as English, as that's easier. We can get fancier later, if there's need. The bottom line is that we don't need to have a complete solution in order to start working on this. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 19:47 ` Paul Eggert @ 2019-03-06 20:21 ` Eli Zaretskii 2019-03-07 1:43 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-06 20:21 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, rms, juri > Cc: rms@gnu.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Wed, 6 Mar 2019 11:47:18 -0800 > > > . Do we use a separate message catalog for each Lisp package, or a > > single catalog for all of Emacs? > > We can start with a single catalog that handles core Emacs; we'll need > to do that anyway. We can deal with packages later. "Core" here being the C sources? That's about 4% of the doc strings, a drop in the sea. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 20:21 ` Eli Zaretskii @ 2019-03-07 1:43 ` Paul Eggert 2019-03-07 3:31 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-07 1:43 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel, rms, juri On 3/6/19 12:21 PM, Eli Zaretskii wrote: >> We can start with a single catalog that handles core Emacs; we'll need >> to do that anyway. We can deal with packages later. > "Core" here being the C sources? That's about 4% of the doc strings, > a drop in the sea. Sure, but it should be relatively easy to also grab the doc strings from the Emacs core elisp code. GNU gettext already supports getting strings from Elisp code (this was for XEmacs) and it should be a relatively minor change to adapt it to also get doc strings. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 1:43 ` Paul Eggert @ 2019-03-07 3:31 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-07 3:31 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel, rms, juri > Cc: juri@linkov.net, rms@gnu.org, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Wed, 6 Mar 2019 17:43:43 -0800 > > On 3/6/19 12:21 PM, Eli Zaretskii wrote: > >> We can start with a single catalog that handles core Emacs; we'll need > >> to do that anyway. We can deal with packages later. > > "Core" here being the C sources? That's about 4% of the doc strings, > > a drop in the sea. > > Sure, but it should be relatively easy to also grab the doc strings from > the Emacs core elisp code. GNU gettext already supports getting strings > from Elisp code (this was for XEmacs) and it should be a relatively > minor change to adapt it to also get doc strings. That gets back again to the problems I mentioned. E.g., what to do with Org, which is in the core, but also distributed separately. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-06 18:09 ` Eli Zaretskii 2019-03-06 19:39 ` Paul Eggert 2019-03-06 19:47 ` Paul Eggert @ 2019-03-07 3:44 ` Richard Stallman 2019-03-07 14:48 ` Eli Zaretskii 2 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-07 3:44 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > Second, I don't understand why we are still talking about 'message'. > Most of the user interaction in Emacs that will benefit the most from > translation is not messages we show in the echo area: Emacs actually > doesn't chatter there too much. Most of the stuff that IMO is much > more important to have translated are the doc strings. I think it would be most natural to handle doc strings through a special mechanism. We have already had special mechanisms for them -- I don't know whether we still do. But it is easy for the compiler to find them all and put them in a file for translations. > . Do we use a separate message catalog for each Lisp package, or a > single catalog for all of Emacs? Each alternative has its merits > and demerits. For example, if we go with separate catalogs, then > how do we make the correct bindtextdomain call, given that packages > call each other? I think they have to be separate, and we can use something like lexical binding to specify the right one for each file. This is worth a special mechamism for. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 3:44 ` Richard Stallman @ 2019-03-07 14:48 ` Eli Zaretskii 2019-03-07 22:29 ` Juri Linkov 2019-03-08 4:11 ` Richard Stallman 0 siblings, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-07 14:48 UTC (permalink / raw) To: rms; +Cc: eggert, emacs-devel, juri > From: Richard Stallman <rms@gnu.org> > Cc: juri@linkov.net, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Wed, 06 Mar 2019 22:44:10 -0500 > > I think it would be most natural to handle doc strings through > a special mechanism. Up to a point, perhaps. We still should try to use .po files for them, if at all possible, and perhaps also the gettext code that supports looking up strings in .gmo catalogs generated from .po. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 14:48 ` Eli Zaretskii @ 2019-03-07 22:29 ` Juri Linkov 2019-03-08 1:48 ` Jean-Christophe Helary 2019-03-08 7:37 ` Eli Zaretskii 2019-03-08 4:11 ` Richard Stallman 1 sibling, 2 replies; 151+ messages in thread From: Juri Linkov @ 2019-03-07 22:29 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, rms, emacs-devel >> I think it would be most natural to handle doc strings through >> a special mechanism. > > Up to a point, perhaps. We still should try to use .po files for > them, if at all possible, and perhaps also the gettext code that > supports looking up strings in .gmo catalogs generated from .po. The PO format is best suited for translation of one-liners like messages and menu items, but I doubt that the PO format would be the most efficient implementation for multi-line doc strings since gettext uses the whole text of the doc string as a key to translation. Whereas more efficient would be to use a Lisp symbol (function or variable name) as a translation key. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 22:29 ` Juri Linkov @ 2019-03-08 1:48 ` Jean-Christophe Helary 2019-03-08 8:08 ` Eli Zaretskii 2019-03-08 7:37 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-08 1:48 UTC (permalink / raw) To: emacs-devel > On Mar 8, 2019, at 7:29, Juri Linkov <juri@linkov.net> wrote: > > The PO format is best suited for translation of one-liners like > messages and menu items, but I doubt that the PO format would be the most efficient implementation for multi-line doc strings since gettext uses the whole text of the doc string as a key to translation. > Whereas more efficient would be to use a Lisp symbol (function or variable name) as a translation key. po4a is a commonly used perl utility that creates po files from a number of documentation formats including texinfo. The msgid is indeed the paragraph itself but nobody sees any "efficiency" issue in the process. Since the emacs code is not a documentation format, there would be a need to find a different way to extract the doc strings, but using each doc string paragraph as a msgid is not a problem in itself. Let's not forget that most if not all issues regarding formats and processes on the l10n side have mostly been solved decades ago. I think what really needs to be discussed is: • which strings do we extract • how to rewrite the mix of code and strings • how to extract the resulting strings • how to process the translations for display in emacs Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 1:48 ` Jean-Christophe Helary @ 2019-03-08 8:08 ` Eli Zaretskii 2019-03-08 15:11 ` Jean-Christophe Helary 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-08 8:08 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: emacs-devel > From: Jean-Christophe Helary <brandelune@gmail.com> > Date: Fri, 8 Mar 2019 10:48:46 +0900 > > I think what really needs to be discussed is: > > • which strings do we extract > • how to rewrite the mix of code and strings > • how to extract the resulting strings > • how to process the translations for display in emacs Extraction is just a technicality, it can be done in either of several possible ways. We could use xgettext, or we could use a modification of make-docfile (the latter is probably a must for collecting do strings from C sources), or we could use po4a or something similar. As long as the catalogs are PO files, we could even use a mix of tools, if, for example, some of the tools is more convenient for Lisp, but not for C. And I don't understand what problems you see in the last item: what should be done there other than display the translated string with 'message' or insert it into the *Help* buffer? So I think you are bothered by stuff that is largely non-issues. The most important issues IMO are different: (a) what methodology of extracting/marking translatable strings to choose so that this job doesn't become infeasible; and (b) how to arrange the message catalogs so that they will be easy to maintain and update, given the modular nature of Emacs. I think we should also take a better look at how the built-in help facilities generate documentation and other displayable strings from symbol names. Macros such as define-minor-mode should also be scrutinized to see if there are some special problems there. Once this is done, the methodology decided, and the necessary tools are available, the rest is just more or less mechanical work to convert more and more parts of Emacs. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 8:08 ` Eli Zaretskii @ 2019-03-08 15:11 ` Jean-Christophe Helary 2019-03-08 20:11 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-08 15:11 UTC (permalink / raw) To: emacs-devel > On Mar 8, 2019, at 17:08, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Jean-Christophe Helary <brandelune@gmail.com> >> Date: Fri, 8 Mar 2019 10:48:46 +0900 >> >> I think what really needs to be discussed is: >> >> • which strings do we extract >> • how to rewrite the mix of code and strings >> • how to extract the resulting strings >> • how to process the translations for display in emacs > > Extraction is just a technicality, it can be done in either of several > possible ways. Sure. I just meant that l10n issues (is PO "efficiency", etc.) are already solved, but i18n in general has to be implemented from scratch. > And I don't understand what problems you see in the last item: what > should be done there other than display the translated string with > 'message' or insert it into the *Help* buffer? As I wrote above: we have to implement everything from scratch. > So I think you are bothered by stuff that is largely non-issues. The > most important issues IMO are different: (a) what methodology of > extracting/marking translatable strings to choose so that this job > doesn't become infeasible; That's my 3 first points elegantly combined into one :) > and (b) how to arrange the message catalogs > so that they will be easy to maintain and update, given the modular > nature of Emacs. I'm not sure what you mean in "how to arrange ..." Do you mean: how to provide the l10n packages to translator communities ? > I think we should also take a better look at how the > built-in help facilities generate documentation and other displayable > strings from symbol names. Macros such as define-minor-mode should > also be scrutinized to see if there are some special problems there. > > Once this is done, the methodology decided, and the necessary tools > are available, the rest is just more or less mechanical work to > convert more and more parts of Emacs. Can't we start with a survey of the strings we want extracted in a given number of emacs core packages ? Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 15:11 ` Jean-Christophe Helary @ 2019-03-08 20:11 ` Eli Zaretskii 2019-03-09 2:44 ` Jean-Christophe Helary 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-08 20:11 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: emacs-devel > From: Jean-Christophe Helary <brandelune@gmail.com> > Date: Sat, 9 Mar 2019 00:11:24 +0900 > > > Extraction is just a technicality, it can be done in either of several > > possible ways. > > Sure. I just meant that l10n issues (is PO "efficiency", etc.) are already solved, but i18n in general has to be implemented from scratch. No sure I understand: what part(s) we would need to implement from scratch? We already have the capability of inserting arbitrary non-ASCII text into any buffer and displaying such text as echo area messages. > > and (b) how to arrange the message catalogs > > so that they will be easy to maintain and update, given the modular > > nature of Emacs. > > I'm not sure what you mean in "how to arrange ..." Do you mean: how to provide the l10n packages to translator communities ? No, I mean how many catalogs should we have and what should be their granularity. Also, how to merge several catalogs (the need for this might disappear if, for example, we decide that each .el file will have its own catalog), and how to load catalogs on demand when the corresponding code is loaded/executed. > Can't we start with a survey of the strings we want extracted in a given number of emacs core packages ? How would such a survey help us? We generally want all of the strings that are displayed to the user translated. We don't need any survey for that decision. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 20:11 ` Eli Zaretskii @ 2019-03-09 2:44 ` Jean-Christophe Helary 2019-03-09 6:40 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Jean-Christophe Helary @ 2019-03-09 2:44 UTC (permalink / raw) To: emacs-devel > On Mar 9, 2019, at 5:11, Eli Zaretskii <eliz@gnu.org> wrote: > >> From: Jean-Christophe Helary <brandelune@gmail.com> >> Date: Sat, 9 Mar 2019 00:11:24 +0900 >> >>> Extraction is just a technicality, it can be done in either of several possible ways. >> >> Sure. I just meant that l10n issues (is PO "efficiency", etc.) are already solved, but i18n in general has to be implemented from scratch. > > No sure I understand: what part(s) we would need to implement from scratch? We already have the capability of inserting arbitrary non-ASCII text into any buffer and displaying such text as echo area messages. What I mean by "from scratch" is that we have the possibility to extract text and insert text, but i18n is inexistant in emacs. So we have to build an i18n system that works for emacs and that does not exist yet, at all. Also, the "how to load catalogs on demand" point that you mention below is part of i18n and as you seem to say has to be developed from scratch. >>> and (b) how to arrange the message catalogs so that they will be easy to maintain and update, given the modular nature of Emacs. >> >> I'm not sure what you mean in "how to arrange ..." Do you mean: how to provide the l10n packages to translator communities ? > > No, I mean how many catalogs should we have and what should be their granularity. Isn't that related to the below item ? > Also, how to merge several catalogs (the need for this might disappear if, for example, we decide that each .el file will have its own catalog), Won't this depend on the extracting tool's options ? And wouldn't that be more practical in the first place to not merge anything but have one catalog per .el file ? (practical in terms of translation/testing/management, as far as I can tell from experience, etc.) > and how to load catalogs on demand when the corresponding code is loaded/executed. I guess you mean the technicalities involved in the obvious (?) "we check the user preferred locale and display the catalog corresponding to that locale" ? >> Can't we start with a survey of the strings we want extracted in a given number of emacs core packages ? > > How would such a survey help us? We generally want all of the strings that are displayed to the user translated. We don't need any survey for that decision. Of course, but a survey (sorry, I don't have a better word) of a few packages can help us see the workload, build prototypes, test them, establish best practices for developers, etc. Jean-Christophe Helary ----------------------------------------------- http://mac4translators.blogspot.com @brandelune ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 2:44 ` Jean-Christophe Helary @ 2019-03-09 6:40 ` Eli Zaretskii 2019-03-09 8:37 ` Michael Albinus 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 6:40 UTC (permalink / raw) To: Jean-Christophe Helary; +Cc: emacs-devel > From: Jean-Christophe Helary <brandelune@gmail.com> > Date: Sat, 9 Mar 2019 11:44:09 +0900 > > >> Sure. I just meant that l10n issues (is PO "efficiency", etc.) are already solved, but i18n in general has to be implemented from scratch. > > > > No sure I understand: what part(s) we would need to implement from scratch? We already have the capability of inserting arbitrary non-ASCII text into any buffer and displaying such text as echo area messages. > > What I mean by "from scratch" is that we have the possibility to extract text and insert text, but i18n is inexistant in emacs. So we have to build an i18n system that works for emacs and that does not exist yet, at all. I don't see how we can start implementing before deciding what and how to implement. This discussion hopefully will eventually lead to such decisions. > Also, the "how to load catalogs on demand" point that you mention below is part of i18n and as you seem to say has to be developed from scratch. If we decide that the gettext way is not entirely appropriate, yes. But we didn't make that decision yet. > > Also, how to merge several catalogs (the need for this might disappear if, for example, we decide that each .el file will have its own catalog), > > Won't this depend on the extracting tool's options ? Not directly, no. It's actually the other way around: we should first decide how to arrange the catalogs, and only after that see what tools/options to use for that. > And wouldn't that be more practical in the first place to not merge anything but have one catalog per .el file ? (practical in terms of translation/testing/management, as far as I can tell from experience, etc.) If you are following the discussion, you know that not everyone agrees with that. There are advantages in having just one catalog or a small number of large ones. > > and how to load catalogs on demand when the corresponding code is loaded/executed. > > I guess you mean the technicalities involved in the obvious (?) "we check the user preferred locale and display the catalog corresponding to that locale" ? I said "load", not "display". If you have one catalog per .el file, when do you load it into memory and when, if ever, do you unload it? Loading everything at the start would be un-economical, to say the least. > >> Can't we start with a survey of the strings we want extracted in a given number of emacs core packages ? > > > > How would such a survey help us? We generally want all of the strings that are displayed to the user translated. We don't need any survey for that decision. > > Of course, but a survey (sorry, I don't have a better word) of a few packages can help us see the workload, build prototypes, test them, establish best practices for developers, etc. I don't think we have reached the point where building prototypes is useful, since we don't yet have the basic design decisions for prototyping. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 6:40 ` Eli Zaretskii @ 2019-03-09 8:37 ` Michael Albinus 2019-03-09 10:45 ` Eli Zaretskii 0 siblings, 1 reply; 151+ messages in thread From: Michael Albinus @ 2019-03-09 8:37 UTC (permalink / raw) To: Eli Zaretskii; +Cc: Jean-Christophe Helary, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, >> And wouldn't that be more practical in the first place to not merge >> anything but have one catalog per .el file ? (practical in terms of >> translation/testing/management, as far as I can tell from >> experience, etc.) > > If you are following the discussion, you know that not everyone agrees > with that. There are advantages in having just one catalog or a small > number of large ones. One catalog for the whole Emacs is not appropriate for packages with a life outside Emacs core, like org or Tramp. Best regards, Michael. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 8:37 ` Michael Albinus @ 2019-03-09 10:45 ` Eli Zaretskii 2019-03-09 11:27 ` Michael Albinus 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 10:45 UTC (permalink / raw) To: Michael Albinus; +Cc: brandelune, emacs-devel > From: Michael Albinus <michael.albinus@gmx.de> > Cc: Jean-Christophe Helary <brandelune@gmail.com>, emacs-devel@gnu.org > Date: Sat, 09 Mar 2019 09:37:11 +0100 > > > If you are following the discussion, you know that not everyone agrees > > with that. There are advantages in having just one catalog or a small > > number of large ones. > > One catalog for the whole Emacs is not appropriate for packages with a > life outside Emacs core, like org or Tramp. Yes, but the question still stands whether those packages which _are_ maintained only in the Emacs repository should have one catalog or more than one, and if more than one, then at which granularity. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 10:45 ` Eli Zaretskii @ 2019-03-09 11:27 ` Michael Albinus 2019-03-09 17:23 ` Eli Zaretskii 2019-03-09 19:22 ` Paul Eggert 0 siblings, 2 replies; 151+ messages in thread From: Michael Albinus @ 2019-03-09 11:27 UTC (permalink / raw) To: Eli Zaretskii; +Cc: brandelune, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, > Yes, but the question still stands whether those packages which _are_ > maintained only in the Emacs repository should have one catalog or > more than one, and if more than one, then at which granularity. Packages with an own subdirectory (f.e., gnus, vc) should have an own catalog. Tramp + ange-ftp.el could get an own subdirectory + catalog as well (these are 17 *.el files). Best regards, Michael. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 11:27 ` Michael Albinus @ 2019-03-09 17:23 ` Eli Zaretskii 2019-03-09 19:55 ` Paul Eggert 2019-03-09 20:04 ` Michael Albinus 2019-03-09 19:22 ` Paul Eggert 1 sibling, 2 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 17:23 UTC (permalink / raw) To: Michael Albinus; +Cc: brandelune, emacs-devel > From: Michael Albinus <michael.albinus@gmx.de> > Cc: brandelune@gmail.com, emacs-devel@gnu.org > Date: Sat, 09 Mar 2019 12:27:04 +0100 > > > Yes, but the question still stands whether those packages which _are_ > > maintained only in the Emacs repository should have one catalog or > > more than one, and if more than one, then at which granularity. > > Packages with an own subdirectory (f.e., gnus, vc) should have an own > catalog. Tramp + ange-ftp.el could get an own subdirectory + catalog as > well (these are 17 *.el files). So you are saying that we should have a single catalog for all the other .el files, and load it unconditionally in every Emacs session? That'd waste memory, no? We have more than 1500 Lisp files in Emacs. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 17:23 ` Eli Zaretskii @ 2019-03-09 19:55 ` Paul Eggert 2019-03-09 20:07 ` Eli Zaretskii 2019-03-09 20:04 ` Michael Albinus 1 sibling, 1 reply; 151+ messages in thread From: Paul Eggert @ 2019-03-09 19:55 UTC (permalink / raw) To: emacs-devel Eli Zaretskii wrote: > So you are saying that we should have a single catalog for all the > other .el files, and load it unconditionally in every Emacs session? > That'd waste memory, no? Assuming we use GNU gettext, it'd consume virtual memory but not as much physical memory, as GNU gettext mmaps the message catalog (using PROT_READ so that it's read-only and the physical data can be shared). Only pages containing actual translations should need to be brought into physical memory (along with the indexes to these pages). The total amount of virtual memory would depend on the catalog size. A reasonable upper bound for current Emacs master would be 61 MB (the sum of sizes of all of Emacs's .el files). Although 61 MB is nontrivial, there should be little trouble fitting it into virtual memory even on a 32-bit platform. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 19:55 ` Paul Eggert @ 2019-03-09 20:07 ` Eli Zaretskii 2019-03-09 20:47 ` Paul Eggert 0 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 20:07 UTC (permalink / raw) To: Paul Eggert; +Cc: emacs-devel > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 9 Mar 2019 11:55:15 -0800 > > Eli Zaretskii wrote: > > So you are saying that we should have a single catalog for all the > > other .el files, and load it unconditionally in every Emacs session? > > That'd waste memory, no? > Assuming we use GNU gettext, it'd consume virtual memory but not as much > physical memory, as GNU gettext mmaps the message catalog (using PROT_READ so > that it's read-only and the physical data can be shared). Only pages containing > actual translations should need to be brought into physical memory (along with > the indexes to these pages). > > The total amount of virtual memory would depend on the catalog size. A > reasonable upper bound for current Emacs master would be 61 MB (the sum of sizes > of all of Emacs's .el files). Although 61 MB is nontrivial, there should be > little trouble fitting it into virtual memory even on a 32-bit platform. The same is true for the Lisp files themselves. Yet we don't load them all in advance, because that's simply not economical. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 20:07 ` Eli Zaretskii @ 2019-03-09 20:47 ` Paul Eggert 0 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-09 20:47 UTC (permalink / raw) To: Eli Zaretskii; +Cc: emacs-devel Eli Zaretskii wrote: >> The total amount of virtual memory would depend on the catalog size. A >> reasonable upper bound for current Emacs master would be 61 MB (the sum of sizes >> of all of Emacs's .el files). Although 61 MB is nontrivial, there should be >> little trouble fitting it into virtual memory even on a 32-bit platform. > The same is true for the Lisp files themselves. Yet we don't load > them all in advance, because that's simply not economical. No, it would be quite economical if we put all the .elc files into one big file that was mmapped in and then used lazily (which is what GNU gettext does for message catalogs). Emacs doesn't do that because historically it developed another way to use .elc files, a way that is good enough in practice even if it might not be as efficient as the mmap approach. The GNU gettext library was historically developed to use mmap, and is good enough in practice for Emacs as-is. None of the issues discussed in this thread mean that we should redesign the gettext library, or split up message catalogs only for performance reasons. On the contrary, splitting things up (or rewriting the gettext library in Elisp) is likely to make things slower. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 17:23 ` Eli Zaretskii 2019-03-09 19:55 ` Paul Eggert @ 2019-03-09 20:04 ` Michael Albinus 2019-03-09 20:14 ` Eli Zaretskii 1 sibling, 1 reply; 151+ messages in thread From: Michael Albinus @ 2019-03-09 20:04 UTC (permalink / raw) To: Eli Zaretskii; +Cc: brandelune, emacs-devel Eli Zaretskii <eliz@gnu.org> writes: Hi Eli, > So you are saying that we should have a single catalog for all the > other .el files, and load it unconditionally in every Emacs session? > That'd waste memory, no? We have more than 1500 Lisp files in Emacs. I haven't said this. I have no strong opinion about the other lisp files. Best regards, Michael. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 20:04 ` Michael Albinus @ 2019-03-09 20:14 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 20:14 UTC (permalink / raw) To: Michael Albinus; +Cc: brandelune, emacs-devel > From: Michael Albinus <michael.albinus@gmx.de> > Cc: brandelune@gmail.com, emacs-devel@gnu.org > Date: Sat, 09 Mar 2019 21:04:43 +0100 > > > So you are saying that we should have a single catalog for all the > > other .el files, and load it unconditionally in every Emacs session? > > That'd waste memory, no? We have more than 1500 Lisp files in Emacs. > > I haven't said this. I have no strong opinion about the other lisp > files. Oh, I agree that Gnus should have only one catalog, as should Tramp, Calc, Org, ERC, NXML, Rmail, etc. IOW, if a package has several Lisp files, it should still have no more than one catalog. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 11:27 ` Michael Albinus 2019-03-09 17:23 ` Eli Zaretskii @ 2019-03-09 19:22 ` Paul Eggert 2019-03-09 19:39 ` Eli Zaretskii ` (2 more replies) 1 sibling, 3 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-09 19:22 UTC (permalink / raw) To: Michael Albinus, Eli Zaretskii; +Cc: brandelune, emacs-devel Michael Albinus wrote: > Packages with an own subdirectory (f.e., gnus, vc) should have an own > catalog. I'm not sure I agree. Message catalogs are primarily of interest to translators and installers, not programmers. Assuming we're using the gettext machinery (a pretty safe assumption, as why reinvent the wheel?), the set of messages to be translated will be maintained automatically: programmers shouldn't care how many catalogs there are, or how they're updated. Other GNU packages generally go with one large catalog, for several reasons. For example, translators can batch their work; similar translations can be shared more easily and reliably; and installation is simpler and a bit faster. A few packages do have multiple catalogs. This is intended for convenience in installation, not for convenience to developers. For example, GNU gettext has two catalogs, one for the gettext runtime library (used by applications in production) and one for gettext tools (used by developers when extracting or doing translations). That way, operating systems packagers can install just the first message catalog on systems where users are not developers. In practice, though, this multiple-catalog approach hasn't proved to be all that useful. Debian and Fedora both put the two gettext catalogs into one package. Debian has a package language-pack-fr-base that contains French translations for several core packages, including both gettext catalogs, and similarly for other languages. Fedora includes all translations of both gettext catalogs in its 'gettext' package. So in hindsight, it seems to have been overkill for 'gettext' to have two translation catalogs. With this in mind, I think it unlikely that OS packagers would find it useful for Emacs to maintain multiple message catalogs for each source subdirectory. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 19:22 ` Paul Eggert @ 2019-03-09 19:39 ` Eli Zaretskii 2019-03-09 20:48 ` Paul Eggert 2019-03-09 20:08 ` Michael Albinus 2019-03-10 3:09 ` Richard Stallman 2 siblings, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-09 19:39 UTC (permalink / raw) To: Paul Eggert; +Cc: michael.albinus, brandelune, emacs-devel > Cc: brandelune@gmail.com, emacs-devel@gnu.org > From: Paul Eggert <eggert@cs.ucla.edu> > Date: Sat, 9 Mar 2019 11:22:12 -0800 > > Other GNU packages generally go with one large catalog, for several reasons. For > example, translators can batch their work; similar translations can be shared > more easily and reliably; and installation is simpler and a bit faster. > > A few packages do have multiple catalogs. Any example of a package whose 90% gets loaded piecemeal on demand? Out of ~500 packages that Emacs has, how many are loaded into our "usual" session? And if we don't load all of those 500, why should we load their message catalogs? This is one of those aspects that make Emacs so different from other localized programs. I think the difference really justifies separating the catalogs by package. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 19:39 ` Eli Zaretskii @ 2019-03-09 20:48 ` Paul Eggert 0 siblings, 0 replies; 151+ messages in thread From: Paul Eggert @ 2019-03-09 20:48 UTC (permalink / raw) To: Eli Zaretskii; +Cc: michael.albinus, brandelune, emacs-devel Eli Zaretskii wrote: >> A few packages do have multiple catalogs. > Any example of a package whose 90% gets loaded piecemeal on demand? I'm not quite sure what you mean by "whose 90% gets loaded piecemail on demand". However, it's routine for a program to retrieve only a few translations from a much larger catalog, so that most of the catalog is not loaded into physical RAM. The GNU gettext library is tuned for this sort of thing, and I see no reason why Emacs would pose important performance challenges to it. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 19:22 ` Paul Eggert 2019-03-09 19:39 ` Eli Zaretskii @ 2019-03-09 20:08 ` Michael Albinus 2019-03-10 3:09 ` Richard Stallman 2 siblings, 0 replies; 151+ messages in thread From: Michael Albinus @ 2019-03-09 20:08 UTC (permalink / raw) To: Paul Eggert; +Cc: Eli Zaretskii, brandelune, emacs-devel Paul Eggert <eggert@cs.ucla.edu> writes: Hi Paul, > With this in mind, I think it unlikely that OS packagers would find it > useful for Emacs to maintain multiple message catalogs for each source > subdirectory. There are packages which live outside Emacs. There are packages which are also available as ELPA core packages. At least these packages need their own catalog. For all other packages I'm undecided. Best regards, Michael. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-09 19:22 ` Paul Eggert 2019-03-09 19:39 ` Eli Zaretskii 2019-03-09 20:08 ` Michael Albinus @ 2019-03-10 3:09 ` Richard Stallman 2019-03-10 13:38 ` Eli Zaretskii 2 siblings, 1 reply; 151+ messages in thread From: Richard Stallman @ 2019-03-10 3:09 UTC (permalink / raw) To: Paul Eggert; +Cc: eliz, michael.albinus, brandelune, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > Packages with an own subdirectory (f.e., gnus, vc) should have an own > > catalog. > I'm not sure I agree. Let's start out without any particular rule about this, and let people try various things. That way we will work out what is useful. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-10 3:09 ` Richard Stallman @ 2019-03-10 13:38 ` Eli Zaretskii 0 siblings, 0 replies; 151+ messages in thread From: Eli Zaretskii @ 2019-03-10 13:38 UTC (permalink / raw) To: rms; +Cc: eggert, michael.albinus, brandelune, emacs-devel > From: Richard Stallman <rms@gnu.org> > Date: Sat, 09 Mar 2019 22:09:09 -0500 > Cc: eliz@gnu.org, michael.albinus@gmx.de, brandelune@gmail.com, > emacs-devel@gnu.org > > Let's start out without any particular rule about this, and let people > try various things. That way we will work out what is useful. Btw, translating messages also means that the likes of this: static bool set_message_1 (ptrdiff_t a1, Lisp_Object string) { [...] if (!NILP (BVAR (current_buffer, bidi_display_reordering))) bset_bidi_paragraph_direction (current_buffer, Qleft_to_right); will need to depend on the current UI language, instead of being hard-coded, so the value should be probably recorded in some file (the message catalog?). Likewise the direction of menu items and tool-bar buttons, if/when we get to translating menus and the tool bar. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 22:29 ` Juri Linkov 2019-03-08 1:48 ` Jean-Christophe Helary @ 2019-03-08 7:37 ` Eli Zaretskii 2019-03-09 3:12 ` Richard Stallman 1 sibling, 1 reply; 151+ messages in thread From: Eli Zaretskii @ 2019-03-08 7:37 UTC (permalink / raw) To: Juri Linkov; +Cc: eggert, rms, emacs-devel > From: Juri Linkov <juri@linkov.net> > Cc: rms@gnu.org, eggert@cs.ucla.edu, emacs-devel@gnu.org > Date: Fri, 08 Mar 2019 00:29:17 +0200 > > > We still should try to use .po files for them, if at all possible, > > and perhaps also the gettext code that supports looking up strings > > in .gmo catalogs generated from .po. > > The PO format is best suited for translation of one-liners like > messages and menu items, but I doubt that the PO format would be > the most efficient implementation for multi-line doc strings since > gettext uses the whole text of the doc string as a key to translation. I'm not sure I understand why the length of the string is an important factor here. Can you explain? If the problem is with the efficiency of gettext implementation of indexing, then we could have our own indexing method. > Whereas more efficient would be to use a Lisp symbol (function or > variable name) as a translation key. A key other than the original string would mean abandoning the PO format. Any deviation from PO would mean major PITA for translation teams, so we should make sure the reason for such a deviation is a very good reason. I'm not yet sure we have such a good reason. ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-08 7:37 ` Eli Zaretskii @ 2019-03-09 3:12 ` Richard Stallman 0 siblings, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-09 3:12 UTC (permalink / raw) To: Eli Zaretskii; +Cc: eggert, emacs-devel, juri [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] We can handle translating doc strings just like the other translations. Or we could have a special system for doc strings -- if that proves more convenient. Since doc strings are special in so many ways, a special system might prove more convenient. Or it might not. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
* Re: Emacs i18n 2019-03-07 14:48 ` Eli Zaretskii 2019-03-07 22:29 ` Juri Linkov @ 2019-03-08 4:11 ` Richard Stallman 1 sibling, 0 replies; 151+ messages in thread From: Richard Stallman @ 2019-03-08 4:11 UTC (permalink / raw) To: Eli Zaretskii; +Cc: juri, eggert, emacs-devel [[[ To any NSA and FBI agents reading my email: please consider ]]] [[[ whether defending the US Constitution against all enemies, ]]] [[[ foreign or domestic, requires you to follow Snowden's example. ]]] > > I think it would be most natural to handle doc strings through > > a special mechanism. > Up to a point, perhaps. We still should try to use .po files for > them, if at all possible, and perhaps also the gettext code that > supports looking up strings in .gmo catalogs generated from .po. I agree completely. -- Dr Richard Stallman President, Free Software Foundation (https://gnu.org, https://fsf.org) Internet Hall-of-Famer (https://internethalloffame.org) ^ permalink raw reply [flat|nested] 151+ messages in thread
end of thread, other threads:[~2019-04-24 20:18 UTC | newest] Thread overview: 151+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-03-20 11:59 Emacs i18n Bruno Haible 2019-03-20 16:36 ` Paul Eggert 2019-03-20 21:32 ` Juri Linkov 2019-03-21 2:14 ` Richard Stallman [not found] ` <E1h6nE3-0000bt-SW-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> 2019-03-21 21:45 ` Juri Linkov 2019-03-23 2:28 ` Richard Stallman 2019-03-23 7:55 ` Yuri Khan [not found] ` <CAP_d_8WjQwAtcWCfkjXHtc-dqYyBfnaP0+9L8KK6eCp4r_ZsPQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2019-03-23 17:50 ` Ineiev 2019-03-24 1:43 ` Richard Stallman [not found] ` <E1h7WOF-0006T8-Be-iW7gFb+/I3LZHJUXO5efmti2O/JbrIOy@public.gmane.org> 2019-03-23 21:48 ` Juri Linkov 2019-03-24 1:47 ` Richard Stallman 2019-03-22 20:50 ` Chusslove Illich [not found] ` <87h8bx5ijn.fsf-i9wRM+HIrmlRTR8OWt4JRw@public.gmane.org> 2019-03-21 2:55 ` Bruno Haible 2019-03-21 2:14 ` Richard Stallman 2019-03-22 1:26 ` Bruno Haible 2019-03-23 2:29 ` Richard Stallman [not found] <87o97aq6gz.fsf@jidanni.org> [not found] ` <87tvgoud56.fsf@mail.linkov.net> [not found] ` <83o96wk2mi.fsf@gnu.org> [not found] ` <87k1hjfvjd.fsf@mail.linkov.net> [not found] ` <E1gzZKP-0000kS-Iw@fencepost.gnu.org> [not found] ` <871s3p0zdz.fsf@mail.linkov.net> 2019-03-03 3:04 ` bug#34520: delete-matching-lines should report how many lines it deleted Richard Stallman 2019-03-03 15:31 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Eli Zaretskii 2019-03-03 20:57 ` Emacs i18n Juri Linkov 2019-03-04 1:46 ` Jean-Christophe Helary 2019-03-06 9:38 ` Elias Mårtenson 2019-03-06 11:23 ` Jean-Christophe Helary 2019-03-21 20:33 ` Clément Pit-Claudel 2019-03-21 20:50 ` Eli Zaretskii 2019-03-21 21:03 ` Clément Pit-Claudel 2019-03-21 21:21 ` Jean-Christophe Helary 2019-03-21 21:34 ` Clément Pit-Claudel 2019-03-21 21:56 ` Jean-Christophe Helary 2019-03-21 22:05 ` Clément Pit-Claudel 2019-03-21 23:46 ` Jean-Christophe Helary 2019-03-22 8:22 ` Eli Zaretskii 2019-03-22 16:10 ` Clément Pit-Claudel 2019-03-22 16:35 ` Eli Zaretskii 2019-03-22 17:16 ` Clément Pit-Claudel 2019-03-22 17:35 ` Eli Zaretskii 2019-03-22 23:17 ` Clément Pit-Claudel 2019-03-21 21:17 ` Jean-Christophe Helary 2019-03-21 21:59 ` Juri Linkov 2019-03-22 8:22 ` Eli Zaretskii 2019-03-23 21:50 ` Juri Linkov 2019-03-24 3:36 ` Eli Zaretskii 2019-03-24 21:55 ` Juri Linkov 2019-03-24 23:31 ` Jean-Christophe Helary 2019-03-25 21:32 ` Juri Linkov 2019-03-25 22:31 ` Paul Eggert 2019-03-26 16:11 ` Eli Zaretskii 2019-03-26 16:22 ` Stefan Monnier 2019-03-26 16:55 ` Eli Zaretskii 2019-03-26 22:35 ` Paul Eggert 2019-03-27 3:43 ` Eli Zaretskii 2019-03-28 14:56 ` Clément Pit-Claudel 2019-03-28 15:52 ` Eli Zaretskii 2019-03-27 2:34 ` Jean-Christophe Helary 2019-03-26 23:16 ` Juri Linkov 2019-03-27 1:35 ` Paul Eggert 2019-04-24 6:39 ` Jean-Christophe Helary 2019-04-24 20:18 ` Juri Linkov 2019-03-25 3:35 ` Eli Zaretskii 2019-03-25 9:04 ` Jean-Christophe Helary 2019-03-25 21:02 ` Juri Linkov 2019-03-26 3:27 ` Eli Zaretskii 2019-03-27 23:06 ` Richard Stallman 2019-03-25 10:52 ` Mattias Engdegård 2019-03-25 15:37 ` Eli Zaretskii 2019-03-25 21:11 ` Juri Linkov 2019-03-25 22:05 ` Mattias Engdegård 2019-03-27 21:22 ` Juri Linkov 2019-03-28 11:03 ` Mattias Engdegård 2019-03-04 3:27 ` Emacs i18n (was: bug#34520: delete-matching-lines should report how many lines it deleted) Richard Stallman 2019-03-04 16:36 ` Eli Zaretskii 2019-03-04 18:37 ` Paul Eggert 2019-03-04 19:07 ` Eli Zaretskii 2019-03-05 2:09 ` Paul Eggert 2019-03-05 21:58 ` Emacs i18n Juri Linkov 2019-03-06 2:16 ` Richard Stallman 2019-03-06 18:15 ` Eli Zaretskii 2019-03-06 19:47 ` Paul Eggert 2019-03-06 20:19 ` Eli Zaretskii 2019-03-07 1:52 ` Paul Eggert 2019-03-07 3:37 ` Eli Zaretskii 2019-03-08 4:07 ` Richard Stallman 2019-03-08 8:16 ` Eli Zaretskii 2019-03-08 4:07 ` Richard Stallman 2019-03-08 4:33 ` Elias Mårtenson 2019-03-08 8:22 ` Eli Zaretskii 2019-03-09 3:11 ` Richard Stallman 2019-03-09 7:54 ` Paul Eggert 2019-03-09 10:30 ` Eli Zaretskii 2019-03-10 3:05 ` Richard Stallman 2019-03-10 6:07 ` Paul Eggert 2019-03-11 1:20 ` Richard Stallman 2019-03-11 3:52 ` Paul Eggert 2019-03-12 3:31 ` Richard Stallman 2019-03-12 3:31 ` Richard Stallman 2019-03-10 8:45 ` Yuri Khan 2019-03-10 3:05 ` Richard Stallman 2019-03-10 6:14 ` Paul Eggert 2019-03-10 3:05 ` Richard Stallman 2019-03-07 3:42 ` Richard Stallman 2019-03-07 14:46 ` Eli Zaretskii 2019-03-07 17:19 ` Paul Eggert 2019-03-07 18:24 ` martin rudalics 2019-03-07 18:44 ` Paul Eggert 2019-03-07 20:22 ` Eli Zaretskii 2019-03-07 22:25 ` Paul Eggert 2019-03-08 7:29 ` Eli Zaretskii 2019-03-08 4:18 ` Richard Stallman 2019-03-08 4:11 ` Richard Stallman 2019-03-06 17:30 ` Eli Zaretskii 2019-03-06 18:09 ` Eli Zaretskii 2019-03-06 19:39 ` Paul Eggert 2019-03-06 19:49 ` Eli Zaretskii 2019-03-07 1:33 ` Paul Eggert 2019-03-07 3:30 ` Eli Zaretskii 2019-03-07 16:06 ` Paul Eggert 2019-03-07 4:35 ` Jean-Christophe Helary 2019-03-07 16:04 ` Paul Eggert 2019-03-08 4:09 ` Richard Stallman 2019-03-11 21:48 ` Juri Linkov 2019-03-11 22:51 ` Paul Eggert 2019-03-12 21:45 ` Juri Linkov 2019-03-17 21:23 ` Juri Linkov 2019-03-18 21:20 ` Juri Linkov 2019-03-18 21:55 ` Paul Eggert 2019-03-19 20:40 ` Juri Linkov 2019-03-11 23:59 ` Jean-Christophe Helary 2019-03-12 9:16 ` Michael Albinus 2019-03-06 19:47 ` Paul Eggert 2019-03-06 20:21 ` Eli Zaretskii 2019-03-07 1:43 ` Paul Eggert 2019-03-07 3:31 ` Eli Zaretskii 2019-03-07 3:44 ` Richard Stallman 2019-03-07 14:48 ` Eli Zaretskii 2019-03-07 22:29 ` Juri Linkov 2019-03-08 1:48 ` Jean-Christophe Helary 2019-03-08 8:08 ` Eli Zaretskii 2019-03-08 15:11 ` Jean-Christophe Helary 2019-03-08 20:11 ` Eli Zaretskii 2019-03-09 2:44 ` Jean-Christophe Helary 2019-03-09 6:40 ` Eli Zaretskii 2019-03-09 8:37 ` Michael Albinus 2019-03-09 10:45 ` Eli Zaretskii 2019-03-09 11:27 ` Michael Albinus 2019-03-09 17:23 ` Eli Zaretskii 2019-03-09 19:55 ` Paul Eggert 2019-03-09 20:07 ` Eli Zaretskii 2019-03-09 20:47 ` Paul Eggert 2019-03-09 20:04 ` Michael Albinus 2019-03-09 20:14 ` Eli Zaretskii 2019-03-09 19:22 ` Paul Eggert 2019-03-09 19:39 ` Eli Zaretskii 2019-03-09 20:48 ` Paul Eggert 2019-03-09 20:08 ` Michael Albinus 2019-03-10 3:09 ` Richard Stallman 2019-03-10 13:38 ` Eli Zaretskii 2019-03-08 7:37 ` Eli Zaretskii 2019-03-09 3:12 ` Richard Stallman 2019-03-08 4:11 ` Richard Stallman
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/emacs.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).