Discussion:
Supporting multiple plural forms in translations
(too old to reply)
Dimitri
2016-02-22 16:42:00 UTC
Permalink
In qooxdoo's translation mechanism, multiple plural forms are not
supported. The corresponding issue has been reported almost 10 years
ago. Time to revise it?

This is essential for many languages, eg. of Baltic and Slavic
families. Contrary to English, these languages may have, say, one
plural form to denote 2, 3,4 items and another for >=5 items. For
example, let's take the word "korova" ("cow" in Russian and Ukrainian):

1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.

The original GNU gettext (which qooxdoo's translation facility is
modeled after) provides such a mechanism. There is a special "Plural-
Forms" PO file header that contains:
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr
lookup is then done based on the plural class (the result of evaluation
of the formula), rather than on the ordinal itself.

For Russian, Ukrainian, Belarusian, Serbian and Croatian, the formula
looks like this:

Plural-Forms: nplurals=3;plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 &&
n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;

With the above, the translation itself would look like this
(transliterated for better perception):

msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"

Per specification, the formula should be a valid C language expression,
limited to one variable (n).

It would be nice to have it implemented in qooxdoo. However, that would
require changes in both framework and toolchain. At the moment,
internal translations structure looks like that:

translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": "korovy"
 }
}

This could be changed to:

translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}

and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the compiler. A
valid C expression will be a valid JavaScript expression, too, and we
can benefit from this. I've examined several gettext implementations
for JavaScript; those that do support Plural-Forms simply evaluate this
expression unchanged as JavaScript. For security purposes, we could
validate the expression first, to make sure it is restricted to
arithmetic, logical and ternary operators.

I think that we could start with implementing minimal, non-breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*() functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing##SELECTION_END##. As soon as POC is
ready, it can be ported to generate.py or Grunt based toolchain,
whichever becomes mainstream at that moment.

John, guys, what do you think?

Dimitri
Tobias Oetiker
2016-02-22 16:47:24 UTC
Permalink
Hi Dimitry,

Today Dimitri wrote:

[...]
Post by Dimitri
It would be nice to have it implemented in qooxdoo. However, that would
require changes in both framework and toolchain. At the moment,
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": "korovy"
 }
}
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}
and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the compiler. A
valid C expression will be a valid JavaScript expression, too, and we
can benefit from this. I've examined several gettext implementations
for JavaScript; those that do support Plural-Forms simply evaluate this
expression unchanged as JavaScript. For security purposes, we could
validate the expression first, to make sure it is restricted to
arithmetic, logical and ternary operators.
I think that we could start with implementing minimal, non-breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*() functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing##SELECTION_END##. As soon as POC is
ready, it can be ported to generate.py or Grunt based toolchain,
whichever becomes mainstream at that moment.
I think that qooxdoos multi lingual abilities are one of its big
assets, so improving these, by all means would be a great thing.

cheers
tobi
--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch ***@oetiker.ch +41 62 775 9902
d***@cost-savers.net
2016-02-22 16:49:25 UTC
Permalink
Dimitri,

I think it is a good idea.
We have developed language support for different "direction" languages...not only LTR instead including RTL, TB, BT, MTR etc. It also includes different calendars etc...
It might be something we could include in that package and then make a pull request on both.

Stefan
Post by Dimitri
In qooxdoo's translation mechanism, multiple plural forms are not
supported. The corresponding issue has been reported almost 10 years
ago. Time to revise it?
This is essential for many languages, eg. of Baltic and Slavic
families. Contrary to English, these languages may have, say, one
plural form to denote 2, 3,4 items and another for >=5 items. For
1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.
The original GNU gettext (which qooxdoo's translation facility is
modeled after) provides such a mechanism. There is a special "Plural-
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr
lookup is then done based on the plural class (the result of evaluation
of the formula), rather than on the ordinal itself.
For Russian, Ukrainian, Belarusian, Serbian and Croatian, the formula
Plural-Forms: nplurals=3;plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 &&
n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
With the above, the translation itself would look like this
msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"
Per specification, the formula should be a valid C language expression,
limited to one variable (n).
It would be nice to have it implemented in qooxdoo. However, that would
require changes in both framework and toolchain. At the moment,
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": "korovy"
 }
}
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}
and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the compiler. A
valid C expression will be a valid JavaScript expression, too, and we
can benefit from this. I've examined several gettext implementations
for JavaScript; those that do support Plural-Forms simply evaluate this
expression unchanged as JavaScript. For security purposes, we could
validate the expression first, to make sure it is restricted to
arithmetic, logical and ternary operators.
I think that we could start with implementing minimal, non-breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*() functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing##SELECTION_END##. As soon as POC is
ready, it can be ported to generate.py or Grunt based toolchain,
whichever becomes mainstream at that moment.
John, guys, what do you think?
Dimitri
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
Dimitri
2016-02-22 17:24:04 UTC
Permalink
Thumbs up! "Make qooxdoo great again" ;)

P.S. Could you please elaborate a bit on what projects you are using
qooxdoo in? Ours is a communication application (think web-based Pidgin
embeddable into virtually any website).
Post by d***@cost-savers.net
Dimitri,
I think it is a good idea.
We have developed language support for different "direction"
languages...not only LTR instead including RTL, TB, BT, MTR etc. It
also includes different calendars etc...
It might be something we could include in that package and then make
a pull request on both.
Stefan
Post by Dimitri
In qooxdoo's translation mechanism, multiple plural forms are not
supported. The corresponding issue has been reported almost 10 years
ago. Time to revise it?
This is essential for many languages, eg. of Baltic and Slavic
families. Contrary to English, these languages may have, say, one
plural form to denote 2, 3,4 items and another for >=5 items. For
example, let's take the word "korova" ("cow" in Russian and
1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.
The original GNU gettext (which qooxdoo's translation facility is
modeled after) provides such a mechanism. There is a special "Plural-
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr
lookup is then done based on the plural class (the result of
evaluation
of the formula), rather than on the ordinal itself.
For Russian, Ukrainian, Belarusian, Serbian and Croatian, the formula
Plural-Forms: nplurals=3;plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 &&
n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
With the above, the translation itself would look like this
msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"
Per specification, the formula should be a valid C language
expression,
limited to one variable (n).
It would be nice to have it implemented in qooxdoo. However, that would
require changes in both framework and toolchain. At the moment,
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": "korovy"
 }
}
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}
and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the
compiler. A
valid C expression will be a valid JavaScript expression, too, and we
can benefit from this. I've examined several gettext
implementations
for JavaScript; those that do support Plural-Forms simply evaluate this
expression unchanged as JavaScript. For security purposes, we could
validate the expression first, to make sure it is restricted to
arithmetic, logical and ternary operators.
I think that we could start with implementing minimal, non-breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*() functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing##SELECTION_END##. As soon as POC is
ready, it can be ported to generate.py or Grunt based toolchain,
whichever becomes mainstream at that moment.
John, guys, what do you think?
Dimitri
-----------------------------------------------------------------
-------------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
-------------------------------------------------------------------
-----------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
John Spackman
2016-02-22 18:25:58 UTC
Permalink
I think that the only thing missing from the toolchain to support this is being able to include the headers from the translation in the compiled application; I’ve just pushed a version of QxCompiler that now adds this.

I have not done the corresponding changes to qx.locale.Manager because it’s probably easier for you to mod that :) but at least the data is ready when you have a moment. In qx.locale.Manager, the data is available at this.__translations[locale + ":__header__"] or globally as qx.$$translations[locale + ":__header__”].

I’m not 100% sure that this is the best way to output the header information, so if you want a different layout let me know or take a look at qxcompiler.targets.Target on lines 328 to 343:
function(cb) {
async.each(t.getLocales(),
function(localeId, cb) {
analyser.getTranslation(library, localeId, function(err, translation) {
if (err)
return cb(err);

var dest = pkgdata.translations[localeId + ":__header__"] = {};
var src = translation.getHeaders();
for (var key in src)
dest[key] = src[key];
cb();
});
},
cb);
},
Note that there is another change in this release of QxCompiler that changes the directory names inside source-output (more on that in a moment) so it’s best to delete the source-output directory before you try again.

John

From: Dimitri <***@cargosoft.ru>
Reply-To: qooxdoo Development <qooxdoo-***@lists.sourceforge.net>
Date: Monday, 22 February 2016 at 16:42
To: qooxdoo Development <qooxdoo-***@lists.sourceforge.net>
Subject: [qooxdoo-devel] Supporting multiple plural forms in translations

In qooxdoo's translation mechanism, multiple plural forms are not supported. The corresponding issue has been reported almost 10 years ago. Time to revise it?

This is essential for many languages, eg. of Baltic and Slavic families. Contrary to English, these languages may have, say, one plural form to denote 2, 3,4 items and another for >=5 items. For example, let's take the word "korova" ("cow" in Russian and Ukrainian):

1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.

The original GNU gettext (which qooxdoo's translation facility is modeled after) provides such a mechanism. There is a special "Plural-Forms" PO file header that contains:
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr lookup is then done based on the plural class (the result of evaluation of the formula), rather than on the ordinal itself.

For Russian, Ukrainian, Belarusian, Serbian and Croatian, the formula looks like this:

Plural-Forms: nplurals=3;plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;

With the above, the translation itself would look like this (transliterated for better perception):

msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"

Per specification, the formula should be a valid C language expression, limited to one variable (n).

It would be nice to have it implemented in qooxdoo. However, that would require changes in both framework and toolchain. At the moment, internal translations structure looks like that:

translations: {
...
"ru": {
"cow": "korova",
"cows": "korovy"
}
}

This could be changed to:

translations: {
...
"ru": {
"cow": "korova",
"cows": [ "korova", "korovy", "korov" ]
}
}

and locales structure could contain a function to compute plural form from ordinal, created from a Plural-Forms PO header by the compiler. A valid C expression will be a valid JavaScript expression, too, and we can benefit from this. I've examined several gettext implementations for JavaScript; those that do support Plural-Forms simply evaluate this expression unchanged as JavaScript. For security purposes, we could validate the expression first, to make sure it is restricted to arithmetic, logical and ternary operators.

I think that we could start with implementing minimal, non-breaking changes in framework, namely internal structure for translations, plural classifier in locales, translation logic in tr*() functions. Meanwhile, we could experiment with John Spackman's QxCompiler to introduce Plural-Form parsing. As soon as POC is ready, it can be ported to generate.py or Grunt based toolchain, whichever becomes mainstream at that moment.

John, guys, what do you think?

Dimitri
------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ qooxdoo-devel mailing list qooxdoo-***@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
Dimitri
2016-02-23 00:54:41 UTC
Permalink
John,

Translations seem to work fine now, thanks!

I'm OK with the headers approach, and I'm eager to fiddle with
qx.locale.* stuff. However, two things are yet to be done on the side
of toolchain:

1. (minor priority) Support multi-line headers. Typical PO header for
Russian localization looks like that:

msgid ""
msgstr ""
"Project-Id-Version: hello-guile 0.19.4.73\n"
"Report-Msgid-Bugs-To: bug-gnu-***@gnu.org\n"
"PO-Revision-Date: 2015-06-26 08:55+0300\n"
"Last-Translator: Yuri Kozlov <***@komyakino.ru>\n"
"Language-Team: Russian <***@mx.ru>\n"
"Language: ru\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Lokalize 1.5\n"
"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2
&& n"
"%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"

It's common practice to split complex expressions into several lines.
As you see, lines should be concatenated until "\n" is encountered.
Tailing "\n"s should be stripped off as well. In fact, I don't see why
a complete set of headers might be needed at runtime. The only two
things we need are nplurals and the formula proper. Again, ATM I'm
quite fine with application-side processing of headers, but later we
can move that logic into the compiler.

2. (major priority) Pass msgstr[] as array.

At the moment, only one plural is supported. That means, the following
mapping takes place:

msgid -> msgstr[0]
msgid_plural -> msgstr[1]

With multiple plural forms (there can be up to 6 in Arabic!) we'll need
to pass the whole msgstr array to the application. Ideally, I would
like to see the following in qx.$$translations:

translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}

This would obviously break current qooxdoo. I can implement the qooxdoo
side and you'll merge it into your branch, or we can temporarily stick
to the following non-compatibility-breaking approach:

translations: {
 ...
 "ru": {
  "cow": "korova", // msgstr[0]
 
"cows":"korovy", // msgstr[1]
  "***@plural": [ "korova", "korovy",
"korov" ] // whole msgstr array
 }
}

Dimitri

P.S. I've noticed that both QxCompiler and generate.py do the same thing: a msgid/msgstr will only make it to the qx.$$translations, if it appears in the code as a string constant inside a tr*() call. What's the rationale behind that? What if tr() is called on a dynamically computed expression or data received from server? Why not simply copy all the msgstrs unconditionally? This will simplify compiler code as well. I'm not insisting that should be done this way, just wondering.
Post by John Spackman
I think that the only thing missing from the toolchain to support
this is being able to include the headers from the translation in the
compiled application; I’ve just pushed a version of QxCompiler that
now adds this.  
I have not done the corresponding changes to qx.locale.Manager
because it’s probably easier for you to mod that :) but at least the
data is ready when you have a moment.  In qx.locale.Manager, the data
is available at this.__translations[locale + ":__header__"] or
globally as qx.$$translations[locale + ":__header__”].
I’m not 100% sure that this is the best way to output the header
information, so if you want a different layout let me know or take a
function(cb) {
  async.each(t.getLocales(),
      function(localeId, cb) {
        analyser.getTranslation(library, localeId, function(err,
translation) {
          if (err)
            return cb(err);
          var dest = pkgdata.translations[localeId + ":__header__"] =
{};
          var src = translation.getHeaders();
          for (var key in src)
            dest[key] = src[key];
          cb();
        });
      },
      cb);
},
Note that there is another change in this release of QxCompiler that
changes the directory names inside source-output (more on that in a
moment) so it’s best to delete the source-output directory before you
try again.
John
Date: Monday, 22 February 2016 at 16:42
Subject: [qooxdoo-devel] Supporting multiple plural forms in
translations
In qooxdoo's translation mechanism, multiple plural forms are not
supported. The corresponding issue has been reported almost 10 years
ago. Time to revise it?
This is essential for many languages, eg. of Baltic and Slavic
families. Contrary to English, these languages may have, say, one
plural form to denote 2, 3,4 items and another for >=5 items. For
example, let's take the word "korova" ("cow" in Russian and
1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.
The original GNU gettext (which qooxdoo's translation facility is
modeled after) provides such a mechanism. There is a special "Plural-
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr
lookup is then done based on the plural class (the result of
evaluation of the formula), rather than on the ordinal itself.
Plural-Forms: nplurals=3;plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 &&
n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
With the above, the translation itself would look like this
msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"
Per specification, the formula should be a valid C language
expression, limited to one variable (n).
It would be nice to have it implemented in qooxdoo. However, that
would require changes in both framework and toolchain. At the moment,
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": "korovy"
 }
}
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}
and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the compiler.
A valid C expression will be a valid JavaScript expression, too, and
we can benefit from this. I've examined several gettext
implementations for JavaScript; those that do support Plural-Forms
simply evaluate this expression unchanged as JavaScript. For security
purposes, we could validate the expression first, to make sure it is
restricted to arithmetic, logical and ternary operators.
I think that we could start with implementing minimal, non-breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*() functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing. As soon as POC is ready, it can be
ported to generate.py or Grunt based toolchain, whichever becomes
mainstream at that moment.
John, guys, what do you think?
Dimitri
-------------------------------------------------------------------
----------- Site24x7 APM Insight: Get Deep Visibility into
Application Performance APM + Mobile APM + RUM: Monitor 3 App
instances at just $35/Month Monitor end-to-end web transactions and
take corrective actions now Troubleshoot faster and improve end-user
experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id
=272487151&iu=/4140_______________________________________________
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
-------------------------------------------------------------------
-----------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
John Spackman
2016-02-23 07:38:51 UTC
Permalink
Hi Dimitri

Re multi-line: OK, I see the error now: that’s a bug in my book :) I’ll get onto it

Re msgstr as array: how about if msgstr is an array only if there is more than one value to output? The advantage would be that the generator.py can still output code which is compatible with the framework (even if it is not compatible with a particular application).

Re outputting all translations: the reason is just to optimise the output, and QxCompiler is trying to mimic the result of generate.py so that there is a like-for-like replacement. But it should be easy to add this as an option to the target, it’s on my TODO list

Cheers
John
Post by Dimitri
John,
Translations seem to work fine now, thanks!
I'm OK with the headers approach, and I'm eager to fiddle with
qx.locale.* stuff. However, two things are yet to be done on the side
1. (minor priority) Support multi-line headers. Typical PO header for
msgid ""
msgstr ""
"Project-Id-Version: hello-guile 0.19.4.73\n"
"PO-Revision-Date: 2015-06-26 08:55+0300\n"
"Language: ru\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Lokalize 1.5\n"
"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2
&& n"
"%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"
It's common practice to split complex expressions into several lines.
As you see, lines should be concatenated until "\n" is encountered.
Tailing "\n"s should be stripped off as well. In fact, I don't see why
a complete set of headers might be needed at runtime. The only two
things we need are nplurals and the formula proper. Again, ATM I'm
quite fine with application-side processing of headers, but later we
can move that logic into the compiler.
2. (major priority) Pass msgstr[] as array.
At the moment, only one plural is supported. That means, the following
msgid -> msgstr[0]
msgid_plural -> msgstr[1]
With multiple plural forms (there can be up to 6 in Arabic!) we'll need
to pass the whole msgstr array to the application. Ideally, I would
translations: {
...
"ru": {
"cow": "korova",
"cows": [ "korova", "korovy", "korov" ]
}
}
This would obviously break current qooxdoo. I can implement the qooxdoo
side and you'll merge it into your branch, or we can temporarily stick
translations: {
...
"ru": {
"cow": "korova", // msgstr[0]
"cows":"korovy", // msgstr[1]
"korov" ] // whole msgstr array
}
}
Dimitri
P.S. I've noticed that both QxCompiler and generate.py do the same thing: a msgid/msgstr will only make it to the qx.$$translations, if it appears in the code as a string constant inside a tr*() call. What's the rationale behind that? What if tr() is called on a dynamically computed expression or data received from server? Why not simply copy all the msgstrs unconditionally? This will simplify compiler code as well. I'm not insisting that should be done this way, just wondering.
Post by John Spackman
I think that the only thing missing from the toolchain to support
this is being able to include the headers from the translation in the
compiled application; I’ve just pushed a version of QxCompiler that
now adds this.
I have not done the corresponding changes to qx.locale.Manager
because it’s probably easier for you to mod that :) but at least the
data is ready when you have a moment. In qx.locale.Manager, the data
is available at this.__translations[locale + ":__header__"] or
globally as qx.$$translations[locale + ":__header__”].
I’m not 100% sure that this is the best way to output the header
information, so if you want a different layout let me know or take a
function(cb) {
async.each(t.getLocales(),
function(localeId, cb) {
analyser.getTranslation(library, localeId, function(err, translation) {
if (err)
return cb(err);
var dest = pkgdata.translations[localeId + ":__header__"] = {};
var src = translation.getHeaders();
for (var key in src)
dest[key] = src[key];
cb();
});
},
cb);
},
Note that there is another change in this release of QxCompiler that
changes the directory names inside source-output (more on that in a
moment) so it’s best to delete the source-output directory before you
try again.
John
Date: Monday, 22 February 2016 at 16:42
Subject: [qooxdoo-devel] Supporting multiple plural forms in
translations
In qooxdoo's translation mechanism, multiple plural forms are not
supported. The corresponding issue has been reported almost 10 years
ago. Time to revise it?
This is essential for many languages, eg. of Baltic and Slavic
families. Contrary to English, these languages may have, say, one
plural form to denote 2, 3,4 items and another for >=5 items. For
example, let's take the word "korova" ("cow" in Russian and
1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.
The original GNU gettext (which qooxdoo's translation facility is
modeled after) provides such a mechanism. There is a special "Plural-
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr
lookup is then done based on the plural class (the result of
evaluation of the formula), rather than on the ordinal itself.
Plural-Forms: nplurals=3;plural=n%10==1 && n%100!=11 ? 0 : n%10>=2 &&
n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
With the above, the translation itself would look like this
msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"
Per specification, the formula should be a valid C language
expression, limited to one variable (n).
It would be nice to have it implemented in qooxdoo. However, that
would require changes in both framework and toolchain. At the moment,
translations: {
...
"ru": {
"cow": "korova",
"cows": "korovy"
}
}
translations: {
...
"ru": {
"cow": "korova",
"cows": [ "korova", "korovy", "korov" ]
}
}
and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the compiler.
A valid C expression will be a valid JavaScript expression, too, and
we can benefit from this. I've examined several gettext
implementations for JavaScript; those that do support Plural-Forms
simply evaluate this expression unchanged as JavaScript. For security
purposes, we could validate the expression first, to make sure it is
restricted to arithmetic, logical and ternary operators.
I think that we could start with implementing minimal, non-breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*() functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing. As soon as POC is ready, it can be
ported to generate.py or Grunt based toolchain, whichever becomes
mainstream at that moment.
John, guys, what do you think?
Dimitri
-------------------------------------------------------------------
----------- Site24x7 APM Insight: Get Deep Visibility into
Application Performance APM + Mobile APM + RUM: Monitor 3 App
instances at just $35/Month Monitor end-to-end web transactions and
take corrective actions now Troubleshoot faster and improve end-user
experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id
=272487151&iu=/4140_______________________________________________
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
-------------------------------------------------------------------
-----------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
------------------------------------------------------------------------------
Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
Dimitri
2016-02-23 08:50:04 UTC
Permalink
Hi John,

Re: msgstr array - I think that's gonna be a good start. Let's
implement it that way and see what qooxdoo guys say. A verbose example
to make sure I understood it right:

es.po:

msgid "cow"
msgid_plural "cows"
msgstr[0] "vaca"
msgstr[1] "vacas"

ru.po:

msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"

output:

translations: {
 ...
 "es": {
  "cow": "vaca",
  "cows": "vacas"
 },
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}

I guess I now should fork https://github.com/johnspackman/qooxdoo/tree/
qxcompiler , make changes there and file a PR, right?

Dimitri

P.S. you can find me on #qxdev too
Post by John Spackman
Hi Dimitri
Re multi-line: OK, I see the error now: that’s a bug in my book
:)  I’ll get onto it
Re msgstr as array: how about if msgstr is an array only if there is
more than one value to output?  The advantage would be that the
generator.py can still output code which is compatible with the
framework (even if it is not compatible with a particular
application).  
Re outputting all translations: the reason is just to optimise the
output, and QxCompiler is trying to mimic the result of generate.py
so that there is a like-for-like replacement.  But it should be easy
to add this as an option to the target, it’s on my TODO list
Cheers
John
Post by Dimitri
John,
Translations seem to work fine now, thanks!
I'm OK with the headers approach, and I'm eager to fiddle with
qx.locale.* stuff. However, two things are yet to be done on the side
1. (minor priority) Support multi-line headers. Typical PO header for
msgid ""
msgstr ""
"Project-Id-Version: hello-guile 0.19.4.73\n"
"PO-Revision-Date: 2015-06-26 08:55+0300\n"
"Language: ru\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"X-Generator: Lokalize 1.5\n"
"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2
&& n"
"%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"
It's common practice to split complex expressions into several lines.
As you see, lines should be concatenated until "\n" is encountered.
Tailing "\n"s should be stripped off as well. In fact, I don't see why
a complete set of headers might be needed at runtime. The only two
things we need are nplurals and the formula proper. Again, ATM I'm
quite fine with application-side processing of headers, but later we
can move that logic into the compiler.
2. (major priority) Pass msgstr[] as array.
At the moment, only one plural is supported. That means, the
following
msgid -> msgstr[0]
msgid_plural -> msgstr[1]
With multiple plural forms (there can be up to 6 in Arabic!) we'll need
to pass the whole msgstr array to the application. Ideally, I would
translations: {
...
"ru": {
 "cow": "korova",
 "cows": [ "korova", "korovy", "korov" ]
}
}
This would obviously break current qooxdoo. I can implement the qooxdoo
side and you'll merge it into your branch, or we can temporarily stick
translations: {
...
"ru": {
 "cow": "korova", // msgstr[0]
"cows":"korovy", // msgstr[1]
"korov" ] // whole msgstr array
}
}
Dimitri
P.S. I've noticed that both QxCompiler and generate.py do the same
thing: a msgid/msgstr will only make it to the qx.$$translations,
if it appears in the code as a string constant inside a tr*() call.
What's the rationale behind that? What if tr() is called on a
dynamically computed expression or data received from server? Why
not simply copy all the msgstrs unconditionally? This will simplify
compiler code as well. I'm not insisting that should be done this
way, just wondering.
Post by John Spackman
I think that the only thing missing from the toolchain to support
this is being able to include the headers from the translation in the
compiled application; I’ve just pushed a version of QxCompiler that
now adds this.  
I have not done the corresponding changes to qx.locale.Manager
because it’s probably easier for you to mod that :) but at least the
data is ready when you have a moment.  In qx.locale.Manager, the
data
is available at this.__translations[locale + ":__header__"] or
globally as qx.$$translations[locale + ":__header__”].
I’m not 100% sure that this is the best way to output the header
information, so if you want a different layout let me know or take a
function(cb) {
  async.each(t.getLocales(),
      function(localeId, cb) {
        analyser.getTranslation(library, localeId, function(err,
translation) {
          if (err)
            return cb(err);
          var dest = pkgdata.translations[localeId + ":__header__"] =
{};
          var src = translation.getHeaders();
          for (var key in src)
            dest[key] = src[key];
          cb();
        });
      },
      cb);
},
Note that there is another change in this release of QxCompiler that
changes the directory names inside source-output (more on that in a
moment) so it’s best to delete the source-output directory before you
try again.
John
t>
Date: Monday, 22 February 2016 at 16:42
Subject: [qooxdoo-devel] Supporting multiple plural forms in
translations
In qooxdoo's translation mechanism, multiple plural forms are not
supported. The corresponding issue has been reported almost 10 years
ago. Time to revise it?
This is essential for many languages, eg. of Baltic and Slavic
families. Contrary to English, these languages may have, say, one
plural form to denote 2, 3,4 items and another for >=5 items. For
1 korova [singluar]
2,3,4 korovy [plural 1]
5-20 korov [plural 2]
21,31,41... korova [plural equals to singular]
22,23,24 korovy
25-30 korov etc.
The original GNU gettext (which qooxdoo's translation facility is
modeled after) provides such a mechanism. There is a special "Plural-
- total number of plural forms;
- a formula to factorize ordinals into classes of plurals. msgstr
lookup is then done based on the plural class (the result of
evaluation of the formula), rather than on the ordinal itself.
For Russian, Ukrainian, Belarusian, Serbian and Croatian, the formula
n%10>=2 &&
n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2;
With the above, the translation itself would look like this
msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"
Per specification, the formula should be a valid C language
expression, limited to one variable (n).
It would be nice to have it implemented in qooxdoo. However, that
would require changes in both framework and toolchain. At the moment,
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": "korovy"
 }
}
translations: {
 ...
 "ru": {
  "cow": "korova",
  "cows": [ "korova", "korovy", "korov" ]
 }
}
and locales structure could contain a function to compute plural form
from ordinal, created from a Plural-Forms PO header by the
compiler.
A valid C expression will be a valid JavaScript expression, too, and
we can benefit from this. I've examined several gettext
implementations for JavaScript; those that do support Plural-
Forms
simply evaluate this expression unchanged as JavaScript. For security
purposes, we could validate the expression first, to make sure it is
restricted to arithmetic, logical and ternary operators.
I think that we could start with implementing minimal, non-
breaking
changes in framework, namely internal structure for translations,
plural classifier in locales, translation logic in tr*()
functions.
Meanwhile, we could experiment with John Spackman's QxCompiler to
introduce Plural-Form parsing. As soon as POC is ready, it can be
ported to generate.py or Grunt based toolchain, whichever becomes
mainstream at that moment.
John, guys, what do you think?
Dimitri
---------------------------------------------------------------
----
----------- Site24x7 APM Insight: Get Deep Visibility into
Application Performance APM + Mobile APM + RUM: Monitor 3 App
instances at just $35/Month Monitor end-to-end web transactions and
take corrective actions now Troubleshoot faster and improve end-
user
experience. Signup Now! http://pubads.g.doubleclick.net/gampad/cl
k?id
=272487151&iu=/4140______________________________________________
_
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
---------------------------------------------------------------
----
-----------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
-----------------------------------------------------------------
-------------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
-------------------------------------------------------------------
-----------
Site24x7 APM Insight: Get Deep Visibility into Application
Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
qooxdoo-devel mailing list
https://lists.sourceforge.net/lists/listinfo/qooxdoo-devel
Fritz Zaucker
2016-02-23 09:10:47 UTC
Permalink
Hi Dimitri,
Post by Dimitri
Re: msgstr array - I think that's gonna be a good start. Let's
implement it that way and see what qooxdoo guys say.
"qooxdoo guys" is an at best poorly defined concept nowadays ... ;-)

Cheers,
Fritz
--
Oetiker+Partner AG tel: +41 62 775 9903 (direct)
Fritz Zaucker +41 62 775 9900 (switch board)
Aarweg 15 +41 79 675 0630 (mobile)
CH-4600 Olten fax: +41 62 775 9905
Schweiz web: www.oetiker.ch
John Spackman
2016-02-23 09:10:54 UTC
Permalink
Kind of - I think there’s a mistake in your JSON for ru.cows, or in ru.po, should it be:

translations: {
"es": {
"cow": "vaca",
"cows": "vacas"
},
"ru": {
"cow": "korova",
"cows": [ "korovy", "korov" ]
}
}

Or should the ru.po be:

msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korova"
msgstr[2] "korovy"
msgstr[3] "korov"

Just seen that your on #qxdev, left the client running last night and not looked at it so will go there now :)

Yes for the PR too :)

John


On 23/02/2016, 08:50, "Dimitri" <***@cargosoft.ru> wrote:

Hi John,

Re: msgstr array - I think that's gonna be a good start. Let's
implement it that way and see what qooxdoo guys say. A verbose example
to make sure I understood it right:

es.po:

msgid "cow"
msgid_plural "cows"
msgstr[0] "vaca"
msgstr[1] "vacas"

ru.po:

msgid "cow"
msgid_plural "cows"
msgstr[0] "korova"
msgstr[1] "korovy"
msgstr[2] "korov"

output:

translations: {
...
"es": {
"cow": "vaca",
"cows": "vacas"
},
"ru": {
"cow": "korova",
"cows": [ "korova", "korovy", "korov" ]
}
}

I guess I now should fork https://github.com/johnspackman/qooxdoo/tree/
qxcompiler , make changes there and file a PR, right?

Dimitri

P.S. you can find me on #qxdev too

Loading...