Ticket #2354 (new defect)

Opened 3 years ago

Last modified 3 years ago

Better handle plurals in localization

Reported by: erikos Owned by: erikos
Priority: Unspecified by Maintainer Milestone: 0.92
Component: sugar-toolkit Version: 0.90.x
Severity: Unspecified Keywords:
Cc: Distribution/OS: Unspecified
Bug Status: Unconfirmed

Description

On Arabic machines (LANG=ar_SY.UTF-8) the activities list view is always empty. shell.log contains:

  File "/usr/lib/python2.6/site-packages/sugar/util.py", line 251, in timestamp_to_elapsed_string
    elapsed_units) % elapsed_units
TypeError: not all arguments converted during string formatting

From:  http://dev.laptop.org/ticket/10372

Attachments

0001-Do-not-break-if-the-string-contains-no-conversion-sp.patch Download (1.6 KB) - added by erikos 3 years ago.
Patch to make Marco and Tomeu happy

Change History

follow-up: ↓ 4   Changed 3 years ago by erikos

Python 2.6 contains a new way for formatting, described  here and  here.

However this might have issues when using with gettext as described in  this python bug.

And we would need to have the strings translated again.

  Changed 3 years ago by erikos

If  translators put zero length strings into the translations we might for a simple fix just check if there is a conversion specifier in the returned string and otherwise just display the string.

Something like:

diff --git a/src/sugar/util.py b/src/sugar/util.py
index b947c0a..d531a2a 100644
--- a/src/sugar/util.py
+++ b/src/sugar/util.py
@@ -274,7 +274,12 @@ def timestamp_to_elapsed_string(timestamp, max_levels=2):
                 translation = gettext.dngettext('sugar-toolkit',
                                                 name_singular,
                                                 name_plural,
-                                                elapsed_units) % elapsed_units
+                                                elapsed_units)
+                try:
+                    translation % elapsed_units
+                except TypeError:
+                    pass
+
                 _i18n_timestamps_cache[key] = translation
                 time_period += translation

  Changed 3 years ago by erikos

*That* one works actually :)

diff --git a/src/sugar/util.py b/src/sugar/util.py
index b947c0a..a061378 100644
--- a/src/sugar/util.py
+++ b/src/sugar/util.py
@@ -271,10 +271,15 @@ def timestamp_to_elapsed_string(timestamp, max_levels=2):
             if key in _i18n_timestamps_cache:
                 time_period += _i18n_timestamps_cache[key]
             else:
-                translation = gettext.dngettext('sugar-toolkit',
-                                                name_singular,
-                                                name_plural,
-                                                elapsed_units) % elapsed_units
+                tmp = gettext.dngettext('sugar-toolkit',
+                                        name_singular,
+                                        name_plural,
+                                        elapsed_units)
+                try:
+                    translation = tmp % elapsed_units
+                except TypeError:
+                    translation = tmp
+
                 _i18n_timestamps_cache[key] = translation
                 time_period += translation

in reply to: ↑ 1 ; follow-up: ↓ 5   Changed 3 years ago by sascha_silbe

Replying to erikos:

Python 2.6 contains a new way for formatting, described  here and  here.

That's not necessary, older versions of Python can do formatting using dictionaries quite fine:

>>> "%(seconds)d seconds elapsed" % {'seconds': 5}
'5 seconds elapsed'
>>> "seconds elapsed" % {'seconds': 5}
'seconds elapsed'
>>> "%(second)d seconds elapsed" % {'seconds': 5}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'second'
>>> 

However this might have issues when using with gettext as described in  this python bug.

As mentioned in the bug report you cited, the above method works fine with gettext.

In case the formatting fails, we should fall back to the next-most preferred language, down to C = original string if necessary.

in reply to: ↑ 4 ; follow-up: ↓ 6   Changed 3 years ago by erikos

Replying to sascha_silbe:

Replying to erikos:

Python 2.6 contains a new way for formatting, described  here and  here.


That's not necessary, older versions of Python can do formatting using dictionaries quite fine:

{{{

"%(seconds)d seconds elapsed" % {'seconds': 5}

'5 seconds elapsed'

"seconds elapsed" % {'seconds': 5}

'seconds elapsed'

"%(second)d seconds elapsed" % {'seconds': 5}

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'second'


}}}

However this might have issues when using with gettext as described in  this python bug.


As mentioned in the bug report you cited, the above method works fine with gettext.

In case the formatting fails, we should fall back to the next-most preferred language, down to C = original string if necessary.

Yes, that would be the option to do in master. For the released versions like 0.84 and 0.88 and the one that are close to (0.90) I would not want to make that change as we would loose the translation work that has been done already (if we do not port that over).

in reply to: ↑ 5   Changed 3 years ago by sascha_silbe

Replying to erikos:

In case the formatting fails, we should fall back to the next-most preferred language, down to C = original string if necessary.


Yes, that would be the option to do in master. For the released versions like 0.84 and 0.88 and the one that are close to (0.90) I would not want to make that change as we would loose the translation work that has been done already (if we do not port that over).

The fallback shouldn't affect existing translations and would fix this issue at least for the plain-broken translations case.

follow-up: ↓ 8   Changed 3 years ago by dsd

I think you should add a comment in the code explaining this oddity. (after reading this bug for a 2nd time I'm still not really understanding the issue here -- no doubt I'm just missing something that could be explained in a comment)

in reply to: ↑ 7   Changed 3 years ago by erikos

Replying to dsd:

I think you should add a comment in the code explaining this oddity. (after reading this bug for a 2nd time I'm still not really understanding the issue here -- no doubt I'm just missing something that could be explained in a comment)

Ok, I added a comment and note the ticket.

So the issue we are handling is that, the original string is something like "%d months' that gets as well into the .po files. So, some languages do have several plural forms and in some cases do not need to specify the number at all since the information is included in the word itself. The translators agreed at some point to put zero length strings in the translations in those cases. And our code does not handle those cases as formatting does fail then.

This is only a fix for 0.84-0.90. In 0.92 we will use dictionaries as  Sascha has pointed out and which was the conclusion in the  python bug as well as the format() method for strings seem not to handle this case correctly, yet.

Hopes this explains it a bit better.

  Changed 3 years ago by erikos

  • summary changed from Activities list view and Journal view crashes in Arabic to Better handle plurals in localization

  Changed 3 years ago by erikos

  • milestone changed from Unspecified by Release Team to 0.92

Changed 3 years ago by erikos

Patch to make Marco and Tomeu happy

Note: See TracTickets for help on using tickets.