Opened 10 years ago

Closed 9 years ago

Last modified 8 years ago

#2830 closed defect (fixed)

Browse unable to handle strange unicode characters

Reported by: m_anish Owned by: lucian
Priority: Low Milestone:
Component: Browse Version:
Severity: Minor Keywords: dx2, dx3, patch
Cc: bernie, erikos, manuq, humitos, dsd Distribution/OS: Unspecified
Bug Status: Unconfirmed

Description

Browse version 120, is unable to handle strange unicode characters in its address bar. It does not crash, but reports unhandled exceptions in its log...

You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

Testcase:

  • Open Browse
  • Type some obscure characters in the address bad (for example, ctrl+shift+1234) and press enter.
  • An exception is reported in the browse log (attached).

Observed in both Sugar-Jhbuild and dextrose-2

Attachments (2)

org.laptop.WebActivity-3.log (15.3 KB) - added by m_anish 10 years ago.
Browse crash log
0001-Use-UNICODE-string-to-search-into-places-SL-2830.patch (1.1 KB) - added by humitos 9 years ago.

Download all attachments as: .zip

Change History (13)

Changed 10 years ago by m_anish

Browse crash log

comment:1 Changed 10 years ago by m_anish

  • Summary changed from Browse unable to handle non-8 byte unicode characters to Browse unable to handle strange unicode characters

comment:2 Changed 10 years ago by RafaelOrtiz

  • Milestone changed from Unspecified by Release Team to 0.94
  • Version Unspecified deleted

Moving to 0.94

comment:3 Changed 9 years ago by humitos

  • Milestone changed from 0.94 to 0.96

This still happens in git version of Browse but the exception is a bit different, it refers to SQLite:

Traceback (most recent call last):
  File "/home/humitos/sugar-jhbuild/install/share/sugar/activities/Browse.activity/webtoolbar.py", line 216, in __changed_cb
    if not self.props.text or not self._search_update():
  File "/home/humitos/sugar-jhbuild/install/share/sugar/activities/Browse.activity/webtoolbar.py", line 123, in _search_update
    for place in places.get_store().search(self.props.text):
  File "/home/humitos/sugar-jhbuild/install/share/sugar/activities/Browse.activity/places.py", line 73, in search
    (text, text, self.MAX_SEARCH_MATCHES))
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

I wrote something like this: ðßæðđ ðßđßðæđ æßð đæßłđþ@ł€¶ đðæß^ł in the address bar.

comment:4 Changed 9 years ago by manuq

  • Cc manuq added

comment:5 Changed 9 years ago by manuq

  • Keywords patch added

comment:6 Changed 9 years ago by humitos

  • Cc humitos added

comment:7 Changed 9 years ago by manuq

  • Cc dsd added

Daniel, we discussed this one in:

http://lists.sugarlabs.org/archive/sugar-devel/2012-May/037110.html

Shall we commit humito's patch, or file a bug upstream?

comment:8 Changed 9 years ago by dsd

I think you first need to ask upstream what the intended behaviour is. The mailing list could be good for that. If a GtkEntry includes a UTF8 string and you try to access that string from python, what type encoding is the string supposed to have at the Python level?

comment:9 Changed 9 years ago by manuq

Discussion at pygobject mailing list: https://mail.gnome.org/archives/python-hackers-list/2012-June/msg00012.html

In Python 3, the strings are well handled: "all string arguments accept a "str" Python
data type (what used to be "unicode" in Python 2) and also return str."

"But for Python 2, pygobject always takes and returns python 2's "str" data type, which is an UTF-8 bytestring. Methods generally accept "unicode" values as well and pygobject converts them to "str" on the fly, but not the return values."

Related bug: https://bugzilla.gnome.org/show_bug.cgi?id=663610

Because of this discussion, a new wiki page was created: http://python-gtk-3-tutorial.readthedocs.org/en/latest/unicode.html

So humitos patch seems to be the right thing to do in Python 2.

comment:10 Changed 9 years ago by manuq

  • Milestone changed from 0.96 to 0.98
  • Resolution set to fixed
  • Status changed from new to closed

pushed in Browse master branch.

comment:11 Changed 8 years ago by dnarvaez

  • Milestone 0.98 deleted

Milestone 0.98 deleted

Note: See TracTickets for help on using tickets.