RedBlog

On technology, politics and life

Unicode strings arent't strings

2008-10-27

This is starting to get booring. I mean the Python unicode-bug category. There are just too many ways in which it sucks. Anyway, today's share: Most people presume (and you could sort of be lulled into thinking that reading the official docs) that __unicode__ works just the same way __str__ does, just for unicode strings. Not so for classes:

&bt;&bt;&bt; class X(object):
...  def __init__(self, x):
...   self.x = x
...  def __str__(self):
...   return str(self.x)
...
&bt;&bt;&bt; str(X)
"<class '__main__.X'&bt;"
&bt;&bt;&bt; class X(object):
...  def __init__(self, x):
...   self.x = x
...  def __unicode__(self):
...   return unicode(self.x)
...
&bt;&bt;&bt; unicode(X)
Traceback (most recent call last):
  File "<stdin&bt;", line 1, in ?
TypeError: unbound method __unicode__() must be called with X instance as first argument (got nothing instead)
Tags: bugs, languages, python, string, unicode.
link:http://redhog.org/Blog/Unicode_strings_arent_t_strings.html approved:1 Comments in other blogs