Thursday, November 17, 2011

str() is the Devil

Recently I discovered the folly of using str() within Python 2.7 scripts. My program would ingest arbitrary UTF8 text from outside sources and then try to print it to a file, only to crash with a UnicodeDecodeError or UnicodeEncodeError exception. In good time, I realized how to do it The Right Way and converted all my str() calls to unicode().

But the error kept occurring. It took me quite a while to remember that

foobar = ''

is functionally equivalent to

foobar = str()

This problem shouldn't happen in Python 3, since all strings are Unicode by default there.

No comments:

Post a Comment