Friday, January 6, 2012

Python 3 Porting Fun Redux

My last post on Python 3 porting got some really great responses, and I've learned a lot from the feedback I've seen.  I'm here to rather briefly outline a few additional tips and tricks that folks have sent me and that I've learned by doing other ports since then.  Please keep them coming, either in the blog comments or to me via email.  Or better yet, blog about your experiences yourself and I'll link to them from here.

One of the big lessons I'm trying to adopt is to support Python 3 in pure-Python code with a single code base.  Specifically, I'm trying to avoid using 2to3 as much as possible.  While I think 2to3 is an excellent tool that can make it easier to get started supporting both Python 2 and Python 3 from a single branch of code, it does have some disadvantages.  The biggest problem with 2to3 is that it's slow; it can take a long time to slog through your Python code, which can be a significant impediment to your development velocity.  Another 2to3 problem is that it doesn't always play nicely with other development tools, such as python test and virtualenv, and you occasionally have to write additional custom fixers for conversion that 2to3 doesn't handle.

Given that almost all the code I'm writing these days targets Python 2.6 as the minimal supported Python 2 version, 2to3 may just be unnecessary.  With my dbus-python port to Python 3, and with my own flufl packages, I'm experimenting with ignoring 2to3 and trying to write one code base for all of Python 2.6, 2.7, and 3.2.  My colleague Michael Foord has been pretty successful with this approach going back all the way to Python 2.4, so 2.6 as a minimum should be no problem!  C extensions are pretty easy because you have the C preprocessor to help you.  But it turns out that it's usually not too difficult in pure-Python either.  I've done this in my latest release of the flufl.bounce package, and intend to eliminate 2to3 in my other flufl packages soon too.

The first thing I've done is add print_function to the __future__ import in all my modules.  Previously, I was only importing unicode_literals and absolute_import.  But doctests tend to use a lot of print statements, so switching to the print() function explicitly removes one big 2to3 conversion.  Aside from having to unlearn decades of print statement muscle memory, the print() function is actually rather nice.  So my module template now looks like this (with the copyright comment block omitted):

from __future__ import absolute_import, print_function, unicode_literals

__metaclass__ = type
__all__ = [

Speaking of doctests, you really want them to have the same set of future imports as all your other code.  I'll talk more about how my own packages set up doctests later, but for now, it's useful to know that I create a doctest.DocFileSuite for every doctest in my package.  These suites all have a setup() function and Python's testing framework will call these at the appropriate time, passing in a testobj parameter.  This argument has a globs attribute which serves as the module globals for the doctest.  All you need to do to enable the future imports in your doctests is to do something like this:

def setup(testobj):
        testobj.globs['absolute_import'] = absolute_import
        testobj.globs['print_function'] = print_function
        testobj.globs['unicode_literals'] = unicode_literals
    except NameError:

The try-except really is only necessary if you keep using 2to3, since that tool will remove the future imports from all the modules it processes.  The future imports still exist in Python 3 of course, since future imports are never, ever removed.  So if you ditch 2to3, you can get rid of the try-except too.

In the latest release of flufl.bounce, I changed the API so that the detected email addresses are all explicitly bytes objects in Python 3 (and 8-bit strings in Python 2).  This caused some problems with my doctests because the repr of Python 3 bytes objects is different than the repr of 8-bit strings in Python 2.  When you print the object in Python 2, you get just the contents of the string, but when you print them in Python 3, you get the b''-prefix.

% python
Python 2.7.2+ (default, Dec 18 2011, 17:30:39)
[GCC 4.6.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print b'foo'
% python3
Python 3.2.2+ (default, Dec 19 2011, 12:03:32)
[GCC 4.6.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print(b'foo')

This means your doctest cannot be written to easily support both versions of Python when bytes/8-bit strings are used.  I use the following helper to get around this:

def print_bytes(obj):
    if bytes is not str:
        obj = repr(obj)[2:-1]

Remember that in Python 2, bytes is just an alias for str so this code only gets invoked in Python 3.

Another fun bytes/8-bit-string issue is that in Python 3, bytes objects have no .format() method.  So if you're doing something like b'foo {0}'.format(obj) this will work in Python 2, but fail in Python 3. The best I've come up with for this is to use concatenation instead, or do the format using unicodes and then encode them to their bytes object (but then you have the additional fun of choosing an appropriate encoding!).

Did you know that the re module can scan either unicodes or bytes in Python 3?  The switch is made by passing in either a bytes pattern or a str pattern, and then passing in the appropriate type of object to parse.  But, if you use the r''-prefix (i.e. raw strings) for saner handling of backslashes, you've got another problem when you want to parse bytes.  Python does not support rb''-prefixes, meaning you can have either raw string literals or bytes string literals but not both.  You have to forgo one or the other, and I usually come down on the side of ditching the raw strings and suffering the pain of backslash proliferation.

Some of the code I was porting was using itertools.izip_longest(), but this doesn't exist in Python 3.  Instead you have itertools.zip_longest().  You'll have to do a conditional import (i.e. try-except) around this to get the right version.

Do you use zope.interfaces?  You'll be interested to know that the syntax we've long been accustomed to for declaring that a class implements an interface does not work in Python 3.  For example:

from zope.interface import Interface, implements
class MyInterface(Interface):
class MyClass:

This is because the stack hacking that implements() uses doesn't work in Python 3.  Fortunately, the latest version of zope.interface has a new class decorator that you can use instead.  This works in Python 2.6 and 2.7 too, so change your code to use this:

from zope.interface import Interface, implementer
class MyInterface(Interface):
class MyClass:

I kind of like the use of class decorators better anyway.

Here's a tricky one.  Did you know that Python 2 provides some codecs for doing interesting conversions such as Caeser rotation (i.e. rot13)?  Thus, you can do things like:

>>> 'foo'.encode('rot-13')

This doesn't work in Python 3 though, because even though certain str-to-str codecs like rot-13 still exist, the str.encode() interface requires that the codec return a bytes object. In order to use str-to-str codecs in both Python 2 and Python 3, you'll have to pop the hood and use a lower-level API, getting and calling the codec directly:

>>> from codecs import getencoder
>>> encoder = getencoder('rot-13')
>>> rot13string = encoder(mystring)[0]

You have to get the zeroth-element from the return value of the encoder because of the codecs API.  A bit ugly, but it works in both versions of Python.

That's all for now.  Happy porting!


  1. > Python does not support rb''-prefixes

    But it does support br''-prefixes ;-)

  2. What is the "__metaclass__ = type" line for?

  3. @Anonymous: ha! It's actually the same in Python 2.7. But it's somewhat confusing given the language here:

    I think I'll open a bug on that but leave it as a documentation bug.

  4. @Daniele, in Python 2, it makes all your classes "new-style". There are several other ways to do this. You can inherit from `object` or you can set the __metaclass__ attribute in your class definition. I like doing it this way because it's easier to remove when you've ported to Python 3 and can drop Python 2.


  6. It must be really hard writing the targets Python 2.6. I've tried doing it but I think I still need to learn more about it. essay writing service writer Maude

  7. it must be has a skill to learn about Python 2.6. but it still hard for me. thanks for the best article you have.

  8. This website is so cool and full of constructive information. Keep sharing and I will always support to such a great blog posts.
    cheap essay writing

  9. Excellent writing skills have been shown here by the writer. I am really enthused to be reading this amazing article.Apple Ibeacon

  10. This post couldn’t be written much better! He always kept talking about this ways to kill a blog. I am going to send this article to him. Pretty sure he’s going to have a great read.
    write my dissertation for me || coursework writing service

  11. Thanks for making Python 3 Porting so easy for all of us..It is like fun to watch, please do post more like this...Assignment writing services

  12. Wow!! An astonishing piece of writing...I am just shocked to read the whole post...Thank you so much for sharing this information with us... write my coursework coursework consultancy order coursework

  13. I like to say that this blog again looking too interesting I got a nice and great read on this blog. This time I want to thank along with my whole team. We also like to thank to blogger for his best thinking.

  14. I want you to thank for your time of this wonderful read!!!
    Buy High Pagerank Backlinks

  15. I was exactly searching for. Thanks for such post and please keep it up. Great work.
    startup funding

  16. I really love the quality writing as offered on this post, cheers to the writer.
    lettre en bois

  17. Shunt Regulators Electronics Assignment Help and Online Homework Help & Project Help Figure 17-36 is a functional block diagram of the shunt-type regulator.

  18. Accounting Thesis Help Services Finance Homework and Project of financial management Accounting Thesis Help Accounting is an academic discipline that has many different concepts.

  19. Chemistry Assignment Help and Chemistry Homework Help includes the study of matter as well as their changes. Matter is the mix of pure parts.

  20. Our website is number in Statistics/ Stats homework help. This is preferred destination for various students to get their statistics homework help online taken from.

  21. Really special blog post, Thanks for your time designed for writing It education. Outstandingly drafted wintel firefly box

  22. I had heard a lot about Willpower show and searched all over the internet to find an article on it. I accidentally came across this article and I found it to be helpful. Thanks a lot forrecuperar dados