Wednesday, January 18, 2012

Debian packaging for Python 2 and 3

Time for another installment of my ongoing mission to convert the world to Python 3!  This time, a little Debian packaging-fu for modifying an existing Python 2 package to include support for Python 3 from the same source package.

Today, I added a python3-feedparser package to Ubuntu Precise.  What's interesting about this is that, despite various reported problems, upstream feedparser 5.1 claims to support Python 3, via 2to3 conversion.  And indeed it does (although the test suite does not).

Before today, Ubuntu had feedparser 5.0.1 in its archive, and while some work has been done to update the Debian package to 5.1, this has not been released.  The uninteresting precursor to Python 3 packaging was to upgrade the Ubuntu version of the python-feedparser source package to 5.1.  I'll spare you the boring details about missing data files in the upstream tarball, and other problems, since they don't really relate to the Python 3 effort.

The first step was to verify that feedparser 5.1 works with Python 3.2 in a virtualenv, and indeed it does.  This is good news because it means that the does the right thing, which is always the best way to start supporting Python 3.  I've found that it's much easier to build a solid Debian package if you have a solid in upstream to begin with.

Now, what I'd like to do is to give you a recipe for modifying your existing debian/ directory files to add Python 3 support to a package that already exists for Python 2.  This is a little trickier for feedparser because it used an older debhelper standard, and carried some crufty old stuff in its rules file.  My first step was to update this to debhelper compatibility level 8 and greatly simplify the debian/rules file.  Here's what it might have looked like with just Python 2 support, so let's start there.

#!/usr/bin/make -f
export DH_VERBOSE=1

    dh $@ --with python2

    rm -rf build .*egg-info

ifeq (,$(filter nocheck,$(DEB_BUILD_OPTIONS)))
    cd feedparser && python ./
    @echo "nocheck set, not running tests"

    dh_installdocs -Xtests

This is all pretty standard stuff.  dh_python2 is used (the --with python2 option to dh), and we just provide a couple of overrides for idiosyncrasies in the feedparser package.  We clean a couple of extra things that aren't cleaned automatically, and we run the test suite in the slightly non-standard way that upstream requires.  Also, we override the installation of a huge amount of test files that would otherwise get installed as documentation (they aren't docs).

So far so good.  What do we have to do to add support for Python 3?

First, we need to make a few modifications to the debian/control file.  The current convention with dh_python2 is to use an X-Python-Version header in the source package stanza, so we just need to add this header to the same stanza for Python 3:

X-Python3-Version: >= 3.2

This just says we support any Python 3 version from 3.2 onwards.  You also need to add a few additional packages to the Build-Depends.  In the feedparser case, I added the following build dependencies: python3, python3-chardet, python3-setuptools.  Even though for Python 2 there are a couple of other build dependencies (e.g. python-libxml2 and python-utidylib) these aren't available for Python 3, but lucky for us, they are optional anyway.

Next, you need to add a new binary package stanza.  There was already a python-feedparser binary package stanza for Python 2 support.  In Debian, Python 3 is provided as a separate stack, meaning packages for Python 3 will always start with the python3- prefix.  Thus, it is pretty easy to just copy the python-feedparser stanza and paste it to the bottom of debian/rules, changing the package name to python3-feedparser.  You have to update the Depends line to use ${python3:Depends} and I updated the Recommends line to name python3-chardet, and that was about it.  Here's what the new stanza looks like:

Package: python3-feedparser
Architecture: all
Depends: ${misc:Depends}, ${python3:Depends}
Recommends: python3-chardet
Description: Universal Feed Parser for Python
 Python module for downloading and parsing syndicated feeds. It can
 handle RSS 0.90, Netscape RSS 0.91, Userland RSS 0.91, RSS 0.92, RSS
 0.93, RSS 0.94, RSS 1.0, RSS 2.0, Atom, and CDF feeds.
 It provides the same API to all formats, and sanitizes URIs and HTML.
 This is the Python 3 version of the package.
Again, so far so good.  Now let's look at the debian/rules file.

The first thing to do is to add support for dh_python3, which is analogous to dh_python2, and is the only accepted helper for Python 3.  The rules line then becomes:

    dh $@ --with python2,python3
Now, one problem with debhelper is that it doesn't have any built-in support for Python 3 like it does for Python 2.  This means dh will not automatically build or install any Python 3 packages, so you have to do this manually.  Eventually, this will be fixed, and fortunately with a solid file, you don't have to do to much, but it's something to be aware of.  In the feedparser case, we need to add overrides for dh_auto_build and dh_auto_install.  Here's what these rules look like:

    set -ex; for python in $(shell py3versions -r); do \
        $$python build; \

    set -ex; for python in $(shell py3versions -r); do \
        $$python install --root=$(CURDIR)/debian/tmp --install-layout=deb; \
    cp feedparser/ $(CURDIR)/debian/tmp/usr/lib/python3/dist-packages/
Not too bad, eh?  You'll notice that the first thing these rules do is call the standard dh_auto_build and dh_auto_install respectively.  This preserves the Python 2 support.  Then we just loop over all the available Python 3 versions, doing a fairly normal equivalent of install (split into a build step and an install step).  The install rule looks a little odd, but should be familiar to Debian Python hackers.  It just installs the package into the proper Debian locations, and will pretty much be the same for any Python 3 package you build.

The one odd bit is the last line in the override_dh_auto_install rule.  This is there just to work around an peculiarity in the feedparser 5.1 upstream package, where it depends on, but that is no longer in the Python standard library in Python 3.  Upstream provides an already 2to3 converted version of it, and recommends you install the module as somewhere on your Python 3 sys.path.  Well, I don't like the namespace pollution that would cause, so I install the file as and add a quilt patch to the package to try an import of that module if importing sgmllib fails (as it will on Python 3).

An aside: If you look in the debian/rules file for what I actually uploaded, you'll see some additional modifications to override_dh_auto_test.  This just works around the upstream bug where some test suite data files were accidentally omitted from the release tarball.  You can pretty much ignore those lines for the purposes of this article.

We're almost done.  The last thing we need to do is make sure that debhelper installs the right files into the right binary packages.  We want the python-feedparser binary package to include only the Python 2 files, and the python3-feedparser binary package to only include the Python 3 files.  Keep in mind that when a source package builds only a single binary package (as was the case before I added Python 3 support), debhelper will include everything under the build directory's debian/tmp subdirectory in the single binary package.  That's why you see things get installed into $(CURDIR)/debian/tmp.  But when a source package builds multiple binary packages, as is now the case here, we have to tell debhelper which files go into which binary packages.  We do this by adding two new files to the debian directory: python-feedparser.install and python3-feedparser.install

Reading the manpage for dh_install will explain the reasons for this, and describe the format of the file contents.  In our case, we're really lucky, because for Python 2, everything gets installed under usr/lib/python2.* and in Python 3, everything gets installed under usr/lib/python3 (relative to $(CURDIR)/debian/tmp).  You'll notice a few things here.  Because we could be building for multiple versions of Python 2, we have to wildcard the actual directory under usr/lib, e.g. it might be python2.6 or python2.7.  But because we have PEP 3147 and PEP 3149 in Python 3.2, there's only one directory for all supported versions of Python 3, so we don't need to wildcard the subdirectory.  Also, if you look at the actual .install files in the package, you'll see a few other trailing path components, so the actual contents of the files are:


for the python-feedparser.install and python3-feedparser.install files respectively.  The trailing bits just wildcard what on a Debian system will always be dist-packages, just for safety (cargo culting FTW!).

And that really is it!  Of course, things could be a little more complicated if you have extension modules, but maybe not that much more so, and if the package you're adding Python 3 support to isn't setuptools-based, you may have more work to do even still.  The feedparser package has a few other oddities that are really unrelated to adding Python 3 support, so I'm ignoring them here, but feel free to ask for additional details in the comments, in IRC, or in email.

Hopefully this gives you some insight into how to extend an existing Python 2 Debian package into including Python 3 support, given that your upstream already supports Python 3.  Now, go forth and hack!

Addendum: my colleague Colin Watson just today packaged up Benjamin Peterson's very fine Python package called six.  This is a nice package that provides some excellent Python 2 and 3 compatibility utilities.  You may find this helpful if you're trying to support both Python 2 and Python 3 in a single code base, especially if you have to support back to Python 2.4 (poor you :).  This will be available in Ubuntu Precise, although if you're submitting patches back upstream, you may have to convince the upstream author to accept the additional dependency.  It's worth it to add a little more Python 3 love to the world.


  1. Hey Barry, thanks for working to get a Python 3 version of feedparser into Ubuntu!

    First, I'm sorry you ran into such troubles with the package. You alluded to oddities with feedparser but asked that people contact you out-of-band about them. Instead, could you post them to the feedparser mailing list or submit more bug reports? I'd love to hear how the library can improve, although for the past few months I've frequently been working extended hours and often on weekends so I haven't been able to react quickly when I hear about problems. :( Nevertheless, I'm excited to hear what issues you've run into so I can attack them as I'm able!

    Second, while feedparser does depend on sgmllib even on Python 3, it never occurred to me that it might be difficult to package for a distro! I'm glad you were able to work around the issue. Perhaps it would be possible to package sgmllib3k, available on PyPI, and make that a dependency? (I really, really don't want to keep sgmllib in the repo anyway, so I'm looking forward to the day I can remove it.)

    Third, I've never tested feedparser with chardet, and the test suite disables tidy support while running. As far as I know, chardet is completely unmaintained, and the tidy libraries looked to be at least five years old when last I looked. I strongly discourage adding any of those packages as a recommended package for feedparser, as I plan to remove tidy support entirely from feedparser in the future, and I haven't yet evaluated what value chardet offers.

    Fourth, it never occurred to me that there would be a problem with not converting the unit tests using 2to3. The unit tests obviously have no business getting installed, but I'm all ears if you have some suggestions how to alleviate the situation for Python 3 users.

    Finally, please don't hesitate to bring up any issues you run into on the mailing list, or by filing bug reports as you have been! I'm grateful that you've worked to overcome the current issues to get the software packaged Ubuntu, so I'd like to make your life easier in future iterations.

    Thanks again!

  2. Hi Kurt. First, I hope I didn't sound too harsh on feedparser; it's a wonderful package and you've done a great job with it! I think you know about most/all of the problems I ran into. Issue 313 covers the missing data files (I added these to the Debian packaging, but they are easily removed with the next feedparser release). Issue 323 covers the other test suite failures, some caused by the presence of chardet, and another because of the file system encoding in an schroot environment.

    I think it would be fine if feedparser included as I describe above, but I also understand you not wanting to distribute that. I did find this on the Cheeseshop, but haven't looked at it yet:

    Seems like that's intended to be the same thing you've got. If you fix feedparser to depend on this PyPI package, I'd happily package it up for Debian (I couldn't find an existing ITP).

    utidylib is completely decrepit; it hasn't been updated in years afaict. ;) I just ignore that dependency in the Python 3 package.

    One thing to think about for Python 3 support. If you can drop support for anything earlier than Python 2.6, then it might be possible to support Python 2 and 3 with the same code base. I've managed to eliminate 2to3 for all my libraries this way, and with a bit of virtualenv testing, it should be easy to do. Then could just be dually compatible w/o the need for 2to3.

    Thanks again for such a great package, and for contributing to this thread!

  3. Time for another installment of my ongoing mission to convert the world to Python 3! This time, a little Debian packaging-fu for modifying an existing Python 2 package to include support for Python 3 from the same source package.

  4. The blog post Debian packaging for Python 2 and 3 was completely unbelievable! Lot of great knowledge which can be useful in some or the other way,

  5. You need --force for $$python install --root=$(CURDIR)/debian/tmp --install-layout=deb;.

    Otherwise, #! header with the wrong version may end-up in the package like me.

    Think python3 default is 3.2 while 3.3 and 3.2 are both available like current Debian unstable.

  6. Thank you so much for this! I've spend so many hours searching for Python 3 packaging.

    One slight problem, by Python 3 .deb doesn't run my postinst script. I'm assuming it's something to do with the override_dh_auto_install?

  7. Hi Thomas, I don't think that would do it, but check to see if you're upcalling to the default dh_auto_install, e.g

    # My extra stuff

    Note too that Piotr just announced pybuild in unstable, so building distutils-based packages for Python 2 and 3 is going to get even easier.

  8. wow, thats a huge article, nice to know this, thanks

  9. Hire a ghost writer
    It is really nice blog that become inspiration to everyone.

  10. This comment has been removed by the author.

  11. Cost Accounting Help
    Depreciation Accounting Homework Help
    Thanks for sharing in detail. Your blog is an inspiration! Apart of really useful tips, it's just really !

  12. Awesome work you have done here, I am very happy to read this nice post. You are a great writer and give us much information.
    Term Paper Writing

  13. do my accounting homework for me
    This is a very great & impressive article for me by you.

  14. good to share the human rights. it is very informative post . you ever share here.
    Illustration Essay Assignment Help

  15. What a great information you shared with us, I am inspired by the method for the stage. It kept joined me regularly. Keep doing awesome. Thanks for sharing this blog article.
    Assignment Help