The right way to internationalize your Python app

Written by Barry Warsaw in technology on Fri 22 June 2012. Tags: i18n, mailman, python, python3, ubuntu,

Recently, as part of our push to ship only Python 3 on the Ubuntu 12.10 desktop, I've helped several projects update their internationalization (i18n) support. I've seen lots of instances of suboptimal Python 2 i18n code, which leads to liberal sprinkling of cargo culted .decode() and .encode() calls simply to avoid the dreaded UnicodeError s. These get worse when the application or library is ported to Python 3 because then even the workarounds aren't enough to prevent nasty failures in non-ASCII environments (i.e. the non-English speaking world majority :).

Let's be honest though, the problem is not because these developers are crappy coders! In fact, far from it, the folks I've talked with are really really smart, experienced Pythonistas. The fundamental problem is Python 2's 8-bit string type which doubles as a bytes type, and the terrible API of the built-in Python 2 gettext module, which does its utmost to sabotage your Python 2 i18n programs. I take considerable blame for the latter, since I wrote the original version of that module. At the time, I really didn't understand unicodes (this is probably also evident in the mess I made of the email package). Oh, to really have access to Guido's time machine.

The good news is that we now know how to do i18n right, especially in a bilingual Python 2/3 world, and the Python 3 gettext module fixes the most egregious problems in the Python 2 version. Hopefully this article does some measure of making up for my past sins.

Stop right here and go watch Ned Batchelder's talk from PyCon 2012 entitled Pragmatic Unicode, or How Do I Stop the Pain? It's the single best description of the background and effective use of Unicode in Python you'll ever see. Ned does a brilliant job of …

Continue reading »


Python 3 on the desktop for Quantal Quetzal

Written by Barry Warsaw in technology on Tue 24 April 2012. Tags: canonical, debian, ubuntu, python, python3,

So, now all the world now knows that my suggested code name for Ubuntu 12.10, Qwazy Quahog, was not chosen by Mark. Oh well, maybe I'll have more luck with Racy Roadrunner.

In any case, Ubuntu 12.04 LTS is to be released any day now so it's time for my semi-annual report on Python plans for Ubuntu. I seem to write about this every cycle, so 12.10 is no exception. We've made some fantastic progress, but now it's time to get serious.

For Ubuntu 12.10, we've made it a release goal to have Python 3 only on the desktop CD images. The usual caveats apply: Python 2.7 isn't going away; it will still probably always be available in the main archive. This release goal also doesn't affect other installation CD images, such as server, or other Ubuntu flavors. The relatively modest goal then only affects packages for the standard desktop CD images, i.e. the alternative installation CD and the live CD.

Update 2012-04-25: To be crystal clear, if you depend on Python 2.7, the only thing that changes for you is that after a fresh install from the desktop CD on a new machine, you'll have to explicitly apt-get install *python2.7. After that, everything else will be the same.

This is ostensibly an effort to port a significant chunk of Ubuntu to Python 3, but it really is a much wider, Python-community driven effort. Ubuntu has its priorities, but I personally want to see a world where Python 3 rules the day, and we can finally start scoffing at Python 2 :).

Still, that leaves us with about 145 binary packages (and many fewer source packages) to port. There are a few categories of packages to consider:

Already ported and available.
This is …

Continue reading »


Debian packaging for Python 2 and 3

Written by Barry Warsaw in technology on Wed 18 January 2012. Tags: python, debian,

Time for another installment of my ongoing mission to convert the world to Python 3! This time, a little Debian packaging-fu for modifying an existing Python 2 package to include support for Python 3 from the same source package.

Today, I added a python3-feedparser package to Ubuntu Precise. What's interesting about this is that, despite various reported problems, upstream feedparser 5.1 claims to support Python 3, via 2to3 conversion. And indeed it does (although the test suite does not).

Before today, Ubuntu had feedparser 5.0.1 in its archive, and while some work has been done to update the Debian package to 5.1, this has not been released. The uninteresting precursor to Python 3 packaging was to upgrade the Ubuntu version of the python-feedparser source package to 5.1. I'll spare you the boring details about missing data files in the upstream tarball, and other problems, since they don't really relate to the Python 3 effort.

The first step was to verify that feedparser 5.1 works with Python 3.2 in a virtualenv, and indeed it does. This is good news because it means that the setup.py does the right thing, which is always the best way to start supporting Python 3. I've found that it's much easier to build a solid Debian package if you have a solid setup.py in upstream to begin with.

Now, what I'd like to do is to give you a recipe for modifying your existing debian/ directory files to add Python 3 support to a package that already exists for Python 2. This is a little trickier for feedparser because it used an older debhelper standard, and carried some crufty old stuff in its rules file. My first step was to update this to debhelper compatibility level 8 …

Continue reading »


Python 3 Porting Fun Redux

Written by Barry Warsaw in technology on Fri 06 January 2012. Tags: python3,

My last post on Python 3 porting got some really great responses, and I've learned a lot from the feedback I've seen. I'm here to rather briefly outline a few additional tips and tricks that folks have sent me and that I've learned by doing other ports since then. Please keep them coming, either in the blog comments or to me via email. Or better yet, blog about your experiences yourself and I'll link to them from here.

One of the big lessons I'm trying to adopt is to support Python 3 in pure-Python code with a single code base. Specifically, I'm trying to avoid using 2to3 as much as possible. While I think 2to3 is an excellent tool that can make it easier to get started supporting both Python 2 and Python 3 from a single branch of code, it does have some disadvantages. The biggest problem with 2to3 is that it's slow; it can take a long time to slog through your Python code, which can be a significant impediment to your development velocity. Another 2to3 problem is that it doesn't always play nicely with other development tools, such as python setup.py test and virtualenv, and you occasionally have to write additional custom fixers for conversion that 2to3 doesn't handle.

Given that almost all the code I'm writing these days targets Python 2.6 as the minimal supported Python 2 version, 2to3 may just be unnecessary. With my dbus-python port to Python 3, and with my own flufl packages, I'm experimenting with ignoring 2to3 and trying to write one code base for all of Python 2.6, 2.7, and 3.2. My colleague Michael Foord has been pretty successful with this approach going back all the way to Python 2.4, so 2.6 as a …

Continue reading »


Lessons in porting to Python 3

Written by Barry Warsaw in technology on Wed 07 December 2011. Tags: python, python3, ubuntu,

Yesterday, I completed my port of dbus-python to Python 3, and submitted my patch upstream. While I've yet to hear any feedback from Simon about my patch, I'm fairly confident that it's going in the right direction. This version should allow existing Python 2 applications to run largely unchanged, and minimizes the differences that clients will have to make to use the Python 3 version.

Some of the changes are specific to the dbus-python project, and I included a detailed summary of those changes and my rationale behind them. There are lots of good lessons learned during this porting exercise that I want to share with you, have a discussion about, and see if there aren't things we core Python developers can do in Python 3.3 to make it even easier to migrate to Python 3.

First, some background. D-Bus is a freedesktop.org project for same-system interprocess communication, and it's an essential component of any Linux desktop. The D-Bus system and C API are mature and well-defined, and there are bindings available for many programming language, Python included of course. The existing dbus-python package is only compatible with Python 2, and most recommendations are to use the Gnome version of Python bindings should you want to use D-Bus with Python 3. For us in Ubuntu, this isn't acceptable though because we must have a solution that supports KDE and potentially even non-UI based D-Bus Python servers. Several ports of dbus-python to Python 3 have been attempted in the past, but none have been accepted upstream, so naturally I took it as a challenge to work on a new version of the port. After some discussion with the upstream maintainer Simon McVittie, I had a few requirements in mind:

  • One code base for both Python 2 and Python 3 …

Continue reading »


An update on Ubuntu's Python plans

Written by Barry Warsaw in technology on Wed 23 November 2011. Tags: python, ubuntu,

Earlier this month, I attended UDS-P (the Ubuntu Developers Summit for 12.04 Precise Pangolin). 12.04 is an LTS release, or Long Term Support, meaning it will be officially supported on both the desktop and server for five years.

During the summit we reiterated our plans for Python on Ubuntu, both for 12.04 LTS, and our vision of Python two years from now for the next LTS, 14.04. I'm here to provide an update from my last report, which was written after UDS-O for 11.10.

As per those previous plans, we've removed Python 2.6 from Ubuntu 12.04. Now you have only Python 2.7 and 3.2. Dropping Python 2.6 may cause some inconvenience for data centers which like to upgrade only between LTS's but don't want to have to upgrade both their operation system and their Python version. Canonical services such as Launchpad and Landscape fall into this camp. The biggest problem is that Python 2.7 is not available for the last LTS, i.e. 10.04. To mitigate this, we decided to create a PPA containing Python 2.7 and a bunch of packages that services such as Launchpad will need to do their porting. This now exists, although it hasn't yet been tested with any development branches of Launchpad or Landscape as far as I'm aware. If you have your own data center porting task ahead of you, you can also use this PPA, and if there are additional packages you need for 10.04, you can create your own PPA which depends on ours, and build those extra packages there.

But that's all boring stuff. Let's have some fun!

The other decision we made concerns Python 3. I strongly feel that the Python community is very close to …

Continue reading »


sbuild with local, newer, dependencies

Written by Barry Warsaw in technology on Thu 01 September 2011. Tags: canonical, debian, packaging, ubuntu,

Update 2016-11-28: I've updated this article with new instructions!

sbuild is an excellent tool for locally building Ubuntu and Debian packages. It fits into roughly the same problem space as the more popular pbuilder, but for many reasons, I prefer sbuild. It's based on schroot to create chroot environments for any distribution and version you might want. For example, I have chroots for Ubuntu Oneiric, Natty, Maverick, and Lucid, Debian Sid, Wheezy, and Squeeze, for both i386 and amd64. It uses an overlay filesystem so you can easily set up the primary snapshot with whatever packages or prerequisites you want, and the individual builds will create a new session with an overlaid temporary filesystem on top of that, so the build results will not affect your primary snapshot. sbuild can also be configured to save the session depending on the success or failure of your build, which is fantastic for debugging build failures. I've been told that Launchpad's build farm uses a customized version of sbuild, and in my experience, if you can get a package to build locally with sbuild, it will build fine in the main archive or a PPA.

Right out of the box, sbuild will work great for individual package builds, with very little configuration or setup. The Ubuntu Security Team's wiki page has some excellent instructions for getting started (you can stop reading when you get to UMT :).

One thing that sbuild doesn't do very well though, is help you build a stack of packages. By that I mean, when you have a new package that itself has new dependencies, you need to build those dependencies first, and then build your new package based on those dependencies. Here's an example.

I'm working on bug 832864 and I wanted to see if I could build the …

Continue reading »


ONOTE and OTONE - A musical project

Written by Barry Warsaw in music on Wed 20 July 2011. Tags: music, otone, onote,

I'm starting a new musical project, which I'm calling OTONE and/or ONOTE. Actually, I've been working on this project for several years without realizing what I wanted to do with it. It coalesced in my mind when I thought of the acronyms above. Here's what they stand for:

  • The One Tune One Night Experiment (OTONE)
  • The One Night One Tune Experiment (ONOTE)

I'm not yet sure what the difference between the two are yet (though see below), but here's the idea behind the project.

If you're like me, you can easily sweat over a song and its recording forever, tweaking the mix, or hearing another melody, or (worst of all) agonizing over every word of a lyric that was like pulling teeth in the first place. Sometimes you think if you just do one more take of the guitar, you can get it perfect, or oh! it just needs a little bit of tamborine right there. Sometimes the arrangement just doesn't sit quite right, or you know in your gut that lurking out there somewhere there's a better way to get from the bridge to the last chorus.

Well, I'm kind of frustrated with that because it can lead to never actually finishing a song and getting it out there for folks to hear. At some point you reach diminishing returns, where the little tweaks don't really improve the song enough. Probably most importantly, the whole thing can put the brakes on the creative process. I liken it to software maintenance vs. creating a new project from scratch.

Software maintenance is important, useful, and can be fun, but the juices really get flowing when you're starting a new project. You get this rush of an idea and your fingers can't type fast enough to translate them into code. It's …

Continue reading »


Ignore at your own peril

Written by Barry Warsaw in technology on Wed 13 July 2011. Tags: ubuntu,

On Monday, I lost my home directory on my primary development machine. I'd had this machine for a couple of years but it was still beefy enough to be an excellent development box. I've upgraded it several times with each new Ubuntu release, and it was running Natty. I had decent sbuild and pbuilder environments, and a bunch of virtual machines for many different flavors of OS.

I'd also encrypted my home directory when I did the initial install. Under Ubuntu, this creates an ecryptfs and does some mount magic after you successfully log in. It's as close to FileVault as you can get on Ubuntu, and I think it does a pretty good job without incurring much noticeable overhead. Plus, with today's Ubuntu desktop installers, enabling an encrypted home directory is just a trivial checkbox away.

To protect your home directory, ecryptfs creates a random hex passphrase that is used to decrypt the contents of your home directory. To protect this passphrase, it encrypts it with your login password. ecryptfs stores this "wrapped" passphrase on disk in the ~/.ecryptfs/wrapped-passphrase file.

When you log in, ecryptfs uses your login password to decrypt wrapped-passphrase, and then uses the crazy long hex number inside it to decrypt your real home directory. Usually, this works seamlessly and you never really see the guts of what's going on. The problem of course is that if you ever lose your wrapped-passphrase file, you're screwed because without that long hex number, your home directory cannot be decrypted. Yay for security, boo for robustness!

When you do your initial installation and choose to encrypt your home directory, you will be prompted to write down the long hex number, i.e. your unwrapped passphrase. Here's the moral of the story. 1) You should do this; 2) You …

Continue reading »


PEP 382 sprint summary

Written by Barry Warsaw in technology on Wed 22 June 2011. Tags: debian, packaging, python, ubuntu,

So, yesterday (June 21, 2011), six talented and motivated Python hackers from the Washington DC area met at Panera Bread in downtown Silver Spring, Maryland to sprint on PEP 382. This is a Python Enhancement Proposal to introduce a better way for handling namespace packages, and our intent is to get this feature landed in Python 3.3. Here then is a summary, from my own spotty notes and memory, of how the sprint went.

First, just a brief outline of what the PEP does. For more details please read the PEP itself, or join the newly resurrected import-sig for more discussions. The PEP has two main purposes. First, it fixes the problem of which package owns a namespace's __init__.py file, e.g. zope/__init__.py for all the Zope packages. In essence, it eliminate the need for these by introducing a new variant of .pth files to define a namespace package. Thus, the zope.interfaces package would own zope/zope-interfaces.pth and the zope.components package would own zope/zope-components.pth. The presence of either .pth file is enough to define the namespace package. There's no ambiguity or collision with these files the way there is for zope/__init__.py. This aspect will be very beneficial for Debian and Ubuntu.

Second, the PEP defines the one official way of defining namespace packages, rather than the multitude of ad-hoc ways currently in use. With the pre-PEP 382 way, it was easy to get the details subtly wrong, and unless all subpackages cooperated correctly, the packages would be broken. Now, all you do is put a * in the .pth file and you're done.

Sounds easy, right? Well, Python's import machinery is pretty complex, and there are actually two parallel implementations of it in Python 3.3, so gaining traction on …

Continue reading »