We Fear Change

Creating Python snaps

Written by Barry Warsaw in technology on Thu 02 April 2015. Tags: debugging, python, python3, ubuntu,

Background

Snappy Ubuntu Core is a new edition of the Ubuntu you know and love, with some interesting new features, including atomic, transactional updates, and a much more lightweight application deployment story than traditional Debian/Ubuntu packaging. Much of this work grew out of our development of a mobile/touch based version of Ubuntu for phones and tablets, but now Ubuntu Core is available for clouds and devices.

I find the transactional nature of upgrades to be very interesting. While you still get a perfectly normal Ubuntu system, your root file system is read-only, so traditional apt-get based upgrades don't work. Instead, your system version is image based; today you are running image 231 and tomorrow a new image is released to get you to 232. When you upgrade to the new image, you get all the system changes. We support both full and delta upgrades (the latter which reduces bandwidth), and even phased updates so that we can roll out new upgrades and quickly pull them from the server side if we notice a problem. Snappy devices even support rolling back upgrades on a single device, by using a dual-partition root file system. Phones generally don't support this due to lack of available space on the device.

Of course, the other part really interesting thing about Snappy is the lightweight, flexible approach to deploying applications. I still remember my early days learning how to package software for Debian and Ubuntu, and now that I'm both an Ubuntu Core Developer and Debian Developer, I understand pretty well how to properly package things. There's still plenty of black art involved, even for relatively easy upstream packages such as distutils/setuptools-based Python packages available on the Cheeseshop (er, PyPI). The Snappy approach on Ubuntu Core is much more lightweight and easy, and it doesn't require the magical approval of the archive elves, or the vagaries of PPAs, to make your applications quickly available to all your users. There's even a robust online store for publishing your apps.

There's lots more about Snappy apps and Ubuntu Core that I won't cover here, so I encourage you to follow the links for more information. You might also want to stop now and take the tour of Ubuntu Core (hey, I'm a poet and I didn't even realize it).

In this post, I want to talk about building and deploying snappy Python applications. Python itself is not an officially supported development framework, but we have a secret weapon. The system image client upgrader -- i.e. the component on the devices that checks for, verifies, downloads, and applies atomic updates -- is written in Python 3. So the core system provides us with a full-featured Python 3 environment we can utilize.

The question that came to mind is this: given a command-line application available on PyPI, how easy is it to turn into a snap and install it on an Ubuntu Core system? With some caveats I'll explore later, it's actually pretty easy!

Basic approach

The basic idea is this: let's take a package on PyPI, which may have additional dependencies also on PyPI, download them locally, and build them into a snap that we can install on an Ubuntu Core system.

The first question is, how do we build a local version of a fully-contained Python application? My initial thought was to build a virtual environment using virtualenv or pyvenv, and then somehow turn that virtual environment into a snap. This turns out to be difficult in practice because virtual environments aren't really designed for this. They have issues with being relocated for example, and they can contain a lot of extraneous stuff that's great for development (virtual environment's actual purpose ) but unnecessary baggage for our use case.

My second thought involved turning a Python application into a single file executable, and from there it would be fairly easy to snappify. Python has a long tradition of such tools, many with varying degrees of cross platform portability and standalone-ishness. After looking again at some oldies but goodies (e.g. cx_freeze) and some new offerings, I decided to start with pex.

pex is a nice tool developed by Brian Wickman and the Twitter folks which they use to deploy Python applications to their production environment. pex takes advantage of modern Python's support for zip imports, and a clever trick of zip files.

Python supports direct imports (of pure Python modules) from zip files, and the python executable's -m option works even when the module is inside a zip file. Further, the presence of a __main__.py file within a package can be used as shorthand for executing the package, e.g. python -m myapp will run myapp/__main__.py if it exists.

Zip files are interesting because their index is at the end of the file. This allows you to put whatever you want at the front of the file and it will still be considered a zip file. pex exploits this by putting a shebang in the first line of the file, e.g. #!/usr/bin/python3 and thus the entire zip file becomes a single file executable of Python code.

There are of course, plenty of caveats. Probably the main one is that Python cannot import extension modules directly from the zip, because the dlopen() function call only takes a file system path. pex handles this by marking the resulting file as not zip safe, so the zip is written out to a temporary directory first.

The other issue of course, is that the zip file must contain all the dependencies not present in the base Python. pex is actually fairly smart here, in that it will chase dependencies, much like pip and it will include those dependencies in the zip file. You can also specify any missed dependencies explicitly on the pex command line.

Once we have the pex file, we need to add the required snappy metadata and configuration files, and run the snappy command to generate the .snap file, which can then be installed into Ubuntu Core. Since we can extract almost all of the minimal required snappy metadata from the Python package metadata, we only need just a little input from the user, and the rest of work can be automated.

We're also going to avail ourselves of a convenient cheat. Because Python 3 and its standard library are already part of Ubuntu Core on a snappy device, we don't need to worry about any of those dependencies. We're only going to support Python 3, so we get its full stdlib for free. If we needed access to Python 2, or any external libraries or add-ons that can't be made part of the zip file, we would need to create a snappy framework for that, and then utilize that framework for our snappy app. That's outside the scope of this article though.

Requirements

To build Python snaps, you'll need to have a few things installed. If you're using Ubuntu 15.04, just apt-get install the appropriate packages. Otherwise, you can get any additional Python requirements by building a virtual environment and installing tools like pex and wheel into their, then invoking pex from that virtual environment. But let's assume you have the Vivid Vervet (Ubuntu 15.04); here are the packages you need:

python3
python-pex-cli
python3-wheel
snappy-tools
git

You'll also want a local git clone of https://gitlab.com/warsaw/pysnap.git which provides a convenient script called snap.py for automating the building of Python snaps. We'll refer to this script extensively in the discussion below.

For extra credit, you might want to get a copy of Python 3.5 (unreleased as of this writing). I'll show you how to do some interesting debugging with Python 3.5 later on.

From PyPI to snap in one easy step

Let's start with a simple example: world is a very simple script that can provide forward and reverse mappings of ISO 3166 two letter country codes (at least as of before ISO once again paywalled the database). So if you get an email from guido@example.py you can find out where the BDFL has his secret lair:

$ world py
py originates from PARAGUAY

world is a pure-Python package with both a library and a command line interface. To get started with the snap.py script mentioned above, you need to create a minimal .ini file, such as:

[project]
name: world

[pex]
verbose: true

Let's call this file world.ini. (In fact, you'll find this very file under the examples directory in the snap git repository.) What do the various sections and variables control?

name is the name of the project on PyPI. It's used to look up metadata about the project on PyPI via PyPI's JSON API.
verbose variable just defines whether to pass -v to the underlying pex command.

Now, to create the snap, just run:

$ ./snap.py examples/world.ini

You'll see a few progress messages and a warning which you can ignore. Then out spits a file called world_3.1.1_all.snap. Because this is pure Python, it's architecture independent. That's a good thing because the snap will run on any device, such as a local amd64 kvm instance, or an ARM-based Ubuntu Core-compatible Lava Lamp.

Armed with this new snap, we can just install it on our device (in this case, a local kvm instance) and then run it:

$ snappy-remote --url=ssh://localhost:8022 install world_3.1.1_all.snap
$ ssh -p 8022 ubuntu@localhost
ubuntu@localhost:~$ world.world py
py originates from PARAGUAY

From git repository to snap in one easy step

Let's look at another example, this time using a stupid project that contains an extension module. This aptly named package just prints a "yes" for every -y argument, and "no" for every -n argument.

The difference here is that stupid isn't on PyPI; it's only available via git. The snap.py helper is smart enough to know how to build snaps from git repositories. Here's what the stupid.ini file looks like:

[project]
name: stupid
origin: git https://gitlab.com/warsaw/stupid.git

[pex]
verbose: yes

Notice that there's a [project]origin variable. This just says that the origin of the package isn't PyPI, but instead a git repository, and then the public repo url is given. The first word is just an arbitrary protocol tag; we could eventually extend this to handle other version control systems or origin types. For now, only git is supported.

To build this snap:

$ ./snap.py examples/stupid.ini

This clones the repository into a temporary directory, builds the Python package into a wheel, and stores that wheel in a local directory. pex has the ability to build its pex file from local wheels without hitting PyPI, which we use here. Out spits a file called stupid_1.1a1_all.snap, which we can install in the kvm instance using the snappy-remote command as above, and then run it after ssh'ing in:

ubuntu@localhost:~$ stupid.stupid -ynnyn
yes
no
no
yes
no

Watch out though, because this snap is really not architecture-independent. It contains an extension module which is compiled on the host platform, so it is not portable to different architectures. It works on my local kvm instance, but sadly not on my Lava Lamp.

Entry points

pex currently requires you to explicitly name the entry point of your Python application. This is the function which serves as your main and it's what runs by default when the pex zip file is executed.

Usually, a Python package will define its entry point in its setup.py file, like so:

setup(
    ...
    entry_points={
        'console_scripts': ['stupid = stupid.__main__:main'],
        },
    ...
    )

And if you have a copy of the package, you can run a command to generate the various package metadata files:

$ python3 setup.py egg_info

If you look in the resulting stupid.egg_info/entry_points.txt file, you see the entry point clearly defined there. Ideally, either pex or snap.py would just figure this out explicitly. As it turns out, there's already a feature request open on pex for this, but in the meantime, how can we auto-detect the entry point?

For the stupid example, it's pretty easy. Once we've cloned its git repository, we just run the egg_info command and read the entry_points.txt file. Later, we can build the project's binary wheel from the same git clone.

It's a bit more problematic with world though because the package isn't downloaded from PyPI until pex runs, but the pex command line requires that you specify the entry point before the download occurs.

We can handle this by supporting an entry_point variable in the snap's .ini file. For example, here's the world.ini file with an explicit entry point setting:

[project]
name: world
entry_point: worldlib.__main__:main

[pex]
verbose: true

What if we still wanted to auto-detect the entry point? We could of course, download the world package in snap.py and run the egg-info command over that. But pex also wants to download world and we don't want to have to download it twice. Maybe we could download it in snap.py and then build a local wheel file for pex to consume?

As it turns out there's an easier way.

Unfortunately, package egg-info metadata is not availble on PyPI, although arguably it should be. Fortunately, Vinay Sajip runs an external service that does make the metadata available, such as the metadata for world.

snap.py makes the entry_point variable optional, and if it's missing, it will grab the package metadata from a link like that given above. An error will be thrown if the file can't be found, in which case, for now, you'd just add the [project]entry_point variable to the .ini file.

A little more snap.py detail

The snap.py script is more or less a pure convenience wrapper around several independent tools. pex of course for creating the single executable zip file, but also the snappy command for building the .snap file. It also utilizes python3 setup.py egg_info where possible to extract metadata and construct the snappy facade needed for the snappy build command. Less typing for you! In the case of a snap built from a git repository, it also performs the git cloning, and the python3 setup.py bdist_wheel command to create the wheel file that pex will consume.

There's one other important thing snap.py does: it fixes the resulting pex file's shebang line. Because we're running these snaps on an Ubuntu Core system, we know that Python 3 will be available in /usr/bin/python3. We want the pex file's shebang line to be exactly this. While pex supports a --python option to specify the interpreter, it doesn't take the value literally. Instead, it takes the last path component and passes it to /usr/bin/env so you end up with a shebang line like:

#!/usr/bin/env python3

That might work, but we don't want the pex file to be subject to the uncertainties of the $PATH environment variable.

One of the things that snap.py does is repack the pex file. Remember, it's just a zip file with some magic at the top (that magic is the shebang), so we just read the file that pex spits out, and rewrite it with the shebang we want. Eventually, pex itself will handle this and we won't need to do that anymore.

Debugging

While I was working out the code and techniques for this blog post, I ran into an interesting problem. The world script would crash with some odd tracebacks. I don't have the details anymore and they'd be superfluous, but suffice to say that the tracebacks really didn't help in figuring out the problem. It would work in a local virtual environment build of world using either the (pip installed) PyPI package or run from the upstream git repository, but once the snap was installed in my kvm instance, it would traceback. I didn't know if this was a bug in world, in the snap I built, or in the Ubuntu Core environment. How could I figure that out?

Of course, the go to tool for debugging any Python problem is pdb. I'll just assume you already know this. If not, stop everything and go learn how to use the debugger.

Okay, but how was I going to get a pdb breakpoint into my snap? This is where Python 3.5 comes in!

PEP 441, which has already been accepted and implemented in what will be Python 3.5, aims to improve support for zip applications. Apropos this blog post, the new zipapp module can be used to zip up a directory into single executable file, with an argument to specify the shebang line, and a few other options. It's related to what pex does, but without all the PyPI interactions and dependency chasing. Here's how we can use it to debug a pex file.

Let's ignore snappy for the moment and just create a pex of the world application:

$ pex -r world -o world.pex -e worldlib.__main__:main

Now let's say we want to set a pdb breakpoint in the main() function so that we can debug the program, even when it's a single executable file. We start by unzipping the pex:

$ mkdir world
$ cd world
$ unzip ../world.pex

If you poke around, you'll notice a __main__.py file in the current directory. This is pex's own main entry point. There are also two hidden directories, .bootstrap and .deps. The former is more pex scaffolding, but inside the latter you'll see the unpacked wheel directories for world and its single dependency.

Drilling down a little farther, you'll see that inside the world wheel is the full source code for world itself. Set a break point by visiting .deps/world-3.1.1-py2.py3-none-any.whl/worldlib/__main__.py in your editor. Find the main() function and put this right after the def line:

import pdb; pdb.set_trace()

Save your changes and exit your editor.

At this point, you'll want to have Python 3.5 installed or available. Let's assume that by the time you read this, Python 3.5 has been released and is the default Python 3 on your system. If not, you can always download a pre-release of the source code, or just build Python 3.5 from its Mercurial repository. I'll wait while you do this...

...and we're back! Okay, now armed with Python 3.5, and still inside the world subdirectory you created above, just do this:

$ python3.5 -m zipapp . -p /usr/bin/python3 -o ../world.dbg

Now, before you can run ../world.dbg and watch the break point do its thing, you need to delete pex's own local cache, otherwise pex will execute the world dependency out of its cache, which won't have the break point set. This is a wart that might be worth reporting and fixing in pex itself. For now:

$ rm -rf ~/.pex
$ ../world.dbg

And now you should be dropped into pdb almost immediately.

If you wanted to build this debugging pex into a snap, just use the snappy build command directly. You'll need to add the minimal metadata yourself (since currently snap.py doesn't preserve it). See the Snappy developer documentation for more details.

Summary and Caveats

There's a lot of interesting technology here; pex for building single file executables of Python applications, and Snappy Ubuntu Core for atomic, transactional system updates and lightweight application deployment to the cloud and things. These allow you to get started doing some basic deployments of Python applications. No doubt there are lots of loose ends to clean up, and caveats to be aware of. Here are some known ones:

All of the above only works with Python 3. I think that's a feature, but you might disagree. ;) This works on Ubuntu Core for free because Python 3 is an essential piece of the base image. Working out how to deploy Python 2 as a Snappy framework would be an interesting exercise.
When we build a snap from a git repository for an application that isn't on PyPI, I don't currently have a way to also grab some dependencies from PyPI. The stupid example shown here doesn't have any additional dependencies so it wasn't a problem. Fixing this should be a fairly simple matter of engineering on the snap.py wrapper (pull requests welcome!)
We don't really have a great story for cross-compilation of extension modules. Solving this is probably a fairly complex initiative involving the distros, setuptools and other packaging tools, and upstream Python. For now, your best bet might be to actually build the snap on the actual target hardware.
Importing extension modules requires a file system cache because of limitations in the dlopen() API. There have been rumors of extensions to glibc which would provide a dlopen() -from-memory type of API which could solve this, or upstream Python's zip support may want to grow native support for caching.

Even with these caveats, it's pretty easy to turn a Python application into a Snappy Ubuntu Core application, publish it to the world, and profit! So what are you waiting for? Snap to it!