For a project at work we create sparse files, which on Linux and other
POSIX systems, represent empty blocks more efficiently. Let's say you have a
file that's a gibibyte in size, but which contains mostly zeros,
i.e. the NUL byte. It would be inefficient to write out all those zeros, so
file systems that support sparse files actually just write some metadata to
represent all those zeros. The real, non-zero data is then written wherever
it may occur. These sections of zero bytes are called "holes".
Sparse files are used in many situations, such as disk images, database files,
etc. so having an efficient representation is pretty important. When the file
is read, the operating system transparently turns those holes into the correct
number of zero bytes, so software reading sparse files generally don't have to
do anything special. They just read data as normal, and the OS gives them
zeros for the holes.
You can create a sparse file right from the shell:
$ truncate -s 1000000 /tmp/sparse
Now /tmp/sparse is a file containing one million zeros. It actually
consumes almost no space on disk (just some metadata), but for most intents
and purposes, the file is one million bytes in size:
$ ls -l /tmp/sparse
-rw-rw-r-- 1 barry barry 1000000 Jan 14 11:36 /tmp/sparse
$ wc -c /tmp/sparse
The commands ls and wc don't really know or care that the file is
sparse; they just keep working as if it weren't.
But, sometimes you do need to know that a file contains holes. A common
case is if you want to copy the file to some other location, say on a
different file system. A naive use of cp will fill in those holes, so a
command like this …
Continue reading »
Way back in 2011 I wrote an article describing how you can build Debian
packages with local dependencies for testing purposes. An example would be a
new version of a package that has new dependencies. Or perhaps the new
dependency isn't available in Debian yet. You'd like to test both packages
together locally before uploading. Using sbuild and autopkgtest you can
have a high degree of confidence about the quality of your packages before you
Here I'll describe some of the improvements in those tools, and give you
simplified instructions on how to build and test packages with local
Several things have changed. Probably the biggest thing that simplifies the
procedure is that GPG keys for your local repository are no longer needed.
Another thing that's improved is the package testing support. It used to be
that packages could only be tested during build time, but with the addition of
the autopkgtest tool, we can also test the built packages under various
scenarios. This is important because it more closely mimics what your
package's users will see. One thing that's cool about this for Python
packages is that autopkgtest runs an import test of your package by
default, so even if you don't add any explicit tests, you still get
something. Of course, if you do want to add your own tests, you'll need to
recreate those default tests, or check out the autodep8 package for some
I've moved the repository of scripts over to git.
The way you specify the location of the extra repositories holding your local
debs has changed. Now, instead of providing a directory on the local file
system, we're going to fire up a simple Python-based HTTP server and use that
as a new repository URL. This won't be …
Continue reading »
In early September 2016 I traveled to New Zealand to give a keynote
speech for Kiwi PyCon 2016. I was very honored to be invited, and glad that
after 4 years of valiant resistance, I finally gave in to my thankfully
diligent colleague Thomi Richards. My wife Jane and I made the 25+ hour
journey, and had a wonderful little vacation after the conference.
Here in part I, I'll talk a bit about the conference and my keynote. Later in
part II, I'll talk about the vacation part of the trip. I might sprinkle
little impressions of New Zealand throughout both articles.
In some ways, it's a good thing that New Zealand is so far from the USA, with
an additional transcontinental trip away from the east coast. During our
summer, it's their winter and they are 16 hours ahead of UTC-4. Meaning that
at noon New York time, it's 4am the next day in New Zealand. Or, to put it
another way, 8am New Zealand time is 4pm New York time the previous day. I
never got tired of the joke with family back home that we were living in the
future over there.
So it's definitely a commitment to get there, which makes it all the more
breathtaking I think. In a way, New Zealand feels exotic because the first
thing that most Americans probably think of when they hear "New Zealand" is
the "Lord of the Rings" movies. But New Zealand is a high-tech country,
with a low population (under 5 million people in total) concentrated in four
major cities (Auckland, Christchurch, Dunedin, and Wellington), lots of
sheep and some of the nicest people you'll meet in the English speaking world.
They have a real sense of stewardship for their land and environment, as
Continue reading »
I'm back! This time, I'm using the very awesome Pelican publishing platform,
as Blogger just got to be too much of a pain to use. Let's hope that the
simplicity of using reStructuredText, static pages, and outsourcing
discussions will make it so easy to blog that I actually keep doing it. I've
slapped together a basic theme, but no doubt I'll be tweaking it as time goes
on. For now, the theme isn't in a public repository. A quick shout out to
Font Awesome and font-linux for their very cool font icons.
I do intend to expand the themes I blog about. I'll still focus on Python
and GNU Mailman as well as other technology, but now I'll include tai chi,
music, and anything else that I feel the overwhelming urge to share. I hope I
can continue to present my thoughts in a respectful, positive, all-inclusive
I've migrated most of the pages from the old Blogger platform, and done some
minor updating as appropriate. One of the main reasons I've moved off of
Blogger was because of the pain of moderating comments. There was just too
much spam. I've switched to the third party discussion platform Disqus and
I've tried to import all the non-spammy original comments. I'm not sure I've
done that correctly, but we'll see!
You can contact me via the Social links in the side bar, and via the various
public mailing lists and IRC channels I hang out in. I hope to hear from you!
Snappy Ubuntu Core is a new edition of the Ubuntu you know and love, with
some interesting new features, including atomic, transactional updates, and a
much more lightweight application deployment story than traditional
Debian/Ubuntu packaging. Much of this work grew out of our development of a
mobile/touch based version of Ubuntu for phones and tablets, but now Ubuntu
Core is available for clouds and devices.
I find the transactional nature of upgrades to be very interesting. While you
still get a perfectly normal Ubuntu system, your root file system is
read-only, so traditional apt-get based upgrades don't work. Instead, your
system version is image based; today you are running image 231 and tomorrow
a new image is released to get you to 232. When you upgrade to the new image,
you get all the system changes. We support both full and delta upgrades
(the latter which reduces bandwidth), and even phased updates so that we can
roll out new upgrades and quickly pull them from the server side if we notice
a problem. Snappy devices even support rolling back upgrades on a single
device, by using a dual-partition root file system. Phones generally don't
support this due to lack of available space on the device.
Of course, the other part really interesting thing about Snappy is the
lightweight, flexible approach to deploying applications. I still remember my
early days learning how to package software for Debian and Ubuntu, and now
that I'm both an Ubuntu Core Developer and Debian Developer, I understand
pretty well how to properly package things. There's still plenty of black art
involved, even for relatively easy upstream packages such as
distutils/setuptools-based Python packages available on the Cheeseshop (er,
PyPI). The Snappy approach on Ubuntu Core is much more lightweight and easy,
Continue reading »
I'm writing a bunch of new code these days for Ubuntu Touch's Image Based
Upgrade system. Think of it essentially as Ubuntu Touch's version of
upgrading the phone/tablet (affectionately called phablet) operating system
in a bulk way rather than piecemeal apt-get s the way you do it on a
traditional Ubuntu desktop or server. One of the key differences is that a
phone has to detour through a reboot in order to apply an upgrade since its
Ubuntu root file system is mounted read-only during the user session.
Anyway, those details aren't the focus of this article. Instead, just realize
that because it's a pile of new code, and because we want to rid ourselves of
Python 2, at least on the phablet image if not everywhere else in Ubuntu, I
am prototyping all this in Python 3, and specifically 3.3. This means that
I can use all the latest and greatest cool stuff in the most recent stable
Python release. And man, is there a lot of cool stuff!
One module in particular that I'm especially fond of is contextlib. Context
managers are objects implementing the protocol behind the with
statement, and they are typically used to guarantee that some resource is
cleaned up properly, even in the event of error conditions. When you see code
with open(somefile) as fp:
data = fp.read()
you are invoking a context manager. Python was clever enough to make file
objects support the context manager protocol so that you never have to
explicitly close the file; that happens automatically when the with
statement completes, regardless of whether the code inside the with
statement succeeds or raises an exception.
It's also very easy to define your own context managers to properly handle
other kinds of resources. I won't go …
Continue reading »
There's a lot of Python nostalgia going around today, from Brett Cannon's 10
year anniversary of becoming a core developer, to Guido reminding us that
he came to the USA 18 years ago. Despite my stolen time machine keys, I
don't want to dwell in the past, except to say that I echo much of what Brett
says. I had no idea how life changing it would be -- on both a personal and
professional level -- when Roger Masse and I met Guido at NIST at the
first Python workshop back in November 1994. The lyric goes: what a long
strange trip it's been, and that's for sure. There were about 20 people
at that first workshop, and 2500 at Pycon 2013.
And Python continues to hold little surprises. Just today, I solved a bug in
an Ubuntu package that's been perplexing us for weeks. I'd looked at the code
dozens of times and saw nothing wrong. I even knew about the underlying
corner of the language, but didn't put them together until just now. Here's a
boiled down example, see if you can spot the bug!
if i == 1:
if i == 2:
e = None
except KeyError as e:
except ValueError as e:
Here's a hint: this works under Python 2, but gives you an
UnboundLocalError on the e variable under Python 3.
The reason is that in Python 3, the targets of except clauses are del'd from
the current namespace after the try...except clause executes. This is
to prevent circular references that occur when the exception is bound to the
target. What is surprising and non-obvious is that the name is deleted …
Continue reading »
For UDS-R for Raring (i.e. Ubuntu 13.04) in Copenhagen, I sponsored
three blueprints. These blueprints represent most of the work I will be doing
for the next 6 months, as we're well on our way to the next LTS, Ubuntu 14.04.
I'll provide some updates to the other blueprints later, but for now, I want
to talk about OAuth and Python 3. OAuth is a protocol which allows you to
programmatically interact with certain website APIs, in an authenticated
manner, without having to provide your website password. Essentially, it
allows you to generate an authorization token which you can use instead, and
it allows you to manage and share these tokens with applications, so that you
can revoke them if you want, or decide how and which applications to trust to
act on your behalf.
A good example of a site that uses OAuth is Launchpad, but many other sites
also support OAuth, such as Twitter and Facebook.
There are actually two versions of OAuth out there. OAuth version 1 is
definitely the more prevalent, since it has been around for years, is
relatively simple (at least on the client side), and enshrined in RFC 5849.
There are tons of libraries available that support OAuth v1, in a multitude
of languages, with Python being no exception.
OAuth v2 is much less common, since it is currently only a draft
specification, and has had its share of design-by-committee controversy.
Still, some sites such as Facebook do require OAuth v2.
One of the very earliest Python libraries to support OAuth v1, on both the
client and server side, was python-oauth (I'll use the Debian package names
in this post), and on the Ubuntu desktop, you'll find lots of scripts and
libraries that use python-oauth. There are major problems with …
Continue reading »
Recently, as part of our push to ship only Python 3 on the Ubuntu 12.10
desktop, I've helped several projects update their internationalization
(i18n) support. I've seen lots of instances of suboptimal Python 2 i18n code,
which leads to liberal sprinkling of cargo culted .decode() and
.encode() calls simply to avoid the dreaded UnicodeError s. These get
worse when the application or library is ported to Python 3 because then even
the workarounds aren't enough to prevent nasty failures in non-ASCII
environments (i.e. the non-English speaking world majority :).
Let's be honest though, the problem is not because these developers are crappy
coders! In fact, far from it, the folks I've talked with are really really
smart, experienced Pythonistas. The fundamental problem is Python 2's 8-bit
string type which doubles as a bytes type, and the terrible API of the
built-in Python 2 gettext module, which does its utmost to sabotage your
Python 2 i18n programs. I take considerable blame for the latter, since I
wrote the original version of that module. At the time, I really didn't
understand unicodes (this is probably also evident in the mess I made of the
email package). Oh, to really have access to Guido's time machine.
The good news is that we now know how to do i18n right, especially in a
bilingual Python 2/3 world, and the Python 3 gettext module fixes the most
egregious problems in the Python 2 version. Hopefully this article does some
measure of making up for my past sins.
Stop right here and go watch Ned Batchelder's talk from PyCon 2012 entitled
Pragmatic Unicode, or How Do I Stop the Pain? It's the single best
description of the background and effective use of Unicode in Python you'll
ever see. Ned does a brilliant job of …
Continue reading »
So, now all the world now knows that my suggested code name for Ubuntu 12.10,
Qwazy Quahog, was not chosen by Mark. Oh well, maybe I'll have more luck
with Racy Roadrunner.
In any case, Ubuntu 12.04 LTS is to be released any day now so it's time
for my semi-annual report on Python plans for Ubuntu. I seem to write about
this every cycle, so 12.10 is no exception. We've made some fantastic
progress, but now it's time to get serious.
For Ubuntu 12.10, we've made it a release goal to have Python 3 only on the
desktop CD images. The usual caveats apply: Python 2.7 isn't going away; it
will still probably always be available in the main archive. This release
goal also doesn't affect other installation CD images, such as server, or
other Ubuntu flavors. The relatively modest goal then only affects
packages for the standard desktop CD images, i.e. the alternative installation
CD and the live CD.
Update 2012-04-25: To be crystal clear, if you depend on Python 2.7, the
only thing that changes for you is that after a fresh install from the
desktop CD on a new machine, you'll have to explicitly apt-get install
*python2.7. After that, everything else will be the same.
This is ostensibly an effort to port a significant chunk of Ubuntu to Python
3, but it really is a much wider, Python-community driven effort. Ubuntu has
its priorities, but I personally want to see a world where Python 3 rules the
day, and we can finally start scoffing at Python 2 :).
Still, that leaves us with about 145 binary packages (and many fewer source
packages) to port. There are a few categories of packages to consider:
- Already ported and available.
- This is …
Continue reading »