How macOS Broke Python
Written by Barry Warsaw in technology on Tue 27 November 2018. Tags: python, macos,
Here's a common Python idiom that is broken on macOS 10.13 and beyond:
import os, requests pid = os.fork() if pid == 0: # I am the child process. requests.get('https://www.python.org') os._exit(0) else: # I am the parent process. do_something()
Now, it's important to stress that this particular code sample may execute just fine. It's an excerpt from some code at work that illustrates a deeper problem that you may or may not encounter in real-world applications. We've seen it crash reproducibly, resulting in core dumps of the Python process, with potentially disk filling dump files in /cores.
In this article, I hope to explain what I know about this problem, with links to information elsewhere on the 'net. Some of those resources include workarounds, but in my experiments, those are not completely reliable in eliminating the core dumps. I'll explain why I think that is.
It's important to stress that at the time of this article's publishing, I do not have a complete solution, and am not even sure one exists. I'll note further that this is not specifically a Python problem and in fact has been described within the Ruby community. It is endemic to common idioms around the use of fork() without exec*() in scripting languages, and is caused by changes in the Objective-C runtime in macOS 10.13 High Sierra and beyond. It can also be observed in certain "prefork" servers.
What is forking?
I won't go into much detail on this, since any POSIX programmer should be well acquainted with the fork(2) system call, and besides, there are tons of other good resources on the 'net that explain fork(). For our purposes here, it's enough to know that fork() is a relatively inexpensive way to make an exact copy of the current process, creating a child process in the, um, process! The parent process continues to run unchanged, and is isolated from the child process in every important way.
fork(2) returns 0 to the child, and the process id (pid) of the child to the parent, so in the code example above, that's why we check the return code of os.fork() to know whether we're in the child or parent.
There are lots of icky semantics when fork() is used with threads, so it's generally a bad idea to use fork in multithreaded applications. However, even in single threaded applications, fork() can cause problems on macOS.
What about exec?
It is very common to call one of the exec*() family of functions right after calling fork(). The exec*() functions replace the current process with a new image, by executing the code in a file given by the first parameter to exec*(). This is one of the most common pairs of calls to run new programs on POSIX -- first you fork() to get a child process, then the child exec's the new file, and you now have two independent processes running. This common idiom is always safe since the child process after the fork() is replaced by the new program. macOS fully supports the exec-after-fork model.
The problem is that the above Python code sample, the prefork model, and similar very common idioms all try to do additional work after the fork() but before -- or even instead of -- exec*(). It's convenient, especially in scripting languages, since you don't need a separate file to exec*(). You also don't need to pass data from the parent to the child. You just keep running code in the child stanza of your conditional statement, and it inherits a copy of all the state of the parent process. As mentioned, this can be dangerous in multithreaded applications, but is generally safe (enough) in single threaded application. It is this idiom that is broken on macOS.
What's the problem?
The basic problem is that the Objective-C runtime can't be both thread safe and fork safe. In High Sierra, Apple clarified the rules for using the Objective-C runtime between fork() and exec*(). Code which was technically incorrect, but seemed to work before macOS 10.13 will now fail. Objective-C +inititialize methods may not be called within this interval due to the implicit acquisition of locks. With macOS 10.13 and beyond, the process simply core dumps.
The problem is that your program may implicitly call a +initialize method without you knowing it. In Python 3 for example, if you call into the popular requests library, this will end up calling into the _scproxy module to get the system proxies, and this will end up calling a +initialize method. So you better not use requests between fork() and exec*()! Setting the environment variable no_proxy = '*' will prevent this, and avoid the crash, but it also bypasses the system proxies, which is probably not what you want.
What's the fix?
The real fix is simply to avoid using this idiom in your Python code. Lots of other projects are struggling with where and how to fix this problem. E.g. should it be fixed in the prefork servers? Should it be fixed in the language support (e.g. Ruby and Python)? Or should it be fixed in the applications that run code between fork() and exec*()? It's a tricky thing to answer, and some projects are adopting fixes while others are not.
Currently, Python does not implement any kind of fix, so it's up to the individual applications to ensure that they are safe.
Ignore the problem
Individual users can prevent core dumps from filling up their disk by setting the core dump size limit to zero. If you're using the bash shell (other shells have similar commands), type this to set the core dump size limit to 0, effectively disabling them:
$ ulimit -c 0
Without the 0 argument, you can print the current limit:
$ ulimit -c unlimited
You can re-enable core dumps with this command:
$ ulimit -c unlimited
You can also disable core dumps system-wide by setting the following kernel parameter:
$ sudo sysctl kern.coredump=0 kern.coredump: 1 -> 0
Note that neither approach prevents the child processes from crashing, it just prevents your disk from getting filled. So it's not really a solution, because the code in your child process will still not get executed.
Band-aid the problem
You see lots of recommendations to set the following environment variable:
$ export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
This does prevent the child process from crashing, but it comes with an important caveat: You must set this outside of the forking process. It is not good enough to do something like this in your Python code:
os.environ['OBJC_DISABLE_INITIALIZE_FORK_SAFETY'] = 'YES' pid = os.fork() if pid == 0: # I am the child process. requests.get('https://www.python.org') os._exit(0) else: # I am the parent process. do_something()
I haven't positively confirmed this with outside references, but my experimentation has shown that this environment variable is only consulted when the parent process first starts, so if you set it in the parent before the fork(), it's ignored. Using os.putenv() doesn't help either.
Maybe this is good enough for you. At work, it's not, because we can't change the environment for every possible process. We use both shiv and pex as Python zip application formats (transitioning to all-shiv for Python >= 3.6) for command line tools (CLIs). For CLIs with multiple entry points, we do use a wrapper script that forks-and-execs the zip application, and we can set the environment variable there to prevent crashes. But for zip applications with a single entry point, we don't use the wrapper, so there's no universally good place to set this, and asking all the users to update their various shell initialization scripts isn't viable.
Other possible solutions
You'll see these other recommendations out on the 'net:
- Use NSTask or posix_spawn() instead of fork() and exec().
- Do nothing between fork() and exec().
- Use only async-signal-safe operations between fork() and exec().
- Use ObjC classes with no +initialize overrides between fork() and exec().
- Use pthread_atfork() to force your +initialize methods to run before fork().
- Define environment variable OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES, or add a __DATA,__objc_fork_ok section, or build using an SDK older than macOS 10.13. Then cross your fingers.
Let's look at each of these in turn.
#1 and #2 are the real solutions because they use safer mechanisms to spawn a child process, but both have the disadvantage of requiring you to implement your child process in a separate file. This can be quite a serious drawback. For us at work, it means designing a protocol to pass the required information between the parent and the child, where with the old idiom, this data was just maintained in local variables which get copied to the child upon fork()-ing. Still, this is guaranteed to be safe, so it's what we'll implement.
#3 and #4 are not, IMHO, practically effective because as I've mentioned, it can be surprising what gets called in a scripting language like Python. Before I debugged the problem, I never expected the calls from requests into _scproxy into the Objective-C runtime.
#5 sounds promising, and I actually experimented with using the prepare handler of pthread_atfork(), but I could never actually get it to work. Another recommendation I found suggests that simply loading a framework library will invoke its +initialize method. I wrote some code to implement this. While it was fun to play with ctypes, this never actually worked for me:
import ctypes from ctypes.util import find_library @ctypes.CFUNCTYPE(None) def prefork_callback(): ctypes.CDLL('/System/Library/Frameworks/Foundation.framework/Foundation') ctypes.CDLL('/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation') ctypes.CDLL('/System/Library/Frameworks/SystemConfiguration.framework/SystemConfiguration') libc = ctypes.CDLL(find_library('c')) code = libc.pthread_atfork(prefork_callback, None, None) if code != 0: print('Could not install the pthread_atfork() prepare handler')
The list of frameworks I load was found by inspecting the core dumps with lldb but try as I might, I could not find the right combination of loads to prevent the crashes. It seems as if loading the framework doesn't actually guarantee that its +initialize method gets called.
As I thought about it more, I'm not even sure a prepare handler and calling pthread_atfork() is necessary. If this technique worked, why not just call it inline before the call to os.fork()? Suggestions are welcome!
I have not played with the other suggestions of #6, namely adding a __DATA__,__objc_fork_ok section, but I'm skeptical that those will solve the problem. And using an older SDK is also not a viable option.
I think the only safe thing to do on macOS is to call exec*() after fork(), use the spawn family of functions, or better yet use the subprocess module. I am going to rewrite our code at work to use the latter, even though it's less convenient.
If you come up with any other reliably viable solutions, please do add to the comments below, or on the open Python tracker issue.
I have a real love/hate relationship with these types of problems. On the one hand, they are frustrating and perplexing. All you have as evidence are some core files, and you may not even notice them unless you have core dumps enabled and notice your disk getting filled with them. Then the journey begins!
On the other hand, they can be really fun to investigate. You have to be adept at debugging C code, skilled at searching the interwebs, and clever in your implementations. And even then -- as is the case here -- you may not come up with a satisfying solution. But at least you learn a lot, and begin to build a picture of what's going on that hangs together, even if there are still some head scratching details, and little confirmation from other sources.
But it can be a fun way to waste a week of work time, and it can provide useful fodder for a blog that often gets neglected for long stretches of time.
Thanks Apple! <wink>