The latest update to PyPy (5.6), the just-in-time compiling runtime for Python, supplies the usual roster of bug fixes and incremental improvements. Its biggest changes, though, involve a long-standing issue with PyPy’s C extensions and the major Python packages that use them.
Python has never been the fastest language, but rewriting performance-intensive functions in C can bring Python apps within striking distance of pure C counterparts, often without the headache of writing the app entirely in C.
PyPy speeds up Python applications by compiling them, but a drawback of the process is that C extensions for Python apps don’t work very well. What’s more, a lot of valuable Python packages use C extensions, and they aren’t going anywhere.
The workaround is a compatibility layer with Python’s C API, CPyExt, updated with changes for PyPy 5.6. CPyExt runs the widely used math-and-stats package NumPy, a good example of a major project that needs C extensions, although it still falls short of running Pandas (another math-oriented package in wide use).
Now the bad news: CPyExt is an emulation layer, so any calls through it are slower. PyPy has two solutions for developers: Either write everything in pure Python and let PyPy speed it up, or use CFFI, a C interface that integrates directly with PyPy.
The first option is fine for programs that aren’t heavily dependent on C for functionality or speed. Some Python apps offer the ability to use C modules for acceleration (compiled through tools like Cython), but can fall back to pure Python if C modules can’t be put to work. For projects like NumPy, it’ll likely be a no-go.
The latter, using CFFI, has issues too. CFFI uses a more “Pythonic” metaphor for working with C than Python’s native ctypes interface, but the two aren’t interchangeable. Existing Python packages that interface with C in the standard manner won't likely be rewritten to use CFFI for PyPy compatibility.
Because NumPy is so widely adopted and presents one of the biggest stumbling blocks for PyPy users, PyPy’s developers forked NumPy and created a PyPy-compatible version called NumPyPy. Consider this the exception, not the rule—you’d need a lot of manpower to create PyPy-compatible forks of even a fraction of the packages that depend on C extensions.
Which version will win out for PyPy users: NumPy or NumPyPy? According to the developers, both have their merits: “There are places where gcc can beat the JIT, and places where the tight integration between NumPyPy and PyPy is more performant,” states developer Richard Plangger in the PyPy blog. Rather than boost one over the other, PyPy is planning to “integrate both [NumPy and NumPyPy], hijacking the C-extension method calls to use NumPyPy where we know NumPyPy can be faster.”