If you want to make Python run faster on the same hardware, you have two basic options, each with a drawback:
- You can create a replacement for the default runtime used by the language (the CPython implementation) -- a major undertaking, but the result would be a drop-in replacement for CPython.
- You can rewrite existing Python code to take advantage of certain speed optimizations, which means more work for the programmer but doesn't require changes in the runtime.
Here are five possible ways the bar could be raised -- and in some cases already is -- on Python performance.
Among the candidates for a drop-in replacement for CPython, PyPy is easily the most visible (Quora, for instance, uses it in production). It also stands the best chance of becoming the default, as it's highly compatible with existing Python code.
Those using Python 3.x have to work with a separate build of the project, PyPy3. Unfortunately for lovers of bleeding-edge language features, that version supports up to Python 3.2.5 only, although support for 3.3 is in the works.
Pyston, sponsored by Dropbox, uses the LLVM compiler infrastructure to also speed up Python with JITing. Compared to PyPy, Pyston is in the very early stages -- it's at revision 0.2 so far and supports only a limited subset of the language's features. Much of the work has been divided between supporting core features of the language and bringing up performance of key benchmarks to an acceptable level. It'll be a while before Pyston can be considered remotely production-ready.
Rather than replace the Python runtime, some teams are doing away with a Python runtime entirely and seeking ways to transpile Python code to languages that run natively at high speed. Case in point: Nuitka, which converts Python to C++ code -- although it relies on executables from the existing Python runtimes to work its magic. That limits its portability, but there's no denying the value of the velocity gained from this conversion. Long-term plans for Nuitka include allowing Nuitka-compiled Python to interface directly with C code, allowing for even greater speed.
Cython (C extensions for Python) is a superset of Python, a version of the language that compiles to C and interfaces with C/C++ code. It's one way to write C extensions for Python (where code that needs to run fast can be implemented), but can also be used on its own, separate from conventional Python code. The downside is that you're not really writing Python, so porting existing code wouldn't be totally automatic.
That said, Cython provides several advantages for the sake of speed not available in vanilla Python, among them variable typing à la C itself. A number of scientific packages for Python, such as scikit-learn, draw on Cython features like this to keep operations lean and fast.
Numba combines two of the previous approaches. From Cython, it takes the concept of speeding up the parts of the language that most need it (typically CPU-bound math); like PyPy and Pyston, it does so via LLVM. Functions compiled with Numba can be specified with a decorator, and Numba works hand-in-hand with NumPy to quicken the functions found. However, Numba doesn't perform JITing; the code is compiled ahead of time.
Python creator Guido van Rossum is adamant that many of Python's performance issues can be traced to improper use of the language. CPU-heavy processing, for instance, can be hastened through a few methods touched on here -- using NumPy (for math), using the multiprocessing extensions, or making calls to external C code and thus avoiding the Global Interpreter Lock (GIL), the root of Python's slowness. But since there's no viable replacement yet for the GIL in Python, it falls to others to come up with short-term solutions -- and maybe long-term ones, too.