Hands-on: Python 3.6’s speed boost matters

Don't expect order-of-magnitude speedups from the next version of Python, but a few key benchmarks show improvement, and every bit adds up

Hands-on: Python 3.6’s speed boost matters

With the first beta release of Python 3.6 out in the wild, most of the discussion has been about additions like a new string-literal format or the "secrets" module for cryptographically secure tokens.

But version 3.6 also boasts a fair amount of under-the-hood work to quicken the default CPython interpreter. Maybe not fast like C or PyPy, but certainly faster than previous versions of CPython -- and more's always better. But how much more?

To find out, I snagged the 3.6 beta and ran a number of basic benchmarks on both it and the current version of Python, 3.5.2. All tests were run on a Windows 10 64-bit machine, with 16GB of RAM and an Intel Core i7-3770K 3.5 GHz CPU, using the 64-bit builds of Python.

First up was a quick benchmark developed by core Python dev Victor Stinner (of the speed-enhancing FAT Python project) to demonstrate his perf benchmark library. This particular benchmark manipulates dictionaries, and since Python 3.6 boasts a new, more compact dictionary implementation, I figured it was worth testing.

On this benchmark, Python 3.5.2 cranked out 34.3 nanoseconds per iteration. Python 3.6 beta 1 brought that number down to 29.8. It's not a dramatic, order-of-magnitude improvement, but it's noticeable.

Another benchmark I put together using perf measured the speeds of these functions, which tested accessing global objects, manipulating dictionaries, and raw function call overhead:

def obj1():
for n in range(100):

def obj2():
for n in range(100):

def obj3():
for n in range(100):

def obj4():

From Python 3.5.2:

obj1: Median +- std dev: 576 ns +- 26 ns
obj2: Median +- std dev: 1.45 us +- 0.02 us
obj3: Median +- std dev: 642 ns +- 6 ns
obj4: Median +- std dev: 9.30 ns +- 0.28 ns

From Python 3.6 beta 1:

obj1: Median +- std dev: 488 ns +- 20 ns
obj2: Median +- std dev: 833 ns +- 12 ns
obj3: Median +- std dev: 597 ns +- 18 ns
obj4: Median +- std dev: 7.66 ns +- 0.13 ns

What stands out most immediately: Every operation is faster. Dictionary creation in particular requires a lot less time. Method or function calls in Python have always been expensive operations, but they're less so now -- not drastically, but there's clearly been improvement.

The bad news about microbenchmarks like these is that they rarely map to real-world use cases. Few Python applications in the wild have behavior patterns that match the workings of benchmarks. But in the aggregate, every little improvement adds up, and the differences across a cluster of machines can matter a lot more than on any one machine.

But two reasons loom as to why CPython will likely gain only incremental improvements in speed. First: CPython is intended to be the reference implementation of Python, so compatibility with the larger Python ecosystem has always been crucial. This feeds into the second reason, which is that Python's always been about speed of development and convenience for the programmer, rather than speed of execution. Within those constraints, it isn't unrealistic to expect more little victories as time goes on.

Copyright © 2016 IDG Communications, Inc.