Intro to Nuitka: A better way to compile and distribute Python

You can use Nuitka to compile Python programs to standalone executables, then redistribute them without the Python runtime.

A better way to compile and distribute Python applications.

As Python's popularity rises, its limitations are becoming more clear. For one thing, it can be very hard to write a Python application and distribute it to people who don't have Python installed.

The most common way to solve this issue is to package the program together with all its supporting libraries and files and the Python runtime. There are tools for doing this, like PyInstaller, but they require a lot of cadging to work correctly. What's more, it's often possible to extract the source code for the Python program from the resulting package. For some scenarios, that's a deal breaker.

Nuitka, a third-party project, offers a radical solution. It compiles a Python program to a C binary—not by packaging the CPython runtime with the program bytecode, but by translating Python instructions into C. The results can be distributed in a zipped bundle or packaged into an installer with another third-party product.

Nuitka also tries to maintain maximum compatibility with the Python ecosystem, so third-party libraries like NumPy work reliably. Nuitka also tries to make performance improvements to compiled Python programs whenever possible, but again without sacrificing overall compatibility. Speedups aren't guaranteed, either—they vary tremendously between workloads, and some programs might not experience any significant performance improvement. As a general rule, it's best not to rely on Nuitka to improve performance, but rather as a bundling solution.

Installing Nuitka

Nuitka works with Python 2.6 through 2.7 and Python 3.3 through 3.10. It can compile binaries for Microsoft Windows, macOS, Linux, and FreeBSD/NetBSD. Note that you must build the binaries on the target platform; you cannot cross-compile.

For every platform, aside from needing the Python runtime, you'll also need a C compiler. On Microsoft Windows, Visual Studio 2022 or higher is recommended, but it is also possible to use MinGW-w64 C11 (gcc 11.2 or higher). For other platforms, you can use gcc 5.1 or higher, g++ 4.4 or higher, clang, or clang-cl on Windows under Visual Studio.

Note that if you use Python 3.3 or Python 3.4 (which are long deprecated), you'll need Python 2.7 because of tool dependencies. All of this should be an argument to use the most recent version of Python if you can.

It's best to install Nuitka in a virtual environment along with your project as a development dependency rather than a distribution dependency. Nuitka itself isn't bundled with or used by your project; it performs the bundling.

Using Nuitka for the first time

Once you have Nuitka installed, use nuitka, or python -m nuitka to invoke it.

The first thing you'll want to do with Nuitka is to verify the entire toolchain works, including your C compiler. To test this, you can compile a simple "Hello world" Python program—call it

print ("Hello world")

When you compile a Python program with Nuitka, you pass the name of the entry-point module as a parameter to Nuitka, for example, nuitka When invoked like this, Nuitka will take in and build a single executable from it.

Note that because we're just testing out Nuitka's functionality, it will only compile that one Python file to an executable. It will not compile anything else, nor will it bundle anything for redistribution. But compiling one file should be enough to determine if Nuitka's toolchain is set up correctly.

Once the compilation finishes, you should see a binary executable file placed in the same directory as the Python program. Run the executable to ensure it works.

You can also automatically run your Nuitka-compiled app by passing --run as a command-line flag.

If your "Hello world" test executable works, you can try packaging it as a redistributable. I'll explain that process shortly.

Note that when you run your first test compilation with Nuitka, it will probably complete in a matter of seconds. Don't be fooled by this! Right now, you are only compiling a single module, not your entire program. Compiling a full program with Nuitka can take many minutes or longer, depending on how many modules the program uses.

Compile a Python program with Nuitka

By default, Nuitka only compiles the module you specify. If your module has imports—whether from elsewhere in your program, from the standard library, or from third-party packages—you'll need to specify that those should be compiled, too.

Consider a modified version of the "Hello world" program, where you have an adjacent module named

def greet(name):
    print ("Hello ", name)

and a modified

import greet

To have both modules compiled, you can use the --follow-imports switch:

nuitka --follow-imports

The switch ensures that all the imports required throughout the program are traced from the import statements and compiled together.

Another option, --nofollow-import-to, lets you exclude specific subdirectories from the import process. This option is useful for screening out test suites, or modules that you know are never used. It also lets you supply a wildcard as an argument.

Compiling a large program with Nuitka. IDG

Figure 1. Compiling a large, complex program with Nuitka. This example involves compiling the Pyglet module along with many modules in the standard library, which takes several minutes.

Include dynamic imports

Now comes one of the wrinkles Python users often confront when trying to package a Python application for distribution. The --follow-imports option only follows imports explicitly declared in code by way of an import statement. It doesn't handle dynamic imports.

To get around this, you can use the --include-plugin-directory switch to provide one or more paths to modules that are dynamically imported. For instance, for a directory named mods that contains dynamically imported code, you would use:

nuitka --follow-imports --include-plugin-directory=mods

Include data files and directories

If your Python program uses data files loaded at runtime, Nuitka can't automatically detect those, either. To include individual files and directories with a Nuitka-packaged program, you'd use --include-data-files and --include-data-dir.

--include-data-files lets you specify a wildcard for the files to copy and where you want them copied to. For instance, --include-data-files=/data/*=data/ takes all the files that match the wildcard data/* and copies them to data/ in your distribution directory.

--include-data-dir works roughly the same way, except that it uses no wildcard; it just lets you pass a path to copy and a destination in the distribution folder to copy it to. As an example, --include-data-dir=/path/to/data=data would copy everything in /path/to/data to the matching directory data in your distribution directory.

Include Python packages and modules

Another way to specify imports is by using a Python-style package namespace rather than a file path, using the --include-package option. For instance, the following command would include mypackage, wherever it is on disk (assuming Python could locate it), and everything below it:

nuitka --include-package=mypackage

If packages require their own data files, you can include those with the --include-package-data option:

nuitka --include-package=mypackage --include-package-data=mypackage

This command tells Nuitka to pick up any files in the package directory that aren't actually code.

If you only want to include a single module, you can use --include-module:

nuitka --include-module=mypackage.mymodule

This command tells Nuitka to include only mypackage.mymodule, but nothing else.

Compile a Python program for redistribution

When you want to compile a Python program with Nuitka for redistribution, you can use a command-line switch, --standalone, that handles most of the work. This switch automatically follows all imports and generates a dist folder that includes the compiled executable and any support files needed. To redistribute the program, you only need to copy this directory.

Don't expect a --standalone-compiled program to work the first time you run it. The general dynamism of Python programs all but guarantees you'll need to use some of the other above-described options to ensure the compiled program runs properly. For instance, if you have a GUI app that requires specific fonts, you may have to copy them into your distribution with --include-data-files or --include-data-dir.

Also, as noted above, the compilation time for a --standalone application may be significantly longer than for a test compilation. Budget in the needed build time for testing a standalone-built application once you have some idea of how long it'll take.

Finally, Nuitka offers another build option, --onefile. For those familiar with PyInstaller, --onefile works the same way as the same option in that program: it compresses your entire application, including all its dependent files, into a single executable with no other files needed for redistribution. However, it is important to know that --onefile works differently on Linux and Microsoft Windows. On Linux, it mounts a virtual filesystem with the contents of the archive. On Windows, it unpacks the files into a temporary directory and runs them from there—and it has to do this for each run of the program. Using --onefile on Windows may significantly slow down the time it takes to start the program.

Copyright © 2022 IDG Communications, Inc.

InfoWorld Technology of the Year Awards 2023. Now open for entries!