LLVM 7 improves performance analysis, linking

The compiler framework that powers Rust, Swift, and Clang offers new and revised tools for optimization, linking, and debugging

The developers behind LLVM, the open-source framework for building cross-platform compilers, have unveiled LLVM 7. The new release arrives right on schedule as part of the project’s cadence of major releases every six months.

LLVM underpins several modern language compilers including Apple’s Swift, the Rust language, and the Clang C/C++ compiler. LLVM 7 introduces revisions to both its native features and to companion tools that make it easier to build, debug, and analyze LLVM-generated software.

Among the changes to the LLVM core:

  • When installed on Microsoft Windows, LLVM no longer offers direct integration with Visual Studio. Instead, the integration is provided by way of a separately installable add-on for Visual Studio, the LLVM Compiler Toolchain Visual Studio extension.
  • The llvm-rc tool, for compiling resources for Microsoft Windows executables, has been improved to make it easier to generate Windows applications with LLVM without needing as many of Microsoft’s own tools to do so.
  • Floating-point casts—converting a floating point number to an integer by discarding the data after the decimal—has been optimized, but in a way that might cause problems for developers who rely on undefined behavior around this feature. Clang has a new command-line switch to detect this issue.

LLVM’s linking tool, lld, benefits from a major speed boost. LLVM’s creators claim that lld is now “significantly faster” than platform-native linkers (such as link.exe on Microsoft Windows), and that it is production-ready for generating generic Unix, Windows, and MinGW apps. The project also promotes lld as the default linker for WebAssembly applications generated by LLVM, as WebAssembly is intended to be a first-class target of the compiler toolchain.

New to LLVM 7 is llvm-mca, a performance analysis tool that measures the behavior of generated machine code—not just throughput of instructions, but also processor resource usage. Using llvm-mca, LLVM-generated code can be evaluated for how many instructions per cycle it uses.

The llvm-mca tool works by taking in assembly that has been decorated with inline code comments (e.g., __asm_volatile in C) to indicate which instructions should be profiled. The generated report includes statistics about the code over a number of iterations (the default is 100) and a timeline view of each instruction’s state transitions as it works its way through the processor’s instruction pipeline. This makes it easier to figure out if LLVM-generated code is more efficient than hand-rolled assembly for the same jobs.

Also new to LLVM’s toolset is llvm-exegesis, a benchmarking tool to determine the performance of a given architecture’s instruction set. It just-in-time compiles a fragment of code to test the instructions in question with the highest possible parallelism on the available hardware, and reports back the latency of the tested instructions. The tool is mainly intended to validate the information provided by vendors about instruction behavior on their chipsets, but someone creating an LLVM back end for a new architecture could use it to fine-tune instruction scheduling there.