Debugging concurrent code with Coyote

Microsoft’s new .NET distributed systems testing framework helps track down hard-to-reproduce errors in cloud code.

Debugging concurrent code with Coyote

Modern multithreaded, asynchronous code can be hard to debug. The complexity that comes with message passing and thread management results in bugs that can seem non-determinant, with little or no way of spotting precisely what caused a particular interaction. Things get worse when we move away from monolithic apps to distributed microservices running across cloud compute fabrics.

Concurrent code is inherently complex. Its asynchronous nature ensures that it’s dependent on much more than the various components of your application, affected by the underlying network and the performance of the various services that help support its code. It’s now essential, as we move to take advantage of cloud-native development models both on-premises and in public hyperscale clouds.

Traditional test and debug methods fall down here, as they come from a history of working with single-threaded, monolithic applications. Although we’ve been lucky that they manage to scale to multiprocessor, multithreaded code with shared memory, that advantage is lost in the cloud, where there’s no guarantee of shared processors or memory.

One option is to use verification tools to attempt to prove correctness, but these can miss combinations of external factors that might significantly affect concurrency. What’s needed is a testing tool designed from the ground up to work with concurrent systems, that can run these tests in development environments alongside traditional unit testing tools.

Introducing Coyote

Microsoft is experimenting with a tool that does just that, code-named Coyote. It was developed in Microsoft Research and is already in use by the Azure development team. It’s intended to be part of your unit test suite, adding non-deterministic failures and parallel testing of concurrent operations. You don’t need to change your code to use Coyote, as it works at a binary level. At the same time, it provides a development framework of its own, using asynchronous actors to deliver C# code that won’t exhibit blocking behaviors.

Coyote is a modern .NET framework for C# in .NET 5 and later. It installs from NuGet with two packages: one for the Coyote framework and one for the test package. Alternatively, you can install the Coyote test tool from the .NET command line, with no need to build it from source or add the packages to your code. The CLI tool helps with automating Coyote tests, either using Azure DevOps or GitHub Actions.

It’s designed for working with asynchronous, distributed .NET code that uses common C# structures, as well as in the .NET Task Parallel Library. Error messages indicate whether it’s working with unsupported APIs and types, with the option of providing mocks for external APIs. In these cases, Coyote can still run and show bugs, but it won’t produce traces that can help with debugging. It will be useful as a bug generator, especially if you’re using other .NET concurrency approaches. The same “no repro” mode is needed if you’re embedding Coyote in another unit testing environment, so if you need reproducible traces, you should run Coyote tests outside other testing frameworks as part of a larger unit testing strategy.

Using Coyote with your code

Getting started with Coyote is relatively simple. Install the CLI client, and then use it to target your code by using its rewriter on a binary—either for your code or for any library it uses. If you’re working with more than one assembly, you can use a simple JSON file to list all the assemblies that need to be rewritten. A useful option redirects rewrites to a new directory so you can run both standard and Coyote tests at the same time. If an assembly is outside your build path, you’re able to use a full path to pull it in, with the rewritten binary being delivered to your output directory.

With code rewritten, you can now hand control over to the Coyote tester. Again, all you need is a single line of command line code pointing to a binary that has a method that’s been set up as an entry point for Coyote. This runs your app a set number of times (first serializing its operation, then controlling the app’s scheduler) before running your code with different assumptions on how it’s scheduled for each run. Each time it runs, it’s working through a different path, making deterministic choices about non-deterministic effects on your code.

Debugging concurrent applications

By simulating many different scheduling options, it can quickly expose bugs that may only occur under very limited conditions. When a bug is triggered, the tester terminates, delivering readable traces to your development environment ready for debugging. To speed things up, parallel instances of the tester can run at the same time in separate processes, ensuring isolation between instances. A simple command line setting controls the number of iterations that run at a time, with another controlling the number of steps executed and terminating when that number is reached (if there’s the prospect of a non-terminating bug that oscillates between states, for example).

One interesting option for Coyote’s parallel testing strategy is its portfolio flag. This allows each instance to use a different scheduler strategy, improving the odds of finding a bug and, at the same time, hopefully avoiding bug duplication.

Terminating as soon as a bug is found makes sense; with non-deterministic bugs, the same error may manifest differently on different runs. Fixing it quickly can prevent future occurrences, allowing new test runs to expose different bugs rather than different instances of the same problem.

Once you’ve found a bug, Coyote’s traces can replay the sequence of actions that triggered it. At the same time, you can attach Visual Studio’s debugger to an automatically created break point near the bug. This simplifies debugging and allows you to step through the interactions that may have triggered the bug.

As applications get more and more complex, issues like concurrency become significant, and dedicated development and test tools become essential. With much of Azure built on a distributed systems framework, Microsoft needed to develop a set of testing tools to handle this. Like most internal tools, it fixed a specific set of itches initially, adding new features as they were needed and as the package matured.

With Coyote now public and with an open development model hosted on GitHub, it’ll be interesting to watch it develop alongside the .NET platform. The increasing importance of distributed systems development makes tools like Coyote essential. It’s likely that it’ll quickly end up as a part of the .NET ecosystem, with deep integration into Visual Studio beyond the current tracing tools, much like NuGet has become the standard for .NET package distribution.

Copyright © 2021 IDG Communications, Inc.