How to stress a UTM

We challenged the Astaro, SonicWall, WatchGuard, and ZyXel appliances with a maximum dose of legitimate traffic, 200 VPNs, and hundreds of Internet attacks, all at the same time

For our scenario-based test of the Astaro, SonicWall, WatchGuard, and ZyXel UTMs, we simulated a representative corporation with 200 branch offices all connecting back to headquarters for various services. Unlike previous firewall tests, we tested all three major firewall functions -- Internet services, secure remote access, and malware blocking -- simultaneously, in order to better represent the workloads these devices face when deployed in the real world.

First, using the Ixia IxLoad system, we ran a mix of HTTP, FTP, POP, and SMTP traffic through three firewall interfaces: LAN to WAN to simulate Web browsing by employees; LAN To DMZ to simulate employees updating and querying public servers; and WAN to DMZ to simulate to simulate external users interacting with the company's public servers and employees on the road accessing e-mail. This also created a baseline on which to compare performance.

Second, again using the IxLoad, we ran a mix of HTTP, FTP, POP, and SMTP traffic through each of 200 VPNs, simulating 10 users at each branch office accessing intranet servers on the LAN. This allowed us to see how overall throughput was affected by VPN activity, and completed our legitimate traffic baseline for our fictitious company.

[ When is a UTM not a UTM? Read the overall results of the InfoWorld Test Center's great UTM challenge. Read the reviews: Astaro Security Gateway 425 | SonicWall NSA E7500 | WatchGuard Firebox Peak X5500e | ZyXel ZyWall USG1000. Compare the UTMs feature by feature. ]

Third, we added malware to the mix, using Mu Dynamics' Mu-4000 and Published Vulnerability Attacks module to test the UTMs' attack blocking capabilities. Attacks were launched against the WAN interface to simulate bot traffic and other external threats, and then from the LAN interface to simulate an outbreak from an infected laptop being plugged in behind the firewall.

Feeds and misdeeds

By laying a baseline of traffic across multiple firewall interfaces, adding traffic from 200 VPNs, and then hitting the UTM with roughly 600 attacks, we were able to determine how a stream of attacks affected overall throughput. We weren't surprised that the performance hit was typically substantial. Oddly, the Astaro system suffered a mere 2% drop, albeit while also failing to block more than 400 of our roughly 600 attacks.

Naturally, one of our main goals was to find out just how well these UTMs would handle the nearly constant attacks typically found on public Internet connections. To this end, we enlisted the help of Mu Dynamics and its Mu-4000 Analyzer. This unique test tool has the ability to generate millions of attacks based upon published vulnerabilities as defined by folks like U.S. CERT (Computer Emergency Readiness Team) to exercise the deep packet inspection capabilities of each UTM. (Although the MU-4000 can also "fuzz" these attacks to assess how well the UTMs could cope with variants or "zero dayattacks," we did not expose the UTMs to these attack mutations.) Mu Dynamics is so confident that it can break through a security device that the company even provides script-controllable power outlets on the Analyzer so that it can reboot the device after it's been locked up.

For testing the throughput of the UTMs, we used Ixia Communications' IxLoad system to run synthetic Web, FTP, and e-mail traffic in patterns between the different interfaces. Ixia recently gave the IxLoad the ability to run these same simulations through the IPsec VPN tunnels, allowing us to exercise rules on the firewalls for the VPNs. One of our biggest problems over the years has been manually correlating traffic numbers between several different test tools and trying to deal with the fact that TCP-based traffic (HTTP) will back off if UDP-based traffic (FTP) starts filling the pipe. In the case of IxLoad, we could actually see the HTTP traffic backing off as the FTP traffic ramped up.

[ Read more about the Ixia and Mu Dynamics test tools: "Ixia IxLoad's multithreaded testing" | "Mu's Internet attacks in a can." ]

Ixia loaned us the smallest chassis in the Ixia product line (Optixia XM2) with 16 ports of gigabit throughput per blade and an embedded Linux machine behind each port. The basic architecture is that tests are loaded onto each port from a console, and when the test is run, the console just collects data. This way, each dedicated port can run flat out, generating huge amounts of data and saving us the hassle of setting up banks and banks of CPUs to generate the same load.  Since we had multiple ports in each firewall zone (LAN, WAN, DMZ), we aggregated the ports together on a trio of Extreme Networks gigabit switches that provided more than enough bandwidth to avoid any potential throttling of the test. We also kept the traffic rates "real world" since the WAN port really never got over the rate you might find on even the sexiest commercial cable modems. When the test was done, the Ixia reporting feature generated comprehensive reports on the various test streams and correlated them in an easy-to-read format.

To run each UTM's management console, we set up a couple of modern workstations, each connected to an Avocent IP KVM and a remote power-down device from Server Technologies. This remote management setup really paid off, as we spent many a late night working out IPSec incompatibilities with firewall vendors remotely logged into the firewall console while our staff ran the Ixia console.

The overall test goal was to create a reproducible set of tests so that each firewall vendor was tested against the exact same benchmarks, but with test structures still based upon published Internet standards for network equipment testing. Overall, we think we've succeeded, but we've learned a great deal about just how flexible the IPsec VPN standard can be and just how many variants there are in its implementation.

The future: Testing the rest

We've been on the scenario-based testing soap box for more than a dozen years, and our hopes for this type of testing still haven't quite come true. The missing piece is the ability to start up each portion of the test that might reside on test equipment from as many as a half-dozen vendors. We also have the challenge of correlating the test results from all the test tools without going blind trying to read all the reports. We're not willing to say that we can see the light at the end of the proverbial tunnel, but we can see a dim glow in the stygian darkness as we've been reading up on the TesLA alliance.

To put this alliance into perspective, one needs to realize that a quiet revolution has been afoot. The use of XML for configuration and control has slowly become a de facto standard in the industry. TesLA is just one of the better-organized cooperative efforts to take advantage of this that we've seen to date.

The promise of an alliance of test equipment vendors revolves around the fact that each vendor has its strengths and weaknesses. For example, our particular test tapped the Ixia IXLoad for generating the legitimate traffic moving through the WAN-LAN interfaces and the IPsec VPN tunnels, and the Mu Dynamics Analyzer to overwhelm the firewalls with malware. The trick to a real-world test, however, was to have both tools running all three major traffic types at the same time so that the devices under test wouldn't have the advantage of dedicating 100 percent of their CPU and memory resources (buffers are the name of the game) to a single task.

In the future we'd like to be able to control our distribution switches so that we can let our firewalls apply QoS profiles to redirect low-priority traffic onto different VLANs and perhaps even incorporate Network Access Control (NAC) functionality. Perhaps we could even control some as-yet-unannounced feature to script a full 802.1x Radius Authenticated user session as part of the test. Oh heck, why not simulate how a user might start a VoIP conversation on a wired deskset, then move to a wireless handset as they walk out of a building? Our great wish for the future is a flexible multivendor testing system with varying levels of scripting capability so that our tests can even more closely resemble the real world.

Recommended
Join the discussion
Be the first to comment on this article. Our Commenting Policies