Commercial products are available to help and, fortunately for buyers, this space is hotly contested. Several vendors offer product suites that claim to assist in maintaining policies across disparate network devices and performing automated configuration backup, searching, and restoration. AlterPoint’s Device Authority Suite offers a complete network development environment patterned on the Eclipse IDE that features extensive automatic scripting tools to develop configuration changes and push them to selected network devices.
Such tools form the core of any change-management initiative. Without proper methods to develop, deploy, maintain, and verify configuration policies across dozens or hundreds of devices, all the procedures in the world will not make a difference. Enterprising carriers and datacenter operators have even integrated help desk software and change window identification to speed the process of linking problems to recent changes and to assist in network troubleshooting. If the framework for thorough change management is available, capitalizing on integrations such as this will definitely help admins sleep at night.
Now This Won’t Hurt a Bit
For many network administrators, initiating network change management is like a trip to the dentist — necessary but dreadful. Generally, network configuration changes are slight (an addition to an access list or a change to an SNMP community, for example) and require only seconds to implement. Navigating through onerous change-management guidelines can sometimes seem to complicate seemingly straightforward tasks. Abiding by the guidelines pays off, however. If you don’t summon the courage to see the dentist, problems only get worse; it’s a similar situation with configuring networks.
Enterprises can learn much about managing network configurations from service providers, whose very livelihood depends on handling changes smoothly, quickly, and accurately. Nearly all large ISPs have rigorous change-management procedures in place and back those up with thorough configuration management tools. Some ISPs aren’t as dutiful and it can show.
Recently, I assisted during a network outage of a multistate MPLS (multiprotocol label switching) network. Although I was not privy to the carrier’s network, I was on the call with the NOC (network operating center) administrator looking into the problem. Due to the high-level nature of private MPLS networks, carriers have a much greater impact on the performance and reliability of the service. Where a traditional frame-relay network functions at layer 2, MPLS networks function at layer 3, and the carrier is responsible for maintaining valid routes across all POPs (points of presence). Thus, when a network failure occurs and all network links are active, the problem may lie within the carrier’s routing tables. Such was the case here. The problem was eventually traced to a change made in a router thousands of miles from the furthest point of this network, where routes were erroneously injected into the routing tables for my client’s MPLS network. The failure was triggered by a seemingly innocuous change to a routing table with no relation to the unintentionally affected network and, until someone contacted the tech who made the change, no one knew it had been made. It took three hours to identify and fix the problem. If an effective change-management policy had been in place, the problem could probably have been averted.
Following the ITIL Framework
Increasingly, effective change-management policies follow the ITIL framework.