SQL Server bulks up

Microsoft's beefy remake brings hefty new capabilities and a slew of new tools to master

To call SQL Server 2000 a 90-pound weakling because it lacks certain high-end features is a bit like calling Hercules a wimp because he never ran a marathon. Not every strongman can perform every feat, and not every company needs every heavyweight feature. Many enterprises, large and small, have been running their businesses on SQL Server for years.

Still, there’s no denying that SQL Server always fell short of Oracle Database when it came to true enterprise features, especially in the areas of high-availability and disaster recovery. Whereas Oracle met the demands of mission-critical, 24/7 operation with features such as online re-indexing, snapshot-based restores, and fail-over log shipping, SQL Server couldn’t provide true zero-percent data loss without the help of third-party solutions. For customers who didn’t need these protections, SQL Server offered a solid database with lower licensing fees. But maintenance-related downtime was a fact of life.

Until now. More than five years in the re-making, SQL Server 2005 not only addresses shortcomings in high-availability and disaster recovery, but brings sweeping improvements in almost every area. Management tasks have been streamlined, and monitoring and diagnostics have been beefed up. A default locked-down configuration, data encryption, and other new features strengthen security. Analysis Services, Reporting Services, and Data Mining are now fully integrated and mature. DTS (Data Transformation Services) has been rewritten from the ground up, emerging as the more powerful and flexible SQL SSIS (Server Integration Services). Integration with Visual Studio and the .Net/CLR opens SQL Server internals to .Net developers and .Net to SQL developers, meaning the power of .Net can now be tapped for stored procedures, data flows and transformations, BI applications, and more (see “Taking the database beyond SQL”).

SQL Server 2005 also promises better performance. I didn’t run performance benchmarks for this review, but in addition to allowing DBAs to partition the database to balance load, Microsoft has done a lot of work to improve memory management, indexing strategies, and query optimization. Users will see the difference.

Many of the changes in SQL Server 2005 will take current SQL developers and DBAs some time to learn. One key high-availability feature, database mirroring, is for practical purposes still “beta.” Microsoft still has work to do, especially in integrating and refining SQL Server’s many tools. Microsoft’s extreme makeover closes the gap between SQL Server and Oracle, however, and makes SQL Server a true contender even for very large databases and the most demanding enterprise environments.

A new way to move your data

SSIS, the revamped set of tools for moving and manipulating data, not only provides greatly enhanced performance, reliability, and functionality over DTS, but also offers many new prebuilt tasks that reduce the amount of coding needed to perform the most common operations, including backing up databases, re-indexing tables, and running integrity checks. More important, SSIS completely changes the modeling paradigm that DTS uses, logically dividing its building blocks into separate control-flow and data-flow components.

Control-flow containers are graphical representations of operations that would otherwise take developers dozens of lines of code and many hours of debugging to create. Some of the operations you can perform with the new containers (namely Sequence, For Loop, and Foreach) can’t even be done in DTS or would be too difficult to manage. 

SSIS also extends the reach into different platforms, and not just different databases. For example, whereas DTS provided limited visibility into OLAP and practically none into data mining, these are natively supported in SSIS, meaning you can run data-mining queries, work with slowly changing dimensions, and send the data wherever you like, even to Reporting Services reports. In fact, SSIS serves as a bridge between SSAS and SSRS and any ODBC-compliant data source.

Package configurations are another huge enhancement. These allow you to define certain elements of the package -- file locations, database connections, special log-ins, and such -- to be read in from an external source (like an XML file) and applied at run time. This function lets you deploy packages very quickly because you don’t have to change them to move them from one system to another; you simply change the dynamic elements in the configuration and the package will run on the new server.

Last but not least, the new .Net integration releases SSIS from DTS’s dependence on the limits of VBscript and moves serious coders into the richness of VB.net. For this reason, file and string manipulations, math operations, and Win32 calls are much easier and faster.

As you would expect, the increased power and flexibility of SSIS come at the price of increased complexity. SSIS presents a learning curve that can seem daunting, but after you learn how to use it, SSIS will change the way your company thinks about its data. Systems that couldn’t communicate before are now perfectly integrated and have the full power of .Net behind them. Complex data load operations into warehouses and disparate systems will take a fraction of the time to build, execute, and support. Other complicated tasks can also be performed much faster, which means deadlines will be met more easily and projects won’t be held up because of the limited functionality of DTS and the way operations have to be taped together.

Maximum DBA

Microsoft has also made some dramatic changes to its management tools, replacing Enterprise Manager and Query Analyzer with the new SSMS (SQL Server Management Studio). SSMS serves as both the database command line, with which you perform tasks such as creating databases and managing user accounts, and the tool used to write database code. SSIS and Profiler were left out of this merger.

These changes won’t leave users completely satisfied. For starters, doing normal query tasks, such as viewing execution plans and even just running queries, is much slower. Screens can become quite unresponsive while queries run, and it takes much longer to pull up execution plans, sometimes causing the window to seize up. Microsoft also changed the layout of the execution plans, so larger ones are much harder to read. Your favorite keyboard shortcuts have been taken away, and right-clicking on a view -- which used to allow you to edit it -- now takes you directly to query builder. A lot of DBAs won’t like this. 

But these annoyances are greatly outweighed by richer management capabilities overall. One place Microsoft has succeeded beyond expectations is in the management of SQL error logs. SSMS not only displays the SQL error logs, but combines them with the Windows NT event logs in the same pane, allowing DBAs to see all the relevant information in one place. While we’re on the topic of combined logs, this is a good place to mention the enhancements to Profiler, which now combines SQL trace logs with Perfmon logs. This makes it much easier to correlate performance counters with specific queries. In fact, it’s very easy to find exactly what you’re looking for because all you have to do is click on the performance spike in the Perfmon log and it automatically takes you to the relevant place in the Profiler log.

At the heart of SQL’s new management capabilities are the DMVs (Dynamic Management Views) and functions, which give admins much deeper insight into database operations. Microsoft used to hide the deeper aspects of the database, but meeting the demands of managing large databases means giving DBAs more control. DMVs provide greater visibility into memory, index usage, table stats, server stats, security, and on and on. So many things you couldn’t see at all before are now at your fingertips.

In addition, DBAs get ready-made reports on detailed database and server stats, including schema changes, memory and index usage, resource consumption, blocking, and many others. SSMS relies on SSRS for this functionality, and I find that it sometimes bogs down when bringing up a report on a busy server, or on a large database with a lot of objects. I’m also a little disappointed that Microsoft didn’t provide the RDL (Report Definition Language) code. As a result, you can’t change the layouts of the canned reports, nor can you add your own reports or set up subscriptions, as you normally can in SSRS.  You can manually export the reports to Excel or PDF, however.

One key design goal for SQL Server 2005 was tighter security out of the box. To reduce the surface area for attacks, the most dangerous features are disabled until the DBA turns them on. This is a help, but not earth-shattering to experienced DBAs, who mostly have their lockdown scripts written and don’t find it much trouble to run them on install. The truly significant security improvements revolve around password management and rights assignment. Native SQL passwords can now be set to respect Windows password policies, which means that you can require your SQL passwords to have the same password strength, expiration, history, lockout threshold, lockout duration, and age policies as your Windows accounts. You can even require the user to change the password at the next log in. Other important security enhancements include more control over schema changes (discussed under coding enhancements below) and the capability to encrypt data at the column level.

SQL Server 2005 also brings some exciting new features for increasing availability and smoothing disaster recovery. In the area of HA (high availability), Microsoft has introduced database mirroring and made some important changes to replication. Now schema changes will be replicated automatically to the subscribers; no longer must the DBA run code by hand to do it. Snapshots no longer need to be restarted after failures, but will automatically resume after network errors or other glitches, allowing you to get your HA scenario up and running much faster. Also, logical record replication, which allows you to define parent and child data that will either be replicated together or not at all, ensures more consistent HA scenarios and eliminates logical errors on the subscriber side.

Mirroring allows you to specify a database on another server to be kept in synch with a primary database. Doing this allows you to automatically fail over to the mirror server should the primary fail, providing SQL Server’s first zero data loss HA scenario. This configuration can be expensive to applications and networks, so you’ll need to test it thoroughly, but the capability is finally here. Note, however, that mirroring doesn’t yet qualify as a production feature in this release. Microsoft is shipping SQL Server 2005 with mirroring turned off, and the feature is unsupported.

Mirroring performed solidly in my tests, handling fail-overs of a 100GB database without a hitch. The monitoring around mirroring, however, is still very primitive, consisting of only a small number of DMVs and performance counters, so it’s not especially easy to find out how much data has been replicated, how far behind the secondary system is from the primary, or the cost of the replication. The difficulty in gaining visibility into the process may fray some nerves.

As for disaster recovery, SQL Server 2005 slashes the downtime necessary when restoring a backup. The new online restore capability allows users to begin working in the database as objects become available. With SQL Server 2000, the entire database must be restored before it can be made available to users. Given the time it takes to completely restore a multi-terabyte database, online restores could mean the difference between operating productively and being crippled for hours.

Similarly, you can now fix database corruption with less impact on users. Page-level restore allows you to restore a single corrupted page without hindering the availability of the rest of the database.

Developing for a new age

Code enhancements can be divided into two basic categories: those for developers and those for DBAs. Apart from the .Net/CLR integration, CTEs (Common Table Expressions) are arguably the most significant for developers. Besides elegantly solving the problem of recursive queries, which I won’t get into, CTEs allow developers to simplify code by using inline query definitions in place of views or derived tables. CTEs not only streamline complex queries, but perform the same as derived tables, so there isn’t any extra overhead to implementing them. 

Another important enhancement: The TOP clause, which allows you to define that only a certain number of a result set is returned (such as the top 10 salespeople by revenue in the United States), can now take a parameter as an argument. I know it doesn’t sound like much, but this is one of those cases where a relatively small change can make a huge difference in development effort. In short, instead of having to include a static value (TOP 10, for example) in the code, you can now include a variable, allowing users to report the TOP 10 or TOP 100 (or any number they like) on the fly. 

Other goodies for developers include ranking, random data sampling, pivot/unpivot, synonyms, much richer error handling, and code security context (see the infographic “Cool tools for SQL coders”). Many developers will also be singing in their Cheerios over SQL Server 2005’s native XML storage capabilities. As I mentioned in my April 2004 comparison of the leading relational databases’ XML handling capabilities, SQL Server 2000 was limited to “shredding” XML into relational data, a cumbersome process that fails to preserve XML hierarchies. Now, you can import XML into a column and not only perform native XML queries, but also index it for fast retrieval.

DBAs haven’t been left out of the coding revolution. A number of coding changes give admins management capabilities from the command line that are not accessible through Management Studio. One in particular, DDL (Data Definition Language), will not only change the way DBAs do their jobs but will also help their CEOs sleep much better at night. DDL triggers fire whenever changes to the database schema occur, and they can be defined on pretty much any object type in the server or database, helping to make unwanted changes a thing of the past. 

Indexing is always a volatile subject in any busy production environment. DBAs need to maintain the database; production managers don’t want to interrupt operations. Online indexing allows DBAs to create or rebuild indexes without taking the database offline. The index is built in parallel to the table and then applied once it is created. Of course, you have to be aware of resources, but this feature will greatly increase availability for shops that have to be up 24/7. DBAs can now also control the level of locking used by indexing operations. 

Not your father’s OLAP

SQL Server 2005’s suite of BI tools includes SSAS (SQL Server Analysis Services), SSDM (SQL Server Data Mining), and SSRS (SQL Server Reporting Services), all of which are finally designed to work together to answer your intelligence needs. SSAS is looking pretty good these days, thanks to many usability enhancements. The new Business Intelligence Wizard comes loaded with pre-canned solutions to common problems. Currency conversions, semi-additive measures, and time-based calculations are just some of the situations that the wizard handles.

SSAS has had one very important structural change -- the UDM (Unified Dimensional Model). It’s hard to fully grasp the UDM until you work with it, but it basically provides a bridge between clients such as Excel to any number of heterogeneous data sources. Or put in simpler terms, it sits between you and these sources and shows you the view of the data you request. UDM brings several important changes to the way you will operate with SSAS. One is that now you can have multiple fact tables, and in fact, you have your entire relational schema available to you. UDM eliminates the need for virtual cubes and virtual dimensions, providing not only increased functionality but a much easier model with which to work.

One of my absolute favorite new features is proactive caching. This mechanism allows you to set up metrics that determine when your data gets refreshed. There are several ways to do this, but one of the better ones is to define your cube to be refreshed when the base data changes. The data is refreshed at the partition level, and there are many options for balancing performance and latency.

SQL Server Data Mining has grown from what seemed like an experiment into a full-fledged application. Whereas SQL Server 2000 had two mining algorithms -- simple clustering and simple decision trees -- SQL Server 2005 has 10. It also has 25 new visualizations and visual query editors. All of these tools make it much easier to develop and work with mining models. I wouldn’t say that Microsoft has exactly achieved its goal to bring data mining into the hands of nonprofessionals, but it is far easier than using the third-party mining tools.

Tying the BI package together is SSRS. With this new release, Microsoft has taken away much of the pain of writing and distributing reports. Perhaps the most important improvement, Report Builder, is a new Web interface that allows end-users to write and publish their own reports. Admins publish a report model, which is basically a definition of the data they want the users to be able to write reports against, and users step through the wizard interface to create whatever reports they like. In a way, these reports can be richer than traditional SSRS reports because they come with infinite drilldown already built in, which is something that designed reports simply don’t have.

Report Builder’s controls allow developers to embed reports in applications. Along with this desirable capability, Report Builder also includes support for custom controls that allow developers (most likely third-party vendors) to extend SSRS functionality.

SSRS also includes lesser enhancements that end-users will find helpful. One of these is multivalued parameters, which allow users to pass more than one value (multiple sales regions, for example) inside a single parameter. This functionality previously relied on passing in delimited elements into a single parameter, and then parsing them out into single entities in the database. Other nice

additions are interactive sorting, which allows users to change the sorting criteria of the columns in their reports as they see fit, and a calendar picker, which allows users to choose a date from a pop-up calendar instead of having to type it.

For developers, SSRS works directly with SSAS and SSIS, using them both as data sources. This is a huge step forward because current SSRS users can follow processes already in place to produce any number of reports in any number of ways. The SSIS integration alone gives you the capability to pull in data from multiple sources, including those merged from different databases, Web services, RSS feeds, SSAS, and data-mining repositories. The possibilities are limitless.

Wish list

What got left out of this release? Many developers and DBAs were looking forward to IntelliSense for T-SQL coding. Many were also hoping for some sort of centralized code repository. We’ve still got templates, but those typically reside on clients; DBAs and developers can’t easily pull them from a centralized online resource. SSIS is a huge improvement over DTS, but it still isn’t an actual ETL server, meaning it still uses the resources of the server that kicks off the package. This has been a source of grief for many DBAs, although the current architecture has advantages too, and it’s something SQL DBAs have grown to live with and work around. Finally, Microsoft still hasn’t offered an answer to Oracle’s grid solution, which allows you to cluster multiple database servers for load balancing and fail-over and manage them as a group. That’s the last major advantage Oracle can still claim over SQL Server. 

Who should upgrade and when? Shops who should look at an upgrade right away -- assuming their applications have been certified for the new version and migrating won’t void their support contracts -- are those that need true 24/7 availability, rely heavily on DTS, or need to comply with Sarbanes-Oxley audits.

If you’re at the point where you just can’t keep pulling downtime for maintenance, or for any other reason, then this upgrade shouldn’t wait. SQL Server 2005’s replication enhancements, online reindexing, partial restores, and (soon) database mirroring will offer welcome relief.

If you are currently pushing DTS to the limit, converting to SSIS will ultimately streamline operations but may not be as easy as you’d like. Proper planning will be essential. The best way to go about it is to put a dedicated SSIS server in place and upgrade your packages one at a time. This method will give you an opportunity to upgrade and test at your leisure.

If you’re concerned about compliance, the new security features and the DDL triggers will whip your database into shape very quickly. Here again, though, implementing password policies can cause some grief in your environment, so you’ll have to plan appropriately. 

For shops without these pressing problems, SQL Server 2005 offers plenty of other reasons to consider it. From management to development, just pick your woe and chances are SQL Server 2005 will help alleviate it. 

SQL Server 2005 is an excellent release that will gain Microsoft some real respect in the high-end database market. SQL Server’s tools now lead Oracle and DB2 in pretty much every area -- management, monitoring, and especially development -- and of the three vendors, Microsoft seems to be the most concerned about improving them. Maybe those other guys should start hitting the weights.

InfoWorld Scorecard
Manageability (25.0%)
Scalability (20.0%)
Value (10.0%)
Performance (25.0%)
Availability (20.0%)
Overall Score (100%)
Microsoft SQL Server 2005 9.0 9.0 10.0 9.0 9.0 9.1
From CIO: 8 Free Online Courses to Grow Your Tech Skills
Join the discussion
Be the first to comment on this article. Our Commenting Policies