The next version of Microsoft's relational database management system (RDMS) promises to bring immense performance gains to online transactional processing systems.
SQL Server 2014 will come with a new in-memory OLTP engine, one built with the company's Heckaton set of technologies, first developed by Microsoft Research.
[ Andrew C. Oliver answers the question on everyone's mind: Which freaking database should I use? | Keep up with the latest approaches to managing information overload and compliance in InfoWorld's Enterprise Data Explosion Digital Spotlight. ]
"We're building our transactional processing work in directly into the database. It is not a separate product, not a separate thing to program to," said Quentin Clark, corporate vice president of SQL Server, in a keynote presentation launching SQL Server 2014, announced this week at the Microsoft TechEd North America conference in New Orleans. "We are doing it because it is a way to achieve unprecedented low latency and high-scale throughput for transactional data."
Microsoft has billed Heckaton as an in-memory technology, though unlike other in-memory data stores, such as Oracle's TimesTen, it does not place the entire database into working memory, or RAM, to speed performance. Rather, it selectively moves only the most consulted database tables to memory.
In Microsoft labs, Hekaton has been shown to cut transaction times by 50 times, and customers who have tried early versions of the technology have found a 16-fold percent improvement in performance, according to Eron Kelly, Microsoft SQL Server general manager, in a follow-up interview with the IDG News Service.
In addition to speeding transactional system performance, SQL Server 2014 could also cut hardware costs, because the efficiencies wrought from the new technologies would reduce the computational requirements, or number of servers, to execute a given workload, Kelly said.
Traditionally, each time an OLTP system conducts a transaction, such as books an order for a travel agency, the results are written to, or read from, a row of data residing on a database table on a hard drive. OLTP systems could perform much faster if the database tables were stored in a server's working memory. With most commercial OLTP systems, however, the cost of putting all the database tables in working memory would be prohibitively expensive. Hekaton speeds transaction times through, among other techniques, a more judicious use of memory.
What Hekaton does, Kelly explained, is move the hot data, or data that gets lots of reads and writes, into a server's working memory.
To enable in-memory usage, the database administrator uses a diagnostic application that surveys data table usage and recommends which database tables should be moved into a server's main memory.
The system maintains transactional integrity -- the guarantee that each completed transaction is saved even during a system outage -- by writing transactions separately to a log file.
Hekaton also looks for stored procedures, or queries that are stored for regular use, that would be ripe for optimization as well. The software can identify heavily used stored procedures and can compile them to machine code. Machines code typically executes much more quickly -- and requires less memory -- than code written in high-level languages such as SQL itself.
A third way Hekaton improves performance is by refining the process of locking database tables to write data. Traditionally, in order to avoid overwriting of data midtransaction by another operation, a RDMS will lock a table page each time a row of data needs to be updated. Multiple concurrent reads or writes to a database table can, however, create a backlog, slowing performance.
The OLTP Engine doesn't lock data by pages. "So each process can hit any row it wants, and the only time there is a conflict is when two processes hit the same row at the exact same instant," Kelly said.
Even individual rows aren't locked when they are being written to. Rather, the RDMS writes the updated row to a new location and leaves a pointer to the new location in the old row, where it can be followed by any query that takes place at the same time as the update. Kelly called this technique "optimistic concurrency."
This is not Microsoft's first foray into in-memory technologies. The Microsoft Parallel Data Warehouse uses a similar technology for large-scale data analysis and PowerPivot, introduced in Excel 2010, uses an in-memory compressed columnar processing to parse large data sets in memory. SQL Server 2012 then borrowed this columnar processing technique for its own use.
As an example of how the new SQL Server could speed performance, Kelly explained how a retail chain could benefit from this technology.
A retail store keeps an inventory of all the products on the shelves, which a sales clerk can consult on a computer to see if an item is in stock. This sort of query can take time and sometimes is not accurate, given that the inventory database can only be updated once a day through a batch processing system.
"There's not the performance to update it in real time," Kelly said. "With our OLTP engine, you can update things in real time."
Kelly pointed to one early user of the technology, online gaming company Bwin, which by updating to SQL Server 2014 managed to increase the number of transactions on its SQL Server-based Web application, from 15,000 transactions per second to 250,000 transactions per second. "For them, each transaction is money," Kelly said.
More details will be given about the new features in SQL Server 2014 in the months ahead. A preview of SQL Server 2014 will be available by the end of the month and the full production version will be available for purchase in early 2014.
Microsoft did talk about how it has made it easy for users of this RDMS to use Microsoft's new Azure-based backup and replication services. The SQL Server Management Studio will include the option to periodically back up a database to the Azure blob storage. An administrator can also configure a secondary live copy of SQL Server on Azure.
"So every time a transaction is committed on Azure, it is committed on premise," Kelly said.
In this case, the data is copied asynchronously, so performance of the primary database will not be slowed waiting to complete the backup. A system could also load-balance traffic between the Azure database and the on-premises database, Kelly said.