Oh boy! It appears that Microsoft’s glowing track record with Windows 7 is about to come to an abrupt and unceremonious end. According to various Web sources, the RTM build 7600.16385 includes a potentially fatal bug that, once triggered, could bring down the entire OS in a matter of seconds.
The bug in question -- a massive memory leak involving the chkdsk.exe utility -- appears when you attempt to run the program against a secondary (that is, not the boot partition) hard disk using the "/r" (read and verify all file data) parameter. The problem affects both 32- and 64-bit versions of Windows 7 and is classified as a "showstopper" in that it can cause the OS to crash (Blue Screen of Death) as it runs out of physical memory.
[ Get InfoWorld's 21-page hands-on look at the next version of Windows, plus deployment tips on security, Windows Server 2008 integration, and Windows XP migration, all from InfoWorld’s editors and contributors. | Read the Test Center review of Windows 7 RTM. Follow these seven steps to better Windows 7 security. ]
I tested for the bug against three different Windows 7 OS configurations on two different hardware platforms: an Intel Atom-based netbook running the 32-bit version, an Intel Core 2 Duo notebook running the 64-bit version, and a VMware Workstation 6.5.2 virtual machine running the 32-bit version.
In each case, the utility executed the first three stages of the test correctly using modest amounts of memory (several hundred megabytes). Then, when it entered the fourth stage (a read test), the chkdsk.exe utility's memory consumption started to climb rapidly until several gigabytes had been allocated to its process and the test systems in question began to run out of memory.
Note: I did not succeed in causing the systems to “blue screen” as others have reported. However, I did observe chkdsk.exe consume up to 90 percent of the available physical memory on a 2GB VMware virtual machine. After that, the utility appeared to hang while all other operations in the OS slowed to a crawl for lack of RAM.
This whitepaper explains the terminology and concepts behind Data Replication technologies and establishes some sizing rules through worked examples. Learn the new paradigm in disaster tolerance—protect data anywhere.
Download now »Server virtualization is a popular option for dealing with mounting datacenter costs. Another equally promising approach is the use of an Application Delivery Controller. Citrix NetScaler provides a low-cost way for organizations to reduce their server count and accrue cost savings from a reduction in space, cooling, power and personnel.
Download now »
The emergence of WLANs has created a new breed of security threats to enterprise networks.
Included in HP ProCurve WLAN solutions is security technology that alleviates threats from WLANs through:
* Monitoring wireless activity inside and out of the enterprise
* Classifying WLAN transmissions into harmful and harmless
* Preventing transmissions that pose a security threat to the enterprise network
* Locating participating devices for physical remediation
Effectively address data protection challenges, implementing solutions that help store and protect businesscritical data while cutting costs and improving efficiency and reliability.
Download now »Fix it. Re-RTM it. Done.
Better now than later!
As much as I love to bash M$oft this does strike me as the ideal time to find this kind of bug. However, I doubt it is as simple as speedman thinks, M$oft has a habit of building ugly architectures that encourage their coders to rely on external code that is often complex enough that unanticipated side-effects eat their lunch. Or as in this case, left at the mercy of behaviors in code of completely unknown vetting, since a non-M$ofty wrote it.
Still it is just another RTM cycle and they have a huge code-base from which to cobble something together. With a little mid-night oil, I doubt it will even have much schedule impact.
What I do know is that massive resource hogging is fairly typical of Microsoft Windows 64 bit operating systems. A friend of mine has Vista-64 running on a laptop, and the amount of disk usage grew massively for a month or so, then started going down and up in what appeared to be a cycle. In a shorter-term cycle (per session), Physical Memory usage follows a similar pattern, even without the sort of caching several commenters have suggested.
Whether Intel or Microsoft has something to patch, it all can be done in plenty of time for the October release date for Windows 7. For myself, I'd wait a few months and find out who screams the loudest about which "features", aka bugs, in the new OS version. Then I'd buy in the second round of sales promotions.
Of course, that's for consumer-level buyers. A corporate IT Department will have to be more careful and wait longer, perhaps as Randall says, for Service Pack 1 or even SP2.
I saw that post as a link provided by Woody Leonhard in his Askwoody.com Woindows patch watch blog. Yes, Randall has overstated what is really a highly technical aspect of checkdisk /r. Randall says that he runs this command for every drive he is about to commit to any critical task, before actually trying to use the disk. That may be Best Practices for an IT Development Shop, but most business, IT and Home users will never have any reason to use this parameter from the Command Line. The Windows 7 sky is not falling. And the memory usage is not a Memory Leak -- it is deliberate and limited, always designed to leave the System with at least 50 MB of RAM overhead, which usually prevents the BSOD crash.
If you want to experience a true Memory Leak, try running the freeware antispyware product Super Antispyware. Run its updater on a fully-patched Windows XP Pro SP3 computer with limited RAM and a single-core processor, running IE8 fully patched. What results is a kernel driver memory leak and a true BSOD. At least that's what the Microsoft crash report output page says.
IBM i (formerly OS/400) and z/OS for IBM's Power systems and Mainframe. That's two operating systems for you that aren't going to be brought down by renegade apps. They're server-class OSes only, but I point them out to note that the capability of an OS to maintain stability and availability can be achieved.
Even if you run that command, with that option, against a second physical hard drive, you (and your system) will be fine.
chkdsk.exe performs exactly like it is supposed to. While it uses a lot of RAM it does so to speed up the repair. And it has been designed to not use *all* of the RAM, only the available, physical RAM minus some 50MB. It respects already allocated RAM, and the system remains responsive during the operation.
See my comment below for more info.
Please everyone update your info on this issue before commenting.
There is no bug in chkdsk.exe. There is no memory leak. chkdsk.exe does take up a lot of memory when run with the /R option against a non-system partition/drive. This is a deliberate design decision to help speed up the repair process.
Per Microsoft, chkdsk.exe will allocate as much as possible of available physical memory, leaving at least 50MB of physical memory free during the operation with the /R option.
In other words, chkdsk will not allocate memory unbounded. Rather it will measure how much physical memory is available at the time of launch and allocate memory based on that measure. This means that even if the command is invoked on a running server, it will not cause the server to start trashing as it will respect the memory allocations already made - and then some.
I have confirmed this on my own system. It's simple and anyone can make the experiment: First run chkdsk /r on a non-system disk with no other apps started. Use resource monitor to see how RAM allocations ramp up. When leveling out take a note of how much RAM is used by chkdsk.exe. When the process completes you repeat the experiment, but this time you start a lot of memory hungry applications and let them allocate memory before launching chkdsk.exe /r. Note how chkdsk.exe uses less memory during this second run. The memory allocated by chkdsk.exe clearly depends on how much available RAM at the time of launch.
Incidently, my computer remained responsive during both tests. I was able to launch new programs, open Word, take screenshots (many), paste them into word etc. While I could tell that the system was working, it did not feel sluggish at all.
Now, the crash (BSOD) was reported by a single user who mistakenly assumed that it was connected to the operation of chkdsk.exe (because it happened while he was running chkdsk.exe). This has now been determined to be a chipset driver issue. Said user has updated drivers from motherboard manufacturer and has reported back that the crash issue was solved.
In conclusion: There is no bug in chkdsk.exe. There is no bug in the NTFS driver stack. chkdsk.exe has been optimized to finish ASAP by allocating as much memory as possible with as minimal impact as possible on other running processes. There may be a chipset driver issue in an earlier version of a 3rd party driver. This issue is not found in current drivers.
Now, what remains to be asked is this:
Even if there had been a "massive memory leak" as originally reported, how can anyone claim that a memory leak in a rarely used utility (chkdsk.exe), only in a more rarely used option, only in a even more rarely occurring scenario (repairing a non-system volume on a multi-volume system) amounts to a "showstopper" bug which risk derailing the Windows 7 launch? It is hilarious! Look at who made that claim and his posting history. Not to mention his totally unsubstantiated assertion about a bug in the "NTFS driver stack". WTF? Mr. Randal C. Kennedy is either grossly incompetent or an extremely cynical professional troll. Either way, FAIL!

Sign up to receive InfoWorld Resource Alerts
