Dr. Jesper Johannson, in his series of "Great Debates: Passwords vs. Passphrases" papers, found that 80 percent of passwords he surveyed shared the same 32 characters. Using these sorts of rules, which have been validated by every password guesser writing on the subject the last decade, a very large password space can be broken down into a much smaller set of likely nearly non-random passwords. (My friend Mark Burnett, author of "Perfect Passwords," conducted a great study on password character distribution.) Bruce Schneier and I did a similar study on tens of thousands of passwords stolen from MySpace, which you can read about in an earlier Security Adviser post and on Bruce Schneier's site.
The randomness of a password is called its entropy. Reading about password entropy in detail will almost hurt your brain. To simplify, a password such as "password" or "12345678" has almost no entropy; a password such as "vB%&7P" has a higher-than-average entropy. However, it's nearly impossible to calculate a password's true entropy without examining it along with every other choice in the password space. But several people have made intelligent guesses, and the password entropy model you rely on will have a significant impact on the overall success of your password policy defense.
Enter the spreadsheet
In the password-guessing calculator spreadsheet, I allow the user to choose among four different password entropy models: one developed by NIST (National Institute of Standards and Technology); one developed by C.E. Shannon, noted language entropy researcher; one developed by Johannson; and the Perfect Entropy model. I think the most accurate password entropy model comes from NIST (NIST Publication 800-63, Appendix A, Table A.1, p.53), but I include other models for reference, and there is room defined for more. It's important to point out that all the included password entropy models are probably flawed, but they are the best we can do without knowing all the real-life passwords in a given password space. In my experience, I think even the NIST model is too conservative, overestimating guessing effort.
In the spreadsheet, the user inputs his or her password policy (length, character set, maximum age, and whether complexity is enabled), selects an entropy model, and enters the number of guesses per minute that an attacker can attempt.