I've said it before and I'll say again: Data science is a team sport.
The gold rush has started and no one will question the wisdom of buying a random acre of land with a stream and searching for your very own gold nugget -- or in this case, a data scientist. Gosh, there are a lot of articles on what makes a good data scientist.
Enough of that: I'd rather focus on what makes a bad data scientist who has the potential to harm rather than help your organization:
1. Weak mathematical background
With very few exceptions, data scientists are, at their core, math geeks. They fall on a spectrum, from total math types who write terrible Python (and maybe R) to folks who can pop machine learning algorithms off the top of their head. You may need both depending on what you're doing. But a data scientist with a weak mathematical background probably isn't a real data scientist. Maybe they're a data architect or data engineer, but they're more likely a consultant from a staffing firm. This person won't help you. A weak mathematical background can hurt in a lot of ways -- particularly in judging whether the results you're getting are useful.
2. Weak computing background
Data scientists who are mathematicians but don't really understand computers aren't terribly useful (in the same way an executive assistant who uses a typewriter isn't terribly useful in the modern world). In plenty of circumstances, the way you'd calculate something on paper isn't the same as how you'd calculate it using a distributed platform like Spark. Your data scientist needs to understand this.
3. Too good to be true
At the same time, don't expect to find a data scientist who is a mathematician, statistician, and distributed computing developer, with an MBA and actual experience as a mathematician, distributed computing developer, business person, and so on. In the words of a friend: “How old are they -- 80?” This is why you need a team. When you see a data scientist who meets the “unicorn” definition, remember this simple rule: Unicorns do not exist!
4. Effete academics
Just like there are coders who don't code and architects with no actual technical expertise, there are data scientists with limited experience with actual, you know, data. Moreover, they don't want to get their hands dirty by digging in the code. We're talking practical application, not theory. You're not running a university.
5. Poor communication skills
Fundamentally, a data scientist is there to bring clarity to data. While you as a technology pro or business expert might not understand all of the math or be able to implement it yourself, to trust in the decision-making process, you should understand it at a high level at least conceptually. Whether it's a clustering algorithm, probability calculations, or NLP, this stuff isn't hard to convey. If your data scientist isn't making that happen, your data scientist is doing a bad job. Your data scientist needs to be approachable and make the process approachable. Also, the ability to communicate clearly with multiple groups at an organization to get adjunct information, data, or access to data -- and details on how the data was developed -- makes the work go much smoother.
6. No understanding of business problems
You really can't hire a person who reiterates all of the business in math or statistical terms to your data scientist, who then "solves" problems. Why? Because if person A knows how to do all that, he or she probably knows enough to describe that in an algorithm to a computer. Why do you need person B?
7. No familiarity with the tools of the trade
There is SAS. There is R. There is Scala. There is Python. There is Matlab and a bevy of other tools. If you don't see those on the resume, then that person probably isn't a data scientist.
8. The SAS-only syndrome
With all due respect for my friends in the Containment Area for Relocated Yankees (Cary, N.C.), it seems like all two-bit SAS developers have rebranded themselves "data scientists." But that doesn't mean they know anything about data science (aka, knowing how to read the data) except how to write SAS code.
What sort of person do you need? You need an individual with the specific skills to address the problem and augment the existing technology team: a mathematician with programming and analytics experience, business sense, and the ability to talk to CEOs and techs alike. Now, don't go chasing unicorns -- but don't settle for chumps, either.
Good luck; you're going to need it. How competitive is it to snag someone good? Try this experiment: Add "data scientist" to your LinkedIn profile and watch a million recruiters shower you with offers of riches.