Nine questions for evaluating data-mining tools

Check it out before you buy

1. Can existing data and infrastructure support the proposed data-mining tool?

Some solutions will require a separate server for analysis, which could increase the expenditure by adding server hardware and software licensing costs. Other solutions operate only at the desktop level, but may not scale well with large data sets.

2. Are there adequate data preparation tools?

What tools are included in the data-mining solution to help construct the data-mining database? Premining data preparation entails considerable effort. Consider choosing a solution that includes cleansing tools. Does that solution also offer transformation, integration, and load capabilities? These tools simplify the creation of the data-mining database.

3. How is data accessed?

Some tools force data to be extracted from its source and put into the solution's proprietary format. Others support direct access to the data sources.

4. Which models are supported?

Some data-mining tools and solutions support only one or two modeling types. Will this be enough to support analysis of your business problems?

5. Can the solution be integrated with third party tools?

What type of support is provided to integrate other tools, such as an OLAP solution, into the data-mining environment? Or, is integration support limited to the supplier’s tools? How will this mesh with the tools currently used?

6. What's the mining output?

Data-mining results have to be decipherable to be worth time and effort to obtain them. Consider tools that produce charts and graphs directing you to take specific actions (e.g. new business rules). What types of reporting does the data-mining solution provide? Can results be exported to create reports using a third party product?

7. Model maintenance burdens?

What APIs (application programming interfaces) does the data-mining solution support? Supported interfaces allow changes to the model and thus get better results. A tool that lacks model flexibility may require custom coding.

8. Scalability?

Is the solution designed for single- or multi-processor environments? Is there a maximum number of data elements or are data-mining activities only constricted by available memory, processor, and disk?

9. Will users be satisfied?

Does the solution offer role-based access? Can an executive log in and work with the data-mining interfaces as easily as a business analyst? What will be required of the administrator? How much training will be needed for administrators, analysts, and other end-users?