Справочник Пользователя для IBM 15

Chapter 4

have been resolved adequately. Similarly, the evaluation phase can lead you to reevaluate your
original business understanding, and you may decide that you have been trying to answer the
wrong question. At this point, you can revise your business understanding and proceed through
the rest of the process again with a better target in mind.

The second key point is the iterative nature of data mining. You will rarely, if ever, simply

plan a data mining project, complete it, and then pack up your data and go home. Data mining to
address your customers’ demands is an ongoing endeavor. The knowledge gained from one cycle
of data mining will almost invariably lead to new questions, new issues, and new opportunities
to identify and meet your customers’ needs. Those new questions, issues, and opportunities can
usually be addressed by mining your data once again. This process of mining and identifying new
opportunities should become part of the way you think about your business and a cornerstone of
your overall business strategy.

This introduction provides only a brief overview of the CRISP-DM process model. For

complete details on the model, consult the following resources:

The CRISP-DM Guide, which can be accessed along with other documentation from the

\Documentation folder on the installation disk.

The CRISP-DM Help system, available from the Start menu or by clicking

CRISP-DM Help

the Help menu in IBM® SPSS® Modeler.

Types of Models

IBM® SPSS® Modeler offers a variety of modeling methods taken from machine learning,
artificial intelligence, and statistics. The methods available on the Modeling palette allow you
to derive new information from your data and to develop predictive models. Each method has
certain strengths and is best suited for particular types of problems.

The SPSS Modeler Applications Guide provides examples for many of these methods, along

with a general introduction to the modeling process. This guide is available as an online tutorial,
and also in PDF format. For more information, see the topic

Application Examples

in Chapter 1

on p. 5.

Modeling methods are divided into three categories:

Classification

Association

Segmentation

Classification Models

Classification models use the values of one or more input fields to predict the value of one or
more output, or target, fields. Some examples of these techniques are: decision trees (C&R Tree,
QUEST, CHAID and C5.0 algorithms), regression (linear, logistic, generalized linear, and Cox
regression algorithms), neural networks, support vector machines, and Bayesian networks.

Classification models helps organizations to predict a known result, such as whether a customer
will buy or leave or whether a transaction fits a known pattern of fraud. Modeling techniques
include machine learning, rule induction, subgroup identification, statistical methods, and multiple
model generation.