WO2001006396A2

WO2001006396A2 - Data mining software to determine customer potential

Info

Publication number: WO2001006396A2
Application number: PCT/US2000/018843
Authority: WO
Inventors: Yuchun Lee; Robert Crites
Original assignee: Unica Technologies Inc
Current assignee: Unica Corp
Priority date: 1999-07-16
Filing date: 2000-07-11
Publication date: 2001-01-25
Anticipated expiration: 2002-01-16
Also published as: AU5927500A; WO2001006396A8

Abstract

A method of determining a prospect's potential value as a customer is described. The method scores a group of prospects on a first valuation model that models prospects that are marketed to in a first manner and a second valuation model that models prospects that are marketed to in a second, different manner. The method determines a difference in scores of the group of prospects scored on the first valuation model and the second valuation model. The method can determine a prospect's true value as a customer by determining a prospect's potential value and adding the prospect's potential value to an estimate of the prospect's current value.

Description

DATA MINING SOFTWARE TO DETERMINE CUSTOMER POTENTIAL

BACKGROUND This invention relates to data mining software.

Data mining software extracts knowledge that may be suggested by a set of data. For example, data mining software can be used to maximize a return on investment in collecting marketing data, as well as other applications such as credit risk assessment, fraud detection, process control, medical diagnoses and so forth. Typically, data mining software uses one or a plurality of different types of modeling algorithms in combination with a set of test data to determine what types of characteristics are most useful in achieving a desired response rate, behavioral response or other output from a targeted group of individuals represented by the data. Generally, data mining software executes complex data modeling algorithms such as linear regression, logistic regression, back propagation neural network, Classification and Regression Trees (CART) and Automatic Interaction Detection (CHAID) decision trees, as well as other types of algorithms on a set of data.

One objective in using data mining software is to identify customers or prospects for target marketing. One concept used in data mining software is determining the value of a customer. By most practices, value is essentially equal to current value. For example, a common method of treating the customer is anybody who purchased from a company most recently and spent the most money with the company is a person considered to be of high value. Generally, marketing organizations will target people that are identified as having a past history of frequent or high value purchasing. Such persons are considered the best customers of a marketing organization. These best customers are often aggressively marketed.

SUMMARY A flaw exists with the technique of targeting customers by assuming that their value is equal to their current value. That commonly used technique can burn the customer out. If a valued customer is identified, the marketing organization will keep marketing to the valued customer, who may keep buying. Eventually the valued customer may become annoyed and/or expend all of its available resources .

According to an aspect of the present invention, a method of determining a prospect's potential value as a customer includes scoring a group of prospects on a first valuation model that models prospects that are marketed to in a first manner and a second valuation model that models prospects that are marketed to m a second, different manner and determining a difference in scores of the group of prospects scored on the first valuation model and the second valuation model.

According to an additional aspect of the present invention, a method of determining a prospect's true value as a customer includes determining a prospect's potential value and adding the prospect's potential value to an estimate of the prospect's current value to provide the prospect's true value. According to an additional aspect of the present invention, a computer program product for determining a prospect's true value as a customer includes instructions for causing a computer to determine a prospect's potential value and add the prospect's potential value to an estimate of the prospects current value to provide the prospect's true value.

One or more of the following advantages may be provided by one or more aspects of the invention. This invention takes the approach that value does not simply equate to current value and thus the customer with the highest current value may not actually be the best customer to aggressively pursue in a new marketing campaign. Rather, a true value is determined. The true value is related to a customer's current value and the customer's potential value. The data mining software includes a process to determine potential value. The concept of potential value adds a new dimension to how marketing is done. A fundamental approach used in the data mining software is that the software determines a customer's true value to a company based on an estimate of a current value which is observable and a determined potential value component. The data mining software uses an algorithm that scores prospects based on their calculated "true value" which is equal to current value plus potential value (e.g., untapped potential) .

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram of a computer system executing data mining software that determines true value of a customer.

FIG. 2 is a block diagram of a data set. FIG. 2A is a diagram of a record. FIG. 3 is a block diagram of a training process for the data mining software used m FIG. 1.

FIG. 4 is a flow chart of the data mining software that determines customer potential.

FIG. 5 is a flow chart of an alternative technique to determine customer potential value.

FIG. 6 is a flow chart of another alternative technique to determine customer potential value. DESCRIPTION Referring now to FIG. 1, a computer system 10 includes a CPU 12, main memory 14 and persistent storage device 16 all coupled via a computer bus 18. The system 10 also includes output devices such as a display 20 and a printer 22, as well as user input devices such as a keyboard 24 and a mouse 26. Not shown m FIG. 1 but necessarily included m a system of FIG. 1 are software drivers and hardware interfaces to couple all the aforementioned elements to the CPU 12.

The computer system 10 also includes data mining software. The data mining software 30 may reside on the computer system 10 or may reside on a server 28, as shown, which is coupled to the computer system 10 m a conventional manner such as m a client-server arrangement. The details on how this data mining software is coupled to this computer system 10 are not important to understand the present invention. Generally, data mining software 30 executes complex data modeling algorithms such as linear regression, logistic regression, back propagation neural network, Classification and Regression Trees (CART) and Chi squared Automatic Interaction Detection (CHAID) decision trees, as well as other types of algorithms that operate on a data set. Also, the data mining software 30 can use any one of these algorithms with different modeling parameters to produce different results. The data mining software 30 can render a visual representation of the results on the display 20 or printer 22 to provide a αecision maker with the results. The results that are returned can be based on different algorithm types or different sets of parameters used with the same algorithm. One type of result that the software can return is in the form of a lift chart. The results can also be returned without a visual depiction of the results such as the score itself, calculating an RMS value, and so forth. One approach is to render a graph or other visual depiction of the results. A preferred arrangement for providing such lift curves on a lift chart is described m U.S. Patent Application, serial no. 09/176,370 filed October 12, 1998, entitled "VISUAL PRESENTATION TECHNIQUE FOR DATA MINING SOFTWARE" by Yuchun Lee et al., which is incorporated herein by reference and is assigned to the assignee of the present invention.

The data mining software 30 described below determines a true value of a customer. The data mining software includes a potential value determination process 32 that allows for execution of multiple evaluation models. The models are trained by evaluating different groups of randomly selected customers sampled from a larger data set. The value determining process 32 produces a true value based on a current value that can be observed and a calculated potential value. There are several different algorithms that can be used to determine potential value.

Referring now to FIGS. 2 and 2A, a data set 50 includes a plurality of records 51. The data set 50 often includes a very large number of such records 51. The records are divided into groups 52a-52b whose members are randomly selected from the larger data set 50. The records 51 (FIG. 2A) can include an identifier field 53a, as well as one or a plurality of fields 53b corresponding to input variable values that are used in the value determining process 32. The records 51 also include one or more result fields 53c that are used by the value determining process 32 to record scores for the record that measure the current value and potential value and determine the true value of a prospect represented by the record. For example, for a record 51a, the result fields 53c include a current value field 64a, a potential value field 65a and a true value field 66a. The data mining software 30 or user randomly partitions this data set 50 into a series of data segments, i.e., at least two 52a, 52b.

In one embodiment, a test marketing campaign is conducted. Individuals represented by the records of one of the groups e.g., 52a are treated differently from individuals represented by the records of another one of the groups, e.g., group 52b. By being treated differently is meant that one group e.g., the first group 52a can be exposed to intensive, i.e., aggressive marketing 57, whereas the second group 52b could be exposed to normal or less intensive marketing 59.

Examples of aggressive marketing would include high incentives such as strong discounts or premiums, or higher cost collateral materials .

Referring now to FIG. 3, value determining process 32 includes multiple valuation models 60a-60b (at least two) trained using the results of the test marketing directed at the respective groups of customers 52a-52b (FIG. 2) . The individual multiple models 60a-60b are trained with data for the respective one of the groups 52a-52b. The groups 52a-52b are used to train the models based on the marketing results. For example, a customer who spends $1,000 every year for the last 5 years, may be considered to have a current value of $1,000 per year. But, the potential value of this customer may not be $1000. Referring now to FIG. 4, tne value determining process 32 determines the potential value V_P of a customer. The potential value is a measure of how much more this person can spend if the person is aggressively marketed. The value determining process 32 estimates the potential value by scoring each record 51 of the dataset 50 through a valuation model 60a for the special group and a valuation model 60b for the normal group. Each customer record 51 scored through models 60a and 60b yields scores SI and S2. The potential for each customer is determined 66 by taking the difference in the scores e.g., S1-S2. In other words, the potential value i.e., untapped potential V_P is related to the difference between how well a prospect matches to the profiles of the specially treated group and the normally treated group. The determined score is a measure of the customer's potential value.

From the potential value the true value can be determined 68 as:

15

Since the evaluation process 60 uses valuation models, the models return potential values as amounts of money e.g., dollar amounts, e.g. a measure of the amount of money that a prospect potentially may spend.

Referring now to FIG. 5, m some circumstances a marketing organization may not want to go to the cost of a test marketing campaign. Therefore another potential determining process 70 uses the concept that customers with the highest potential are those that most resemble current best customers. In this scenario, the data mining software 30 models the probability density of each customer over their possible values that they might take on. Tne customer's potential can be approximated by integrating over a density function, e.g., multiplying the probability at each value by that value and integrating over the whole density of the customer's potential value.

An algorithm that accomplishes this is set forth in equation 2 below.

potential = a * (\ p(v) • vdv - C) Equatιon2

where a is a scaling factor that can be tuned empirically, p (v) is the probability density associated with value v, and C is the current value.

There are many ways to model the density functions. One possible way to build the probability density model would be to divide up the range of values into segments, and build j models, each learning a probability for one segment. In this technique the value determining process 70 builds models 70a-70j for each of a plurality of segments. The evaluation process 70 therefore has a model 70a for the top 10%, a second model 70b for between 10 and 20%, a third model 70c between 20 and 30% and so forth. The models each learn a segment of the probability density of possible customer values given customer attributes, p(V) . The outputs of the j models are used to build 76 a probability density. The outputs are combined and renormalized to produce a composite probability density model. Other examples of non-parametric density function estimators include kernel estimators, orthogonal series estimators, histograms, splines, and mixture models. The training used m the evaluation process 70 examines the customer base and for each one of the customers provides a measure of how much money the customer had spent with the marketing organization in the past. That measure is used as a current value. The evaluation process 70 sorts the customer base into segments where the top segment would be customers that spend the most and the bottom segment would be customers that spend the least. The process 70 segments the customers into such groups and builds models 70a-70j of the probability of membership of a customer in each segment. Positive examples for each segment model are a sample of customers currently m that segment, and negative examples are a sample of customers not m that segment.

The value determining process 70 uses models 70a-70j to give an estimate of the probability that a customer is in each segment. To score a new prospect, the process 70 computes the prospect's expected value (by integrating over the output of the density model) multiplied by the value associated with each segment (either an average or the output of a valuation model for that segment) . The process 70 subtracts the current value of the prospect (obtained from either a calculation or a model) from the prospect's expected value, and scales the difference by a factor alpha α (which can be tuned based on sample data) . The result is an estimate of the potential. The potential value process 70 uses the set of models 70a-70j where each model 70a-70] provides an estimate of the probability of a prospect moving into the segment represented by the model.

Assume that a prospect is m a bottom segment. The model process 70 will take a record corresponding to the prospect and score it on the other nine (9) models 70a-70ι. The average valuations for each segment would give an indication of how much money the prospect would spend if the prospect belonged to that segment. The model process 70 uses the score in order to estimate the likelihood of the prospect being a member of the each of the segments. Therefore, the estimated value of the prospect is equal to the sum of the probability of the prospect becoming a member of the top segment times the value of the prospect in that segment, plus the probability of the prospect becoming a member of the second segment times the value of the prospect in the second segment, and so forth down to the segment that is the one above the prospect's current segment.

The reason for only considering segments above a prospect's current segment is that the potential is defined to be a non-negative quantity. In another embodiment, all segments (including those below the segment of the prospect) could be considered in the calculation, allowing for some prospects to have negative potential. Negative values of potential could have various interpretations, including a warning of likelihood of attrition or a signal that the prospect was spending beyond his/her means.

In another embodiment, rather than building a model 70a for the top 10%, a second model 70b for between 10 and 20%, a third model 70c between 20 and 30%, and so forth, model 70a would be the same, but model 70b would be for the entire top 20%, model 70c would be for the entire top 30%, and so forth. One advantage of this embodiment is the larger sample sizes available for building the models. In this case, if for example, a prospect was m the fourth segment, rather than integrating over the outputs from models 70a through 70c, only the output from model 70c would need to be used.

The potential value for each customer is determined from equation 2. In other words, the potential value i.e., untapped potential V_P is related to the difference between how well a prospect matches to the profiles of each of the segments above the segment that the prospect is classified into . In another embodiment, rather than multiplying the density by the average value for each segment, the process builds a plurality of valuation models (one for the customers from each segment), and multiplies the density by the score from the valuation model for each segment.

Referring to FIG. 6, a heuristic can be used as an alternative to producing a large number of models of probability of classification and determining the actual value for each segment. A heuristic based value determining process 80 can be further approximated by building a model of the probability of high value customers rather than of the entire probability density function. In this embodiment, the heuristic based value determining process 80 has a model of the top n % of customers, using top customers as positive examples, and a random sample of other customers as negative examples. That is, a group of customers can be segmented into a plurality of segments based on how much money they currently spend with the organization. The heuristic based value determining process 80 takes the top segment (e.g., the top 10% if 10 segments) to build a model that maps their background historical information to the actual amount of money that is spent while in that top segment. The heuristic based value determining process 80 also builds a model that gives the probability of the prospect becoming a member of the top segment.

Therefore, the heuristic based value determining process 80 is a further approximation of potential value. The prospect's potential value, therefore, is equal to the probability of that person moving to the top segment times the value of that person at the top segment.

To process new prospects, the value determining process 80 computes 82 the prospect's probability score from the top n % e.g., 10% model and multiplies 84 that by the mean value of the top n % e.g., 10% or by the output of a valuation model for that segment. The process 80 subtracts 86 the current value of the prospects (obtained from either a calculation or a model) , and scales 88 the difference by a factor β (which can be tuned based on sample data) . The result is an estimate of the potential. The estimate of potential can be added 90 to current value to determine the true value as in Equation 1 above.

Once the potential value of customers have been determined, the process 32 (using any of the potential determining algorithms 32, 70, 80) can segment customers into a two-dimensional grid arranged as customers with high current value, high potential value, low current value, h gh potential value or who have low current value low potential value. The process 32 can tailor a marketing strategy to each one of them. For example, a maintenance or retention program could be geared to customers who have high current value but low potential value, whereas, prospects who have low current value but high potential value can be good candidates for aggressive marketing .

Other Embodiments It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims .

What is claimed is:

Claims

1. A method of determining a prospect's potential value as a customer comprises : scoring a group of prospects on a first valuation model that models prospects that are marketed to in a first manner and a second valuation model that models prospects that are marketed to in a second, different manner; and determining a difference in scores of the group of prospects scored on the first valuation model and the second valuation model.

2. The method of claim 1 wherein the first valuation model models prospects that are marketed to an aggressive manner and the second valuation model models prospects that are marketed to m a normal manner.

3. The method of claim 1 wherein the first valuation model and the second valuation model return scores m the form of a dollar amount.

4. The method of claim 1 wherein the potential value is provided as an amount of money that the prospect can spend.

5. A method of determining a prospect's true value as a customer comprises: determining a prospect's potential value; and adding the prospect's potential value to an estimate of the prospect's current value to provide the prospect's true value .

6. The method of claim 5 wherein determining a prospect's potential value comprises: scoring the prospect on a first valuation model that models results of marketing to prospects m an aggressive manner and a second valuation model that models results of marketing to prospects in a second, less aggressive manner; and determining a difference in scores of the group of prospects scored on the first valuation model and the second valuation model.

7. The method of claim 6 wherein the first valuation model and the second valuation model return scores in the form of a monetary amount.

8. The method of claim 5 wherein determining potential value comprises: estimating the potential value by comparing a prospect to models of current best customers.

9. The method of claim 8 wherein estimating the potential value comprises: determining a prospect's potential by integrating over a density function that models a prospect's potential.

potential = a » (\ p(v) • vdv - C) Equatιon2

10. The method of claim 8 wherein estimating is determined in accordance with where a is a scaling factor that can be tuned empirically, p (v) is the probability density associated with value v, and C is the current value.

11. The method of claim 9 wherein determining further comprises : scoring the prospect through n models, each model learning a probability for one segment of a plurality of segments that correspond to the probability density of possible customer values given customer attributes; and combining results from each of the models to produce a composite probability density model.

12. The method of claim 11 wherein the group of prospects are segmented into the plurality of segments based on how much money they currently spend with the organization.

13. The method of claim 11 wherein a heuristic is used to determine the actual value for each segment.

14. The method of claim 8 wherein estimating potential value comprises: evaluating a prospect through a model of the probability of high value customers.

15. The method of claim 8 wherein estimating potential value comprises: modeling the top n% of customers, using top customers as positive examples, and a random sample of other customers as negative examples.

16. The method of claim 15 wherein the group of prospects are segmented into a plurality of segments based on how much money they currently spend with the organization.

17. The method of claim 16 estimating potential value comprises: scoring a prospect using a model that maps background historical information to an actual amount of money that is spent while in a top segment.

18. The method of claim 17 wherein estimating potential value comprises: scoring the prospect using a model that gives the probability of the prospect becoming a member of the top segment .

19. The method of claim 18 wherein the prospect's potential value is related to the probability of that person becoming a member of the top segment times the value of that person at the top segment.

20. A computer program product stored on a computer readable medium to determine a prospect's potential value as a customer comprises instructions to cause a computer to: score a group of prospects on a first valuation model that models prospects that are marketed to in a first manner and a second valuation model that models prospects that are marketed to in a second, different manner; and determine a difference in scores of the group of prospects scored on the first valuation model and the second valuation model.

21. The computer program product of claim 20 wherein the first valuation model models prospects that are marketed to an aggressive manner and the second valuation model models prospects that are marketed to in a normal manner.

22. The computer program product of claim 20 wherein the first valuation model and the second valuation model return scores m the form of a dollar amount.

23. The computer program product of claim 20 wherein the potential value is provided as an amount of money that the prospect can spend.

24. A computer program product to determine a prospect's true value as a customer comprises instructions to cause a computer to: determine a prospect's potential value; and add the prospect's potential value to an estimate of the prospect's current value to provide the prospect's true value .

25. The computer program product of claim 24 wherein instructions to determine a prospect's potential value comprises instructions to: score the prospect on a first valuation model that models result of marketing to prospects in an aggressive manner and a second valuation model that models results of marketing to prospects m a second, less aggressive manner; and determine a difference m scores of the group of prospects scored on the first valuation model and the second valuation model.

26. The computer program product of claim 24 wherein the first valuation model and the second valuation model return scores in the form of a monetary amount.

27. The computer program product of claim 24 wherein instructions to determine potential value comprises instructions to: estimate the potential value by comparing a prospect to models of current best customers.

28. The computer program product of claim 27 wherein instructions to estimate the potential value comprises instructions to: determine a prospect's potential by integrating over a density function that models a prospect's potential.

potential = a * (\ p(v) • vdv - C) Equatιon2

29. The computer program product of claim 27 wherein instructions that estimate, determined in accordance with where a is a scaling factor that can be tuned empirically, p (v) is the probability density associated with value v, and C is the current value.

30. The computer program product of claim 28 wherein instructions to determine further comprise instructions to: score the prospect through n models, each model learning a probability for one segment of a plurality of segments that correspond to the probability density of possible customer values given customer attributes; and combine results from each of the models to produce a composite probability density model.

31. The computer program product of claim 30 further comprising instructions to: segment records of the group of prospects into the plurality of segments based on how much money they currently spend with the organization.

32. The computer program product of claim 30 wherein a heuristic is used to determine the actual value for each segment .

33. The computer program product of claim 27 wherein instructions to estimate potential value comprise instructions to: evaluate a prospect through a model of the probability of high value customers.

34. The computer program product of claim 27 wherein instructions to estimate potential value comprise instructions to: model the top n% of customers, using top customers as positive examples, and a random sample of other customers as negative examples.

35. The computer program product of claim 34 further comprising instructions to segment records of prospects into a plurality of segments based on how much money the prospects currently spend with the organization.

36. The computer program product of claim 35 wherein instructions to estimate potential value comprise instructions to: score a prospect using a model that maps background historical information to an actual amount of money that is spent while m a top segment.

37. The computer program product of claim 36 wherein instructions to estimate potential value comprise instructions to: score the prospect using a model that gives the probability of the prospect becoming a member of the top segment.

38. The computer program product of claim 37 wherein the prospect's potential value is related to the probability of that person becoming a member of the top segment times the value of that person at the top segment.

39. An apparatus comprising- a processor; and a computer storage medium storing a computer program product for determining a prospect's true value as a customer, said program product comprising instructions for causing the processor to: determine a prospect's potential value; and add the prospect's potential value to an estimate of the prospects current value to provide the prospect's true value.

- 27