Assessing the Performance of (original) (raw)

ASSESSING THE PERFORMANCE OF DIRECT MARKETING SCORING MODELS

Direct marketers commonly assess their scoring models with a single-split, gains chart method: They split the available data into "training" and "test" sets, estimate their models on the training set, apply them to the test set, and generate gains charts. They use the results to compare models (which model should be used), assess overfitting, and estimate how well the mailing will do. It is well known that the results from this approach are highly dependent on the particular split of the data used, due to sampling variation across splits. This paper examines the single-split method. Does the sampling variation across splits affect one's ability to distinguish between superior and inferior models? How can one estimate the overall performance of a mailing accurately? I consider two ways of reducing the variation across splits: Winsorization and stratified sampling. The paper gives an empirical study of these questions and variance-reduction methods using the DMEF data sets.

Ridge regression and direct marketing scoring models

Journal of Interactive Marketing, 1999

The objective of a direct marketing scoring model is to pick a specified number of people to receive a particular offer so that the response to the mailing is maximized. This paper shows how ridge regression can be used to improve the performance of direct marketing scoring models. It reviews the key property of ridge regression-it can produce estimates of the slope coefficients having smaller mean squared error than ordinary least squares models. Next, it shows that ridge regression can be used to reduce the effective number of parameters in a regression model. Thus, ridge regression can be used as an alternative to variable subset selection methods such as stepwise regression to control the biasvariance tradeoff of the estimated values. This means that direct marketers can include more variables in a scoring model without danger of overfitting the data. Ridge regression estimates are compared with stepwise regression on direct marketing data. The empirical results suggest that ridge regression provides a more stable way of moderating the model degrees of freedom than dropping variables. School of Journalism. His primary research is in the area of database marketing and computational statistics. He develops statistical models and applies them to large data sets of consumer information to help managers make marketing decisions. The primary application areas of his models include market segmentation, targeting, and direct marketing.

PERFORMANCE-BASED VARIABLE SELECTION FOR SCORING MODELS

The performance of a direct marketing scoring model at a particular mailing depth, d, is usually measured by the total amount of revenue generated by sending an offer to the customers with the 100d% largest scores (predicted values). Commonly used variable selection algorithms optimize some function of model fit (squared difference between training and predicted values). This article (1) discusses issues involved in selecting a mailing depth, d, and proposes a variable selection algorithm that optimizes the performance as the primary objective. The relationship between fit and performance is discussed. The performance-based algorithm is compared with fit-based algorithms using two real direct marketing data sets. These experimental results indicate that performancebased variable selection is 3-4% better than corresponding fitbased models, on average, when the mailing depth is between 20% and 40%.

A Stacking Approach to Direct Marketing Response Modeling

Asian Journal of Research in Computer Science

In this work, we investigate the viability of the stacked generalization approach in predictive modeling of a direct marketing problem. We compare the performance of individual models created using different classification algorithms, and stacked ensembles of these models. The base algorithms we investigate and use to create stacked models are Neural Networks, Logistic Regression, Support Vector Machines (SVM), Naïve Bayes and Decision Tree (CART). These algorithms were selected for their popularity and good performance on similar tasks in previous studies. Using a benchmark experiment and statistical tests, we compared five single algorithm classifiers and 26 stacked ensembles of combinations these algorithms on two popular metrics: Area Under ROC Curve (AUC) and lift. We will demonstrate a significant improvement in the AUC and lift values when the stacked generalization approach is used viz a viz the single-algorithm approach. We conclude that despite its relative obscurity in m...

Using predicted outcome stratified sampling to reduce the variability in predictive performance of a one-shot train-and-test split for individual customer predictions

2006

Since it is generally recognised that models evaluated on the data that was used for constructing them are overly optimistic, in predictive modeling practice, the assessment of a model's predictive performance frequently relies on a one-shot train-and-test split between observations used for estimating a model, and those used for validating it. Previous research has indicated the usefulness of stratified sampling for reducing the variation in predictive performance in a linear regression application. In this paper, we validate the previous findings on six real-life European predictive modeling applications for marketing and credit scoring using a dichotomous outcome variable. We find confirmation for the reduction in variability using a procedure we describe as predicted outcome stratified sampling in a logistic regression model, and we find that the gain in variation reduction is-also in large data sets-almost always significant, and in certain applications markedly high.

Evaluation of prediction models for marketing campaigns

Proceedings of the …, 2001

We consider prediction-model evaluation in the context of marketing-campaign planning. In order to evaluate and compare models with specific campaign objectives in mind, we need to concentrate our attention on the appropriate evaluation-criteria. These should portray the model's ability to score accurately and to identify the relevant target population. In this paper we discuss some applicable model-evaluation and selection criteria, their relevance for campaign planning, their robustness under changing population distributions, and their employment when constructing confidence intervals. We illustrate our results with a case study based on our experience from several projects.

Model selection for direct marketing: performance criteria and validation methods

Marketing Intelligence & Planning, 2008

PurposeThe purpose of this paper is to assess the performance of competing methods and model selection, which are non‐trivial issues given the financial implications. Researchers have adopted various methods including statistical models and machine learning methods such as neural networks to assist decision making in direct marketing. However, due to the different performance criteria and validation techniques currently in practice, comparing different methods is often not straightforward.Design/methodology/approachThis study compares the performance of neural networks with that of classification and regression tree, latent class models and logistic regression using three criteria – simple error rate, area under the receiver operating characteristic curve (AUROC), and cumulative lift – and two validation methods, i.e. bootstrap and stratified k‐fold cross‐validation. Systematic experiments are conducted to compare their performance.FindingsThe results suggest that these methods vary...

Measures of predictive success for rating functions

2011

Aim of our paper is to develop an adequate measure of predictive success and accuracy of rating functions. At first, we show that the common measures of rating accuracy, i.e. area under curve and accuracy ratio, respectively, lack of informative value of single rating classes. Selten (1991) builds up an axiomatic framework for measures of predictive success. Therefore, we introduce a measure for rating functions that fulfills the axioms proposed by Selten (1991). Furthermore, an empirical investigation analyzes predictive power and accuracy of Standard & Poor's and Moody's ratings, and compares the rankings according to area under curve and our measure.

Characterizing and Predicting Reviews for Effective Product Marketing and Advancement

Journal of Informatics Electrical and Electronics Engineering (JIEEE), A2Z Journals, 2021

In the present made world, dependably, individuals around the planet grant through different stages on the Web. It has been addressed, about 71% of by and large online customers read online surveys going before buying a thing. Thing considers, particularly the early surveys (i.e., the investigations posted at the beginning time of a thing), astoundingly impact coming about thing deals. We call the clients who posted the early examinations as "early investigators". Be that as it may, early specialists contribute just a little level of surveys, their feelings can pick the achievement or disappointment of new things and associations. It is immense for relationship to perceive early spectators since their responses can assist relationship with changing publicizing frameworks and improve thing plans, which can at last incite the accomplishment of their new things. And in dependably, a mass extent of unstructured information is made. This information is as text, which is accumulated from get-togethers, online media regions, surveys. Such information is named as gigantic information. Client feelings are identified with a wide degree of spotlights like on express things also. These investigations can be mined utilizing different movements and are of everything considered significance to make checks since they unmistakably pass on the perspective of the bigger part. Online outlines moreover have become a basic wellspring of data for clients going before settling on an educated buy choice. Early examiner's appraisals and their got strength scores are apparently going to influence thing notoriety. The test is to assemble all the audits, in like way find and investigate the assessments, to locate something refined, that scores high evaluating.