Logistic (original) (raw)

Class for building and using a multinomial logistic regression model with a ridge estimator.

There are some modifications, however, compared to the paper of leCessie and van Houwelingen(1992):

If there are k classes for n instances with m attributes, the parameter matrix B to be calculated will be an m*(k-1) matrix.

The probability for class j with the exception of the last class is

Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

The last class has probability

1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)

The (negative) multinomial log-likelihood is thus:

L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)

In order to find the matrix B for which L is minimised, a Quasi-Newton Method is used to search for the optimized values of the m*(k-1) variables. Note that before we use the optimization procedure, we 'squeeze' the matrix B into a m*(k-1) vector. For details of the optimization procedure, please check weka.core.Optimization class.

Although original Logistic Regression does not deal with instance weights, we modify the algorithm a little bit to handle the instance weights.

For more information see:

le Cessie, S., van Houwelingen, J.C. (1992). Ridge Estimators in Logistic Regression. Applied Statistics. 41(1):191-201.

Note: Missing values are replaced using a ReplaceMissingValuesFilter, and nominal attributes are transformed into numeric attributes using a NominalToBinaryFilter.

BibTeX:

@article{leCessie1992, author = {le Cessie, S. and van Houwelingen, J.C.}, journal = {Applied Statistics}, number = {1}, pages = {191-201}, title = {Ridge Estimators in Logistic Regression}, volume = {41}, year = {1992} }

Valid options are:

-D Turn on debugging output.

-R Set the ridge in the log-likelihood.

-M Set the maximum number of iterations (default -1, until convergence).