Cafe Cerebral - Generalized Linear Model
The Generalized Linear Model (GLZ) is a generalization of the general linear model. In its simplest form, a linear model specifies the (linear) relationship between a dependent (or response) variable Y, and a set of predictor variables, the X's, so that

Y = b0 + b1X1 + b2X2 + ... + bkXk

In this equation b0 is the intercept and the bi values are the regression coefficients computed from the data.

So for example, one could estimate (i.e., predict) spend as a function of income and savings. In this situation linear regression can be used to estimate the respective regression coefficients from a sample of data. For many data analysis problems, estimates of the linear relationships between variables are adequate to describe the observed data, and to make reasonable predictions for new observations.

However, there are many relationships that cannot adequately be summarized by a simple linear equation, for two major reasons:


  1. Distribution of Dependent Variable
    First, the dependent variable of interest may have a non-continuous distribution, and thus, the predicted values should also follow the respective distribution. For example, a marketing manager may be interested in predicting one of three possible discrete outcomes (e.g., a consumer's choice of one of three alternative products). In that case, the dependent variable can only take on 3 distinct values, and the distribution of the dependent variable is said to be multinomial.

  2. Link Approach
    Link function is a second reason why the linear (multiple regression) model might be inadequate to describe a particular relationship is that the effect of the predictors on the dependent variable may not be linear in nature. For example, the relationship between satisfaction derived (say measured in some unit) from consumption of ice creams is most likely not linear in nature. The satisfaction derived from consuming the first and the second ice cream will be very different while it won’t be that different between the fourth and the fifth one. From the sixth ice cream onwards the satisfaction may actually start going down. Put in other words, the link between satisfaction and number of ice cream status is best described as non-linear, or as a logarithmic relationship in this particular example.

    The generalized linear model can be used to predict responses both for dependent variables with discrete distributions and for dependent variables, which are nonlinearly related to the predictors.
 
Computational Approach
To illustrate, in the general linear model a response variable Y is linearly associated with values on the X variables by

Y = b0 + b1X1 + b2X2 + ... + bkXk


while the relationship in the generalized linear model is assumed to be

Y = g (b0 + b1X1 + b2X2 + ... + bkXk )

where g(…) is a function. Formally, the inverse function of g(…), say f(…), is called the link function; so that:

f(Ey) = b0 + b1X1 + b2X2 + ... + bkXk

where Ey stands for the expected value of y.
Contact Mu Sigma
info@mu-sigma.com
Site Map | Disclaimer | Privacy Policy
© 2005 - 2009 Mu Sigma. All rights reserved