CHAID or Chi-Square Automatic Interaction Detector is an exploratory method or more precisely an algorithm to study the relationship between a dependent variable and a series of predictor variables. This algorithm selects a set of predictors and their interactions and predicts the optimal value of the dependent variable. In the end what we get is a classification tree. The dependent variable could be a qualitative variable or a quantitative variable.
The CHAID model or a CHAID diagram can be thought of as an inverted tree trunk, which splits into different branches and sub branches. Initially the "Tree Trunk" is the totality of all the participants in the study. A series of predictor variables are studied to see if splitting the sample based on these predictors leads to a statistically significant discrimination in the dependent variable. For this Chi square tests and F tests are done and their P values are calculated. If the p values are not statistically significant, then the algorithm merges the respective predictor variables (or categories in case of categorical data). If a statistical significance is observed then a split is made. This becomes the first branching of the tree. Then for each of the groups, we face the question whether they can be further split into subgroups so that there are significant differences in the dependent variable. |