Cafe Cerebral - ANOVA / MANOVA

ANOVA or Analysis of Variance is an important tool in statistics which tests the significance of difference of means for two or more groups. A t-test is used normally when two groups of observations needs to be compared. If the number of groups increase, for e.g. five groups, then we would have to perform in all 5C2 or ten different t-tests.The drawback with using this method is that, as the number of groups increase, the number of t-tests increase geometrically and leads to increase in Type I error. In such a situation we resort to the technique of ANOVA. One would consider the name ANOVA a misnomer as because we are comparing means. But in reality, the name comes from the fact that in order to test the significance of difference of means we are actually analyzing the variances.

The kernel of ANOVA lies in partitioning its total variation into two groups, variation within groups and variation between groups. This is also known as partitioning of its total sum of squares. The variation within groups is known as sum of squares due to error or SSE, as this variation cannot be explained and that for variation between groups is known as sum of squares due to effect or SS(Effect). We consider the mean sum of squares respectively and take their ratio. If there is no significant difference in means then we can expect MS(Effect) and MSE to be of the same order or consequently their ratio, very small. But if there is actually a significant difference between the groups, then this difference would be reflected in the MS(Effect). Then our null hypothesis (that there is no significant differences between groups in the population) would be rejected. The variable that is measured is called the dependent variable and the variables that are manipulated or controlled are called factors or independent variables. In comparison to t-tests, this method is more powerful and more flexible. Moreover, ANOVA allows us to detect interaction effects between variables too.


For example, consider an agricultural scientist who is wants to find out if the average yield of four different fertilizers differ significantly (See figure below). Now what he does is, he divides his whole plot of land into four parts or blocks (Shaded in blue). In doing this, what he is actually doing is removing the heterogeneity in each block. The variation within each block is beyond our control and is assumed to be error variation. The variation between the blocks is assumed to be due the variation in the quality of the factors (Shaded in purple). The total variation is sum of these two kinds of variations. In ANOVA, we consider the ratio of these two kinds of variations after taking their averages. If there is a significant difference in the quality of the fertilizers, we can expect that variation along the horizontal plots much more than the variation along the vertical plots.

As the name suggests MANOVA, refers to Multivariate Analysis of Variance. Here we move from an univariate framework to a multivariate one where we have more than one dependent variables. Suppose a researcher is testing effect of two different kinds of broadband antibiotics. Then the various fields where these two medicines can be applied will form the dependent variables.

Even though the computations in MANOVA become increasingly complicated, the logic and the underlying structure remain the same. Interpreting the result of the multivariate test is a bit different. Suppose in the above example the test comes out to be significant, then we can conclude that the effect of the medicines is significantly different. Our next question would then be, whether one particular disease was significantly cured or all of them. Normally a MANOVA is followed by univariate tests for each variable to interpret their respective effects. ANOVA/MANOVA have a very wide range of applications in agriculture, medicine, in categorical data analysis where we try to compare groups, in trying to reduce the number of independent variables to a smaller number which can be modeled easily, to identify the independent variables which differentiate a set of dependent variables the most etc.

Contact Mu Sigma
info@mu-sigma.com
Site Map | Disclaimer | Privacy Policy
© 2005 - 2009 Mu Sigma. All rights reserved