-
One way ANOVA and Assumptions and F-testInferential Statistics from Amsterdam 2021. 6. 23. 09:54
What i want to talk about in this page is One way ANOVA and assumptions and F-test for one way ANOVA.
1. One way ANOVA : Analysis of variance(known as ANOVA) allows us to compare means of more than two groups. It's a method of analysis we use in research designs with a quantitative response variable and one or more independent variables. One way ANOVA uses one response variable and independent variable that distinguishes three or more groups.
1-1) Factor : in ANOVA, independent variables are referred to as factors. The factors are categorical variables that represent groups, also called the levels of a factor. In one way ANOVA, there's just one factor with three or more levels.
** The family wise error rate : It is the probability of falsely rejecting the null hypothesis and falsely concluding that there's a significant effect when the null is in fact true. If we perfomr one test, the error rate is equivalent to significance level. When we perform more than one test, the family wise error rate refers to the probability of at least one of these tests resulting in a false rejection of the null hypothesis. This probability is approximately equal to the number of tests times the significance level. If we want to keep the family wise error rate at the desired significance level, we could adjust the individual rates by dividing them by the number of tests. If we apply this correction, we have less power(the probability of correctly rejecting the null hypothesis) to detect the difference between the groups.
** ANOVA allows us to decide whether the groups are samples from the same population distribution, with one and the same mean, or whether they're from different population distribution, with different means.
1-2) How to use ANOVA? : ANOVA works based on an assumption and a trick. First, we assume the variance is the same in all the populations. If there's any difference between the populations, this should be a difference in means only. The trick is to estimate the population variance in two different ways. The first method will always result in a fairly accurate and precise estimate of the population variance whether the population means are different or the same. The second method will produce a fairly accurate and precise population variance estimate, if the means are the same, but it will overestimate the population variance if the population means differ. So we can detect a difference in means by observing a discrepancy between the two estimates of population variance.
we don't know which group or groups differ and in what direction. The ratio of the between and within group variances provide an overall test equivalent to the overall test in multiple regression.
2. Assumptions and F-test
2-1) Assumptions
- Independence : The observations should be independent of each other.
- Normality : the response variables should be normally distributed in each group. Even if they aren't normally distributed, moderate violation of normality isn't problematic as long as the samples are large enough, at least ten observations in each group.
- Homogeneity : It means the population variance is assumed to be the same for all groups. This assumption forms the basis for the trick of using a population variance estimate that is always accurate and an estimate that is sometimes not so accurate to detect a difference in the means. Moderate violation of homogeneity variances is not problematic if the group size are equal. If the groups sizes are unequal, a rule of thumb is that you can still perform analysis of variance, as long as the largest standard deviation is no more than twice the size of the smallest standard deviation.
2-2) Hypotheses : the null hypothesis states that all population means are equal. The alternative hypothesis states that at least one population mean differs from the rest. This is a nondirectional hypothesis. It specifies that there is a difference somewhere. It doesn't specify for which groups we expect the difference or in what direction.
2-3) F test
** Statistical software reports not just the F value but also the sume of squares and mean sums of squares. Mean sum of squares is just another word for variance.
** Since the F test is non directional, we always look in the right tail of the distribution.
'Inferential Statistics from Amsterdam' 카테고리의 다른 글
Factorial ANOVA assumptions and test and ANOVA and regression (0) 2021.06.25 One way ANOVA-post hoc t-tests and Factorial ANOVA (0) 2021.06.24 Categorical response variable (0) 2021.06.22 Checking assumptions and Categorical predictors (0) 2021.06.21 Overall test and Individual tests (0) 2021.06.03