Hypothesis Testing  Chi Squared Test
Five Steps in a Hypothesis Test
Significance Tests / Hypothesis Testing
The null and research hypotheses are written in words rather than in symbols. The research hypothesis is that the grouping variable (living arrangement) and the outcome variable (exercise) are dependent or related.
We must assess whether the sample size is adequate. Specifically, we need to check min(np_{0}, np_{1,} ..., n p_{k}) > 5. The sample size here is n=3,326 and the proportions specified in the null hypothesis are 0.02, 0.39, 0.36 and 0.23. Thus, min( 3326(0.02), 3326(0.39), 3326(0.36), 3326(0.23))=min(66.5, 1297.1, 1197.4, 765.0)=66.5. The sample size is more than adequate, so the formula can be used.
Null hypothesis: μ = 72 Alternative hypothesis: μ ≠72
We now compute the expected frequencies using the sample size and the proportions specified in the null hypothesis. We then substitute the sample data (observed frequencies) into the formula for the test statistic identified in Step 2. We organize the computations in the following table.
In the second experiment, you are going to put human volunteers with high blood pressure on a strict lowsalt diet and see how much their blood pressure goes down. Everyone will be confined to a hospital for a month and fed either a normal diet, or the same foods with half as much salt. For this experiment, you wouldn't be very interested in the P value, as based on prior research in animals and humans, you are already quite certain that reducing salt intake will lower blood pressure; you're pretty sure that the null hypothesis that "Salt intake has no effect on blood pressure" is false. Instead, you are very interested to know how much the blood pressure goes down. Reducing salt intake in half is a big deal, and if it only reduces blood pressure by 1 mm Hg, the tiny gain in life expectancy wouldn't be worth a lifetime of bland food and obsessive labelreading. If it reduces blood pressure by 20 mm with a confidence interval of ±5 mm, it might be worth it. So you should estimate the effect size (the difference in blood pressure between the diets) and the confidence interval on the difference.
Null hypothesis: μ = 72 Alternative hypothesis: μ ≠72
Here are three experiments to illustrate when the different approaches to statistics are appropriate. In the first experiment, you are testing a plant extract on rabbits to see if it will lower their blood pressure. You already know that the plant extract is a diuretic (makes the rabbits pee more) and you already know that diuretics tend to lower blood pressure, so you think there's a good chance it will work. If it does work, you'll do more lowcost animal tests on it before you do expensive, potentially risky human trials. Your prior expectation is that the null hypothesis (that the plant extract has no effect) has a good chance of being false, and the cost of a false positive is fairly low. So you should do frequentist hypothesis testing, with a significance level of 0.05.
A Bayesian would insist that you put in numbers just how likely you think the null hypothesis and various values of the alternative hypothesis are, before you do the experiment, and I'm not sure how that is supposed to work in practice for most experimental biology. But the general concept is a valuable one: as Carl Sagan summarized it, "Extraordinary claims require extraordinary evidence."
The formula for the test statistic is:

The test statistic is computed as follows:
The two competing statements about a population are called the null hypothesis and the alternative hypothesis.

We presented the following approach to the test using a Z statistic.
In general, the smaller the pvalue the stronger the evidence is in favor of the alternative hypothesis.

The formula for the test statistic is:
Pulse rates for n = 35 women are available. Here are Minitab results for our hypothesis test:
The test statistic is computed as follows:
Now instead of testing 1000 plant extracts, imagine that you are testing just one. If you are testing it to see if it kills beetle larvae, you know (based on everything you know about plant and beetle biology) there's a pretty good chance it will work, so you can be pretty sure that a P value less than 0.05 is a true positive. But if you are testing that one plant extract to see if it grows hair, which you know is very unlikely (based on everything you know about plants and hair), a P value less than 0.05 is almost certainly a false positive. In other words, if you expect that the null hypothesis is probably true, a statistically significant result is probably a false positive. This is sad; the most exciting, amazing, unexpected results in your experiments are probably just your data trying to make you jump to ridiculous conclusions. You should require a much lower P value to reject a null hypothesis that you think is probably true.
The formula for the test statistic is:
Now imagine that you are testing those extracts from 1000 different tropical plants to try to find one that will make hair grow. The reality (which you don't know) is that one of the extracts makes hair grow, and the other 999 don't. You do the 1000 experiments and do the 1000 frequentist statistical tests, and you use the traditional significance level of PPPP values less than 0.05, but almost all of them are false positives.
The test statistic is computed as follows:
Having said that, there's one key concept from Bayesian statistics that is important for all users of statistics to understand. To illustrate it, imagine that you are testing extracts from 1000 different tropical plants, trying to find something that will kill beetle larvae. The reality (which you don't know) is that 500 of the extracts kill beetle larvae, and 500 don't. You do the 1000 experiments and do the 1000 frequentist statistical tests, and you use the traditional significance level of PPPP value, after all), so you have 25 false positives. So you end up with 525 plant extracts that gave you a P value less than 0.05. You'll have to do further experiments to figure out which are the 25 false positives and which are the 500 true positives, but that's not so bad, since you know that most of them will turn out to be true positives.
We now substitute to compute the test statistic.
In the module on hypothesis testing for means and proportions, we discussed hypothesis testing applications with a dichotomous outcome variable in a single population. We presented a test using a test statistic Z to test whether an observed (sample) proportion differed significantly from a historical or external comparator. The chisquare goodnessoffit test can also be used with a dichotomous outcome and the results are mathematically equivalent.