Call us toll-free

How to Determine a p-Value When Testing a Null Hypothesis

A small -value (typically ≤ 0.05) indicates strong evidence against the null hypothesis, so you reject the null hypothesis.

Approximate price


275 Words


which is equivalent to rejecting the null hypothesis:

False. A small p-value means the value of the statisticwe observed in the sample is unlikely to have occurred when thenull hypothesis is true. Hence, a .04 p-value means it is evenmore unlikely the observed statistic would have occurred when the nullhypothesis is true than a .08 p-value. The smaller thep-value, the stronger the evidenceagainst the null hypothesis.

Let's return finally to the question of whether we reject or fail to reject the null hypothesis.

Here are three experiments to illustrate when the different approaches to statistics are appropriate. In the first experiment, you are testing a plant extract on rabbits to see if it will lower their blood pressure. You already know that the plant extract is a diuretic (makes the rabbits pee more) and you already know that diuretics tend to lower blood pressure, so you think there's a good chance it will work. If it does work, you'll do more low-cost animal tests on it before you do expensive, potentially risky human trials. Your prior expectation is that the null hypothesis (that the plant extract has no effect) has a good chance of being false, and the cost of a false positive is fairly low. So you should do frequentist hypothesis testing, with a significance level of 0.05.

which is equivalent to rejecting the null hypothesis:

A Bayesian would insist that you put in numbers just how likely you think the null hypothesis and various values of the alternative hypothesis are, before you do the experiment, and I'm not sure how that is supposed to work in practice for most experimental biology. But the general concept is a valuable one: as Carl Sagan summarized it, "Extraordinary claims require extraordinary evidence."

In the second experiment, you are going to put human volunteers with high blood pressure on a strict low-salt diet and see how much their blood pressure goes down. Everyone will be confined to a hospital for a month and fed either a normal diet, or the same foods with half as much salt. For this experiment, you wouldn't be very interested in the P value, as based on prior research in animals and humans, you are already quite certain that reducing salt intake will lower blood pressure; you're pretty sure that the null hypothesis that "Salt intake has no effect on blood pressure" is false. Instead, you are very interested to know how much the blood pressure goes down. Reducing salt intake in half is a big deal, and if it only reduces blood pressure by 1 mm Hg, the tiny gain in life expectancy wouldn't be worth a lifetime of bland food and obsessive label-reading. If it reduces blood pressure by 20 mm with a confidence interval of ±5 mm, it might be worth it. So you should estimate the effect size (the difference in blood pressure between the diets) and the confidence interval on the difference.

which is equivalent to rejecting the null hypothesis:

Now instead of testing 1000 plant extracts, imagine that you are testing just one. If you are testing it to see if it kills beetle larvae, you know (based on everything you know about plant and beetle biology) there's a pretty good chance it will work, so you can be pretty sure that a P value less than 0.05 is a true positive. But if you are testing that one plant extract to see if it grows hair, which you know is very unlikely (based on everything you know about plants and hair), a P value less than 0.05 is almost certainly a false positive. In other words, if you expect that the null hypothesis is probably true, a statistically significant result is probably a false positive. This is sad; the most exciting, amazing, unexpected results in your experiments are probably just your data trying to make you jump to ridiculous conclusions. You should require a much lower P value to reject a null hypothesis that you think is probably true.

A related criticism is that a significant rejection of a null hypothesis might not be biologically meaningful, if the difference is too small to matter. For example, in the chicken-sex experiment, having a treatment that produced 49.9% male chicks might be significantly different from 50%, but it wouldn't be enough to make farmers want to buy your treatment. These critics say you should estimate the effect size and put a on it, not estimate a P value. So the goal of your chicken-sex experiment should not be to say "Chocolate gives a proportion of males that is significantly less than 50% (P=0.015)" but to say "Chocolate produced 36.1% males with a 95% confidence interval of 25.9 to 47.4%." For the chicken-feet experiment, you would say something like "The difference between males and females in mean foot size is 2.45 mm, with a confidence interval on the difference of ±1.98 mm."

Order now
  • the null hypothesis is rejected when it is true b.

    A large -value ( 0.05) indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.

  • the null hypothesis is not rejected when it is false c.

    m) Ifyou get a p-value of 0.13, it means thatthe null hypothesis is true in 13% of all samples.

  • the research hypothesis is rejected when it is true d.

    This is the critical value that we need if we want to reject the null hypothesis.

Order now

the research hypothesis is not rejected when it is false722-1

This criticism only applies to two-tailed tests, where the null hypothesis is "Things are exactly the same" and the alternative is "Things are different." Presumably these critics think it would be okay to do a one-tailed test with a null hypothesis like "Foot length of male chickens is the same as, or less than, that of females," because the null hypothesis that male chickens have smaller feet than females could be true. So if you're worried about this issue, you could think of a two-tailed test, where the null hypothesis is that things are the same, as shorthand for doing two one-tailed tests. A significant rejection of the null hypothesis in a two-tailed test would then be the equivalent of rejecting one of the two one-tailed null hypotheses.

failing to reject the null hypothesis when it is false.

In the olden days, when people looked up P values in printed tables, they would report the results of a statistical test as "PPP>0.10", etc. Nowadays, almost all computer statistics programs give the exact P value resulting from a statistical test, such as P=0.029, and that's what you should report in your publications. You will conclude that the results are either significant or they're not significant; they either reject the null hypothesis (if P is below your pre-determined significance level) or don't reject the null hypothesis (if P is above your significance level). But other people will want to know if your results are "strongly" significant (P much less than 0.05), which will give them more confidence in your results than if they were "barely" significant (P=0.043, for example). In addition, other researchers will need the exact P value if they want to combine your results with others into a .

failing to reject the null hypothesis when it is true.

Because the test statistic Z = −1.92 > −2.33, we do not reject the null hypothesis. There is insufficient evidence at the α = 0.01 level to conclude that the rate has been reduced.

rejecting the null hypothesis when it is true.

The probability that was calculated above, 0.030, is the probability of getting 17 or fewer males out of 48. It would be significant, using the conventional PP=0.03 value found by adding the probabilities of getting 17 or fewer males. This is called a one-tailed probability, because you are adding the probabilities in only one tail of the distribution shown in the figure. However, if your null hypothesis is "The proportion of males is 0.5", then your alternative hypothesis is "The proportion of males is different from 0.5." In that case, you should add the probability of getting 17 or fewer females to the probability of getting 17 or fewer males. This is called a two-tailed probability. If you do that with the chicken result, you get P=0.06, which is not quite significant.

Order now
  • Kim

    "I have always been impressed by the quick turnaround and your thoroughness. Easily the most professional essay writing service on the web."

  • Paul

    "Your assistance and the first class service is much appreciated. My essay reads so well and without your help I'm sure I would have been marked down again on grammar and syntax."

  • Ellen

    "Thanks again for your excellent work with my assignments. No doubts you're true experts at what you do and very approachable."

  • Joyce

    "Very professional, cheap and friendly service. Thanks for writing two important essays for me, I wouldn't have written it myself because of the tight deadline."

  • Albert

    "Thanks for your cautious eye, attention to detail and overall superb service. Thanks to you, now I am confident that I can submit my term paper on time."

  • Mary

    "Thank you for the GREAT work you have done. Just wanted to tell that I'm very happy with my essay and will get back with more assignments soon."

Ready to tackle your homework?

Place an order