5.3 The p-Value in Hypothesis Testing


10454 twpvalue ()
illustrates a $ p$-value for hypothesis testing

This quantlet illustrates the concept of the $ p$-value, which is used extensively for hypothesis testing. The main idea is that, if the probability of getting an observed value $ x$ is very small, then we most likely have the wrong assumptions in computing this probability.

More formally, suppose we take a sample value from the binomial probability distribution:

$\displaystyle P(X=x) = {n\choose x} \,p^x\,(1-p)^{(n-x)},\quad P(X \ge x) =
\sum^n_{i=x} {n\choose i}\, p^i\,(1-p)^{(n-i)}\,.$

Suppose we test the following null and alternative hypotheses:

$\displaystyle H_0: p = p_0\,, \quad\quad H_1: p > p_0\,.$

This is an example of a one-tailed test. That is, we rule out (usually because of peculiarities of the data) the possibility that $ p < p_0$. To test this null hypothesis, we assume that $ H_0$ is true (i.e. $ p = p_0$) and using the above formulas, find the probability of getting the observed value. However, it turns out that for testing this hypothesis, $ P(X \ge x)$ is more reliable to use than $ P(X=x)$. This quantlet illustrates why this is so.

To activate this quantlet, the following should be typed in:

  twpvalue()
After this, the user should see the following window:

10458

Here, the user is asked to input the number of Bernoulli trials ($ n$), the null hypothesis probability of success in the Bernoulli trials ($ p_0$), and the observed binomial value ($ x$). The values above are the default values. After choosing the desired values, clicking on the OK button results in the window below, asking which probability should be visually displayed.


10461

After choosing the desired choice, the next window is displayed (here, for choosing P(X=5)):


10464

On the computer screen, the desired probability from the last window will be represented in black, with the rest of the distribution portrayed in red. In the above window, the second bar from the right, representing $ P(X=5)$, will be in black.

One intention of this quantlet is to demonstrate why the $ p$-value is $ P(X \ge x)$ and not $ P(X=x)$. Indeed, through experimenting, the user can observe that $ P(X=x)$ depends strongly on $ n$, and gets small even when there is clearly no strong evidence against $ H_0$. $ P(X \ge x)$, on the other hand, is stable for increasing $ n$, and stays large when there is no strong evidence against $ H_0$.

Additionally, for testing the null hypothesis that $ p = p_0 << n$ (i.e. for a given $ p_0$ considerably smaller than $ n$), this quantlet demonstrates that as the observed value $ x$ becomes larger, the $ p$-value decreases (i.e. evidence against $ H_0$ becomes stronger), and that when $ p$ becomes larger, the $ p$-value becomes larger (i.e. evidence against $ H_0$ becomes weaker).