2 Example 1: Dependent children

The data frame DEPEND from the PASWR2 package shows the number of dependent children (number) for 50 families (count). Use a goodness-of-fit test to see if a Poisson distribution with \(\lambda\) = 2 can reasonably be used to model the number of dependent children. The significance level is 5%.

2.1 Step 0: construct the observed and expected values for different categories

To start with, we look at the data to determine how many categories should be created.

FT <- xtabs(~number, data = DEPEND)
FT

## number
##  0  1  2  3  4  5  6 
##  9 13 13  7  4  3  1

Based on the frequency table for the number of dependent children, the cells with four, five, and six children are combined into a single cell for four or more children. After the merge, we get the observed number of children for five categories as follows:

obs <- c(9, 13, 13, 7, 8)

Under the null hypothesis that \(F_0(x)\) is a Poisson distribution with \(\lambda=2\), the probabilities of 0, 1, 2, 3, and 4 or more dependent children are computed with R as follows:

null.p <- c(dpois(0, 2), dpois(1, 2), dpois(2, 2),
            dpois(3, 2), ppois(3, 2, lower = FALSE))
null.p

## [1] 0.1353353 0.2706706 0.2706706 0.1804470 0.1428765

Since there were a total of \(n = 50\) families, the expected number of dependent children for the five categories is simply \(50 \times\)null.p.

EX <- 50*null.p
EX

## [1]  6.766764 13.533528 13.533528  9.022352  7.143827

We are now ready to conduct the hypothesis test.

2.2 Step 1 - Hypotheses

The null and alternative hypotheses for using the chi-square goodness-of-fit test to test the hypothesis that the number of dependent children follows a Poisson distribution with \(\lambda = 2\) are: \[ \begin{split} &H_0: F_X(x) = F_0(x) \sim Pois(\lambda = 2.5) \text{ for all } x \textit{ versus}\\ &H_1: F_X(x) \neq F_0(x) \text{ for some }x. \end{split}\]

2.3 Step 2 - Choosing a Test Statistic

The test statistic chosen is \(\chi^2_\text{obs} = \sum_{k=1}^5 \frac{(O_k-E_k)^2}{E_k}\).

2.4 Step 3 - Rejection Region Calculations

Reject \(H_0\) if \(\chi^2_\text{obs} > \chi^2_{1-\alpha;K-1}\), where \(K\) denotes the number of categories.

QUESTION:

Find the value of the test statistic. \(\chi^2_\text{obs}\)=

Find the critical value. \(\chi^2_{1-\alpha;k-1}\)=

Find the \(p\)-value. \(p\)-value=

Using R, we can find the value of test statistic, the critical value, and \(p\)-value (\(P(\chi^2_4 \geq \chi^2_\text{obs})\)) as follows.

chi.obs <- sum((obs - EX)^2/EX) #value of the test statistic
chi.obs

## [1] 1.33502

qchisq(0.95, 4) #critical value

## [1] 9.487729

pchisq(chi.obs, 4, lower=FALSE) #p-value

## [1] 0.8554064

chisq.test(x=obs, p=null.p)

## 
##  Chi-squared test for given probabilities
## 
## data:  obs
## X-squared = 1.335, df = 4, p-value = 0.8554

We can find the value of test statistic and \(p\)-value by using the R function chisq.test.

chisq.test(x=obs, p=null.p)

## 
##  Chi-squared test for given probabilities
## 
## data:  obs
## X-squared = 1.335, df = 4, p-value = 0.8554

In the output, X-squared gives the value of test statistic, df is the degrees of freedom for this \(\chi^2\) distribution, and p-value gives the \(p\)-value.

2.5 Step 4 - Statistical Conclusion

Do we reject the null hypothesis?

Yes No

I. Since \(\chi^2_\text{obs}=1.335\) is not greater than \(\chi^2_{0.95;4}= 9.4877\), fail to reject \(H_0\).

Since \(p\)-value\(=0.8554 > 0.05\), fail to reject \(H_0\).

2.6 Step 5 - English Conclusion

Which of the following is the correct conclusion of our test?

There is evidence to suggest that the true cdf equals the Poisson distribution with \(\lambda = 2\) for all \(x\).
There is evidence to suggest that the true cdf equals the Poisson distribution with \(\lambda = 2\) for at least one \(x\).
There is no evidence to suggest that the true cdf does not equal the Poisson distribution with \(\lambda = 2\) for at least one \(x\).
There is no evidence to suggest that the true cdf does not equal the Poisson distribution with \(\lambda = 2\) for all \(x\).