Basic Statistics (confidence intervals, sample size, and hypothesis testing)
amenphCHAPTER 8 (Confidence Intervals and Sample Size Calculations & CHAPTER 9 (Hypothesis Testing)
Chapter 8 & 9 Cheat Sheet
Case A: Observations are normally distributed or n > 30, AND is known
Confidence Interval:
n
z
x
z
x
x
s
s
±
=
±
Hypothesis test:
n
x
z
s
m
0
-
=
If you don't know if the observations are normally distributed, you can do a normal probability plot to find out.
Sample size determination:
2
÷
ø
ö
ç
è
æ
=
E
z
n
s
Z-Test & Zinterval
Case B: Observations are normally distributed or n > 30 AND is not known
Confidence Interval:
n
s
t
x
t
x
n
x
n
1
1
-
-
±
=
±
s
Hypothesis test:
n
s
x
t
0
m
-
=
T-Test & TInterval
Case C: Binomial observations with np > 5 and n(1-p) > 5
Confidence Interval:
n
p
p
z
p
z
p
p
)
ˆ
1
(
ˆ
ˆ
ˆ
ˆ
-
±
=
±
s
Hypothesis Test:
n
p
p
p
p
z
)
1
(
ˆ
0
0
0
-
-
=
Sample size calculation:
)
1
(
0
0
2
p
p
E
z
n
-
÷
ø
ö
ç
è
æ
=
or)
ˆ
1
(
ˆ
2
p
p
E
z
n
-
÷
ø
ö
ç
è
æ
=
or2
4
1
÷
ø
ö
ç
è
æ
=
E
z
n
1-ProbZTest & 1-PropZInt
Margin of Error = MOE = traditionally 95% confidence interval
Homework Chapter 8
7. a. How large a sample should be selected so that the margin of error of estimate for a 98% confidence interval for p is 0.045 when the value of the sample proportion obtained from a preliminary sample is .53?
b. Find the most conservative sample size that will produce the margin of error fofr a 98% confidence interval for p, is equal to 0.045.
8. In a random sample of 50 homeowners selected from a large suburban area, 19 said that they had serious problems with excessive noise from their neighbors.
a. Make a 99% confidence interval for the percentage of all homeowners in this suburban area who have such problems.
b. Suppose the confidence interval obtained in part (a) is too wide. How can the width of this interval be reduced? Discuss all possible alternatives. Which option is best?
9. The management of a health insurance company wants to know the percentage of its policly holders who have tried alternative treatments (such as acupuncture, herbal therapy, etc.). A random sample of 24 of the company’s policy holders were asked whether or not they have ever tried such treatments. The following are their reponses:
Yes |
No |
No |
Yes |
No |
Yes |
No |
No |
No |
Yes |
No |
No |
Yes |
No |
Yes |
No |
No |
No |
Yes |
No |
No |
No |
Yes |
No |
a. What is the point estimate of the corresponding population proportion?
b. Construct a 99% confidence interval for the percentage of this company’s policyholders who have tried alternative treatments.
10. A hospital administration wants to estimate the mean time spent by patients waiting for treatment at the emergency room. The waiting times (in minutes) recorded for a random sample of 32 such patients are given below
110 |
42 |
88 |
19 |
35 |
76 |
10 |
151 |
2 |
44 |
27 |
77 |
53 |
102 |
66 |
39 |
20 |
108 |
92 |
55 |
14 |
52 |
3 |
62 |
78 |
15 |
60 |
121 |
40 |
35 |
11 |
72 |
Construct a 98% confidence interval for the corresponding population mean.
Answers (assumes you are using the calculator):
7a 666 7b.668
8 (.203, .557)
9. a .333 b. (.085, .581)
10. (39.324, 71.863)
Homework Chapter 9
1. Write the null and alternative hypotheses for each of the following examples. Determine if each is a case of a 2-tailed, left-tailed, or right-tailed test.
a. To test if the mean number of hours spent working per week by college students who hold jobs is different from 20 hrs.
b. To test whether or not a bank’s ATM is out of service for an average of more than 10 hr per month
c. To test if the mean length of experience of airport security guards is different from 3 years
d. To test if the mean credit card debt of college seniors is less than $1000
e. To test if the mean time a customer has to wait on the phone to speak to a representative of a mail-order company about unsatisfactory service is more than 12 min.
2. For each of the following examples of tests of hypothesis about , show the rejection and nonrejection regions on the sampling distribution of the sample mean assuming it is normal (use t):
a. a 2-tailed test with =.01 and n=100
b. a left-tailed test with =.005 and n=27
c. a right-tailed test with 025 and n=36
3. A random sample of 28 observations produced a sample mean of 15. Find the critical and observed values of z for each of the following tests of hypotheses using =.01. It is known that the population has a normal distribution with =4.
a. H0: =20 vs. H1: < 20
b. H0: =20 vs. H1: not equal 20
4. According to data from the National Association of Home Builders, the average size of new homes in the United States was 2320 sq ft in 2002. Suppose a recent random sample of 400 new homes produced a mean size of 2365 sq ft. The population standard deviation of the sizes of new homes is known to be 312 sq ft.
a. Find the p-value for the hypothesis test with the alternative hypothesis that the current means size of all new homes in the United States exceeds 2320 sq ft. Will you reject the null hypothesis at =.02?
b. Test the hypothesis of part (a) using the critical value approach and =.02.
5. A random sample of 16 observations taken from a population that is normally distributed produced a sample mean of 42.4 and a standard deviation of 8. Test each of the following hypotheses using =.05.
a. Null Hypothesis: =46 vs. Alternative Hypothesis: < 46
b. Null: =46 vs. Alternative: not equal 46
6. The past records of a supermarket show that its customers spend an average of $65 per visit at this store. Recently the management of the store initiated a promotional campaign according to which each customer receives points based on the total money spent at the store, and these points can be used to buy products at the store. The management expects that as a result of this campaign, the customers should be encouraged to spend more money at the store. To check whether this is true, the manager of the store took a sample of 12 customers who visited the store. The following data gives the money (in dollars) spent by these customers at this supermarket during their visits:
88 69 141 28 106 45 32 51 78 54 110 83
Assume that the money spent by all customers at this supermarket has a normal distribution. Using the 1% significance level, can you conclude that the mean amount of money spent by all customers a this supermarket after the campaign was started is more than $65?
7. For each of the following examples of tests of hypotheses about the population proportion, show the rejection and nonrejection regions on the graph of the sampling distribution of the sample proportion:
a. a 2-tailed test with =.10
b. a left-tailed test with =.01
c. a right-tailed test with =.05
8. Mong corporation makes auto batteries. The company claims that 80% of its LL70 batteries are good for 70 months or longer. A consumer agency wanted to check if this claim is true. The agency took a random sample of 40 such batteries and found that 75% of them were good for 70 months or longer.
a. Using the 1% signficance level, can you conclude that the company’s claim is false?
b. What will your decision be in part (a) if the probability of making a Type I error is zero? Explain.
Answers (assumes you are using the calculator):
1. a 2 b Right c 2 d Left
2. a. reject to left of -2.626 or to right of 2.626
b. reject to left of -2.779
c. reject to right of 2.030
3. a. calculated z=-6.61 critical z=-2.326
b. calculatged z=-6.61 cirtical z= +/- 2.576
4. a. reject null b reject null
5. a. critical t=-1.753 calculated t=-1.800
b. critical t = +/- 2.131 calculated t=-1.800
6. do not reject null
7. a. reject null if z outside of +/-1.645
b. reject null if z < -2.326
c. reject null if z > 1.645
8. do not reject null
Project #3: Chapters 8-9
Collect a random sample for any quantitative variable for 30 members as you did for the first 2 projects. Choose a different project this time, however.
Once you have collected the data, answer the following questions:
a. List the data
b. What is the target population for the sample you used? Be specific.
c. Describe any problems you had in collecting the data and what impact it had on your sample.
d. Were any of the data values in the population unusable. If yes, explain why.
e. Construct a confidence interval of your choosing on this data.
f. Construct a hypothesis test for your data.
1. What is your significance level and why?
2. Why did you choose your particular alternative hypothesis?
3. What is your conclusion?
Notes:
1. This project is voluntary. If you do it, the possible points will be added to your total possible points. If you do not do it, you will have fewer possible points than those that do do it.
2. You may work on this project in groups with everyone in each group getting the score for the project. Groups can be of any size, from 1 to 5 people.
Chapters 8-9
1. The following data give the annual incomes (in thousands of dollars) before taxes for a sample of 36 randomly selected families in Ann Arbor:
21.6 |
50.1 |
91.2 |
92.8 |
40.6 |
96.3 |
33.0 |
21.5 |
57.0 |
79.4 |
69.0 |
44.5 |
25.6 |
70.0 |
72.2 |
45.3 |
75.5 |
84.0 |
37.9 |
72.8 |
45.0 |
76.0 |
57.5 |
43.0 |
50.0 |
58.2 |
95.0 |
48.6 |
49.7 |
61.7 |
148.1 |
85.4 |
27.8 |
69.3 |
75.1 |
126.0 |
a. What is the point estimate of ?
b. Construct a 95% confidence interval for .
c. What sample size would you have to collect so that the margin of error (E) is $5000 for a 95% confidence interval? Assume that the sample standard deviation is true for the population.
d. You want to determine if the annual incomes is less than $75,000. Conduct a hypothesis test using =0.05.
2. In a 2003 Affluent Americans and Their Money (an aricle in Money magazine) surved adults of all income levles and 85% agreed with the statement: “Money doesn’t buty happiness, but it helps”. In a recent random sample of 1200 adults, 776 agreed with that statement.
Based on this recent survey:
a. What is the point estimate of the population proportion p?
b. Construct a 99% confidence interval on p.
c. What sample size would you recommend so that the Margin of Error (E) is 0.01 for a 99% confidence interval?
d. At the 1% level of significance, can you conclude that the current percentage of all adults who agree with this statement is less than 85%?