Business Statistics

profileMike95
DSCI_3710_Excel_Quiz_4_Instructions.doc

DSCI 3710 Excel Assignment #4

Food Lion Inc. would like to investigate the feasibility and future prospects of setting up stores in Denton County. Food Lion has provided you with a sample from a database of household financial variables. They would like you to use regression techniques to predict the monthly expenditure on groceries, of families who either rent or own their homes in Denton. The sample contains 100 records. The various fields in the sample are:

Income1 ( Annual income of head of household or primary wage earner

Income2 ( Annual income of secondary wage earner

Famlsize ( Size of family (number of people permanently residing in the household)

Ownorent ( 1 if household is owned; 0 if it is rented

Autodebt ( Automobile related debt pending for wage earners in the household

Hpayrent ( Household mortgage payment or rent per month

Groc ( Monthly expenditure on groceries

Loc ( 1 = East Denton (E); 0 = West Denton(W); -1 = North Denton (N); 2 = South Denton (S)

For this assignment you will be given minimal instructions since, having completed the first three assignments, you should now be quite familiar with Excel functions and pull down menus.

1. The sample data are contained in a file named Assgt#4.xls to be found in the folder Excel Assignments for All Sections. Download the files from the folder onto a disk.

2. Import the data into the first nine columns (A-I) of your spreadsheet. The first row contains labels for the variables in the order outlined in the data description. The table below illustrates how the first 10 data records should look, after this step is completed.

Obs

Income1

Income2

Famlsize

Ownorent

Autodebt

Hpayrent

Groc

Loc

1

36557

20610

4

0

15290

1339

278

0

2

27045

25490

5

0

14676

1175

220

0

3

38878

0

8

0

10317

1108

456

0

4

41448

0

2

1

9504

729

253

0

5

33136

25300

6

1

17802

875

344

0

6

44308

24559

2

1

20537

1282

326

-1

7

31997

25419

4

1

11725

919

308

1

8

43437

0

6

1

12084

970

311

2

9

41625

22802

4

1

14863

945

305

0

10

40140

32158

5

0

15708

1470

432

2

3. Insert three columns prior to the column labeled “Groc”. In these columns, create three logical (dummy) variables to represent the four locations. You may use the IF function from the pull down menu, to do this. Please follow the illustration below, where: ED=1, if record is for East Denton; WD=1, if record is for West Denton; and SD=1, if the record is for South Denton (thus North Denton representing the base). After the creation of the dummy variables, the first 11 records should look as shown below.

Obs

Income1

Income2

Famlsize

Ownrent

Autodebt

Hpayrent

ED

WD

SD

Groc

1

36557

20610

4

0

15290

1339

0

1

0

278

2

27045

25490

5

0

14676

1175

0

1

0

220

3

38878

0

8

0

10317

1108

0

1

0

456

4

41448

0

2

1

9504

729

0

1

0

253

5

33136

25300

6

1

17802

875

0

1

0

344

6

44308

24559

2

1

20537

1282

0

0

0

326

7

31997

25419

4

1

11725

919

1

0

0

308

8

43437

0

6

1

12084

970

0

0

1

311

9

41625

22802

4

1

14863

945

0

1

0

305

10

40140

32158

5

0

15708

1470

0

0

1

432

11

31448

0

4

0

6160

985

1

0

0

259

4. Conduct the regression analysis with Groc as the dependent variable (Y) and Income1 (X1), Income2 (X2), Ownorent (X3), Autodebt (X4), Famlsize (X5), and Location variables (X6 through X8) as the independent variables at 1% level of significance. Use the Regression tool accessed from the Data Tab/ Data Analysis (in 2003 use Tools/Data Analysis) pull-down menu. Check only the labels box and specify that you want the output in cell M1. (Make sure that you enter “Confidence Level = 99%”). Also, make sure to check the Standardized residuals box to obtain outlier information. A partial output is shown below for your guidance.

SUMMARY OUTPUT

Regression Statistics

Multiple R

0.779966

R Square

xxxx

Adjusted R Square

0.5739159

Standard Error

xxxx

Observations

100

ANOVA

df

SS

MS

F

Significance F

Regression

8

760733.044

95091.6

xxxx

xxxx

Residual

91

xxxx

xxxx

Total

99

xxxx

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 99.0%

Upper 99.0%

Intercept

59.847981

43.87639859

1.364

0.176

-27.307103

147.0031

-55.58863

175.284587

Income1

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

Income2

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

Ownrent

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

Autodebt

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

Famlsize

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

ED

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

WD

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

SD

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

xxxx

4. By substituting appropriate values directly into the sample regression equation given in your output generate a point estimate for (Alternatively, you may use the prediction template that it is given to you as excel file-check the Excel assigment4 folder):

a) the monthly Grocery Bill for a family of 4 living in a rented home in South Denton, whose primary income is $42,457 and with the secondary wage earner having an income of $10000, and Autodebt is $6,000

b) the monthly Grocery Bill for a family of size 7 living in a home in the North Denton that is owned, whose primary income is $35,000 with the secondary wage earner having an income of $25,000, and Autodebt is $10,000.

Use the output from the regression analysis and trend function to answer the following questions.

A. Write one to three sentences to interpret the meaning of your model’s R-square value.

B. Conduct an F test for the regression model containing the eight independent variables, at the 1% significance level. State the null and alternative hypotheses, the decision, reason for the decision, and a conclusion.

C. Conduct a t-test, at the 1% significance level, for the usefulness of the dummy (indicator) variable Ownorent. State the null and alternative hypotheses, the decision, reason for the decision, and a conclusion.

D. Conduct a t-test, at the 1% significance level, for the usefulness of the variable Income2. State the null and alternative hypotheses, the decision, reason for the decision, and a conclusion.

E. State which of the eight X (independent) variables are statistically significant and which ones are not, at the 1% significance level.

F. Write one or two sentences on the interpretation of the coefficient of each of the six independent variables, making specific reference to the value of each coefficient given by your model.

Experiential Exercise

As you work this assignment think about the following questions. Then form a team of 3 to 5 and discuss each of the following. You can engage in this discussion by meeting or your group can use a Wiki to engage in an online discussion. Instructions for setting up a Wiki are provided on the excel assignment page of our course web site.

1. What wording tells you the alternative and null hypothesis?

2. What wording tells you the type of statistical test to perform? For example, is a z or t statistic appropriate?

3. What wording tells you that this is a one or two tail hypothesis test?

4. What were the steps you used to obtain the calculated value of the test statistic?

5. What were the steps you used to obtain the critical value of the test statistic?

6. How do you use the calculated and critical value to make a statistical decision about this test?

7. How do you obtain the p value for the test statistic?

8. How do you use the p value and the level of significance to make a statistical decision about this test?

9. How does the result of your test relate to the statistical significance of your findings?

10. What managerial implications can you conclude from the results of your test?

To be ready for the “Excel Quiz 43” HLS Web Test you should prepare the following:

1. A printout of the regression analysis

2. The results of the prediction/estimation of Groc (computed directly or by using the trend function) in the two rows immediately below the regression output

3. Answers for questions A-F

4. Clearly labeled or highlighted parts of the output that pertain to each answer for ques. A-F.

The Excel assignments are each graded via a short Excel Quiz in HLS Web Test that is open for about 48 hours as listed in the syllabus and in your HLS progress report. You are expected to use your output and written answers to complete the quiz. You are not required to turn in the output.

The questions below are much like the quiz you will have in WEBTEST. If you can answer these, you should have no difficulty with those that will be asked. However, these questions are not the exact questions that you will have to answer.

SAMPLE WEBTEST QUIZ: The correct answers to the sample questions are highlighted.

1. What is the p-value of the test statistic for the global F test for the regression model?

A. 0.444 B. 0.0000 C. 0.9650 D. 50 E. 2.0696

2. What is the calculated value of the test statistic to conduct the test for the usefulness of the Autodebt variable?

A. 118.98 B. 0.9650 C. 0.234 D. -1.314 E. 0.0079

3. The best interpretation of R-square for this regression analysis is

A. 60.8 percent of the total variation in a family's expenditure on groceries is explained by regression on the variables X1-X8

B. 77.80 percent of the total variation in a family's expenditure on groceries is explained by regression on the variables X1-X8

C. 57.4 percent of the total variation in a family's expenditure on groceries is explained by regression on the variables X2-X8

D. 72.84 percent of the total variation in a family's expenditure on groceries is explained by regression on the variables X2-X8

E. 95 percent of the total variation in a family's expenditure on groceries is explained by regression on the variables X1-X8

4. The result of the t-test for the usefulness of the Income2 (X2) variable, at ( = .05 is to

A. F.T.R. Ho, since the p-value is greater than .05. Conclude X2 is not useful for predicting Y

B. F.T.R. Ho, since the p-value is greater than .05. Conclude X2 is useful for predicting Y

C. Reject Ho, since the p-value is less than .05. Conclude X1 is not useful for predicting Y

D. F.T.R. Ho, since the p-value is less than .05. Conclude X1 is not useful for predicting Y

E. Reject Ho, since the p-value is less than .05. Conclude X1 is useful for predicting Y

5. What is the coefficient for Income2 in the sample regression equation?

A. 1. 20.10 B. -1.2504 C. 0.00104 D. 118.97 E. 0.0115

3