Statistics Writing Assignment 2

profileamrjaghoob
DevoreProbabilityStatisticsEngineeringSciences8thtxtbk.pdf

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

S 50 R 51

1st Pass Pages

1019763_FM_VOL-I.qxp 9/17/07 4:22 PM Page viii

User
Zone de texte
This page was intentionally left blank

Your First Study Breakwww.CengageBrain.com

Get the best grade in the shortest time possible!

Buy the way you want and save

Now that you’ve bought the textbook . . .

Get a break on the study materials designed for your course! Visit CengageBrain.com and search for your textbook to find discounted print, digital and audio study tools that allow you to:

• Study in less time to get the grade you want using online resources such as chapter quizzing, flashcards, and interactive study tools. • Prepare for tests anywhere, anytime • Practice, review, and master course concepts using printed guides and manuals that work hand-in-hand with each chapter of your textbook.

Source Code: 12M-ST0009

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Probability and Statistics for Engineering and the Sciences

JAY DEVORE California Polytechnic State University, San Luis Obispo

EIGHTH EDIT ION

Australia • Brazil • Canada • Mexico • Singapore • Spain United Kingdom • United States

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

S 50 R 51

1st Pass Pages

1019763_FM_VOL-I.qxp 9/17/07 4:22 PM Page viii

User
Zone de texte
This page was intentionally left blank

This is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content may be suppressed. Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. The publisher reserves the right to remove content from this title at any time if subsequent rights restrictions require it. For valuable information on pricing, previous editions, changes to current editions, and alternate formats, please visit www.cengage.com/highered to search by ISBN#, author, title, or keyword for materials in your areas of interest.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Probability and Statistics for Engineering and the Sciences, Eighth Edition Jay L. Devore

Editor in Chief: Michelle Julet

Publisher: Richard Stratton

Senior Sponsoring Editor: Molly Taylor

Senior Development Editor: Jay Campbell

Senior Editorial Assistant: Shaylin Walsh

Media Editor: Andrew Coppola

Marketing Manager: Ashley Pickering

Marketing Communications Manager: Mary Anne Payumo

Content Project Manager: Cathy Brooks

Art Director: Linda Helcher

Print Buyer: Diane Gibbons

Rights Acquisition Specialists: Image: Mandy Groszko; Text: Katie Huha

Production Service: Elm Street Publishing Services

Text Designer: Diane Beasley

Cover Designer: Rokusek Design

© 2012, 2009 Brooks/Cole, Cengage Learning

ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, Web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher.

Printed in the United States of America 1 2 3 4 5 6 7 14 13 12 11 10

For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706

For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions.

Further permissions questions can be emailed to [email protected].

Library of Congress Control Number: 2010927429

ISBN-13: 978-0-538-73352-6 ISBN-10: 0-538-73352-7

Brooks/Cole 20 Channel Center Street Boston, MA 02210 USA

Cengage Learning is a leading provider of customized learning solutions with office locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your local office at international.cengage.com/region

Cengage Learning products are represented in Canada by Nelson Education, Ltd.

For your course and learning solutions, visit www.cengage.com. Purchase any of our products at your local college store or at our preferred online store www.cengagebrain.com.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

v

To my grandson

Philip, who is highly

statistically significant.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

vii

Contents

1 Overview and Descriptive Statistics

Introduction 1

1.1 Populations, Samples, and Processes 2

1.2 Pictorial and Tabular Methods in Descriptive Statistics 12

1.3 Measures of Location 28

1.4 Measures of Variability 35

Supplementary Exercises 46

Bibliography 49

2 Probability

Introduction 50

2.1 Sample Spaces and Events 51

2.2 Axioms, Interpretations, and Properties of Probability 55

2.3 Counting Techniques 64

2.4 Conditional Probability 73

2.5 Independence 83

Supplementary Exercises 88

Bibliography 91

Introduction 92

3.1 Random Variables 93

3.2 Probability Distributions for Discrete Random Variables 96

3.3 Expected Values 106

3.4 The Binomial Probability Distribution 114

3.5 Hypergeometric and Negative Binomial Distributions 122

3.6 The Poisson Probability Distribution 128

Supplementary Exercises 133

Bibliography 136

3 Discrete Random Variables and Probability Distributions

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Introduction 137

4.1 Probability Density Functions 138

4.2 Cumulative Distribution Functions and Expected Values 143

4.3 The Normal Distribution 152

4.4 The Exponential and Gamma Distributions 165

4.5 Other Continuous Distributions 171

4.6 Probability Plots 178

Supplementary Exercises 188

Bibliography 192

Introduction 193

5.1 Jointly Distributed Random Variables 194

5.2 Expected Values, Covariance, and Correlation 206

5.3 Statistics and Their Distributions 212

5.4 The Distribution of the Sample Mean 223

5.5 The Distribution of a Linear Combination 230

Supplementary Exercises 235

Bibliography 238

6 Point Estimation

7 Statistical Intervals Based on a Single Sample

Introduction 239

6.1 Some General Concepts of Point Estimation 240

6.2 Methods of Point Estimation 255

Supplementary Exercises 265

Bibliography 266

Introduction 267

7.1 Basic Properties of Confidence Intervals 268

7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 276

4 Continuous Random Variables and Probability Distributions

5 Joint Probability Distributions and Random Samples

viii Contents

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.3 Intervals Based on a Normal Population Distribution 285

7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population 294

Supplementary Exercises 297

Bibliography 299

8 Tests of Hypotheses Based on a Single Sample

Introduction 300

8.1 Hypotheses and Test Procedures 301

8.2 Tests About a Population Mean 310

8.3 Tests Concerning a Population Proportion 323

8.4 P-Values 328

8.5 Some Comments on Selecting a Test 339

Supplementary Exercises 342

Bibliography 344

9 Inferences Based on Two Samples

Introduction 345

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 346

9.2 The Two-Sample t Test and Confidence Interval 357

9.3 Analysis of Paired Data 365

9.4 Inferences Concerning a Difference Between Population Proportions 375

9.5 Inferences Concerning Two Population Variances 382

Supplementary Exercises 386

Bibliography 390

10 The Analysis of Variance

Introduction 391

10.1 Single-Factor ANOVA 392

10.2 Multiple Comparisons in ANOVA 402

10.3 More on Single-Factor ANOVA 408

Supplementary Exercises 417

Bibliography 418

Contents ix

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11 Multifactor Analysis of Variance

Introduction 419

11.1 Two-Factor ANOVA with Kij � 1 420

11.2 Two-Factor ANOVA with Kij � 1 433

11.3 Three-Factor ANOVA 442

11.4 2p Factorial Experiments 451

Supplementary Exercises 464

Bibliography 467

12 Simple Linear Regression and Correlation

Introduction 468

12.1 The Simple Linear Regression Model 469

12.2 Estimating Model Parameters 477

12.3 Inferences About the Slope Parameter �1 490

12.4 Inferences Concerning and the Prediction of Future Y Values 499

12.5 Correlation 508

Supplementary Exercises 518

Bibliography 522

mY #x*

13 Nonlinear and Multiple Regression

Introduction 523

13.1 Assessing Model Adequacy 524

13.2 Regression with Transformed Variables 531

13.3 Polynomial Regression 543

13.4 Multiple Regression Analysis 553

13.5 Other Issues in Multiple Regression 574

Supplementary Exercises 588

Bibliography 593

14 Goodness-of-Fit Tests and Categorical Data Analysis

Introduction 594

14.1 Goodness-of-Fit Tests When Category Probabilities Are Completely Specified 595

x Contents

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

14.2 Goodness-of-Fit Tests for Composite Hypotheses 602

14.3 Two-Way Contingency Tables 613

Supplementary Exercises 621

Bibliography 624

15 Distribution-Free Procedures

Introduction 625

15.1 The Wilcoxon Signed-Rank Test 626

15.2 The Wilcoxon Rank-Sum Test 634

15.3 Distribution-Free Confidence Intervals 640

15.4 Distribution-Free ANOVA 645

Supplementary Exercises 649

Bibliography 650

16 Quality Control Methods

Introduction 651

16.1 General Comments on Control Charts 652

16.2 Control Charts for Process Location 654

16.3 Control Charts for Process Variation 663

16.4 Control Charts for Attributes 668

16.5 CUSUM Procedures 672

16.6 Acceptance Sampling 680

Supplementary Exercises 686

Bibliography 687

Appendix Tables

A.1 Cumulative Binomial Probabilities A-2

A.2 Cumulative Poisson Probabilities A-4

A.3 Standard Normal Curve Areas A-6

A.4 The Incomplete Gamma Function A-8

A.5 Critical Values for t Distributions A-9

A.6 Tolerance Critical Values for Normal Population Distributions A-10

A.7 Critical Values for Chi-Squared Distributions A-11

A.8 t Curve Tail Areas A-12

A.9 Critical Values for F Distributions A-14

A.10 Critical Values for Studentized Range Distributions A-20

Contents xi

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

A.11 Chi-Squared Curve Tail Areas A-21

A.12 Critical Values for the Ryan-Joiner Test of Normality A-23

A.13 Critical Values for the Wilcoxon Signed-Rank Test A-24

A.14 Critical Values for the Wilcoxon Rank-Sum Test A-25

A.15 Critical Values for the Wilcoxon Signed-Rank Interval A-26

A.16 Critical Values for the Wilcoxon Rank-Sum Interval A-27

A.17 � Curves for t Tests A-28

Answers to Selected Odd-Numbered Exercises A-29 Glossary of Symbols/Abbreviations G-1 Index I-1

xii Contents

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

xiii

Preface Purpose

The use of probability models and statistical methods for analyzing data has become common practice in virtually all scientific disciplines. This book attempts to provide a comprehensive introduction to those models and methods most likely to be encoun- tered and used by students in their careers in engineering and the natural sciences. Although the examples and exercises have been designed with scientists and engi- neers in mind, most of the methods covered are basic to statistical analyses in many other disciplines, so that students of business and the social sciences will also profit from reading the book.

Approach

Students in a statistics course designed to serve other majors may be initially skeptical of the value and relevance of the subject matter, but my experience is that students can be turned on to statistics by the use of good examples and exercises that blend their every- day experiences with their scientific interests. Consequently, I have worked hard to find examples of real, rather than artificial, data—data that someone thought was worth col- lecting and analyzing. Many of the methods presented, especially in the later chapters on statistical inference, are illustrated by analyzing data taken from published sources, and many of the exercises also involve working with such data. Sometimes the reader may be unfamiliar with the context of a particular problem (as indeed I often was), but I have found that students are more attracted by real problems with a somewhat strange context than by patently artificial problems in a familiar setting.

Mathematical Level

The exposition is relatively modest in terms of mathematical development. Substantial use of the calculus is made only in Chapter 4 and parts of Chapters 5 and 6. In particu- lar, with the exception of an occasional remark or aside, calculus appears in the inference part of the book only—in the second section of Chapter 6. Matrix algebra is not used at all. Thus almost all the exposition should be accessible to those whose mathematical background includes one semester or two quarters of differential and integral calculus.

Content

Chapter 1 begins with some basic concepts and terminology—population, sample, descriptive and inferential statistics, enumerative versus analytic studies, and so on— and continues with a survey of important graphical and numerical descriptive methods. A rather traditional development of probability is given in Chapter 2, followed by prob- ability distributions of discrete and continuous random variables in Chapters 3 and 4, respectively. Joint distributions and their properties are discussed in the first part of Chapter 5. The latter part of this chapter introduces statistics and their sampling distri- butions, which form the bridge between probability and inference. The next three chapters cover point estimation, statistical intervals, and hypothesis testing based on a single sample. Methods of inference involving two independent samples and paired data are presented in Chapter 9. The analysis of variance is the subject of Chapters 10 and 11 (single-factor and multifactor, respectively). Regression makes its initial appearance in Chapter 12 (the simple linear regression model and correlation) and

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

returns for an extensive encore in Chapter 13. The last three chapters develop chi- squared methods, distribution-free (nonparametric) procedures, and techniques from statistical quality control.

Helping Students Learn

Although the book’s mathematical level should give most science and engineering students little difficulty, working toward an understanding of the concepts and gain- ing an appreciation for the logical development of the methodology may sometimes require substantial effort. To help students gain such an understanding and appreci- ation, I have provided numerous exercises ranging in difficulty from many that involve routine application of text material to some that ask the reader to extend con- cepts discussed in the text to somewhat new situations. There are many more exer- cises than most instructors would want to assign during any particular course, but I recommend that students be required to work a substantial number of them; in a problem-solving discipline, active involvement of this sort is the surest way to iden- tify and close the gaps in understanding that inevitably arise. Answers to most odd- numbered exercises appear in the answer section at the back of the text. In addition, a Student Solutions Manual, consisting of worked-out solutions to virtually all the odd-numbered exercises, is available.

To access additional course materials and companion resources, please visit www.cengagebrain.com. At the CengageBrain.com home page, search for the ISBN of your title (from the back cover of your book) using the search box at the top of the page. This will take you to the product page where free companion resources can be found.

New for This Edition

• A Glossary of Symbols/Abbreviations appears at the end of the book (the author apologizes for his laziness in not getting this together for earlier editions!) and a small set of sample exams appears on the companion website (available at www.cengage.com/login).

• Many new examples and exercises, almost all based on real data or actual prob- lems. Some of these scenarios are less technical or broader in scope than what has been included in previous editions—for example, weights of football players (to illustrate multimodality), fundraising expenses for charitable organizations, and the comparison of grade point averages for classes taught by part-time faculty with those for classes taught by full-time faculty.

• The material on P-values has been substantially rewritten. The P-value is now ini- tially defined as a probability rather than as the smallest significance level for which the null hypothesis can be rejected. A simulation experiment is presented to illustrate the behavior of P-values.

• Chapter 1 contains a new subsection on “The Scope of Modern Statistics” to indicate how statisticians continue to develop new methodology while working on problems in a wide spectrum of disciplines.

• The exposition has been polished whenever possible to help students gain an intuitive understanding of various concepts. For example, the cumulative distribution function is more deliberately introduced in Chapter 3, the first example of maximum likeli- hood in Section 6.2 contains a more careful discussion of likelihood, more attention is given to power and type II error probabilities in Section 8.3, and the material on residuals and sums of squares in multiple regression is laid out more explicitly in Section 13.4.

xiv Preface

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Acknowledgments

My colleagues at Cal Poly have provided me with invaluable support and feedback over the years. I am also grateful to the many users of previous editions who have made suggestions for improvement (and on occasion identified errors). A special note of thanks goes to Matt Carlton for his work on the two solutions manuals, one for instructors and the other for students.

The generous feedback provided by the following reviewers of this and previous editions has been of great benefit in improving the book: Robert L. Armacost, University of Central Florida; Bill Bade, Lincoln Land Community College; Douglas M. Bates, University of Wisconsin–Madison; Michael Berry, West Virginia Wesleyan College; Brian Bowman, Auburn University; Linda Boyle, University of Iowa; Ralph Bravaco, Stonehill College; Linfield C. Brown, Tufts University; Karen M. Bursic, University of Pittsburgh; Lynne Butler, Haverford College; Raj S. Chhikara, University of Houston–Clear Lake; Edwin Chong, Colorado State University; David Clark, California State Polytechnic University at Pomona; Ken Constantine, Taylor University; David M. Cresap, University of Portland; Savas Dayanik, Princeton University; Don E. Deal, University of Houston; Annjanette M. Dodd, Humboldt State University; Jimmy Doi, California Polytechnic State University–San Luis Obispo; Charles E. Donaghey, University of Houston; Patrick J. Driscoll, U.S. Military Academy; Mark Duva, University of Virginia; Nassir Eltinay, Lincoln Land Community College; Thomas English, College of the Mainland; Nasser S. Fard, Northeastern University; Ronald Fricker, Naval Postgraduate School; Steven T. Garren, James Madison University; Mark Gebert, University of Kentucky; Harland Glaz, University of Maryland; Ken Grace, Anoka-Ramsey Community College; Celso Grebogi, University of Maryland; Veronica Webster Griffis, Michigan Technological University; Jose Guardiola, Texas A&M University–Corpus Christi; K. L. D. Gunawardena, University of Wisconsin–Oshkosh; James J. Halavin, Rochester Institute of Technology; James Hartman, Marymount University; Tyler Haynes, Saginaw Valley State University; Jennifer Hoeting, Colorado State University; Wei-Min Huang, Lehigh University; Aridaman Jain, New Jersey Institute of Technology; Roger W. Johnson, South Dakota School of Mines & Technology; Chihwa Kao, Syracuse University; Saleem A. Kassam, University of Pennsylvania; Mohammad T. Khasawneh, State University of NewYork–Binghamton; Stephen Kokoska, Colgate University; Hillel J. Kumin, University of Oklahoma; Sarah Lam, Binghamton University; M. Louise Lawson, Kennesaw State University; Jialiang Li, University of Wisconsin–Madison; Wooi K. Lim, William Paterson University; Aquila Lipscomb, The Citadel; Manuel Lladser, University of Colorado at Boulder; Graham Lord, University of California–Los Angeles; Joseph L. Macaluso, DeSales University; Ranjan Maitra, Iowa State University; David Mathiason, Rochester Institute of Technology; Arnold R. Miller, University of Denver; John J. Millson, University of Maryland; Pamela Kay Miltenberger, West Virginia Wesleyan College; Monica Molsee, Portland State University; Thomas Moore, Naval Postgraduate School; Robert M. Norton, College of Charleston; Steven Pilnick, Naval Postgraduate School; Robi Polikar, Rowan University; Ernest Pyle, Houston Baptist University; Steve Rein, California Polytechnic State University–San Luis Obispo; Tony Richardson, University of Evansville; Don Ridgeway, North Carolina State University; Larry J. Ringer, Texas A&M University; Robert M. Schumacher, Cedarville University; Ron Schwartz, Florida Atlantic University; Kevan Shafizadeh, California State University–Sacramento; Mohammed Shayib, Prairie View A&M; Robert K. Smidt, California Polytechnic State University–San Luis Obispo; Alice E. Smith, Auburn University; James MacGregor Smith, University of Massachusetts;

Preface xv

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Paul J. Smith, University of Maryland; Richard M. Soland, The George Washington University; Clifford Spiegelman, Texas A&M University; Jery Stedinger, Cornell University; David Steinberg, Tel Aviv University; William Thistleton, State University of New York Institute of Technology; G. Geoffrey Vining, University of Florida; Bhutan Wadhwa, Cleveland State University; Gary Wasserman, Wayne State University; Elaine Wenderholm, State University of New York–Oswego; Samuel P. Wilcock, Messiah College; Michael G. Zabetakis, University of Pittsburgh; and Maria Zack, Point Loma Nazarene University.

Danielle Urban of Elm Street Publishing Services has done a terrific job of supervising the book's production. Once again I am compelled to express my grat- itude to all those people at Cengage who have made important contributions over the course of my textbook writing career. For this most recent edition, special thanks go to Jay Campbell (for his timely and informed feedback throughout the project), Molly Taylor, Shaylin Walsh, Ashley Pickering, Cathy Brooks, and Andrew Coppola. I also greatly appreciate the stellar work of all those Cengage Learning sales representatives who have labored to make my books more visible to the statistical community. Last but by no means least, a heartfelt thanks to my wife Carol for her decades of support, and to my daughters for providing inspiration through their own achievements.

Jay Devore

xvi Preface

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1

1 Overview and DescriptiveStatistics

“I am not much given to regret, so I puzzled over this one a while. Should have taken much more statistics in college, I think.”

—Max Levchin, Paypal Co-founder, Slide Founder

Quote of the week from the Web site of the American Statistical Association on November 23, 2010

“I keep saying that the sexy job in the next 10 years will be statisticians, and I’m not kidding.”

—Hal Varian, Chief Economist at Google

August 6, 2009, The New York Times

INTRODUCTION

Statistical concepts and methods are not only useful but indeed often indis-

pensable in understanding the world around us. They provide ways of gaining

new insights into the behavior of many phenomena that you will encounter in

your chosen field of specialization in engineering or science.

The discipline of statistics teaches us how to make intelligent judgments

and informed decisions in the presence of uncertainty and variation. Without

uncertainty or variation, there would be little need for statistical methods or stat-

isticians. If every component of a particular type had exactly the same lifetime, if

all resistors produced by a certain manufacturer had the same resistance value, if

pH determinations for soil specimens from a particular locale gave identical

results, and so on, then a single observation would reveal all desired information.

An interesting manifestation of variation arises in the course of performing

emissions testing on motor vehicles. The expense and time requirements of the

Federal Test Procedure (FTP) preclude its widespread use in vehicle inspection pro-

grams. As a result, many agencies have developed less costly and quicker tests,

which it is hoped replicate FTP results. According to the journal article “Motor

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Vehicle Emissions Variability” (J. of the Air and Waste Mgmt. Assoc., 1996:

667–675), the acceptance of the FTP as a gold standard has led to the widespread

belief that repeated measurements on the same vehicle would yield identical (or

nearly identical) results. The authors of the article applied the FTP to seven vehicles

characterized as “high emitters.” Here are the results for one such vehicle:

HC (gm/mile) 13.8 18.3 32.2 32.5

CO (gm/mile) 118 149 232 236

The substantial variation in both the HC and CO measurements casts consider-

able doubt on conventional wisdom and makes it much more difficult to make

precise assessments about emissions levels.

How can statistical techniques be used to gather information and draw

conclusions? Suppose, for example, that a materials engineer has developed a

coating for retarding corrosion in metal pipe under specified circumstances. If

this coating is applied to different segments of pipe, variation in environmental

conditions and in the segments themselves will result in more substantial cor-

rosion on some segments than on others. Methods of statistical analysis could

be used on data from such an experiment to decide whether the average

amount of corrosion exceeds an upper specification limit of some sort or to pre-

dict how much corrosion will occur on a single piece of pipe.

Alternatively, suppose the engineer has developed the coating in the belief

that it will be superior to the currently used coating. A comparative experiment

could be carried out to investigate this issue by applying the current coating to

some segments of pipe and the new coating to other segments. This must be

done with care lest the wrong conclusion emerge. For example, perhaps the aver-

age amount of corrosion is identical for the two coatings. However, the new

coating may be applied to segments that have superior ability to resist corrosion

and under less stressful environmental conditions compared to the segments and

conditions for the current coating. The investigator would then likely observe a

difference between the two coatings attributable not to the coatings themselves,

but just to extraneous variation. Statistics offers not only methods for analyzing

the results of experiments once they have been carried out but also suggestions

for how experiments can be performed in an efficient manner to mitigate the

effects of variation and have a better chance of producing correct conclusions.

2 CHAPTER 1 Overview and Descriptive Statistics

1.1 Populations, Samples, and Processes Engineers and scientists are constantly exposed to collections of facts, or data, both in their professional capacities and in everyday activities. The discipline of statistics provides methods for organizing and summarizing data and for drawing conclusions based on information contained in the data.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.1 Populations, Samples, and Processes 3

An investigation will typically focus on a well-defined collection of objects constituting a population of interest. In one study, the population might consist of all gelatin capsules of a particular type produced during a specified period. Another investigation might involve the population consisting of all individuals who received a B.S. in engineering during the most recent academic year. When desired informa- tion is available for all objects in the population, we have what is called a census. Constraints on time, money, and other scarce resources usually make a census impractical or infeasible. Instead, a subset of the population—a sample—is selected in some prescribed manner. Thus we might obtain a sample of bearings from a par- ticular production run as a basis for investigating whether bearings are conforming to manufacturing specifications, or we might select a sample of last year’s engineer- ing graduates to obtain feedback about the quality of the engineering curricula.

We are usually interested only in certain characteristics of the objects in a pop- ulation: the number of flaws on the surface of each casing, the thickness of each cap- sule wall, the gender of an engineering graduate, the age at which the individual graduated, and so on. A characteristic may be categorical, such as gender or type of malfunction, or it may be numerical in nature. In the former case, the value of the characteristic is a category (e.g., female or insufficient solder), whereas in the latter case, the value is a number (e.g., or ). A variable is any characteristic whose value may change from one object to another in the population. We shall initially denote variables by lowercase letters from the end of our alphabet. Examples include

Data results from making observations either on a single variable or simultaneously on two or more variables. A univariate data set consists of observations on a single variable. For example, we might determine the type of transmission, automatic (A) or manual (M), on each of ten automobiles recently purchased at a certain dealer- ship, resulting in the categorical data set

The following sample of lifetimes (hours) of brand D batteries put to a certain use is a numerical univariate data set:

We have bivariate data when observations are made on each of two variables. Our data set might consist of a (height, weight) pair for each basketball player on a team, with the first observation as (72, 168), the second as (75, 212), and so on. If an engineer determines the value of both and for component failure, the resulting data set is bivariate with one variable numeri- cal and the other categorical. Multivariate data arises when observations are made on more than one variable (so bivariate is a special case of multivariate). For exam- ple, a research physician might determine the systolic blood pressure, diastolic blood pressure, and serum cholesterol level for each patient participating in a study. Each observation would be a triple of numbers, such as (120, 80, 146). In many multivariate data sets, some variables are numerical and others are categorical. Thus the annual automobile issue of Consumer Reports gives values of such variables as type of vehicle (small, sporty, compact, mid-size, large), city fuel efficiency (mpg), highway fuel efficiency (mpg), drivetrain type (rear wheel, front wheel, four wheel), and so on.

y 5 reasonx 5 component lifetime

5.6 5.1 6.2 6.0 5.8 6.5 5.8 5.5

M A A A M A A M A A

z 5 braking distance of an automobile under specified conditions

y 5 number of visits to a particular Web site during a specified period

x 5 brand of calculator owned by a student

diameter 5 .502 cmage 5 23 years

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.1

4 CHAPTER 1 Overview and Descriptive Statistics

Branches of Statistics An investigator who has collected data may wish simply to summarize and describe important features of the data. This entails using methods from descriptive statistics. Some of these methods are graphical in nature; the construction of histograms, boxplots, and scatter plots are primary examples. Other descriptive methods involve calculation of numerical summary measures, such as means, standard deviations, and correlation coefficients. The wide availability of statistical computer software packages has made these tasks much easier to carry out than they used to be. Computers are much more efficient than human beings at calculation and the creation of pictures (once they have received appropriate instructions from the user!). This means that the investigator doesn’t have to expend much effort on “grunt work” and will have more time to study the data and extract important messages. Throughout this book, we will present output from various packages such as Minitab, SAS, S-Plus, and R. The R software can be downloaded without charge from the site http://www.r-project.org.

Charity is a big business in the United States. The Web site charitynavigator.com gives information on roughly 5500 charitable organizations, and there are many smaller charities that fly below the navigator’s radar screen. Some charities operate very efficiently, with fundraising and administrative expenses that are only a small percentage of total expenses, whereas others spend a high percentage of what they take in on such activities. Here is data on fundraising expenses as a percentage of total expenditures for a random sample of 60 charities:

6.1 12.6 34.7 1.6 18.8 2.2 3.0 2.2 5.6 3.8 2.2 3.1 1.3 1.1 14.1 4.0 21.0 6.1 1.3 20.4 7.5 3.9 10.1 8.1 19.5 5.2 12.0 15.8 10.4 5.2 6.4 10.8 83.1 3.6 6.2 6.3 16.3 12.7 1.3 0.8 8.8 5.1 3.7 26.3 6.0 48.0 8.2 11.7 7.2 3.9

15.3 16.6 8.8 12.0 4.7 14.7 6.4 17.0 2.5 16.2

Without any organization, it is difficult to get a sense of the data’s most prominent features—what a typical (i.e. representative) value might be, whether values are highly concentrated about a typical value or quite dispersed, whether there are any

0 0

10

20

F re

qu en

cy

30

40 Stem–and–leaf of FundRsng N = 60

Leaf Unit = 1.0

0 0111112222333333344 0 55556666666778888 1 0001222244 1 55666789 2 01 2 6 3 3 4 4 8 5 5 6 6 7 7 8 3

4

10 20 30 40 50

FundRsng 60 70 80 90

Figure 1.1 A Minitab stem-and-leaf display (tenths digit truncated) and histogram for the charity fundraising percentage data

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.2

1.1 Populations, Samples, and Processes 5

gaps in the data, what fraction of the values are less than 20%, and so on. Figure 1.1 shows what is called a stem-and-leaf display as well as a histogram. In Section 1.2 we will discuss construction and interpretation of these data summaries. For the moment, we hope you see how they begin to describe how the percentages are dis- tributed over the range of possible values from 0 to 100. Clearly a substantial major- ity of the charities in the sample spend less than 20% on fundraising, and only a few percentages might be viewed as beyond the bounds of sensible practice. ■

Having obtained a sample from a population, an investigator would frequently like to use sample information to draw some type of conclusion (make an inference of some sort) about the population. That is, the sample is a means to an end rather than an end in itself. Techniques for generalizing from a sample to a population are gathered within the branch of our discipline called inferential statistics.

Material strength investigations provide a rich area of application for statistical meth- ods. The article “Effects of Aggregates and Microfillers on the Flexural Properties of Concrete” (Magazine of Concrete Research, 1997: 81–98) reported on a study of strength properties of high-performance concrete obtained by using superplasticizers and certain binders. The compressive strength of such concrete had previously been investigated, but not much was known about flexural strength (a measure of ability to resist failure in bending). The accompanying data on flexural strength (in MegaPascal, MPa, where ) appeared in the article cited:

5.9 7.2 7.3 6.3 8.1 6.8 7.0 7.6 6.8 6.5 7.0 6.3 7.9 9.0 8.2 8.7 7.8 9.7 7.4 7.7 9.7 7.8 7.7 11.6 11.3 11.8 10.7

Suppose we want an estimate of the average value of flexural strength for all beams that could be made in this way (if we conceptualize a population of all such beams, we are trying to estimate the population mean). It can be shown that, with a high degree of confidence, the population mean strength is between 7.48 MPa and 8.80 MPa; we call this a confidence interval or interval estimate. Alternatively, this data could be used to predict the flexural strength of a single beam of this type. With a high degree of confidence, the strength of a single such beam will exceed 7.35 MPa; the number 7.35 is called a lower prediction bound. ■

The main focus of this book is on presenting and illustrating methods of infer- ential statistics that are useful in scientific work. The most important types of infer- ential procedures—point estimation, hypothesis testing, and estimation by confidence intervals—are introduced in Chapters 6–8 and then used in more com- plicated settings in Chapters 9–16. The remainder of this chapter presents methods from descriptive statistics that are most used in the development of inference.

Chapters 2–5 present material from the discipline of probability. This material ultimately forms a bridge between the descriptive and inferential techniques. Mastery of probability leads to a better understanding of how inferential procedures are developed and used, how statistical conclusions can be translated into everyday language and interpreted, and when and where pitfalls can occur in applying the methods. Probability and statistics both deal with questions involving populations and samples, but do so in an “inverse manner” to one another.

In a probability problem, properties of the population under study are assumed known (e.g., in a numerical population, some specified distribution of the population values may be assumed), and questions regarding a sample taken from the population are posed and answered. In a statistics problem, characteristics of a

1 Pa (Pascal) 5 1.45 3 1024 psi

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.3

6 CHAPTER 1 Overview and Descriptive Statistics

sample are available to the experimenter, and this information enables the experi- menter to draw conclusions about the population. The relationship between the two disciplines can be summarized by saying that probability reasons from the population to the sample (deductive reasoning), whereas inferential statistics rea- sons from the sample to the population (inductive reasoning). This is illustrated in Figure 1.2.

Before we can understand what a particular sample can tell us about the pop- ulation, we should first understand the uncertainty associated with taking a sample from a given population. This is why we study probability before statistics.

As an example of the contrasting focus of probability and inferential statistics, con- sider drivers’ use of manual lap belts in cars equipped with automatic shoulder belt systems. (The article “Automobile Seat Belts: Usage Patterns in Automatic Belt Systems,” Human Factors, 1998: 126–135, summarizes usage data.) In probability, we might assume that 50% of all drivers of cars equipped in this way in a certain metropolitan area regularly use their lap belt (an assumption about the population), so we might ask, “How likely is it that a sample of 100 such drivers will include at least 70 who regularly use their lap belt?” or “How many of the drivers in a sample of size 100 can we expect to regularly use their lap belt?” On the other hand, in infer- ential statistics, we have sample information available; for example, a sample of 100 drivers of such cars revealed that 65 regularly use their lap belt. We might then ask, “Does this provide substantial evidence for concluding that more than 50% of all such drivers in this area regularly use their lap belt?” In this latter scenario, we are attempting to use sample information to answer a question about the structure of the entire population from which the sample was selected. ■

In the foregoing lap belt example, the population is well defined and concrete: all drivers of cars equipped in a certain way in a particular metropolitan area. In Example 1.2, however, the strength measurements came from a sample of prototype beams that had not been selected from an existing population. Instead, it is conven- ient to think of the population as consisting of all possible strength measurements that might be made under similar experimental conditions. Such a population is referred to as a conceptual or hypothetical population. There are a number of prob- lem situations in which we fit questions into the framework of inferential statistics by conceptualizing a population.

The Scope of Modern Statistics These days statistical methodology is employed by investigators in virtually all dis- ciplines, including such areas as

• molecular biology (analysis of microarray data)

• ecology (describing quantitatively how individuals in various animal and plant populations are spatially distributed)

Population

Probability

Inferential statistics

Sample

Figure 1.2 The relationship between probability and inferential statistics

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.1 Populations, Samples, and Processes 7

• materials engineering (studying properties of various treatments to retard corrosion)

• marketing (developing market surveys and strategies for marketing new products)

• public health (identifying sources of diseases and ways to treat them)

• civil engineering (assessing the effects of stress on structural elements and the impacts of traffic flows on communities)

As you progress through the book, you’ll encounter a wide spectrum of different sce- narios in the examples and exercises that illustrate the application of techniques from probability and statistics. Many of these scenarios involve data or other material extracted from articles in engineering and science journals. The methods presented herein have become established and trusted tools in the arsenal of those who work with data. Meanwhile, statisticians continue to develop new models for describing random- ness, and uncertainty and new methodology for analyzing data. As evidence of the con- tinuing creative efforts in the statistical community, here are titles and capsule descriptions of some articles that have recently appeared in statistics journals (Journal of the American Statistical Association is abbreviated JASA, and AAS is short for the Annals of Applied Statistics, two of the many prominent journals in the discipline):

• “Modeling Spatiotemporal Forest Health Monitoring Data” (JASA, 2009: 899–911): Forest health monitoring systems were set up across Europe in the 1980s in response to concerns about air-pollution-related forest dieback, and have continued operation with a more recent focus on threats from climate change and increased ozone levels. The authors develop a quantitative descrip- tion of tree crown defoliation, an indicator of tree health.

• “Active Learning Through Sequential Design, with Applications to the Detection of Money Laundering” (JASA, 2009: 969–981): Money laundering involves con- cealing the origin of funds obtained through illegal activities. The huge number of transactions occurring daily at financial institutions makes detection of money laundering difficult. The standard approach has been to extract various summary quantities from the transaction history and conduct a time-consuming investiga- tion of suspicious activities. The article proposes a more efficient statistical method and illustrates its use in a case study.

• “Robust Internal Benchmarking and False Discovery Rates for Detecting Racial Bias in Police Stops” (JASA, 2009: 661–668): Allegations of police actions that are attributable at least in part to racial bias have become a contentious issue in many communities. This article proposes a new method that is designed to reduce the risk of flagging a substantial number of “false positives” (individuals falsely identified as manifesting bias). The method was applied to data on 500,000 pedestrian stops in New York City in 2006; of the 3000 officers regu- larly involved in pedestrian stops, 15 were identified as having stopped a sub- stantially greater fraction of Black and Hispanic people than what would be predicted were bias absent.

• “Records in Athletics Through Extreme Value Theory” (JASA, 2008: 1382–1391): The focus here is on the modeling of extremes related to world records in athletics. The authors start by posing two questions: (1) What is the ultimate world record within a specific event (e.g. the high jump for women)? and (2) How “good” is the current world record, and how does the quality of current world records compare across different events? A total of 28 events (8 running, 3 throwing, and 3 jumping for both men and women) are considered. For example, one conclusion is that only about 20 seconds can be shaved off the

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8 CHAPTER 1 Overview and Descriptive Statistics

men’s marathon record, but that the current women’s marathon record is almost 5 minutes longer than what can ultimately be achieved. The methodology also has applications to such issues as ensuring airport runways are long enough and that dikes in Holland are high enough.

• “Analysis of Episodic Data with Application to Recurrent Pulmonary Exacerbations in Cystic Fibrosis Patients” (JASA, 2008: 498–510): The analysis of recurrent medical events such as migraine headaches should take into account not only when such events first occur but also how long they last—length of episodes may contain important information about the severity of the disease or malady, associated medical costs, and the quality of life. The article proposes a technique that summarizes both episode frequency and length of episodes, and allows effects of characteristics that cause episode occurrence to vary over time. The technique is applied to data on cystic fibrosis patients (CF is a serious genetic disorder affecting sweat and other glands).

• “Prediction of Remaining Life of Power Transformers Based on Left Truncated and Right Censored Lifetime Data” (AAS, 2009: 857–879): There are roughly 150,000 high-voltage power transmission transformers in the United States. Unexpected failures can cause substantial economic losses, so it is important to have predictions for remaining lifetimes. Relevant data can be complicated because lifetimes of some transformers extend over several decades during which records were not necessarily complete. In particular, the authors of the article use data from a certain energy company that began keeping careful records in 1980. But some transformers had been installed before January 1, 1980, and were still in service after that date (“left truncated” data), whereas other units were still in serv- ice at the time of the investigation, so their complete lifetimes are not available (“right censored” data). The article describes various procedures for obtaining an interval of plausible values (a prediction interval) for a remaining lifetime and for the cumulative number of failures over a specified time period.

• “The BARISTA: A Model for Bid Arrivals in Online Auctions” (AAS, 2007: 412–441): Online auctions such as those on eBay and uBid often have character- istics that differentiate them from traditional auctions. One particularly important difference is that the number of bidders at the outset of many traditional auctions is fixed, whereas in online auctions this number and the number of resulting bids are not predetermined. The article proposes a new BARISTA (for Bid ARivals In STAges) model for describing the way in which bids arrive online. The model allows for higher bidding intensity at the outset of the auction and also as the auction comes to a close. Various properties of the model are investigated and then validated using data from eBay.com on auctions for Palm M515 personal assistants, Microsoft Xbox games, and Cartier watches.

• “Statistical Challenges in the Analysis of Cosmic Microwave Background Radiation” (AAS, 2009: 61–95): The cosmic microwave background (CMB) is a significant source of information about the early history of the universe. Its radi- ation level is uniform, so extremely delicate instruments have been developed to measure fluctuations. The authors provide a review of statistical issues with CMB data analysis; they also give many examples of the application of statistical procedures to data obtained from a recent NASA satellite mission, the Wilkinson Microwave Anisotropy Probe.

Statistical information now appears with increasing frequency in the popular media, and occasionally the spotlight is even turned on statisticians. For example, the

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.1 Populations, Samples, and Processes 9

Nov. 23, 2009, New York Times reported in an article “Behind Cancer Guidelines, Quest for Data” that the new science for cancer investigations and more sophisti- cated methods for data analysis spurred the U.S. Preventive Services task force to re-examine guidelines for how frequently middle-aged and older women should have mammograms. The panel commissioned six independent groups to do statis- tical modeling. The result was a new set of conclusions, including an assertion that mammograms every two years are nearly as beneficial to patients as annual mam- mograms, but confer only half the risk of harms. Donald Berry, a very prominent biostatistician, was quoted as saying he was pleasantly surprised that the task force took the new research to heart in making its recommendations. The task force’s report has generated much controversy among cancer organizations, politicians, and women themselves.

It is our hope that you will become increasingly convinced of the importance and relevance of the discipline of statistics as you dig more deeply into the book and the subject. Hopefully you’ll be turned on enough to want to continue your statisti- cal education beyond your current course.

Enumerative Versus Analytic Studies W. E. Deming, a very influential American statistician who was a moving force in Japan’s quality revolution during the 1950s and 1960s, introduced the distinction between enumerative studies and analytic studies. In the former, interest is focused on a finite, identifiable, unchanging collection of individuals or objects that make up a population. A sampling frame—that is, a listing of the individuals or objects to be sampled—is either available to an investigator or else can be constructed. For example, the frame might consist of all signatures on a petition to qualify a certain initiative for the ballot in an upcoming election; a sample is usually selected to ascertain whether the number of valid signatures exceeds a specified value. As another example, the frame may contain serial numbers of all furnaces manufac- tured by a particular company during a certain time period; a sample may be selected to infer something about the average lifetime of these units. The use of inferential methods to be developed in this book is reasonably noncontroversial in such settings (though statisticians may still argue over which particular methods should be used).

An analytic study is broadly defined as one that is not enumerative in nature. Such studies are often carried out with the objective of improving a future product by taking action on a process of some sort (e.g., recalibrating equipment or adjusting the level of some input such as the amount of a catalyst). Data can often be obtained only on an existing process, one that may differ in important respects from the future process. There is thus no sampling frame listing the indi- viduals or objects of interest. For example, a sample of five turbines with a new design may be experimentally manufactured and tested to investigate efficiency. These five could be viewed as a sample from the conceptual population of all pro- totypes that could be manufactured under similar conditions, but not necessarily as representative of the population of units manufactured once regular production gets underway. Methods for using sample information to draw conclusions about future production units may be problematic. Someone with expertise in the area of turbine design and engineering (or whatever other subject area is relevant) should be called upon to judge whether such extrapolation is sensible. A good exposition of these issues is contained in the article “Assumptions for Statistical Inference” by Gerald Hahn and William Meeker (The American Statistician, 1993: 1–11).

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.4

10 CHAPTER 1 Overview and Descriptive Statistics

Collecting Data Statistics deals not only with the organization and analysis of data once it has been collected but also with the development of techniques for collecting the data. If data is not properly collected, an investigator may not be able to answer the questions under consideration with a reasonable degree of confidence. One common problem is that the target population—the one about which conclusions are to be drawn—may be different from the population actually sampled. For example, advertisers would like various kinds of information about the television-viewing habits of potential cus- tomers. The most systematic information of this sort comes from placing monitoring devices in a small number of homes across the United States. It has been conjectured that placement of such devices in and of itself alters viewing behavior, so that char- acteristics of the sample may be different from those of the target population.

When data collection entails selecting individuals or objects from a frame, the simplest method for ensuring a representative selection is to take a simple random sample. This is one for which any particular subset of the specified size (e.g., a sam- ple of size 100) has the same chance of being selected. For example, if the frame consists of 1,000,000 serial numbers, the numbers 1, 2, . . . , up to 1,000,000 could be placed on identical slips of paper. After placing these slips in a box and thor- oughly mixing, slips could be drawn one by one until the requisite sample size has been obtained. Alternatively (and much to be preferred), a table of random numbers or a computer’s random number generator could be employed.

Sometimes alternative sampling methods can be used to make the selection process easier, to obtain extra information, or to increase the degree of confidence in conclusions. One such method, stratified sampling, entails separating the population units into nonoverlapping groups and taking a sample from each one. For example, a manufacturer of DVD players might want information about customer satisfaction for units produced during the previous year. If three different models were manu- factured and sold, a separate sample could be selected from each of the three corre- sponding strata. This would result in information on all three models and ensure that no one model was over- or underrepresented in the entire sample.

Frequently a “convenience” sample is obtained by selecting individuals or objects without systematic randomization. As an example, a collection of bricks may be stacked in such a way that it is extremely difficult for those in the center to be selected. If the bricks on the top and sides of the stack were somehow different from the others, resulting sample data would not be representative of the population. Often an investigator will assume that such a convenience sample approximates a random sample, in which case a statistician’s repertoire of inferential methods can be used; however, this is a judgment call. Most of the methods discussed herein are based on a variation of simple random sampling described in Chapter 5.

Engineers and scientists often collect data by carrying out some sort of designed experiment. This may involve deciding how to allocate several different treatments (such as fertilizers or coatings for corrosion protection) to the various experimental units (plots of land or pieces of pipe). Alternatively, an investigator may systematically vary the levels or categories of certain factors (e.g., pressure or type of insulating material) and observe the effect on some response variable (such as yield from a production process).

An article in the New York Times (Jan. 27, 1987) reported that heart attack risk could be reduced by taking aspirin. This conclusion was based on a designed experi- ment involving both a control group of individuals that took a placebo having the appearance of aspirin but known to be inert and a treatment group that took aspirin

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.5

1.1 Populations, Samples, and Processes 11

according to a specified regimen. Subjects were randomly assigned to the groups to protect against any biases and so that probability-based methods could be used to analyze the data. Of the 11,034 individuals in the control group, 189 subsequently experienced heart attacks, whereas only 104 of the 11,037 in the aspirin group had a heart attack. The incidence rate of heart attacks in the treatment group was only about half that in the control group. One possible explanation for this result is chance variation—that aspirin really doesn’t have the desired effect and the observed dif- ference is just typical variation in the same way that tossing two identical coins would usually produce different numbers of heads. However, in this case, inferential methods suggest that chance variation by itself cannot adequately explain the mag- nitude of the observed difference. ■

An engineer wishes to investigate the effects of both adhesive type and conductor material on bond strength when mounting an integrated circuit (IC) on a certain sub- strate. Two adhesive types and two conductor materials are under consideration. Two observations are made for each adhesive-type/conductor-material combination, resulting in the accompanying data:

Adhesive Type Conductor Material Observed Bond Strength Average

1 1 82, 77 79.5 1 2 75, 87 81.0 2 1 84, 80 82.0 2 2 78, 90 84.0

Conducting material

Average strength

1 2

80

85 Adhesive type 2

Adhesive type 1

Figure 1.3 Average bond strengths in Example 1.5

The resulting average bond strengths are pictured in Figure 1.3. It appears that adhe- sive type 2 improves bond strength as compared with type 1 by about the same amount whichever one of the conducting materials is used, with the 2, 2 combina- tion being best. Inferential methods can again be used to judge whether these effects are real or simply due to chance variation.

Suppose additionally that there are two cure times under consideration and also two types of IC post coating. There are then combinations of these four factors, and our engineer may not have enough resources to make even a single obser- vation for each of these combinations. In Chapter 11, we will see how the careful selec- tion of a fraction of these possibilities will usually yield the desired information. ■

2 ? 2 ? 2 ? 2 5 16

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12 CHAPTER 1 Overview and Descriptive Statistics

EXERCISES Section 1.1 (1–9)

1. Give one possible sample of size 4 from each of the follow- ing populations: a. All daily newspapers published in the United States b. All companies listed on the New York Stock Exchange c. All students at your college or university d. All grade point averages of students at your college or

university

2. For each of the following hypothetical populations, give a plausible sample of size 4: a. All distances that might result when you throw a football b. Page lengths of books published 5 years from now c. All possible earthquake-strength measurements (Richter

scale) that might be recorded in California during the next year

d. All possible yields (in grams) from a certain chemical reaction carried out in a laboratory

3. Consider the population consisting of all computers of a cer- tain brand and model, and focus on whether a computer needs service while under warranty. a. Pose several probability questions based on selecting a

sample of 100 such computers. b. What inferential statistics question might be answered by

determining the number of such computers in a sample of size 100 that need warranty service?

4. a. Give three different examples of concrete populations and three different examples of hypothetical populations.

b. For one each of your concrete and your hypothetical pop- ulations, give an example of a probability question and an example of an inferential statistics question.

5. Many universities and colleges have instituted supplemental instruction (SI) programs, in which a student facilitator meets regularly with a small group of students enrolled in the course to promote discussion of course material and enhance subject mastery. Suppose that students in a large statistics course (what else?) are randomly divided into a control group that will not participate in SI and a treatment group that will participate. At the end of the term, each student’s total score in the course is determined. a. Are the scores from the SI group a sample from an exist-

ing population? If so, what is it? If not, what is the rele- vant conceptual population?

b. What do you think is the advantage of randomly dividing the students into the two groups rather than letting each student choose which group to join?

c. Why didn’t the investigators put all students in the treat- ment group? Note: The article “Supplemental Instruction: An Effective Component of Student Affairs Programming” (J. of College Student Devel., 1997: 577–586) discusses the analysis of data from several SI programs.

6. The California State University (CSU) system consists of 23 campuses, from San Diego State in the south to Humboldt State near the Oregon border. A CSU administrator wishes to make an inference about the average distance between the hometowns of students and their campuses. Describe and dis- cuss several different sampling methods that might be employed. Would this be an enumerative or an analytic study? Explain your reasoning.

7. A certain city divides naturally into ten district neighborhoods. How might a real estate appraiser select a sample of single- family homes that could be used as a basis for developing an equation to predict appraised value from characteristics such as age, size, number of bathrooms, distance to the nearest school, and so on? Is the study enumerative or analytic?

8. The amount of flow through a solenoid valve in an automo- bile’s pollution-control system is an important characteristic. An experiment was carried out to study how flow rate depended on three factors: armature length, spring load, and bobbin depth. Two different levels (low and high) of each fac- tor were chosen, and a single observation on flow was made for each combination of levels. a. The resulting data set consisted of how many observations? b. Is this an enumerative or analytic study? Explain your rea-

soning.

9. In a famous experiment carried out in 1882, Michelson and Newcomb obtained 66 observations on the time it took for light to travel between two locations in Washington, D.C. A few of the measurements (coded in a certain manner) were

and 31. a. Why are these measurements not identical? b. Is this an enumerative study? Why or why not?

31, 23, 32, 36, 22, 26, 27,

Descriptive statistics can be divided into two general subject areas. In this section, we consider representing a data set using visual techniques. In Sections 1.3 and 1.4, we will develop some numerical summary measures for data sets. Many visual techniques may already be familiar to you: frequency tables, tally sheets, histograms, pie charts,

1.2 Pictorial and Tabular Methods in Descriptive Statistics

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.2 Pictorial and Tabular Methods in Descriptive Statistics 13

bar graphs, scatter diagrams, and the like. Here we focus on a selected few of these techniques that are most useful and relevant to probability and inferential statistics.

Notation Some general notation will make it easier to apply our methods and formulas to a wide variety of practical problems. The number of observations in a single sample, that is, the sample size, will often be denoted by n, so that for the sample of universities {Stanford, Iowa State, Wyoming, Rochester} and also for the sample of pH measurements {6.3, 6.2, 5.9, 6.5}. If two samples are simultaneously under con- sideration, either m and n or n1 and n2 can be used to denote the numbers of obser- vations. Thus if {29.7, 31.6, 30.9} and {28.7, 29.5, 29.4, 30.3} are thermal-efficiency measurements for two different types of diesel engines, then

and . Given a data set consisting of n observations on some variable x, the individ-

ual observations will be denoted by . The subscript bears no relation to the magnitude of a particular observation. Thus x1 will not in general be the small- est observation in the set, nor will xn typically be the largest. In many applications, x1 will be the first observation gathered by the experimenter, x2 the second, and so on. The ith observation in the data set will be denoted by xi.

Stem-and-Leaf Displays Consider a numerical data set for which each xi consists of at least two digits. A quick way to obtain an informative visual representation of the data set is to construct a stem-and-leaf display.

x1, x2, c, xn

x1, x2, x3, c, xn

n 5 4m 5 3

n 5 4

Constructing a Stem-and-Leaf Display

1. Select one or more leading digits for the stem values. The trailing digits become the leaves.

2. List possible stem values in a vertical column.

3. Record the leaf for each observation beside the corresponding stem value.

4. Indicate the units for stems and leaves someplace in the display.

Example 1.6

If the data set consists of exam scores, each between 0 and 100, the score of 83 would have a stem of 8 and a leaf of 3. For a data set of automobile fuel efficien- cies (mpg), all between 8.1 and 47.8, we could use the tens digit as the stem, so 32.6 would then have a leaf of 2.6. In general, a display based on between 5 and 20 stems is recommended.

The use of alcohol by college students is of great concern not only to those in the aca- demic community but also, because of potential health and safety consequences, to society at large. The article “Health and Behavioral Consequences of Binge Drinking in College” (J. of the Amer. Med. Assoc., 1994: 1672–1677) reported on a comprehen- sive study of heavy drinking on campuses across the United States. A binge episode was defined as five or more drinks in a row for males and four or more for females. Figure 1.4 shows a stem-and-leaf display of 140 values of of undergraduate students who are binge drinkers. (These values were not given in the cited article, but our display agrees with a picture of the data that did appear.)

x 5 the percentage

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.7

14 CHAPTER 1 Overview and Descriptive Statistics

0 4 1 1345678889 2 1223456666777889999 Stem: tens digit 3 0112233344555666677777888899999 Leaf: ones digit

4 111222223344445566666677788888999 5 00111222233455666667777888899 6 01111244455666778

Figure 1.4 Stem-and-leaf display for the percentage of binge drinkers at each of the 140 colleges

The first leaf on the stem 2 row is 1, which tells us that 21% of the students at one of the colleges in the sample were binge drinkers. Without the identification of stem digits and leaf digits on the display, we wouldn’t know whether the stem 2, leaf 1 observation should be read as 21%, 2.1%, or .21%.

When creating a display by hand, ordering the leaves from smallest to largest on each line can be time-consuming. This ordering usually contributes little if any extra information. Suppose the observations had been listed in alphabetical order by school name, as

Then placing these values on the display in this order would result in the stem 1 row having 6 as its first leaf, and the beginning of the stem 3 row would be

The display suggests that a typical or representative value is in the stem 4 row, perhaps in the mid-40% range. The observations are not highly concentrated about this typical value, as would be the case if all values were between 20% and 49%. The display rises to a single peak as we move downward, and then declines; there are no gaps in the display. The shape of the display is not perfectly symmetric, but instead appears to stretch out a bit more in the direction of low leaves than in the direction of high leaves. Lastly, there are no observations that are unusually far from the bulk of the data (no outliers), as would be the case if one of the 26% values had instead been 86%. The most surprising feature of this data is that, at most colleges in the sample, at least one-quarter of the students are binge drinkers. The problem of heavy drinking on campuses is much more pervasive than many had suspected. ■

A stem-and-leaf display conveys information about the following aspects of the data:

• identification of a typical or representative value

• extent of spread about the typical value

• presence of any gaps in the data

• extent of symmetry in the distribution of values

• number and location of peaks

• presence of any outlying values

Figure 1.5 presents stem-and-leaf displays for a random sample of lengths of golf courses (yards) that have been designated by Golf Magazine as among the most chal- lenging in the United States. Among the sample of 40 courses, the shortest is 6433 yards long, and the longest is 7280 yards. The lengths appear to be distributed in a

3 u 371 c

16% 33% 64% 37% 31% c

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.2 Pictorial and Tabular Methods in Descriptive Statistics 15

roughly uniform fashion over the range of values in the sample. Notice that a stem choice here of either a single digit (6 or 7) or three digits (643, . . . , 728) would yield an uninformative display, the first because of too few stems and the latter because of too many.

Statistical software packages do not generally produce displays with multiple- digit stems. The Minitab display in Figure 1.5(b) results from truncating each obser- vation by deleting the ones digit.

64 35 64 33 70 Stem: Thousands and hundreds digits 65 26 27 06 83 Leaf: Tens and ones digits 66 05 94 14 67 90 70 00 98 70 45 13 68 90 70 73 50 69 00 27 36 04 70 51 05 11 40 50 22 71 31 69 68 05 13 65 72 80 09

Stem-and-leaf of yardage N 40 Leaf Unit 10

4 64 3367 8 65 0228 11 66 019 18 67 0147799 (4) 68 5779 18 69 0023 14 70 012455 8 71 013666 2 72 08

(a) (b)

Figure 1.5 Stem-and-leaf displays of golf course lengths: (a) two-digit leaves; (b) display from Minitab with truncated one-digit leaves ■

Dotplots A dotplot is an attractive summary of numerical data when the data set is reasonably small or there are relatively few distinct data values. Each observation is represented by a dot above the corresponding location on a horizontal measurement scale. When a value occurs more than once, there is a dot for each occurrence, and these dots are stacked vertically. As with a stem-and-leaf display, a dotplot gives information about location, spread, extremes, and gaps.

Here is data on state-by-state appropriations for higher education as a percentage of state and local tax revenue for the fiscal year 2006–2007 (from the Statistical Abstract of the United States); values are listed in order of state abbreviations (AL first, WY last):

10.8 6.9 8.0 8.8 7.3 3.6 4.1 6.0 4.4 8.3 8.1 8.0 5.9 5.9 7.6 8.9 8.5 8.1 4.2 5.7 4.0 6.7 5.8 9.9 5.6 5.8 9.3 6.2 2.5 4.5

12.8 3.5 10.0 9.1 5.0 8.1 5.3 3.9 4.0 8.0 7.4 7.5 8.4 8.3 2.6 5.1 6.0 7.0 6.5 10.3

Figure 1.6 shows a dotplot of the data. The most striking feature is the substantial state-to-state variability. The largest value (for New Mexico) and the two smallest values (New Hampshire and Vermont) are somewhat separated from the bulk of the data, though not perhaps by enough to be considered outliers.

2.8 4.2 5.6 7.0 8.4 9.8 11.2 12.6

Figure 1.6 A dotplot of the data from Example 1.8 ■

Example 1.8

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

16 CHAPTER 1 Overview and Descriptive Statistics

If the number of compressive strength observations in Example 1.2 had been much larger than the actually obtained, it would be quite cumbersome to construct a dotplot. Our next technique is well suited to such situations.

Histograms Some numerical data is obtained by counting to determine the value of a variable (the number of traffic citations a person received during the last year, the number of cus- tomers arriving for service during a particular period), whereas other data is obtained by taking measurements (weight of an individual, reaction time to a particular stimulus). The prescription for drawing a histogram is generally different for these two cases.

n 5 27

A numerical variable is discrete if its set of possible values either is finite or else can be listed in an infinite sequence (one in which there is a first number, a second number, and so on). A numerical variable is continuous if its possi- ble values consist of an entire interval on the number line.

A discrete variable x almost always results from counting, in which case pos- sible values are 0, 1, 2, 3, . . . or some subset of these integers. Continuous variables arise from making measurements. For example, if x is the pH of a chemical sub- stance, then in theory x could be any number between 0 and 14: 7.0, 7.03, 7.032, and so on. Of course, in practice there are limitations on the degree of accuracy of any measuring instrument, so we may not be able to determine pH, reaction time, height, and concentration to an arbitrarily large number of decimal places. However, from the point of view of creating mathematical models for distributions of data, it is help- ful to imagine an entire continuum of possible values.

Consider data consisting of observations on a discrete variable x. The frequency of any particular x value is the number of times that value occurs in the data set. The relative frequency of a value is the fraction or proportion of times the value occurs:

Suppose, for example, that our data set consists of 200 observations on of courses a college student is taking this term. If 70 of these x values are 3, then

Multiplying a relative frequency by 100 gives a percentage; in the college-course example, 35% of the students in the sample are taking three courses. The relative fre- quencies, or percentages, are usually of more interest than the frequencies them- selves. In theory, the relative frequencies should sum to 1, but in practice the sum may differ slightly from 1 because of rounding. A frequency distribution is a tab- ulation of the frequencies and/or relative frequencies.

relative frequency of the x value 3: 70

200 5 .35

frequency of the x value 3: 70

x 5 the number

relative frequency of a value 5 number of times the value occurs

number of observations in the data set

Constructing a Histogram for Discrete Data

First, determine the frequency and relative frequency of each x value. Then mark possible x values on a horizontal scale. Above each value, draw a rectangle whose height is the relative frequency (or alternatively, the frequency) of that value.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.2 Pictorial and Tabular Methods in Descriptive Statistics 17

Example 1.9

This construction ensures that the area of each rectangle is proportional to the rela- tive frequency of the value. Thus if the relative frequencies of and are .35 and .07, respectively, then the area of the rectangle above 1 is five times the area of the rectangle above 5.

How unusual is a no-hitter or a one-hitter in a major league baseball game, and how frequently does a team get more than 10, 15, or even 20 hits? Table 1.1 is a frequency distribution for the number of hits per team per game for all nine-inning games that were played between 1989 and 1993.

x 5 5x 5 1

Table 1.1 Frequency Distribution for Hits in Nine-Inning Games

Number Relative Number of Relative Hits/Game of Games Frequency Hits/Game Games Frequency

0 20 .0010 14 569 .0294 1 72 .0037 15 393 .0203 2 209 .0108 16 253 .0131 3 527 .0272 17 171 .0088 4 1048 .0541 18 97 .0050 5 1457 .0752 19 53 .0027 6 1988 .1026 20 31 .0016 7 2256 .1164 21 19 .0010 8 2403 .1240 22 13 .0007 9 2256 .1164 23 5 .0003

10 1967 .1015 24 1 .0001 11 1509 .0779 25 0 .0000 12 1230 .0635 26 1 .0001 13 834 .0430 27 1 .0001

19,383 1.0005

The corresponding histogram in Figure 1.7 rises rather smoothly to a single peak and then declines. The histogram extends a bit more on the right (toward large values) than it does on the left—a slight “positive skew.”

10

.05

0

.10

0 Hits/game

20

Relative frequency

Figure 1.7 Histogram of number of hits per nine-inning game

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

18 CHAPTER 1 Overview and Descriptive Statistics

Either from the tabulated information or from the histogram itself, we can determine the following:

Similarly,

That is, roughly 64% of all these games resulted in between 5 and 10 (inclusive) hits. ■

Constructing a histogram for continuous data (measurements) entails subdi- viding the measurement axis into a suitable number of class intervals or classes, such that each observation is contained in exactly one class. Suppose, for example, that we have 50 observations on efficiency of an automobile (mpg), the smallest of which is 27.8 and the largest of which is 31.4. Then we could use the class boundaries 27.5, 28.0, 28.5, . . . , and 31.5 as shown here:

x 5 fuel

between 5 and 10 hits (inclusive) 5 .0752 1 .1026 1 c 1 .1015 5 .6361 proportion of games with

5 .0010 1 .0037 1 .0108 5 .0155 at most two hits

relative 1 frequency

for x 5 2

relative 1 frequency

for x 5 1

relative 5 frequency

for x 5 0 proportion of games with

27.5 28.0 28.5 29.0 29.5 30.0 30.5 31.0 31.5

One potential difficulty is that occasionally an observation lies on a class bound- ary so therefore does not fall in exactly one interval, for example, 29.0. One way to deal with this problem is to use boundaries like 27.55, 28.05, . . . , 31.55. Adding a hundredths digit to the class boundaries prevents observations from falling on the resulting boundaries. Another approach is to use the classes

. Then 29.0 falls in the class rather than in the class . In other words, with this con-

vention, an observation on a boundary is placed in the interval to the right of the boundary. This is how Minitab constructs a histogram.

28.52, 29.029.02, 29.5 27.52, 28.0, 28.02, 28.5, c, 31.02, 31.5

Example 1.10

Constructing a Histogram for Continuous Data: Equal Class Widths

Determine the frequency and relative frequency for each class. Mark the class boundaries on a horizontal measurement axis. Above each class inter- val, draw a rectangle whose height is the corresponding relative frequency (or frequency).

Power companies need information about customer usage to obtain accurate fore- casts of demands. Investigators from Wisconsin Power and Light determined energy consumption (BTUs) during a particular period for a sample of 90 gas-heated homes. An adjusted consumption value was calculated as follows:

This resulted in the accompanying data (part of the stored data set FURNACE.MTW available in Minitab), which we have ordered from smallest to largest.

adjusted consumption 5 consumption

(weather, in degree days)(house area)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.2 Pictorial and Tabular Methods in Descriptive Statistics 19

1 3 5 7 9 BTUIN

0

10

20

30 P

er ce

nt

11 13 15 17 19

Figure 1.8 Histogram of the energy consumption data from Example 1.10

Class Frequency 1 1 11 21 25 17 9 4 1 Relative .011 .011 .122 .233 .278 .189 .100 .044 .011

frequency

172,19152,17132,15112,1392,1172,952,732,512,3

2.97 4.00 5.20 5.56 5.94 5.98 6.35 6.62 6.72 6.78 6.80 6.85 6.94 7.15 7.16 7.23 7.29 7.62 7.62 7.69 7.73 7.87 7.93 8.00 8.26 8.29 8.37 8.47 8.54 8.58 8.61 8.67 8.69 8.81 9.07 9.27 9.37 9.43 9.52 9.58 9.60 9.76 9.82 9.83 9.83 9.84 9.96 10.04 10.21 10.28

10.28 10.30 10.35 10.36 10.40 10.49 10.50 10.64 10.95 11.09 11.12 11.21 11.29 11.43 11.62 11.70 11.70 12.16 12.19 12.28 12.31 12.62 12.69 12.71 12.91 12.92 13.11 13.38 13.42 13.43 13.47 13.60 13.96 14.24 14.35 15.12 15.24 16.06 16.90 18.26

We let Minitab select the class intervals. The most striking feature of the histogram in Figure 1.8 is its resemblance to a bell-shaped (and therefore symmetric) curve, with the point of symmetry roughly at 10.

From the histogram,

The relative frequency for the class is about .27, so we estimate that roughly half of this, or .135, is between 9 and 10. Thus

The exact value of this proportion is . ■

There are no hard-and-fast rules concerning either the number of classes or the choice of classes themselves. Between 5 and 20 classes will be satisfactory for most data sets. Generally, the larger the number of observations in a data set, the more classes should be used. A reasonable rule of thumb is

number of classes < 1number of observations

47/90 5 .522

less than 10 proportion of observations

92,11

less than 9 observations proportion of

< .37 1 .135 5 .505 (slightly more than 50%)

< .01 1 .01 1 .12 1 .23 5 .37 (exact value 5 34

90 5 .378)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

20 CHAPTER 1 Overview and Descriptive Statistics

(a)

(b)

(c)

Figure 1.9 Selecting class intervals for “varying density” data: (a) many short equal-width intervals; (b) a few wide equal-width intervals; (c) unequal-width intervals

Equal-width classes may not be a sensible choice if there are some regions of the measurement scale that have a high concentration of data values and other parts where data is quite sparse. Figure 1.9 shows a dotplot of such a data set; there is high concentration in the middle, and relatively few observations stretched out to either side. Using a small number of equal-width classes results in almost all obser- vations falling in just one or two of the classes. If a large number of equal-width classes are used, many classes will have zero frequency. A sound choice is to use a few wider intervals near extreme observations and narrower intervals in the region of high concentration.

Example 1.11

Constructing a Histogram for Continuous Data: Unequal Class Widths

After determining frequencies and relative frequencies, calculate the height of each rectangle using the formula

The resulting rectangle heights are usually called densities, and the vertical scale is the density scale. This prescription will also work when class widths are equal.

rectangle height 5 relative frequency of the class

class width

Corrosion of reinforcing steel is a serious problem in concrete structures located in environments affected by severe weather conditions. For this reason, researchers have been investigating the use of reinforcing bars made of composite material. One study was carried out to develop guidelines for bonding glass-fiber-reinforced plas- tic rebars to concrete (“Design Recommendations for Bond of GFRP Rebars to Concrete,” J. of Structural Engr., 1996: 247–254). Consider the following 48 obser- vations on measured bond strength:

11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6 5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8 3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6 5.2 5.5 5.1 5.0 5.2 4.8 4.1 3.8 3.7 3.6 3.6 3.6

Class Frequency 9 15 5 9 8 2 Relative frequency .1875 .3125 .1042 .1875 .1667 .0417 Density .094 .156 .052 .047 .021 .004

202,30122,2082,1262,842,622,4

The resulting histogram appears in Figure 1.10. The right or upper tail stretches out much farther than does the left or lower tail—a substantial departure from symmetry.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2 4 6 8 12 20 30 Bond strength

0.00

0.05

0.10

D en

si ty

0.15

Figure 1.10 A Minitab density histogram for the bond strength data of Example 1.11

1.2 Pictorial and Tabular Methods in Descriptive Statistics 21

Example 1.12

When class widths are unequal, not using a density scale will give a picture with distorted areas. For equal-class widths, the divisor is the same in each density calculation, and the extra arithmetic simply results in a rescaling of the vertical axis (i.e., the histogram using relative frequency and the one using density will have exactly the same appearance). A density histogram does have one interesting prop- erty. Multiplying both sides of the formula for density by the class width gives

That is, the area of each rectangle is the relative frequency of the corresponding class. Furthermore, since the sum of relative frequencies should be 1, the total area of all rectangles in a density histogram is l. It is always possible to draw a histogram so that the area equals the relative frequency (this is true also for a histogram of dis- crete data)—just use the density scale. This property will play an important role in creating models for distributions in Chapter 4.

Histogram Shapes Histograms come in a variety of shapes. A unimodal histogram is one that rises to a single peak and then declines. A bimodal histogram has two different peaks. Bimodality can occur when the data set consists of observations on two quite differ- ent kinds of individuals or objects. For example, consider a large data set consisting of driving times for automobiles traveling between San Luis Obispo, California, and Monterey, California (exclusive of stopping time for sightseeing, eating, etc.). This histogram would show two peaks: one for those cars that took the inland route (roughly 2.5 hours) and another for those cars traveling up the coast (3.5–4 hours). However, bimodality does not automatically follow in such situations. Only if the two separate histograms are “far apart” relative to their spreads will bimodality occur in the histogram of combined data. Thus a large data set consisting of heights of col- lege students should not result in a bimodal histogram because the typical male height of about 69 inches is not far enough above the typical female height of about 64–65 inches. A histogram with more than two peaks is said to be multimodal. Of course, the number of peaks may well depend on the choice of class intervals, par- ticularly with a small number of observations. The larger the number of classes, the more likely it is that bimodality or multimodality will manifest itself.

Figure 1.11(a) shows a Minitab histogram of the weights (lb) of the 124 players listed on the rosters of the San Francisco 49ers and the New England Patriots (teams the author would like to see meet in the Super Bowl) as of Nov. 20, 2009.

5 rectangle area relative frequency 5 (class width)(density) 5 (rectangle width)(rectangle height)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

22 CHAPTER 1 Overview and Descriptive Statistics

Figure 1.11(b) is a smoothed histogram (actually what is called a density estimate) of the data from the R software package. Both the histogram and the smoothed his- togram show three distinct peaks; the one on the right is for linemen, the middle peak corresponds to linebacker weights, and the peak on the left is for all other players (wide receivers, quarterbacks, etc.).

180

150

0. 00

0 0.

00 2

0. 00

4 0.

00 6

D en

si ty

E st

im at

e 0.

00 8

0. 01

0 0.

01 2

200 250

Player Weight 300 350

0

2

4

6

8

P er

ce nt

10

12

14

200

(a)

(b)

220 240 260

Weight 280 300 320 340

Figure 1.11 NFL player weights (a) Histogram (b) Smoothed histogram

(a) (d)(b) (c)

Figure 1.12 Smoothed histograms: (a) symmetric unimodal; (b) bimodal; (c) positively skewed; and (d) negatively skewed

A histogram is symmetric if the left half is a mirror image of the right half. A unimodal histogram is positively skewed if the right or upper tail is stretched out compared with the left or lower tail and negatively skewed if the stretching is to the left. Figure 1.12 shows “smoothed” histograms, obtained by superimposing a smooth curve on the rectangles, that illustrate the various possibilities.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.13

1.2 Pictorial and Tabular Methods in Descriptive Statistics 23

Qualitative Data Both a frequency distribution and a histogram can be constructed when the data set is qualitative (categorical) in nature. In some cases, there will be a natural ordering of classes—for example, freshmen, sophomores, juniors, seniors, graduate students— whereas in other cases the order will be arbitrary—for example, Catholic, Jewish, Protestant, and the like. With such categorical data, the intervals above which rectangles are constructed should have equal width.

The Public Policy Institute of California carried out a telephone survey of 2501 California adult residents during April 2006 to ascertain how they felt about various aspects of K-12 public education. One question asked was “Overall, how would you rate the quality of public schools in your neighborhood today?” Table 1.2 displays the frequencies and relative frequencies, and Figure 1.13 shows the corresponding histogram (bar chart).

Table 1.2 Frequency Distribution for the School Rating Data

Rating Frequency Relative Frequency

A 478 .191 B 893 .357 C 680 .272 D 178 .071 F 100 .040

Don’t know 172 .069

2501 1.000

R el

at iv

e Fr

eq ue

nc y

Rating

0.4

0.3

0.2

0.1

0.0 A B C D F Don’t know

Chart of Relative Frequency vs Rating

Figure 1.13 Histogram of the school rating data from Minitab

More than half the respondents gave an A or B rating, and only slightly more than 10% gave a D or F rating. The percentages for parents of public school children were somewhat more favorable to schools: 24%, 40%, 24%, 6%, 4%, and 2%. ■

Multivariate Data Multivariate data is generally rather difficult to describe visually. Several meth- ods for doing so appear later in the book, notably scatter plots for bivariate numerical data.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

24 CHAPTER 1 Overview and Descriptive Statistics

EXERCISES Section 1.2 (10–32)

10. Consider the strength data for beams given in Example 1.2. a. Construct a stem-and-leaf display of the data. What

appears to be a representative strength value? Do the observations appear to be highly concentrated about the representative value or rather spread out?

b. Does the display appear to be reasonably symmetric about a representative value, or would you describe its shape in some other way?

c. Do there appear to be any outlying strength values? d. What proportion of strength observations in this sample

exceed 10 MPa?

11. Every score in the following batch of exam scores is in the 60s, 70s, 80s, or 90s. A stem-and-leaf display with only the four stems 6, 7, 8, and 9 would not give a very detailed description of the distribution of scores. In such situations, it is desirable to use repeated stems. Here we could repeat the stem 6 twice, using 6L for scores in the low 60s (leaves 0, 1, 2, 3, and 4) and 6H for scores in the high 60s (leaves 5, 6, 7, 8, and 9). Similarly, the other stems can be repeated twice to obtain a display consisting of eight rows. Construct such a display for the given scores. What feature of the data is highlighted by this display?

74 89 80 93 64 67 72 70 66 85 89 81 81 71 74 82 85 63 72 81 81 95 84 81 80 70 69 66 60 83 85 98 84 68 90 82 69 72 87 88

12. The accompanying specific gravity values for various wood types used in construction appeared in the article “Bolted Connection Design Values Based on European Yield Model” (J. of Structural Engr., 1993: 2169–2186):

.31 .35 .36 .36 .37 .38 .40 .40 .40

.41 .41 .42 .42 .42 .42 .42 .43 .44

.45 .46 .46 .47 .48 .48 .48 .51 .54

.54 .55 .58 .62 .66 .66 .67 .68 .75

Construct a stem-and-leaf display using repeated stems (see the previous exercise), and comment on any interesting fea- tures of the display.

13. Allowable mechanical properties for structural design of metallic aerospace vehicles requires an approved method for statistically analyzing empirical test data. The article “Establishing Mechanical Property Allowables for Metals” (J. of Testing and Evaluation, 1998: 293–299) used the accompanying data on tensile ultimate strength (ksi) as a basis for addressing the difficulties in developing such a method.

122.2 124.2 124.3 125.6 126.3 126.5 126.5 127.2 127.3 127.5 127.9 128.6 128.8 129.0 129.2 129.4 129.6 130.2 130.4 130.8 131.3 131.4 131.4 131.5 131.6 131.6 131.8 131.8 132.3 132.4 132.4 132.5 132.5 132.5 132.5 132.6

132.7 132.9 133.0 133.1 133.1 133.1 133.1 133.2 133.2 133.2 133.3 133.3 133.5 133.5 133.5 133.8 133.9 134.0 134.0 134.0 134.0 134.1 134.2 134.3 134.4 134.4 134.6 134.7 134.7 134.7 134.8 134.8 134.8 134.9 134.9 135.2 135.2 135.2 135.3 135.3 135.4 135.5 135.5 135.6 135.6 135.7 135.8 135.8 135.8 135.8 135.8 135.9 135.9 135.9 135.9 136.0 136.0 136.1 136.2 136.2 136.3 136.4 136.4 136.6 136.8 136.9 136.9 137.0 137.1 137.2 137.6 137.6 137.8 137.8 137.8 137.9 137.9 138.2 138.2 138.3 138.3 138.4 138.4 138.4 138.5 138.5 138.6 138.7 138.7 139.0 139.1 139.5 139.6 139.8 139.8 140.0 140.0 140.7 140.7 140.9 140.9 141.2 141.4 141.5 141.6 142.9 143.4 143.5 143.6 143.8 143.8 143.9 144.1 144.5 144.5 147.7 147.7

a. Construct a stem-and-leaf display of the data by first deleting (truncating) the tenths digit and then repeat- ing each stem value five times (once for leaves 1 and 2, a second time for leaves 3 and 4, etc.). Why is it rel- atively easy to identify a representative strength value?

b. Construct a histogram using equal-width classes with the first class having a lower limit of 122 and an upper limit of 124. Then comment on any interesting features of the histogram.

14. The accompanying data set consists of observations on shower-flow rate (L/min) for a sample of houses in Perth, Australia (“An Application of Bayes Methodology to the Analysis of Diary Records in a Water Use Study,” J. Amer. Stat. Assoc., 1987: 705–711):

4.6 12.3 7.1 7.0 4.0 9.2 6.7 6.9 11.5 5.1 11.2 10.5 14.3 8.0 8.8 6.4 5.1 5.6 9.6 7.5 7.5 6.2 5.8 2.3 3.4 10.4 9.8 6.6 3.7 6.4 8.3 6.5 7.6 9.3 9.2 7.3 5.0 6.3 13.8 6.2 5.4 4.8 7.5 6.0 6.9 10.8 7.5 6.6 5.0 3.3 7.6 3.9 11.9 2.2 15.0 7.2 6.1 15.3 18.9 7.2 5.4 5.5 4.3 9.0 12.7 11.3 7.4 5.0 3.5 8.2 8.4 7.3 10.3 11.9 6.0 5.6 9.5 9.3 10.4 9.7 5.1 6.7 10.2 6.2 8.4 7.0 4.8 5.6 10.5 14.6

10.8 15.5 7.5 6.4 3.4 5.5 6.6 5.9 15.0 9.6 7.8 7.0 6.9 4.1 3.6 11.9 3.7 5.7 6.8 11.3 9.3 9.6 10.4 9.3 6.9 9.8 9.1 10.6 4.5 6.2 8.3 3.2 4.9 5.0 6.0 8.2 6.3 3.8 6.0

a. Construct a stem-and-leaf display of the data. b. What is a typical, or representative, flow rate? c. Does the display appear to be highly concentrated or

spread out? d. Does the distribution of values appear to be reasonably

symmetric? If not, how would you describe the departure from symmetry?

e. Would you describe any observation as being far from the rest of the data (an outlier)?

n 5 129

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.2 Pictorial and Tabular Methods in Descriptive Statistics 25

15. Do running times of American movies differ somehow from running times of French movies? The author investigated this question by randomly selecting 25 recent movies of each type, resulting in the following running times:

Am: 94 90 95 93 128 95 125 91 104 116 162 102 90 110 92 113 116 90 97 103 95 120 109 91 138

Fr: 123 116 90 158 122 119 125 90 96 94 137 102 105 106 95 125 122 103 96 111 81 113 128 93 92

Construct a comparative stem-and-leaf display by listing stems in the middle of your paper and then placing the Am leaves out to the left and the Fr leaves out to the right. Then comment on interesting features of the display.

16. The article cited in Example 1.2 also gave the accompany- ing strength observations for cylinders:

6.1 5.8 7.8 7.1 7.2 9.2 6.6 8.3 7.0 8.3 7.8 8.1 7.4 8.5 8.9 9.8 9.7 14.1 12.6 11.2

a. Construct a comparative stem-and-leaf display (see the previous exercise) of the beam and cylinder data, and then answer the questions in parts (b)–(d) of Exercise 10 for the observations on cylinders.

b. In what ways are the two sides of the display similar? Are there any obvious differences between the beam observations and the cylinder observations?

c. Construct a dotplot of the cylinder data.

17. Temperature transducers of a certain type are shipped in batches of 50. A sample of 60 batches was selected, and the number of transducers in each batch not conforming to design specifications was determined, resulting in the following data:

2 1 2 4 0 1 3 2 0 5 3 3 1 3 2 4 7 0 2 3 0 4 2 1 3 1 1 3 4 1 2 3 2 2 8 4 5 1 3 1 5 0 2 3 2 1 0 6 4 2 1 6 0 3 3 3 6 1 2 3

a. Determine frequencies and relative frequencies for the observed values of of nonconforming trans- ducers in a batch.

b. What proportion of batches in the sample have at most five nonconforming transducers? What proportion have fewer than five? What proportion have at least five non- conforming units?

c. Draw a histogram of the data using relative frequency on the vertical scale, and comment on its features.

18. In a study of author productivity (“Lotka’s Test,” Collection Mgmt., 1982: 111–118), a large number of authors were classified according to the number of articles they had pub- lished during a certain period. The results were presented in the accompanying frequency distribution:

Number of papers 1 2 3 4 5 6 7 8 Frequency 784 204 127 50 33 28 19 19

Number of papers 9 10 11 12 13 14 15 16 17 Frequency 6 7 6 7 4 4 5 3 3

x 5 number

a. Construct a histogram corresponding to this frequency distribution. What is the most interesting feature of the shape of the distribution?

b. What proportion of these authors published at least five papers? At least ten papers? More than ten papers?

c. Suppose the five 15s, three 16s, and three 17s had been lumped into a single category displayed as “ .” Would you be able to draw a histogram? Explain.

d. Suppose that instead of the values 15, 16, and 17 being listed separately, they had been combined into a 15–17 category with frequency 11. Would you be able to draw a histogram? Explain.

19. The number of contaminating particles on a silicon wafer prior to a certain rinsing process was determined for each wafer in a sample of size 100, resulting in the following frequencies:

Number of particles 0 1 2 3 4 5 6 7 Frequency 1 2 3 12 11 15 18 10 Number of particles 8 9 10 11 12 13 14 Frequency 12 4 5 3 1 2 1

a. What proportion of the sampled wafers had at least one particle? At least five particles?

b. What proportion of the sampled wafers had between five and ten particles, inclusive? Strictly between five and ten particles?

c. Draw a histogram using relative frequency on the vertical axis. How would you describe the shape of the histogram?

20. The article “Determination of Most Representative Subdivision” (J. of Energy Engr., 1993: 43–55) gave data on various characteristics of subdivisions that could be used in deciding whether to provide electrical power using over- head lines or underground lines. Here are the values of the variable of streets within a subdivision:

1280 5320 4390 2100 1240 3060 4770 1050 360 3330 3380 340 1000 960 1320 530 3350 540 3870 1250 2400 960 1120 2120 450 2250 2320 2400

3150 5700 5220 500 1850 2460 5850 2700 2730 1670 100 5770 3150 1890 510 240 396 1419 2109

a. Construct a stem-and-leaf display using the thousands digit as the stem and the hundreds digit as the leaf, and comment on the various features of the display.

b. Construct a histogram using class boundaries 0, 1000, 2000, 3000, 4000, 5000, and 6000. What proportion of subdivisions have total length less than 2000? Between 2000 and 4000? How would you describe the shape of the histogram?

21. The article cited in Exercise 20 also gave the following val- ues of the variables and

:

y 1 0 1 0 0 2 0 1 1 1 2 1 0 0 1 1 0 1 1 z 1 8 6 1 1 5 3 0 0 4 4 0 0 1 2 1 4 0 4

z 5 number of intersections y 5 number of culs-de-sac

x 5 total length

$15

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

26 CHAPTER 1 Overview and Descriptive Statistics

y 1 1 0 0 0 1 1 2 0 1 2 2 1 1 0 2 1 1 0 z 0 3 0 1 1 0 1 3 2 4 6 6 0 1 1 8 3 3 5

y 1 5 0 3 0 1 1 0 0 z 0 5 2 3 1 0 0 0 3

a. Construct a histogram for the y data. What proportion of these subdivisions had no culs-de-sac? At least one cul-de-sac?

b. Construct a histogram for the z data. What proportion of these subdivisions had at most five intersections? Fewer than five intersections?

22. How does the speed of a runner vary over the course of a marathon (a distance of 42.195 km)? Consider determining both the time to run the first 5 km and the time to run between the 35-km and 40-km points, and then subtracting the former time from the latter time. A positive value of this difference corresponds to a runner slowing down toward the end of the race. The accompanying histogram is based on times of runners who participated in several different Japanese marathons (“Factors Affecting Runners’ Marathon Performance,” Chance, Fall, 1993: 24–30).

What are some interesting features of this histogram? What is a typical difference value? Roughly what proportion of the runners ran the late distance more quickly than the early distance?

23. The article “Statistical Modeling of the Time Course of Tantrum Anger” (Annals of Applied Stats, 2009: 1013–1034) discussed how anger intensity in children’s tantrums could be related to tantrum duration as well as behavioral indica- tors such as shouting, stamping, and pushing or pulling. The following frequency distribution was given (and also the cor- responding histogram):

: 136 : 92 : 71 : 26 : 7 : 3

Draw the histogram and then comment on any interesting features.

302,40202,30112,20 42,1122,402,2

24. The accompanying data set consists of observations on shear strength (lb) of ultrasonic spot welds made on a certain type of alclad sheet. Construct a relative frequency histogram based on ten equal-width classes with boundaries 4000, 4200, . . . . [The histogram will agree with the one in “Comparison of Properties of Joints Prepared by Ultrasonic Welding and Other Means” (J. of Aircraft, 1983: 552–556).] Comment on its features.

5434 4948 4521 4570 4990 5702 5241 5112 5015 4659 4806 4637 5670 4381 4820 5043 4886 4599 5288 5299 4848 5378 5260 5055 5828 5218 4859 4780 5027 5008 4609 4772 5133 5095 4618 4848 5089 5518 5333 5164 5342 5069 4755 4925 5001 4803 4951 5679 5256 5207 5621 4918 5138 4786 4500 5461 5049 4974 4592 4173 5296 4965 5170 4740 5173 4568 5653 5078 4900 4968 5248 5245 4723 5275 5419 5205 4452 5227 5555 5388 5498 4681 5076 4774 4931 4493 5309 5582 4308 4823 4417 5364 5640 5069 5188 5764 5273 5042 5189 4986

25. A transformation of data values by means of some mathe- matical function, such as or 1/x, can often yield a set of numbers that has “nicer” statistical properties than the orig- inal data. In particular, it may be possible to find a function for which the histogram of transformed values is more symmetric (or, even better, more like a bell-shaped curve) than the original data. As an example, the article “Time Lapse Cinematographic Analysis of Beryllium–Lung Fibroblast Interactions” (Environ. Research, 1983: 34–43) reported the results of experiments designed to study the behavior of certain individual cells that had been exposed to beryllium. An important characteristic of such an individual cell is its interdivision time (IDT). IDTs were determined for a large number of cells, both in exposed

1x

0 100 200 400

50

100

150

200

–100

Time difference

300 500 600 700 800

Frequency

Histogram for Exercise 22

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.2 Pictorial and Tabular Methods in Descriptive Statistics 27

(treatment) and unexposed (control) conditions. The authors of the article used a logarithmic transformation, that is, . Consider the following representative IDT data:

transformed value 5 log(original value)

a. Why can a frequency distribution not be based on the class intervals 0–50, 50–100, 100–150, and so on?

b. Construct a frequency distribution and histogram of the data using class boundaries 0, 50, 100, . . . , and then comment on interesting characteristics.

c. Construct a frequency distribution and histogram of the natural logarithms of the lifetime observations, and com- ment on interesting characteristics.

d. What proportion of the lifetime observations in this sam- ple are less than 100? What proportion of the observa- tions are at least 200?

28. Human measurements provide a rich area of application for statistical methods. The article “A Longitudinal Study of the Development of Elementary School Children’s Private Speech” (Merrill-Palmer Q., 1990: 443–463) reported on a study of children talking to themselves (pri- vate speech). It was thought that private speech would be related to IQ, because IQ is supposed to measure mental maturity, and it was known that private speech decreases as students progress through the primary grades. The study included 33 students whose first-grade IQ scores are given here:

82 96 99 102 103 103 106 107 108 108 108 108 109 110 110 111 113 113 113 113 115 115 118 118 119 121 122 122 127 132 136 140 146

Describe the data and comment on any interesting features.

29. Consider the following data on types of health complaint (J joint swelling, F fatigue, B back pain, M muscle weakness, C coughing, N nose running/ irritation, O other) made by tree planters. Obtain frequen- cies and relative frequencies for the various categories, and draw a histogram. (The data is consistent with percentages given in the article “Physiological Effects of Work Stress and Pesticide Exposure in Tree Planting by British Columbia Silviculture Workers,” Ergonomics, 1993: 951–961.)

O O N J C F B B F O J O O M O F F O O N O N J F J B O C J O J J F N O B M O J M O B O F J O O B N C O O O M B F J O F N

30. A Pareto diagram is a variation of a histogram for cate- gorical data resulting from a quality control study. Each cat- egory represents a different type of product nonconformity or production problem. The categories are ordered so that the one with the largest frequency appears on the far left, then the category with the second largest frequency, and so on. Suppose the following information on nonconformities in circuit packs is obtained: failed component, 126; incor- rect component, 210; insufficient solder, 67; excess solder, 54; missing component, 131. Construct a Pareto diagram.

31. The cumulative frequency and cumulative relative frequency for a particular class interval are the sum of frequencies and relative frequencies, respectively, for that interval and all intervals lying below it. If, for example,

5 55

5555

IDT log10(IDT) IDT log10(IDT) IDT log10(IDT)

28.1 1.45 60.1 1.78 21.0 1.32 31.2 1.49 23.7 1.37 22.3 1.35 13.7 1.14 18.6 1.27 15.5 1.19 46.0 1.66 21.4 1.33 36.3 1.56 25.8 1.41 26.6 1.42 19.1 1.28 16.8 1.23 26.2 1.42 38.4 1.58 34.8 1.54 32.0 1.51 72.8 1.86 62.3 1.79 43.5 1.64 48.9 1.69 28.0 1.45 17.4 1.24 21.4 1.33 17.9 1.25 38.8 1.59 20.7 1.32 19.5 1.29 30.6 1.49 57.3 1.76 21.1 1.32 55.6 1.75 40.9 1.61 31.9 1.50 25.5 1.41 28.9 1.46 52.1 1.72

Use class intervals to construct a histogram of the original data. Use intervals

to do the same for the trans- formed data. What is the effect of the transformation?

26. Automated electron backscattered diffraction is now being used in the study of fracture phenomena. The following information on misorientation angle (degrees) was extracted from the article “Observations on the Faceted Initiation Site in the Dwell-Fatigue Tested Ti-6242 Alloy: Crystallographic Orientation and Size Effects (Metallurgical and Materials Trans., 2006: 1507–1518).

Class: Rel freq: .177 .166 .175 .136 Class: Rel freq: .194 .078 .044 .030

a. Is it true that more than 50% of the sampled angles are smaller than 15°, as asserted in the paper?

b. What proportion of the sampled angles are at least 30°? c. Roughly what proportion of angles are between 10°

and 25°? d. Construct a histogram and comment on any interesting

features.

27. The paper “Study on the Life Distribution of Microdrills” (J. of Engr. Manufacture, 2002: 301–305) reported the fol- lowing observations, listed in increasing order, on drill life- time (number of holes that a drill machines before it breaks) when holes were drilled in a certain brass alloy.

11 14 20 23 31 36 39 44 47 50 59 61 65 67 68 71 74 76 78 79 81 84 85 89 91 93 96 99 101 104

105 105 112 118 123 136 139 141 148 158 161 168 184 206 248 263 289 322 388 513

602,90402,60302,40202,30

152,20102,1552,1002,5

1.12,1.2, 1.22,1.3, c

102,20, 202,30, c

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

28 CHAPTER 1 Overview and Descriptive Statistics

there are four intervals with frequencies 9, 16, 13, and 12, then the cumulative frequencies are 9, 25, 38, and 50, and the cumulative relative frequencies are .18, .50, .76, and 1.00. Compute the cumulative frequencies and cumulative relative frequencies for the data of Exercise 24.

32. Fire load is the heat energy that could be released per square meter of floor area by combustion of contents and the structure itself. The article “Fire Loads in Office Buildings” (J. of Structural Engr., 1997: 365–368) gave the following cumulative percentages (read from a graph) for fire loads in a sample of 388 rooms:

(MJ/m2)

Value 0 150 300 450 600 Cumulative % 0 19.3 37.6 62.7 77.5

Value 750 900 1050 1200 1350 Cumulative % 87.2 93.8 95.7 98.6 99.1

Value 1500 1650 1800 1950 Cumulative % 99.5 99.6 99.8 100.0

a. Construct a relative frequency histogram and comment on interesting features.

b. What proportion of fire loads are less than 600? At least 1200?

c. What proportion of the loads are between 600 and 1200?

1.3 Measures of Location Visual summaries of data are excellent tools for obtaining preliminary impres- sions and insights. More formal data analysis often requires the calculation and interpretation of numerical summary measures. That is, from the data we try to extract several summarizing numbers—numbers that might serve to characterize the data set and convey some of its salient features. Our primary concern will be with numerical data; some comments regarding categorical data appear at the end of the section.

Suppose, then, that our data set is of the form , where each xi is a number. What features of such a set of numbers are of most interest and deserve emphasis? One important characteristic of a set of numbers is its location, and in particular its center. This section presents methods for describing the location of a data set; in Section 1.4 we will turn to methods for measuring variability in a set of numbers.

The Mean For a given set of numbers , the most familiar and useful measure of the center is the mean, or arithmetic average of the set. Because we will almost always think of the xi’s as constituting a sample, we will often refer to the arithmetic average as the sample mean and denote it by .x

x1, x2, c, xn

x1, x2, c, xn

DEFINITION The sample mean of observations is given by

The numerator of can be written more informally as , where the sum- mation is over all sample observations.

gxix

x 5 x1 1 x2 1 c 1 xn

n 5

g n

i51 xi

n

x1, x2, c, xnx

For reporting , we recommend using decimal accuracy of one digit more than the accuracy of the xi’s. Thus if observations are stopping distances with ,

, and so on, we might have .x 5 127.3 ftx2 5 131 x1 5 125

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.3 Measures of Location 29

Example 1.14 Caustic stress corrosion cracking of iron and steel has been studied because of fail- ures around rivets in steel boilers and failures of steam rotors. Consider the accom- panying observations on as a result of constant load stress corrosion tests on smooth bar tensile specimens for a fixed length of time. (The data is consistent with a histogram and summary quantities from the article “On the Role of Phosphorus in the Caustic Stress Corrosion Cracking of Low Alloy Steels,” Corrosion Science, 1989: 53–68.)

Figure 1.14 shows a stem-and-leaf display of the data; a crack length in the low 20s appears to be “typical.”

x21 5 28.5x20 5 11.8x19 5 32.4x18 5 8.9x17 5 14.6x16 5 24.2x15 5 23.3

x14 5 45.0x13 5 27.1x12 5 14.0x11 5 25.3x10 5 10.3x9 5 18.5x8 5 25.8

x7 5 30.2x6 5 21.2x5 5 12.7x4 5 20.4x3 5 24.9x2 5 9.6x1 5 16.1

x 5 crack length (mm)

0H 96 89 1L 27 03 40 46 18 1H 61 85 2L 49 04 12 33 42 Stem: tens digit 2H 58 53 71 85 Leaf: one and tenths digit

3L 02 24 3H 4L 4H 50

Figure 1.14 A stem-and-leaf display of the crack-length data

With , the sample mean is

a value consistent with information conveyed by the stem-and-leaf display. ■

A physical interpretation of demonstrates how it measures the location (cen- ter) of a sample. Think of drawing and scaling a horizontal measurement axis, and then represent each sample observation by a 1-lb weight placed at the corresponding point on the axis. The only point at which a fulcrum can be placed to balance the sys- tem of weights is the point corresponding to the value of (see Figure 1.15).x

x

x 5 444.8

21 5 21.18

gxi 5 444.8

10 20 30 40

x = 21.18

Figure 1.15 The mean as the balance point for a system of weights

Just as represents the average value of the observations in a sample, the average of all values in the population can be calculated. This average is called the population mean and is denoted by the Greek letter . When there are N values in the population (a finite population), then . In Chapters 3 and 4, we will give a more general definition for that applies to both finite and (conceptually) infinite populations. Just as is an interesting and important measure of sample location, is an interesting and important (often the most important) characteristic of a population. In the chapters on statistical

m

x m

m 5 (sum of the N population values)/N m

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.15

DEFINITION

30 CHAPTER 1 Overview and Descriptive Statistics

inference, we will present methods based on the sample mean for drawing conclu- sions about a population mean. For example, we might use the sample mean

computed in Example 1.14 as a point estimate (a single number that is our “best” guess) of crack length for all specimens treated as described.

The mean suffers from one deficiency that makes it an inappropriate measure of center under some circumstances: Its value can be greatly affected by the presence of even a single outlier (unusually large or small observation). In Example 1.14, the value is obviously an outlier. Without this observation,

; the outlier increases the mean by more than 1 If the 45.0 observation were replaced by the catastrophic value 295.0 a really extreme outlier, then , which is larger than all but one of the observations!

A sample of incomes often produces such outlying values (those lucky few who earn astronomical amounts), and the use of average income as a measure of location will often be misleading. Such examples suggest that we look for a meas- ure that is less sensitive to outlying values than , and we will momentarily pro- pose one. However, although does have this potential defect, it is still the most widely used measure, largely because there are many populations for which an extreme outlier in the sample would be highly unlikely. When sampling from such a population (a normal or bell-shaped population being the most important example), the sample mean will tend to be stable and quite representative of the sample.

The Median The word median is synonymous with “middle,” and the sample median is indeed the middle value once the observations are ordered from smallest to largest. When the observations are denoted by , we will use the symbol to represent the sample median.

x|x1, c, xn

x x

x 5 694.8/21 5 33.09 mm,mm mm.x 5 399.8/20 5 19.99

x14 5 45.0

m 5 the true average x 5 21.18

The sample median is obtained by first ordering the n observations from smallest to largest (with any repeated values included so that every sample observation appears in the ordered list). Then,

The single middle value if n is odd The average of the two middle values if n is even

x| 5

People not familiar with classical music might tend to believe that a composer’s instructions for playing a particular piece are so specific that the duration would not depend at all on the performer(s). However, there is typically plenty of room for interpretation, and orchestral conductors and musicians take full advantage of this. The author went to the Web site ArkivMusic.com and selected a sample of

5 an 1 1 2

b th ordered value

5 average of an 2 b thand an

2 1 1b th ordered values

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.3 Measures of Location 31

60 65 70 Duration

75 80

Figure 1.16 Dotplot of the data from Example 1.14

12 recordings of Beethoven’s Symphony #9 (the “Choral,” a stunningly beautiful work), yielding the following durations (min) listed in increasing order:

62.3 62.8 63.6 65.2 65.7 66.4 67.4 68.4 68.8 70.8 75.7 79.0

Here is a dotplot of the data:

Since is even, the sample median is the average of the values from the ordered list:

Note that if the largest observation 79.0 had not been included in the sample, the resulting sample median for the remaining observations would have been the single middle value 66.4 (the ordered value, i.e. the 6th value in from either end of the ordered list). The sample mean is , a bit more than a full minute larger than the median. The mean is pulled out a bit rela- tive to the median because the sample “stretches out” somewhat more on the upper end than on the lower end. ■

The data in Example 1.15 illustrates an important property of in contrast to : The sample median is very insensitive to outliers. If, for example, we increased

the two largest xis from 75.7 and 79.0 to 85.7 and 89.0, respectively, would be unaffected. Thus, in the treatment of outlying data values, and are at opposite ends of a spectrum. Both quantities describe where the data is centered, but they will not in general be equal because they focus on different aspects of the sample.

Analogous to as the middle value in the sample is a middle value in the pop- ulation, the population median, denoted by . As with and , we can think of using the sample median to make an inference about . In Example 1.15, we might use as an estimate of the median time for the population of all record- ings. A median is often used to describe income or salary data (because it is not greatly influenced by a few large salaries). If the median salary for a sample of engi- neers were , we might use this as a basis for concluding that the median salary for all engineers exceeds $60,000.

The population mean and median will not generally be identical. If the population distribution is positively or negatively skewed, as pictured in Figure 1.17, then . When this is the case, in making inferences we must first decide which of the two population characteristics is of greater interest and then proceed accordingly.

m 2 m|

m|m

x| 5 $66,416

x| 5 66.90 m|x|

mxm| x|

x|x x|

x x|

x 5 gxi 5 816.1/12 5 68.01 [n 1 1]/2 5 6th

n 5 11

x| 5 66.4 1 67.4

2 5 66.90

(n/2 1 1) 5 7th n/2 5 6th andn 5 12

� � � �~ ~~� �� (a) Negative skew (b) Symmetric (c) Positive skew

Figure 1.17 Three different shapes for a population distribution

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.16

32 CHAPTER 1 Overview and Descriptive Statistics

Other Measures of Location: Quartiles, Percentiles, and Trimmed Means The median (population or sample) divides the data set into two parts of equal size. To obtain finer measures of location, we could divide the data into more than two such parts. Roughly speaking, quartiles divide the data set into four equal parts, with the observations above the third quartile constituting the upper quarter of the data set, the second quartile being identical to the median, and the first quartile separating the lower quarter from the upper three-quarters. Similarly, a data set (sample or population) can be even more finely divided using percentiles; the 99th percentile separates the highest 1% from the bottom 99%, and so on. Unless the number of observations is a multiple of 100, care must be exercised in obtaining percentiles. We will use percentiles in Chapter 4 in con- nection with certain models for infinite populations and so postpone discussion until that point.

The mean is quite sensitive to a single outlier, whereas the median is impervious to many outliers. Since extreme behavior of either type might be undesirable, we briefly consider alternative measures that are neither as sensitive as nor as insensitive as . To motivate these alternatives, note that and are at opposite extremes of the same “family” of measures. The mean is the average of all the data, whereas the median results from eliminating all but the middle one or two values and then averaging. To paraphrase, the mean involves trim- ming 0% from each end of the sample, whereas for the median the maximum possible amount is trimmed from each end. A trimmed mean is a compromise between and . A 10% trimmed mean, for example, would be computed by eliminating the smallest 10% and the largest 10% of the sample and then aver- aging what remains.

The production of Bidri is a traditional craft of India. Bidri wares (bowls, vessels, and so on) are cast from an alloy containing primarily zinc along with some copper. Consider the following observations on copper content (%) for a sample of Bidri artifacts in London’s Victoria and Albert Museum (“Enigmas of Bidri,” Surface Engr., 2005: 333–339), listed in increasing order:

2.0 2.4 2.5 2.6 2.6 2.7 2.7 2.8 3.0 3.1 3.2 3.3 3.3 3.4 3.4 3.6 3.6 3.6 3.6 3.7 4.4 4.6 4.7 4.8 5.3 10.1

Figure 1.18 is a dotplot of the data. A prominent feature is the single outlier at the upper end; the distribution is somewhat sparser in the region of larger values than is the case for smaller values. The sample mean and median are 3.65 and 3.35, respec- tively. A trimmed mean with a trimming percentage of results from eliminating the two smallest and two largest observations; this gives

. Trimming here eliminates the larger outlier and so pulls the trimmed mean toward the median. xtr(7.7) 5 3.42

100(2/26) 5 7.7%

x|x

x|xx|x

x~

x–

xtr (7.7) –

1 2 3 4 5 6 7 8 9 10 11

Figure 1.18 Dotplot of copper contents from Example 1.16 ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.3 Measures of Location 33

A trimmed mean with a moderate trimming percentage—someplace between 5% and 25%—will yield a measure of center that is neither as sensitive to outliers as is the mean nor as insensitive as the median. If the desired trimming percentage is and is not an integer, the trimmed mean must be calculated by interpolation. For example, consider for a 10% trimming percentage and as in Example 1.16. Then would be the appropriate weighted average of the 7.7% trimmed mean calculated there and the 11.5% trimmed mean resulting from trimming three observations from each end.

Categorical Data and Sample Proportions When the data is categorical, a frequency distribution or relative frequency dis- tribution provides an effective tabular summary of the data. The natural numer- ical summary quantities in this situation are the individual frequencies and the relative frequencies. For example, if a survey of individuals who own digital cameras is undertaken to study brand preference, then each individual in the sample would identify the brand of camera that he or she owned, from which we could count the number owning Canon, Sony, Kodak, and so on. Consider sam- pling a dichotomous population—one that consists of only two categories (such as voted or did not vote in the last election, does or does not own a digital cam- era, etc.). If we let x denote the number in the sample falling in category 1, then the number in category 2 is . The relative frequency or sample proportion in category 1 is x/n and the sample proportion in category 2 is . Let’s denote a response that falls in category 1 by a 1 and a response that falls in cat- egory 2 by a 0. A sample size of might then yield the responses 1, 1, 0, 1, 1, 1, 0, 0, 1, 1. The sample mean for this numerical sample is (since number of )

More generally, focus attention on a particular category and code the sample results so that a 1 is recorded for an observation in the category and a 0 for an observation not in the category. Then the sample proportion of observations in the category is the sample mean of the sequence of 1s and 0s. Thus a sample mean can be used to summarize the results of a categorical sample. These remarks also apply to situations in which categories are defined by grouping values in a numerical sam- ple or population (e.g., we might be interested in knowing whether individuals have owned their present automobile for at least 5 years, rather than studying the exact length of ownership).

Analogous to the sample proportion x/n of individuals or objects falling in a particular category, let p represent the proportion of those in the entire population falling in the category. As with x/n, p is a quantity between 0 and 1, and while x/n is a sample characteristic, p is a characteristic of the population. The relationship between the two parallels the relationship between and and between and . In particular, we will subsequently use x/n to make inferences about p. If, for example, a sample of 100 car owners reveals that 22 owned their car at least 5 years, then we might use as a point estimate of the proportion of all owners who have owned their car at least 5 years. With k categories , we can use the k sample proportions to answer questions about the population pro- portions .p1, c, pk

(k . 2) 22/100 5 .22

mxm|x|

x1 1 c1 xn n

5 1 1 1 1 0 1 c1 1 1 1

10 5

7

10 5

x n

5 sample proportion

1s 5 x 5 7

n 5 10

1 2 x/n n 2 x

xtr(10)n 5 26 a 5 .10

na100a%

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

34 CHAPTER 1 Overview and Descriptive Statistics

EXERCISES Section 1.3 (33–43)

33. The May 1, 2009 issue of The Montclarian reported the fol- lowing home sale amounts for a sample of homes in Alameda, CA that were sold the previous month (1000s of $):

590 815 575 608 350 1285 408 540 555 679

a. Calculate and interpret the sample mean and median. b. Suppose the 6th observation had been 985 rather than

1285. How would the mean and median change? c. Calculate a 20% trimmed mean by first trimming the two

smallest and two largest observations. d. Calculate a 15% trimmed mean.

34. Exposure to microbial products, especially endotoxin, may have an impact on vulnerability to allergic diseases. The article “Dust Sampling Methods for Endotoxin—An Essential, But Underestimated Issue” (Indoor Air, 2006: 20–27) considered various issues associated with determin- ing endotoxin concentration. The following data on concen- tration (EU/mg) in settled dust for one sample of urban homes and another of farm homes was kindly supplied by the authors of the cited article.

U: 6.0 5.0 11.0 33.0 4.0 5.0 80.0 18.0 35.0 17.0 23.0 F: 4.0 14.0 11.0 9.0 9.0 8.0 4.0 20.0 5.0 8.9 21.0

9.2 3.0 2.0 0.3

a. Determine the sample mean for each sample. How do they compare?

b. Determine the sample median for each sample. How do they compare? Why is the median for the urban sample so different from the mean for that sample?

c. Calculate the trimmed mean for each sample by deleting the smallest and largest observation. What are the corre- sponding trimming percentages? How do the values of these trimmed means compare to the corresponding means and medians?

35. The minimum injection pressure (psi) for injection molding specimens of high amylose corn was determined for eight different specimens (higher pressure corresponds to greater processing difficulty), resulting in the following observa- tions (from “Thermoplastic Starch Blends with a Polyethylene-Co-Vinyl Alcohol: Processability and Physical Properties,” Polymer Engr. and Science, 1994: 17–23):

15.0 13.0 18.0 14.5 12.0 11.0 8.9 8.0

a. Determine the values of the sample mean, sample median, and 12.5% trimmed mean, and compare these values.

b. By how much could the smallest sample observation, currently 8.0, be increased without affecting the value of the sample median?

c. Suppose we want the values of the sample mean and median when the observations are expressed in kilograms per square inch (ksi) rather than psi. Is it necessary to

reexpress each observation in ksi, or can the values calculated in part (a) be used directly? [Hint:

36. A sample of 26 offshore oil workers took part in a simulated escape exercise, resulting in the accompanying data on time (sec) to complete the escape (“Oxygen Consumption and Ventilation During Escape from an Offshore Platform,” Ergonomics, 1997: 281–292):

389 356 359 363 375 424 325 394 402 373 373 370 364 366 364 325 339 393 392 369 374 359 356 403 334 397

a. Construct a stem-and-leaf display of the data. How does it suggest that the sample mean and median will compare?

b. Calculate the values of the sample mean and median. [Hint: .]

c. By how much could the largest time, currently 424, be increased without affecting the value of the sample median? By how much could this value be decreased without affecting the value of the sample median?

d. What are the values of and when the observations are reexpressed in minutes?

37. The article “Snow Cover and Temperature Relationships in North America and Eurasia” (J. Climate and Applied Meteorology, 1983: 460–469) used statistical techniques to relate the amount of snow cover on each continent to aver- age continental temperature. Data presented there included the following ten observations on October snow cover for Eurasia during the years 1970–1979 (in million km2):

6.5 12.0 14.9 10.0 10.7 7.9 21.9 12.5 14.5 9.2

What would you report as a representative, or typical, value of October snow cover for this period, and what prompted your choice?

38. Blood pressure values are often reported to the nearest 5 mmHg (100, 105, 110, etc.). Suppose the actual blood pressure values for nine randomly selected individuals are

118.6 127.4 138.4 130.0 113.7 122.0 108.3 131.5 133.2

a. What is the median of the reported blood pressure values? b. Suppose the blood pressure of the second individual is

127.6 rather than 127.4 (a small change in a single value). How does this affect the median of the reported values? What does this say about the sensitivity of the median to rounding or grouping in the data?

39. The propagation of fatigue cracks in various aircraft parts has been the subject of extensive study in recent years. The accompanying data consists of propagation lives (flight hours/104) to reach a given crack size in fastener holes intended for use in military aircraft (“Statistical Crack

x|x

gxi 5 9638

1 kg 5 2.2 lb.]

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.4 Measures of Variability 35

Propagation in Fastener Holes Under Spectrum Loading,” J. Aircraft, 1983: 1028–1032):

.736 .863 .865 .913 .915 .937 .983 1.007 1.011 1.064 1.109 1.132 1.140 1.153 1.253 1.394

a. Compute and compare the values of the sample mean and median.

b. By how much could the largest sample observation be decreased without affecting the value of the median?

40. Compute the sample median, 25% trimmed mean, 10% trimmed mean, and sample mean for the lifetime data given in Exercise 27, and compare these measures.

41. A sample of automobiles was selected, and each was subjected to a 5-mph crash test. Denoting a car with no visible damage by S (for success) and a car with such dam- age by F, results were as follows:

S S F S S S F F S S

a. What is the value of the sample proportion of successes x/n?

b. Replace each S with a 1 and each F with a 0. Then cal- culate for this numerically coded sample. How does compare to x/n?

xx

n 5 10

c. Suppose it is decided to include 15 more cars in the experiment. How many of these would have to be S’s to give for the entire sample of 25 cars?

42. a. If a constant c is added to each xi in a sample, yielding , how do the sample mean and median of the

yis relate to the mean and median of the xis? Verify your conjectures.

b. If each xi is multiplied by a constant c, yielding , answer the question of part (a). Again, verify your conjectures.

43. An experiment to study the lifetime (in hours) for a certain type of component involved putting ten components into operation and observing them for 100 hours. Eight of the components failed during that period, and those lifetimes were recorded. Denote the lifetimes of the two components still functioning after 100 hours by . The resulting sample observations were

48 79 35 92 86 57 17 29

Which of the measures of center discussed in this section can be calculated, and what are the values of those meas- ures? [Note: The data from this experiment is said to be “censored on the right.”]

10011001

1001

yi 5 cxi

yi 5 xi 1 c

x/n 5 .80

1.4 Measures of Variability Reporting a measure of center gives only partial information about a data set or dis- tribution. Different samples or populations may have identical measures of center yet differ from one another in other important ways. Figure 1.19 shows dotplots of three samples with the same mean and median, yet the extent of spread about the center is different for all three samples. The first sample has the largest amount of variability, the third has the smallest amount, and the second is intermediate to the other two in this respect.

30 40

* * * * * * * * *

50 60 70

1:

2:

3:

Figure 1.19 Samples with identical measures of center but different amounts of variability

Measures of Variability for Sample Data The simplest measure of variability in a sample is the range, which is the difference between the largest and smallest sample values. The value of the range for sample 1 in Figure 1.19 is much larger than it is for sample 3, reflecting more variability in the first sample than in the third. A defect of the range, though, is that it depends on only the two most extreme observations and disregards the positions of the remaining

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.17

DEFINITION

36 CHAPTER 1 Overview and Descriptive Statistics

The sample variance, denoted by s2, is given by

The sample standard deviation, denoted by s, is the (positive) square root of the variance:

s 5 2s2

s2 5 g (xi 2 x)

2

n 2 1 5

Sxx n 2 1

Note that s2 and s are both nonnegative. The unit for s is the same as the unit for each of the xis. If, for example, the observations are fuel efficiencies in miles per gallon, then we might have . A rough interpretation of the sample standard deviation is that it is the size of a typical or representative deviation from the sam- ple mean within the given sample. Thus if , then some xi’s in the sam- ple are closer than 2.0 to , whereas others are farther away; 2.0 is a representative (or “standard”) deviation from the mean fuel efficiency. If for a second sam- ple of cars of another type, a typical deviation in this sample is roughly 1.5 times what it is in the first sample, an indication of more variability in the second sample.

The Web site www.fueleconomy.gov contains a wealth of information about fuel characteristics of various vehicles. In addition to EPA mileage ratings, there are

s 5 3.0 x

s 5 2.0 mpg

s 5 2.0 mpg

values. Samples 1 and 2 in Figure 1.19 have identical ranges, yet when we take into account the observations between the two extremes, there is much less variabil- ity or dispersion in the second sample than in the first.

Our primary measures of variability involve the deviations from the mean, . That is, the deviations from the mean are obtained by

subtracting from each of the n sample observations. A deviation will be positive if the observation is larger than the mean (to the right of the mean on the measurement axis) and negative if the observation is smaller than the mean. If all the deviations are small in magnitude, then all xis are close to the mean and there is little variabil- ity. Alternatively, if some of the deviations are large in magnitude, then some xis lie far from , suggesting a greater amount of variability. A simple way to combine the deviations into a single quantity is to average them. Unfortunately, this is a bad idea:

so that the average deviation is always zero. The verification uses several standard rules of summation and the fact that :

How can we prevent negative and positive deviations from counteracting one another when they are combined? One possibility is to work with the absolute values of the deviations and calculate the average absolute deviation . Because the absolute value operation leads to a number of theoretical difficulties, consider instead the squared deviations . Rather than use the average squared deviation , for several reasons we divide the sum of squared deviations by rather than n.n 2 1

g (xi 2 x) 2/n

(x1 2 x) 2, (x2 2 x)

2, c, (xn 2 x) 2

g u xi 2 x u /n

g(xi 2 x) 5 gxi 2 gx 5 gxi 2 nx 5 gxi 2 na 1n gxib 5 0 gx 5 x 1 x 1 c 1 x 5 nx

sum of deviations 5 g n

i51 (xi 2 x) 5 0

x

x x1 2 x, x2 2 x, c, xn 2 x

n 2 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.4 Measures of Variability 37

Car xi

1 27.3 35.522 2 27.9 28.730 3 32.9 0.130 4 35.2 1.94 3.764 5 44.9 11.64 135.490 6 39.9 6.64 44.090 7 30.0 10.628 8 29.7 12.674 9 28.5 22.658

10 32.0 1.588 11 37.6 4.34 18.836

x 5 33.26g sxi 2 xd2 5 314.106g sxi 2 xd 5 .04gxi 5 365.9

21.26 24.76 23.56 23.26

20.36 25.36 25.96

sxi 2 xd2xi 2 x

many vehicles for which users have reported their own values of fuel efficiency (mpg). Consider the following sample of efficiencies for the 2009 Ford Focus equipped with an automatic transmission (for this model, EPA reports an over- all rating of 27 mpg–24 mpg for city driving and 33 mpg for highway driving):

n 5 11

Effects of rounding account for the sum of deviations not being exactly zero. The numerator of s2 is , from which

The size of a representative deviation from the sample mean 33.26 is roughly 5.6 mpg. Note: Of the nine people who also reported driving behavior, only three did more than 80% of their driving in highway mode; we bet you can guess which cars they drove. We haven’t a clue why all 11 reported values exceed the EPA figure—maybe only drivers with really good fuel efficiencies communicate their results. ■

Motivation for s2

To explain the rationale for the divisor in s2, note first that whereas s2 meas- ures sample variability, there is a measure of variability in the population called the population variance. We will use (the square of the lowercase Greek letter sigma) to denote the population variance and to denote the population standard deviation (the square root of ). When the population is finite and consists of N values,

which is the average of all squared deviations from the population mean (for the pop- ulation, the divisor is N and not ). More general definitions of appear in Chapters 3 and 4.

Just as will be used to make inferences about the population mean , we should define the sample variance so that it can be used to make inferences about . Now note that involves squared deviations about the population mean . If we actu- ally knew the value of , then we could define the sample variance as the average squared deviation of the sample xis about . However, the value of is almost never known, so the sum of squared deviations about must be used. But the xis tend to be closer to their average than to the population average , so to compensate for thismx

x mm

m

ms2 s2

mx

s2N 2 1

s2 5 g N

i51 (xi 2 m)

2/N

s2 s

s2

n 2 1

s2 5 Sxx

n 2 1 5

314.106

11 2 1 5 31.41, s 5 5.60

Sxx 5 314.106

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.18

38 CHAPTER 1 Overview and Descriptive Statistics

the divisor is used rather than n. In other words, if we used a divisor n in the sample variance, then the resulting quantity would tend to underestimate (produce estimated values that are too small on the average), whereas dividing by the slightly smaller corrects this underestimating.

It is customary to refer to s2 as being based on degrees of freedom (df). This terminology reflects the fact that although s2 is based on the n quantities

, these sum to 0, so specifying the values of any of the quantities determines the remaining value. For example, if and

, , and , then automatically , so only three of the four values of are freely determined (3 df).

A Computing Formula for s2

It is best to obtain s2 from statistical software or else use a calculator that allows you to enter data into memory and then view s2 with a single keystroke. If your calcula- tor does not have this capability, there is an alternative formula for Sxx that avoids calculating the deviations. The formula involves both , summing and then squaring, and , squaring and then summing.gxi

2 AgxiB2

xi 2 x x3 2 x 5 2x4 2 x 5 24x2 2 x 5 26x1 2 x 5 8

n 5 4 n 2 1x1 2 x, x2 2 x, c, xn 2 x

n 2 1 n 2 1

s2 n 2 1

An alternative expression for the numerator of s2 is

Sxx 5 g (xi 2 x) 2 5 gxi

2 2 AgxiB2

n

Proof Because . Then,

Traumatic knee dislocation often requires surgery to repair ruptured ligaments. One measure of recovery is range of motion (measured as the angle formed when, start- ing with the leg straight, the knee is bent as far as possible). The given data on post- surgical range of motion appeared in the article “Reconstruction of the Anterior and Posterior Cruciate Ligaments After Knee Dislocation” (Amer. J. Sports Med., 1999: 189–197):

The sum of these 13 sample observations is , and the sum of their squares is

Thus the numerator of the sample variance is

from which and . ■

Both the defining formula and the computational formula for s2 can be sensitive to rounding, so as much decimal accuracy as possible should be used in intermediate calculations.

Several other properties of s2 can enhance understanding and facilitate com- putation.

s 5 11.47s2 5 1579.0769/12 5 131.59

Sxx 5 gxi 2 2 [(gxi)

2]/n 5 222,581 2 (1695)2/13 5 1579.0769

gxi 2 5 (154)2 1 (142)2 1 c 1 (122)2 5 222,581

gxi 5 1695

154 142 137 133 122 126 135 135 108 120 127 134 122

5 gxi 2 2 2x # nx 1 n(x)2 5 gxi2 2 n(x)2

g (x i 2 x ) 2 5 g (x i

2 2 2x # x i 1 x 2) 5 gx i2 2 2x gx i 1 g (x)2 x 5 gxi /n, nx

2 5 AgxiB2/n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.4 Measures of Variability 39

PROPOSITION Let be a sample and c be any nonzero constant.

1. If , then , and

2. If , then

where is the sample variance of the x’s and is the sample variance of the y’s.sy 2sx

2

sy 2 5 c2sx

2, sy 5 u c usxy1 5 cx1, c, yn 5 cxn

sy 2 5 sx

2y1 5 x1 1 c, y2 5 x2 1 c, c , yn 5 xn 1 c

x1, x2, c, xn

In words, Result 1 says that if a constant c is added to (or subtracted from) each data value, the variance is unchanged. This is intuitive, since adding or subtracting c shifts the location of the data set but leaves distances between data values un- changed. According to Result 2, multiplication of each xi by c results in s

2 being mul- tiplied by a factor of c2. These properties can be proved by noting in Result 1 that

and in Result 2 that .

Boxplots Stem-and-leaf displays and histograms convey rather general impressions about a data set, whereas a single summary such as the mean or standard deviation focuses on just one aspect of the data. In recent years, a pictorial summary called a boxplot has been used successfully to describe several of a data set’s most prominent fea- tures. These features include (1) center, (2) spread, (3) the extent and nature of any departure from symmetry, and (4) identification of “outliers,” observations that lie unusually far from the main body of the data. Because even a single outlier can dras- tically affect the values of and s, a boxplot is based on measures that are “resist- ant” to the presence of a few outliers—the median and a measure of variability called the fourth spread.

x

y 5 cxy 5 x 1 c

DEFINITION Order the n observations from smallest to largest and separate the smallest half from the largest half; the median is included in both halves if n is odd. Then the lower fourth is the median of the smallest half and the upper fourth is the median of the largest half. A measure of spread that is resistant to outliers is the fourth spread fs, given by

fs 5 upper fourth 2 lower fourth

x|

Roughly speaking, the fourth spread is unaffected by the positions of those observations in the smallest 25% or the largest 25% of the data. Hence it is resistant to outliers.

The simplest boxplot is based on the following five-number summary:

First, draw a horizontal measurement scale. Then place a rectangle above this axis; the left edge of the rectangle is at the lower fourth, and the right edge is at the upper fourth . Place a vertical line segment or some other symbol inside the rectangle at the location of the median; the position of the median symbol relative to the two edges conveys information about skewness in the middle 50% of the data. Finally, draw “whiskers” out from either end of the rectangle to the small- est and largest observations. A boxplot with a vertical orientation can also be drawn by making obvious modifications in the construction process.

(so box width 5 fs)

smallest xi lower fourth median upper fourth largest xi

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.19

40 CHAPTER 1 Overview and Descriptive Statistics

40 50 60 70 80 90 100 110 120 130 Depth

Figure 1.20 A boxplot of the corrosion data

Ultrasound was used to gather the accompanying corrosion data on the thickness of the floor plate of an aboveground tank used to store crude oil (“Statistical Analysis of UT Corrosion Data from Floor Plates of a Crude Oil Aboveground Storage Tank,” Materials Eval., 1994: 846–849); each observation is the largest pit depth in the plate, expressed in milli-in.

40 52 55 60 70 75 85 85 90 90 92 94 94 95 98 100 115 125 125

The five-number summary is as follows:

Figure 1.20 shows the resulting boxplot. The right edge of the box is much closer to the median than is the left edge, indicating a very substantial skew in the middle half of the data. The box width (fs) is also reasonably large relative to the range of the data (distance between the tips of the whiskers).

largest xi 5 125 smallest xi 5 40 lower fourth 5 72.5 x| 5 90 upper fourth 5 96.5

Figure 1.21 shows Minitab output from a request to describe the corrosion data. Q1 and Q3 are the lower and upper quartiles; these are similar to the fourths but are cal- culated in a slightly different manner. SE Mean is ; this will be an important quantity in our subsequent work concerning inferences about .m

s/1n

Variable N Mean Median TrMean StDev SE Mean depth 19 86.32 90.00 86.76 23.32 5.35

Variable Minimum Maximum Q1 Q3 depth 40.00 125.00 70.00 98.00

Figure 1.21 Minitab description of the pit-depth data ■

Boxplots That Show Outliers A boxplot can be embellished to indicate explicitly the presence of outliers. Many inferential procedures are based on the assumption that the population distribution is normal (a certain type of bell curve). Even a single extreme outlier in the sample warns the investigator that such procedures may be unreliable, and the presence of several mild outliers conveys the same message.

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎩

DEFINITION Any observation farther than 1.5fs from the closest fourth is an outlier. An outlier is extreme if it is more than 3fs from the nearest fourth, and it is mild otherwise.

Let’s now modify our previous construction of a boxplot by drawing a whisker out from each end of the box to the smallest and largest observations that are not outliers. Each mild outlier is represented by a closed circle and each extreme outlier by an open circle. Some statistical computer packages do not distinguish between mild and extreme outliers.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.4 Measures of Variability 41

The Clean Water Act and subsequent amendments require that all waters in the United States meet specific pollution reduction goals to ensure that water is “fishable and swimmable.” The article “Spurious Correlation in the USEPA Rating Curve Method for Estimating Pollutant Loads” (J. of Environ. Engr., 2008: 610–618) investigated var- ious techniques for estimating pollutant loads in watersheds; the authors “discuss the imperative need to use sound statistical methods” for this purpose. Among the data considered is the following sample of TN (total nitrogen) loads (kg N/day) from a par- ticular Chesapeake Bay location, displayed here in increasing order.

9.69 13.16 17.09 18.12 23.70 24.07 24.29 26.43 30.75 31.54 35.07 36.99 40.32 42.51 45.64 48.22 49.98 50.06 55.02 57.00 58.41 61.31 64.25 65.24 66.14 67.68 81.40 90.80 92.17 92.42 100.82 101.94

103.61 106.28 106.80 108.69 114.61 120.86 124.54 143.27 143.75 149.64 167.79 182.50 192.55 193.53 271.57 292.61 312.45 352.09 371.47 444.68 460.86 563.92 690.11 826.54

1529.35

Relevant summary quantities are

Subtracting 1.5fs from the lower 4 th gives a negative number, and none of the obser-

vations are negative, so there are no outliers on the lower end of the data. However,

Thus the four largest observations—563.92, 690.11, 826.54, and 1529.35—are extreme outliers, and 352.09, 371.47, 444.68, and 460.86 are mild outliers.

The whiskers in the boxplot in Figure 1.22 extend out to the smallest observa- tion, 9.69, on the low end and 312.45, the largest observation that is not an outlier, on the upper end. There is some positive skewness in the middle half of the data (the median line is somewhat closer to the left edge of the box than to the right edge) and a great deal of positive skewness overall.

upper 4th 1 1.5fs 5 351.015 upper 4th 1 3fs 5 534.24

fs 5 122.15 1.5fs 5 183.225 3fs 5 366.45 x| 5 92.17 lower 4th 5 45.64 upper 4th 5 167.79

0 200 400 600 800 1000 1200 1400 1600 load

Daily nitrogen load

Figure 1.22 A boxplot of the nitrogen load data showing mild and extreme outliers ■

Example 1.20

Comparative Boxplots A comparative or side-by-side boxplot is a very effective way of revealing similari- ties and differences between two or more data sets consisting of observations on the same variable—fuel efficiency observations for four different types of automobiles, crop yields for three different varieties, and so on.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 1.21

42 CHAPTER 1 Overview and Descriptive Statistics

In recent years, some evidence suggests that high indoor radon concentration may be linked to the development of childhood cancers, but many health professionals remain unconvinced. A recent article (“Indoor Radon and Childhood Cancer,” The Lancet, 1991: 1537–1538) presented the accompanying data on radon concentration (Bq/m3) in two different samples of houses. The first sample consisted of houses in which a child diagnosed with cancer had been residing. Houses in the second sample had no recorded cases of childhood cancer. Figure 1.23 presents a stem-and-leaf display of the data.

recnacoN.2recnaC.1

9683795 0 95768397678993 86071815066815233150 1 12271713114

12302731 2 99494191 8349 3 839

5 4 7 5 55

6 7 Stem: Tens digit

HI: 210 8 5 Leaf: Ones digit

Figure 1.23 Stem-and-leaf display for Example 1.21

Numerical summary quantities are as follows:

s fs

Cancer 22.8 16.0 31.7 11.0 No cancer 19.2 12.0 17.0 18.0

x|x

The values of both the mean and median suggest that the cancer sample is centered somewhat to the right of the no-cancer sample on the measurement scale. The mean, however, exaggerates the magnitude of this shift, largely because of the observation 210 in the cancer sample. The values of s suggest more variability in the cancer sam- ple than in the no-cancer sample, but this impression is contradicted by the fourth spreads. Again, the observation 210, an extreme outlier, is the culprit. Figure 1.24 shows a comparative boxplot from the S-Plus computer package. The no-cancer box

0

50

100

150

200

Radon concentration

No cancer Cancer

Figure 1.24 A boxplot of the data in Example 1.21, from S-Plus

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.4 Measures of Variability 43

is stretched out compared with the cancer box , and the positions of the median lines in the two boxes show much more skewness in the middle half of the no-cancer sample than the cancer sample. Outliers are represented by horizontal line segments, and there is no distinction between mild and extreme outliers. ■

(fs 5 18 vs. fs 5 11)

EXERCISES Section 1.4 (44–61)

44. The article “Oxygen Consumption During Fire Suppression: Error of Heart Rate Estimation” (Ergonomics, 1991: 1469–1474) reported the following data on oxygen consumption (mL/kg/min) for a sample of ten firefighters performing a fire-suppression simulation:

29.5 49.3 30.6 28.2 28.0 26.3 33.9 29.4 23.5 31.6 Compute the following:

a. The sample range b. The sample variance s2 from the definition (i.e., by first

computing deviations, then squaring them, etc.) c. The sample standard deviation d. s2 using the shortcut method

45. The value of Young’s modulus (GPa) was determined for cast plates consisting of certain intermetallic substrates, resulting in the following sample observations (“Strength and Modulus of a Molybdenum-Coated Ti-25Al-10Nb-3U- 1Mo Intermetallic,” J. of Materials Engr. and Performance, 1997: 46–50):

116.4 115.9 114.6 115.2 115.8

a. Calculate and the deviations from the mean. b. Use the deviations calculated in part (a) to obtain the

sample variance and the sample standard deviation. c. Calculate s2 by using the computational formula for the

numerator Sxx. d. Subtract 100 from each observation to obtain a sample of

transformed values. Now calculate the sample variance of these transformed values, and compare it to s2 for the original data.

46. The accompanying observations on stabilized viscosity (cP) for specimens of a certain grade of asphalt with 18% rubber added are from the article “Viscosity Characteristics of Rubber-Modified Asphalts” (J. of Materials in Civil Engr., 1996: 153–156):

2781 2900 3013 2856 2888

a. What are the values of the sample mean and sample median?

b. Calculate the sample variance using the computational formula. [Hint: First subtract a convenient number from each observation.]

47. Calculate and interpret the values of the sample median, sam- ple mean, and sample standard deviation for the following observations on fracture strength (MPa, read from a graph in

x

“Heat-Resistant Active Brazing of Silicon Nitride: Mechanical Evaluation of Braze Joints,” Welding J., August, 1997):

87 93 96 98 105 114 128 131 142 168

48. Exercise 34 presented the following data on endotoxin con- centration in settled dust both for a sample of urban homes and for a sample of farm homes:

U: 6.0 5.0 11.0 33.0 4.0 5.0 80.0 18.0 35.0 17.0 23.0 F: 4.0 14.0 11.0 9.0 9.0 8.0 4.0 20.0 5.0 8.9 21.0

9.2 3.0 2.0 0.3

a. Determine the value of the sample standard deviation for each sample, interpret these values, and then contrast variability in the two samples. [Hint: for the urban sample and for the farm sample, and

for the urban sample and 1617.94 for the farm sample.]

b. Compute the fourth spread for each sample and compare. Do the fourth spreads convey the same message about variability that the standard deviations do? Explain.

c. The authors of the cited article also provided endotoxin concentrations in dust bag dust:

U: 34.0 49.0 13.0 33.0 24.0 24.0 35.0 104.0 34.0 40.0 38.0 1.0 F: 2.0 64.0 6.0 17.0 35.0 11.0 17.0 13.0 5.0 27.0 23.0

28.0 10.0 13.0 0.2

Construct a comparative boxplot (as did the cited paper) and compare and contrast the four samples.

49. A study of the relationship between age and various visual functions (such as acuity and depth perception) reported the following observations on the area of scleral lamina (mm2) from human optic nerve heads (“Morphometry of Nerve Fiber Bundle Pores in the Optic Nerve Head of the Human,” Experimental Eye Research, 1988: 559–568):

2.75 2.62 2.74 3.85 2.34 2.74 3.93 4.21 3.88 4.33 3.46 4.52 2.43 3.65 2.78 3.56 3.01

a. Calculate and . b. Use the values calculated in part (a) to compute the sam-

ple variance s2 and then the sample standard deviation s.

50. In 1997 a woman sued a computer keyboard manufacturer, charging that her repetitive stress injuries were caused by the keyboard (Genessy v. Digital Equipment Corp.). The injury awarded about $3.5 million for pain and suffering, but the court then set aside that award as being unreasonable

gxi 2gxi

gxi 2 5 10,079

5 128.4 gxi 5 237.0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

44 CHAPTER 1 Overview and Descriptive Statistics

compensation. In making this determination, the court iden- tified a “normative” group of 27 similar cases and specified a reasonable award as one within two standard deviations of the mean of the awards in the 27 cases. The 27 awards were (in $1000s) 37, 60, 75, 115, 135, 140, 149, 150, 238, 290, 340, 410, 600, 750, 750, 750, 1050, 1100, 1139, 1150, 1200, 1200, 1250, 1576, 1700, 1825, and 2000, from which

, . What is the maximum possible amount that could be awarded under the two- standard-deviation rule?

51. The article “A Thin-Film Oxygen Uptake Test for the Evaluation of Automotive Crankcase Lubricants” (Lubric. Engr., 1984: 75–83) reported the following data on oxidation-induction time (min) for various commer- cial oils:

87 103 130 160 180 195 132 145 211 105 145 153 152 138 87 99 93 119 129

a. Calculate the sample variance and standard deviation. b. If the observations were reexpressed in hours, what

would be the resulting values of the sample variance and sample standard deviation? Answer without actually per- forming the reexpression.

52. The first four deviations from the mean in a sample of reaction times were .3, .9, 1.0, and 1.3. What is the

fifth deviation from the mean? Give a sample for which these are the five deviations from the mean.

53. A mutual fund is a professionally managed investment scheme that pools money from many investors and invests in a variety of securities. Growth funds focus pri- marily on increasing the value of investments, whereas blended funds seek a balance between current income and growth. Here is data on the expense ratio (expenses as a % of assets, from www.morningstar.com) for sam- ples of 20 large-cap balanced funds and 20 large-cap growth funds (“large-cap” refers to the sizes of compa- nies in which the funds invest; the population sizes are 825 and 762, respectively):

Bl 1.03 1.23 1.10 1.64 1.30 1.27 1.25 0.78 1.05 0.64 0.94 2.86 1.05 0.75 0.09 0.79 1.61 1.26 0.93 0.84

Gr 0.52 1.06 1.26 2.17 1.55 0.99 1.10 1.07 1.81 2.05 0.91 0.79 1.39 0.62 1.52 1.02 1.10 1.78 1.01 1.15

a. Calculate and compare the values of , , and s for the two types of funds.

b. Construct a comparative boxplot for the two types of funds, and comment on interesting features.

54. Grip is applied to produce normal surface forces that com- press the object being gripped. Examples include two

x|x

n 5 5

gxi 2 5 24,657,511gxi 5 20,179

people shaking hands, or a nurse squeezing a patient’s fore- arm to stop bleeding. The article “Investigation of Grip Force, Normal Force, Contact Area, Hand Size, and Handle Size for Cylindrical Handles” (Human Factors, 2008: 734–744) included the following data on grip strength (N) for a sample of 42 individuals:

16 18 18 26 33 41 54 56 66 68 87 91 95 98 106 109 111 118 127 127 135 145 147 149 151 168

172 183 189 190 200 210 220 229 230 233 238 244 259 294 329 403

a. Construct a stem-and-leaf display based on repeating each stem value twice, and comment on interesting features.

b. Determine the values of the fourths and the fourth- spread.

c. Construct a boxplot based on the five-number summary, and comment on its features.

d. How large or small does an observation have to be to qualify as an outlier? An extreme outlier? Are there any outliers?

e. By how much could the observation 403, currently the largest, be decreased without affecting fs?

55. Here is a stem-and-leaf display of the escape time data introduced in Exercise 36 of this chapter.

32 55 33 49 34 35 6699 36 34469 37 03345 38 9 39 2347 40 23 41 42 4

a. Determine the value of the fourth spread. b. Are there any outliers in the sample? Any extreme outliers? c. Construct a boxplot and comment on its features. d. By how much could the largest observation, currently

424, be decreased without affecting the value of the fourth spread?

56. The following data on distilled alcohol content (%) for a sample of 35 port wines was extracted from the article “A Method for the Estimation of Alcohol in Fortified Wines Using Hydrometer Baumé and Refractometer Brix” (Amer. J. Enol. Vitic., 2006: 486–490). Each value is an average of two duplicate measurements.

16.35 18.85 16.20 17.75 19.58 17.73 22.75 23.78 23.25 19.08 19.62 19.20 20.05 17.85 19.17 19.48 20.00 19.97 17.48 17.15 19.07 19.90 18.68 18.82 19.03 19.45 19.37 19.20 18.00 19.60 19.33 21.22 19.50 15.30 22.25

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

1.4 Measures of Variability 45

Use methods from this chapter, including a boxplot that shows outliers, to describe and summarize the data.

57. A sample of 20 glass bottles of a particular type was selected, and the internal pressure strength of each bottle was deter- mined. Consider the following partial sample information:

Three smallest observations 125.8 188.1 193.7 Three largest observations 221.3 230.5 250.2

a. Are there any outliers in the sample? Any extreme outliers? b. Construct a boxplot that shows outliers, and comment on

any interesting features.

58. A company utilizes two different machines to manufacture parts of a certain type. During a single shift, a sample of

parts produced by each machine is obtained, and the value of a particular critical dimension for each part is deter- mined. The comparative boxplot at the bottom of this page is constructed from the resulting data. Compare and contrast the two samples.

59. Blood cocaine concentration (mg/L) was determined both for a sample of individuals who had died from cocaine- induced excited delirium (ED) and for a sample of those who had died from a cocaine overdose without excited delirium; survival time for people in both groups was at most 6 hours. The accompanying data was read from a comparative box- plot in the article “Fatal Excited Delirium Following Cocaine Use” (J. of Forensic Sciences, 1997: 25–31).

ED 0 0 0 0 .1 .1 .1 .1 .2 .2 .3 .3 .3 .4 .5 .7 .8 1.0 1.5 2.7 2.8 3.5 4.0 8.9 9.2 11.7 21.0

Non-ED 0 0 0 0 0 .1 .1 .1 .1 .2 .2 .2 .3 .3 .3 .4 .5 .5 .6 .8 .9 1.0 1.2 1.4 1.5 1.7 2.0 3.2 3.5 4.1 4.3 4.8 5.0 5.6 5.9 6.0 6.4 7.9 8.3 8.7 9.1 9.6 9.9 11.0 11.5 12.2 12.7 14.0 16.6 17.8

n 5 20

upper fourth 5 216.8 lower fourth 5 196.0median 5 202.2

a. Determine the medians, fourths, and fourth spreads for the two samples.

b. Are there any outliers in either sample? Any extreme outliers?

c. Construct a comparative boxplot, and use it as a basis for comparing and contrasting the ED and non-ED samples.

60. Observations on burst strength (lb/in2) were obtained both for test nozzle closure welds and for production cannister nozzle welds (“Proper Procedures Are the Key to Welding Radioactive Waste Cannisters,” Welding J., Aug. 1997: 61–67).

Test 7200 6100 7300 7300 8000 7400 7300 7300 8000 6700 8300

Cannister 5250 5625 5900 5900 5700 6050 5800 6000 5875 6100 5850 6600

Construct a comparative boxplot and comment on inter- esting features (the cited article did not include such a picture, but the authors commented that they had looked at one).

61. The accompanying comparative boxplot of gasoline vapor coefficients for vehicles in Detroit appeared in the article “Receptor Modeling Approach to VOC Emission Inventory Validation” (J. of Envir. Engr., 1995: 483–490). Discuss any interesting features.

85

1

2

95 105 115 Dimension

Machine

Comparative boxplot for Exercise 58

6 a.m. 8 a.m. 12 noon 2 p.m. 10 p.m.

10

0

20

30

40

50

60

70

Time

Gas vapor coefficient

Comparative boxplot for Exercise 61

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

46 CHAPTER 1 Overview and Descriptive Statistics

SUPPLEMENTARY EXERCISES (62–83)

62. Consider the following information on ultimate tensile strength (lb/in) for a sample of hard zirconium cop- per wire specimens (from “Characterization Methods for Fine Copper Wire,” Wire J. Intl., Aug., 1997: 74–80):

Determine the values of the two middle sample observations (and don’t do it by successive guessing!).

63. A sample of 77 individuals working at a particular office was selected and the noise level (dBA) experienced by each individual was determined, yielding the following data (“Acceptable Noise Levels for Construction Site Offices,” Building Serv. Engr. Research and Technology, 2009: 87–94).

55.3 55.3 55.3 55.9 55.9 55.9 55.9 56.1 56.1 56.1 56.1 56.1 56.1 56.8 56.8 57.0 57.0 57.0 57.8 57.8 57.8 57.9 57.9 57.9 58.8 58.8 58.8 59.8 59.8 59.8 62.2 62.2 63.8 63.8 63.8 63.9 63.9 63.9 64.7 64.7 64.7 65.1 65.1 65.1 65.3 65.3 65.3 65.3 67.4 67.4 67.4 67.4 68.7 68.7 68.7 68.7 69.0 70.4 70.4 71.2 71.2 71.2 73.0 73.0 73.1 73.1 74.6 74.6 74.6 74.6 79.3 79.3 79.3 79.3 83.0 83.0 83.0

Use various techniques discussed in this chapter to organ- ize, summarize, and describe the data.

64. Fretting is a wear process that results from tangential oscil- latory movements of small amplitude in machine parts. The article “Grease Effect on Fretting Wear of Mild Steel” (Industrial Lubrication and Tribology, 2008: 67–78) included the following data on volume wear for base oils having four different viscosities.

Viscosity Wear

20.4 58.8 30.8 27.3 29.9 17.7 76.5 30.2 44.5 47.1 48.7 41.6 32.8 18.3 89.4 73.3 57.1 66.0 93.8 133.2 81.1

252.6 30.6 24.2 16.6 38.9 28.7 23.6

a. The sample coefficient of variation assesses the extent of variability relative to the mean (specifically, the standard deviation as a percentage of the mean). Calculate the coefficient of variation for the sample at each viscosity. Then compare the results and comment.

b. Construct a comparative boxplot of the data and com- ment on interesting features.

65. The accompanying frequency distribution of fracture strength (MPa) observations for ceramic bars fired in a particular kiln appeared in the article “Evaluating Tunnel Kiln Performance” (Amer. Ceramic Soc. Bull., Aug. 1997: 59–63).

100s/ x

(1024mm3)

largest xi 5 77,048 smallest xi 5 76,683s 5 180x 5 76,831

n 5 4 Class Frequency 6 7 17 30 43 Class Frequency 28 22 13 3

a. Construct a histogram based on relative frequencies, and comment on any interesting features.

b. What proportion of the strength observations are at least 85? Less than 95?

c. Roughly what proportion of the observations are less than 90?

66. A deficiency of the trace element selenium in the diet can negatively impact growth, immunity, muscle and neuromus- cular function, and fertility. The introduction of selenium supplements to dairy cows is justified when pastures have low selenium levels. Authors of the paper “Effects of Short- Term Supplementation with Selenised Yeast on Milk Production and Composition of Lactating Cows” (Australian J. of Dairy Tech., 2004: 199–203) supplied the following data on milk selenium concentration (mg/L) for a sample of cows given a selenium supplement and a control sample given no supplement, both initially and after a 9-day period.

Obs Init Se Init Cont Final Se Final Cont 1 11.4 9.1 138.3 9.3 2 9.6 8.7 104.0 8.8 3 10.1 9.7 96.4 8.8 4 8.5 10.8 89.0 10.1 5 10.3 10.9 88.0 9.6 6 10.6 10.6 103.8 8.6 7 11.8 10.1 147.3 10.4 8 9.8 12.3 97.1 12.4 9 10.9 8.8 172.6 9.3

10 10.3 10.4 146.3 9.5 11 10.2 10.9 99.0 8.4 12 11.4 10.4 122.3 8.7 13 9.2 11.6 103.0 12.5 14 10.6 10.9 117.8 9.1 15 10.8 121.5 16 8.2 93.0

a. Do the initial Se concentrations for the supplement and control samples appear to be similar? Use various tech- niques from this chapter to summarize the data and answer the question posed.

b. Again use methods from this chapter to summarize the data and then describe how the final Se concentration values in the treatment group differ from those in the control group.

67. Aortic stenosis refers to a narrowing of the aortic valve in the heart. The paper “Correlation Analysis of Stenotic Aortic Valve Flow Patterns Using Phase Contrast MRI”

972,99952,97932,95912,93

892,91872,89852,87832,85812,83

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 47

(Annals of Biomed. Engr., 2005: 878–887) gave the follow- ing data on aortic root diameter (cm) and gender for a sam- ple of patients having various degrees of aortic stenosis:

M: 3.7 3.4 3.7 4.0 3.9 3.8 3.4 3.6 3.1 4.0 3.4 3.8 3.5 F: 3.8 2.6 3.2 3.0 4.3 3.5 3.1 3.1 3.2 3.0

a. Compare and contrast the diameter observations for the two genders.

b. Calculate a 10% trimmed mean for each of the two sam- ples, and compare to other measures of center (for the male sample, the interpolation method mentioned in Section 1.3 must be used).

68. a. For what value of c is the quantity mini- mized? [Hint: Take the derivative with respect to c, set equal to 0, and solve.]

b. Using the result of part (a), which of the two quantities and will be smaller than the

other (assuming that )?

69. a. Let a and b be constants and let for . What are the relationships between

and and between and ? b. A sample of temperatures for initiating a certain chemi-

cal reaction yielded a sample average (°C) of 87.3 and a sample standard deviation of 1.04. What are the sample average and standard deviation measured in °F? [Hint:

.]

70. Elevated energy consumption during exercise continues after the workout ends. Because calories burned after exer- cise contribute to weight loss and have other consequences, it is important to understand this process. The paper “Effect of Weight Training Exercise and Treadmill Exercise on Post-Exercise Oxygen Consumption” (Medicine and Science in Sports and Exercise, 1998: 518–522) reported the accompanying data from a study in which oxygen con- sumption (liters) was measured continuously for 30 minutes for each of 15 subjects both after a weight training exercise and after a treadmill exercise.

Subject 1 2 3 4 5 6 7

Weight (x) 14.6 14.4 19.5 24.3 16.3 22.1 23.0

Treadmill (y) 11.3 5.3 9.1 15.2 10.1 19.6 20.8

Subject 8 9 10 11 12 13 14 15

Weight (x) 18.7 19.0 17.0 19.1 19.6 23.2 18.5 15.9

Treadmill (y) 10.3 10.3 2.6 16.6 22.4 23.6 12.6 4.4

a. Construct a comparative boxplot of the weight and tread- mill observations, and comment on what you see.

b. Because the data is in the form of (x, y) pairs, with x and y measurements on the same variable under two different conditions, it is natural to focus on the differences within pairs: . Construct a boxplot of the sample differences. What does it suggest?

71. Here is a description from Minitab of the strength data given in Exercise 13.

d1 5 x1 2 y1, c, dn 5 xn 2 yn

F 5 9 5

C 1 32

sy 2sx

2y xi 5 1, 2, c, n

yi 5 axi 1 b

x 2 m g (xi 2 m)

2g (xi 2 x) 2

g (xi 2 c) 2

Variable N Mean Median TrMean StDev SE Mean strength 153 135.39 135.40 135.41 4.59 0.37

Variable Minimum Maximum Q1 Q3 strength 122.20 147.70 132.95 138.25

a. Comment on any interesting features (the quartiles and fourths are virtually identical here).

b. Construct a boxplot of the data based on the quartiles, and comment on what you see.

72. Anxiety disorders and symptoms can often be effectively treated with benzodiazepine medications. It is known that animals exposed to stress exhibit a decrease in benzodi- azepine receptor binding in the frontal cortex. The paper “Decreased Benzodiazepine Receptor Binding in Prefrontal Cortex in Combat-Related Posttraumatic Stress Disorder” (Amer. J. of Psychiatry, 2000: 1120–1126) described the first study of benzodiazepine receptor binding in individuals suffering from PTSD. The accompanying data on a receptor binding measure (adjusted distribution volume) was read from a graph in the paper.

PTSD: 10, 20, 25, 28, 31, 35, 37, 38, 38, 39, 39, 42, 46

Healthy: 23, 39, 40, 41, 43, 47, 51, 58, 63, 66, 67, 69, 72

Use various methods from this chapter to describe and sum- marize the data.

73. The article “Can We Really Walk Straight?” (Amer. J. of Physical Anthropology, 1992: 19–27) reported on an exper- iment in which each of 20 healthy men was asked to walk as straight as possible to a target 60 m away at normal speed. Consider the following observations on cadence (number of strides per second):

.95 .85 .92 .95 .93 .86 1.00 .92 .85 .81

.78 .93 .93 1.05 .93 1.06 1.06 .96 .81 .96

Use the methods developed in this chapter to summarize the data; include an interpretation or discussion wherever appropriate. [Note: The author of the article used a rather sophisticated statistical analysis to conclude that people cannot walk in a straight line and suggested several expla- nations for this.]

74. The mode of a numerical data set is the value that occurs most frequently in the set. a. Determine the mode for the cadence data given in

Exercise 73. b. For a categorical sample, how would you define the

modal category?

75. Specimens of three different types of rope wire were selected, and the fatigue limit (MPa) was determined for each specimen, resulting in the accompanying data.

Type 1 350 350 350 358 370 370 370 371 371 372 372 384 391 391 392

Type 2 350 354 359 363 365 368 369 371 373 374 376 380 383 388 392

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

48 CHAPTER 1 Overview and Descriptive Statistics

Type 3 350 361 362 364 364 365 366 371 377 377 377 379 380 380 392

a. Construct a comparative boxplot, and comment on simi- larities and differences.

b. Construct a comparative dotplot (a dotplot for each sam- ple with a common scale). Comment on similarities and differences.

c. Does the comparative boxplot of part (a) give an inform- ative assessment of similarities and differences? Explain your reasoning.

76. The three measures of center introduced in this chapter are the mean, median, and trimmed mean. Two additional measures of center that are occasionally used are the midrange, which is the average of the smallest and largest observations, and the midfourth, which is the average of the two fourths.Which of these five measures of center are resistant to the effects of out- liers and which are not? Explain your reasoning.

77. The authors of the article “Predictive Model for Pitting Corrosion in Buried Oil and Gas Pipelines” (Corrosion, 2009: 332–342) provided the data on which their investiga- tion was based. a. Consider the following sample of 61 observations on

maximum pitting depth (mm) of pipeline specimens buried in clay loam soil.

0.41 0.41 0.41 0.41 0.43 0.43 0.43 0.48 0.48 0.58 0.79 0.79 0.81 0.81 0.81 0.91 0.94 0.94 1.02 1.04 1.04 1.17 1.17 1.17 1.17 1.17 1.17 1.17 1.19 1.19 1.27 1.40 1.40 1.59 1.59 1.60 1.68 1.91 1.96 1.96 1.96 2.10 2.21 2.31 2.46 2.49 2.57 2.74 3.10 3.18 3.30 3.58 3.58 4.15 4.75 5.33 7.65 7.70 8.13 10.41 13.44

Construct a stem-and-leaf display in which the two largest values are shown in a last row labeled HI.

b. Refer back to (a), and create a histogram based on eight classes with 0 as the lower limit of the first class and class widths of .5, .5, .5, .5, 1, 2, 5, and 5, respectively.

c. The accompanying comparative boxplot from Minitab shows plots of pitting depth for four different types of soils. Describe its important features.

78. Consider a sample and suppose that the values of , s2, and s have been calculated. a. Let for . How do the values of

s2 and s for the yi’s compare to the corresponding values for the xi’s? Explain.

b. Let for . What are the values of the sample variance and sample standard deviation for the zi’s?

79. Let and denote the sample mean and variance for the sample and let and denote these quanti- ties when an additional observation is added to the sample. a. Show how can be computed from and .

b. Show that

so that can be computed from , , and . c. Suppose that a sample of 15 strands of drapery yarn has

resulted in a sample mean thread elongation of 12.58 mm and a sample standard deviation of .512 mm. A 16th

strand results in an elongation value of 11.8. What are the values of the sample mean and sample standard devi- ation for all 16 elongation observations?

80. Lengths of bus routes for any particular transit system will typically vary from one route to another. The article “Planning of City Bus Routes” (J. of the Institution of Engineers, 1995: 211–215) gives the following information on lengths (km) for one particular system:

Length Frequency 6 23 30 35 32

Length Frequency 48 42 40 28 27

Length Frequency 26 14 27 11 2

402,45352,40302,35282,30262,28

242,26222,24202,22182,20162,18

142,16122,14102,1282,1062,8

sn 2xnxn11sn11

2

nsn11 2 5 (n 2 1)sn

2 1 n

n 1 1 (xn11 2 xn)

2

xn11xnxn11

xn11

sn11 2xn11x1, c, xn

sn 2xn

i 5 1, c, nzi 5 (xi 2 x)/s

i 5 1, c, nyi 5 xi 2 x x

x1, x2, c, xn

M ax

im um

p it

d ep

th

14

12

10

8

6

4

2

0

C CL SCL SYCL

Soil type

Comparative boxplot for Exercise 77

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Bibliography 49

a. Draw a histogram corresponding to these frequencies. b. What proportion of these route lengths are less than 20?

What proportion of these routes have lengths of at least 30? c. Roughly what is the value of the 90th percentile of the

route length distribution? d. Roughly what is the median route length?

81. A study carried out to investigate the distribution of total braking time (reaction time plus accelerator-to-brake move- ment time, in ms) during real driving conditions at 60 km/hr gave the following summary information on the distribution of times (“A Field Study on Braking Responses During Driving,” Ergonomics, 1995: 1903–1910):

What can you conclude about the shape of a histogram of this data? Explain your reasoning.

82. The sample data sometimes represents a time series, where value of a response variable x at time t. Often the observed series shows a great deal of random variation, which makes it difficult to study longer- term behavior. In such situations, it is desirable to produce a smoothed version of the series. One technique for doing so involves exponential smoothing. The value of a smoothing constant is chosen . Then with

at time t, we set , and for , .

a. Consider the following time series in which of effluent at a sewage treatment

plant on day t: 47, 54, 53, 50, 46, 46, 47, 50, 51, 50, 46, 52, 50, 50. Plot each xt against t on a two-dimensional coordinate system (a time-series plot). Does there appear to be any pattern?

b. Calculate the using . Repeat using . Which value of gives a smoother series?xta

a 5 .5a 5 .1x t ’s

xt 5 temperature (8F)

xt 5 axt 1 (1 2 a)xt21t 5 2, 3, c, n x1 5 x1xt 5 smoothed value

(0 , a , 1)a

xt 5 the observed x1, x2, c, xn

95th percentile 5 72090th percentile 5 640 10th percentile 5 4305th percentile 5 400

maximum 5 925minimum 5 220sd 5 96 mode 5 500median 5 500mean 5 535

c. Substitute on the right-hand side of the expression for , then substitute in terms of and , and so on. On how many of the values

does depend? What happens to the coefficient on as k increases?

d. Refer to part (c). If t is large, how sensitive is to the ini- tialization ? Explain.

[Note: A relevant reference is the article “Simple Statistics for Interpreting Environmental Data,” Water Pollution Control Fed. J., 1981: 167–175.]

83. Consider numerical observations . It is frequently of interest to know whether the xi s are (at least approxi- mately) symmetrically distributed about some value. If n is at least moderately large, the extent of symmetry can be assessed from a stem-and-leaf display or histogram. However, if n is not very large, such pictures are not partic- ularly informative. Consider the following alternative. Let y1 denote the smallest xi, y2 the second smallest xi, and so on. Then plot the following pairs as points on a two-dimensional coordinate system:

There are n/2 points when n is even and when n is odd. a. What does this plot look like when there is perfect sym-

metry in the data? What does it look like when observa- tions stretch out more above the median than below it (a long upper tail)?

b. The accompanying data on rainfall (acre-feet) from 26 seeded clouds is taken from the article “A Bayesian Analysis of a Multiplicative Treatment Effect in Weather Modification” (Technometrics, 1975: 161–166). Construct the plot and comment on the extent of sym- metry or nature of departure from symmetry.

4.1 7.7 17.5 31.4 32.7 40.6 92.4 115.3 118.3 119.0 129.6 198.6 200.7 242.5 255.0 274.7 274.7 302.8 334.1 430.0 489.1 703.4 978.0 1656.0 1697.8 2745.6

(n 2 1)/2 (yn22 2 x

|, x| 2 y3), c (yn21 2 x

|, x| 2 y2),(yn 2 x |, x| 2 y1),

x1, c, xn

x1 5 x1

xt

xt2k

xtxt, xt21, c, x1

xt23xt22

xt22xt

xt21 5 axt21 1 (1 2 a)xt22

Bibliography Chambers, John, William Cleveland, Beat Kleiner, and Paul

Tukey, Graphical Methods for Data Analysis, Brooks/Cole, Pacific Grove, CA, 1983. A highly recommended presentation of various graphical and pictorial methodology in statistics.

Cleveland, William, Visualizing Data, Hobart Press, Summit, NJ, 1993. An entertaining tour of pictorial techniques.

Peck, Roxy, and Jay Devore, Statistics: The Exploration and Analysis of Data (6th ed.), Thomson Brooks/Cole, Belmont, CA, 2008. The first few chapters give a very nonmathemati- cal survey of methods for describing and summarizing data.

Freedman, David, Robert Pisani, and Roger Purves, Statistics (4th ed.), Norton, New York, 2007. An excellent, very nonmathe- matical survey of basic statistical reasoning and methodology.

Hoaglin, David, Frederick Mosteller, and John Tukey, Understanding Robust and Exploratory Data Analysis,

Wiley, New York, 1983. Discusses why, as well as how, exploratory methods should be employed; it is good on details of stem-and-leaf displays and boxplots.

Moore, David, and William Notz, Statistics: Concepts and Controversies (7th ed.), Freeman, San Francisco, 2009. An extremely readable and entertaining paperback that contains an intuitive discussion of problems connected with sampling and designed experiments.

Peck, Roxy, et al. (eds.), Statistics: A Guide to the Unknown (4th ed.), Thomson Brooks/Cole, Belmont, CA, 2006. Contains many short nontechnical articles describing various applica- tions of statistics.

Verzani, John, Using R for Introductory Statistics, Chapman and Hall/CRC, Boca Raton, FL, 2005. A very nice introduction to the R software package.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

50

Probability2

INTRODUCTION

The term probability refers to the study of randomness and uncertainty. In any

situation in which one of a number of possible outcomes may occur, the disci-

pline of probability provides methods for quantifying the chances, or likeli-

hoods, associated with the various outcomes. The language of probability is

constantly used in an informal manner in both written and spoken contexts.

Examples include such statements as “It is likely that the Dow Jones average

will increase by the end of the year,” “There is a 50–50 chance that the incum-

bent will seek reelection,” “There will probably be at least one section of that

course offered next year,” “The odds favor a quick settlement of the strike,”

and “It is expected that at least 20,000 concert tickets will be sold.” In this

chapter, we introduce some elementary probability concepts, indicate how

probabilities can be interpreted, and show how the rules of probability can be

applied to compute the probabilities of many interesting events. The method-

ology of probability will then permit us to express in precise language such

informal statements as those given above.

The study of probability as a branch of mathematics goes back over 300

years, where it had its genesis in connection with questions involving games of

chance. Many books are devoted exclusively to probability, but our objective

here is to cover only that part of the subject that has the most direct bearing

on problems of statistical inference.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.3

Example 2.2

Example 2.1

DEFINITION

2.1 Sample Spaces and Events 51

2.1 Sample Spaces and Events An experiment is any activity or process whose outcome is subject to uncertainty. Although the word experiment generally suggests a planned or carefully controlled laboratory testing situation, we use it here in a much wider sense. Thus experiments that may be of interest include tossing a coin once or several times, selecting a card or cards from a deck, weighing a loaf of bread, ascertaining the commuting time from home to work on a particular morning, obtaining blood types from a group of individuals, or measuring the compressive strengths of different steel beams.

The Sample Space of an Experiment

The sample space of an experiment, denoted by , is the set of all possible outcomes of that experiment.

S

The simplest experiment to which probability applies is one with two possible out- comes. One such experiment consists of examining a single fuse to see whether it is defective. The sample space for this experiment can be abbreviated as , where N represents not defective, D represents defective, and the braces are used to enclose the elements of a set. Another such experiment would involve tossing a thumbtack and noting whether it landed point up or point down, with sample space

, and yet another would consist of observing the gender of the next child born at the local hospital, with . ■

If we examine three fuses in sequence and note the result of each examination, then an outcome for the entire experiment is any sequence of N’s and D’s of length 3, so

If we had tossed a thumbtack three times, the sample space would be obtained by replacing N by U in above, with a similar notational change yielding the sample space for the experiment in which the genders of three newborn children are observed. ■

Two gas stations are located at a certain intersection. Each one has six gas pumps. Consider the experiment in which the number of pumps in use at a particular time of day is determined for each of the stations. An experimental outcome specifies how many pumps are in use at the first station and how many are in use at the second one. One possible outcome is (2, 2), another is (4, 1), and yet another is (1, 4). The 49 outcomes in are displayed in the accompanying table. The sample space for the experiment in which a six-sided die is thrown twice results from deleting the 0 row and 0 column from the table, giving 36 outcomes.

S

S

S 5 5NNN, NND, NDN, NDD, DNN, DND, DDN, DDD6

S 5 5M, F6S 5 5U, D6

S 5 5N, D6

Second Station

0 1 2 3 4 5 6

0 (0, 0) (0, 1) (0, 2) (0, 3) (0, 4) (0, 5) (0, 6) 1 (1, 0) (1, 1) (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) 2 (2, 0) (2, 1) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6)

First Station 3 (3, 0) (3, 1) (3, 2) (3, 3) (3, 4) (3, 5) (3, 6) 4 (4, 0) (4, 1) (4, 2) (4, 3) (4, 4) (4, 5) (4, 6) 5 (5, 0) (5, 1) (5, 2) (5, 3) (5, 4) (5, 5) (5, 6) 6 (6, 0) (6, 1) (6, 2) (6, 3) (6, 4) (6, 5) (6, 6)

■ Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.5

DEFINITION

52 CHAPTER 2 Probability

A reasonably large percentage of C�� programs written at a particular company compile on the first run, but some do not (a compiler is a program that translates source code, in this case C�� programs, into machine language so programs can be executed). Suppose an experiment consists of selecting and compiling C�� pro- grams at this location one by one until encountering a program that compiles on the first run. Denote a program that compiles on the first run by S (for success) and one that doesn’t do so by F (for failure). Although it may not be very likely, a possible outcome of this experiment is that the first 5 (or 10 or 20 or . . .) are F’s and the next one is an S. That is, for any positive integer n, we may have to examine n programs before seeing the first S. The sample space is , which contains an infinite number of possible outcomes. The same abbreviated form of the sample space is appropriate for an experiment in which, starting at a specified time, the gender of each newborn infant is recorded until the birth of a male is observed. ■

Events In our study of probability, we will be interested not only in the individual outcomes of but also in various collections of outcomes from .SS

S 5 5S, FS, FFS, FFFS, c6

An event is any collection (subset) of outcomes contained in the sample space . An event is simple if it consists of exactly one outcome and compound if

it consists of more than one outcome. S

When an experiment is performed, a particular event A is said to occur if the result- ing experimental outcome is contained in A. In general, exactly one simple event will occur, but many compound events will occur simultaneously.

Consider an experiment in which each of three vehicles taking a particular freeway exit turns left (L) or right (R) at the end of the exit ramp. The eight possible outcomes that comprise the sample space are LLL, RLL, LRL, LLR, LRR, RLR, RRL, and RRR. Thus there are eight simple events, among which are . Some compound events include

the event that exactly one of the three vehicles turns right

the event that at most one of the vehicles turns right

the event that all three vehicles turn in the same direction

Suppose that when the experiment is performed, the outcome is LLL. Then the sim- ple event E1 has occurred and so also have the events B and C (but not A). ■

When the number of pumps in use at each of two six-pump gas stations is observed, there are 49 possible outcomes, so there are 49 simple events:

. Examples of compound events areE1 5 5(0, 0)6, E2 5 5(0, 1)6, c, E49 5 5(6, 6)6

C 5 5LLL, RRR6 5 B 5 5LLL, RLL, LRL, LLR6 5 A 5 5RLL, LRL, LLR6 5

E1 5 5LLL6 and E5 5 5LRR6

Example 2.6 (Example 2.3 continued)

Example 2.4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.9 (Example 2.4 continued)

Example 2.8 (Example 2.3 continued)

DEFINITION

2.1 Sample Spaces and Events 53

the event that the number of pumps in use is the same for both stations

the event that the total number of pumps in use is four

the event that at most one pump is in use at each station ■

The sample space for the program compilation experiment contains an infinite num- ber of outcomes, so there are an infinite number of simple events. Compound events include

the event that an even number of programs are examined ■

Some Relations from Set Theory An event is just a set, so relationships and results from elementary set theory can be used to study events. The following operations will be used to create new events from given events.

E 5 5FS, FFFS, FFFFFS, c6 5 A 5 5S, FS, FFS6 5 the event that at most three programs are examined

C 5 5(0, 0), (0, 1), (1, 0), (1, 1)6 5 B 5 5(0, 4), (1, 3), (2, 2), (3, 1), (4, 0)6 5 A 5 5(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6)6 5

1. The complement of an event A, denoted by , is the set of all outcomes in that are not contained in A.

2. The union of two events A and B, denoted by and read “A or B,” is the event consisting of all outcomes that are either in A or in B or in both events (so that the union includes outcomes for which both A and B occur as well as outcomes for which exactly one occurs)—that is, all outcomes in at least one of the events.

3. The intersection of two events A and B, denoted by and read “A and B,” is the event consisting of all outcomes that are in both A and B.

A ¨ B

A ´ B S

Ar

For the experiment in which the number of pumps in use at a single six-pump gas station is observed, let , , and . Then

In the program compilation experiment, define A, B, and C by

Then

■A ´ B 5 5S, FS, FFS, FFFFS6, A ¨ B 5 5S, FFS6 Ar 5 5FFFS, FFFFS, FFFFFS,c6, Cr 5 5S, FFS, FFFFS, c6

A 5 5S, FS, FFS6, B 5 5S, FFS, FFFFS6, C 5 5FS, FFFS, FFFFFS, c6

A ¨ B 5 53, 46, A ¨ C 5 51, 36, (A ¨ C)r 5 50, 2, 4, 5, 66Ar 5 55, 66, A ´ B 5 50, 1, 2, 3, 4, 5, 66 5 S , A ´ C 5 50, 1, 2, 3, 4, 56, C 5 51, 3, 56B 5 53, 4, 5, 66A 5 50, 1, 2, 3, 46

Example 2.7 (Example 2.4 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.10

DEFINITION

54 CHAPTER 2 Probability

Sometimes A and B have no outcomes in common, so that the intersection of A and B contains no outcomes.

Let denote the null event (the event consisting of no outcomes whatsoever). When , A and B are said to be mutually exclusive or disjoint events.

A ¨ B 5 [ [

A small city has three automobile dealerships: a GM dealer selling Chevrolets and Buicks; a Ford dealer selling Fords and Lincolns; and a Toyota dealer. If an experi- ment consists of observing the brand of the next car sold, then the events

and are mutually exclusive because the next car sold cannot be both a GM product and a Ford product (at least until the two companies merge!). ■

The operations of union and intersection can be extended to more than two events. For any three events A, B, and C, the event is the set of outcomes contained in at least one of the three events, whereas is the set of out- comes contained in all three events. Given events , these events are said to be mutually exclusive (or pairwise disjoint) if no two events have any out- comes in common.

A pictorial representation of events and manipulations with events is obtained by using Venn diagrams. To construct a Venn diagram, draw a rectangle whose interior will represent the sample space . Then any event A is represented as the interior of a closed curve (often a circle) contained in . Figure 2.1 shows examples of Venn diagrams. S

S

A1, A2, A3, c A ¨ B ¨ C

A ´ B ´ C

B 5 5Ford, Lincoln6A 5 5Chevrolet, Buick6

EXERCISES Section 2.1 (1–10)

1. Four universities—1, 2, 3, and 4—are participating in a holi- day basketball tournament. In the first round, 1 will play 2 and 3 will play 4. Then the two winners will play for the championship, and the two losers will also play. One possi- ble outcome can be denoted by 1324 (1 beats 2 and 3 beats 4 in first-round games, and then 1 beats 3 and 2 beats 4). a. List all outcomes in . b. Let A denote the event that 1 wins the tournament. List

outcomes in A. c. Let B denote the event that 2 gets into the championship

game. List outcomes in B. d. What are the outcomes in and in ? What are

the outcomes in A�?

2. Suppose that vehicles taking a particular freeway exit can turn right (R), turn left (L), or go straight (S). Consider

A ¨ BA ´ B

S

observing the direction for each of three successive vehicles. a. List all outcomes in the event A that all three vehicles go

in the same direction. b. List all outcomes in the event B that all three vehicles take

different directions. c. List all outcomes in the event C that exactly two of the

three vehicles turn right. d. List all outcomes in the event D that exactly two vehicles

go in the same direction. e. List outcomes in D�, , and .

3. Three components are connected to form a system as shown in the accompanying diagram. Because the components in the 2–3 subsystem are connected in parallel, that subsystem will function if at least one of the two individual components

C ¨ DC ´ D

A B

(a) Venn diagram of events A and B

A B

(e) Mutually exclusive events

A B

(c) Shaded region is A � B

A

(d) Shaded region is A'

A B

(b) Shaded region is A � B

Figure 2.1 Venn diagrams

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.2 Axioms, Interpretations, and Properties of Probability 55

2

1

3

The experiment consists of determining the condition of each component [S (success) for a functioning component and F (failure) for a nonfunctioning component]. a. Which outcomes are contained in the event A that exactly

two out of the three components function? b. Which outcomes are contained in the event B that at least

two of the components function? c. Which outcomes are contained in the event C that the

system functions? d. List outcomes in C�, , , , and .

4. Each of a sample of four home mortgages is classified as fixed rate (F) or variable rate (V). a. What are the 16 outcomes in ? b. Which outcomes are in the event that exactly three of the

selected mortgages are fixed rate? c. Which outcomes are in the event that all four mortgages

are of the same type? d. Which outcomes are in the event that at most one of the

four is a variable-rate mortgage? e. What is the union of the events in parts (c) and (d), and

what is the intersection of these two events? f. What are the union and intersection of the two events in

parts (b) and (c)?

5. A family consisting of three persons—A, B, and C—goes to a medical clinic that always has a doctor at each of stations 1, 2, and 3. During a certain week, each member of the family visits the clinic once and is assigned at random to a station. The experiment consists of recording the station number for each member. One outcome is (1, 2, 1) for A to station 1, B to station 2, and C to station 1. a. List the 27 outcomes in the sample space. b. List all outcomes in the event that all three members go to

the same station. c. List all outcomes in the event that all members go to dif-

ferent stations. d. List all outcomes in the event that no one goes to station 2.

6. A college library has five copies of a certain text on reserve. Two copies (1 and 2) are first printings, and the other three (3, 4,

S

B ¨ CB ´ CA ¨ CA ´ C

and 5) are second printings. A student examines these books in random order, stopping only when a second printing has been selected. One possible outcome is 5, and another is 213. a. List the outcomes in . b. Let A denote the event that exactly one book must be

examined. What outcomes are in A? c. Let B be the event that book 5 is the one selected. What

outcomes are in B? d. Let C be the event that book 1 is not examined. What out-

comes are in C?

7. An academic department has just completed voting by secret ballot for a department head. The ballot box contains four slips with votes for candidate A and three slips with votes for candidate B. Suppose these slips are removed from the box one by one. a. List all possible outcomes. b. Suppose a running tally is kept as slips are removed.

For what outcomes does A remain ahead of B through- out the tally?

8. An engineering construction firm is currently working on power plants at three different sites. Let Ai denote the event that the plant at site i is completed by the contract date. Use the operations of union, intersection, and complementation to describe each of the following events in terms of A1, A2, and A3, draw a Venn diagram, and shade the region corre- sponding to each one. a. At least one plant is completed by the contract date. b. All plants are completed by the contract date. c. Only the plant at site 1 is completed by the contract date. d. Exactly one plant is completed by the contract date. e. Either the plant at site 1 or both of the other two plants

are completed by the contract date.

9. Use Venn diagrams to verify the following two relationships for any events A and B (these are called De Morgan’s laws): a. b.

[Hint: In each part, draw a diagram corresponding to the left side and another corresponding to the right side.]

10. a. In Example 2.10, identify three events that are mutually exclusive.

b. Suppose there is no outcome common to all three of the events A, B, and C. Are these three events necessarily mutually exclusive? If your answer is yes, explain why; if your answer is no, give a counterexample using the experiment of Example 2.10.

(A ¨ B)r 5 Ar ´ Br (A ´ B)r 5 Ar ¨ Br

S

functions. For the entire system to function, component 1 must function and so must the 2–3 subsystem.

2.2 Axioms, Interpretations, and Properties of Probability

Given an experiment and a sample space , the objective of probability is to assign to each event A a number P(A), called the probability of the event A, which will give a precise measure of the chance that A will occur. To ensure that the probability

S

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.11

PROPOSITION

56 CHAPTER 2 Probability

For any event A, .

.

If is an infinite collection of disjoint events, then

P(A1 ´ A2 ´ A3 ´ c) 5 g `

i51 P(Ai)

A1, A2, A3, c

P(S ) 5 1 P(A) $ 0

You might wonder why the third axiom contains no reference to a finite collection of disjoint events. It is because the corresponding property for a finite collection can be derived from our three axioms. We want our axiom list to be as short as possible and not contain any property that can be derived from others on the list. Axiom 1 reflects the intuitive notion that the chance of A occurring should be non- negative. The sample space is by definition the event that must occur when the exper- iment is performed ( contains all possible outcomes), so Axiom 2 says that the maximum possible probability of 1 is assigned to . The third axiom formalizes the idea that if we wish the probability that at least one of a number of events will occur and no two of the events can occur simultaneously, then the chance of at least one occurring is the sum of the chances of the individual events.

S S

where is the null event (the event containing no outcomes what- soever). This in turn implies that the property contained in Axiom 3 is valid for a finite collection of disjoint events.

[P([) 5 0

Proof First consider the infinite collection . Since , the events in this collection are disjoint and . The third

axiom then gives

This can happen only if . Now suppose that are disjoint events, and append to these the infi-

nite collection . Again invoking the third axiom,

as desired. ■

Pa ´k i51

Aib 5 Pa ´ `

i51 Aib 5 g

`

i51 P(Ai) 5 g

k

i51 P(Ai)

Ak11 5 [, Ak12 5 [, Ak13 5 [, c A1, A2, c, Ak

P([) 5 0

P([) 5 gP([)

´ Ai 5 [[ ¨ [ 5 [ A1 5 [, A2 5 [, A3 5 [, c

assignments will be consistent with our intuitive notions of probability, all assign- ments should satisfy the following axioms (basic properties) of probability.

AXIOM 1

AXIOM 2

AXIOM 3

Consider tossing a thumbtack in the air. When it comes to rest on the ground, either its point will be up (the outcome U) or down (the outcome D). The sample space for this event is therefore . The axioms specify , so the probability assignment will be completed by determining P(U) and P(D). Since U and D are dis- joint and their union is , the foregoing proposition implies that

1 5 P(S ) 5 P(U ) 1 P(D)

S

P(S ) 5 1S 5 5U, D6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.12

2.2 Axioms, Interpretations, and Properties of Probability 57

It follows that . One possible assignment of probabilities is , , whereas another possible assignment is , . In fact, letting p represent any fixed number between 0 and 1, ,

is an assignment consistent with the axioms. ■

Consider testing batteries coming off an assembly line one by one until one having a voltage within prescribed limits is found. The simple events are

. Suppose the probability of any particular battery being satisfactory is .99. Then it can be shown that

is an assignment of probabilities to the simple events that satisfies the axioms. In particular, because the Eis are disjoint and

, it must be the case that

Here we have used the formula for the sum of a geometric series:

However, another legitimate (according to the axioms) probability assignment of the same “geometric” type is obtained by replacing .99 by any other number p between 0 and 1 (and .01 by ). ■

Interpreting Probability Examples 2.11 and 2.12 show that the axioms do not completely determine an assignment of probabilities to events. The axioms serve only to rule out assignments inconsistent with our intuitive notions of probability. In the tack-tossing experiment of Example 2.11, two particular assignments were suggested. The appropriate or correct assignment depends on the nature of the thumbtack and also on one’s inter- pretation of probability. The interpretation that is most frequently used and most eas- ily understood is based on the notion of relative frequencies.

Consider an experiment that can be repeatedly performed in an identical and independent fashion, and let A be an event consisting of a fixed set of outcomes of the experiment. Simple examples of such repeatable experiments include the tack- tossing and die-tossing experiments previously discussed. If the experiment is per- formed n times, on some of the replications the event A will occur (the outcome will be in the set A), and on others, A will not occur. Let n(A) denote the number of repli- cations on which A does occur. Then the ratio is called the relative frequency of occurrence of the event A in the sequence of n replications.

For example, let A be the event that a package sent within the state of California for 2nd day delivery actually arrives within one day. The results from send- ing 10 such packages (the first 10 replications) are as follows:

n(A)/n

1 2 p

a 1 ar 1 ar2 1 ar3 1 c 5 a

1 2 r

5 .99[1 1 .01 1 (.01)2 1 (.01)3 1 c]

1 5 P(S) 5 P(E1) 1 P(E2) 1 P(E3) 1 c

S 5 E1 ´ E2 ´ E3 ´ c

P(E2) 5 (.01)(.99), P(E3) 5 (.01) 2(.99), c

P(E1) 5 .99, E2 5 5FS6, E3 5 5FFS6, E4 5 5FFFS6, c

E1 5 5S6,

P(D) 5 1 2 p P(U) 5 pP(D) 5 .25

P(U) 5 .75P(D) 5 .5P(U) 5 .5 P(D) 5 1 2 P(U)

Package # 1 2 3 4 5 6 7 8 9 10

Did A occur? N Y Y Y N N Y Y N N

Relative frequency of A 0 .5 .667 .75 .6 .5 .571 .625 .556 .5

Figure 2.2(a) shows how the relative frequency fluctuates rather sub- stantially over the course of the first 50 replications. But as the number of replications continues to increase, Figure 2.2(b) illustrates how the relative frequency stabilizes.

n(A)/n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

58 CHAPTER 2 Probability

.2

0

.4

.6

.8

1.0

R el

at iv

e fr

eq ue

nc y

de liv

er ed

in o

ne d

ay

Relative frequency =

9 15

= .60

Relative frequency =

5 10

= .50

100 20 30 5040

Number of packages

.5

.6

.7

100 200 300 400 500 600 700 800 900 10000

Number of packages

(a) (b)

Approaches .6

R el

at iv

e fr

eq ue

nc y

de liv

er ed

in o

ne d

ay

Figure 2.2 Behavior of relative frequency (a) Initial fluctuation (b) Long-run stabilization

More generally, empirical evidence, based on the results of many such repeat- able experiments, indicates that any relative frequency of this sort will stabilize as the number of replications n increases. That is, as n gets arbitrarily large, approaches a limiting value referred to as the limiting (or long-run) relative fre- quency of the event A. The objective interpretation of probability identifies this lim- iting relative frequency with P(A). Suppose that probabilities are assigned to events in accordance with their limiting relative frequencies. Then a statement such as “the probability of a package being delivered within one day of mailing is .6” means that of a large number of mailed packages, roughly 60% will arrive within one day. Similarly, if B is the event that an appliance of a particular type will need service while under warranty, then is interpreted to mean that in the long run 10% of such appliances will need warranty service. This doesn’t mean that exactly 1 out of 10 will need service, or that exactly 10 out of 100 will need service, because 10 and 100 are not the long run.

This relative frequency interpretation of probability is said to be objective because it rests on a property of the experiment rather than on any particular indi- vidual concerned with the experiment. For example, two different observers of a sequence of coin tosses should both use the same probability assignments since the observers have nothing to do with limiting relative frequency. In practice, this inter- pretation is not as objective as it might seem, since the limiting relative frequency of an event will not be known. Thus we will have to assign probabilities based on our beliefs about the limiting relative frequency of events under study. Fortunately, there are many experiments for which there will be a consensus with respect to probabil- ity assignments. When we speak of a fair coin, we shall mean , and a fair die is one for which limiting relative frequencies of the six outcomes are all , suggesting probability assignments .

Because the objective interpretation of probability is based on the notion of limiting frequency, its applicability is limited to experimental situations that are repeatable. Yet the language of probability is often used in connection with situations

P(516) 5 c5 P(566) 5 1616 P(H) 5 P(T) 5 .5

P(B) 5 .1

n(A)/n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.13

PROPOSITION

2.2 Axioms, Interpretations, and Properties of Probability 59

that are inherently unrepeatable. Examples include: “The chances are good for a peace agreement”; “It is likely that our company will be awarded the contract”; and “Because their best quarterback is injured, I expect them to score no more than 10 points against us.” In such situations we would like, as before, to assign numerical probabilities to various outcomes and events (e.g., the probability is .9 that we will get the contract). We must therefore adopt an alternative interpretation of these prob- abilities. Because different observers may have different prior information and opin- ions concerning such experimental situations, probability assignments may now differ from individual to individual. Interpretations in such situations are thus referred to as subjective. The book by Robert Winkler listed in the chapter references gives a very readable survey of several subjective interpretations.

More Probability Properties

For any event A, , from which .P(A) 5 1 2 P(Ar)P(A) 1 P(Ar) 5 1

Proof In Axiom 3, let , , and . Since by definition of A�,

while A and A� are disjoint, . ■

This proposition is surprisingly useful because there are many situations in which is more easily obtained by direct methods than is P(A).

Consider a system of five identical components connected in series, as illustrated in Figure 2.3.

P(Ar)

1 5 P(S ) 5 P(A ´ Ar) 5 P(A) 1 P(Ar)A ´ Ar 5 S A2 5 ArA1 5 Ak 5 2

1 2 3 4 5

Figure 2.3 A system of five components connected in a series

Denote a component that fails by F and one that doesn’t fail by S (for success). Let A be the event that the system fails. For A to occur, at least one of the individual com- ponents must fail. Outcomes in A include SSFSS (1, 2, 4, and 5 all work, but 3 does not), FFSSS, and so on. There are in fact 31 different outcomes in A. However, A�, the event that the system works, consists of the single outcome SSSSS. We will see in Section 2.5 that if 90% of all such components do not fail and different compo- nents fail independently of one another, then . Thus

; so among a large number of such systems, roughly 41% will fail. ■

In general, the foregoing proposition is useful when the event of interest can be expressed as “at least . . . ,” since then the complement “less than . . .” may be easier to work with (in some problems, “more than . . .” is easier to deal with than “at most . . .”). When you are having difficulty calculating P(A) directly, think of determining .P(Ar)

P(A) 5 1 2 .59 5 .41 P(Ar) 5 P(SSSSS) 5 .95 5 .59

PROPOSITION For any event A, .P(A) # 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.14

PROPOSITION

60 CHAPTER 2 Probability

This is because since . When events A and B are mutually exclusive, . For

events that are not mutually exclusive, adding P(A) and P(B) results in “double- counting” outcomes in the intersection. The next result shows how to correct for this.

P(A ´ B) 5 P(A) 1 P(B) P(Ar) $ 01 5 P(A) 1 P(Ar) $ P(A)

For any two events A and B,

P(A ´ B) 5 P(A) 1 P(B) 2 P(A ¨ B)

Proof Note first that can be decomposed into two disjoint events, A and ; the latter is the part of B that lies outside A (see Figure 2.4). Furthermore, B

itself is the union of the two disjoint events and , so . Thus

5 P(A) 1 P(B) 2 P(A ¨ B) P(A ´ B) 5 P(A) 1 P(B ¨ Ar) 5 P(A) 1 [P(B) 2 P(A ¨ B)]

P(B) 5 P(A ¨ B) 1 P(Ar ¨ B) Ar ¨ BA ¨ B

B ¨ Ar A ´ B

A B � �

Figure 2.4 Representing A � B as a union of disjoint events

.5.1 .3

P(A' � B)P(A � B' )

Figure 2.5 Probabilities for Example 2.14

In a certain residential suburb, 60% of all households get Internet service from the local cable company, 80% get television service from that company, and 50% get both services from that company. If a household is randomly selected, what is the probability that it gets at least one of these two services from the company, and what is the probability that it gets exactly one of these services from the company?

With and , the given infor- mation implies that , and . The foregoing proposition now yields

P(subscribes to at least one of the two services)

The event that a household subscribes only to tv service can be written as [(not Internet) and TV]. Now Figure 2.4 implies that

from which . Similarly, . This is all illustrated in Figure 2.5, from which we see that

P(exactly one) 5 P(A ¨ Br) 1 P(Ar ¨ B) 5 .1 1 .3 5 .4

P(A ¨ Br) 5 P(A ´ B) 2 P(B) 5 .1P(Ar ¨ B) 5 .3

.9 5 P(A ´ B) 5 P(A) 1 P(Ar ¨ B) 5 .6 1 P(Ar ¨ B)

Ar ¨ B

5 P(A ´ B) 5 P(A) 1 P(B) 2 P(A ¨ B) 5 .6 1 .8 2 .5 5 .9

P(A ¨ B) 5 .5P(A) 5 .6, P(B) 5 .8 B 5 5gets TV service6A 5 5gets Internet service6

The probability of a union of more than two events can be computed analogously.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

A B

C

Figure 2.6 A ´ B ´ C

2.2 Axioms, Interpretations, and Properties of Probability 61

For any three events A, B, and C,

2P(B ¨ C) 1 P(A ¨ B ¨ C) P(A ´ B ´ C) 5 P(A) 1 P(B) 1 P(C) 2 P(A ¨ B) 2 P(A ¨ C)

This can be verified by examining a Venn diagram of , which is shown in Figure 2.6. When P(A), P(B), and P(C) are added, certain intersections are counted twice, so they must be subtracted out, but this results in being subtracted once too often.

P(A ¨ B ¨ C)

A ´ B ´ C

Determining Probabilities Systematically Consider a sample space that is either finite or “countably infinite” (the latter means that outcomes can be listed in an infinite sequence, so there is a first outcome, a sec- ond outcome, a third outcome, and so on—for example, the battery testing scenario of Example 2.12). Let denote the corresponding simple events, each consisting of a single outcome. A sensible strategy for probability computation is to first determine each simple event probability, with the requirement that . Then the probability of any compound event A is computed by adding together the

for all Ei’s in A:

During off-peak hours a commuter train has five cars. Suppose a commuter is twice as likely to select the middle car (#3) as to select either adjacent car (#2 or #4), and is twice as likely to select either adjacent car as to select either end car (#1 or #5). Let . Then we have and

. This gives

implying , , . The probability that one of the three middle cars is selected (a compound event) is then . ■

Equally Likely Outcomes In many experiments consisting of N outcomes, it is reasonable to assign equal prob- abilities to all N simple events. These include such obvious examples as tossing a fair coin or fair die once or twice (or any fixed number of times), or selecting one or sev- eral cards from a well-shuffled deck of 52. With for every i,

That is, if there are N equally likely outcomes, the probability for each is 1/N.

1 5 g N

i51 P(Ei) 5 g

N

i51 p 5 p # N so p 5 1

N

p 5 P(Ei)

p2 1 p3 1 p4 5 .8 p3 5 .4p2 5 p4 5 .2p1 5 p5 5 .1

1 5 gP(Ei) 5 p1 1 2p1 1 4p1 1 2p1 1 p1 5 10p1

p2 5 2p1 5 2p5 5 p4

p3 5 2p2 5 2p4pi 5 P(car i is selected) 5 P(Ei)

P(A) 5 g all Ei ’s in A

P(Ei)

P(Ei)’s

gP(Ei) 5 1

E1, E2, E3, c

Example 2.15

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.16

62 CHAPTER 2 Probability

Now consider an event A, with N(A) denoting the number of outcomes con- tained in A. Then

Thus when outcomes are equally likely, computing probabilities reduces to counting: determine both the number of outcomes N(A) in A and the number of out- comes N in , and form their ratio.

You have six unread mysteries on your bookshelf and six unread science fiction books. The first three of each type are hardcover, and the last three are paperback. Consider randomly selecting one of the six mysteries and then randomly selecting one of the six science fiction books to take on a post-finals vacation to Acapulco (after all, you need something to read on the beach). Number the mysteries 1, 2, . . . , 6, and do the same for the science fiction books. Then each outcome is a pair of numbers such as (4, 1), and there are possible outcomes (For a visual of this situation, refer to the table in Example 2.3 and delete the first row and column). With random selection as described, the 36 outcomes are equally likely. Nine of these outcomes are such that both selected books are paperbacks (those in the lower right-hand corner of the referenced table): (4,4), (4,5), . . . , (6,6). So the probability of the event A that both selected books are paperbacks is

■P(A) 5 N(A)

N 5

9

36 5 .25

N 5 36

S

P(A) 5 g Ei in A

P(Ei) 5 g Ei in A

1

N 5

N(A)

N

EXERCISES Section 2.2 (11–28)

11. A mutual fund company offers its customers a variety of funds: a money-market fund, three different bond funds (short, intermediate, and long-term), two stock funds (mod- erate and high-risk), and a balanced fund. Among customers who own shares in just one fund, the percentages of cus- tomers in the different funds are as follows:

Money-market 20% High-risk stock 18% Short bond 15% Moderate-risk

stock 25% Intermediate Balanced 7%

bond 10% Long bond 5%

A customer who owns shares in just one fund is randomly selected. a. What is the probability that the selected individual owns

shares in the balanced fund? b. What is the probability that the individual owns shares in

a bond fund? c. What is the probability that the selected individual does

not own shares in a stock fund?

12. Consider randomly selecting a student at a certain univer- sity, and let A denote the event that the selected individual has a Visa credit card and B be the analogous event for a MasterCard. Suppose that , , and

.P(A ¨ B) 5 .25 P(B) 5 .4P(A) 5 .5

a. Compute the probability that the selected individual has at least one of the two types of cards (i.e., the probabil- ity of the event ).

b. What is the probability that the selected individual has neither type of card?

c. Describe, in terms of A and B, the event that the selected student has a Visa card but not a MasterCard, and then calculate the probability of this event.

13. A computer consulting firm presently has bids out on three projects. Let , for , and suppose that , , ,

, , , . Express in words each of the fol-

lowing events, and compute the probability of each event: a. b. [Hint: ] c. d. e. f.

14. Suppose that 55% of all adults regularly consume coffee, 45% regularly consume carbonated soda, and 70% regularly consume at least one of these two products. a. What is the probability that a randomly selected adult

regularly consumes both coffee and soda? b. What is the probability that a randomly selected adult

doesn’t regularly consume at least one of these two products?

(Ar1 ¨ Ar2 ) ´ A3Ar1 ¨ Ar2 ¨ A3 Ar1 ¨ Ar2 ¨ Ar3A1 ´ A2 ´ A3

(A1 ´ A2)r 5 Ar1 ¨ Ar2Ar1 ¨ Ar2 A1 ´ A2

P(A1 ¨ A2 ¨ A3) 5 .01 P(A2 ¨ A3) 5 .07P(A1 ¨ A3) 5 .05P(A1 ¨ A2) 5 .11

P(A3) 5 .28P(A2) 5 .25P(A1) 5 .22 i 5 1, 2, 3Ai 5 5awarded project i6

A ´ B

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.2 Axioms, Interpretations, and Properties of Probability 63

15. Consider the type of clothes dryer (gas or electric) pur- chased by each of five different customers at a certain store. a. If the probability that at most one of these purchases an

electric dryer is .428, what is the probability that at least two purchase an electric dryer?

b. If and P(all five purchase , what is the probability that at least one

of each type is purchased?

16. An individual is presented with three different glasses of cola, labeled C, D, and P. He is asked to taste all three and then list them in order of preference. Suppose the same cola has actually been put into all three glasses. a. What are the simple events in this ranking experiment,

and what probability would you assign to each one? b. What is the probability that C is ranked first? c. What is the probability that C is ranked first and D is

ranked last?

17. Let A denote the event that the next request for assis- tance from a statistical software consultant relates to the SPSS package, and let B be the event that the next request is for help with SAS. Suppose that and . a. Why is it not the case that ? b. Calculate . c. Calculate . d. Calculate .

18. A box contains six 40-W bulbs, five 60-W bulbs, and four 75-W bulbs. If bulbs are selected one by one in random order, what is the probability that at least two bulbs must be selected to obtain one that is rated 75 W?

19. Human visual inspection of solder joints on printed circuit boards can be very subjective. Part of the problem stems from the numerous types of solder defects (e.g., pad non- wetting, knee visibility, voids) and even the degree to which a joint possesses one or more of these defects. Consequently, even highly trained inspectors can disagree on the disposition of a particular joint. In one batch of 10,000 joints, inspector A found 724 that were judged defective, inspector B found 751 such joints, and 1159 of the joints were judged defective by at least one of the inspectors. Suppose that one of the 10,000 joints is ran- domly selected. a. What is the probability that the selected joint was judged

to be defective by neither of the two inspectors? b. What is the probability that the selected joint was

judged to be defective by inspector B but not by inspector A?

20. A certain factory operates three different shifts. Over the last year, 200 accidents have occurred at the factory. Some of these can be attributed at least in part to unsafe working conditions, whereas the others are unrelated to working conditions. The accompanying table gives the percentage of accidents falling in each type of accident– shift category.

P(Ar ¨ Br) P(A ´ B) P(Ar)

P(A) 1 P(B) 5 1 P(B) 5 .50

P(A) 5 .30

electric) 5 .005 P(all five purchase gas) 5 .116

Unsafe Unrelated Conditions to Conditions

Day 10% 35% Shift Swing 8% 20%

Night 5% 22%

Suppose one of the 200 accident reports is randomly selected from a file of reports, and the shift and type of acci- dent are determined. a. What are the simple events? b. What is the probability that the selected accident was

attributed to unsafe conditions? c. What is the probability that the selected accident did not

occur on the day shift?

21. An insurance company offers four different deductible levels—none, low, medium, and high—for its homeowner’s policyholders and three different levels—low, medium, and high—for its automobile policyholders. The accompanying table gives proportions for the various categories of policy- holders who have both types of insurance. For example, the proportion of individuals with both low homeowner’s deductible and low auto deductible is .06 (6% of all such individuals).

Homeowner’s

Auto N L M H

L .04 .06 .05 .03 M .07 .10 .20 .10 H .02 .03 .15 .15

Suppose an individual having both types of policies is ran- domly selected. a. What is the probability that the individual has a medium

auto deductible and a high homeowner’s deductible? b. What is the probability that the individual has a low auto

deductible? A low homeowner’s deductible? c. What is the probability that the individual is in the same

category for both auto and homeowner’s deductibles? d. Based on your answer in part (c), what is the probability

that the two categories are different? e. What is the probability that the individual has at least one

low deductible level? f. Using the answer in part (e), what is the probability that

neither deductible level is low?

22. The route used by a certain motorist in commuting to work contains two intersections with traffic signals. The probabil- ity that he must stop at the first signal is .4, the analogous probability for the second signal is .5, and the probability that he must stop at at least one of the two signals is .6. What is the probability that he must stop a. At both signals? b. At the first signal but not at the second one? c. At exactly one signal?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

64 CHAPTER 2 Probability

23. The computers of six faculty members in a certain depart- ment are to be replaced. Two of the faculty members have selected laptop machines and the other four have chosen desktop machines. Suppose that only two of the setups can be done on a particular day, and the two computers to be set up are randomly selected from the six (implying 15 equally likely outcomes; if the computers are numbered 1, 2, . . . , 6, then one outcome consists of computers 1 and 2, another consists of computers 1 and 3, and so on). a. What is the probability that both selected setups are for

laptop computers? b. What is the probability that both selected setups are

desktop machines? c. What is the probability that at least one selected setup is

for a desktop computer? d. What is the probability that at least one computer of each

type is chosen for setup?

24. Show that if one event A is contained in another event B (i.e., A is a subset of B), then . [Hint: For such A and B, A and are disjoint and , as can be seen from a Venn diagram.] For general A and B, what does this imply about the relationship among

, and ?

25. The three most popular options on a certain type of new car are a built-in GPS (A), a sunroof (B), and an automatic transmission (C). If 40% of all purchasers request A, 55% request B, 70% request C, 63% request A or B, 77% request A or C, 80% request B or C, and 85% request A or B or C, determine the probabilities of the following events. [Hint: “A or B” is the event that at least one of the two options is requested; try drawing a Venn diagram and labeling all regions.] a. The next purchaser will request at least one of the three

options. b. The next purchaser will select none of the three options. c. The next purchaser will request only an automatic trans-

mission and not either of the other two options. d. The next purchaser will select exactly one of these three

options.

P(A ´ B)P(A)P(A ¨ B)

B 5 A ´ (B ¨ Ar)B ¨ Ar P(A) # P(B)

26. A certain system can experience three different types of defects. Let denote the event that the system has a defect of type i. Suppose that

a. What is the probability that the system does not have a type 1 defect?

b. What is the probability that the system has both type 1 and type 2 defects?

c. What is the probability that the system has both type 1 and type 2 defects but not a type 3 defect?

d. What is the probability that the system has at most two of these defects?

27. An academic department with five faculty members— Anderson, Box, Cox, Cramer, and Fisher—must select two of its members to serve on a personnel review committee. Because the work will be time-consuming, no one is anx- ious to serve, so it is decided that the representative will be selected by putting the names on identical pieces of paper and then randomly selecting two. a. What is the probability that both Anderson and Box will

be selected? [Hint: List the equally likely outcomes.] b. What is the probability that at least one of the two mem-

bers whose name begins with C is selected? c. If the five faculty members have taught for 3, 6, 7, 10,

and 14 years, respectively, at the university, what is the probability that the two chosen representatives have a total of at least 15 years’ teaching experience there?

28. In Exercise 5, suppose that any incoming individual is equally likely to be assigned to any of the three stations irre- spective of where other individuals have been assigned. What is the probability that a. All three family members are assigned to the same station? b. At most two family members are assigned to the same

station? c. Every family member is assigned to a different station?

P(A2 ´ A3) 5 .10 P(A1 ¨ A2 ¨ A3) 5 .01 P(A1 ´ A2) 5 .13 P(A1 ´ A3) 5 .14 P(A1) 5 .12 P(A2) 5 .07 P(A3) 5 .05

Ai (i 5 1,2,3)

2.3 Counting Techniques When the various outcomes of an experiment are equally likely (the same probabil- ity is assigned to each simple event), the task of computing probabilities reduces to counting. Letting N denote the number of outcomes in a sample space and N(A) rep- resent the number of outcomes contained in an event A,

(2.1)

If a list of the outcomes is easily obtained and N is small, then N and N(A) can be determined without the benefit of any general counting principles.

There are, however, many experiments for which the effort involved in constructing such a list is prohibitive because N is quite large. By exploiting some

P(A) 5 N(A)

N

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

2.3 Counting Techniques 65

general counting rules, it is possible to compute probabilities of the form (2.1) with- out a listing of outcomes. These rules are also useful in many problems involving outcomes that are not equally likely. Several of the rules developed here will be used in studying probability distributions in the next chapter.

The Product Rule for Ordered Pairs Our first counting rule applies to any situation in which a set (event) consists of ordered pairs of objects and we wish to count the number of such pairs. By an ordered pair, we mean that, if O1 and O2 are objects, then the pair is differ- ent from the pair . For example, if an individual selects one airline for a trip from Los Angeles to Chicago and (after transacting business in Chicago) a second one for continuing on to New York, one possibility is (American, United), another is (United, American), and still another is (United, United).

(O2, O1) (O1, O2)

If the first element or object of an ordered pair can be selected in n1 ways, and for each of these n1 ways the second element of the pair can be selected in n2 ways, then the number of pairs is n1n2.

An alternative interpretation involves carrying out an operation that consists of two stages. If the first stage can be performed in any one of n1 ways, and for each such way there are n2 ways to perform the second stage, then n1n2 is the number of ways of carrying out the two stages in sequence.

A homeowner doing some remodeling requires the services of both a plumbing contractor and an electrical contractor. If there are 12 plumbing contractors and 9 electrical contractors available in the area, in how many ways can the contrac- tors be chosen? If we denote the plumbers by P1, . . . , P12 and the electricians by Q1, . . . , Q9, then we wish the number of pairs of the form . With and , the product rule yields possible ways of choosing the two types of contractors. ■

In Example 2.17, the choice of the second element of the pair did not depend on which first element was chosen or occurred. As long as there is the same number of choices of the second element for each first element, the product rule is valid even when the set of possible second elements depends on the first element.

A family has just moved to a new city and requires the services of both an obstetri- cian and a pediatrician. There are two easily accessible medical clinics, each having two obstetricians and three pediatricians. The family will obtain maximum health insurance benefits by joining a clinic and selecting both doctors from that clinic. In how many ways can this be done? Denote the obstetricians by O1, O2, O3, and O4 and the pediatricians by P1, . . . , P6. Then we wish the number of pairs for which Oi and Pj are associated with the same clinic. Because there are four obstetri- cians, , and for each there are three choices of pediatrician, so . Applying the product rule gives possible choices. ■

In many counting and probability problems, a configuration called a tree diagram can be used to represent pictorially all the possibilities. The tree diagram associated with Example 2.18 appears in Figure 2.7. Starting from a point on the left side of the

N 5 n1n2 5 12 n2 5 3n1 5 4

(Oi, Pj)

N 5 (12)(9) 5 108n2 5 9 n1 5 12(Pi, Qj)

Example 2.17

Example 2.18

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

66 CHAPTER 2 Probability

diagram, for each possible first element of a pair a straight-line segment emanates rightward. Each of these lines is referred to as a first-generation branch. Now for any given first-generation branch we construct another line segment emanating from the tip of the branch for each possible choice of a second element of the pair. Each such line segment is a second-generation branch. Because there are four obstetricians, there are four first-generation branches, and three pediatricians for each obstetrician yields three second-generation branches emanating from each first-generation branch.

O1

O2

O3

O4

P1

P2

P3 P1

P2

P3 P4

P5 P6

P4

P5

P6

Figure 2.7 Tree diagram for Example 2.18

Generalizing, suppose there are n1 first-generation branches, and for each first- generation branch there are n2 second-generation branches. The total number of second-generation branches is then n1n2. Since the end of each second-generation branch corresponds to exactly one possible pair (choosing a first element and then a second puts us at the end of exactly one second-generation branch), there are n1n2 pairs, verifying the product rule.

The construction of a tree diagram does not depend on having the same num- ber of second-generation branches emanating from each first-generation branch. If the second clinic had four pediatricians, then there would be only three branches emanating from two of the first-generation branches and four emanating from each of the other two first-generation branches. A tree diagram can thus be used to repre- sent pictorially experiments other than those to which the product rule applies.

A More General Product Rule If a six-sided die is tossed five times in succession rather than just twice, then each pos- sible outcome is an ordered collection of five numbers such as (1, 3, 1, 2, 4) or (6, 5, 2, 2, 2). We will call an ordered collection of k objects a k-tuple (so a pair is a 2-tuple and a triple is a 3-tuple). Each outcome of the die-tossing experiment is then a 5-tuple.

Product Rule for k-Tuples

Suppose a set consists of ordered collections of k elements (k-tuples) and that there are n1 possible choices for the first element; for each choice of the first element, there are n2 possible choices of the second element; . . . ; for each possible choice of the first elements, there are nk choices of the kth element. Then there are possible k-tuples.n1n2 # c # nk

k 2 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.19 (Example 2.17 continued)

2.3 Counting Techniques 67

An alternative interpretation involves carrying out an operation in k stages. If the first stage can be performed in any one of n1 ways, and for each such way there are n2 ways to perform the second stage, and for each way of performing the first two stages there are n3 ways to perform the 3

rd stage, and so on, then is the number of ways to carry out the entire k-stage operation in sequence. This more general rule can also be visualized with a tree diagram. For the case , simply add an appropriate number of 3rd generation branches to the tip of each 2nd

generation branch. If, for example, a college town has four pizza places, a theater complex with six screens, and three places to go dancing, then there would be four 1st generation branches, six 2nd generation branches emanating from the tip of each 1st generation branch, and three 3rd generation branches leading off each 2nd genera- tion branch. Each possible 3-tuple corresponds to the tip of a 3rd generation branch.

Suppose the home remodeling job involves first purchasing several kitchen appliances. They will all be purchased from the same dealer, and there are five dealers in the area. With the dealers denoted by D1, . . . , D5, there are

3-tuples of the form , so there are 540 ways to choose first an appliance dealer, then a plumbing contractor, and finally an electrical contractor. ■

If each clinic has both three specialists in internal medicine and two general sur- geons, there are ways to select one doctor of each type such that all doctors practice at the same clinic. ■

Permutations and Combinations Consider a group of n distinct individuals or objects (“distinct” means that there is some characteristic that differentiates any particular individual or object from any other). How many ways are there to select a subset of size k from the group? For example, if a Little League team has 15 players on its roster, how many ways are there to select 9 players to form a starting lineup? Or if a university bookstore sells ten different laptop computers but has room to display only three of them, in how many ways can the three be chosen?

An answer to the general question just posed requires that we distinguish between two cases. In some situations, such as the baseball scenario, the order of selection is important. For example, Angela being the pitcher and Ben the catcher gives a different lineup from the one in which Angela is catcher and Ben is pitcher. Often, though, order is not important and one is interested only in which individuals or objects are selected, as would be the case in the laptop display scenario.

n1n2n3n4 5 (4)(3)(3)(2) 5 72

(Di, Pj, Qk)N 5 n1n2n3 5 (5)(12)(9) 5 540

k 5 3

n1n2 # c # nk

Example 2.20 (Example 2.18 continued)

DEFINITION An ordered subset is called a permutation. The number of permutations of size k that can be formed from the n individuals or objects in a group will be denoted by Pk,n. An unordered subset is called a combination. One way to denote the number of combinations is Ck,n, but we shall instead use notation that is quite common in probability books: , read “n choose k”.Ank B

The number of permutations can be determined by using our earlier counting rule for k-tuples. Suppose, for example, that a college of engineering has seven departments, which we denote by a, b, c, d, e, f, and g. Each department has one rep- resentative on the college’s student council. From these seven representatives, one is

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.21

PROPOSITION

68 CHAPTER 2 Probability

Pk, n 5 n!

(n 2 k)!

to be chosen chair, another is to be selected vice-chair, and a third will be secretary. How many ways are there to select the three officers? That is, how many permuta- tions of size 3 can be formed from the 7 representatives? To answer this question, think of forming a triple (3-tuple) in which the first element is the chair, the second is the vice-chair, and the third is the secretary. One such triple is (a, g, b), another is (b, g, a), and yet another is (d, f, b). Now the chair can be selected in any of ways. For each way of selecting the chair, there are ways to select the vice- chair, and hence (chair, vice-chair) pairs. Finally, for each way of selecting a chair and vice-chair, there are ways of choosing the secretary. This gives

as the number of permutations of size 3 that can be formed from 7 distinct individ- uals. A tree diagram representation would show three generations of branches.

The expression for P3,7 can be rewritten with the aid of factorial notation. Recall that 7! (read “7 factorial”) is compact notation for the descending prod- uct of integers (7)(6)(5)(4)(3)(2)(1). More generally, for any positive integer m,

. This gives , and we also define . Then

More generally,

Multiplying and dividing this by gives a compact expression for the number of permutations.

(n 2 k)!

Pk, n 5 n(n 2 1)(n 2 2) # c # (n 2 (k 2 2))(n 2 (k 2 1))

P3, 7 5 (7)(6)(5) 5 (7)(6)(5)(4!)

(4!) 5

7!

4!

0! 5 1 1! 5 1m! 5 m(m 2 1)(m 2 2) # c # (2)(1)

P3, 7 5 (7)(6)(5) 5 210

n3 5 5 7 3 6 5 42

n2 5 6 n1 5 7

There are ten teaching assistants available for grading papers in a calculus course at a large university. The first exam consists of four questions, and the professor wishes to select a different assistant to grade each question (only one assistant per question). In how many ways can the assistants be chosen for grading? Here and . The number of permutations is

That is, the professor could give 5040 different four-question exams without using the same assignment of graders to questions, by which time all the teaching assis- tants would hopefully have finished their degree programs! ■

Now let’s move on to combinations (i.e., unordered subsets). Again refer to the student council scenario, and suppose that three of the seven representatives are to be selected to attend a statewide convention. The order of selection is not important; all that matters is which three get selected. So we are looking for , the number of combinations of size 3 that can be formed from the 7 individuals. Consider for a

A73B

P4,10 5 10!

(10 2 4)! 5

10!

6! 5 10(9)(8)(7) 5 5040

k 5 subset size 5 4 n 5 group size 5 10

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.3 Counting Techniques 69

moment the combination a,c,g. These three individuals can be ordered in ways to produce permutations:

Similarly, there are ways to order the combination b,c,e to produce permuta- tions, and in fact 3! ways to order any particular combination of size 3 to produce permutations. This implies the following relationship between the number of com- binations and the number of permutations:

It would not be too difficult to list the 35 combinations, but there is no need to do so if we are interested only in how many there are. Notice that the number of permuta- tions 210 far exceeds the number of combinations; the former is larger than the latter by a factor of 3! since that is how many ways each combination can be ordered.

Generalizing the foregoing line of reasoning gives a simple relationship between the number of permutations and the number of combinations that yields a concise expression for the latter quantity.

P3, 7 5 (3!) ? Q 73 R 1 Q 73 R 5 P3, 7 3!

5 7!

(3!)(4!) 5

(7)(6)(5)

(3)(2)(1) 5 35

3! 5 6

a, c, g a, g, c c, a, g c, g, a g, a, c g, c, a

3! 5 6

PROPOSITION Q n k R 5 Pk,n

k! 5

n!

k!(n 2 k)!

Notice that and since there is only one way to choose a set of (all) n elements or of no elements, and since there are n subsets of size 1.

A particular iPod playlist contains 100 songs, 10 of which are by the Beatles. Suppose the shuffle feature is used to play the songs in random order (the random- ness of the shuffling process is investigated in “Does Your iPod Really Play Favorites?” (The Amer. Statistician, 2009: 263–268). What is the probability that the first Beatles song heard is the fifth song played?

In order for this event to occur, it must be the case that the first four songs played are not Beatles’ songs (NBs) and that the fifth song is by the Beatles (B). The number of ways to select the first five songs is 100(99)(98)(97)(96). The number of ways to select these five songs so that the first four are NBs and the next is a B is 90(89)(88)(87)(10). The random shuffle assumption implies that any particular set of 5 songs from amongst the 100 has the same chance of being selected as the first five played as does any other set of five songs; each outcome is equally likely. Therefore the desired probability is the ratio of the number of outcomes for which the event of interest occurs to the number of possible outcomes:

Here is an alternative line of reasoning involving combinations. Rather than focus- ing on selecting just the first five songs, think of playing all 100 songs in random order. The number of ways of choosing 10 of these songs to be the Bs (without regard to the order in which they are then played) is . Now if we choose 9 of the

last 95 songs to be Bs, which can be done in ways, that leaves four NBs and one B for the first five songs. There is only one further way for these five to start with

A959 B A10010 B

P(1st B is the 5th song played) 5 90 # 89 # 88 # 87 # 10

100 # 99 # 98 # 97 # 96 5 P4, 90 # (10)

P5, 100 5 .0679

An1B 5 n An0B 5 1AnnB 5 1

Example 2.22

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.23

70 CHAPTER 2 Probability

four NBs and then follow with a B (remember that we are considering unordered subsets). Thus

It is easily verified that this latter expression is in fact identical to the first expres- sion for the desired probability, so the numerical result is again .0679.

The probability that one of the first five songs played is a Beatles’ song is

It is thus rather likely that a Beatles’ song will be one of the first five songs played. Such a “coincidence” is not as surprising as might first appear to be the case. ■

A university warehouse has received a shipment of 25 printers, of which 10 are laser printers and 15 are inkjet models. If 6 of these 25 are selected at random to be checked by a particular technician, what is the probability that exactly 3 of those selected are laser printers (so that the other 3 are inkjets)?

Let . Assuming that any particular set of 6 printers is as likely to be chosen as is any other set of 6, we have equally likely outcomes, so , where N is the number of ways of choosing 6 printers from the 25 and is the number of ways of choosing 3 laser printers and 3 inkjet models. Thus . To obtain , think of first choosing 3 of the 15 inkjet models and then 3 of the laser printers. There are ways of choosing the 3 inkjet models, and there are ways of choosing the 3 laser printers;

is now the product of these two numbers (visualize a tree diagram—we are really using a product rule argument here), so

Let and define D5 and D6 in an analogous manner. Then the probability that at least 3 inkjet printers are selected is

5

a15 3 b a10

3 b

a25 6 b

1

a15 4 b a10

2 b

a25 6 b

1

a15 5 b a10

1 b

a25 6 b

1

a15 6 b a10

0 b

a25 6 b

5 .8530

P(D3 ´ D4 ´ D5 ´ D6) 5 P(D3) 1 P(D4) 1 P(D5) 1 P(D6)

D4 5 5exactly 4 of the 6 printers selected are inkjet models6

P(D3) 5 N(D3)

N 5

a15 3 b a10

3 b

a25 6 b

5

15!

3!12! # 10! 3!7!

25!

6!19!

5 .3083

N(D3) A103 B

A153 B N(D3)N 5 A256 B

N(D3) P(D3) 5 N(D3)/N

D3 5 5exactly 3 of the 6 selected are inkjet printers6

5

a99 9 b

a100 10

b 1

a98 9 b

a100 10

b 1

a97 9 b

a100 10

b 1

a96 9 b

a100 10

b 1

a95 9 b

a100 10

b 5 .4162

P(1st B is the 1st or 2nd or 3rd or 4th or 5th song played)

P(1st B is the 5th song played) 5

a95 9 b

a100 10

b

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.3 Counting Techniques 71

EXERCISES Section 2.3 (29–44)

29. As of April 2006, roughly 50 million .com web domain names were registered (e.g., yahoo.com). a. How many domain names consisting of just two letters in

sequence can be formed? How many domain names of length two are there if digits as well as letters are per- mitted as characters? [Note: A character length of three or more is now mandated.]

b. How many domain names are there consisting of three letters in sequence? How many of this length are there if either letters or digits are permitted? [Note: All are cur- rently taken.]

c. Answer the questions posed in (b) for four-character sequences.

d. As of April 2006, 97,786 of the four-character sequences using either letters or digits had not yet been claimed. If a four-character name is randomly selected, what is the probability that it is already owned?

30. A friend of mine is giving a dinner party. His current wine supply includes 8 bottles of zinfandel, 10 of merlot, and 12 of cabernet (he only drinks red wine), all from different wineries. a. If he wants to serve 3 bottles of zinfandel and serving

order is important, how many ways are there to do this? b. If 6 bottles of wine are to be randomly selected from the

30 for serving, how many ways are there to do this? c. If 6 bottles are randomly selected, how many ways are

there to obtain two bottles of each variety? d. If 6 bottles are randomly selected, what is the probabil-

ity that this results in two bottles of each variety being chosen?

e. If 6 bottles are randomly selected, what is the probability that all of them are the same variety?

31. a. Beethoven wrote 9 symphonies, and Mozart wrote 27 piano concertos. If a university radio station announcer wishes to play first a Beethoven symphony and then a Mozart concerto, in how many ways can this be done?

b. The station manager decides that on each successive night (7 days per week), a Beethoven symphony will be played, followed by a Mozart piano concerto, followed by a Schubert string quartet (of which there are 15). For roughly how many years could this policy be continued before exactly the same program would have to be repeated?

32. A stereo store is offering a special price on a complete set of components (receiver, compact disc player, speakers, turntable). A purchaser is offered a choice of manufacturer for each component:

Receiver: Kenwood, Onkyo, Pioneer, Sony, Sherwood Compact disc player: Onkyo, Pioneer, Sony, Technics Speakers: Boston, Infinity, Polk Turntable: Onkyo, Sony, Teac, Technics

A switchboard display in the store allows a customer to hook together any selection of components (consisting of one of each type). Use the product rules to answer the following questions: a. In how many ways can one component of each type be

selected? b. In how many ways can components be selected if both

the receiver and the compact disc player are to be Sony? c. In how many ways can components be selected if none is

to be Sony? d. In how many ways can a selection be made if at least one

Sony component is to be included? e. If someone flips switches on the selection in a com-

pletely random fashion, what is the probability that the system selected contains at least one Sony component? Exactly one Sony component?

33. Again consider a Little League team that has 15 players on its roster. a. How many ways are there to select 9 players for the

starting lineup? b. How many ways are there to select 9 players for the

starting lineup and a batting order for the 9 starters? c. Suppose 5 of the 15 players are left-handed. How many

ways are there to select 3 left-handed outfielders and have all 6 other positions occupied by right-handed players?

34. Computer keyboard failures can be attributed to electrical defects or mechanical defects. A repair facility currently has 25 failed keyboards, 6 of which have electrical defects and 19 of which have mechanical defects. a. How many ways are there to randomly select 5 of these key-

boards for a thorough inspection (without regard to order)? b. In how many ways can a sample of 5 keyboards be

selected so that exactly two have an electrical defect? c. If a sample of 5 keyboards is randomly selected, what is

the probability that at least 4 of these will have a mechanical defect?

35. A production facility employs 20 workers on the day shift, 15 workers on the swing shift, and 10 workers on the grave- yard shift. A quality control consultant is to select 6 of these workers for in-depth interviews. Suppose the selection is made in such a way that any particular group of 6 workers has the same chance of being selected as does any other group (drawing 6 slips without replacement from among 45). a. How many selections result in all 6 workers coming from

the day shift? What is the probability that all 6 selected workers will be from the day shift?

b. What is the probability that all 6 selected workers will be from the same shift?

c. What is the probability that at least two different shifts will be represented among the selected workers?

d. What is the probability that at least one of the shifts will be unrepresented in the sample of workers?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

72 CHAPTER 2 Probability

36. An academic department with five faculty members nar- rowed its choice for department head to either candidate A or candidate B. Each member then voted on a slip of paper for one of the candidates. Suppose there are actually three votes for A and two for B. If the slips are selected for tally- ing in random order, what is the probability that A remains ahead of B throughout the vote count (e.g., this event occurs if the selected ordering is AABAB, but not for ABBAA)?

37. An experimenter is studying the effects of temperature, pres- sure, and type of catalyst on yield from a certain chemical reaction. Three different temperatures, four different pres- sures, and five different catalysts are under consideration. a. If any particular experimental run involves the use of a

single temperature, pressure, and catalyst, how many experimental runs are possible?

b. How many experimental runs are there that involve use of the lowest temperature and two lowest pressures?

c. Suppose that five different experimental runs are to be made on the first day of experimentation. If the five are randomly selected from among all the possibilities, so that any group of five has the same probability of selec- tion, what is the probability that a different catalyst is used on each run?

38. A box in a certain supply room contains four 40-W light- bulbs, five 60-W bulbs, and six 75-W bulbs. Suppose that three bulbs are randomly selected. a. What is the probability that exactly two of the selected

bulbs are rated 75-W? b. What is the probability that all three of the selected bulbs

have the same rating? c. What is the probability that one bulb of each type is

selected? d. Suppose now that bulbs are to be selected one by one

until a 75-W bulb is found. What is the probability that it is necessary to examine at least six bulbs?

39. Fifteen telephones have just been received at an authorized service center. Five of these telephones are cellular, five are cordless, and the other five are corded phones. Suppose that these components are randomly allocated the numbers 1, 2, . . . , 15 to establish the order in which they will be serviced. a. What is the probability that all the cordless phones are

among the first ten to be serviced? b. What is the probability that after servicing ten of these

phones, phones of only two of the three types remain to be serviced?

c. What is the probability that two phones of each type are among the first six serviced?

40. Three molecules of type A, three of type B, three of type C, and three of type D are to be linked together to form a chain molecule. One such chain molecule is ABCDABCDABCD, and another is BCDDAAABDBCC. a. How many such chain molecules are there? [Hint: If the

three A’s were distinguishable from one another—A1, A2, A3—and the B’s, C’s, and D’s were also, how many

molecules would there be? How is this number reduced when the subscripts are removed from the A’s?]

b. Suppose a chain molecule of the type described is ran- domly selected. What is the probability that all three molecules of each type end up next to one another (such as in BBBAAADDDCCC)?

41. An ATM personal identification number (PIN) consists of four digits, each a 0, 1, 2, . . . 8, or 9, in succession. a. How many different possible PINs are there if there are

no restrictions on the choice of digits? b. According to a representative at the author’s local branch

of Chase Bank, there are in fact restrictions on the choice of digits. The following choices are prohibited: (i) all four digits identical (ii) sequences of consecutive ascending or descending digits, such as 6543 (iii) any sequence start- ing with 19 (birth years are too easy to guess). So if one of the PINs in (a) is randomly selected, what is the prob- ability that it will be a legitimate PIN (that is, not be one of the prohibited sequences)?

c. Someone has stolen an ATM card and knows that the first and last digits of the PIN are 8 and 1, respectively. He has three tries before the card is retained by the ATM (but does not realize that). So he randomly selects the 2nd and 3rd digits for the first try, then randomly selects a differ- ent pair of digits for the second try, and yet another ran- domly selected pair of digits for the third try (the individual knows about the restrictions described in (b) so selects only from the legitimate possibilities). What is the probability that the individual gains access to the account?

d. Recalculate the probability in (c) if the first and last dig- its are 1 and 1, respectively.

42. A starting lineup in basketball consists of two guards, two forwards, and a center. a. A certain college team has on its roster three centers,

four guards, four forwards, and one individual (X) who can play either guard or forward. How many different starting lineups can be created? [Hint: Consider lineups without X, then lineups with X as guard, then lineups with X as forward.]

b. Now suppose the roster has 5 guards, 5 forwards, 3 cen- ters, and 2 “swing players” (X and Y) who can play either guard or forward. If 5 of the 15 players are ran- domly selected, what is the probability that they consti- tute a legitimate starting lineup?

43. In five-card poker, a straight consists of five cards with adja- cent denominations (e.g., 9 of clubs, 10 of hearts, jack of hearts, queen of spades, and king of clubs). Assuming that aces can be high or low, if you are dealt a five-card hand, what is the probability that it will be a straight with high card 10? What is the probability that it will be a straight? What is the probability that it will be a straight flush (all cards in the same suit)?

44. Show that . Give an interpretation involving subsets.

Ank B 5 A nn 2 kB

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.4 Conditional Probability 73

2.4 Conditional Probability The probabilities assigned to various events depend on what is known about the exper- imental situation when the assignment is made. Subsequent to the initial assignment, partial information relevant to the outcome of the experiment may become available. Such information may cause us to revise some of our probability assignments. For a particular event A, we have used P(A) to represent the probability, assigned to A; we now think of P(A) as the original, or unconditional probability, of the event A.

In this section, we examine how the information “an event B has occurred” affects the probability assigned to A. For example, A might refer to an individual having a particular disease in the presence of certain symptoms. If a blood test is performed on the individual and the result is negative , then the probability of having the disease will change (it should decrease, but not usually to zero, since blood tests are not infallible). We will use the notation to represent the conditional probability of A given that the event B has occurred. B is the “conditioning event.”

As an example, consider the event A that a randomly selected student at your university obtained all desired classes during the previous term’s registration cycle. Presumably P(A) is not very large. However, suppose the selected student is an ath- lete who gets special registration priority (the event B). Then should be sub- stantially larger than P(A), although perhaps still not close to 1.

Complex components are assembled in a plant that uses two different assembly lines, A and A�. Line A uses older equipment than A�, so it is somewhat slower and less reliable. Suppose on a given day line A has assembled 8 components, of which 2 have been identified as defective (B) and 6 as nondefective (B�), whereas A� has produced 1 defective and 9 nondefective components. This information is summa- rized in the accompanying table.

P(A | B)

P(A u B)

(B 5 negative blood test)

Example 2.24

Condition

B B�

A 2 6Line A� 1 9

Unaware of this information, the sales manager randomly selects 1 of these 18 com- ponents for a demonstration. Prior to the demonstration

However, if the chosen component turns out to be defective, then the event B has occurred, so the component must have been 1 of the 3 in the B column of the table. Since these 3 components are equally likely among themselves after B has occurred,

(2.2)

P(A u B) 5 2

3 5

2/18

3/18 5

P(A ¨ B) P(B)

P(line A component selected) 5 P(A) 5 N(A)

N 5

8

18 5 .44

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.25

DEFINITION

A

B

Figure 2.8 Motivating the definition of conditional probability

74 CHAPTER 2 Probability

In Equation (2.2), the conditional probability is expressed as a ratio of uncon- ditional probabilities: The numerator is the probability of the intersection of the two events, whereas the denominator is the probability of the conditioning event B. A Venn diagram illuminates this relationship (Figure 2.8).

Given that B has occurred, the relevant sample space is no longer S but con- sists of outcomes in B; A has occurred if and only if one of the outcomes in the inter- section occurred, so the conditional probability of A given B is proportional to

. The proportionality constant is used to ensure that the probability of the new sample space B equals 1.

The Definition of Conditional Probability Example 2.24 demonstrates that when outcomes are equally likely, computation of conditional probabilities can be based on intuition. When experiments are more complicated, though, intuition may fail us, so a general definition of conditional probability is needed that will yield intuitive answers in simple problems. The Venn diagram and Equation (2.2) suggest how to proceed.

P(B u B) 1/P(B)P(A ¨ B)

For any two events A and B with , the conditional probability of A given that B has occurred is defined by

(2.3)P(A u B) 5 P(A ¨ B)

P(B)

P(B) . 0

Suppose that of all individuals buying a certain digital camera, 60% include an optional memory card in their purchase, 40% include an extra battery, and 30% include both a card and battery. Consider randomly selecting a buyer and let

and . Then , , and . Given that the selected

individual purchased an extra battery, the probability that an optional card was also purchased is

That is, of all those purchasing an extra battery, 75% purchased an optional memory card. Similarly,

Notice that and . ■P(B u A) 2 P(B)P(A u B) 2 P(A)

P(battery u memory card) 5 P(B u A) 5 P(A ¨ B)

P(A) 5

.30

.60 5 .50

P(A u B) 5 P(A ¨ B)

P(B) 5

.30

.40 5 .75

P(both purchased) 5 P(A ¨ B) 5 .30P(B) 5 .40 P(A) 5 .60B 5 5battery purchased6A 5 5memory card purchased6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.26

2.4 Conditional Probability 75

The event whose probability is desired might be a union or intersection of other events, and the same could be true of the conditioning event.

A news magazine publishes three columns entitled “Art” (A), “Books” (B), and “Cinema” (C). Reading habits of a randomly selected reader with respect to these columns are

Read regularly A B C Probability .14 .23 .37 .08 .09 .13 .05

A ¨ B ¨ CB ¨ CA ¨ CA ¨ B

Figure 2.9 illustrates relevant probabilities.

This rule is important because it is often the case that is desired, whereas both P(B) and can be specified from the problem description. Consideration of gives .P(A ¨ B) 5 P(B u A) # P(A)P(B u A)

P(A u B) P(A ¨ B)

We thus have

and

The Multiplication Rule for The definition of conditional probability yields the following result, obtained by multiplying both sides of Equation (2.3) by P(B).

P(A ¨ B)

P(A ´ B u C) 5 P((A ´ B) ¨ C)

P(C) 5

.04 1 .05 1 .08

.37 5 .459

5 P(A)

P(A ´ B ´ C) 5

.14

.49 5 .286

P(A u reads at least one) 5 P(A u A ´ B ´ C) 5 P(A ¨ (A ´ B ´ C))

P(A ´ B ´ C)

P(A u B ´ C) 5 P(A ¨ (B ´ C))

P(B ´ C) 5

.04 1 .05 1 .03

.47 5

.12

.47 5 .255

P(A u B) 5 P(A ¨ B)

P(B) 5

.08

.23 5 .348

.02 .03 .07 .05

.04 .08

.20 .51

A B

C

Figure 2.9 Venn diagram for Example 2.26

The Multiplication Rule

P(A ¨ B) 5 P(A u B) # P(B)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.29

Example 2.28

Example 2.27

76 CHAPTER 2 Probability

Four individuals have responded to a request by a blood bank for blood donations. None of them has donated before, so their blood types are unknown. Suppose only type O� is desired and only one of the four actually has this type. If the potential donors are selected in random order for typing, what is the probability that at least three individuals must be typed to obtain the desired type?

Making the identification and

, . Given that the first type is not O�, two of the three individuals

left are not O�, so . The multiplication rule now gives

The multiplication rule is most useful when the experiment consists of several stages in succession. The conditioning event B then describes the outcome of the first stage and A the outcome of the second, so that —conditioning on what occurs first—will often be known. The rule is easily extended to experiments involv- ing more than two stages. For example,

(2.4)

where A1 occurs first, followed by A2, and finally A3.

For the blood typing experiment of Example 2.27,

5 1

2 #

2

3 #

3

4 5

1

4 5 .25

# P(second isn’t u first isn’t) # P(first isn’t) P(third type is O1) 5 P(third is u first isn’t ¨ second isn’t)

5 P(A3 u A1 ¨ A2) # P(A2 u A1) # P(A1) P(A1 ¨ A2 ¨ A3) 5 P(A3 u A1 ¨ A2) # P(A1 ¨ A2)

P(A u B)

5 .5

5 2

3 ?

3

4 5

6

12

5 P(A u B) # P(B) P(at least three individuals are typed) 5 P(A ¨ B)

P(A u B) 5 23

P(B) 5 34O16 A 5 5second type not B 5 5first type not O16

When the experiment of interest consists of a sequence of several stages, it is convenient to represent these with a tree diagram. Once we have an appropriate tree diagram, probabilities and conditional probabilities can be entered on the various branches; this will make repeated use of the multiplication rule quite straightforward.

A chain of video stores sells three different brands of DVD players. Of its DVD player sales, 50% are brand 1 (the least expensive), 30% are brand 2, and 20% are brand 3. Each manufacturer offers a 1-year warranty on parts and labor. It is known that 25% of brand 1’s DVD players require warranty repair work, whereas the cor- responding percentages for brands 2 and 3 are 20% and 10%, respectively.

1. What is the probability that a randomly selected purchaser has bought a brand 1 DVD player that will need repair while under warranty?

2. What is the probability that a randomly selected purchaser has a DVD player that will need repair while under warranty?

3. If a customer returns to the store with a DVD player that needs warranty repair work, what is the probability that it is a brand 1 DVD player? A brand 2 DVD player? A brand 3 DVD player?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.4 Conditional Probability 77

The first stage of the problem involves a customer selecting one of the three brands of DVD player. Let , for , and 3. Then

, , and . Once a brand of DVD player is selected, the second stage involves observing whether the selected DVD player needs warranty repair. With and , the given information implies that , , and .

The tree diagram representing this experimental situation is shown in Figure 2.10. The initial branches correspond to different brands of DVD players; there are two second-generation branches emanating from the tip of each initial branch, one for “needs repair” and the other for “doesn’t need repair.” The probabil- ity P(Ai) appears on the ith initial branch, whereas the conditional probabilities

and appear on the second-generation branches. To the right of each second-generation branch corresponding to the occurrence of B, we display the product of probabilities on the branches leading out to that point. This is simply the multiplication rule in action. The answer to the question posed in 1 is thus

. The answer to question 2 is

5 .125 1 .060 1 .020 5 .205

5 P(A1 ¨ B) 1 P(A2 ¨ B) 1 P(A3 ¨ B) P(B) 5 P[(brand 1 and repair) or (brand 2 and repair) or (brand 3 and repair)]

P(A1 ¨ B) 5 P(B u A1) # P(A1) 5 .125

P(Br u Ai)P(B u Ai)

P(B u A3) 5 .10P(B u A2) 5 .20P(B u A1) 5 .25 Br 5 5doesn’t need repair6B 5 5needs repair6

P(A3) 5 .20P(A2) 5 .30P(A1) 5 .50 i 5 1, 2Ai 5 5brand i is purchased6

Finally,

and

P(A3 u B) 5 1 2 P(A1 u B) 2 P(A2 u B) 5 .10

P(A2 u B) 5 P(A2 ¨ B)

P(B) 5

.060

.205 5 .29

P(A1 u B) 5 P(A1 ¨ B)

P(B) 5

.125

.205 5 .61

Brand 2

Br and

1

Brand 3

P(A 3 ) � .20

P(A 1) �

.5 0

P(A2) � .30 P(B � A2

) � .2 0

Repai r

P(B' � A2) � .80 No repair

P(B � A3) �

.10

Repai r

P(B' � A3) � .90 No repair

P(B' � A1) � .75 No repair

P(B � A1 ) � .2

5

Repai r

P(B � A3) � P(A3) � P(B � A3) � .020

P(B � A2) � P(A2) � P(B � A2) � .060

P(B � A1) � P(A1) � P(B � A1) � .125

P(B) � .205

Figure 2.10 Tree diagram for Example 2.29

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

A1

A2

A3

B

A4

Figure 2.11 Partition of B by mutually exclusive and exhaustive Ai’s

78 CHAPTER 2 Probability

The initial or prior probability of brand 1 is .50. Once it is known that the selected DVD player needed repair, the posterior probability of brand 1 increases to .61. This is because brand 1 DVD players are more likely to need warranty repair than are the other brands. The posterior probability of brand 3 is , which is much less than the prior probability . ■

Bayes’ Theorem The computation of a posterior probability from given prior probabilities

P(Ai) and conditional probabilities occupies a central position in elementary probability. The general rule for such computations, which is really just a simple application of the multiplication rule, goes back to Reverend Thomas Bayes, who lived in the eighteenth century. To state it we first need another result. Recall that events A1, . . . , Ak are mutually exclusive if no two have any common outcomes. The events are exhaustive if one Ai must occur, so that .A1 ´ c ´ Ak 5 S

P(B u Ai) P(Aj u B)

P(A3) 5 .20 P(A3 u B) 5 .10

The Law of Total Probability

Let A1, . . . , Ak be mutually exclusive and exhaustive events. Then for any other event B,

(2.5) 5 g k

i51 P(B u Ai)P(Ai)

P(B) 5 P(B u A1)P(A1) 1 c1 P(B u Ak)P(Ak)

Proof Because the Ai’s are mutually exclusive and exhaustive, if B occurs it must be in conjunction with exactly one of the Ai’s. That is, , where the events are mutually exclusive. This “partitioning of B” is illustrated in Figure 2.11. Thus

is desired.

P(B) 5 g k

i51 P(Ai ¨ B) 5 g

k

i51 P(B u Ai)P(Ai)

(Ai ¨ B) B 5 (A1 ¨ B) ´ c ´ (Ak ¨ B)

An individual has 3 different email accounts. Most of her messages, in fact 70%, come into account #1, whereas 20% come into account #2 and the remaining 10% into account #3. Of the messages into account #1, only 1% are spam, whereas the corresponding percentages for accounts #2 and #3 are 2% and 5%, respectively. What is the probability that a randomly selected message is spam?

To answer this question, let’s first establish some notation:

Ai 5 5message is from account [ i6 for i 5 1, 2, 3, B 5 5message is spam6

Example 2.30

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.4 Conditional Probability 79

Then the given percentages imply that

Now it is simply a matter of substituting into the equation for the law of total probability:

In the long run, 1.6% of this individual’s messages will be spam. ■

P(B) 5 (.01)(.70) 1 (.02)(.20) 1 (.05)(.10) 5 .016

P(B u A1) 5 .01, P(B u A2) 5 .02, P(B u A3) 5 .05 P(A1) 5 .70, P(A2) 5 .20, P(A3) 5 .10

Example 2.31

Bayes’ Theorem

Let A1, A2, . . . , Ak be a collection of k mutually exclusive and exhaustive events with prior probabilities . Then for any other event B for which , the posterior probability of Aj given that B has occurred is

(2.6)P(Aj u B) 5 P(Aj ¨ B)

P(B) 5

P(B u Aj)P(Aj)

g k

i51 P(B u Ai) # P(Ai)

j 5 1, c, k

P(B) . 0 P(Ai) (i 5 1, c, k)

The transition from the second to the third expression in (2.6) rests on using the multiplication rule in the numerator and the law of total probability in the denominator. The proliferation of events and subscripts in (2.6) can be a bit intimi- dating to probability newcomers. As long as there are relatively few events in the partition, a tree diagram (as in Example 2.29) can be used as a basis for calculating posterior probabilities without ever referring explicitly to Bayes’ theorem.

Incidence of a rare disease. Only 1 in 1000 adults is afflicted with a rare disease for which a diagnostic test has been developed. The test is such that when an individual actually has the disease, a positive result will occur 99% of the time, whereas an individual without the disease will show a positive test result only 2% of the time. If a randomly selected individual is tested and the result is positive, what is the proba- bility that the individual has the disease?

To use Bayes’ theorem, let individual has the disease, individual does not have the disease, and positive test result. Then ,

, , and . The tree diagram for this prob- lem is in Figure 2.12.

P(B u A2) 5 .02P(B u A1) 5 .99P(A2) 5 .999 P(A1) 5 .001B 5

A2 5A1 5

A 2 � Doesn't have disease

A1 � Has

dise ase

.001

.999 .02

B � �T

est

.98

B' � �Test

.01

B' � �Test

.99

B � �T

est

P(A1 � B) � .00099

P(A2 � B) � .01998

Figure 2.12 Tree diagram for the rare-disease problem

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

80 CHAPTER 2 Probability

Next to each branch corresponding to a positive test result, the multiplication rule yields the recorded probabilities. Therefore, , from which we have

This result seems counterintuitive; the diagnostic test appears so accurate that we expect someone with a positive test result to be highly likely to have the disease, whereas the computed conditional probability is only .047. However, the rarity of the disease implies that most positive test results arise from errors rather than from dis- eased individuals. The probability of having the disease has increased by a multiplica- tive factor of 47 (from prior .001 to posterior .047); but to get a further increase in the posterior probability, a diagnostic test with much smaller error rates is needed. ■

P(A1 u B) 5 P(A1 ¨ B)

P(B) 5

.00099

.02097 5 .047

P(B) 5 .00099 1 .01998 5 .02097

EXERCISES Section 2.4 (45–69)

45. The population of a particular country consists of three eth- nic groups. Each individual belongs to one of the four major blood groups. The accompanying joint probability table gives the proportions of individuals in the various ethnic group–blood group combinations.

48. Reconsider the system defect situation described in Exercise 26 (Section 2.2). a. Given that the system has a type 1 defect, what is the

probability that it has a type 2 defect? b. Given that the system has a type 1 defect, what is the

probability that it has all three types of defects? c. Given that the system has at least one type of defect,

what is the probability that it has exactly one type of defect?

d. Given that the system has both of the first two types of defects, what is the probability that it does not have the third type of defect?

49. The accompanying table gives information on the type of coffee selected by someone purchasing a single cup at a par- ticular airport kiosk.

Blood Group

O A B AB

1 .082 .106 .008 .004 Ethnic Group 2 .135 .141 .018 .006

3 .215 .200 .065 .020

Small Medium Large

Regular 14% 20% 26% Decaf 20% 10% 10%

Suppose that an individual is randomly selected from the population, and define events by ,

, and . a. Calculate P(A), P(C), and . b. Calculate both and , and explain in con-

text what each of these probabilities represents. c. If the selected individual does not have type B blood, what

is the probability that he or she is from ethnic group 1?

46. Suppose an individual is randomly selected from the popu- lation of all adult males living in the United States. Let A be the event that the selected individual is over 6 ft in height, and let B be the event that the selected individual is a pro- fessional basketball player. Which do you think is larger,

or ? Why?

47. Return to the credit card scenario of Exercise 12 (Section 2.2), where , , ,

, and . Calculate and interpret each of the following probabilities (a Venn diagram might help). a. b. c. d. e. Given that the selected individual has at least one card,

what is the probability that he or she has a Visa card?

P(Ar u B)P(A u B) P(Br u A)P(B u A)

P(A ¨ B) 5 .25P(B) 5 .4 P(A) 5 .5B 5 5MasterCard6A 5 5Visa6

P(B u A)P(A u B)

P(C u A)P(A u C) P(A ¨ C)

C 5 5ethnic group 3 selected6B 5 5type B selected6 A 5 5type A selected6

Consider randomly selecting such a coffee purchaser. a. What is the probability that the individual purchased a

small cup? A cup of decaf coffee? b. If we learn that the selected individual purchased a small

cup, what now is the probability that he/she chose decaf coffee, and how would you interpret this probability?

c. If we learn that the selected individual purchased decaf, what now is the probability that a small size was selected, and how does this compare to the correspon- ding unconditional probability of (a)?

50. A department store sells sport shirts in three sizes (small, medium, and large), three patterns (plaid, print, and stripe), and two sleeve lengths (long and short). The accompanying tables give the proportions of shirts sold in the various cat- egory combinations.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.4 Conditional Probability 81

Short-sleeved

Pattern

Size Pl Pr St

S .04 .02 .05 M .08 .07 .12 L .03 .07 .08

Long-sleeved

Pattern

Size Pl Pr St

S .03 .02 .03 M .10 .05 .07 L .04 .02 .08

a. What is the probability that the next shirt sold is a medium, long-sleeved, print shirt?

b. What is the probability that the next shirt sold is a medium print shirt?

c. What is the probability that the next shirt sold is a short- sleeved shirt? A long-sleeved shirt?

d. What is the probability that the size of the next shirt sold is medium? That the pattern of the next shirt sold is a print?

e. Given that the shirt just sold was a short-sleeved plaid, what is the probability that its size was medium?

f. Given that the shirt just sold was a medium plaid, what is the probability that it was short-sleeved? Long- sleeved?

51. One box contains six red balls and four green balls, and a second box contains seven red balls and three green balls. A ball is randomly chosen from the first box and placed in the second box. Then a ball is randomly selected from the sec- ond box and placed in the first box. a. What is the probability that a red ball is selected from the

first box and a red ball is selected from the second box? b. At the conclusion of the selection process, what is the

probability that the numbers of red and green balls in the first box are identical to the numbers at the beginning?

52. A system consists of two identical pumps, #1 and #2. If one pump fails, the system will still operate. However, because of the added strain, the remaining pump is now more likely to fail than was originally the case. That is, r 5 P(#2 fails ⏐ #1 fails) � P(#2 fails) 5 q. If at least one pump fails by the end of the pump design life in 7% of all systems and both pumps fail during that period in only 1%, what is the prob- ability that pump #1 will fail during the pump design life?

53. A certain shop repairs both audio and video components. Let A denote the event that the next component brought in for repair is an audio component, and let B be the event that the next component is a compact disc player (so the event B is contained in A). Suppose that and . What is ?

54. In Exercise 13, , for . Use the probabilities given there to compute the following probabilities, and explain in words the meaning of each one. a. b. c. d. .

55. Deer ticks can be carriers of either Lyme disease or human granulocytic ehrlichiosis (HGE). Based on a recent study, suppose that 16% of all ticks in a certain location carry Lyme disease, 10% carry HGE, and 10% of the ticks that carry at least one of these diseases in fact carry both of them. If a randomly selected tick is found to have carried HGE, what is the probability that the selected tick is also a carrier of Lyme disease?

56. For any events A and B with , show that .

57. If , show that . [Hint: Add to both sides of the given inequality and then use

the result of Exercise 56.]

58. Show that for any three events A, B, and C with , .

59. At a certain gas station, 40% of the customers use regular gas (A1), 35% use plus gas (A2), and 25% use premium (A3). Of those customers using regular gas, only 30% fill their tanks (event B). Of those customers using plus, 60% fill their tanks, whereas of those using premium, 50% fill their tanks. a. What is the probability that the next customer will

request plus gas and fill the tank ? b. What is the probability that the next customer fills the

tank? c. If the next customer fills the tank, what is the probability

that regular gas is requested? Plus? Premium?

60. Seventy percent of the light aircraft that disappear while in flight in a certain country are subsequently discovered. Of the aircraft that are discovered, 60% have an emer- gency locator, whereas 90% of the aircraft not discovered do not have such a locator. Suppose a light aircraft has disappeared. a. If it has an emergency locator, what is the probability

that it will not be discovered? b. If it does not have an emergency locator, what is the

probability that it will be discovered?

61. Components of a certain type are shipped to a supplier in batches of ten. Suppose that 50% of all such batches contain no defective components, 30% contain one defective compo- nent, and 20% contain two defective components. Two com- ponents from a batch are randomly selected and tested. What

(A2 ¨ B)

P(A ´ B u C) 5 P(A u C) 1 P(B u C) 2 P(A ¨ B u C) P(C) . 0

P(Br u A) P(Br u A) , P(Br)P(B u A) . P(B)

P(A u B) 1 P(Ar u B) 5 1 P(B) . 0

P(A1 ¨ A2 ¨ A3 u A1 ´ A2 ´ A3)P(A2 ´ A3 u A1) P(A2 ¨ A3 u A1)P(A2 u A1)

i 5 1, 2, 3Ai 5 5awarded project i6 P(B u A)

P(B) 5 .05P(A) 5 .6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

82 CHAPTER 2 Probability

are the probabilities associated with 0, 1, and 2 defective components being in the batch under each of the following conditions? a. Neither tested component is defective. b. One of the two tested components is defective. [Hint:

Draw a tree diagram with three first-generation branches for the three different types of batches.]

62. A company that manufactures video cameras produces a basic model and a deluxe model. Over the past year, 40% of the cameras sold have been of the basic model. Of those buying the basic model, 30% purchase an extended warranty, whereas 50% of all deluxe purchasers do so. If you learn that a randomly selected purchaser has an extended warranty, how likely is it that he or she has a basic model?

63. For customers purchasing a refrigerator at a certain appli- ance store, let A be the event that the refrigerator was manufactured in the U.S., B be the event that the refriger- ator had an icemaker, and C be the event that the customer purchased an extended warranty. Relevant probabilities are

a. Construct a tree diagram consisting of first-, second-, and third-generation branches, and place an event label and appropriate probability next to each branch.

b. Compute . c. Compute . d. Compute P(C). e. Compute , the probability of a U.S. pur-

chase given that an icemaker and extended warranty are also purchased.

64. The Reviews editor for a certain scientific journal decides whether the review for any particular book should be short (1–2 pages), medium (3–4 pages), or long (5–6 pages). Data on recent reviews indicates that 60% of them are short, 30% are medium, and the other 10% are long. Reviews are sub- mitted in either Word or LaTeX. For short reviews, 80% are in Word, whereas 50% of medium reviews are in Word and 30% of long reviews are in Word. Suppose a recent review is randomly selected. a. What is the probability that the selected review was sub-

mitted in Word format? b. If the selected review was submitted in Word format,

what are the posterior probabilities of it being short, medium, or long?

65. A large operator of timeshare complexes requires anyone interested in making a purchase to first visit the site of interest. Historical data indicates that 20% of all potential purchasers select a day visit, 50% choose a one-night visit, and 30% opt for a two-night visit. In addition, 10% of day visitors ultimately make a purchase, 30% of one- night visitors buy a unit, and 20% of those visiting for two

P(A u B ¨ C)

P(B ¨ C) P(A ¨ B ¨ C)

P(C u Ar ¨ B) 5 .7 P(C u Ar ¨ Br) 5 .3 P(C u A ¨ B) 5 .8 P(C u A ¨ Br) 5 .6 P(A) 5 .75 P(B u A) 5 .9 P(B u Ar) 5 .8

nights decide to buy. Suppose a visitor is randomly selected and is found to have made a purchase. How likely is it that this person made a day visit? A one-night visit? A two-night visit?

66. Consider the following information about travelers on vacation (based partly on a recent Travelocity poll): 40% check work email, 30% use a cell phone to stay connected to work, 25% bring a laptop with them, 23% both check work email and use a cell phone to stay connected, and 51% neither check work email nor use a cell phone to stay connected nor bring a laptop. In addition, 88 out of every 100 who bring a laptop also check work email, and 70 out of every 100 who use a cell phone to stay connected also bring a laptop. a. What is the probability that a randomly selected traveler

who checks work email also uses a cell phone to stay connected?

b. What is the probability that someone who brings a laptop on vacation also uses a cell phone to stay connected?

c. If the randomly selected traveler checked work email and brought a laptop, what is the probability that he/she uses a cell phone to stay connected?

67. There has been a great deal of controversy over the last sev- eral years regarding what types of surveillance are appro- priate to prevent terrorism. Suppose a particular surveillance system has a 99% chance of correctly identify- ing a future terrorist and a 99.9% chance of correctly iden- tifying someone who is not a future terrorist. If there are 1000 future terrorists in a population of 300 million, and one of these 300 million is randomly selected, scrutinized by the system, and identified as a future terrorist, what is the probability that he/she actually is a future terrorist? Does the value of this probability make you uneasy about using the surveillance system? Explain.

68. A friend who lives in Los Angeles makes frequent consult- ing trips to Washington, D.C.; 50% of the time she travels on airline #1, 30% of the time on airline #2, and the remaining 20% of the time on airline #3. For airline #1, flights are late into D.C. 30% of the time and late into L.A. 10% of the time. For airline #2, these percentages are 25% and 20%, whereas for airline #3 the percentages are 40% and 25%. If we learn that on a particular trip she arrived late at exactly one of the two destinations, what are the posterior probabilities of hav- ing flown on airlines #1, #2, and #3?Assume that the chance of a late arrival in L.A. is unaffected by what happens on the flight to D.C. [Hint: From the tip of each first-generation branch on a tree diagram, draw three second-generation branches labeled, respectively, 0 late, 1 late, and 2 late.]

69. In Exercise 59, consider the following additional informa- tion on credit card usage:

70% of all regular fill-up customers use a credit card. 50% of all regular non-fill-up customers use a credit card. 60% of all plus fill-up customers use a credit card. 50% of all plus non-fill-up customers use a credit card.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.33

Example 2.32

DEFINITION

2.5 Independence 83

50% of all premium fill-up customers use a credit card. 40% of all premium non-fill-up customers use a credit card.

Compute the probability of each of the following events for the next customer to arrive (a tree diagram might help). a. {plus and fill-up and credit card} b. {premium and non-fill-up and credit card}

c. {premium and credit card} d. {fill-up and credit card} e. {credit card} f. If the next customer uses a credit card, what is the prob-

ability that premium was requested?

2.5 Independence The definition of conditional probability enables us to revise the probability P(A) originally assigned to A when we are subsequently informed that another event B has occurred; the new probability of A is . In our examples, it was frequently the case that differed from the unconditional probability P(A), indicating that the information “B has occurred” resulted in a change in the chance of A occurring. Often the chance that A will occur or has occurred is not affected by knowledge that B has occurred, so that . It is then natural to regard A and B as inde- pendent events, meaning that the occurrence or nonoccurrence of one event has no bearing on the chance that the other will occur.

P(A u B) 5 P(A)

P(A u B) P(A u B)

The definition of independence might seem “unsymmetric” because we do not also demand that . However, using the definition of conditional prob- ability and the multiplication rule,

(2.7)

The right-hand side of Equation (2.7) is P(B) if and only if (independence), so the equality in the definition implies the other equality (and vice versa). It is also straightforward to show that if A and B are independent, then so are the following pairs of events: (1) A� and B, (2) A and B�, and (3) A� and B�.

Consider a gas station with six pumps numbered 1, 2, . . . , 6, and let Ei denote the sim- ple event that a randomly selected customer uses pump . Suppose that

, ,

Define events A, B, C by

, , .

We then have , , and . That is, events A and B are dependent, whereas events A and C are independent. Intuitively, A and C are independent because the relative division of probability among even- and odd-num- bered pumps is the same among pumps 2, 3, 4, 5 as it is among all six pumps. ■

Let A and B be any two mutually exclusive events with . For example, for a randomly chosen automobile, let and

. Since the events are mutually exclusive, if B occurs, then A cannot possibly have occurred, so . The mes- sage here is that if two events are mutually exclusive, they cannot be independent.

P(A u B) 5 0 2 P(A) B 5 5the car has a six cylinder engine6A 5 5the car has a four cylinder engine6

P(A) . 0

P(A u C) 5 .50P(A u B) 5 .30P(A) 5 .50

C 5 52, 3, 4, 56B 5 51, 2, 36A 5 52, 4, 66 P(E3) 5 P(E4) 5 .25P(E2) 5 P(E5) 5 .15P(E1) 5 P(E6) 5 .10

i (i 5 1, c, 6)

P(A u B) 5 P(A)

P(B u A) 5 P(A ¨ B)

P(A) 5

P(A u B)P(B) P(A)

P(B u A) 5 P(B)

Two events A and B are independent if and are dependent otherwise.

P(A u B) 5 P(A)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.35

Example 2.34

PROPOSITION

84 CHAPTER 2 Probability

When A and B are mutually exclusive, the information that A occurred says some- thing about B (it cannot have occurred), so independence is precluded. ■

The Multiplication Rule for Frequently the nature of an experiment suggests that two events A and B should be assumed independent. This is the case, for example, if a manufacturer receives a cir- cuit board from each of two different suppliers, each board is tested on arrival, and

and . If , it should also be the case that ; knowing the condition of the second board shouldn’t provide information about the condition of the first. The probability that both events will occur is easily calculated from the individual event probabilities when the events are independent.

P(A u B) 5 .1 P(A) 5 .1B 5 5second is defective6A 5 5first is defective6

P(A " B)

A and B are independent if and only if (iff)

(2.8)P(A ¨ B) 5 P(A) # P(B)

The verification of this multiplication rule is as follows:

(2.9)

where the second equality in Equation (2.9) is valid iff A and B are independent. Equivalence of independence and Equation (2.8) imply that the latter can be used as a definition of independence.

It is known that 30% of a certain company’s washing machines require service while under warranty, whereas only 10% of its dryers need such service. If someone pur- chases both a washer and a dryer made by this company, what is the probability that both machines will need warranty service?

Let A denote the event that the washer needs service while under warranty, and let B be defined analogously for the dryer. Then and . Assuming that the two machines will function independently of one another, the desired probability is

It is straightforward to show that A and B are independent iff A� and B are inde- pendent, A and B� are independent, and A�and B� are independent. Thus in Example 2.34, the probability that neither machine needs service is

Each day, Monday through Friday, a batch of components sent by a first supplier arrives at a certain inspection facility. Two days a week, a batch also arrives from a second supplier. Eighty percent of all supplier 1’s batches pass inspection, and 90% of supplier 2’s do likewise. What is the probability that, on a randomly selected day, two batches pass inspection? We will answer this assuming that on days when two batches are tested, whether the first batch passes is independ- ent of whether the second batch does so. Figure 2.13 displays the relevant information.

P(Ar ¨ Br) 5 P(Ar) # P(Br) 5 (.70)(.90) 5 .63

P(A ¨ B) 5 P(A) # P(B) 5 (.30)(.10) 5 .03

P(B) 5 .10P(A) 5 .30

P(A ¨ B) 5 P(A u B) # P(B) 5 P(A) # P(B)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 2.36

DEFINITION

2.5 Independence 85

2 batches

1 ba tch

.6

.4 .8

1st p asse

s

.2

1st fails

.2

Fails

.8

Pass es

.9

2nd pass

es

.1

2nd fails .9

2nd p asses

.1

2nd fails

.4 � (.8 � .9)

Figure 2.13 Tree diagram for Example 2.35

Independence of More Than Two Events The notion of independence of two events can be extended to collections of more than two events. Although it is possible to extend the definition for two independent events by working in terms of conditional and unconditional probabilities, it is more direct and less cumbersome to proceed along the lines of the last proposition.

5 [(.8)(.9)](.4) 5 .288

5 P(both pass u two received) # P(two received) P(two pass) 5 P(two received ¨ both pass)

Events A1, . . . , An are mutually independent if for every and every subset of indices i1, i2, . . . , ik,

P(Ai1 ¨ Ai2 ¨ c ¨ Aik) 5 P(Ai1) # P(Ai2) # c # P(Aik)

k (k 5 2, 3, c, n)

To paraphrase the definition, the events are mutually independent if the prob- ability of the intersection of any subset of the n events is equal to the product of the individual probabilities. In using the multiplication property for more than two inde- pendent events, it is legitimate to replace one or more of the by their comple- ments (e.g., if A1, A2, and A3 are independent events, so are , , and ). As was the case with two events, we frequently specify at the outset of a problem the inde- pendence of certain events. The probability of an intersection can then be calculated via multiplication.

The article “Reliability Evaluation of Solar Photovoltaic Arrays”(Solar Energy, 2002: 129–141) presents various configurations of solar photovoltaic arrays consisting of crystalline silicon solar cells. Consider first the system illustrated in Figure 2.14(a).

A3rA2rA1r Ais

1 2 3

4 5 6

1 2 3

4 5 6

(a) (b)

Figure 2.14 System configurations for Example 2.36: (a) series-parallel; (b) total-cross-tied

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

86 CHAPTER 2 Probability

There are two subsystems connected in parallel, each one containing three cells. In order for the system to function, at least one of the two parallel subsystems must work. Within each subsystem, the three cells are connected in series, so a subsystem will work only if all cells in the subsystem work. Consider a particular lifetime value t0, and supose we want to determine the probability that the system lifetime exceeds t0. Let Ai denote the event that the lifetime of cell i exceeds . We assume that the are independent events (whether any particular cell lasts more than t0 hours has no bearing on whether or not any other cell does) and that

for every i since the cells are identical. Then

Alternatively,

Next consider the total-cross-tied system shown in Figure 2.14(b), obtained from the series-parallel array by connecting ties across each column of junctions. Now the system fails as soon as an entire column fails, and system lifetime exceeds t0 only if the life of every column does so. For this configuration,

5 1 2 [1 2 (.9)3]2 5 .927

5 1 2 [1 2 P(subsystem life is . t0)] 2

5 1 2 [P(subsystem life is # t0)] 2

P(system lifetime exceeds t0) 5 1 2 P(both subsystem lives are # t0)

5 (.9)(.9)(.9) 1 (.9)(.9)(.9) 2 (.9)(.9)(.9)(.9)(.9)(.9) 5 .927

2 P[(A1 ¨ A2 ¨ A3) ¨ (A4 ¨ A5 ¨ A6)] 5 P(A1 ¨ A2 ¨ A3) 1 P(A4 ¨ A5 ¨ A6)

P(system lifetime exceeds t0) 5 P[(A1 ¨ A2 ¨ A3) ´ (A4 ¨ A5 ¨ A6)]

P(Ai) 5 .9

Airs t0 (i 5 1, 2, c, 6)

EXERCISES Section 2.5 (70–89)

70. Reconsider the credit card scenario of Exercise 47 (Section 2.4), and show that A and B are dependent first by using the definition of independence and then by verifying that the multiplication property does not hold.

71. An oil exploration company currently has two active proj- ects, one in Asia and the other in Europe. Let A be the event that the Asian project is successful and B be the event that the European project is successful. Suppose that A and B are independent events with and . a. If the Asian project is not successful, what is the proba-

bility that the European project is also not successful? Explain your reasoning.

b. What is the probability that at least one of the two proj- ects will be successful?

c. Given that at least one of the two projects is successful, what is the probability that only the Asian project is successful?

P(B) 5 .7P(A) 5 .4

72. In Exercise 13, is any Ai independent of any other Aj? Answer using the multiplication property for independent events.

73. If A and B are independent events, show that A� and B are also independent. [Hint: First establish a relationship between , P(B), and .]

74. The proportions of blood phenotypes in the U.S. population are as follows:

A B AB O .40 .11 .04 .45

Assuming that the phenotypes of two randomly selected individuals are independent of one another, what is the probability that both phenotypes are O? What is the proba- bility that the phenotypes of two randomly selected individ- uals match?

P(A ¨ B)P(Ar ¨ B)

■ 5 [1 2 (1 2 .9)2]3 5 .970

5 [1 2 P(both cells in a column have lifetime # t0)] 3

5 [1 2 P(column lifetime is # t0)] 3

P(system lifetime is at least t0) 5 [P(column lifetime exceeds t0)] 3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

2.5 Independence 87

2

1

3 4

75. One of the assumptions underlying the theory of control charting (see Chapter 16) is that successive plotted points are independent of one another. Each plotted point can sig- nal either that a manufacturing process is operating cor- rectly or that there is some sort of malfunction. Even when a process is running correctly, there is a small probability that a particular point will signal a problem with the process. Suppose that this probability is .05. What is the probability that at least one of 10 successive points indicates a problem when in fact the process is operating correctly? Answer this question for 25 successive points.

76. In October, 1994, a flaw in a certain Pentium chip installed in computers was discovered that could result in a wrong answer when performing a division. The manufacturer ini- tially claimed that the chance of any particular division being incorrect was only 1 in 9 billion, so that it would take thou- sands of years before a typical user encountered a mistake. However, statisticians are not typical users; some modern statistical techniques are so computationally intensive that a billion divisions over a short time period is not outside the realm of possibility. Assuming that the 1 in 9 billion figure is correct and that results of different divisions are independent of one another, what is the probability that at least one error occurs in one billion divisions with this chip?

77. An aircraft seam requires 25 rivets. The seam will have to be reworked if any of these rivets is defective. Suppose riv- ets are defective independently of one another, each with the same probability. a. If 20% of all seams need reworking, what is the proba-

bility that a rivet is defective? b. How small should the probability of a defective rivet be

to ensure that only 10% of all seams need reworking?

78. A boiler has five identical relief valves. The probability that any particular valve will open on demand is .95. Assuming independent operation of the valves, calculate P(at least one valve opens) and P(at least one valve fails to open).

79. Two pumps connected in parallel fail independently of one another on any given day. The probability that only the older pump will fail is .10, and the probability that only the newer pump will fail is .05. What is the probability that the pump- ing system will fail on any given day (which happens if both pumps fail)?

80. Consider the system of components connected as in the accompanying picture. Components 1 and 2 are connected in parallel, so that subsystem works iff either 1 or 2 works; since 3 and 4 are connected in series, that subsystem works iff both 3 and 4 work. If components work independently of

one another and , calculate P(system works).

81. Refer back to the series-parallel system configuration intro- duced in Example 2.35, and suppose that there are only two cells rather than three in each parallel subsystem [in Figure 2.14(a), eliminate cells 3 and 6, and renumber cells 4 and 5 as 3 and 4]. Using , the probability that system life- time exceeds t0 is easily seen to be .9639. To what value would .9 have to be changed in order to increase the system lifetime reliability from .9639 to .99? [Hint: Let , express system reliability in terms of p, and then let .]

82. Consider independently rolling two fair dice, one red and the other green. Let A be the event that the red die shows 3 dots, B be the event that the green die shows 4 dots, and C be the event that the total number of dots showing on the two dice is 7. Are these events pairwise independent (i.e., are A and B independent events, are A and C independent, and are B and C independent)? Are the three events mutu- ally independent?

83. Components arriving at a distributor are checked for defects by two different inspectors (each component is checked by both inspectors). The first inspector detects 90% of all defectives that are present, and the second inspector does likewise. At least one inspector does not detect a defect on 20% of all defective components. What is the probability that the following occur? a. A defective component will be detected only by the first

inspector? By exactly one of the two inspectors? b. All three defective components in a batch escape detec-

tion by both inspectors (assuming inspections of differ- ent components are independent of one another)?

84. Seventy percent of all vehicles examined at a certain emis- sions inspection station pass the inspection. Assuming that successive vehicles pass or fail independently of one another, calculate the following probabilities: a. P(all of the next three vehicles inspected pass) b. P(at least one of the next three inspected fails) c. P(exactly one of the next three inspected passes) d. P(at most one of the next three vehicles inspected passes) e. Given that at least one of the next three vehicles passes

inspection, what is the probability that all three pass (a conditional probability)?

85. A quality control inspector is inspecting newly produced items for faults. The inspector searches an item for faults in a series of independent fixations, each of a fixed duration. Given that a flaw is actually present, let p denote the proba- bility that the flaw is detected during any one fixation (this model is discussed in “Human Performance in Sampling Inspection,” Human Factors, 1979: 99–105). a. Assuming that an item has a flaw, what is the probability

that it is detected by the end of the second fixation (once a flaw has been detected, the sequence of fixations ter- minates)?

b. Give an expression for the probability that a flaw will be detected by the end of the nth fixation.

x 5 p2 P(Ai) 5 p

P(Ai) 5 .9

P(component works) 5 .9

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

88 CHAPTER 2 Probability

c. If when a flaw has not been detected in three fixations, the item is passed, what is the probability that a flawed item will pass inspection?

d. Suppose 10% of all items contain a flaw . With the assumption of

part (c), what is the probability that a randomly chosen item will pass inspection (it will automatically pass if it is not flawed, but could also pass if it is flawed)?

e. Given that an item has passed inspection (no flaws in three fixations), what is the probability that it is actually flawed? Calculate for .

86. a. A lumber company has just taken delivery on a lot of boards. Suppose that 20% of these boards

(2,000) are actually too green to be used in first-quality construction. Two boards are selected at random, one after the other. Let and

. Compute P(A), P(B), and (a tree diagram might help). Are A and B independent?

b. With A and B independent and , what is ? How much difference is there between this answer and in part (a)? For purposes of calcu- lating , can we assume that A and B of part (a) are independent to obtain essentially the correct probability?

c. Suppose the lot consists of ten boards, of which two are green. Does the assumption of independence now yield approximately the correct answer for ? What is the critical difference between the situation here and that of part (a)? When do you think an independence assump- tion would be valid in obtaining an approximately cor- rect answer to ?

87. Consider randomly selecting a single individual and having that person test drive 3 different vehicles. Define events A1, A2, and A3 by

A3 5 likes vehicle [3 A2 5 likes vehicle [2A1 5 likes vehicle [1

P(A ¨ B)

P(A ¨ B)

P(A ¨ B) P(A ¨ B)

P(A ¨ B) P(A) 5 P(B) 5 .2

P(A ¨ B) B 5 5the second board is green6

A 5 5the first board is green6

10,0002 3 4

p 5 .5

chosen item is flawed) 5 .1] [P(randomly

Suppose that , , , , , and

. a. What is the probability that the individual likes both

vehicle #1 and vehicle #2? b. Determine and interpret . c. Are A2 and A3 independent events? Answer in two dif-

ferent ways. d. If you learn that the individual did not like vehicle #1,

what now is the probability that he/she liked at least one of the other two vehicles?

88. Professor Stan der Deviation can take one of two routes on his way home from work. On the first route, there are four railroad crossings. The probability that he will be stopped by a train at any particular one of the crossings is .1, and trains operate independently at the four crossings. The other route is longer but there are only two crossings, independ- ent of one another, with the same stoppage probability for each as on the first route. On a particular day, Professor Deviation has a meeting scheduled at home for a certain time. Whichever route he takes, he calculates that he will be late if he is stopped by trains at at least half the crossings encountered. a. Which route should he take to minimize the probability

of being late to the meeting? b. If he tosses a fair coin to decide on a route and he is late,

what is the probability that he took the four-crossing route?

89. Suppose identical tags are placed on both the left ear and the right ear of a fox. The fox is then let loose for a period of time. Consider the two events and

. Let , and assume C1 and C2 are independent events. Derive an expres- sion (involving �) for the probability that exactly one tag is lost, given that at most one is lost (“Ear Tag Loss in Red Foxes,” J. Wildlife Mgmt., 1976: 164–167). [Hint: Draw a tree diagram in which the two initial branches refer to whether the left ear tag was lost.]

p 5 P(C1) 5 P(C2)C2 5 5right ear tag is lost6 C1 5 5left ear tag is lost6

P(A2 |A3 )

P(A1 ´ A2 ´ A3) 5 .88 P(A2 ¨ A3) 5 .40P(A1 ´ A2) 5 .80

P(A3) 5 .70P(A2) 5 .65P(A1) 5 .55

SUPPLEMENTARY EXERCISES (90–114)

90. A small manufacturing company will start operating a night shift. There are 20 machinists employed by the company. a. If a night crew consists of 3 machinists, how many dif-

ferent crews are possible? b. If the machinists are ranked 1, 2, . . . , 20 in order of com-

petence, how many of these crews would not have the best machinist?

c. How many of the crews would have at least 1 of the 10 best machinists?

d. If one of these crews is selected at random to work on a particular night, what is the probability that the best machinist will not work that night?

91. A factory uses three production lines to manufacture cans of a certain type. The accompanying table gives percentages of nonconforming cans, categorized by type of nonconformance, for each of the three lines during a particular time period.

Line 1 Line 2 Line 3

Blemish 15 12 20 Crack 50 44 40 Pull-Tab Problem 21 28 24 Surface Defect 10 8 15 Other 4 8 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 89

During this period, line 1 produced 500 nonconforming cans, line 2 produced 400 such cans, and line 3 was respon- sible for 600 nonconforming cans. Suppose that one of these 1500 cans is randomly selected. a. What is the probability that the can was produced by line

1? That the reason for nonconformance is a crack? b. If the selected can came from line 1, what is the proba-

bility that it had a blemish? c. Given that the selected can had a surface defect, what is

the probability that it came from line 1?

92. An employee of the records office at a certain university currently has ten forms on his desk awaiting processing. Six of these are withdrawal petitions and the other four are course substitution requests. a. If he randomly selects six of these forms to give to a sub-

ordinate, what is the probability that only one of the two types of forms remains on his desk?

b. Suppose he has time to process only four of these forms before leaving for the day. If these four are randomly selected one by one, what is the probability that each suc- ceeding form is of a different type from its predecessor?

93. One satellite is scheduled to be launched from Cape Canaveral in Florida, and another launching is scheduled for Vandenberg Air Force Base in California. Let A denote the event that the Vandenberg launch goes off on schedule, and let B represent the event that the Cape Canaveral launch goes off on schedule. If A and B are independent events with

, , and , determine the values of P(A) and P(B).

94. A transmitter is sending a message by using a binary code, namely, a sequence of 0’s and 1’s. Each transmitted bit (0 or 1) must pass through three relays to reach the receiver. At each relay, the probability is .20 that the bit sent will be dif- ferent from the bit received (a reversal). Assume that the relays operate independently of one another.

a. If a 1 is sent from the transmitter, what is the probability that a 1 is sent by all three relays?

b. If a 1 is sent from the transmitter, what is the probability that a 1 is received by the receiver? [Hint: The eight experimental outcomes can be displayed on a tree dia- gram with three generations of branches, one generation for each relay.]

c. Suppose 70% of all bits sent from the transmitter are 1s. If a 1 is received by the receiver, what is the probability that a 1 was sent?

95. Individual A has a circle of five close friends (B, C, D, E, and F). A has heard a certain rumor from outside the circle and has invited the five friends to a party to circulate the rumor. To begin, A selects one of the five at random and tells the rumor to the chosen individual. That individual then selects at random one of the four remaining individu- als and repeats the rumor. Continuing, a new individual is selected from those not already having heard the rumor by

Transmitter S Relay 1 S Relay 2 S Relay 3 S Receiver

P(A ¨ B) 5 .144P(A ´ B) 5 .626P(A) . P(B)

the individual who has just heard it, until everyone has been told. a. What is the probability that the rumor is repeated in the

order B, C, D, E, and F? b. What is the probability that F is the third person at the

party to be told the rumor? c. What is the probability that F is the last person to hear

the rumor? d. If at each stage the person who currently “has” the rumor

does not know who has already heard it and selects the next recipient at random from all five possible individu- als, what is the probability that F has still not heard the rumor after it has been told ten times at the party?

96. According to the article “Optimization of Distribution Parameters for Estimating Probability of Crack Detection” (J. of Aircraft, 2009: 2090–2097), the following “Palmberg” equation is commonly used to determine the probability Pd(c) of detecting a crack of size c in an aircraft structure:

where c* is the crack size that corresponds to a .5 detection probability (and thus is an assessment of the quality of the inspection process). a. Verify that b. What is when ? c. Suppose an inspector inspects two different panels, one

with a crack size of c* and the other with a crack size of 2c*. Again assuming and also that the results of the two inspections are independent of one another, what is the probability that exactly one of the two cracks will be detected?

d. What happens to as ?

97. A chemical engineer is interested in determining whether a certain trace impurity is present in a product. An experiment has a probability of .80 of detecting the impurity if it is pres- ent. The probability of not detecting the impurity if it is absent is .90. The prior probabilities of the impurity being present and being absent are .40 and .60, respectively. Three separate experiments result in only two detections. What is the posterior probability that the impurity is present?

98. Each contestant on a quiz show is asked to specify one of six possible categories from which questions will be asked. Suppose and succes- sive contestants choose their categories independently of one another. If there are three contestants on each show and all three contestants on a particular show select different categories, what is the probability that exactly one has selected category 1?

99. Fasteners used in aircraft manufacturing are slightly crimped so that they lock enough to avoid loosening during vibration. Suppose that 95% of all fasteners pass an initial inspection. Of the 5% that fail, 20% are so seriously defec- tive that they must be scrapped. The remaining fasteners are sent to a recrimping operation, where 40% cannot be

P(contestant requests category i) 5 1 6

b S `Pd(c)

b 5 4

b 5 4Pd (2c*) Pd(c*) 5 .5

Pd (c) 5 (c/c*)b

1 1 (c/c*)b

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

90 CHAPTER 2 Probability

salvaged and are discarded. The other 60% of these fasten- ers are corrected by the recrimping process and subse- quently pass inspection. a. What is the probability that a randomly selected incom-

ing fastener will pass inspection either initially or after recrimping?

b. Given that a fastener passed inspection, what is the probability that it passed the initial inspection and did not need recrimping?

100. One percent of all individuals in a certain population are carriers of a particular disease. A diagnostic test for this disease has a 90% detection rate for carriers and a 5% detection rate for noncarriers. Suppose the test is applied independently to two different blood samples from the same randomly selected individual. a. What is the probability that both tests yield the same

result? b. If both tests are positive, what is the probability that the

selected individual is a carrier?

101. A system consists of two components. The probability that the second component functions in a satisfactory manner during its design life is .9, the probability that at least one of the two components does so is .96, and the probability that both components do so is .75. Given that the first component functions in a satisfactory manner throughout its design life, what is the probability that the second one does also?

102. A certain company sends 40% of its overnight mail parcels via express mail service E1. Of these parcels, 2% arrive after the guaranteed delivery time (denote the event “late delivery” by L). If a record of an overnight mailing is ran- domly selected from the company’s file, what is the prob- ability that the parcel went via E1 and was late?

103. Refer to Exercise 102. Suppose that 50% of the overnight parcels are sent via express mail service E2 and the remain- ing 10% are sent via E3. Of those sent via E2, only 1% arrive late, whereas 5% of the parcels handled by E3 arrive late. a. What is the probability that a randomly selected parcel

arrived late? b. If a randomly selected parcel has arrived on time, what

is the probability that it was not sent via E1?

104. A company uses three different assembly lines—A1, A2, and A3—to manufacture a particular component. Of those manufactured by line A1, 5% need rework to remedy a defect, whereas 8% of A2’s components need rework and 10% of A3’s need rework. Suppose that 50% of all compo- nents are produced by line A1, 30% are produced by line A2, and 20% come from line A3. If a randomly selected component needs rework, what is the probability that it came from line A1? From line A2? From line A3?

105. Disregarding the possibility of a February 29 birthday, sup- pose a randomly selected individual is equally likely to have been born on any one of the other 365 days. a. If ten people are randomly selected, what is the proba-

bility that all have different birthdays? That at least two have the same birthday?

b. With k replacing ten in part (a), what is the smallest k for which there is at least a 50-50 chance that two or more people will have the same birthday?

c. If ten people are randomly selected, what is the proba- bility that either at least two have the same birthday or at least two have the same last three digits of their Social Security numbers? [Note: The article “Methods for Studying Coincidences” (F. Mosteller and P. Diaconis, J. Amer. Stat. Assoc., 1989: 853–861) dis- cusses problems of this type.]

106. One method used to distinguish between granitic (G) and basaltic (B) rocks is to examine a portion of the infrared spectrum of the sun’s energy reflected from the rock sur- face. Let R1, R2, and R3 denote measured spectrum intensi- ties at three different wavelengths; typically, for granite

, whereas for basalt . When measurements are made remotely (using aircraft), various orderings of the Ris may arise whether the rock is basalt or granite. Flights over regions of known composition have yielded the following information:

R3 , R1 , R2R1 , R2 , R3

Granite Basalt

60% 10% 25% 20% 15% 70%R3 , R1 , R2

R1 , R3 , R2

R1 , R2 , R3

Suppose that for a randomly selected rock in a certain region, and . a. Show that

. If measurements yielded , would you classify the rock as granite or basalt?

b. If measurements yielded , how would you classify the rock? Answer the same question for

. c. Using the classification rules indicated in parts (a) and

(b), when selecting a rock from this region, what is the probability of an erroneous classification? [Hint: Either G could be classified as B or B as G, and P(B) and P(G) are known.]

d. If rather than .25, are there values of p (other than 1) for which one would always classify a rock as granite?

107. A subject is allowed a sequence of glimpses to detect a tar- get. Let , with . Suppose the Gi's are independent events, and write an expression for the probability that the target has been detected by the end of the nth glimpse. [Note: This model is discussed in “Predicting Aircraft Detectability,” Human Factors, 1979: 277–291.]

108. In a Little League baseball game, team A’s pitcher throws a strike 50% of the time and a ball 50% of the time, suc- cessive pitches are independent of one another, and the pitcher never hits a batter. Knowing this, team B’s manager has instructed the first batter not to swing at anything.

pi 5 P(Gi) Gi 5 5the target is detected on the ith glimpse6

P(granite) 5 p

R3 , R1 , R2

R1 , R3 , R2

R1 , R2 , R3R2 , R3) P(basalt | R1 ,P(granite | R1 , R2 , R3) .

P(basalt) 5 .75P(granite) 5 .25

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Bibliography 91

Calculate the probability that a. The batter walks on the fourth pitch b. The batter walks on the sixth pitch (so two of the first

five must be strikes), using a counting argument or con- structing a tree diagram

c. The batter walks d. The first batter up scores while no one is out (assuming

that each batter pursues a no-swing strategy)

109. Four engineers, A, B, C, and D, have been scheduled for job interviews at 10 A.M. on Friday, January 13, at Random Sampling, Inc. The personnel manager has scheduled the four for interview rooms 1, 2, 3, and 4, respectively. However, the manager’s secretary does not know this, so assigns them to the four rooms in a completely random fashion (what else!). What is the probability that a. All four end up in the correct rooms? b. None of the four ends up in the correct room?

110. A particular airline has 10 A.M. flights from Chicago to New York, Atlanta, and Los Angeles. Let A denote the event that the New York flight is full and define events B and C analogously for the other two flights. Suppose

, , and the three events are independent. What is the probability that a. All three flights are full? That at least one flight is not

full? b. Only the New York flight is full? That exactly one of the

three flights is full?

111. A personnel manager is to interview four candidates for a job. These are ranked 1, 2, 3, and 4 in order of preference and will be interviewed in random order. However, at the conclusion of each interview, the manager will know only how the current candidate compares to those previously interviewed. For example, the interview order 3, 4, 1, 2 generates no information after the first interview, shows

P(C) 5 .4P(B) 5 .5P(A) 5 .6

that the second candidate is worse than the first, and that the third is better than the first two. However, the order 3, 4, 2, 1 would generate the same information after each of the first three interviews. The manager wants to hire the best candidate but must make an irrevocable hire/no hire decision after each interview. Consider the following strat- egy: Automatically reject the first s candidates and then hire the first subsequent candidate who is best among those already interviewed (if no such candidate appears, the last one interviewed is hired).

For example, with , the order 3, 4, 1, 2 would result in the best being hired, whereas the order 3, 1, 2, 4 would not. Of the four possible s values (0, 1, 2, and 3), which one maximizes P(best is hired)? [Hint: Write out the 24 equally likely interview orderings: means that the first candidate is automatically hired.]

112. Consider four independent events A1, A2, A3, and A4, and let . Express the probability that at

least one of these four events occurs in terms of the pis, and do the same for the probability that at least two of the events occur.

113. A box contains the following four slips of paper, each hav- ing exactly the same dimensions: (1) win prize 1; (2) win prize 2; (3) win prize 3; (4) win prizes 1, 2, and 3. One slip will be randomly selected. Let ,

, and . Show that A1 and A2 are independent, that A1 and A3 are independent, and that A2 and A3 are also independent (this is pairwise independence). However, show that

, so the three events are not mutually independent.

114. Show that if A1, A2, and A3 are independent events, then .P(A1 | A2 ¨ A3) 5 P(A1)

P(A1) ? P(A2) ? P(A3) P(A1 ¨ A2 ¨ A3) 2

A3 5 5win prize 36A2 5 5win prize 26 A1 5 5win prize 16

pi 5 P(Ai) for i 5 1,2,3,4

s 5 0

s 5 2

Bibliography Durrett, Richard, Elementary Probability for Applications,

Cambridge Univ. Press, London, England, 2009. A concise but readable presentation at a slightly higher level than this text.

Mosteller, Frederick, Robert Rourke, and George Thomas, Probability with Statistical Applications (2nd ed.), Addison- Wesley, Reading, MA, 1970. A very good precalculus intro- duction to probability, with many entertaining examples; especially good on counting rules and their application.

Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability Models and Application (2nd ed.), Macmillan, New York,

1994. A comprehensive introduction to probability, written at a slightly higher mathematical level than this text but con- taining many good examples.

Ross, Sheldon, A First Course in Probability (8th ed.), Macmillan, New York, 2009. Rather tightly written and more mathematically sophisticated than this text but contains a wealth of interesting examples and exercises.

Winkler, Robert, Introduction to Bayesian Inference and Decision, Holt, Rinehart & Winston, New York, 1972. A very good introduction to subjective probability.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

92

Discrete Random Variables and Probability Distributions

3

INTRODUCTION

Whether an experiment yields qualitative or quantitative outcomes, methods of

statistical analysis require that we focus on certain numerical aspects of the

data (such as a sample proportion x/n, mean , or standard deviations). The

concept of a random variable allows us to pass from the experimental out-

comes themselves to a numerical function of the outcomes. There are two fun-

damentally different types of random variables—discrete random variables and

continuous random variables. In this chapter, we examine the basic properties

and discuss the most important examples of discrete variables. Chapter 4 fo-

cuses on continuous random variables.

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.1. Random Variables 93

Example 3.2

Example 3.1

DEFINITION

3.1 Random Variables In any experiment, there are numerous characteristics that can be observed or mea- sured, but in most cases an experimenter will focus on some specific aspect or aspects of a sample. For example, in a study of commuting patterns in a metropoli- tan area, each individual in a sample might be asked about commuting distance and the number of people commuting in the same vehicle, but not about IQ, income, family size, and other such characteristics. Alternatively, a researcher may test a sample of components and record only the number that have failed within 1000 hours, rather than record the individual failure times.

In general, each outcome of an experiment can be associated with a number by specifying a rule of association (e.g., the number among the sample of ten compo- nents that fail to last 1000 hours or the total weight of baggage for a sample of 25 air- line passengers). Such a rule of association is called a random variable—a variable because different numerical values are possible and random because the observed value depends on which of the possible experimental outcomes results (Figure 3.1).

For a given sample space of some experiment, a random variable (rv) is any rule that associates a number with each outcome in . In mathematical language, a random variable is a function whose domain is the sample space and whose range is the set of real numbers.

S S

Random variables are customarily denoted by uppercase letters, such as X and Y, near the end of our alphabet. In contrast to our previous use of a lowercase letter, such as x, to denote a variable, we will now use lowercase letters to represent some particular value of the corresponding random variable. The notation means that x is the value associated with the outcome s by the rv X.

When a student calls a university help desk for technical support, he/she will either immediately be able to speak to someone (S, for success) or will be placed on hold (F, for failure). With , define an rv X by

The rv X indicates whether (1) or not (0) the student can immediately speak to someone. ■

The rv X in Example 3.1 was specified by explicitly listing each element of and the associated number. Such a listing is tedious if contains more than a few outcomes, but it can frequently be avoided.

Consider the experiment in which a telephone number in a certain area code is dialed using a random number dialer (such devices are used extensively by polling organi- zations), and define an rv Y by

S S

X(S) 5 1 X(F) 5 0

S 5 5S, F6

X(s) 5 x

�2 �1 1 20

Figure 3.1 A random variable

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.4

Example 3.3

DEFINITION

94 CHAPTER 3 Discrete Random Variables and Probability Distributions

For example, if 5282966 appears in the telephone directory, then , whereas tells us that the number 7727350 is unlisted. A word description of this sort is more economical than a complete listing, so we will use such a description whenever possible. ■

In Examples 3.1 and 3.2, the only possible values of the random variable were 0 and 1. Such a random variable arises frequently enough to be given a special name, after the individual who first studied it.

Y(7727350) 5 1 Y(5282966) 5 0

Y 5 e1 if the selected number is unlisted 0 if the selected number is listed in the directory

Any random variable whose only possible values are 0 and 1 is called a Bernoulli random variable.

We will sometimes want to consider several different random variables from the same sample space.

Example 2.3 described an experiment in which the number of pumps in use at each of two six-pump gas stations was determined. Define rv’s X, Y, and U by

the total number of pumps in use at the two stations

the difference between the number of pumps in use at station 1 and the number in use at station 2

the maximum of the numbers of pumps in use at the two stations

If this experiment is performed and results, then , so we say that the observed value of X was . Similarly, the observed value of Y would be , and the observed value of U would be . ■

Each of the random variables of Examples 3.1–3.3 can assume only a finite number of possible values. This need not be the case.

Consider an experiment in which 9-volt batteries are tested until one with an acceptable voltage ( ) is obtained. The sample space is . Define an rv X by

Then , , , and so on. Any positive integer is a possible value of X, so the set of possible values is infinite. ■

Suppose that in some random fashion, a location (latitude and longitude) in the con- tinental United States is selected. Define an rv Y by

For example, if the selected location were (39°50�N, 98°35�W), then we might have . The largest possible value of Y is 14,494 (Mt.

Whitney), and the smallest possible value is (Death Valley). The set of all possible values of Y is the set of all numbers in the interval between and 14,494—that is,

and there are an infinite number of numbers in this interval. ■

5y : y is a number, 2282 # y # 14,4946 2282

2282 Y((39850rN, 98835rW)) 5 1748.26 ft

Y 5 the height above sea level at the selected location

X(FFS) 5 3, c, X(FFFFFFS) 5 7X(FS) 5 2X(S) 5 1

X 5 the number of batteries tested before the experiment terminates

S 5 5S, FS, FFS, c6S

u 5 max (2, 3) 5 3y 5 2 2 3 5 21 x 5 5

X((2, 3)) 5 2 1 3 5 5s 5 (2, 3)

U 5

Y 5

X 5

Example 3.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.1. Random Variables 95

Example 3.6

DEFINITION

Two Types of Random Variables In Section 1.2, we distinguished between data resulting from observations on a count- ing variable and data obtained by observing values of a measurement variable. A slightly more formal distinction characterizes two different types of random variables.

A discrete random variable is an rv whose possible values either constitute a finite set or else can be listed in an infinite sequence in which there is a first element, a second element, and so on (“countably” infinite). A random variable is continuous if both of the following apply: 1. Its set of possible values consists either of all numbers in a single interval

on the number line (possibly infinite in extent, e.g., from to �) or all numbers in a disjoint union of such intervals (e.g., ).

2. No possible value of the variable has positive probability, that is, for any possible value c.P(X 5 c) 5 0

[0, 10] ´ [20, 30] 2`

Although any interval on the number line contains an infinite number of numbers, it can be shown that there is no way to create an infinite listing of all these values— there are just too many of them. The second condition describing a continuous ran- dom variable is perhaps counterintuitive, since it would seem to imply a total probability of zero for all possible values. But we shall see in Chapter 4 that inter- vals of values have positive probability; the probability of an interval will decrease to zero as the width of the interval shrinks to zero.

All random variables in Examples 3.1 –3.4 are discrete. As another example, suppose we select married couples at random and do a blood test on each person until we find a husband and wife who both have the same Rh factor. With of blood tests to be performed, possible values of X are . Since the possible values have been listed in sequence, X is a discrete rv. ■

To study basic properties of discrete rv’s, only the tools of discrete mathematics— summation and differences—are required. The study of continuous variables requires the continuous mathematics of the calculus—integrals and derivatives.

D 5 52, 4, 6, 8, c6X 5 the number

EXERCISES Section 3.1 (1–10)

1. A concrete beam may fail either by shear (S) or flexure (F). Suppose that three failed beams are randomly selected and the type of failure is determined for each one. Let

of beams among the three selected that failed by shear. List each outcome in the sample space along with the associated value of X.

2. Give three examples of Bernoulli rv’s (other than those in the text).

3. Using the experiment in Example 3.3, define two more random variables and list the possible values of each.

4. Let of nonzero digits in a randomly selected zip code. What are the possible values of X? Give three pos- sible outcomes and their associated X values.

X 5 the number

X 5 the number

5. If the sample space is an infinite set, does this necessar- ily imply that any rv X defined from will have an infinite set of possible values? If yes, say why. If no, give an example.

6. Starting at a fixed time, each car entering an intersection is observed to see whether it turns left (L), right (R), or goes straight ahead (A). The experiment terminates as soon as a car is observed to turn left. Let of cars observed. What are possible X values? List five outcomes and their associated X values.

7. For each random variable defined here, describe the set of possible values for the variable, and state whether the vari- able is discrete.

X 5 the number

S S

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

B2 A3A2

0 B1

A1 B4 A4

B3

96 CHAPTER 3 Discrete Random Variables and Probability Distributions

a. of unbroken eggs in a randomly chosen standard egg carton

b. of students on a class list for a particular course who are absent on the first day of classes

c. of times a duffer has to swing at a golf ball before hitting it

d. of a randomly selected rattlesnake e. of royalties earned from the sale of a first

edition of 10,000 textbooks f. of a randomly chosen soil sample g. (psi) at which a randomly selected tennis

racket has been strung h. number of coin tosses required for three

individuals to obtain a match (HHH or TTT)

8. Each time a component is tested, the trial is a success (S) or failure (F). Suppose the component is tested repeatedly until a success occurs on three consecutive trials. Let Y denote the number of trials necessary to achieve this. List all outcomes corresponding to the five smallest possible values of Y, and state which Y value is associated with each one.

9. An individual named Claudius is located at the point 0 in the accompanying diagram.

X 5 the total

X 5 the tension Y 5 the pH

Z 5 the amount X 5 the length

U 5 the number

Y 5 the number

X 5 the number Using an appropriate randomization device (such as a tetrahedral die, one having four sides), Claudius first moves to one of the four locations B1, B2, B3, B4. Once at one of these locations, another randomization device is used to decide whether Claudius next returns to 0 or next visits one of the other two adjacent points. This process then continues; after each move, another move to one of the (new) adjacent points is determined by tossing an appropriate die or coin. a. Let of moves that Claudius makes

before first returning to 0. What are possible values of X? Is X discrete or continuous?

b. If moves are allowed also along the diagonal paths con- necting 0 to A1, A2, A3, and A4, respectively, answer the questions in part (a).

10. The number of pumps in use at both a six-pump station and a four-pump station will be determined. Give the possible values for each of the following random variables: a. number of pumps in use b. between the numbers in use at stations

1 and 2 c. number of pumps in use at either

station d. of stations having exactly two pumps

in use Z 5 the number

U 5 the maximum

X 5 the difference T 5 the total

X 5 the number

Probabilities assigned to various outcomes in in turn determine probabilities asso- ciated with the values of any particular rv X. The probability distribution of X says how the total probability of 1 is distributed among (allocated to) the various possi- ble X values. Suppose, for example, that a business has just purchased four laser printers, and let X be the number among these that require service during the war- ranty period. Possible X values are then 0, 1, 2, 3, and 4. The probability distribution will tell us how the probability of 1 is subdivided among these five possible values— how much probability is associated with the X value 0, how much is apportioned to the X value 1, and so on. We will use the following notation for the probabilities in the distribution:

and so on. In general, p(x) will denote the probability assigned to the value x.

The Cal Poly Department of Statistics has a lab with six computers reserved for sta- tistics majors. Let X denote the number of these computers that are in use at a par- ticular time of day. Suppose that the probability distribution of X is as given in the

p(1) 5 the probability of the X value 1 5 P(X 5 1)

p(0) 5 the probability of the X value 0 5 P(X 5 0)

S

3.2 Probability Distributions for Discrete Random Variables

Example 3.7

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.2. Probability Distributions for Discrete Random Variables 97

Example 3.8

DEFINITION The probability distribution or probability mass function (pmf) of a discrete rv is defined for every number x by .p(x) 5 P(X 5 x) 5 P(all s � S : X(s) 5 x)

following table; the first row of the table lists the possible X values and the second row gives the probability of each such value.

We can now use elementary probability properties to calculate other probabilities of interest. For example, the probability that at most 2 computers are in use is

Since the event at least 3 computers are in use is complementary to at most 2 com- puters are in use,

which can, of course, also be obtained by adding together probabilities for the values, 3, 4, 5, and 6. The probability that between 2 and 5 computers inclusive are in use is

whereas the probability that the number of computers in use is strictly between 2 and 5 is

■P(2 , X , 5) 5 P(X 5 3 or 4) 5 .25 1 .20 5 .45

P(2 # X # 5) 5 P(X 5 2, 3, 4, or 5) 5 .15 1 .25 1 .20 1 .15 5 .75

P(X $ 3) 5 1 2 P(X # 2) 5 1 2 .30 5 .70

P(X # 2) 5 P(X 5 0 or 1 or 2) 5 p(0) 1 p(1) 1 p(2) 5 .05 1 .10 1 .15 5 .30

x 0 1 2 3 4 5 6

p(x) .05 .10 .15 .25 .20 .15 .10

In words, for every possible value x of the random variable, the pmf specifies the probability of observing that value when the experiment is performed. The con- ditions and are required of any pmf.

The pmf of X in the previous example was simply given in the problem description. We now consider several examples in which various probability proper- ties are exploited to obtain the desired distribution.

Six lots of components are ready to be shipped by a certain supplier. The number of defective components in each lot is as follows:

Lot 1 2 3 4 5 6 Number of defectives 0 2 0 1 2 0

One of these lots is to be randomly selected for shipment to a particular customer. Let X be the number of defectives in the selected lot. The three possible X values are 0, 1, and 2. Of the six equally likely simple events, three result in , one in

, and the other two in . Then

p(2) 5 P(X 5 2) 5 P(lot 2 or 5 is sent) 5 2

6 5 .333

p(1) 5 P(X 5 1) 5 P(lot 4 is sent) 5 1

6 5 .167

p(0) 5 P(X 5 0) 5 P(lot 1 or 3 or 6 is sent) 5 3

6 5 .500

X 5 2X 5 1 X 5 0

g all possible x p(x) 5 1p(x) $ 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.9

98 CHAPTER 3 Discrete Random Variables and Probability Distributions

That is, a probability of .500 is distributed to the X value 0, a probability of .167 is placed on the X value 1, and the remaining probability, .333, is associated with the X value 2. The values of X along with their probabilities collectively specify the pmf. If this exper- iment were repeated over and over again, in the long run would occur one-half of the time, one-sixth of the time, and one-third of the time. ■

Consider whether the next person buying a computer at a certain electronics store buys a laptop or a desktop model. Let

If 20% of all purchasers during that week select a desktop, the pmf for X is

An equivalent description is

Figure 3.2 is a picture of this pmf, called a line graph. X is, of course, a Bernoulli rv and p(x) is a Bernoulli pmf.

p(x) 5 c.8 if x 5 0.2 if x 5 1 0 if x 2 0 or 1

p(x) 5 P(X 5 x) 5 0 for x 2 0 or 1 p(1) 5 P(X 5 1) 5 P(next customer purchases a desktop model) 5 .2

p(0) 5 P(X 5 0) 5 P(next customer purchases a laptop model) 5 .8

X 5 e1 if the customer purchases a desktop computer 0 if the customer purchases a laptop computer

X 5 2X 5 1 X 5 0

1

1 x

p(x)

0

Figure 3.2 The line graph for the pmf in Example 3.9 ■

Consider a group of five potential blood donors—a, b, c, d, and e—of whom only a and b have type blood. Five blood samples, one from each individual, will be typed in random order until an individual is identified. Let the rv of typ- ings necessary to identify an individual. Then the pmf of Y is

p(y) 5 0 if y 2 1, 2, 3, 4

p(4) 5 P(Y 5 4) 5 P(c, d, and e all done first) 5 a3 5 b a2

4 b a1

3 b 5 .1

5 a3 5 b a2

4 b a2

3 b 5 .2

p(3) 5 P(Y 5 3) 5 P(c, d, or e first and second, and then a or b)

5 P(c, d, or e first) # P(a or b next | c, d, or e first) 5 3 5

#

2

4 5 .3

p(2) 5 P(Y 5 2) 5 P(c, d, or e first, and then a or b)

p(1) 5 P(Y 5 1) 5 P(a or b typed first) 5 2

5 5 .4

O1 Y 5 the numberO1

O1 Example 3.10

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.2. Probability Distributions for Discrete Random Variables 99

Example 3.11

In tabular form, the pmf is

.5

p(y)

1 y

0 2 3 4

Figure 3.3 The line graph for the pmf in Example 3.10

0 1 1 2 3 4

(a) (b)

Figure 3.4 Probability histograms: (a) Example 3.9; (b) Example 3.10

y 1 2 3 4

p(y) .4 .3 .2 .1

where any y value not listed receives zero probability. Figure 3.3 shows a line graph of the pmf.

The name “probability mass function” is suggested by a model used in physics for a system of “point masses.” In this model, masses are distributed at various loca- tions x along a one-dimensional axis. Our pmf describes how the total probability mass of 1 is distributed at various points along the axis of possible values of the ran- dom variable (where and how much mass at each x).

Another useful pictorial representation of a pmf, called a probability histogram, is similar to histograms discussed in Chapter 1. Above each y with , construct a rectangle centered at y. The height of each rectangle is proportional to p(y), and the base is the same for all rectangles. When possible values are equally spaced, the base is frequently chosen as the distance between successive y values (though it could be smaller). Figure 3.4 shows two probability histograms.

p(y) . 0

It is often helpful to think of a pmf as specifying a mathematical model for a discrete population.

Consider selecting at random a student who is among the 15,000 registered for the current term at Mega University. Let of courses for which the selected student is registered, and suppose that X has the following pmf:

X 5 the number

One way to view this situation is to think of the population as consisting of 15,000 indi- viduals, each having his or her own X value; the proportion with each X value is given by p(x). An alternative viewpoint is to forget about the students and think of the popu- lation itself as consisting of the X values: There are some 1s in the population, some 2s, . . . , and finally some 7s. The population then consists of the numbers 1, 2, . . . , 7 (so is discrete), and p(x) gives a model for the distribution of population values. ■

x 1 2 3 4 5 6 7

p(x) .01 .03 .13 .25 .39 .17 .02

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

100 CHAPTER 3 Discrete Random Variables and Probability Distributions

Once we have such a population model, we will use it to compute values of population characteristics (e.g., the mean m) and make inferences about such characteristics.

A Parameter of a Probability Distribution The pmf of the Bernoulli rv X in Example 3.9 was and because 20% of all purchasers selected a desktop computer. At another store, it may be the case that and . More generally, the pmf of any Bernoulli rv can be expressed in the form and , where

. Because the pmf depends on the particular value of we often write rather than just p(x):

(3.1)

Then each choice of a in Expression (3.1) yields a different pmf.

p(x; a) 5 c1 2 a if x 5 0a if x 5 1 0 otherwise

p(x; a) a,0 , a , 1

p(0) 5 1 2 ap(1) 5 a p(1) 5 .1p(0) 5 .9

p(1) 5 .2p(0) 5 .8

DEFINITION Suppose p(x) depends on a quantity that can be assigned any one of a number of possible values, with each different value determining a different probabil- ity distribution. Such a quantity is called a parameter of the distribution. The collection of all probability distributions for different values of the parameter is called a family of probability distributions.

Example 3.12

The quantity in Expression (3.1) is a parameter. Each different number a between 0 and 1 determines a different member of the Bernoulli family of distributions.

Starting at a fixed time, we observe the gender of each newborn child at a certain hospital until a boy (B) is born. Let , assume that successive births are inde- pendent, and define the rv X by of births observed. Then

and

Continuing in this way, a general formula emerges:

(3.2)

The parameter p can assume any value between 0 and 1. Expression (3.2) describes the family of geometric distributions. In the gender example, might be appropriate, but if we were looking for the first child with Rh-positive blood, then we might have . ■

The Cumulative Distribution Function For some fixed value x, we often wish to compute the probability that the observed value of X will be at most x. For example, the pmf in Example 3.8 was

p 5.85

p 5 .51

p(x) 5 e (1 2 p)x21p x 5 1, 2, 3, c 0 otherwise

p(3) 5 P(X 5 3) 5 P(GGB) 5 P(G) # P(G) # P(B) 5 (1 2 p)2p

p(2) 5 P(X 5 2) 5 P(GB) 5 P(G) # P(B) 5 (1 2 p)p p(1) 5 P(X 5 1) 5 P(B) 5 p

x 5 number p 5 P(B)

a

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.2. Probability Distributions for Discrete Random Variables 101

Example 3.13

DEFINITION

The probability that X is at most 1 is then

In this example, if and only if , so

Similarly,

And in fact for any x satisfying , . The largest possible X value is 2, so

and so on. Notice that since the latter includes the probabil- ity of the X value 1, whereas the former does not. More generally, when X is discrete and x is a possible value of the variable, .P(X , x) , P(X # x)

P(X , 1) , P(X # 1)

P(X # 2) 5 1, P(X # 3.7) 5 1, P(X # 20.5) 5 1

P(X # x) 5 .50 # x , 1

P(X # 0) 5 P(X 5 0) 5 .5, P(X # .75) 5 .5

P(X # 1.5) 5 P(X # 1) 5 .667

X # 1X # 1.5

P(X # 1) 5 p(0) 1 p(1) 5 .500 1 .167 5 .667

p(x) 5 d .500 x 5 0.167 x 5 1 .333 x 5 2

0 otherwise

The cumulative distribution function (cdf) F(x) of a discrete rv variable X with pmf p(x) is defined for every number x by

(3.3)

For any number x, F(x) is the probability that the observed value of X will be at most x.

F(x) 5 P(X # x) 5 g y : y # x

p(y)

A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8 GB, or 16 GB of mem- ory. The accompanying table gives the distribution of of memory in a purchased drive:

Y 5 the amount

y 1 2 4 8 16

p(y) .05 .10 .35 .40 .10

Let’s first determine F(y) for each of the five possible values of Y:

Now for any other number y, F(y) will equal the value of F at the closest possible value of Y to the left of y. For example,

F(7.999) 5 P(Y # 7.999) 5 P(Y # 4) 5 F(4) 5 .50

F(2.7) 5 P(Y # 2.7) 5 P(Y # 2) 5 F(2) 5 .15

F(16) 5 P(Y # 16) 5 1

F(8) 5 P(Y # 8) 5 p(1) 1 p(2) 1 p(4) 1 p(8) 5 .90

F(4) 5 P(Y # 4) 5 P(Y 5 1 or 2 or 4) 5 p(1) 1 p(2) 1 p(4) 5 .50

F(2) 5 P(Y # 2) 5 P(Y 5 1 or 2) 5 p(1) 1 p(2) 5 .15

F(1) 5 P(Y # 1) 5 P(Y 5 1) 5 p(1) 5 .05

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

102 CHAPTER 3 Discrete Random Variables and Probability Distributions

Example 3.14 (Example 3.12 continued)

If y is less than 1, [e.g. ], and if y is at least 16, [e.g. ]. The cdf is thus

A graph of this cdf is shown in Figure 3.5.

F(y) 5 f 0 y , 1 .05 1 # y , 2

.15 2 # y , 4

.50 4 # y , 8

.90 8 # y , 16

1 16 # y

F(25) 5 1 F(y) 5 1F(.58) 5 0F(y) 5 0

0.0

0 5 10 15 20

y

1.0

0.8

0.6

F(y)

0.4

0.2

Figure 3.5 A graph of the cdf of Example 3.13 ■

For X a discrete rv, the graph of F(x) will have a jump at every possible value of X and will be flat between possible values. Such a graph is called a step function.

The pmf of had the form

For any positive integer x,

(3.4)

To evaluate this sum, recall that the partial sum of a geometric series is

g k

y50 ay 5

1 2 ak11

1 2 a

F(x) 5 g y # x

p(y) 5 g x

y 51 (1 2 p)y21 p 5 pg

x21

y50 (1 2 p)y

p(x) 5 e (1 2 p)x21p x 5 1, 2, 3, . . . 0 otherwise

X 5 the number of births

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.2. Probability Distributions for Discrete Random Variables 103

Using this in Equation (3.4), with and , gives

Since F is constant in between positive integers,

(3.5)

where [x] is the largest integer (e.g., ). Thus if as in the birth example, then the probability of having to examine at most five births to see the first boy is , whereas . This cdf is graphed in Figure 3.6.

F(10) < 1.0000F(5) 5 1 2 (.49)5 5 1 2 .0282 5 .9718

p 5 .51[2.7] 5 2# x

F(x) 5 e 0 x , 1 1 2 (1 2 p)[x] x $ 1

F(x) 5 p # 1 2 (1 2 p) x

1 2 (1 2 p) 5 1 2 (1 2 p)x x a positive integer

k 5 x 2 1a 5 1 2 p

0 1 2 3 4 5 50 51

F(x)

x

1.0

In examples thus far, the cdf has been derived from the pmf. This process can be reversed to obtain the pmf from the cdf whenever the latter function is available. For example, consider again the rv of Example 3.7 (the number of computers being used in a lab); possible X values are 0, 1, . . . , 6. Then

More generally, the probability that X falls in a specified interval is easily obtained from the cdf. For example,

Notice that . This is because the X value 2 is included in , so we do not want to subtract out its probability. However,

because is not included in the interval .2 , X # 4

X 5 2P(2 , X # 4) 5 F(4) 2 F(2) 2 # X # 4

P(2 # X # 4) 2 F(4) 2 F(2)

5 F(4) 2 F(1)

5 P(X # 4) 2 P(X # 1)

5 [p(0) 1 c 1 p(4)] 2 [p(0) 1 p(1)]

P(2 # X # 4) 5 p(2) 1 p(3) 1 p(4)

5 F(3) 2 F(2)

5 P(X # 3) 2 P(X # 2)

5 [p(0) 1 p(1) 1 p(2) 1 p(3)] 2 [p(0) 1 p(1) 1 p(2)]

p(3) 5 P(X 5 3)

Figure 3.6 A graph of F(x) for Example 3.14

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

104 CHAPTER 3 Discrete Random Variables and Probability Distributions

PROPOSITION For any two numbers a and b with ,

where “ ” represents the largest possible X value that is strictly less than a. In particular, if the only possible values are integers and if a and b are integers, then

Taking yields in this case.P(X 5 a) 5 F(a) 2 F(a 2 1)a 5 b

5 F(b) 2 F(a 2 1)

P(a # X # b) 5 P(X 5 a or a 1 1 orc or b)

a2

P(a # X # b) 5 F(b) 2 F(a2)

a # b

Example 3.15

The reason for subtracting rather than F(a) is that we want to include ; gives . This proposition will be used exten-

sively when computing binomial and Poisson probabilities in Sections 3.4 and 3.6.

Let of days of sick leave taken by a randomly selected employee of a large company during a particular year. If the maximum number of allowable sick days per year is 14, possible values of X are 0, 1, . . . , 14. With ,

, , , , and ,

and

■P(X 5 3) 5 F(3) 2 F(2) 5 .05

P(2 # X # 5) 5 P(X 5 2, 3, 4, or 5) 5 F(5) 2 F(1) 5 .22

F(5) 5 .94F(4) 5 .88F(3) 5 .81F(2) 5 .76F(1) 5 .72 F(0) 5 .58

X 5 the number

P(a , X # b)F(b) 2 F(a)P(X 5 a) F(a2)

EXERCISES Section 3.2 (11–28)

11. An automobile service facility specializing in engine tune-ups knows that 45% of all tune-ups are done on four- cylinder automobiles, 40% on six-cylinder automobiles, and 15% on eight-cylinder automobiles. Let number of cylinders on the next car to be tuned. a. What is the pmf of X? b. Draw both a line graph and a probability histogram for

the pmf of part (a). c. What is the probability that the next car tuned has at

least six cylinders? More than six cylinders?

12. Airlines sometimes overbook flights. Suppose that for a plane with 50 seats, 55 passengers have tickets. Define the random variable Y as the number of ticketed passengers who actually show up for the flight. The probability mass func- tion of Y appears in the accompanying table.

X 5 the

b. What is the probability that not all ticketed passengers who show up can be accommodated?

c. If you are the first person on the standby list (which means you will be the first one to get on the plane if there are any seats available after all ticketed passengers have been accommodated), what is the probability that you will be able to take the flight? What is this probability if you are the third person on the standby list?

13. A mail-order computer business has six telephone lines. Let X denote the number of lines in use at a specified time. Suppose the pmf of X is as given in the accompanying table.

y 45 46 47 48 49 50 51 52 53 54 55

p(y) .05 .10 .12 .14 .25 .17 .06 .05 .03 .02 .01

x 0 1 2 3 4 5 6

p(x) .10 .15 .20 .25 .20 .06 .04

a. What is the probability that the flight will accommodate all ticketed passengers who show up?

Calculate the probability of each of the following events. a. {at most three lines are in use} b. {fewer than three lines are in use} c. {at least three lines are in use} d. {between two and five lines, inclusive, are in use} e. {between two and four lines, inclusive, are not in use} f. {at least four lines are not in use}

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

14. A contractor is required by a county planning department to submit one, two, three, four, or five forms (depending on the nature of the project) in applying for a building permit. Let

of forms required of the next applicant. The probability that y forms are required is known to be pro- portional to y—that is, for .

a. What is the value of k? [Hint: .]

b. What is the probability that at most three forms are required?

c. What is the probability that between two and four forms (inclusive) are required?

d. Could for be the pmf of Y?

15. Many manufacturers have quality control programs that in- clude inspection of incoming materials for defects. Sup- pose a computer manufacturer receives computer boards in lots of five. Two boards are selected from each lot for inspection. We can represent possible outcomes of the selec- tion process by pairs. For example, the pair (1, 2) represents the selection of boards 1 and 2 for inspection. a. List the ten different possible outcomes. b. Suppose that boards 1 and 2 are the only defective

boards in a lot of five. Two boards are to be chosen at random. Define X to be the number of defective boards observed among those inspected. Find the probability distribution of X.

c. Let F(x) denote the cdf of X. First determine , F(1), and F(2); then obtain F(x) for all other x.

16. Some parts of California are particularly earthquake-prone. Suppose that in one metropolitan area, 25% of all home- owners are insured against earthquake damage. Four home- owners are to be selected at random; let X denote the number among the four who have earthquake insurance. a. Find the probability distribution of X. [Hint: Let S denote

a homeowner who has insurance and F one who does not. Then one possible outcome is SFSS, with probability (.25)(.75)(.25)(.25) and associated X value 3. There are 15 other outcomes.]

b. Draw the corresponding probability histogram. c. What is the most likely value for X? d. What is the probability that at least two of the four

selected have earthquake insurance?

17. A new battery’s voltage may be acceptable (A) or unaccept- able (U). A certain flashlight requires two batteries, so bat- teries will be independently selected and tested until two acceptable ones have been found. Suppose that 90% of all batteries have acceptable voltages. Let Y denote the number of batteries that must be tested. a. What is p(2), that is, ? b. What is p(3)? [Hint: There are two different outcomes

that result in .] c. To have , what must be true of the fifth battery

selected? List the four outcomes for which and then determine p(5).

d. Use the pattern in your answers for parts (a)–(c) to obtain a general formula for p(y).

Y 5 5 Y 5 5

Y 5 3

P(Y 5 2)

P(X # 0) F(0) 5

y 5 1, c, 5p(y) 5 y2/50

a 5

y51 p(y) 5 1

y 5 1, . . . , 5p(y) 5 ky

Y 5 the number

18. Two fair six-sided dice are tossed independently. Let of the two tosses (so ,

, etc.). a. What is the pmf of M? [Hint: First determine p(1), then

p(2), and so on.] b. Determine the cdf of M and graph it.

19. A library subscribes to two different weekly news maga- zines, each of which is supposed to arrive in Wednesday’s mail. In actuality, each one may arrive on Wednesday, Thursday, Friday, or Saturday. Suppose the two arrive inde- pendently of one another, and for each one ,

, , and . Let of days beyond Wednesday that it takes for

both magazines to arrive (so possible Y values are 0, 1, 2, or 3). Compute the pmf of Y. [Hint: There are 16 possible outcomes; , , and so on.]

20. Three couples and two single individuals have been invited to an investment seminar and have agreed to attend. Suppose the probability that any particular couple or indi- vidual arrives late is .4 (a couple will travel together in the same vehicle, so either both people will be on time or else both will arrive late). Assume that different couples and individuals are on time or late independently of one another. Let of people who arrive late for the seminar. a. Determine the probability mass function of X. [Hint:

label the three couples #1, #2, and #3 and the two indi- viduals #4 and #5.]

b. Obtain the cumulative distribution function of X, and use it to calculate .

21. Suppose that you read through this year’s issues of the New York Times and record each number that appears in a news article—the income of a CEO, the number of cases of wine produced by a winery, the total charitable contribution of a politician during the previous tax year, the age of a celebrity, and so on. Now focus on the leading digit of each number, which could be 1, 2, . . . , 8, or 9. Your first thought might be that the leading digit X of a randomly selected number would be equally likely to be one of the nine pos- sibilities (a discrete uniform distribution). However, much empirical evidence as well as some theoretical arguments suggest an alternative probability distribution called Benford’s law:

a. Without computing individual probabilities from this formula, show that it specifies a legitimate pmf.

b. Now compute the individual probabilities and compare to the corresponding discrete uniform distribution.

c. Obtain the cdf of X. d. Using the cdf, what is the probability that the leading

digit is at most 3? At least 5? [Note: Benford’s law is the basis for some auditing pro- cedures used to detect fraud in financial reporting—for example, by the Internal Revenue Service.]

p(x) 5 P(1st digit is x) 5 log10ax 1 1x b x 5 1, 2, . . . , 9

P(2 # X # 6)

X 5 the number

Y(F,Th) 5 2Y(W,W) 5 0

Y 5 the number P(Sat.) 5 .1P(Fri.) 5 .2P(Thurs.) 5 .4

P(Wed.) 5 .3

M(3,3) 5 3 M(1,5) 5 5M 5 the maximum

3.2. Probability Distributions for Discrete Random Variables 105

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

B

C

A

D

0

106 CHAPTER 3 Discrete Random Variables and Probability Distributions

22. Refer to Exercise 13, and calculate and graph the cdf F(x). Then use it to calculate the probabilities of the events given in parts (a)–(d) of that problem.

23. A consumer organization that evaluates new automobiles customarily reports the number of major defects in each car examined. Let X denote the number of major defects in a randomly selected car of a certain type. The cdf of X is as follows:

Calculate the following probabilities directly from the cdf: a. p(2), that is, b. c. d.

24. An insurance company offers its policyholders a number of different premium payment options. For a randomly selected policyholder, let of months between successive payments. The cdf of X is as follows:

a. What is the pmf of X? b. Using just the cdf, compute and .

25. In Example 3.12, let of girls born before the experiment terminates. With and

, what is the pmf of Y? [Hint: First list the possible values of Y, starting with the smallest, and proceed until you see a general formula.]

1 2 p 5 P(G) p 5 P(B)

Y 5 the number

P(4 # X)P(3 # X # 6)

F(x) 5 f 0 x , 1 .30 1 # x , 3 .40 3 # x , 4 .45 4 # x , 6 .60 6 # x , 12 1 12 # x

X 5 the number

P(2 , X , 5)P(2 # X # 5) P(X . 3)P(X 5 2)

F(x) 5 h 0 x , 0

.06 0 # x , 1

.19 1 # x , 2

.39 2 # x , 3

.67 3 # x , 4

.92 4 # x , 5

.97 5 # x , 6

1 6 # x

26. Alvie Singer lives at 0 in the accompanying diagram and has four friends who live at A, B, C, and D. One day Alvie decides to go visiting, so he tosses a fair coin twice to decide which of the four to visit. Once at a friend’s house, he will either return home or else proceed to one of the two adjacent houses (such as 0, A, or C when at B), with each of the three possibilities having probability . In this way, Alvie continues to visit friends until he returns home.

1 3

3.3 Expected Values Consider a university having 15,000 students and let of courses for which a randomly selected student is registered. The pmf of X follows. Since

, we know that of the students are registered for one course, and similarly for the other x values.

(.01) # (15,000) 5 150p(1) 5 .01

X 5 the number

a. Let of times that Alvie visits a friend. Derive the pmf of X.

b. Let of straight-line segments that Alvie traverses (including those leading to and from 0). What is the pmf of Y?

c. Suppose that female friends live at A and C and male friends at B and D. If of visits to female friends, what is the pmf of Z?

27. After all students have left the classroom, a statistics pro- fessor notices that four copies of the text were left under desks. At the beginning of the next lecture, the professor distributes the four books in a completely random fashion to each of the four students (1, 2, 3, and 4) who claim to have left books. One possible outcome is that 1 receives 2’s book, 2 receives 4’s book, 3 receives his or her own book, and 4 receives 1’s book. This outcome can be abbreviated as (2, 4, 3, 1). a. List the other 23 possible outcomes. b. Let X denote the number of students who receive their

own book. Determine the pmf of X.

28. Show that the cdf F(x) is a nondecreasing function; that is, implies that . Under what condition

will ?F(x1) 5 F(x2) F(x1) # F(x2)x1 , x2

Z 5 the number

Y 5 the number

X 5 the number

x 1 2 3 4 5 6 7

p(x) .01 .03 .13 .25 .39 .17 .02

Number registered 150 450 1950 3750 5850 2550 300

(3.6)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.16

DEFINITION

3.3. Expected Values 107

Let X be a discrete rv with set of possible values D and pmf p(x). The expected value or mean value of X, denoted by E(X) or mX or just m, is

E(X) 5 mX 5 g x�D

x # p(x)

For the pmf of in (3.6),

If we think of the population as consisting of the X values 1, 2, . . . , 7, then is the population mean. In the sequel, we will often refer to m as the population mean rather than the mean of X in the population. Notice that m here is not 4, the ordinary average of 1, . . . , 7, because the distribution puts more weight on 4, 5, and 6 than on other X values. ■

In Example 3.16, the expected value m was 4.57, which is not a possible value of X. The word expected should be interpreted with caution because one would not expect to see an X value of 4.57 when a single student is selected.

Just after birth, each newborn child is rated on a scale called the Apgar scale. The possible ratings are 0, 1, . . . , 10, with the child’s rating determined by color, mus- cle tone, respiratory effort, heartbeat, and reflex irritability (the best possible score is 10). Let X be the Apgar score of a randomly selected child born at a certain hos- pital during the next year, and suppose that the pmf of X is

m 5 4.57

5 .01 1 .06 1 .39 1 1.00 1 1.95 1 1.02 1 .14 5 4.57

5 (1)(.01) 1 2(.03) 1 c 1 (7)(.02)

m 5 1 # p(1) 1 2 # p(2) 1 c 1 7 # p(7) X 5 number of courses

x 0 1 2 3 4 5 6 7 8 9 10

p(x) .002 .001 .002 .005 .02 .04 .18 .37 .25 .12 .01

The average number of courses per student, or the average value of X in the population, results from computing the total number of courses taken by all students and dividing by the total number of students. Since each of 150 students is taking one course, these 150 contribute 150 courses to the total. Similarly, 450 students contribute 2(450) courses, and so on. The population average value of X is then

(3.7)

Since , , and so on, an alterna- tive expression for (3.7) is

(3.8)

Expression (3.8) shows that to compute the population average value of X, we need only the possible values of X along with their probabilities (proportions). In particular, the population size is irrelevant as long as the pmf is given by (3.6). The average or mean value of X is then a weighted average of the possible values 1, . . . , 7, where the weights are the probabilities of those values.

The Expected Value of X

1 # p(1) 1 2 # p(2) 1 c 1 7 # p(7)

450/15,000 5 .03 5 p(2)150/15,000 5 .01 5 p(1)

1(150) 1 2(450) 1 3(1950) 1 c 1 7(300)

15,000 5 4.57

Example 3.17

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.20

Example 3.19

Example 3.18

108 CHAPTER 3 Discrete Random Variables and Probability Distributions

Then the mean value of X is

Again, m is not a possible value of the variable X. Also, because the variable relates to a future child, there is no concrete existing population to which m refers. Instead, we think of the pmf as a model for a conceptual population consisting of the values 0, 1, 2, . . . , 10. The mean value of this conceptual population is then

. ■

Let if a randomly selected vehicle passes an emissions test and other- wise. Then X is a Bernoulli rv with pmf and , from which

. That is, the expected value of X is just the probability that X takes on the value 1. If we conceptualize a population consisting of 0s in proportion and 1s in proportion p, then the population average is ■

The general form for the pmf of of children born up to and including the first boy is

From the definition,

(3.9)

If we interchange the order of taking the derivative and the summation, the sum is that of a geometric series. After the sum is computed, the derivative is taken, and the final result is . If p is near 1, we expect to see a boy very soon, whereas if p is near 0, we expect many births before the first boy. For ,

. ■

There is another frequently used interpretation of m. Consider observing a first value x1 of X, then a second value x2, a third value x3, and so on. After doing this a large number of times, calculate the sample average of the observed x'is. This aver- age will typically be quite close to m. That is, m can be interpreted as the long-run average observed value of X when the experiment is performed repeatedly. In Example 3.17, the long-run average Apgar score is .

Let X, the number of interviews a student has prior to getting a job, have pmf

where k is chosen so that . (In a mathematics course on infinite series, it is shown that , which implies that such a k exists, but its exact value need not concern us.) The expected value of X is

(3.10)m 5 E(X) 5 g `

x51 x # k

x2 5 kg

`

x51 1 x

g`x51 (1/x 2) , `

g`x51 (k/x 2) 5 1

p(x) 5 e k /x2 x 5 1, 2, 3, . . . 0 otherwise

m 5 7.15

E(X) 5 2 p 5 .5

E(X) 5 1/p

E(X) 5 g D

x # p(x) 5 g `

x51 xp(1 2 p)x21 5 pg

`

x51 c 2 d

dp (1 2 p)xd

p(x) 5 ep(1 2 p)x21 x 5 1, 2, 3, . . . 0 otherwise

X 5 number

m 5 p. 1 2 p

E(X) 5 0 # p(0) 1 1 # p(1) 5 0(1 2 p) 1 1(p) 5 p p(0) 5 1 2 pp(1) 5 p

X 5 0X 5 1

m 5 7.15

5 7.15

1 c 1 8(.25) 1 9(.12) 1 10(.01)

E(X) 5 m 5 0(.002) 1 1(.001) 1 2(.002)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.21

3.3. Expected Values 109

The sum on the right of Equation (3.10) is the famous harmonic series of mathematics and can be shown to equal �. E(X) is not finite here because p(x) does not decrease sufficiently fast as x increases; statisticians say that the probability dis- tribution of X has “a heavy tail.” If a sequence of X values is chosen using this dis- tribution, the sample average will not settle down to some finite number but will tend to grow without bound.

Statisticians use the phrase “heavy tails” in connection with any distribution hav- ing a large amount of probability far from m (so heavy tails do not require ). Such heavy tails make it difficult to make inferences about m. ■

The Expected Value of a Function Sometimes interest will focus on the expected value of some function h(X) rather than on just E(X).

Suppose a bookstore purchases ten copies of a book at $6.00 each to sell at $12.00 with the understanding that at the end of a 3-month period any unsold copies can be redeemed for $2.00. If of copies sold, then net revenue

. What then is the expected net revenue? ■

An easy way of computing the expected value of h(X) is suggested by the fol- lowing example.

The cost of a certain vehicle diagnostic test depends on the number of cylinders X in the vehicle’s engine. Suppose the cost function is given by . Since X is a random variable, so is . The pmf of X and derived pmf of Y are as follows:

Y 5 h(X) h(X) 5 20 1 3X 1 .5X 2

12X 1 2(10 2 X) 2 60 5 10X 2 40 5 h(X) 5X 5 the number

m 5 `

Example 3.22

x 4 6 8

p(x) .5 .3 .2

y 40 56 76

p(y) .5 .3 .2 1

With D* denoting possible values of Y,

(3.11)

According to Equation (3.11), it was not necessary to determine the pmf of Y to obtain E(Y); instead, the desired expected value is a weighted average of the possi- ble h(x) (rather than x) values. ■

5 g D

h(x) # p(x) 5 h(4) # (.5) 1 h(6) # (.3) 1 h(8) # (.2) 5 (40)(.5) 1 (56)(.3) 1 (76)(.2)

E(Y) 5 E[h(X)] 5 g D*

y # p(y)

PROPOSITION If the rv X has a set of possible values D and pmf p(x), then the expected value of any function h(X), denoted by E[h(X)] or , is computed by

E[h(X)] 5 g D

h(x) # p(x)

mh(X)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

Example 3.23

110 CHAPTER 3 Discrete Random Variables and Probability Distributions

That is, E[h(X)] is computed in the same way that E(X) itself is, except that h(x) is substituted in place of x.

A computer store has purchased three computers of a certain type at $500 apiece. It will sell them for $1000 apiece. The manufacturer has agreed to repurchase any computers still unsold after a specified period at $200 apiece. Let X denote the number of computers sold, and suppose that ,

, , and . With h(X) denoting the profit associated with selling X units, the given information implies that

. The expected profit is then

Rules of Expected Value The h(X) function of interest is quite frequently a linear function . In this case, E[h(X)] is easily computed from E(X).

aX 1 b

5 $700

5 (2900)(.1) 1 (2100)(.2) 1 (700)(.3) 1 (1500)(.4)

E[h(X)] 5 h(0) # p(0) 1 h(1) # p(1) 1 h(2) # p(2) 1 h(3) # p(3) 1000X 1 200(3 2 X) 2 1500 5 800X 2 900

h(X) 5 revenue 2 cost 5 p(3) 5 .4p(2) 5 .3p(1) 5 .2

p(0) 5 .1

(Or, using alternative notation, )maX1b 5 a # mX 1 b E(aX 1 b) 5 a # E(X) 1 b

To paraphrase, the expected value of a linear function equals the linear func- tion evaluated at the expected value E(X). Since h(X) in Example 3.23 is linear and

, as before.

Proof

Two special cases of the proposition yield two important rules of expected value.

1. For any constant a, (take ). (3.12) 2. For any constant b, (take ).

Multiplication of X by a constant a typically changes the unit of measurement, for example, from inches to cm, where . Rule 1 says that the expected value in the new units equals the expected value in the old units multiplied by the conver- sion factor a. Similarly, if a constant b is added to each possible value of X, then the expected value will be shifted by that same constant amount.

The Variance of X The expected value of X describes where the probability distribution is centered. Using the physical analogy of placing point mass p(x) at the value x on a one- dimensional axis, if the axis were then supported by a fulcrum placed at m, there would be no tendency for the axis to tilt. This is illustrated for two different distri- butions in Figure 3.7.

a 5 2.54

a 5 1E(X 1 b) 5 E(X) 1 b

b 5 0E(aX) 5 a # E(X)

5 aE(X) 1 b

E(aX 1 b) 5 g D

(ax 1 b) # p(x) 5 ag D

x # p(x) 1 bg D

p(x)

E[h(x)] 5 800(2) 2 900 5 $700,E(X) 5 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

3.3. Expected Values 111

Although both distributions pictured in Figure 3.7 have the same center m, the distribution of Figure 3.7(b) has greater spread or variability or dispersion than does that of Figure 3.7(a). We will use the variance of X to assess the amount of variabil- ity in (the distribution of ) X, just as s2 was used in Chapter 1 to measure variability in a sample.

p(x)

.5

1 2 3

(a)

5

p(x)

.5

1 2 3 5 6 7 8

(b)

Figure 3.7 Two different probability distributions with m 5 4

Let X have pmf p(x) and expected value m. Then the variance of X, denoted by V(X) or , or just s2, is

The standard deviation (SD) of X is

sX 5 #sX2

V(X) 5 g D

(x 2 m)2 # p(x) 5 E[(X 2 m)2] sX

2

Example 3.24

The quantity is the squared deviation of X from its mean, and s2 is the expected squared deviation—i.e., the weighted average of squared deviations, where the weights are probabilities from the distribution. If most of the probability distribution is close to m, then s2 will be relatively small. However, if there are x values far from m that have large p(x), then s2 will be quite large. Very roughly, s can be interpreted as the size of a representative deviation from the mean value m. So if , then in a long sequence of observed X values, some will devi- ate from m by more than 10 while others will be closer to the mean than that—a typ- ical deviation from the mean will be something on the order of 10.

A library has an upper limit of 6 on the number of videos that can be checked out to an individual at one time. Consider only those who check out videos, and let X denote the number of videos checked out to a randomly selected individual. The pmf of X is as follows:

s 5 10

h(X) 5 (X 2 m)2

x 1 2 3 4 5 6

p(x) .30 .25 .15 .05 .10 .15

The expected value of X is easily seen to be . The variance of X is then

The standard deviation of X is . ■s 5 13.2275 5 1.800

5 (1 2 2.85)2(.30) 1 (2 2 2.85)2(.25) 1 c 1 (6 2 2.85)2(.15) 5 3.2275

V(X) 5 s2 5 g 6

x51 (x 2 2.85)2 # p(x)

m 5 2.85

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

Example 3.25 (Example 3.24 continued)

PROPOSITION

112 CHAPTER 3 Discrete Random Variables and Probability Distributions

When the pmf p(x) specifies a mathematical model for the distribution of pop- ulation values, both s2 and s measure the spread of values in the population; s2 is the population variance, and s is the population standard deviation.

A Shortcut Formula for S2

The number of arithmetic operations necessary to compute s2 can be reduced by using an alternative formula.

V(X) 5 s2 5 cg D

x2 # p(x)d 2 m2 5 E(X 2) 2 [E(X)]2

In using this formula, E(X2) is computed first without any subtraction; then E(X) is computed, squared, and subtracted (once) from E(X2).

The pmf of the number X of videos checked out was given as , , , , , and , from which and

Thus as obtained previously from the definition. ■

Proof of the Shortcut Formula Expand in the definition of s2 to obtain , and then carry through to each of the three terms:

Rules of Variance The variance of h(X) is the expected value of the squared difference between h(X) and its expected value:

(3.13)

When , a linear function,

Substituting this into (3.13) gives a simple relationship between V[h(X)] and V(X):

h(x) 2 E[h(X)] 5 ax 1 b 2 (am 1 b) 5 a(x 2 m)

h(X) 5 aX 1 b

V[h(X)] 5 sh(X) 2 5 g

D 5h(x) 2 E[h(X)]62 # p(x)

5 E(X 2) 2 2m # m 1 m2 5 E(X 2) 2 m2 s2 5 g

D x2 # p(x) 2 2m # g

D x # p(x) 1 m2g

D p(x)

gx2 2 2mx 1 m2 (x 2 m)2

s2 5 11.35 2 (2.85)2 5 3.2275

E(X 2) 5 g 6

x51 x2 # p(x) 5 (12)(.30) 1 (22)(.25) 1 c 1 (62)(.15) 5 11.35

m 5 2.85p(6) 5 .15p(5) 5 .10p(4) 5 .05p(3) 5 .15 p(2) 5 .25p(1) 5 .30

In particular,

(3.14)saX 5 u a u # sX, sX1b 5 sX

V(aX 1 b) 5 saX1b 2 5 a2 # sX2 and saX1b 5 u a u # sx

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.3. Expected Values 113

Example 3.26

The absolute value is necessary because a might be negative, yet a standard deviation cannot be. Usually multiplication by a corresponds to a change in the unit of measurement (e.g., kg to lb or dollars to euros). According to the first relation in (3.14), the sd in the new unit is the original sd multiplied by the conversion factor. The second relation says that adding or subtracting a constant does not impact vari- ability; it just rigidly shifts the distribution to the right or left.

In the computer sales scenario of Example 3.23, and

so . The profit function then has variance and standard deviation 800. ■(800)2 # V(X) 5 (640,000)(1) 5 640,000

h(X) 5 800X 2 900V(X) 5 5 2 (2)2 5 1

E(X 2) 5 (0)2(.1) 1 (1)2(.2) 1 (2)2(.3) 1 (3)2(.4) 5 5

E(X) 5 2

EXERCISES Section 3.3 (29–45)

29. The pmf of the amount of memory X (GB) in a purchased flash drive was given in Example 3.13 as

a. Compute E(X), E(X2), and V(X). b. If the price of a freezer having capacity X cubic feet is

, what is the expected price paid by the next customer to buy a freezer?

c. What is the variance of the price paid by the next customer?

d. Suppose that although the rated capacity of a freezer is X, the actual capacity is . What is the expected actual capacity of the freezer purchased by the next customer?

33. Let X be a Bernoulli rv with pmf as in Example 3.18. a. Compute E(X2). b. Show that . c. Compute E(X79).

34. Suppose that the number of plants of a particular type found in a rectangular sampling region (called a quadrat by ecolo- gists) in a certain geographic area is an rv X with pmf

Is E(X) finite? Justify your answer (this is another distribu- tion that statisticians would call heavy-tailed).

35. A small market orders copies of a certain magazine for its magazine rack each week. Let for the maga- zine, with pmf

X 5 demand

p(x) 5 e c/x3 x 5 1, 2, 3, . . . 0 otherwise

V(X) 5 p(1 2 p)

h(X) 5 X 2 .01X 2

25X 2 8.5

25X 2 8.5 x 1 2 4 8 16

p(x) .05 .10 .35 .40 .10

y 0 1 2 3

p(y) .60 .25 .10 .05

x 1 2 3 4 5 6

p(x) 1 15

2 15

3 15

4 15

3 15

2 15

x 13.5 15.9 19.1

p(x) .2 .5 .3

Compute the following: a. E(X) b. V(X) directly from the definition c. The standard deviation of X d. V(X) using the shortcut formula

30. An individual who has automobile insurance from a certain company is randomly selected. Let Y be the number of mov- ing violations for which the individual was cited during the last 3 years. The pmf of Y is

Suppose the store owner actually pays $2.00 for each copy of the magazine and the price to customers is $4.00. If magazines left at the end of the week have no salvage value, is it better to order three or four copies of the magazine? [Hint: For both three and four copies ordered, express net revenue as a func- tion of demand X, and then compute the expected revenue.]

a. Compute E(Y). b. Suppose an individual with Y violations incurs a sur-

charge of $100Y2. Calculate the expected amount of the surcharge.

31. Refer to Exercise 12 and calculate V(Y) and sY. Then deter- mine the probability that Y is within 1 standard deviation of its mean value.

32. An appliance dealer sells three different models of upright freezers having 13.5, 15.9, and 19.1 cubic feet of storage space, respectively. Let of storage space purchased by the next customer to buy a freezer. Suppose that X has pmf

X 5 the amount

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

114 CHAPTER 3 Discrete Random Variables and Probability Distributions

36. Let X be the damage incurred (in $) in a certain type of acci- dent during a given year. Possible X values are 0, 1000, 5000, and 10000, with probabilities .8, .1, .08, and .02, respectively. A particular company offers a $500 deductible policy. If the company wishes its expected profit to be $100, what premium amount should it charge?

37. The n candidates for a job have been ranked 1, 2, 3, . . . , n. Let of a randomly selected candidate, so that X has pmf

(this is called the discrete uniform distribution). Compute E(X) and V(X) using the shortcut formula. [Hint: The sum of the first n positive integers is , whereas the sum of their squares is .]

38. Let when a fair die is rolled once. If before the die is rolled you are offered either (1/3.5) dollars or dollars, would you accept the guaranteed amount or would you gamble? [Note: It is not generally true that .]

39. A chemical supply company currently has in stock 100 lb of a certain chemical, which it sells to customers in 5-lb batches. Let of batches ordered by a ran- domly chosen customer, and suppose that X has pmf

X 5 the number

1/E(X) 5 E(1/X)

h(X) 5 1/X

X 5 the outcome

n(n 1 1)(2n 1 1)/6 n(n 1 1)/2

p(x) 5 e 1/n x 5 1, 2, 3, . . . , n 0 otherwise

X 5 the rank

40. a. Draw a line graph of the pmf of X in Exercise 35. Then determine the pmf of and draw its line graph. From these two pictures, what can you say about V(X) and

? b. Use the proposition involving to establish a

general relationship between V(X) and .

41. Use the definition in Expression (3.13) to prove that . [Hint: With ,

where .]

42. Suppose and . What is a. E(X2)? [Hint:

]? b. V(X)? c. The general relationship among the quantities E(X),

, and V(X)?

43. Write a general rule for where c is a constant. What happens when you let , the expected value of X?

44. A result called Chebyshev’s inequality states that for any probability distribution of an rv X and any number k that is at least 1, . In words, the proba- bility that the value of X lies at least k standard deviations from its mean is at most 1/k2. a. What is the value of the upper bound for ? ?

? ? ? b. Compute m and s for the distribution of Exercise 13.

Then evaluate for the values of k given in part (a). What does this suggest about the upper bound relative to the corresponding probability?

c. Let X have possible values , 0, and 1, with probabilities , , and , respectively. What is ,

and how does it compare to the corresponding bound? d. Give a distribution for which .

45. If , show that .a # E(X) # ba # X # b

P( u X 2 m u $ 5s) 5 .04

P( u X 2 m u $ 3s)1 18

8 9

1 18

21

P( u X 2 m u $ ks)

k 5 10k 5 5k 5 4 k 5 3k 5 2

P( u X 2 m u $ ks) # 1/k2

c 5 m E(X 2 c)

E[X(X 2 1)]

E(X 2) 2 E(X) E[X(X 2 1)] 5 E[X 2 2 X] 5

E[X(X 2 1)] 5 27.5E(X) 5 5

m 5 E(X)E[h(X)] 5 am 1 b h(X) 5 aX 1 bV(aX 1 b) 5 a2 # sX2

V(2X) V(aX 1 b)

V(2X)

2X

x 1 2 3 4

p(x) .2 .4 .3 .1

Compute E(X) and V(X). Then compute the expected num- ber of pounds left after the next customer’s order is shipped and the variance of the number of pounds left. [Hint: The number of pounds left is a linear function of X.]

3.4 The Binomial Probability Distribution There are many experiments that conform either exactly or approximately to the fol- lowing list of requirements:

1. The experiment consists of a sequence of n smaller experiments called trials, where n is fixed in advance of the experiment.

2. Each trial can result in one of the same two possible outcomes (dichotomous trials), which we generically denote by success (S) and failure (F).

3. The trials are independent, so that the outcome on any particular trial does not influence the outcome on any other trial.

4. The probability of success P(S) is constant from trial to trial; we denote this probability by p.

DEFINITION An experiment for which Conditions 1–4 are satisfied is called a binomial experiment.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.29

Example 3.28

Example 3.27

3.4. The Binomial Probability Distribution 115

The same coin is tossed successively and independently n times. We arbitrarily use S to denote the outcome H (heads) and F to denote the outcome T (tails). Then this experiment satisfies Conditions 1–4. Tossing a thumbtack n times, with

and , also results in a binomial experiment. ■

Many experiments involve a sequence of independent trials for which there are more than two possible outcomes on any one trial. A binomial experiment can then be created by dividing the possible outcomes into two groups.

The color of pea seeds is determined by a single genetic locus. If the two alleles at this locus are AA or Aa (the genotype), then the pea will be yellow (the pheno- type), and if the allele is aa, the pea will be green. Suppose we pair off 20 Aa seeds and cross the two seeds in each of the ten pairs to obtain ten new genotypes. Call each new genotype a success S if it is aa and a failure otherwise. Then with this identification of S and F, the experiment is binomial with and

. If each member of the pair is equally likely to contribute a or

A, then . ■

Suppose a certain city has 50 licensed restaurants, of which 15 currently have at least one serious health code violation and the other 35 have no serious violations. There are five inspectors, each of whom will inspect one restaurant during the coming week. The name of each restaurant is written on a different slip of paper, and after the slips are thoroughly mixed, each inspector in turn draws one of the slips without replacement. Label the ith trial as a success if the ith restaurant selected

has no serious violations. Then

and

Similarly, it can be shown that for . However,

whereas

The experiment is not binomial because the trials are not independent. In gen- eral, if sampling is without replacement, the experiment will not yield independent trials. If each slip had been replaced after being drawn, then trials would have been independent, but this might have resulted in the same restaurant being inspected by more than one inspector. ■

P(S on fifth trial u FFFF) 5 35

46 5 .76

P(S on fifth trial u SSSS) 5 31

46 5 .67

i 5 3, 4, 5P(S on ith trial) 5 .70

5 34

49 # 35

50 1

35

49 # 15

50 5

35

50 a34

49 1

15

49 b 5 35

50 5 .70

1 P(second S | first F)P(first F)

5 P(second S | first S)P(first S)

P(S on second trial) 5 P(SS) 1 P(FS)

P(S on first trial) 5 35

50 5 .70

(i 5 1, . . . , 5)

p 5 P(a) # P(a) 5 Q12R Q12R 5 14 p 5 P(aa genotype)

n 5 10

F 5 point downS 5 point up

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

RULE

116 CHAPTER 3 Discrete Random Variables and Probability Distributions

A certain state has 500,000 licensed drivers, of whom 400,000 are insured. A sam- ple of 10 drivers is chosen without replacement. The ith trial is labeled S if the ith driver chosen is insured. Although this situation would seem identical to that of Example 3.29, the important difference is that the size of the population being sam- pled is very large relative to the sample size. In this case

and

These calculations suggest that although the trials are not exactly independent, the conditional probabilities differ so slightly from one another that for practical purposes the trials can be regarded as independent with constant . Thus, to a very good approximation, the experiment is binomial with and . ■

We will use the following rule of thumb in deciding whether a “without- replacement” experiment can be treated as a binomial experiment.

p 5 .8n 5 10 P(S) 5 .8

P(S on 10 | S on first 9) 5 399,991

499,991 5 .799996 < .80000

P(S on 2 | S on 1) 5 399,999

499,999 5 .80000

Consider sampling without replacement from a dichotomous population of size N. If the sample size (number of trials) n is at most 5% of the population size, the experiment can be analyzed as though it were exactly a binomial experiment.

By “analyzed,” we mean that probabilities based on the binomial experiment assump- tions will be quite close to the actual “without-replacement” probabilities, which are typically more difficult to calculate. In Example 3.29, , so the binomial experiment is not a good approximation, but in Example 3.30,

.

The Binomial Random Variable and Distribution In most binomial experiments, it is the total number of S’s, rather than knowledge of exactly which trials yielded S’s, that is of interest.

n/N 5 10/500,000 , .05

n/N 5 5/50 5 .1 . .05

The binomial random variable X associated with a binomial experiment consisting of n trials is defined as

X 5 the number of S’s among the n trials

Suppose, for example, that . Then there are eight possible outcomes for the experiment:

From the definition of X, , , and so on. Possible values for X in an n-trial experiment are . We will often write to indicate that X is a binomial rv based on n trials with success probability p.

X | Bin(n, p)x 5 0, 1, 2, . . . , n X(SFF) 5 1X(SSF) 5 2

SSS SSF SFS SFF FSS FSF FFS FFF

n 5 3

Example 3.30

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

THEOREM

NOTATION

3.4. The Binomial Probability Distribution 117

b(x; n, p) 5 u QnxR px(1 2 p)n2x x 5 0, 1, 2, . . . , n 0 otherwise

Because the pmf of a binomial rv X depends on the two parameters n and p, we denote the pmf by b(x; n, p).

Consider first the case for which each outcome, its probability, and cor- responding x value are listed in Table 3.1. For example,

5 p3 # (1 2 p) 5 p # p # (1 2 p) # p (constant P(S))

P(SSFS) 5 P(S) # P(S) # P(F) # P(S) (independent trials)

n 5 4

Table 3.1 Outcomes and Probabilities for a Binomial Experiment with Four Trials

Outcome x Probability Outcome x Probability

SSSS 4 p4 FSSS 3 SSSF 3 FSSF 2 SSFS 3 FSFS 2 SSFF 2 FSFF 1 SFSS 3 FFSS 2 SFSF 2 FFSF 1 SFFS 2 FFFS 1 SFFF 1 FFFF 0 (1 2 p)4p(1 2 p)3

p(1 2 p)3p2(1 2 p)2 p(1 2 p)3p2(1 2 p)2

p2(1 2 p)2p3(1 2 p) p(1 2 p)3p2(1 2 p)2

p2(1 2 p)2p3(1 2 p) p2(1 2 p)2p3(1 2 p) p3(1 2 p)

In this special case, we wish b(x; 4, p) for , and 4. For b(3; 4, p), let’s identify which of the 16 outcomes yield an x value of 3 and sum the probabili- ties associated with each such outcome:

There are four outcomes with and each has probability (the order of S’s and F’s is not important, but only the number of S’s), so

Similarly, , which is also the product of the number of out- comes with and the probability of any such outcome.

In general,

Since the ordering of S’s and F’s is not important, the second factor in the previous equation is (e.g., the first x trials resulting in S and the last result- ing in F ). The first factor is the number of ways of choosing x of the n trials to be S’s—that is, the number of combinations of size x that can be constructed from n dis- tinct objects (trials here).

n 2 xpx(1 2 p)n2x

b(x; n, p) 5 enumber of sequences of length n consisting of x S’s

f # eprobability of any particular such sequence

f

X 5 2 b(2; 4, p) 5 6p2(1 2 p)2

b(3; 4, p) 5 enumber of outcomes with X 5 3

f # eprobability of any particular outcome with X 5 3

f

p3(1 2 p)X 5 3

b(3; 4, p) 5 P(FSSS) 1 P(SFSS) 1 P(SSFS) 1 P(SSSF) 5 4p3(1 2 p)

x 5 0, 1, 2, 3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.32

NOTATION

118 CHAPTER 3 Discrete Random Variables and Probability Distributions

Each of six randomly selected cola drinkers is given a glass containing cola S and one containing cola F. The glasses are identical in appearance except for a code on the bot- tom to identify the cola. Suppose there is actually no tendency among cola drinkers to prefer one cola to the other. Then , so with among the six who prefer S, .

Thus

The probability that at least three prefer S is

and the probability that at most one prefers S is

Using Binomial Tables* Even for a relatively small value of n, the computation of binomial probabilities can be tedious. Appendix Table A.1 tabulates the cdf for

in combination with selected values of p. Various other proba- bilities can then be calculated using the proposition on cdf’s from Section 3.2. A table entry of 0 signifies only that the probability is 0 to three significant digits since all table entries are actually positive.

n 5 5, 10, 15, 20, 25 F(x) 5 P(X # x)

P(X # 1) 5 g 1

x50 b(x; 6, .5) 5 .109

P(3 # X) 5 g 6

x53 b(x; 6, .5) 5 g

6

x53 a6xb(.5)x(.5)62x 5 .656

P(X 5 3) 5 b(3; 6, .5) 5 a6 3 b(.5)3(.5)3 5 20(.5)6 5 .313

X | Bin(6,.5)X 5 the number p 5 P(a selected individual prefers S) 5 .5

For , the cdf will be denoted by

B(x; n, p) 5 P(X # x) 5 g x

y50 b(y; n, p) x 5 0, 1, . . . , n

X | Bin(n, p)

Suppose that 20% of all copies of a particular textbook fail a certain binding strength test. Let X denote the number among 15 randomly selected copies that fail the test. Then X has a binomial distribution with and .

1. The probability that at most 8 fail the test is

which is the entry in the row and the column of the bino- mial table. From Appendix Table A.1, the probability is .

2. The probability that exactly 8 fail is

which is the difference between two consecutive entries in the column. The result is ..999 2 .996 5 .003

p 5 .2

P(X 5 8) 5 P(X # 8) 2 P(X # 7) 5 B(8; 15, .2) 2 B(7; 15, .2)

B(8; 15, .2) 5 .999 n 5 15p 5 .2x 5 8

P(X # 8) 5 g 8

y50 b(y; 15, .2) 5 B(8; 15, .2)

p 5 .2n 5 15

Example 3.31

* Statistical software packages such as Minitab and R will provide the pmf or cdf almost instantaneously upon request for any value of p and n ranging from 2 up into the millions. There is also an R command for calculating the probability that X lies in some interval.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.33

3.4. The Binomial Probability Distribution 119

3. The probability that at least 8 fail is

4. Finally, the probability that between 4 and 7, inclusive, fail is

Notice that this latter probability is the difference between entries in the and rows, not the and rows. ■

An electronics manufacturer claims that at most 10% of its power supply units need service during the warranty period. To investigate this claim, technicians at a testing laboratory purchase 20 units and subject each one to accelerated testing to simulate use during the warranty period. Let p denote the probability that a power supply unit needs repair during the period (the proportion of all such units that need repair). The laboratory technicians must decide whether the data result- ing from the experiment supports the claim that . Let X denote the num- ber among the 20 sampled that need repair, so . Consider the decision rule:

The probability that the claim is rejected when (an incorrect conclusion) is

The probability that the claim is not rejected when (a different type of incorrect conclusion) is

The first probability is rather small, but the second is intolerably large. When , so that the manufacturer has grossly understated the percentage of units

that need service, and the stated decision rule is used, 63% of all samples will result in the manufacturer’s claim being judged plausible!

One might think that the probability of this second type of erroneous conclu- sion could be made smaller by changing the cutoff value 5 in the decision rule to something else. However, although replacing 5 by a smaller number would yield a probability smaller than .630, the other probability would then increase. The only way to make both “error probabilities” small is to base the decision rule on an experiment involving many more units. ■

The Mean and Variance of X For , the binomial distribution becomes the Bernoulli distribution. From Example 3.18, the mean value of a Bernoulli variable is , so the expected number of S’s on any single trial is p. Since a binomial experiment consists of n trials, intuition suggests that for , , the product of the number of trials and the probability of success on a single trial. The expression for V(X) is not so intuitive.

E(X) 5 npX , Bin(n, p)

m 5 p n 5 1

p 5 .20

P(X # 4 when p 5 .2) 5 B(4; 20, .2) 5 .630

p 5 .20

P(X $ 5 when p 5 .10) 5 1 2 B(4; 20, .1) 5 1 2 .957 5 .043

p 5 .10

(where x is the observed value of X), and consider the claim plausible if x # 4. Reject the claim that p # .10 in favor of the conclusion that p . .10 if x $ 5

X , Bin(20, p) p # .10

x 5 4x 5 7x 5 3 x 5 7

5 B(7; 15, .2) 2 B(3; 15, .2) 5 .996 2 .648 5 .348

P(4 # X # 7) 5 P(X 5 4, 5, 6, or 7) 5 P(X # 7) 2 P(X # 3)

5 1 2 .996 5 .004

5 1 2 aentry in x 5 7 row of p 5 .2 column

b P(X $ 8) 5 1 2 P(X # 7) 5 1 2 B(7; 15, .2)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.34

PROPOSITION

120 CHAPTER 3 Discrete Random Variables and Probability Distributions

If , then , , and (where ).q 5 1 2 psX 5 1npq

V(X) 5 np(1 2 p) 5 npqE(X) 5 npX , Bin(n, p)

Thus, calculating the mean and variance of a binomial rv does not necessitate eval- uating summations. The proof of the result for E(X) is sketched in Exercise 64.

EXERCISES Section 3.4 (46–67)

46. Compute the following binomial probabilities directly from the formula for b(x; n, p): a. b(3; 8, .35) b. b(5; 8, .6) c. when and d. when and

47. Use Appendix Table A.1 to obtain the following probabilities: a. B(4; 15, .3) b. b(4; 15, .3) c. b(6; 15, .7) d. when e. when f. when g. when

48. When circuit boards used in the manufacture of compact disc players are tested, the long-run percentage of defectives is 5%. Let of defective boards in a random sample of size , so . a. Determine . b. Determine . c. Determine . d. What is the probability that none of the 25 boards is

defective? e. Calculate the expected value and standard deviation of X.

49. A company that produces fine crystal knows from experi- ence that 10% of its goblets have cosmetic flaws and must be classified as “seconds.” a. Among six randomly selected goblets, how likely is it

that only one is a second?

P(1 # X # 4) P(X $ 5) P(X # 2)

X , Bin(25, .05)n 5 25 X 5 the number

X , Bin(15, .3)P(2 , X , 6) X , Bin(15, .7)P(X # 1) X , Bin(15, .3)P(2 # X)

X , Bin(15, .3)P(2 # X # 4)

p 5 .1n 5 9P(1 # X) p 5 .6n 5 7P(3 # X # 5)

b. Among six randomly selected goblets, what is the prob- ability that at least two are seconds?

c. If goblets are examined one by one, what is the proba- bility that at most five must be selected to find four that are not seconds?

50. A particular telephone number is used to receive both voice calls and fax messages. Suppose that 25% of the incoming calls involve fax messages, and consider a sample of 25 incoming calls. What is the probability that a. At most 6 of the calls involve a fax message? b. Exactly 6 of the calls involve a fax message? c. At least 6 of the calls involve a fax message? d. More than 6 of the calls involve a fax message?

51. Refer to the previous exercise. a. What is the expected number of calls among the 25 that

involve a fax message? b. What is the standard deviation of the number among the

25 calls that involve a fax message? c. What is the probability that the number of calls among

the 25 that involve a fax transmission exceeds the expected number by more than 2 standard deviations?

52. Suppose that 30% of all students who have to buy a text for a particular course want a new copy (the successes!), whereas the other 70% want a used copy. Consider ran- domly selecting 25 purchasers. a. What are the mean value and standard deviation of the

number who want a new copy of the book? b. What is the probability that the number who want new

copies is more than two standard deviations away from the mean value?

The probability that X is within 1 standard deviation of its mean value is .

P(7.5 2 1.37 # X # 7.5 1 1.37) 5 P(6.13 # X # 8.87) 5 P(X 5 7 or 8) 5 .532

If 75% of all purchases at a certain store are made with a credit card and X is the number among ten randomly selected purchases made with a credit card, then . Thus , , and

. Again, even though X can take on only integer values, E(X) need not be an integer. If we perform a large number of independent binomial experiments, each with trials and , then the average number of S’s per experiment will be close to 7.5.

p 5 .75n 5 10

s 5 11.875 5 1.37 1.875V(X) 5 npq 5 10(.75)(.25) 5E(X) 5 np 5 (10)(.75) 5 7.5

X , Bin(10, .75)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.4. The Binomial Probability Distribution 121

c. The bookstore has 15 new copies and 15 used copies in stock. If 25 people come in one by one to purchase this text, what is the probability that all 25 will get the type of book they want from current stock? [Hint: Let

who want a new copy. For what values of X will all 25 get what they want?]

d. Suppose that new copies cost $100 and used copies cost $70. Assume the bookstore currently has 50 new copies and 50 used copies. What is the expected value of total rev- enue from the sale of the next 25 copies purchased? Be sure to indicate what rule of expected value you are using. [Hint: Let when X of the 25 pur- chasers want new copies. Express this as a linear function.]

53. Exercise 30 (Section 3.3) gave the pmf of Y, the number of traffic citations for a randomly selected individual insured by a particular company. What is the probability that among 15 randomly chosen such individuals a. At least 10 have no citations? b. Fewer than half have at least one citation? c. The number that have at least one citation is between 5

and 10, inclusive?*

54. A particular type of tennis racket comes in a midsize version and an oversize version. Sixty percent of all customers at a certain store want the oversize version. a. Among ten randomly selected customers who want this

type of racket, what is the probability that at least six want the oversize version?

b. Among ten randomly selected customers, what is the probability that the number who want the oversize version is within 1 standard deviation of the mean value?

c. The store currently has seven rackets of each version. What is the probability that all of the next ten customers who want this racket can get the version they want from current stock?

55. Twenty percent of all telephones of a certain type are sub- mitted for service while under warranty. Of these, 60% can be repaired, whereas the other 40% must be replaced with new units. If a company purchases ten of these telephones, what is the probability that exactly two will end up being replaced under warranty?

56. The College Board reports that 2% of the 2 million high school students who take the SAT each year receive special accommodations because of documented disabilities (Los Angeles Times, July 16, 2002). Consider a random sample of 25 students who have recently taken the test. a. What is the probability that exactly 1 received a special

accommodation? b. What is the probability that at least 1 received a special

accommodation? c. What is the probability that at least 2 received a special

accommodation? d. What is the probability that the number among the 25

who received a special accommodation is within 2

h(X) 5 the revenue

X 5 the number

standard deviations of the number you would expect to be accommodated?

e. Suppose that a student who does not receive a special accommodation is allowed 3 hours for the exam, whereas an accommodated student is allowed 4.5 hours. What would you expect the average time allowed the 25 selected students to be?

57. Suppose that 90% of all batteries from a certain supplier have acceptable voltages. A certain type of flashlight requires two type-D batteries, and the flashlight will work only if both its batteries have acceptable voltages. Among ten randomly selected flashlights, what is the probability that at least nine will work? What assumptions did you make in the course of answering the question posed?

58. A very large batch of components has arrived at a distribu- tor. The batch can be characterized as acceptable only if the proportion of defective components is at most .10. The distributor decides to randomly select 10 components and to accept the batch only if the number of defective components in the sample is at most 2. a. What is the probability that the batch will be accepted

when the actual proportion of defectives is .01? .05? .10? .20? .25?

b. Let p denote the actual proportion of defectives in the batch. A graph of P(batch is accepted) as a function of p, with p on the horizontal axis and P(batch is accepted) on the vertical axis, is called the operating characteristic curve for the acceptance sampling plan. Use the results of part (a) to sketch this curve for .

c. Repeat parts (a) and (b) with “1” replacing “2” in the acceptance sampling plan.

d. Repeat parts (a) and (b) with “15” replacing “10” in the acceptance sampling plan.

e. Which of the three sampling plans, that of part (a), (c), or (d), appears most satisfactory, and why?

59. An ordinance requiring that a smoke detector be installed in all previously constructed houses has been in effect in a par- ticular city for 1 year. The fire department is concerned that many houses remain without detectors. Let proportion of such houses having detectors, and suppose that a random sample of 25 homes is inspected. If the sample strongly indicates that fewer than 80% of all houses have a detector, the fire department will campaign for a mandatory inspection program. Because of the costliness of the program, the department prefers not to call for such inspections unless sample evidence strongly argues for their necessity. Let X denote the number of homes with detectors among the 25 sampled. Consider rejecting the claim that

if .

a. What is the probability that the claim is rejected when the actual value of p is .8?

b. What is the probability of not rejecting the claim when ? When ?

c. How do the “error probabilities” of parts (a) and (b) change if the value 15 in the decision rule is replaced by 14?

p 5 .6p 5 .7

x # 15p $ .8

p 5 the true

0 # p # 1

* “Between a and b, inclusive” is equivalent to .(a # X # b)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

122 CHAPTER 3 Discrete Random Variables and Probability Distributions

60. A toll bridge charges $1.00 for passenger cars and $2.50 for other vehicles. Suppose that during daytime hours, 60% of all vehicles are passenger cars. If 25 vehicles cross the bridge during a particular daytime period, what is the resulting expected toll revenue? [Hint: Let of passenger cars; then the toll revenue h(X) is a linear function of X.]

61. A student who is trying to write a paper for a course has a choice of two topics, A and B. If topic A is chosen, the student will order two books through interlibrary loan, whereas if topic B is chosen, the student will order four books. The student believes that a good paper necessitates receiving and using at least half the books ordered for either topic chosen. If the probability that a book ordered through interlibrary loan actually arrives in time is .9 and books arrive independently of one another, which topic should the student choose to maximize the probability of writing a good paper? What if the arrival probability is only .5 instead of .9?

62. a. For fixed n, are there values of for which ? Explain why this is so.

b. For what value of p is V(X) maximized? [Hint: Either graph V(X) as a function of p or else take a derivative.]

63. a. Show that . b. Show that .

[Hint: At most x S’s is equivalent to at least F’s.] c. What do parts (a) and (b) imply about the necessity of

including values of p greater than .5 in Appendix Table A.1?

64. Show that when X is a binomial random variable. [Hint: First express E(X) as a sum with lower limit

. Then factor out np, let so that the sum is from to , and show that the sum equals 1.]y 5 n 2 1y 5 0

y 5 x 2 1x 5 1

E(X) 5 np

(n 2 x) B(x; n, 1 2 p) 5 1 2 B(n 2 x 2 1; n, p)

b(x; n, 1 2 p) 5 b(n 2 x; n, p)

V(X) 5 0 p (0 # p # 1)

X 5 the number

65. Customers at a gas station pay with a credit card (A), debit card (B), or cash (C ). Assume that successive customers make independent choices, with , , and

. a. Among the next 100 customers, what are the mean and

variance of the number who pay with a debit card? Explain your reasoning.

b. Answer part (a) for the number among the 100 who don’t pay with cash.

66. An airport limousine can accommodate up to four passengers on any one trip. The company will accept a maximum of six reservations for a trip, and a passenger must have a reserva- tion. From previous records, 20% of all those making reservations do not appear for the trip. Answer the following questions, assuming independence wherever appropriate. a. If six reservations are made, what is the probability that

at least one individual with a reservation cannot be accommodated on the trip?

b. If six reservations are made, what is the expected num- ber of available places when the limousine departs?

c. Suppose the probability distribution of the number of reservations made is given in the accompanying table.

P(C ) 5 .3 P(B) 5 .2P(A) 5 .5

The hypergeometric and negative binomial distributions are both related to the binomial distribution. The binomial distribution is the approximate probability model for sampling without replacement from a finite dichotomous (S–F) popula- tion provided the sample size n is small relative to the population size N; the hypergeometric distribution is the exact probability model for the number of S’s in the sample. The binomial rv X is the number of S’s when the number n of trials is fixed, whereas the negative binomial distribution arises from fixing the number of S’s desired and letting the number of trials be random.

The Hypergeometric Distribution The assumptions leading to the hypergeometric distribution are as follows:

1. The population or set to be sampled consists of N individuals, objects, or elements (a finite population).

Let X denote the number of passengers on a randomly selected trip. Obtain the probability mass function of X.

67. Refer to Chebyshev’s inequality given in Exercise 44. Calculate for and when

, and compare to the corresponding upper bound. Repeat for .X , Bin(20, .75) X , Bin(20, .5)

k 5 3k 5 2P( u X 2 m u $ ks)

Number of reservations 3 4 5 6

Probability .1 .2 .3 .4

3.5 Hypergeometric and Negative Binomial Distributions

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.35

3.5. Hypergeometric and Negative Binomial Distributions 123

2. Each individual can be characterized as a success (S) or a failure (F), and there are M successes in the population.

3. A sample of n individuals is selected without replacement in such a way that each subset of size n is equally likely to be chosen.

The random variable of interest is of S’s in the sample. The probability distribution of X depends on the parameters n, M, and N, so we wish to obtain .

During a particular period a university’s information technology office received 20 service orders for problems with printers, of which 8 were laser printers and 12 were inkjet models. A sample of 5 of these service orders is to be selected for inclusion in a customer satisfaction survey. Suppose that the 5 are selected in a completely random fashion, so that any particular subset of size 5 has the same chance of being selected as does any other subset. What then is the probability that exactly

of the selected service orders were for inkjet printers? Here, the population size is , the sample size is , and the number

of S’s and F’s in the population are and , respectively. Consider the value . Because all outcomes (each consisting of 5 particular orders) are equally likely,

The number of possible outcomes in the experiment is the number of ways of selecting 5 from the 20 objects without regard to order—that is, . To count the number of outcomes having , note that there are ways of selecting 2 of the inkjet orders, and for each such way there are ways of selecting the 3 laser orders to fill out the sample. The product rule from Chapter 2 then gives as the number of outcomes with , so

In general, if the sample size n is smaller than the number of successes in the pop- ulation (M), then the largest possible X value is n. However, if (e.g., a sample size of 25 and only 15 successes in the population), then X can be at most M. Similarly, whenever the number of population failures exceeds the sample size, the smallest possible X value is 0 (since all sampled individuals might then be failures). However, if , the smallest possible X value is . Thus, the pos- sible values of X satisfy the restriction . An argument parallel to that of the previous example gives the pmf of X.

max (0, n 2 (N 2 M)) # x # min (n, M) n 2 (N 2 M)N 2 M , n

(N 2 M)

M , n

h(2; 5, 12, 20) 5 Q12

2 R Q8

3 R

Q20 5 R

5 77

323 5 .238

X 5 2 A122 B A83B

A83B A122 BX 5 2

A205 B

P(X 5 2) 5 h(2; 5, 12, 20) 5 number of outcomes having X 5 2

number of possible outcomes

x 5 2 N 2 M 5 8M 5 12(inkjet 5 S)

n 5 5N 5 20 x (x 5 0, 1, 2, 3, 4, or 5)

P(X 5 x) 5 h(x; n, M, N)

X 5 the number

PROPOSITION If X is the number of S’s in a completely random sample of size n drawn from a population consisting of M S’s and F’s, then the probability distri- bution of X, called the hypergeometric distribution, is given by

(3.15)

for x, an integer, satisfying . max (0, n 2 N 1 M) # x # min (n, M)

P(X 5 x) 5 h(x; n, M, N) 5 QMx R QN 2 Mn 2 x R

QNn R

(N 2 M)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

Example 3.36

124 CHAPTER 3 Discrete Random Variables and Probability Distributions

In Example 3.35, , , and , so h(x; 5, 12, 20) for can be obtained by substituting these numbers into Equation (3.15).

Five individuals from an animal population thought to be near extinction in a cer- tain region have been caught, tagged, and released to mix into the population. After they have had an opportunity to mix, a random sample of 10 of these animals is selected. Let of tagged animals in the second sample. If there are actually 25 animals of this type in the region, what is the probability that (a) ? (b) ?

The parameter values are , (5 tagged animals in the population), and , so

For part (a),

For part (b),

Various statistical software packages will easily generate hypergeometric probabilities (tabulation is cumbersome because of the three parameters).

As in the binomial case, there are simple expressions for E(X) and V(X) for hypergeometric rv’s.

5 .057 1 .257 1 .385 5 .699

P(X # 2) 5 P(X 5 0, 1, or 2) 5 g 2

x50 h(x; 10, 5, 25)

P(X 5 2) 5 h(2; 10, 5, 25) 5 Q5 2 R Q20

8 R

Q25 10

R 5 .385

h(x; 10, 5, 25) 5 Q5xR Q 2010 2 xR

Q25 10

R x 5 0, 1, 2, 3, 4, 5

N 5 25 M 5 5n 5 10

X # 2X 5 2

X 5 the number

x 5 0, 1, 2, 3, 4, 5 N 5 20M 5 12n 5 5

The mean and variance of the hypergeometric rv X having pmf h(x; n, M, N) are

E(X) 5 n # M N V(X) 5 aN 2 n

N 2 1 b # n # M

N #

a1 2 M N b

The ratio M/N is the proportion of S’s in the population. If we replace M/N by p in E(X) and V(X), we get

(3.16)

Expression (3.16) shows that the means of the binomial and hypergeometric rv’s are equal, whereas the variances of the two rv’s differ by the factor , often called the finite population correction factor. This factor is less than 1, so the

(N 2 n)/(N 2 1)

V(X) 5 aN 2 n N 2 1

b # np(1 2 p) E(X) 5 np

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.37 (Example 3.36 continued)

3.5. Hypergeometric and Negative Binomial Distributions 125

hypergeometric variable has smaller variance than does the binomial rv. The correction factor can be written , which is approximately 1 when n is small relative to N.

In the animal-tagging example, , , and , so and

If the sampling was carried out with replacement, . Suppose the population size N is not actually known, so the value x is observed

and we wish to estimate N. It is reasonable to equate the observed sample proportion of S’s, x/n, with the population proportion, M/N, giving the estimate

If , , and , then . ■

Our general rule of thumb in Section 3.4 stated that if sampling was without replacement but n/N was at most .05, then the binomial distribution could be used to compute approximate probabilities involving the number of S’s in the sample. A more precise statement is as follows: Let the population size, N, and number of pop- ulation S’s, M, get large with the ratio M/N approaching p. Then h(x; n, M, N) approaches b(x; n, p); so for n/N small, the two are approximately equal provided that p is not too near either 0 or 1. This is the rationale for the rule.

The Negative Binomial Distribution The negative binomial rv and distribution are based on an experiment satisfying the following conditions:

1. The experiment consists of a sequence of independent trials.

2. Each trial can result in either a success (S) or a failure (F).

3. The probability of success is constant from trial to trial, so for .

4. The experiment continues (trials are performed) until a total of r successes have been observed, where r is a specified positive integer.

The random variable of interest is of failures that precede the rth success; X is called a negative binomial random variable because, in contrast to the binomial rv, the number of successes is fixed and the number of trials is random.

Possible values of X are 0, 1, 2, . . . . Let nb(x; r, p) denote the pmf of X. Consider , the probability that exactly 7 F's occur before the 3rd S. In order for this to happen, the 10th trial must be an S and there must be exactly 2 S's among the first 9 trials. Thus

Generalizing this line of reasoning gives the following formula for the negative bino- mial pmf.

nb(7; 3, p) 5 e a9 2 b # p2(1 2 p)7 f # p 5 a9

2 b # p3(1 2 p)7

nb(7; 3, p) 5 P(X 5 7)

X 5 the number

i 5 1, 2, 3, . . . P(S on trial i) 5 p

N̂ 5 250x 5 16n 5 40M 5 100

N̂ 5 M # n

x

V(X) 5 1.6

V(X) 5 15

24 (10)(.2)(.8) 5 (.625)(1.6) 5 1

E(X) 5 10(.2) 5 2

p 5 525 5 .2N 5 25M 5 5n 5 10

(1 2 n/N)/(1 2 1/N)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

Example 3.38

PROPOSITION

126 CHAPTER 3 Discrete Random Variables and Probability Distributions

The pmf of the negative binomial rv X with parameters of S’s and is

nb(x; r, p) 5 ax 1 r 2 1 r 2 1

bpr(1 2 p)x x 5 0, 1, 2, . . . p 5 P(S)

r 5 number

A pediatrician wishes to recruit 5 couples, each of whom is expecting their first child, to participate in a new natural childbirth regimen. Let

. If , what is the probability that 15 cou- ples must be asked before 5 are found who agree to participate? That is, with

, what is the probability that 10 F’s occur before the fifth S? Substituting , , and into nb(x; r, p) gives

The probability that at most 10 F’s are observed (at most 15 couples are asked) is

In some sources, the negative binomial rv is taken to be the number of trials rather than the number of failures.

In the special case , the pmf is

(3.17)

In Example 3.12, we derived the pmf for the number of trials necessary to obtain the first S, and the pmf there is similar to Expression (3.17). Both of F’s and are referred to in the literature as geometric random variables, and the pmf in Expression (3.17) is called the geometric distribution.

The expected number of trials until the first S was shown in Example 3.19 to be 1/p, so that the expected number of F’s until the first S is . Intuitively, we would expect to see F’s before the rth S, and this is indeed E(X). There is also a simple formula for V(X).

r # (1 2 p)/p (1/p) 2 1 5 (1 2 p)/p

Y 5 number of trials ( 5 1 1 X) X 5 number

nb(x; 1, p) 5 (1 2 p)xp x 5 0, 1, 2, . . .

r 5 1 X 1 r

P(X # 10) 5 g 10

x50 nb(x; 5, .2) 5 (.2)5g

10

x50 ax 1 4

4 b(.8)x 5 .164

nb(10; 5, .2) 5 a14 4 b(.2)5(.8)10 5 .034

x 5 10p 5 .2r 5 5 S 5 5agrees to participate6

p 5 .2selected couple agrees to participate) p 5 P(a randomly

If X is a negative binomial rv with pmf nb(x; r, p), then

E(X) 5 r(1 2 p)

p V(X) 5

r(1 2 p)

p2

Finally, by expanding the binomial coefficient in front of and doing some cancellation, it can be seen that nb(x; r, p) is well defined even when r is not an inte- ger. This generalized negative binomial distribution has been found to fit observed data quite well in a wide variety of applications.

pr(1 2 p)x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.5. Hypergeometric and Negative Binomial Distributions 127

EXERCISES Section 3.5 (68–78)

68. An electronics store has received a shipment of 20 table radios that have connections for an iPod or iPhone. Twelve of these have two slots (so they can accommodate both devices), and the other eight have a single slot. Suppose that six of the 20 radios are randomly selected to be stored under a shelf where the radios are displayed, and the remaining ones are placed in a storeroom. Let among the radios stored under the display shelf that have two slots. a. What kind of a distribution does X have (name and val-

ues of all parameters)? b. Compute , , and . c. Calculate the mean value and standard deviation of X.

69. Each of 12 refrigerators of a certain type has been returned to a distributor because of an audible, high-pitched, oscil- lating noise when the refrigerators are running. Suppose that 7 of these refrigerators have a defective compressor and the other 5 have less serious problems. If the refrigerators are examined in random order, let X be the number among the first 6 examined that have a defective compressor. Compute the following: a. b. c. The probability that X exceeds its mean value by more

than 1 standard deviation. d. Consider a large shipment of 400 refrigerators, of which

40 have defective compressors. If X is the number among 15 randomly selected refrigerators that have defective compressors, describe a less tedious way to calculate (at least approximately) than to use the hypergeo- metric pmf.

70. An instructor who taught two sections of engineering statis- tics last term, the first with 20 students and the second with 30, decided to assign a term project. After all projects had been turned in, the instructor randomly ordered them before grading. Consider the first 15 graded projects. a. What is the probability that exactly 10 of these are from

the second section? b. What is the probability that at least 10 of these are from

the second section? c. What is the probability that at least 10 of these are from

the same section? d. What are the mean value and standard deviation of the

number among these 15 that are from the second sec- tion?

e. What are the mean value and standard deviation of the number of projects not among these first 15 that are from the second section?

71. A geologist has collected 10 specimens of basaltic rock and 10 specimens of granite. The geologist instructs a labora- tory assistant to randomly select 15 of the specimens for analysis.

P(X # 5)

P(X # 4) P(X 5 5)

P(X $ 2)P(X # 2)P(X 5 2)

X 5 the number

a. What is the pmf of the number of granite specimens selected for analysis?

b. What is the probability that all specimens of one of the two types of rock are selected for analysis?

c. What is the probability that the number of granite speci- mens selected for analysis is within 1 standard deviation of its mean value?

72. A personnel director interviewing 11 senior engineers for four job openings has scheduled six interviews for the first day and five for the second day of interviewing. Assume that the candidates are interviewed in random order. a. What is the probability that x of the top four candidates

are interviewed on the first day? b. How many of the top four candidates can be expected to

be interviewed on the first day?

73. Twenty pairs of individuals playing in a bridge tournament have been seeded 1, . . . , 20. In the first part of the tourna- ment, the 20 are randomly divided into 10 east–west pairs and 10 north–south pairs. a. What is the probability that x of the top 10 pairs end up

playing east–west? b. What is the probability that all of the top five pairs end

up playing the same direction? c. If there are 2n pairs, what is the pmf of

among the top n pairs who end up playing east–west? What are E(X) and V(X)?

74. A second-stage smog alert has been called in a certain area of Los Angeles County in which there are 50 industrial firms. An inspector will visit 10 randomly selected firms to check for violations of regulations. a. If 15 of the firms are actually violating at least one

regulation, what is the pmf of the number of firms visited by the inspector that are in violation of at least one regulation?

b. If there are 500 firms in the area, of which 150 are in vio- lation, approximate the pmf of part (a) by a simpler pmf.

c. For among the 10 visited that are in vio- lation, compute E(X) and V(X) both for the exact pmf and the approximating pmf in part (b).

75. Suppose that . A couple wishes to have exactly two female children in their family. They will have children until this condition is fulfilled. a. What is the probability that the family has x male

children? b. What is the probability that the family has four children? c. What is the probability that the family has at most four

children? d. How many male children would you expect this family

to have? How many children would you expect this family to have?

p 5 P(male birth) 5 .5

X 5 the number

X 5 the number

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.39

DEFINITION

128 CHAPTER 3 Discrete Random Variables and Probability Distributions

76. A family decides to have children until it has three children of the same gender. Assuming , what is the pmf of of children in the family?

77. Three brothers and their wives decide to have children until each family has two female children. What is the pmf of

of male children born to the brothers? What is E(X), and how does it compare to the expected number of male children born to each brother?

78. According to the article “Characterizing the Severity and Risk of Drought in the Poudre River, Colorado” (J. of Water Res. Planning and Mgmnt., 2005: 383–393), the drought

X 5 the total number

X 5 the number P(B) 5 P(G) 5 .5

length Y is the number of consecutive time intervals in which the water supply remains below a critical value y0 (a deficit), preceded by and followed by periods in which the supply exceeds this critical value (a surplus). The cited paper proposes a geometric distribution with for this random variable. a. What is the probability that a drought lasts exactly 3

intervals? At most 3 intervals? b. What is the probability that the length of a drought

exceeds its mean value by at least one standard deviation?

p 5 .409

3.6 The Poisson Probability Distribution The binomial, hypergeometric, and negative binomial distributions were all derived by starting with an experiment consisting of trials or draws and applying the laws of probability to various outcomes of the experiment. There is no simple experiment on which the Poisson distribution is based, though we will shortly describe how it can be obtained by certain limiting operations.

A discrete random variable X is said to have a Poisson distribution with parameter if the pmf of X is

p(x; m) 5 e2m # mx

x! x 5 0, 1, 2, 3, . . .

m (m . 0)

It is no accident that we are using the symbol m for the Poisson parameter; we shall see shortly that m is in fact the expected value of X. The letter e in the pmf represents the base of the natural logarithm system; its numerical value is approximately 2.71828. In contrast to the binomial and hypergeometric distributions, the Poisson distribution spreads probability over all non-negative integers, an infinite number of possibilities.

It is not obvious by inspection that specifies a legitimate pmf, let alone that this distribution is useful. First of all, for every possible x value because of the requirement that . The fact that is a consequence of the Maclaurin series expansion of em (check your calculus book for this result):

(3.18)

If the two extreme terms in (3.18) are multiplied by and then this quantity is moved inside the summation on the far right, the result is

Let X denote the number of creatures of a particular type captured in a trap during a given time period. Suppose that X has a Poisson distribution with , so on average traps will contain 4.5 creatures. [The article “Dispersal Dynamics of the

m 5 4.5

1 5 g `

x50 e2m # mx

x!

e2m

em 5 1 1 m 1 m2

2! 1 m3

3! 1 c 5 g

`

x50 mx

x!

gp(x; m) 5 1m . 0 p(x; m) . 0

p(x; m)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

3.6. The Poisson Probability Distribution 129

Bivalve Gemma Gemma in a Patchy Environment” (Ecological Monographs, 1995: 1–20) suggests this model; the bivalve Gemma gemma is a small clam.] The proba- bility that a trap contains exactly five creatures is

The probability that a trap has at most five creatures is

The Poisson Distribution as a Limit The rationale for using the Poisson distribution in many situations is provided by the following proposition.

P(X # 5) 5 g 5

x50

e24.5(4.5)x

x! 5 e24.5c1 1 4.5 1 (4.5)2

2! 1 c 1

(4.5)5

5! d 5 .7029

P(X 5 5) 5 e24.5(4.5)5

5! 5 .1708

Suppose that in the binomial pmf b(x; n, p), we let and in such a way that np approaches a value . Then .b(x; n, p) S p(x; m)m . 0

p S 0n S `

Example 3.40

According to this proposition, in any binomial experiment in which n is large and p is small, , where . As a rule of thumb, this approx- imation can safely be applied if and .

If a publisher of nontechnical books takes great pains to ensure that its books are free of typographical errors, so that the probability of any given page containing at least one such error is .005 and errors are independent from page to page, what is the probability that one of its 400-page novels will contain exactly one page with errors? At most three pages with errors?

With S denoting a page containing at least one error and F an error-free page, the number X of pages containing at least one error is a binomial rv with and , so . We wish

The binomial value is , so the approximation is very good. Similarly,

and this again is quite close to the binomial value . ■

Table 3.2 shows the Poisson distribution for along with three bino- mial distributions with , and Figure 3.8 (from S-Plus) plots the Poisson along with the first two binomial distributions. The approximation is of limited use for , but of course the accuracy is better for and much better for .n 5 300

n 5 100n 5 30

np 5 3 m 5 3

P(X # 3) 5 .8576

5 .8571

5 .135335 1 .270671 1 .270671 1 .180447

P(X # 3) < g 3

x50 p(x, 2) 5 g

3

x50 e22

2x

x!

b(1; 400, .005) 5 .270669

P(X 5 1) 5 b(1; 400, .005) < p(1; 2) 5 e22(2)1

1! 5 .270671

np 5 2p 5 .005 n 5 400

np , 5n . 50 m 5 npb(x; n, p) < p(x; m)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 3.41 (Example 3.39 continued)

PROPOSITION

130 CHAPTER 3 Discrete Random Variables and Probability Distributions

Table 3.2 Comparing the Poisson and Three Binomial Distributions

x Poisson,

0 0.042391 0.047553 0.049041 0.049787 1 0.141304 0.147070 0.148609 0.149361 2 0.227656 0.225153 0.224414 0.224042 3 0.236088 0.227474 0.225170 0.224042 4 0.177066 0.170606 0.168877 0.168031 5 0.102305 0.101308 0.100985 0.100819 6 0.047363 0.049610 0.050153 0.050409 7 0.018043 0.020604 0.021277 0.021604 8 0.005764 0.007408 0.007871 0.008102 9 0.001565 0.002342 0.002580 0.002701

10 0.000365 0.000659 0.000758 0.000810

m 5 3n 5 300, p 5 .01n 5 100, p 5 .03n 5 30, p 5 .1

0 2 4 6 8 10 x

.25

.20

.15

.10

.05

0

o x

o x

ox o x

o x

ox

ox

ox

ox ox ox

Bin, n�30 (o); Bin, n�100 (x); Poisson ( )p(x)

Figure 3.8 Comparing a Poisson and two binomial distributions

If X has a Poisson distribution with parameter m, then .E(X) 5 V(X) 5 m

Both the expected number of creatures trapped and the variance of the number trapped equal 4.5, and . ■sX 5 1m 5 14.5 5 2.12

Appendix Table A.2 exhibits the cdf F(x; m) for , and 20. For example, if , then as in

Example 3.40, whereas . Alternatively, many statistical computer packages will generate p(x; m) and F(x; m) upon request.

The Mean and Variance of X Since as , , , the mean and variance of a binomial variable should approach those of a Poisson variable. These limits are

and .np(1 2 p) S mnp S m

np S mp S 0n S `b(x; n, p) S p(x; m)

P(X 5 3) 5 F(3; 2) 2 F(2; 2) 5 .180 P(X # 3) 5 F(3; 2) 5 .857m 5 210, 15

m 5 .1, .2, . . . , 1, 2, . . . ,

These results can also be derived directly from the definitions of mean and variance.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

3.6. The Poisson Probability Distribution 131

The Poisson Process A very important application of the Poisson distribution arises in connection with the occurrence of events of some type over time. Events of interest might be visits to a particular website, pulses of some sort recorded by a counter, email messages sent to a particular address, accidents in an industrial facility, or cosmic ray showers observed by astronomers at a particular observatory. We make the following assump- tions about the way in which the events of interest occur:

1. There exists a parameter such that for any short time interval of length �t, the probability that exactly one event occurs is .*

2. The probability of more than one event occurring during �t is o(�t) [which, along with Assumption 1, implies that the probability of no events during �t is

].

3. The number of events occurring during the time interval �t is independent of the number that occur prior to this time interval.

Informally, Assumption 1 says that for a short interval of time, the probability of a single event occurring is approximately proportional to the length of the time inter- val, where a is the constant of proportionality. Now let Pk(t) denote the probability that k events will be observed during any particular time interval of length t.

1 2 a # �t 2 o(�t)

a # �t 1 o(�t) a . 0

* A quantity is o(�t) (read “little o of delta t”) if, as �t approaches 0, so does o(�t)/�t. That is, o(�t) is even more negligible (approaches 0 faster) than �t itself. The quantity (�t)2 has this property, but sin(�t) does not.

PROPOSITION , so that the number of events during a time interval of length t is a Poisson rv with parameter . The expected number of events during any such time interval is then at, so the expected number dur- ing a unit interval of time is a.

m 5 at Pk(t) 5 e

2at # (at)k/k!

The occurrence of events over time as described is called a Poisson process; the parameter a specifies the rate for the process.

Suppose pulses arrive at a counter at an average rate of six per minute, so that . To find the probability that in a .5-min interval at least one pulse is received, note that the number of pulses in such an interval has a Poisson distribution with parameter

(.5 min is used because a is expressed as a rate per minute). Then with of pulses received in the 30-sec interval,

Instead of observing events over time, consider observing events of some type that occur in a two- or three-dimensional region. For example, we might select on a map a certain region R of a forest, go to that region, and count the num- ber of trees. Each tree would represent an event occurring at a particular point in space. Under assumptions similar to 1–3, it can be shown that the number of events occurring in a region R has a Poisson distribution with parameter , where a(R) is the area of R. The quantity a is the expected number of events per unit area or volume.

a # a(R)

P(1 # X) 5 1 2 P(X 5 0) 5 1 2 e23(3)0

0! 5 .950

X 5 the number at 5 6(.5) 5 3

a 5 6Example 3.42

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

132 CHAPTER 3 Discrete Random Variables and Probability Distributions

EXERCISES Section 3.6 (79–93)

79. Let X, the number of flaws on the surface of a randomly selected boiler of a certain type, have a Poisson distribution with parameter . Use Appendix Table A.2 to compute the following probabilities: a. b. c. d. e.

80. Let X be the number of material anomalies occurring in a particular region of an aircraft gas-turbine disk. The article “Methodology for Probabilistic Life Prediction of Multiple- Anomaly Materials” (Amer. Inst. of Aeronautics and Astronautics J., 2006: 787–793) proposes a Poisson distri- bution for X. Suppose that . a. Compute both and . b. Compute . c. Compute . d. What is the probability that the number of anomalies

exceeds its mean value by no more than one standard deviation?

81. Suppose that the number of drivers who travel between a particular origin and destination during a designated time period has a Poisson distribution with parameter (suggested in the article “Dynamic Ride Sharing: Theory and Practice,” J. of Transp. Engr., 1997: 308–312). What is the probability that the number of drivers will a. Be at most 10? b. Exceed 20? c. Be between 10 and 20, inclusive? Be strictly between 10

and 20? d. Be within 2 standard deviations of the mean value?

82. Consider writing onto a computer disk and then sending it through a certifier that counts the number of missing pulses. Suppose this number X has a Poisson distribution with parameter . (Suggested in “Average Sample Number for Semi-Curtailed Sampling Using the Poisson Distribu- tion,” J. Quality Technology, 1983: 126–129.) a. What is the probability that a disk has exactly one miss-

ing pulse? b. What is the probability that a disk has at least two miss-

ing pulses? c. If two disks are independently selected, what is the prob-

ability that neither contains a missing pulse?

83. An article in the Los Angeles Times (Dec. 3, 1993) reports that 1 in 200 people carry the defective gene that causes inherited colon cancer. In a sample of 1000 individuals, what is the approximate distribution of the number who carry this gene? Use this distribution to calculate the approximate probability that a. Between 5 and 8 (inclusive) carry the gene. b. At least 8 carry the gene.

84. Suppose that only .10% of all computers of a certain type experience CPU failure during the warranty period. Con- sider a sample of 10,000 computers.

m 5 .2

m 5 20

P(8 # X) P(4 # X # 8)

P(X , 4)P(X # 4) m 5 4

P(5 , X , 8)P(5 # X # 8) P(9 # X)P(X 5 8)P(X # 8)

m 5 5

a. What are the expected value and standard deviation of the number of computers in the sample that have the defect?

b. What is the (approximate) probability that more than 10 sampled computers have the defect?

c. What is the (approximate) probability that no sampled computers have the defect?

85. Suppose small aircraft arrive at a certain airport according to a Poisson process with rate per hour, so that the number of arrivals during a time period of t hours is a Poisson rv with parameter . a. What is the probability that exactly 6 small aircraft arrive

during a 1-hour period? At least 6? At least 10? b. What are the expected value and standard deviation of

the number of small aircraft that arrive during a 90-min period?

c. What is the probability that at least 20 small aircraft arrive during a 2.5-hour period? That at most 10 arrive during this period?

86. The number of people arriving for treatment at an emer- gency room can be modeled by a Poisson process with a rate parameter of five per hour. a. What is the probability that exactly four arrivals occur

during a particular hour? b. What is the probability that at least four people arrive

during a particular hour? c. How many people do you expect to arrive during a 45-

min period?

87. The number of requests for assistance received by a towing service is a Poisson process with rate per hour. a. Compute the probability that exactly ten requests are

received during a particular 2-hour period. b. If the operators of the towing service take a 30-min break

for lunch, what is the probability that they do not miss any calls for assistance?

c. How many calls would you expect during their break?

88. In proof testing of circuit boards, the probability that any particular diode will fail is .01. Suppose a circuit board con- tains 200 diodes. a. How many diodes would you expect to fail, and what is

the standard deviation of the number that are expected to fail?

b. What is the (approximate) probability that at least four diodes will fail on a randomly selected board?

c. If five boards are shipped to a particular customer, how likely is it that at least four of them will work properly? (A board works properly only if all its diodes work.)

89. The article “Reliability-Based Service-Life Assessment of Aging Concrete Structures” (J. Structural Engr., 1993: 1600–1621) suggests that a Poisson process can be used to represent the occurrence of structural loads over time. Suppose the mean time between occurrences of loads is .5 year.

a 5 4

m 5 8t

a 5 8

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 133

a. How many loads can be expected to occur during a 2- year period?

b. What is the probability that more than five loads occur during a 2-year period?

c. How long must a time period be so that the probability of no loads occurring during that period is at most .1?

90. Let X have a Poisson distribution with parameter m. Show that directly from the definition of expected value.

[Hint: The first term in the sum equals 0, and then x can be can- celed. Now factor out m and show that what is left sums to 1.]

91. Suppose that trees are distributed in a forest according to a two-dimensional Poisson process with parameter a, the expected number of trees per acre, equal to 80. a. What is the probability that in a certain quarter-acre plot,

there will be at most 16 trees? b. If the forest covers 85,000 acres, what is the expected

number of trees in the forest? c. Suppose you select a point in the forest and construct a

circle of radius .1 mile. Let of trees within that circular region. What is the pmf of X? [Hint:

.]

92. Automobiles arrive at a vehicle equipment inspection sta- tion according to a Poisson process with rate per hour. Suppose that with probability .5 an arriving vehicle will have no equipment violations.

a 5 10

1 sq mile 5 640 acres

X 5 the number

E(X) 5 m

a. What is the probability that exactly ten arrive during the hour and all ten have no violations?

b. For any fixed , what is the probability that y arrive during the hour, of which ten have no violations?

c. What is the probability that ten “no-violation” cars arrive during the next hour? [Hint: Sum the probabilities in part (b) from to �.]

93. a. In a Poisson process, what has to happen in both the time interval (0, t) and the interval so that no events occur in the entire interval ? Use this and Assumptions 1–3 to write a relationship between

and P0(t). b. Use the result of part (a) to write an expression for the

difference . Then divide by �t and let to obtain an equation involving , the

derivative of P0(t) with respect to t. c. Verify that satisfies the equation of part (b). d. It can be shown in a manner similar to parts (a) and (b) that

the Pk(t)s must satisfy the system of differential equations

Verify that satisfies the system. (This is actually the only solution.)

Pk(t) 5 e 2at(at)k/k!

k 5 1, 2, 3, . . .

d

dt Pk(t) 5 aPk21(t) 2 aPk(t)

P0 (t) 5 e 2at

(d/dt)P0 (t)�t S 0 P0 (t 1 �t) 2 P0 (t)

P0(t 1 �t)

(0, t 1 �t) (t, t 1 �t)

y 5 10

y $ 10

SUPPLEMENTARY EXERCISES (94–122)

94. Consider a deck consisting of seven cards, marked 1, 2, . . . , 7. Three of these cards are selected at random. Define an rv W by of the resulting numbers, and compute the pmf of W. Then compute m and s2. [Hint: Consider out- comes as unordered, so that (1, 3, 7) and (3, 1, 7) are not different outcomes. Then there are 35 outcomes, and they can be listed. (This type of rv actually arises in connection with a statistical procedure called Wilcoxon’s rank-sum test, in which there is an x sample and a y sample and W is the sum of the ranks of the x’s in the combined sample; see Section 15.2.)

95. After shuffling a deck of 52 cards, a dealer deals out 5. Let of suits represented in the five-card hand.

a. Show that the pmf of X is X 5 the number

W 5 the sum

96. The negative binomial rv X was defined as the number of F’s preceding the rth S. Let of trials neces- sary to obtain the rth S. In the same manner in which the pmf of X was derived, derive the pmf of Y.

97. Of all customers purchasing automatic garage-door openers, 75% purchase a chain-driven model. Let number among the next 15 purchasers who select the chain-driven model. a. What is the pmf of X? b. Compute . c. Compute . d. Compute m and s2. e. If the store currently has in stock 10 chain-driven models

and 8 shaft-driven models, what is the probability that the requests of these 15 customers can all be met from existing stock?

98. A friend recently planned a camping trip. He had two flash- lights, one that required a single 6-V battery and another that used two size-D batteries. He had previously packed two 6-V and four size-D batteries in his camper. Suppose the probability that any particular battery works is p and that batteries work or fail independently of one another. Our friend wants to take just one flashlight. For what values of p should he take the 6-V flashlight?

P(6 # X # 10) P(X . 10)

X 5 the

Y 5 the number

x 1 2 3 4

p(x) .002 .146 .588 .264

[Hint: , (only spades and hearts with at least one of each suit), and

.] b. Compute m, s2, and s. 5 4P(2 spades ¨ one of each other suit)

p(4) p(2) 5 6Pp(1) 5 4P(all are spades)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

134 CHAPTER 3 Discrete Random Variables and Probability Distributions

99. A k-out-of-n system is one that will function if and only if at least k of the n individual components in the system function. If individual components function independently of one another, each with probability .9, what is the prob- ability that a 3-out-of-5 system functions?

100. A manufacturer of integrated circuit chips wishes to con- trol the quality of its product by rejecting any batch in which the proportion of defective chips is too high. To this end, out of each batch (10,000 chips), 25 will be selected and tested. If at least 5 of these 25 are defective, the entire batch will be rejected. a. What is the probability that a batch will be rejected if

5% of the chips in the batch are in fact defective? b. Answer the question posed in (a) if the percentage of

defective chips in the batch is 10%. c. Answer the question posed in (a) if the percentage of

defective chips in the batch is 20%. d. What happens to the probabilities in (a)–(c) if the criti-

cal rejection number is increased from 5 to 6?

101. Of the people passing through an airport metal detector, .5% activate it; let among a randomly selected group of 500 who activate the detector. a. What is the (approximate) pmf of X? b. Compute . c. Compute .

102. An educational consulting firm is trying to decide whether high school students who have never before used a hand- held calculator can solve a certain type of problem more easily with a calculator that uses reverse Polish logic or one that does not use this logic. A sample of 25 students is selected and allowed to practice on both calculators. Then each student is asked to work one problem on the reverse Polish calculator and a similar problem on the other. Let

, where S indicates that a student worked the problem more quickly using reverse Polish logic than with- out, and let of S’s. a. If , what is ? b. If , what is ? c. If the claim that is to be rejected when either

or , what is the probability of rejecting the claim when it is actually correct?

d. If the decision to reject the claim is made as in part (c), what is the probability that the claim is not rejected when ? When ?

e. What decision rule would you choose for rejecting the claim if you wanted the probability in part (c) to be at most .01?

103. Consider a disease whose presence can be identified by carrying out a blood test. Let p denote the probability that a randomly selected individual has the disease. Suppose n individuals are independently selected for testing. One way to proceed is to carry out a separate test on each of the n blood samples. A potentially more economical approach, group testing, was introduced during World War II to iden- tify syphilitic men among army inductees. First, take a part

p 5 .5

p 5 .8p 5 .6

p 5 .5

x $ 18x # 7 p 5 .5 P(7 # X # 18)p 5 .8 P(7 # X # 18)p 5 .5

X 5 number

p 5 P(S)

P(5 # X) P(X 5 5)

X 5 the number

of each blood sample, combine these specimens, and carry out a single test. If no one has the disease, the result will be negative, and only the one test is required. If at least one individual is diseased, the test on the combined sample will yield a positive result, in which case the n individual tests are then carried out. If and , what is the expected number of tests using this procedure? What is the expected number when ? [The article “Random Multiple-Access Communication and Group Testing” (IEEE Trans. on Commun., 1984: 769–774) applied these ideas to a communication system in which the dichotomy was active/idle user rather than diseased/nondiseased.]

104. Let p1 denote the probability that any particular code sym- bol is erroneously transmitted through a communication system. Assume that on different symbols, errors occur independently of one another. Suppose also that with prob- ability p2 an erroneous symbol is corrected upon receipt. Let X denote the number of correct symbols in a message block consisting of n symbols (after the correction process has ended). What is the probability distribution of X?

105. The purchaser of a power-generating unit requires c con- secutive successful start-ups before the unit will be accepted. Assume that the outcomes of individual start-ups are independent of one another. Let p denote the probabil- ity that any particular start-up is successful. The random variable of interest is of start-ups that must be made prior to acceptance. Give the pmf of X for the case . If , what is ? [Hint: For

, express p(x) “recursively” in terms of the pmf eval- uated at the smaller values .] (This problem was suggested by the article “Evaluation of a Start-Up Demonstration Test,” J. Quality Technology, 1983: 103–106.)

106. A plan for an executive travelers’ club has been developed by an airline on the premise that 10% of its current cus- tomers would qualify for membership. a. Assuming the validity of this premise, among 25 ran-

domly selected current customers, what is the probabil- ity that between 2 and 6 (inclusive) qualify for membership?

b. Again assuming the validity of the premise, what are the expected number of customers who qualify and the standard deviation of the number who qualify in a ran- dom sample of 100 current customers?

c. Let X denote the number in a random sample of 25 cur- rent customers who qualify for membership. Consider rejecting the company’s premise in favor of the claim that if . What is the probability that the company’s premise is rejected when it is actually valid?

d. Refer to the decision rule introduced in part (c). What is the probability that the company’s premise is not rejected even though (i.e., 20% qualify)?

107. Forty percent of seeds from maize (modern-day corn) ears carry single spikelets, and the other 60% carry paired spikelets. A seed with single spikelets will produce an ear

p 5 .20

x $ 7p . .10

x 2 3, x 2 4, c, 2 x $ 5

P(X # 8)p 5 .9c 5 2

X 5 the number

n 5 5

n 5 3p 5 .1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 135

with single spikelets 29% of the time, whereas a seed with paired spikelets will produce an ear with single spikelets 26% of the time. Consider randomly selecting ten seeds. a. What is the probability that exactly five of these seeds

carry a single spikelet and produce an ear with a single spikelet?

b. What is the probability that exactly five of the ears pro- duced by these seeds have single spikelets? What is the probability that at most five ears have single spikelets?

108. A trial has just resulted in a hung jury because eight mem- bers of the jury were in favor of a guilty verdict and the other four were for acquittal. If the jurors leave the jury room in random order and each of the first four leaving the room is accosted by a reporter in quest of an interview, what is the pmf of of jurors favoring acquittal among those interviewed? How many of those favoring acquittal do you expect to be interviewed?

109. A reservation service employs five information operators who receive requests for information independently of one another, each according to a Poisson process with rate

per minute. a. What is the probability that during a given 1-min

period, the first operator receives no requests? b. What is the probability that during a given 1-min

period, exactly four of the five operators receive no requests?

c. Write an expression for the probability that during a given 1-min period, all of the operators receive exactly the same number of requests.

110. Grasshoppers are distributed at random in a large field according to a Poisson process with parameter per square yard. How large should the radius R of a circular sampling region be taken so that the probability of finding at least one in the region equals .99?

111. A newsstand has ordered five copies of a certain issue of a photography magazine. Let of individuals who come in to purchase this magazine. If X has a Poisson distribution with parameter , what is the expected number of copies that are sold?

112. Individuals A and B begin to play a sequence of chess games. Let , and suppose that out- comes of successive games are independent with and (they never draw). They will play until one of them wins ten games. Let of games played (with possible values 10, 11, . . . , 19). a. For , obtain an expression for

. b. If a draw is possible, with , ,

, what are the possible values of X? What is ? [Hint:

.]

113. A test for the presence of a certain disease has probability .20 of giving a false-positive reading (indicating that an individual has the disease when this is not the case) and

1 2 P(X , 20) P(20 # X) 5P(20 # X)

1 2 p 2 q 5 P(draw) q 5 P(F)p 5 P(S)

p(x) 5 P(X 5 x) x 5 10, 11, c, 19

X 5 the number P(F) 5 1 2 p

P(S) 5 p S 5 5A wins a game6

m 5 4

X 5 the number

a 5 2

a 5 2

X 5 the number

probability .10 of giving a false-negative result. Suppose that ten individuals are tested, five of whom have the dis- ease and five of whom do not. Let of pos- itive readings that result. a. Does X have a binomial distribution? Explain your rea-

soning. b. What is the probability that exactly three of the ten test

results are positive?

114. The generalized negative binomial pmf is given by

Let X, the number of plants of a certain species found in a particular region, have this distribution with and

. What is ? What is the probability that at least one plant is found?

115. There are two Certified Public Accountants in a particular office who prepare tax returns for clients. Suppose that for a particular type of complex form, the number of errors made by the first preparer has a Poisson distribution with mean value m1, the number of errors made by the second preparer has a Poisson distribution with mean value m2, and that each CPA prepares the same number of forms of this type. Then if a form of this type is randomly selected, the function

gives the pmf of of errors on the selected form. a. Verify that p(x; m1, m2) is in fact a legitimate pmf (

and sums to 1). b. What is the expected number of errors on the selected

form? c. What is the variance of the number of errors on the

selected form? d. How does the pmf change if the first CPA prepares 60%

of all such forms and the second prepares 40%?

116. The mode of a discrete random variable X with pmf p(x) is that value x* for which p(x) is largest (the most probable x value). a. Let . By considering the ratio

, show that b(x; n, p) increases with x as long as . Conclude that the mode x* is the integer satisfying .

b. Show that if X has a Poisson distribution with parame- ter m, the mode is the largest integer less than m. If m is an integer, show that both and m are modes.

117. A computer disk storage device has ten concentric tracks, numbered 1, 2, . . . , 10 from outermost to innermost, and a single access arm. Let that any particu- lar request for data will take the arm to track

. Assume that the tracks accessed in suc- cessive seeks are independent. Let ofX 5 the number i(i 5 1, . . . , 10)

pi 5 the probability

m 2 1

(n 1 1)p 2 1 # x* # (n 1 1)p x , np 2 (1 2 p)

p)/b(x; n, p) b(x 1 1; n,X | Bin(n, p)

$ 0

X 5 the number

p(x; m1, m2) 5 .5 e2m1m1

x

x! 1 .5

e2m2m2 x

x! x 5 0, 1, 2, . . .

P(X 5 4)r 5 2.5 p 5 .3

x 5 0, 1, 2, . . .

nb(x; r, p) 5 k(r, x) # pr(1 2 p)x

X 5 the number

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

136 CHAPTER 3 Discrete Random Variables and Probability Distributions

tracks over which the access arm passes during two succes- sive requests (excluding the track that the arm has just left, so possible X values are ). Compute the pmf of X. [Hint:

. After the conditional probability is written in terms of p1, . . . , p10, by the law of total proba- bility, the desired probability is obtained by summing over i.]

118. If X is a hypergeometric rv, show directly from the defini- tion that (consider only the case ). [Hint: Factor nM/N out of the sum for E(X), and show that the terms inside the sum are of the form

, where .]

119. Use the fact that

to prove Chebyshev’s inequality given in Exercise 44.

120. The simple Poisson process of Section 3.6 is characterized by a constant rate at which events occur per unit time. A generalization of this is to suppose that the probability of exactly one event occurring in the interval is

. It can then be shown that the number of events occurring during an interval [t1, t2] has a Poisson distribution with parameter

The occurrence of events over time in this situation is called a nonhomogeneous Poisson process. The article “Inference Based on Retrospective Ascertainment,” J. Amer. Stat. Assoc., 1989: 360–372, considers the intensity function

as appropriate for events involving transmission of HIV (the AIDS virus) via blood transfusions. Suppose that

and (close to values suggested in the paper), with time in years.

b 5 .6a 5 2

a(t) 5 ea1bt

m 5 � t1

t2

a(t) dt

a(t) # �t 1 o(�t) [t, t 1 �t]

a

g all x

(x 2 m)2p(x) $ g x: u x2mu$ks

(x 2 m)2p(x)

y 5 x 2 1h(y; n 2 1, M 2 1, N 2 1)

n , ME(X) 5 nM/N

P(X 5 j|arm now on i) # pi P(the arm is now on track i and X 5 j) 5

x 5 0, 1, . . . , 9

a. What is the expected number of events in the interval [0, 4]? In [2, 6]?

b. What is the probability that at most 15 events occur in the interval [0, .9907]?

121. Consider a collection A1, . . . , Ak of mutually exclusive and exhaustive events, and a random variable X whose distri- bution depends on which of the Ai’s occurs (e.g., a com- muter might select one of three possible routes from home to work, with X representing the commute time). Let

denote the expected value of X given that the event Ai occurs. Then it can be shown that

, the weighted average of the indi- vidual “conditional expectations” where the weights are the probabilities of the partitioning events. a. The expected duration of a voice call to a particular

telephone number is 3 minutes, whereas the expected duration of a data call to that same number is 1 minute. If 75% of all calls are voice calls, what is the expected duration of the next call?

b. A deli sells three different types of chocolate chip cook- ies. The number of chocolate chips in a type i cookie has a Poisson distribution with parameter

. If 20% of all customers pur- chasing a chocolate chip cookie select the first type, 50% choose the second type, and the remaining 30% opt for the third type, what is the expected number of chips in a cookie purchased by the next customer?

122. Consider a communication source that transmits packets containing digitized speech. After each transmission, the receiver sends a message indicating whether the transmis- sion was successful or unsuccessful. If a transmission is unsuccessful, the packet is re-sent. Suppose a voice packet can be transmitted a maximum of 10 times. Assuming that the results of successive transmissions are independent of one another and that the probability of any particular trans- mission being successful is p, determine the probability mass function of the rv of times a packet is transmitted. Then obtain an expression for the expected number of times a packet is transmitted.

X 5 the number

mi 5 i 1 1 (i 5 1, 2, 3)

E(X) 5 �E(X u Ai) # P(Ai)

E(X u Ai)

Bibliography Johnson, Norman, Samuel Kotz, and Adrienne Kemp, Discrete

Univariate Distributions, Wiley, New York, 1992. An ency- clopedia of information on discrete distributions.

Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability Models and Applications (2nd ed.), Macmillan, New York, 1994. Contains an in-depth discussion of both general

properties of discrete and continuous distributions and results for specific distributions.

Ross, Sheldon, Introduction to Probability Models (9th ed.), Academic Press, New York, 2007. A good source of material on the Poisson process and generalizations and a nice intro- duction to other topics in applied probability.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

137

4 Continuous RandomVariables and Probability Distributions

INTRODUCTION

Chapter 3 concentrated on the development of probability distributions for dis-

crete random variables. In this chapter, we consider the second general type of

random variable that arises in many applied problems. Sections 4.1 and 4.2

present the basic definitions and properties of continuous random variables and

their probability distributions. In Section 4.3, we study in detail the normal ran-

dom variable and distribution, unquestionably the most important and useful in

probability and statistics. Sections 4.4 and 4.5 discuss some other continuous

distributions that are often used in applied work. In Section 4.6, we introduce

a method for assessing whether given sample data is consistent with a specified

distribution.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.3

Example 4.2

Example 4.1

138 CHAPTER 4 Continuous Random Variables and Probability Distributions

A discrete random variable (rv) is one whose possible values either constitute a finite set or else can be listed in an infinite sequence (a list in which there is a first element, a second element, etc.). A random variable whose set of possible values is an entire interval of numbers is not discrete.

Recall from Chapter 3 that a random variable X is continuous if (1) possible values comprise either a single interval on the number line (for some , any number x between A and B is a possible value) or a union of disjoint intervals, and (2) for any number c that is a possible value of X.

If in the study of the ecology of a lake, we make depth measurements at randomly chosen locations, then the depth at such a location is a continuous rv. Here A is the minimum depth in the region being sampled, and B is the maximum depth. ■

If a chemical compound is randomly selected and its pH X is determined, then X is a continuous rv because any pH value between 0 and 14 is possible. If more is known about the compound selected for analysis, then the set of possible values might be a subinterval of [0, 14], such as , but X would still be continuous. ■

Let X represent the amount of time a randomly selected customer spends waiting for a haircut before his/her haircut commences. Your first thought might be that X is a continuous random variable, since a measurement is required to determine its value. However, there are customers lucky enough to have no wait whatsoever before climbing into the barber’s chair. So it must be the case that . Conditional on no chairs being empty, though, the waiting time will be continuous since X could then assume any value between some minimum possible time A and a maximum possible time B. This random variable is neither purely discrete nor purely continuous but instead is a mixture of the two types. ■

One might argue that although in principle variables such as height, weight, and temperature are continuous, in practice the limitations of our measuring instru- ments restrict us to a discrete (though sometimes very finely subdivided) world. However, continuous models often approximate real-world situations very well, and continuous mathematics (the calculus) is frequently easier to work with than math- ematics of discrete variables and distributions.

Probability Distributions for Continuous Variables Suppose the variable X of interest is the depth of a lake at a randomly chosen point on the surface. Let the maximum depth (in meters), so that any number in the interval [0, M] is a possible value of X. If we “discretize” X by measuring depth to the nearest meter, then possible values are nonnegative integers less than or equal to M. The resulting discrete distribution of depth can be pictured using a probability his- togram. If we draw the histogram so that the area of the rectangle above any possible integer k is the proportion of the lake whose depth is (to the nearest meter) k, then the total area of all rectangles is 1. A possible histogram appears in Figure 4.1(a).

If depth is measured much more accurately and the same measurement axis as in Figure 4.1(a) is used, each rectangle in the resulting probability histogram is much narrower, though the total area of all rectangles is still 1. A possible histogram is

M 5

P(X 5 0) . 0

5.5 # x # 6.5

X 5

P(X 5 c) 5 0

A , B

4.1 Probability Density Functions

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

4.1 Probability Density Functions 139

(a) (b) (c)

0 M 0 M 0 M

Figure 4.1 (a) Probability histogram of depth measured to the nearest meter; (b) probability histogram of depth measured to the nearest centimeter; (c) a limit of a sequence of discrete histograms

pictured in Figure 4.1(b); it has a much smoother appearance than the histogram in Figure 4.1(a). If we continue in this way to measure depth more and more finely, the resulting sequence of histograms approaches a smooth curve, such as is pictured in Figure 4.1(c). Because for each histogram the total area of all rectangles equals 1, the total area under the smooth curve is also 1. The probability that the depth at a randomly chosen point is between a and b is just the area under the smooth curve between a and b. It is exactly a smooth curve of the type pictured in Figure 4.1(c) that specifies a continuous probability distribution.

Let X be a continuous rv. Then a probability distribution or probability den- sity function (pdf) of X is a function f(x) such that for any two numbers a and b with ,

That is, the probability that X takes on a value in the interval [a, b] is the area above this interval and under the graph of the density function, as illustrated in Figure 4.2. The graph of f(x) is often referred to as the density curve.

P(a # X # b) 5 3

b

a

f (x)dx

a # b

a b x

f(x)

Figure 4.2 the area under the density curve between a and bP(a # X # b) 5

For f(x) to be a legitimate pdf, it must satisfy the following two conditions:

1. for all x

2. area under the entire graph of f(x)

The direction of an imperfection with respect to a reference line on a circular object such as a tire, brake rotor, or flywheel is, in general, subject to uncertainty. Consider the reference line connecting the valve stem on a tire to the center point, and let X

5 1

3

`

2`

f (x) dx 5

f (x) $ 0

Example 4.4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

140 CHAPTER 4 Continuous Random Variables and Probability Distributions

Shaded area � P(90 � X �180)

x

1 360

f(x)

0 360 x

f(x)

36027018090

Figure 4.3 The pdf and probability from Example 4.4

be the angle measured clockwise to the location of an imperfection. One possible pdf for X is

The pdf is graphed in Figure 4.3. Clearly . The area under the density curve is just the area of a rectangle: (height)(base) . The probability that the angle is between and is

The probability that the angle of occurrence is within of the reference line is

P(0 # X # 90) 1 P(270 # X , 360) 5 .25 1 .25 5 .50

908

P(90 # X # 180) 5 3

180

90

1

360 dx 5

x

360 ` x5180

x590

5 1

4 5 .25

1808908 5 Q 1360R(360) 5 1

f (x) $ 0

f (x) 5 • 1

360 0 # x , 360

0 otherwise

A continuous rv X is said to have a uniform distribution on the interval [A, B] if the pdf of X is

f (x; A, B) 5 • 1

B 2 A A # x # B

0 otherwise

The graph of any uniform pdf looks like the graph in Figure 4.3 except that the inter- val of positive density is [A, B] rather than [0, 360].

In the discrete case, a probability mass function (pmf) tells us how little “blobs” of probability mass of various magnitudes are distributed along the mea- surement axis. In the continuous case, probability density is “smeared” in a continu- ous fashion along the interval of possible values. When density is smeared uniformly over the interval, a uniform pdf, as in Figure 4.3, results.

When X is a discrete random variable, each possible value is assigned positive probability. This is not true of a continuous random variable (that is, the second

Because whenever in Example 4.4 and depends only on the width of the interval, X is said to have a uniform distribution.b 2 a

P(a # X # b)0 # a # b # 360

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.5

4.1 Probability Density Functions 141

condition of the definition is satisfied) because the area under a density curve that lies above any single value is zero:

The fact that when X is continuous has an important practical consequence: The probability that X lies in some interval between a and b does not depend on whether the lower limit a or the upper limit b is included in the probabil- ity calculation:

(4.1)

If X is discrete and both a and b are possible values (e.g., X is binomial with and ), then all four of the probabilities in (4.1) are different.

The zero probability condition has a physical analog. Consider a solid circular rod with cross-sectional area . Place the rod alongside a measurement axis and suppose that the density of the rod at any point x is given by the value f(x) of a density function. Then if the rod is sliced at points a and b and this segment is removed, the amount of mass removed is � ; if the rod is sliced just at the point c, no mass is removed. Mass is assigned to interval segments of the rod but not to individual points.

“Time headway” in traffic flow is the elapsed time between the time that one car fin- ishes passing a fixed point and the instant that the next car begins to pass that point. Let the time headway for two randomly chosen consecutive cars on a freeway during a period of heavy flow. The following pdf of X is essentially the one suggested in “The Statistical Properties of Freeway Traffic” (Transp. Res., vol. 11: 221–228):

The graph of f (x) is given in Figure 4.4; there is no density associated with headway times less than .5, and headway density decreases rapidly (exponentially fast) as x increases from .5. Clearly, ; to show that � , we use 2`` f (x) dx 5 1f(x) $ 0

f (x) 5 e .15e2.15(x2.5) x $ .5 0 otherwise

X 5

b a f (x) dx

5 1 in2

a 5 5, b 5 10 n 5 20

P(a # X # b) 5 P(a , X , b) 5 P(a , X # b) 5 P(a # X , b)

P(X 5 c) 5 0

P(X 5 c) 5 3

c

c

f (x) dx 5 lim eS0

3

c1e

c2e

f (x) dx 5 0

0

.15

2 .5

4 6 8 10 x

f (x)

P(X � 5)

Figure 4.4 The density curve for time headway in Example 4.5

the calculus result � . Then

5 .15e.075 # 1 .15

e2(.15)(.5) 5 1

3

`

2`

f (x) dx 5 3

`

.5

.15e2.15(x2.5) dx 5 .15e.0753

`

.5

e2.15x dx

a `e2kx dx 5 (1/k)e2k # a

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

142 CHAPTER 4 Continuous Random Variables and Probability Distributions

The probability that headway time is at most 5 sec is

Unlike discrete distributions such as the binomial, hypergeometric, and nega- tive binomial, the distribution of any given continuous rv cannot usually be derived using simple probabilistic arguments. Instead, one must make a judicious choice of pdf based on prior knowledge and available data. Fortunately, there are some general families of pdf’s that have been found to be sensible candidates in a wide variety of experimental situations; several of these are discussed later in the chapter.

Just as in the discrete case, it is often helpful to think of the population of inter- est as consisting of X values rather than individuals or objects. The pdf is then a model for the distribution of values in this numerical population, and from this model various population characteristics (such as the mean) can be calculated.

5 P(less than 5 sec) 5 P(X , 5)

5 e.075(2e2.75 1 e2.075) 5 1.078(2.472 1 .928) 5 .491

5 .15e.0753

5

.5

e2.15x dx 5 .15e.075 # a2 1 .15

e2.15x ` x5.5

x55 b

P(X # 5) 5 3

5

2`

f (x) dx 5 3

5

.5

.15e2.15(x2.5) dx

EXERCISES Section 4.1 (1–10)

1. The current in a certain circuit as measured by an ammeter is a continuous random variable X with the following density function:

a. Graph the pdf and verify that the total area under the den- sity curve is indeed 1.

b. Calculate . How does this probability compare to ?

c. Calculate and also .

2. Suppose the reaction temperature X (in ) in a certain chemical process has a uniform distribution with and . a. Compute . b. Compute . c. Compute . d. For k satisfying , compute

.

3. The error involved in making a certain measurement is a con- tinuous rv X with pdf

a. Sketch the graph of f(x). b. Compute . c. Compute . d. Compute .P(X , 2.5 or X . .5)

P(21 , X , 1) P(X . 0)

f (x) 5 e .09375(4 2 x2) 22 # x # 2 0 otherwise

P(k , X , k 1 4) 25 , k , k 1 4 , 5

P(22 # X # 3) P(22.5 , X , 2.5) P(X , 0)

B 5 5 A 5 25

8C

P(4.5 , X)P(3.5 # X # 4.5) P(X , 4)

P(X # 4)

f (x) 5 e .075x 1 .2 3 # x # 5 0 otherwise

4. Let X denote the vibratory stress (psi) on a wind turbine blade at a particular wind speed in a wind tunnel. The article “Blade Fatigue Life Assessment with Application to VAWTS” (J. of Solar Energy Engr., 1982: 107–111) proposes the Rayleigh distribution, with pdf

as a model for the X distribution. a. Verify that is a legitimate pdf. b. Suppose (a value suggested by a graph in the

article). What is the probability that X is at most 200? Less than 200? At least 200?

c. What is the probability that X is between 100 and 200 (again assuming )?

d. Give an expression for .

5. A college professor never finishes his lecture before the end of the hour and always finishes his lectures within 2 min after the hour. Let the time that elapses between the end of the hour and the end of the lecture and suppose the pdf of X is

a. Find the value of k and draw the corresponding density curve. [Hint: Total area under the graph of f(x) is 1.]

b. What is the probability that the lecture ends within 1 min of the end of the hour?

f (x) 5 e kx2 0 # x # 2 0 otherwise

X 5

P(X # x) u 5 100

u 5 100 f(x; u)

f (x; u) 5 • x

u2 # e2x2/(2u2) x . 0

0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.2 Cumulative Distribution Functions and Expected Values 143

c. What is the probability that the lecture continues beyond the hour for between 60 and 90 sec?

d. What is the probability that the lecture continues for at least 90 sec beyond the end of the hour?

6. The actual tracking weight of a stereo cartridge that is set to track at 3 g on a particular changer can be regarded as a con- tinuous rv X with pdf

a. Sketch the graph of f(x). b. Find the value of k. c. What is the probability that the actual tracking weight is

greater than the prescribed weight? d. What is the probability that the actual weight is within

.25 g of the prescribed weight? e. What is the probability that the actual weight differs from

the prescribed weight by more than .5 g?

7. The time X (min) for a lab assistant to prepare the equipment for a certain experiment is believed to have a uniform distri- bution with and . a. Determine the pdf of X and sketch the corresponding

density curve. b. What is the probability that preparation time exceeds

33 min? c. What is the probability that preparation time is within

2 min of the mean time? [Hint: Identify from the graph of f(x).]

d. For any a such that , what is the probability that preparation time is between a and

min?

8. In commuting to work, a professor must first get on a bus near her house and then transfer to a second bus. If the wait- ing time (in minutes) at each stop has a uniform distribution with and , then it can be shown that the total waiting time Y has the pdf

B 5 5A 5 0

a 1 2

25 , a , a 1 2 , 35

m

B 5 35A 5 25

f (x) 5 e k[1 2 (x 2 3)2] 2 # x # 4 0 otherwise

a. Sketch a graph of the pdf of Y.

b. Verify that .

c. What is the probability that total waiting time is at most 3 min?

d. What is the probability that total waiting time is at most 8 min?

e. What is the probability that total waiting time is between 3 and 8 min?

f. What is the probability that total waiting time is either less than 2 min or more than 6 min?

9. Consider again the pdf of time headway given in Example 4.5. What is the probability that time headway is a. At most 6 sec? b. More than 6 sec? At least 6 sec? c. Between 5 and 6 sec?

10. A family of pdf’s that has been used to approximate the dis- tribution of income, city population size, and size of firms is the Pareto family. The family has two parameters, k and , both , and the pdf is

a. Sketch the graph of . b. Verify that the total area under the graph equals 1. c. If the rv X has pdf , for any fixed , obtain

an expression for . d. For , obtain an expression for the probability

.P(a # X # b) u , a , b

P(X # b) b . uf (x; k, u)

f (x; k, u)

f (x; k, u) 5 u k # uk

xk11 x $ u

0 x , u

. 0 u

X 5

3

`

2`

f (y) dy 5 1

f (y) 5 e 125 y 0 # y , 52 5 2

1

25 y 5 # y # 10

0 y , 0 or y . 10

4.2 Cumulative Distribution Functions and Expected Values

Several of the most important concepts introduced in the study of discrete distribu- tions also play an important role for continuous distributions. Definitions analogous to those in Chapter 3 involve replacing summation by integration.

The Cumulative Distribution Function The cumulative distribution function (cdf) F(x) for a discrete rv X gives, for any specified number x, the probability . It is obtained by summing the pmf p(y) over all possible values y satisfying . The cdf of a continuous rv gives the same probabilities and is obtained by integrating the pdf f(y) between the limits and x.2`

P(X # x) y # x

P(X # x)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.6

DEFINITION

144 CHAPTER 4 Continuous Random Variables and Probability Distributions

The cumulative distribution function F(x) for a continuous rv X is defined for every number x by

For each x, F(x) is the area under the density curve to the left of x. This is illus- trated in Figure 4.5, where F(x) increases smoothly as x increases.

F(x) 5 P(X # x) 5 3

x

2`

f (y) dy

Let X, the thickness of a certain metal sheet, have a uniform distribution on [A, B]. The density function is shown in Figure 4.6. For , since there is no area under the graph of the density function to the left of such an x. For

, since all the area is accumulated to the left of such an x. Finally, for ,

F(x) 5 3

x

2`

f(y)dy 5 3

x

A

1

B 2 A dy 5

1

B 2 A # y `

y5A

y5x

5 x 2 A

B 2 A

A # x # B x $ B, F(x) 5 1

x , A, F(x) 5 0

f (x) F (x)

F(8)

x x

F(8)

5 8

10 5 8

10

.5

1

Figure 4.5 A pdf and associated cdf

f (x) f (x)

1 B�A

A B

1 B�A

A Bx x

Shaded area � F(x)

Figure 4.6 The pdf for a uniform distribution

The entire cdf is

The graph of this cdf appears in Figure 4.7.

F(x) 5 μ 0 x , A

x 2 A

B 2 A A # x , B

1 x $ B

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.7

PROPOSITION

4.2 Cumulative Distribution Functions and Expected Values 145

F (x)

A B x

1

Figure 4.7 The cdf for a uniform distribution

Using F(x) to Compute Probabilities The importance of the cdf here, just as for discrete rv’s, is that probabilities of vari- ous intervals can be computed from a formula for or table of F(x).

Let X be a continuous rv with pdf f (x) and cdf F(x). Then for any number a,

and for any two numbers a and b with ,

P(a # X # b) 5 F(b) 2 F(a)

a , b

P(X . a) 5 1 2 F(a)

Figure 4.8 illustrates the second part of this proposition; the desired probability is the shaded area under the density curve between a and b, and it equals the difference between the two shaded cumulative areas. This is different from what is appropriate for a discrete integer valued random variable (e.g., binomial or Poisson):

when a and b are integers.P(a # X # b) 5 F(b) 2 F(a 2 1)

a b

f (x)

b a

� �

Figure 4.8 Computing from cumulative probabilitiesP(a # X # b)

Suppose the pdf of the magnitude X of a dynamic load on a bridge (in newtons) is given by

For any number x between 0 and 2,

Thus

F(x) 5 d 0 x , 0x 8 1

3

16 x2 0 # x # 2

1 2 , x

F(x) 5 3

x

2`

f (y) dy 5 3

x

0

a 1 8 1

3

8 yb dy 5 x

8 1

3

16 x2

f (x) 5 • 1

8 1

3

8 x 0 # x # 2

0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.8 (Example 4.6 continued)

PROPOSITION

146 CHAPTER 4 Continuous Random Variables and Probability Distributions

The graphs of f(x) and F(x) are shown in Figure 4.9. The probability that the load is between 1 and 1.5 is

The probability that the load exceeds 1 is

5 11

16 5 .688

P(X . 1) 5 1 2 P(X # 1) 5 1 2 F(1) 5 1 2 c 1 8 (1) 1

3

16 (1)2d

5 19

64 5 .297

5 c 1 8 (1.5) 1

3

16 (1.5)2d 2 c 1

8 (1) 1

3

16 (1)2d

P(1 # X # 1.5) 5 F(1.5) 2 F(1)

1 8

7 8

0 2

f (x)

2

F (x)

1

x x

Figure 4.9 The pdf and cdf for Example 4.7

Once the cdf has been obtained, any probability involving X can easily be cal- culated without any further integration.

Obtaining f (x) from F(x) For X discrete, the pmf is obtained from the cdf by taking the difference between two F(x) values. The continuous analog of a difference is a derivative. The following result is a consequence of the Fundamental Theorem of Calculus.

If X is a continuous rv with pdf f (x) and cdf F(x), then at every x at which the derivative exists, . F r(x) 5 f (x)F r(x)

When X has a uniform distribution, F(x) is differentiable except at and where the graph of F(x) has sharp corners. Since for and for for such x. For ,

Percentiles of a Continuous Distribution When we say that an individual’s test score was at the 85th percentile of the popu- lation, we mean that 85% of all population scores were below that score and 15% were above. Similarly, the 40th percentile is the score that exceeds 40% of all scores and is exceeded by 60% of all scores.

F r(x) 5 d

dx a x 2 A

B 2 A b 5 1

B 2 A 5 f(x)

A , x , Bx . B, F r(x) 5 0 5 f (x) F(x) 5 1x , AF(x) 5 0

x 5 B,x 5 A

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.9

DEFINITION

4.2 Cumulative Distribution Functions and Expected Values 147

Let p be a number between 0 and 1. The (100p)th percentile of the distribu- tion of a continuous rv X, denoted by , is defined by

(4.2)p 5 F(h(p)) 5 3

h(p)

2`

f (y) dy

h(p)

According to Expression (4.2), is that value on the measurement axis such that 100p% of the area under the graph of f(x) lies to the left of and

% lies to the right. Thus , the 75th percentile, is such that the area under the graph of f(x) to the left of is .75. Figure 4.10 illustrates the definition.

h(.75) h(.75)100(1 2 p)

h(p) h(p)

Shaded area � p

(p)�

f (x) F(x)

�p � F( (p))

x

1

(p)�

Figure 4.10 The (100p)th percentile of a continuous distribution

The distribution of the amount of gravel (in tons) sold by a particular construction supply company in a given week is a continuous rv X with pdf

The cdf of sales for any x between 0 and 1 is

The graphs of both f(x) and F(x) appear in Figure 4.11. The (100p)th percentile of this distribution satisfies the equation

that is,

For the 50th percentile, , and the equation to be solved is ; the solution is . If the distribution remains the same from week to week, then in the long run 50% of all weeks will result in sales of less than .347 ton and 50% in more than .347 ton.

h 5 h(.5) 5 .347 h3 2 3h 1 1 5 0p 5 .5

(h(p))3 2 3h(p) 1 2p 5 0

p 5 F(h(p)) 5 3

2 ch(p) 2 (h(p))3

3 d

F(x) 5 3

x

0

3

2 (1 2 y2) dy 5

3

2 a y 2 y3

3 b `

y50

y5x

5 3

2 ax 2 x3

3 b

f (x) 5 • 3

2 (1 2 x2) 0 # x # 1

0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.10 (Example 4.9 continued)

DEFINITION

DEFINITION

148 CHAPTER 4 Continuous Random Variables and Probability Distributions

The median of a continuous distribution, denoted by , is the 50th percentile, so satisfies . That is, half the area under the density curve is to the left of and half is to the right of .m|m|

.5 5 F(m|)m| m|

A continuous distribution whose pdf is symmetric—the graph of the pdf to the left of some point is a mirror image of the graph to the right of that point—has median equal to the point of symmetry, since half the area under the curve lies to either side of this point. Figure 4.12 gives several examples. The error in a measurement of a physical quantity is often assumed to have a symmetric distribution.

m|

f (x)

x x x

f (x) f (x)

A � B˜ �̃ �̃

Figure 4.12 Medians of symmetric distributions

Expected Values For a discrete random variable X, E(X) was obtained by summing over possi- ble X values. Here we replace summation by integration and the pmf by the pdf to get a continuous weighted average.

x # p(x)

The expected or mean value of a continuous rvX with pdf f (x) is

mX 5 E(X) 5 3

`

2`

x # f (x) dx

The pdf of weekly gravel sales X was

f (x) 5 u32 (1 2 x2) 0 # x # 1 0 otherwise

1.5

0 1 x

f (x)

1

0 1 x

F(x)

.5

.347

Figure 4.11 The pdf and cdf for Example 4.9 ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.11

PROPOSITION

4.2 Cumulative Distribution Functions and Expected Values 149

If X is a continuous rv with pdf f(x) and h(X) is any function of X, then

E[h(X)] 5 mh(X) 5 3

`

2`

h(x) # f (x) dx

so

When the pdf f(x) specifies a model for the distribution of values in a numeri- cal population, then is the population mean, which is the most frequently used measure of population location or center.

Often we wish to compute the expected value of some function h(X) of the rv X. If we think of h(X) as a new rv Y, techniques from mathematical statistics can be used to derive the pdf of Y, and E(Y) can then be computed from the definition. Fortunately, as in the discrete case, there is an easier way to compute E[h(X)].

m

5 3

2 3

1

0

(x 2 x3) dx 5 3

2 a x2

2 2

x4

4 b `

x50

x51

5 3

8

E(X) 5 3

`

2`

x # f (x) dx 5 3 1

0

x # 3 2 (1 2 x2) dx

Two species are competing in a region for control of a limited amount of a certain resource. Let the proportion of the resource controlled by species 1 and suppose X has pdf

which is a uniform distribution on [0, 1]. (In her book Ecological Diversity, E. C. Pielou calls this the “broken-stick” model for resource allocation, since it is analo- gous to breaking a stick at a randomly chosen point.) Then the species that controls the majority of this resource controls the amount

The expected amount controlled by the species having majority control is then

For h(X), a linear function, .

In the discrete case, the variance of X was defined as the expected squared devia- tion from and was calculated by summation. Here again integration replaces summation.

m

E[h(X)] 5 E(aX 1 b) 5 aE(X) 1 b

5 3

1/2

0

(1 2 x) # 1 dx 1 3 1

1/2

x # 1 dx 5 3 4

E[h(X)] 5 3

`

2`

max(x, 1 2 x) # f (x) dx 5 3 1

0

max(x, 1 2 x) # 1 dx

h(X) 5 max (X, 1 2 X) 5 μ 1 2 X if 0 # X ,

1

2

X if 1

2 # X # 1

f (x) 5 e1 0 # x # 1 0 otherwise

X 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.12 (Example 4.10 continued)

PROPOSITION

DEFINITION

150 CHAPTER 4 Continuous Random Variables and Probability Distributions

The variance of a continuous random variable X with pdf f(x) and mean value is

The standard deviation (SD) of X is .sX 5 2V(X)

sX 2 5 V(X) 5 3

`

2`

(x 2 m)2 # f (x)dx 5 E[(X 2 m)2]

m

V(X) 5 E(X 2) 2 [E(X)]2

The variance and standard deviation give quantitative measures of how much spread there is in the distribution or population of x values. Again is roughly the size of a typical deviation from . Computation of is facilitated by using the same short- cut formula employed in the discrete case.

s2m

s

EXERCISES Section 4.2 (11–27)

11. Let X denote the amount of time a book on two-hour reserve is actually checked out, and suppose the cdf is

Use the cdf to obtain the following: a. b. c. d. The median checkout duration [solve e. to obtain the density function f(x) f. E(X) g. V(X) and h. If the borrower is charged an amount when

checkout duration is X, compute the expected charge E[h(X)].

h(X ) 5 X 2 sX

F r(x) .5 5 F(m|)]m|

P(X . 1.5) P(.5 # X # 1) P(X # 1)

F(x) 5 d0 x , 0x2 4

0 # x , 2

1 2 # x

12. The cdf for X ( measurement error) of Exercise 3 is

a. Compute . b. Compute . c. Compute . d. Verify that f(x) is as given in Exercise 3 by obtaining

. e. Verify that .

13. Example 4.5 introduced the concept of time headway in traffic flow and proposed a particular distribution for the headway between two randomly selected consecutive cars (sec). Suppose that in a different traffic environment, the distribution of time headway has the form

X 5

m| 5 0 F r(x)

P(.5 , X) P(21 , X , 1) P(X , 0)

F(x) 5 d 0 x , 221 2 1

3

32 a4x 2 x3

3 b 22 # x , 2

1 2 # x

5

For weekly gravel sales, we computed . Since

When , the expected value and variance of h(X ) satisfy the same properties as in the discrete case: and .V[h(X)] 5 a2 # s2E[h(X)] 5 am 1 b

h(X ) 5 aX 1 b

V(X) 5 1

5 2 a 3

8 b2 5 19

320 5 .059 and sX 5 .244

5 3

1

0

3

2 (x2 2 x4) dx 5

1

5

E(X 2) 5 3

`

2`

x2 # f (x) dx 5 3 1

0

x2 # 3 2 (1 2 x2) dx

E(X) 5 38X 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.2 Cumulative Distribution Functions and Expected Values 151

a. Determine the value of k for which f(x) is a legitimate pdf. b. Obtain the cumulative distribution function. c. Use the cdf from (b) to determine the probability that

headway exceeds 2 sec and also the probability that headway is between 2 and 3 sec.

d. Obtain the mean value of headway and the standard deviation of headway.

e. What is the probability that headway is within 1 standard deviation of the mean value?

14. The article “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants” (Water Research, 1984: 1169–1174) suggests the uniform distribution on the interval (7.5, 20) as a model for depth (cm) of the bioturba- tion layer in sediment in a certain region. a. What are the mean and variance of depth? b. What is the cdf of depth? c. What is the probability that observed depth is at most

10? Between 10 and 15? d. What is the probability that the observed depth is within

1 standard deviation of the mean value? Within 2 stan- dard deviations?

15. Let X denote the amount of space occupied by an article placed in a 1- packing container. The pdf of X is

a. Graph the pdf. Then obtain the cdf of X and graph it. b. What is [i.e., F(.5)]? c. Using the cdf from (a), what is ? What

is ? d. What is the 75th percentile of the distribution? e. Compute E(X) and . f. What is the probability that X is more than 1 standard

deviation from its mean value?

16. Answer parts (a)–(f) of Exercise 15 with lecture time past the hour given in Exercise 5.

17. Let X have a uniform distribution on the interval [A, B]. a. Obtain an expression for the (100p)th percentile. b. Compute E(X), V(X), and . c. For n, a positive integer, compute .

18. Let X denote the voltage at the output of a microphone, and suppose that X has a uniform distribution on the interval from to 1. The voltage is processed by a “hard limiter” with cutoff values and .5, so the limiter output is a ran- dom variable Y related to X by if if

, and if . a. What is ? b. Obtain the cumulative distribution function of Y and

graph it.

P(Y 5 .5) X , 2.5Y 5 2.5X . .5

|X| # .5, Y 5 .5Y 5 X 2.5

21

E(Xn) sX

X 5

sX

P(.25 # X # .5) P(.25 , X # .5)

P(X # .5)

f (x) 5 e90x8(1 2 x) 0 , x , 1 0 otherwise

ft3

f (x) 5 • k

x4 x . 1

0 x # 1

19. Let X be a continuous rv with cdf

[This type of cdf is suggested in the article “Variability in Measured Bedload-Transport Rates” (Water Resources Bull., 1985: 39–48) as a model for a certain hydrologic vari- able.] What is a. ? b. ? c. The pdf of X?

20. Consider the pdf for total waiting time Y for two buses

introduced in Exercise 8. a. Compute and sketch the cdf of Y. [Hint: Consider sepa-

rately and in computing F(y). A graph of the pdf should be helpful.]

b. Obtain an expression for the (100p)th percentile. [Hint: Consider separately and .]

c. Compute E(Y ) and V(Y). How do these compare with the expected waiting time and variance for a single bus when the time is uniformly distributed on [0, 5]?

21. An ecologist wishes to mark off a circular sampling region having radius 10 m. However, the radius of the resulting region is actually a random variable R with pdf

What is the expected area of the resulting circular region?

22. The weekly demand for propane gas (in 1000s of gallons) from a particular facility is an rv X with pdf

a. Compute the cdf of X. b. Obtain an expression for the (100p)th percentile. What is

the value of ? c. Compute E(X) and V(X). d. If 1.5 thousand gallons are in stock at the beginning of

the week and no new supply is due in during the week, how much of the 1.5 thousand gallons is expected to be left at the end of the week? [Hint: Let amount left when demand .]5 x

h(x) 5

m|

f (x) 5 u2a1 2 1x2 b 1 # x # 2 0 otherwise

f (r) 5 u 34 [1 2 (10 2 r)2] 9 # r # 11 0 otherwise

.5 , p , 10 , p , .5

5 # y # 100 # y , 5

f (y) 5 e 125 y 0 # y , 52 5

2 1

25 y 5 # y # 10

0 otherwise

P(1 # X # 3) P(X # 1)

F(x) 5 μ 0 x # 0

x

4 c1 1 lna 4

x b d 0 , x # 4

1 x . 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

152 CHAPTER 4 Continuous Random Variables and Probability Distributions

23. If the temperature at which a certain compound melts is a random variable with mean value and standard devi- ation , what are the mean temperature and standard deviation measured in ? [Hint: .]

24. Let X have the Pareto pdf

introduced in Exercise 10. a. If , compute E(X). b. What can you say about E(X) if ? c. If , show that . d. If , what can you say about V(X)? e. What conditions on k are necessary to ensure that

is finite?

25. Let X be the temperature in at which a certain chemical reaction takes place, and let Y be the temperature in (so

). a. If the median of the X distribution is , show that

is the median of the Y distribution. b. How is the 90th percentile of the Y distribution related to

the 90th percentile of the X distribution? Verify your conjecture.

c. More generally, if , how is any particular percentile of the Y distribution related to the correspon- ding percentile of the X distribution?

26. Let X be the total medical expenses (in 1000s of dollars) incurred by a particular individual during a given year.

Y 5 aX 1 b

1.8m| 1 32 m|

Y 5 1.8X 1 32 8 F

8 C

E(Xn) k 5 2

V(X) 5 ku2 (k 2 1)22 (k 2 2)21k . 2 k 5 1

k . 1

f (x; k, u) 5 u k # u k

xk11 x $ u

0 x , u

8F 5 1.88 C 1 328F 28 C

1208C Although X is a discrete random variable, suppose its distri- bution is quite well approximated by a continuous distribu- tion with pdf for . a. What is the value of k? b. Graph the pdf of X. c. What are the expected value and standard deviation of

total medical expenses? d. This individual is covered by an insurance plan that

entails a $500 deductible provision (so the first $500 worth of expenses are paid by the individual). Then the plan will pay 80% of any additional expenses exceed- ing $500, and the maximum payment by the individual (including the deductible amount) is $2500. Let Y denote the amount of this individual’s medical expenses paid by the insurance company. What is the expected value of Y? [Hint: First figure out what value of X corresponds to the maximum out-of-pocket expense of $2500. Then write an expression for Y as a function of X (which involves several different pieces) and calculate the expected value of this function.]

27. When a dart is thrown at a circular target, consider the loca- tion of the landing point relative to the bull’s eye. Let X be the angle in degrees measured from the horizontal, and assume that X is uniformly distributed on [0, 360]. Define Y to be the transformed variable , so Y is the angle measured in radians and Y is between and . Obtain E(Y) and by first obtaining E(X) and , and then using the fact that h(X) is a linear function of X.

sXsY

p2p Y 5 h(X) 5 (2p/360)X 2 p

x $ 0f(x) 5 k(1 1 x/2.5)27

4.3 The Normal Distribution The normal distribution is the most important one in all of probability and statistics. Many numerical populations have distributions that can be fit very closely by an appropriate normal curve. Examples include heights, weights, and other physical characteristics (the famous 1903 Biometrika article “On the Laws of Inheritance in Man” discussed many examples of this sort), measurement errors in scientific exper- iments, anthropometric measurements on fossils, reaction times in psychological experiments, measurements of intelligence and aptitude, scores on various tests, and numerous economic measures and indicators. In addition, even when individual vari- ables themselves are not normally distributed, sums and averages of the variables will under suitable conditions have approximately a normal distribution; this is the content of the Central Limit Theorem discussed in the next chapter.

A continuous rv X is said to have a normal distribution with parameters and (or and ), where and , if the pdf of X is

(4.3)f (x; m, s) 5 1

12ps e2(x2m)2/(2s2) 2` , x , `

0 , s2` , m , `s2ms m

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.3 The Normal Distribution 153

Again e denotes the base of the natural logarithm system and equals approximately 2.71828, and represents the familiar mathematical constant with approximate value 3.14159. The statement that X is normally distributed with parameters and

is often abbreviated . Clearly , but a somewhat complicated calculus argument must bef (x; m, s) $ 0

X | N(m, s2)s2 m

p

� � � �

0.09 f(x)

0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0.00 40 60 80

(a) (b)

x 100 120

= 80, = 15

= 100, = 5

Figure 4.13 (a) Two different normal density curves (b) Visualizing and for a normal distribution

sm

used to verify that � . It can be shown that and , so the parameters are the mean and the standard deviation of X. Figure 4.13

presents graphs of for several different pairs. Each density curve is symmetric about and bell-shaped, so the center of the bell (point of symmetry) is both the mean of the distribution and the median. The value of is the distance from to the inflection points of the curve (the points at which the curve changes from turning downward to turning upward). Large values of yield graphs that are quite spread out about , whereas small values of yield graphs with a high peak above and most of the area under the graph quite close to . Thus a large implies that a value of X far from may well be observed, whereas such a value is quite unlikely when is small.sm

sm

msm

s

ms

m

(m, s)f (x; m, s) V(X) 5 s2

E(X) 5 m2` ` f (x; m, s) dx 5 1

The Standard Normal Distribution The computation of when X is a normal rv with parameters and requires evaluating

(4.4)

None of the standard integration techniques can be used to accomplish this. Instead, for and , Expression (4.4) has been calculated using numerical tech- niques and tabulated for certain values of a and b. This table can also be used to com- pute probabilities for any other values of and under consideration.sm

s 5 1m 5 0

3

b

a

1

12ps e2(x2m)2/(2s2) dx

smP(a # X # b)

The normal distribution with parameter values and is called the standard normal distribution. A random variable having a standard normal distribution is called a standard normal random variable and will be de- noted by Z. The pdf of Z is

f (z; 0, 1) 5 1

12p e2z2/2 2` , z , `

s 5 1m 5 0DEFINITION

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.13

154 CHAPTER 4 Continuous Random Variables and Probability Distributions

The graph of f(z; 0, 1) is called the standard normal (or z) curve. Its inflection

points are at . The cdf of Z is , which

we will denote by .�(z)

P(Z # z) 5 � z

2`

f (y; 0, 1) dy1 and 21

The standard normal distribution almost never serves as a model for a naturally arising population. Instead, it is a reference distribution from which information about other normal distributions can be obtained. Appendix Table A.3 gives

, the area under the standard normal density curve to the left of z, for . Figure 4.14 illustrates the type of cumulative area (probability) tabulated in Table A.3. From this table, various other probabilities involving Z can be calculated.

z 5 23.49, 23.48, c, 3.48, 3.49 �(z) 5 P(Z # z)

0 z

Shaded area � (z)

Standard normal (z) curve

Figure 4.14 Standard normal cumulative areas tabulated in Appendix Table A.3

Let’s determine the following standard normal probabilities: (a) , (b) , (c) , and (d) .

a. , a probability that is tabulated in Appendix Table A.3 at the intersection of the row marked 1.2 and the column marked .05. The number there is .8944, so . Figure 4.15(a) illustrates this probability.

P(Z # 1.25) 5 .8944

P(Z # 1.25) 5 �(1.25)

P(2.38 # Z # 1.25)P(Z # 21.25)P(Z . 1.25) P(Z # 1.25)

b. , the area under the z curve to the right of 1.25 (an upper-tail area). Then implies that

. Since Z is a continuous rv, . See Figure 4.15(b).

c. , a lower-tail area. Directly from Appendix Table A.3, . By symmetry of the z curve, this is the same answer as in part (b).

d. is the area under the standard normal curve above the inter- val whose left endpoint is and whose right endpoint is 1.25. From Section 4.2, if X is a continuous rv with cdf F(x), then . Thus

. (See Figure 4.16.) P(2.38 # Z # 1.25) 5 �(1.25) 2 �(2.38) 5 .8944 2 .3520 5 .5424

P(a # X # b) 5 F(b) 2 F(a) 2.38

P(2.38 # Z # 1.25)

�(21.25) 5 .1056 P(Z # 21.25) 5 �(21.25)

P(Z $ 1.25) 5 .1056P(Z . 1.25) 5 .1056 �(1.25) 5 .8944

P(Z . 1.25) 5 1 2 P(Z # 1.25) 5 1 2 �(1.25)

Shaded area � (1.25) z curve

0 1.25

z curve

0 1.25 (a) (b)

Figure 4.15 Normal curve areas (probabilities) for Example 4.13

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.3 The Normal Distribution 155

0�.38 1.25 0 1.25 0�.38

z curve

� �

Figure 4.16 as the difference between two cumulative areasP(2.38 # Z # 1.25) ■

Percentiles of the Standard Normal Distribution For any p between 0 and 1, Appendix Table A.3 can be used to obtain the (100p)th percentile of the standard normal distribution.

The 99th percentile of the standard normal distribution is that value on the horizon- tal axis such that the area under the z curve to the left of the value is .9900. Appendix Table A.3 gives for fixed z the area under the standard normal curve to the left of z, whereas here we have the area and want the value of z. This is the “inverse” prob- lem to ? so the table is used in an inverse fashion: Find in the middle of the table .9900; the row and column in which it lies identify the 99th z percentile. Here .9901 lies at the intersection of the row marked 2.3 and column marked .03, so the 99th percentile is (approximately) . (See Figure 4.17.) By symmetry, the first percentile is as far below 0 as the 99th is above 0, so equals (1% lies below the first and also above the 99th). (See Figure 4.18.)

22.33 z 5 2.33

P(Z # z) 5

Example 4.14

Shaded area � .9900

z curve

99th percentile

0

Figure 4.17 Finding the 99th percentile

Shaded area � .01

z curve

2.33 � 99th percentile�2.33 � 1st percentile

0

Figure 4.18 The relationship between the 1st and 99th percentiles

In general, the (100p)th percentile is identified by the row and column of Appendix Table A.3 in which the entry p is found (e.g., the 67th percentile is obtained by find- ing .6700 in the body of the table, which gives ). If p does not appear, the number closest to it is often used, although linear interpolation gives a more accurate answer. For example, to find the 95th percentile, we look for .9500 inside the table. Although .9500 does not appear, both .9495 and .9505 do, corresponding to z 5 1.64

z 5 .44

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

156 CHAPTER 4 Continuous Random Variables and Probability Distributions

and 1.65, respectively. Since .9500 is halfway between the two probabilities that do appear, we will use 1.645 as the 95th percentile and as the 5th percentile.

zA Notation for z Critical Values In statistical inference, we will need the values on the horizontal z axis that capture certain small tail areas under the standard normal curve.

21.645

Notation

will denote the value on the z axis for which of the area under the z curve lies to the right of . (See Figure 4.19.)za

aza

For example, captures upper-tail area .10, and captures upper-tail area .01.z.01z.10

Shaded area � P(Z z�) � �Shaded area � P(Z z ) � z curve

z�

0

Figure 4.19 notation Illustratedza

Since of the area under the z curve lies to the right of of the area lies to its left. Thus is the th percentile of the standard normal distri- bution. By symmetry the area under the standard normal curve to the left of is also . The are usually referred to as z critical values. Table 4.1 lists the most useful z percentiles and values.za

zarsa 2za

100(1 2 a)za

za,1 2 aa

Table 4.1 Standard Normal Percentiles and Critical Values

Percentile 90 95 97.5 99 99.5 99.9 99.95 (tail area) .1 .05 .025 .01 .005 .001 .0005

th 1.28 1.645 1.96 2.33 2.58 3.08 3.27 percentile

za 5 100(1 2 a) a

is the th 95th percentile of the standard normal distribution, so . The area under the standard normal curve to the left of is also

.05. (See Figure 4.20.) 2z.05z.05 5 1.645

5100(1 2 .05)z.05

Shaded area � .05 Shaded area � .05

z curve

z.05 � 95th percentile � 1.645�1.645 � �z.05

0

Figure 4.20 Finding z.05

Example 4.15

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.3 The Normal Distribution 157

Nonstandard Normal Distributions When , probabilities involving X are computed by “standardizing.” The standardized variable is . Subtracting shifts the mean from to zero, and then dividing by scales the variable so that the standard deviation is 1 rather than .s

s

mm(X 2 m)/s X , N(m, s2)

PROPOSITION If X has a normal distribution with mean and standard deviation , then

has a standard normal distribution. Thus

P(X # a) 5 �a a 2 m s

b P(X $ b) 5 1 2 �a b 2 m s

b

5 �a b 2 m s

b 2 �a a 2 m s

b

P(a # X # b) 5 Pa a 2 m s

# Z # b 2 m s

b

Z 5 X 2 m s

sm

The key idea of the proposition is that by standardizing, any probability involving X can be expressed as a probability involving a standard normal rv Z, so that Appendix Table A.3 can be used. This is illustrated in Figure 4.21. The proposition can be proved by writing the cdf of as

Using a result from calculus, this integral can be differentiated with respect to z to yield the desired pdf f (z; 0, 1).

P(Z # z) 5 P(X # sz 1 m) 5 � sz1m

2`

f(x; m, s)dx

Z 5 (X 2 m)/s

� x 0

N( , 2)� � N(0, 1)

(x � )/ ��

Figure 4.21 Equality of nonstandard and standard normal curve areas

Example 4.16 The time that it takes a driver to react to the brake lights on a decelerating vehi- cle is critical in helping to avoid rear-end collisions. The article “Fast-Rise Brake Lamp as a Collision-Prevention Device” (Ergonomics, 1993: 391–395) suggests that reaction time for an in-traffic response to a brake signal from standard brake lights can be modeled with a normal distribution having mean value 1.25 sec and standard deviation of .46 sec. What is the probability that reaction time is

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.17

158 CHAPTER 4 Continuous Random Variables and Probability Distributions

1.25

1.751.00

0

1.09�.54

Normal, � 1.25, � .46 P(1.00 � X � 1.75)

z curve

� �

Figure 4.22 Normal curves for Example 4.16

between 1.00 sec and 1.75 sec? If we let X denote reaction time, then standardiz- ing gives

if and only if

Thus

5 .8621 2 .2946 5 .5675

5 P(2.54 # Z # 1.09) 5 �(1.09) 2 �(2.54)

P(1.00 # X # 1.75) 5 Pa 1.00 2 1.25 .46

# Z # 1.75 2 1.25

.46 b

1.00 2 1.25

.46 #

X 2 1.25

.46 #

1.75 2 1.25

.46

1.00 # X # 1.75

This is illustrated in Figure 4.22. Similarly, if we view 2 sec as a critically long reac- tion time, the probability that actual reaction time will exceed this value is

Standardizing amounts to nothing more than calculating a distance from the mean value and then reexpressing the distance as some number of standard deviations. Thus, if and , then corresponds to

. That is, 130 is 2 standard deviations above (to the right of) the mean value. Similarly, standardizing 85 gives , so 85 is 1 standard deviation below the mean. The z table applies to any normal distribution provided that we think in terms of number of standard deviations away from the mean value.

The breakdown voltage of a randomly chosen diode of a particular type is known to be normally distributed. What is the probability that a diode’s breakdown voltage is within 1 standard deviation of its mean value? This question can be answered with- out knowing either or , as long as the distribution is known to be normal; the answer is the same for any normal distribution:

5 �(1.00) 2 �(21.00) 5 .6826

5 P(21.00 # Z # 1.00)

5 Pa m 2 s 2 m s

# Z # m 1 s 2 m

s b

sm

(85 2 100)/15 5 21.00 30/15 5 2.00

z 5 (130 2 100)/15 5x 5 130s 5 15m 5 100

P(X . 2) 5 PaZ . 2 2 1.25 .46

b 5 P(Z . 1.63) 5 1 2 �(1.63) 5 .0516

P(X is within 1 standard deviation of its mean) 5 P(m 2 s # X # m 1 s)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.18

PROPOSITION

4.3 The Normal Distribution 159

The probability that X is within 2 standard deviations of its mean is and within 3 standard deviations of the mean is

. ■

The results of Example 4.17 are often reported in percentage form and referred to as the empirical rule (because empirical evidence has shown that histograms of real data can very frequently be approximated by normal curves).

P(23.00 # Z # 3.00) 5 .9974 .9544P(22.00 # Z # 2.00) 5

Shaded area � .995

c � 99.5th percentile � 66.0

� 64�

Figure 4.23 Distribution of amount dispensed for Example 4.18

If the population distribution of a variable is (approximately) normal, then

1. Roughly 68% of the values are within 1 SD of the mean.

2. Roughly 95% of the values are within 2 SDs of the mean.

3. Roughly 99.7% of the values are within 3 SDs of the mean.

(100p)th percentile

for normal (m, s) 5 m 1 c (100p)th for

standard normal d # s

It is indeed unusual to observe a value from a normal population that is much farther than 2 standard deviations from . These results will be important in the develop- ment of hypothesis-testing procedures in later chapters.

Percentiles of an Arbitrary Normal Distribution The (100p)th percentile of a normal distribution with mean and standard deviation

is easily related to the (100p)th percentile of the standard normal distribution.s m

m

Another way of saying this is that if z is the desired percentile for the standard nor- mal distribution, then the desired percentile for the normal ( distribution is z standard deviations from .

The amount of distilled water dispensed by a certain machine is normally distributed with mean value 64 oz and standard deviation .78 oz. What container size c will ensure that overflow occurs only .5% of the time? If X denotes the amount dispensed, the desired condition is that , or, equivalently, that . Thus c is the 99.5th percentile of the normal distribution with and . The 99.5th percentile of the standard normal distribution is 2.58, so

This is illustrated in Figure 4.23.

c 5 h(.995) 5 64 1 (2.58)(.78) 5 64 1 2.0 5 66 oz

s 5 .78m 5 64 P(X # c) 5 .995P(X . c) 5 .005

m

m, s)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.19

125

Figure 4.24 A normal approximation to a discrete distribution

160 CHAPTER 4 Continuous Random Variables and Probability Distributions

The Normal Distribution and Discrete Populations The normal distribution is often used as an approximation to the distribution of val- ues in a discrete population. In such situations, extra care should be taken to ensure that probabilities are computed in an accurate manner.

IQ in a particular population (as measured by a standard test) is known to be approx- imately normally distributed with and . What is the probability that a randomly selected individual has an IQ of at least 125? Letting the IQ of a randomly chosen person, we wish . The temptation here is to standard- ize as in previous examples. However, the IQ population distribution is actually discrete, since IQs are integer-valued. So the normal curve is an approxi- mation to a discrete probability histogram, as pictured in Figure 4.24.

The rectangles of the histogram are centered at integers, so IQs of at least 125 correspond to rectangles beginning at 124.5, as shaded in Figure 4.24. Thus we really want the area under the approximating normal curve to the right of 124.5. Standardizing this value gives , whereas standardizing 125 results in . The difference is not great, but the answer .0516 is more accurate. Similarly, would be approximated by the area between 124.5 and 125.5, since the area under the normal curve above the single value 125 is zero.

P(X 5 125) P(Z $ 1.67) 5 .0475

P(Z $ 1.63) 5 .0516

X $ 125 P(X $ 125)

X 5 s 5 15m 5 100

The correction for discreteness of the underlying distribution in Example 4.19 is often called a continuity correction. It is useful in the following application of the normal distribution to the computation of binomial probabilities.

Approximating the Binomial Distribution Recall that the mean value and standard deviation of a binomial random variable X are and , respectively. Figure 4.25 displays a binomial proba- bility histogram for the binomial distribution with , for which

and . A normal curve with this and has been superimposed on the probability histogram. Although the probability his- togram is a bit skewed (because ), the normal curve gives a very good approx- imation, especially in the middle part of the picture. The area of any rectangle (probability of any particular X value) except those in the extreme tails can be accu- rately approximated by the corresponding normal curve area. For example,

, whereas the area under the nor- mal curve between 9.5 and 10.5 is .

More generally, as long as the binomial probability histogram is not too skewed, binomial probabilities can be well approximated by normal curve areas. It is then customary to say that X has approximately a normal distribution.

P(21.14 # Z # 2.68) 5 .1212 P(X 5 10) 5 B(10; 20, .6) 2 B(9; 20, .6) 5 .117

p 2 .5

sms 5 120(.6)(.4) 5 2.19m 5 20(.6) 5 12 n 5 20, p 5 .6

sX 5 1npqmX 5 np

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.20

PROPOSITION

4.3 The Normal Distribution 161

Let X be a binomial rv based on n trials with success probability p. Then if the binomial probability histogram is not too skewed, X has approximately a normal distribution with and . In particular, for a pos- sible value of X,

In practice, the approximation is adequate provided that both and , since there is then enough symmetry in the underlying binomial

distribution. nq $ 10

np $ 10

5 �ax 1 .5 2 np 1npq

b P(X # x) 5 B(x, n, p) < aarea under the normal curve

to the left of x 1 .5 b

x 5s 5 1npqm 5 np

0 2 4 6 8 10 12 14 16 18 20

Normal curve, � 12, � 2.19.20

.15

.10

.05

μ σ

Figure 4.25 Binomial probability histogram for with normal approximation curve superimposed

n 5 20, p 5 .6

A direct proof of this result is quite difficult. In the next chapter we’ll see that it is a consequence of a more general result called the Central Limit Theorem. In all hon- esty, this approximation is not so important for probability calculation as it once was. This is because software can now calculate binomial probabilities exactly for quite large values of n.

Suppose that 25% of all students at a large public university receive financial aid. Let X be the number of students in a random sample of size 50 who receive financial aid, so that . Then and . Since and , the approximation can safely be applied. The probability that at most 10 students receive aid is

Similarly, the probability that between 5 and 15 (inclusive) of the selected students receive aid is

< �a 15.5 2 12.5 3.06

b 2 �a 4.5 2 12.5 3.06

b 5 .8320 P(5 # X # 15) 5 B(15; 50, .25) 2 B(4; 50, .25)

5 �(2.65) 5 .2578

P(X # 10) 5 B(10; 50, .25) < �a 10 1 .5 2 12.5 3.06

b

nq 5 37.5 $ 10 np 5 50(.25) 5 12.5 $ 10s 5 3.06m 5 12.5p 5 .25

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

162 CHAPTER 4 Continuous Random Variables and Probability Distributions

The exact probabilities are .2622 and .8348, respectively, so the approximations are quite good. In the last calculation, the probability is being approxi- mated by the area under the normal curve between 4.5 and 15.5—the continuity cor- rection is used for both the upper and lower limits. ■

When the objective of our investigation is to make an inference about a popu- lation proportion p, interest will focus on the sample proportion of successes X/n rather than on X itself. Because this proportion is just X multiplied by the constant 1/n, it will also have approximately a normal distribution (with mean and standard deviation ) provided that both and . This nor- mal approximation is the basis for several inferential procedures to be discussed in later chapters.

nq $ 10np $ 10s 5 1pq/n m 5 p

P(5 # X # 15)

EXERCISES Section 4.3 (28–58)

28. Let Z be a standard normal random variable and calculate the following probabilities, drawing pictures wherever appropriate. a. b. c. d. e. f. g. h. i. j.

29. In each case, determine the value of the constant c that makes the probability statement correct. a. b. c. d. e.

30. Find the following percentiles for the standard normal dis- tribution. Interpolate where appropriate. a. 91st b. 9th c. 75th d. 25th e. 6th

31. Determine for the following: a. b. c.

32. Suppose the force acting on a column that helps to support a building is a normally distributed random variable X with mean value 15.0 kips and standard deviation 1.25 kips. Compute the following probabilities by standardizing and then using Table A.3. a. b. c. d. e.

33. Mopeds (small motorcycles with an engine capacity below ) are very popular in Europe because of their mobil-

ity, ease of operation, and low cost. The article “Procedure to Verify the Maximum Speed of Automatic Transmission Mopeds in Periodic Motor Vehicle Inspections” (J. of Automobile Engr., 2008: 1615–1623) described a rolling bench test for determining maximum vehicle speed. A nor- mal distribution with mean value 46.8 km/h and standard

50 cm3

P( u X 2 15 u # 3) P(14 # X # 18)P(X $ 10) P(X # 17.5)P(X # 15)

a 5 .663 a 5 .09a 5 .0055

za

P(c # u Z u ) 5 .016 P(2c # Z # c) 5 .668P(c # Z) 5 .121 P(0 # Z # c) 5 .291�(c) 5 .9838

P( u Z u # 2.50)P(1.50 # Z) P(1.37 # Z # 2.50)P(21.50 # Z # 2.00) P(21.75 # Z)P(Z # 1.37) P(22.50 # Z # 2.50)P(22.50 # Z # 0) P(0 # Z # 1)P(0 # Z # 2.17)

deviation 1.75 km/h is postulated. Consider randomly selecting a single such moped. a. What is the probability that maximum speed is at most

50 km/h? b. What is the probability that maximum speed is at least

48 km/h? c. What is the probability that maximum speed differs from

the mean value by at most 1.5 standard deviations?

34. The article “Reliability of Domestic-Waste Biofilm Reactors” (J. of Envir. Engr., 1995: 785–790) suggests that substrate concentration of influent to a reactor is normally distributed with and . a. What is the probability that the concentration exceeds .25? b. What is the probability that the concentration is at

most .10? c. How would you characterize the largest 5% of all con-

centration values?

35. Suppose the diameter at breast height (in.) of trees of a certain type is normally distributed with and

, as suggested in the article “Simulating a Harvester-Forwarder Softwood Thinning” (Forest Products J., May 1997: 36–41). a. What is the probability that the diameter of a ran-

domly selected tree will be at least 10 in.? Will exceed 10 in.?

b. What is the probability that the diameter of a randomly selected tree will exceed 20 in.?

c. What is the probability that the diameter of a randomly selected tree will be between 5 and 10 in.?

d. What value c is such that the interval includes 98% of all diameter values?

e. If four trees are independently selected, what is the probability that at least one has a diameter exceeding 10 in.?

36. Spray drift is a constant concern for pesticide applicators and agricultural producers. The inverse relationship between droplet size and drift potential is well known. The

(8.8 2 c, 8.8 1 c)

s 5 2.8 m 5 8.8

s 5 .06m 5 .30 (mg/cm3)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.3 The Normal Distribution 163

paper “Effects of 2,4-D Formulation and Quinclorac on Spray Droplet Size and Deposition” (Weed Technology, 2005: 1030–1036) investigated the effects of herbicide for- mulation on spray atomization. A figure in the paper sug- gested the normal distribution with mean and standard deviation was a reasonable model for droplet size for water (the “control treatment”) sprayed through a 760 ml/min nozzle.

a. What is the probability that the size of a single droplet is less than ? At least ?

b. What is the probability that the size of a single droplet is between 1000 and ?

c. How would you characterize the smallest 2% of all droplets?

d. If the sizes of five independently selected droplets are measured, what is the probability that at least one exceeds

?

37. Suppose that blood chloride concentration (mmol/L) has a normal distribution with mean 104 and standard devia- tion 5 (information in the article “Mathematical Model of Chloride Concentration in Human Blood,” J. of Med. Engr. and Tech., 2006: 25–30, including a normal proba- bility plot as described in Section 4.6, supports this assumption). a. What is the probability that chloride concentration

equals 105? Is less than 105? Is at most 105? b. What is the probability that chloride concentration

differs from the mean by more than 1 standard devia- tion? Does this probability depend on the values of and ?

c. How would you characterize the most extreme .1% of chloride concentration values?

38. There are two machines available for cutting corks intended for use in wine bottles. The first produces corks with diam- eters that are normally distributed with mean 3 cm and stan- dard deviation .1 cm. The second machine produces corks with diameters that have a normal distribution with mean 3.04 cm and standard deviation .02 cm. Acceptable corks have diameters between 2.9 cm and 3.1 cm. Which machine is more likely to produce an acceptable cork?

39. a. If a normal distribution has and , what is the 91st percentile of the distribution?

b. What is the 6th percentile of the distribution? c. The width of a line etched on an integrated circuit chip is

normally distributed with mean and standard deviation .140. What width value separates the widest 10% of all such lines from the other 90%?

40. The article “Monte Carlo Simulation—Tool for Better Understanding of LRFD” (J. of Structural Engr., 1993: 1586–1599) suggests that yield strength (ksi) for A36 grade steel is normally distributed with and . a. What is the probability that yield strength is at most 40?

Greater than 60? b. What yield strength value separates the strongest 75%

from the others?

s 5 4.5m 5 43

3.000 mm

s 5 5m 5 30

s

m

1500 mm

1500 mm

1000 mm1500 mm

150 mm 1050 mm

41. The automatic opening device of a military cargo para- chute has been designed to open when the parachute is 200 m above the ground. Suppose opening altitude actu- ally has a normal distribution with mean value 200 m and standard deviation 30 m. Equipment damage will occur if the parachute opens at an altitude of less than 100 m. What is the probability that there is equipment damage to the payload of at least one of five independently dropped parachutes?

42. The temperature reading from a thermocouple placed in a constant-temperature medium is normally distributed with mean , the actual temperature of the medium, and standard deviation . What would the value of have to be to ensure that 95% of all readings are within of ?

43. The distribution of resistance for resistors of a certain type is known to be normal, with 10% of all resistors having a resistance exceeding 10.256 ohms and 5% having a resistance smaller than 9.671 ohms. What are the mean value and standard deviation of the resistance dis- tribution?

44. If bolt thread length is normally distributed, what is the probability that the thread length of a randomly selected bolt is a. Within 1.5 SDs of its mean value? b. Farther than 2.5 SDs from its mean value? c. Between 1 and 2 SDs from its mean value?

45. A machine that produces ball bearings has initially been set so that the true average diameter of the bearings it pro- duces is .500 in. A bearing is acceptable if its diameter is within .004 in. of this target value. Suppose, however, that the setting has changed during the course of production, so that the bearings have normally distributed diameters with mean value .499 in. and standard deviation .002 in. What percentage of the bearings produced will not be acceptable?

46. The Rockwell hardness of a metal is determined by impressing a hardened point into the surface of the metal and then measuring the depth of penetration of the point. Suppose the Rockwell hardness of a particular alloy is normally distributed with mean 70 and standard deviation 3. (Rockwell hardness is measured on a contin- uous scale.) a. If a specimen is acceptable only if its hardness is

between 67 and 75, what is the probability that a ran- domly chosen specimen has an acceptable hardness?

b. If the acceptable range of hardness is , for what value of c would 95% of all specimens have acceptable hardness?

c. If the acceptable range is as in part (a) and the hardness of each of ten randomly selected specimens is indepen- dently determined, what is the expected number of acceptable specimens among the ten?

d. What is the probability that at most eight of ten inde- pendently selected specimens have a hardness of less than

(70 2 c, 70 1 c)

m.18 ss

m

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

164 CHAPTER 4 Continuous Random Variables and Probability Distributions

73.84? [Hint: the number among the ten specimens with hardness less than 73.84 is a binomial variable; what is p?]

47. The weight distribution of parcels sent in a certain manner is normal with mean value 12 lb and standard deviation 3.5 lb. The parcel service wishes to establish a weight value c beyond which there will be a surcharge. What value of c is such that 99% of all parcels are at least 1 lb under the sur- charge weight?

48. Suppose Appendix Table A.3 contained only for Explain how you could still compute a. b.

Is it necessary to tabulate for z negative? What prop- erty of the standard normal curve justifies your answer?

49. Consider babies born in the “normal” range of 37–43 weeks gestational age. Extensive data supports the assumption that for such babies born in the United States, birth weight is normally distributed with mean 3432 g and standard deviation 482 g. [The article “Are Babies Normal?” (The American Statistician, 1999: 298–302) analyzed data from a particular year; for a sensible choice of class intervals, a histogram did not look at all normal, but after further investigations it was determined that this was due to some hospitals measuring weight in grams and others measuring to the nearest ounce and then converting to grams. A modified choice of class intervals that allowed for this gave a histogram that was well described by a normal distribution.] a. What is the probability that the birth weight of a ran-

domly selected baby of this type exceeds 4000 g? Is between 3000 and 4000 g?

b. What is the probability that the birth weight of a ran- domly selected baby of this type is either less than 2000 g or greater than 5000 g?

c. What is the probability that the birth weight of a randomly selected baby of this type exceeds 7 lb?

d. How would you characterize the most extreme .1% of all birth weights?

e. If X is a random variable with a normal distribution and a is a numerical constant , then also has a normal distribution. Use this to determine the distri- bution of birth weight expressed in pounds (shape, mean, and standard deviation), and then recalculate the probability from part (c). How does this compare to your previous answer?

50. In response to concerns about nutritional contents of fast foods, McDonald’s has announced that it will use a new cooking oil for its french fries that will decrease sub- stantially trans fatty acid levels and increase the amount of more beneficial polyunsaturated fat. The company claims that 97 out of 100 people cannot detect a differ- ence in taste between the new and old oils. Assuming that this figure is correct (as a long-run proportion), what is the approximate probability that in a random

Y 5 aX(a 2 0)

�(z)

P(21.72 # Z # .55) P(21.72 # Z # 2.55)

z $ 0.�(z)

Y 5 sample of 1000 individuals who have purchased fries at McDonald’s, a. At least 40 can taste the difference between the two oils? b. At most 5% can taste the difference between the two

oils?

51. Chebyshev’s inequality, (see Exercise 44, Chapter 3), is valid for continuous as well as discrete distributions. It states that for any number satisfying ,

(see Exercise 44 in Chapter 3 for an interpretation). Obtain this probability in the case of a normal distribution for , 2, and 3, and compare to the upper bound.

52. Let X denote the number of flaws along a 100-m reel of magnetic tape (an integer-valued variable). Suppose X has approximately a normal distribution with and

. Use the continuity correction to calculate the prob- ability that the number of flaws is a. Between 20 and 30, inclusive. b. At most 30. Less than 30.

53. Let X have a binomial distribution with parameters and p. Calculate each of the following probabili-

ties using the normal approximation (with the continuity correction) for the cases , .6, and .8 and compare to the exact probabilities calculated from Appendix Table A.1. a. b. c.

54. Suppose that 10% of all steel shafts produced by a certain process are nonconforming but can be reworked (rather than having to be scrapped). Consider a random sample of 200 shafts, and let X denote the number among these that are nonconforming and can be reworked. What is the (approxi- mate) probability that X is a. At most 30? b. Less than 30? c. Between 15 and 25 (inclusive)?

55. Suppose only 75% of all drivers in a certain state regularly wear a seat belt. A random sample of 500 drivers is selected. What is the probability that a. Between 360 and 400 (inclusive) of the drivers in the

sample regularly wear a seat belt? b. Fewer than 400 of those in the sample regularly wear a

seat belt?

56. Show that the relationship between a general normal per- centile and the corresponding z percentile is as stated in this section.

57. a. Show that if X has a normal distribution with parame- ters and , then (a linear function of X) also has a normal distribution. What are the parameters of the distribution of Y [i.e., E(Y ) and V(Y )]? [Hint: Write the cdf of Y, , as an integral involving the pdf of X, and then differentiate with respect to y to get the pdf of Y.]

P(Y # y)

Y 5 aX 1 bsm

P(20 # X) P(X # 15) P(15 # X # 20)

p 5 .5

n 5 25

s 5 5 m 5 25

k 5 1

P( u X 2 m u $ ks) # 1/k2 k $ 1k

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

4.4 The Exponential and Gamma Distributions 165

b. If, when measured in , temperature is normally dis- tributed with mean 115 and standard deviation 2, what can be said about the distribution of temperature meas- ured in ?

58. There is no nice formula for the standard normal cdf , but several good approximations have been published in art- icles. The following is from “Approximations for Hand Calculators Using Small Integer Coefficients” (Mathematics of Computation, 1977: 214–222). For ,0 , z # 5.5

�(z)

8F

8C

The relative error of this approximation is less than .042%. Use this to calculate approximations to the following prob- abilities, and compare whenever possible to the probabili- ties obtained from Appendix Table A.3. a. b. c. d. P(Z . 5)P(24 , Z , 4)

P(Z , 23)P(Z $ 1)

< .5 exp e2c(83z 1 351)z 1 562 703/z 1 165

d f P(Z $ z) 5 1 2 �(z)

4.4 The Exponential and Gamma Distributions The density curve corresponding to any normal distribution is bell-shaped and therefore symmetric. There are many practical situations in which the variable of interest to an investigator might have a skewed distribution. One family of distribu- tions that has this property is the gamma family. We first consider a special case, the exponential distribution, and then generalize later in the section.

The Exponential Distribution The family of exponential distributions provides probability models that are very widely used in engineering and science disciplines.

X is said to have an exponential distribution with parameter if the pdf of X is

(4.5)f (x; l) 5 ele2lx x $ 0 0 otherwise

l (l . 0)

Some sources write the exponential pdf in the form , so that . The expected value of an exponentially distributed random variable X is

Obtaining this expected value necessitates doing an integration by parts. The vari- ance of X can be computed using the fact that . The deter- mination of requires integrating by parts twice in succession. The results of these integrations are as follows:

Both the mean and standard deviation of the exponential distribution equal . Graphs of several exponential pdf’s are illustrated in Figure 4.26.

1/l

m 5 1

l s2 5

1

l2

E(X 2) V(X) 5 E(X 2) 2 [E(X)]2

E(X) 5 3

`

0

xle2lx dx

b 5 1/l(1/b)e2x/b

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

Example 4.21

166 CHAPTER 4 Continuous Random Variables and Probability Distributions

.5

1

2

x

� 2

� .5

� 1

f (x; )�

Figure 4.26 Exponential density curves

F(x; l) 5 e 0 x , 0 1 2 e2lx x $ 0

Suppose that the number of events occurring in any time interval of length t has a Poisson distribution with parameter (where , the rate of the event process, is the expected number of events occurring in 1 unit of time) and that numbers of occurrences in nonoverlapping intervals are independent of one another. Then the distribution of elapsed time between the occurrence of two successive events is exponential with parameter .l 5 a

aat

Although a complete proof is beyond the scope of the text, the result is easily veri- fied for the time until the first event occurs:

which is exactly the cdf of the exponential distribution.

5 1 2 e2at # (at)0

0! 5 1 2 e2at

P(X1 # t) 5 1 2 P(X1 . t) 5 1 2 P[no events in (0, t)]

X1

The exponential pdf is easily integrated to obtain the cdf.

The article “Probabilistic Fatigue Evaluation of Riveted Railway Bridges” (J. of Bridge Engr., 2008: 237–244) suggested the exponential distribution with mean value 6 MPa as a model for the distribution of stress range in certain bridge connections. Let’s assume that this is in fact the true model. Then The probability that stress range is at most 10 MPa is

The probability that stress range is between 5 and 10 MPa is

The exponential distribution is frequently used as a model for the distribution of times between the occurrence of successive events, such as customers arriving at a service facility or calls coming in to a switchboard. The reason for this is that the expo- nential distribution is closely related to the Poisson process discussed in Chapter 3.

5 .246 P(5 # X # 10) 5 F(10; .1667) 2 F(5; .1667) 5 (1 2 e21.667) 2 (1 2 e2.8335)

P(X # 10) 5 F(10 ; .1667) 5 1 2 e2(.1667)(10) 5 1 2 .189 5 .811

E(X ) 5 1/l 5 6 implies that l 5 .1667.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

Example 4.22

4.4 The Exponential and Gamma Distributions 167

Suppose that calls are received at a 24-hour “suicide hotline” according to a Poisson process with rate call per day. Then the number of days X between succes- sive calls has an exponential distribution with parameter value .5, so the probability that more than 2 days elapse between calls is

The expected time between successive calls is days. ■

Another important application of the exponential distribution is to model the distribution of component lifetime. A partial reason for the popularity of such applications is the “memoryless” property of the exponential distribution. Suppose component lifetime is exponentially distributed with parameter . After putting the component into service, we leave for a period of hours and then return to find the component still working; what now is the probability that it lasts at least an additional t hours? In symbols, we wish . By the definition of conditional probability,

But the event in the numerator is redundant, since both events can occur if and only if . Therefore,

This conditional probability is identical to the original probability that the component lasted t hours. Thus the distribution of additional lifetime is exactly the same as the original distribution of lifetime, so at each point in time the component shows no effect of wear. In other words, the distribution of remaining lifetime is independent of current age.

Although the memoryless property can be justified at least approximately in many applied problems, in other situations components deteriorate with age or occasionally improve with age (at least up to a certain point). More general lifetime models are then furnished by the gamma, Weibull, and lognormal distributions (the latter two are discussed in the next section).

The Gamma Function To define the family of gamma distributions, we first need to introduce a function that plays an important role in many branches of mathematics.

P(X $ t)

P(X $ t 1 t0 uX $ t0) 5 P(X $ t 1 t0)

P(X $ t0) 5

1 2 F(t 1 t0; l)

1 2 F(t0; l) 5 e2lt

X $ t 1 t0

X $ t0

P(X $ t 1 t0 uX $ t0) 5 P[(X $ t 1 t0) ¨ (X $ t0)]

P(X $ t0)

P(X $ t 1 t0 u X $ t0)

t0

l

1/.5 5 2

P(X . 2) 5 1 2 P(X # 2) 5 1 2 F(2; .5) 5 e2(.5)(2) 5 .368

a 5 .5

For , the gamma function is defined by

(4.6)�(a) 5 3

`

0

xa21e2x dx

�(a)a . 0

The most important properties of the gamma function are the following:

1. For any , [via integration by parts]

2. For any positive integer, n,

3. �Q12R 5 1p �(n) 5 (n 2 1)!

�(a) 5 (a 2 1) # �(a 2 1)a . 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

168 CHAPTER 4 Continuous Random Variables and Probability Distributions

By Expression (4.6), if we let

(4.7)

then and , so satisfies the two

basic properties of a pdf.

The Gamma Distribution

f(x; a)3

`

0

f(x; a) dx 5 �(a)/�(a) 5 1f(x; a) $ 0

f(x; a) 5 • xa21e2x

�(a) x $ 0

0 otherwise

A continuous random variable X is said to have a gamma distribution if the pdf of X is

(4.8)

where the parameters and satisfy , . The standard gamma distribution has , so the pdf of a standard gamma rv is given by (4.7).b 5 1

b . 0a . 0ba

f(x; a, b) 5 • 1

ba�(a) xa21e2x/b x $ 0

0 otherwise

The exponential distribution results from taking and . Figure 4.27(a) illustrates the graphs of the gamma pdf (4.8) for sev-

eral pairs, whereas Figure 4.27(b) presents graphs of the standard gamma pdf. For the standard pdf, when , is strictly decreasing as x increases from 0; when , rises from 0 at to a maximum and then decreases. The parameter in (4.8) is called the scale parameter because values other than 1 either stretch or compress the pdf in the x direction.

b

x 5 0f(x; a)a . 1 f(x; a)a # 1

(a, b) f(x; a, b) b 5 1/la 5 1

7654321 0

0.5

1.0 � 2, � 13

� 1, � 1

� 2, � 2

� 2, � 1

(a)

x

f (x; , )� �

54321 0

0.5

1.0 � 1

� .6

� 2 � 5

(b)

x

f (x; )�

� �

Figure 4.27 (a) Gamma density curves; (b) standard gamma density curves

The mean and variance of a random variable X having the gamma distribution are

When X is a standard gamma rv, the cdf of X,

(4.9)F(x; a) 5 3

x

0

ya21e2y

�(a) dy x . 0

E(X) 5 m 5 ab V(X) 5 s2 5 ab2 f(x; a, b)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.23

4.4 The Exponential and Gamma Distributions 169

is called the incomplete gamma function [sometimes the incomplete gamma func- tion refers to Expression (4.9) without the denominator in the integrand]. There are extensive tables of available; in Appendix Table A.4, we present a small tabulation for and .

Suppose the reaction time X of a randomly selected individual to a certain stimulus has a standard gamma distribution with . Since

when X is continuous,

The probability that the reaction time is more than 4 sec is

The incomplete gamma function can also be used to compute probabilities involving nonstandard gamma distributions. These probabilities can also be obtained almost instantaneously from various software packages.

P(X . 4) 5 1 2 P(X # 4) 5 1 2 F(4; 2) 5 1 2 .908 5 .092

P(3 # X # 5) 5 F(5; 2) 2 F(3; 2) 5 .960 2 .801 5 .159

P(a # X # b) 5 F(b) 2 F(a)

a 5 2

x 5 1, 2, c,15a 5 1, 2, c, 10 F(x; a)

�(a)

Let X have a gamma distribution with parameters and . Then for any , the cdf of X is given by

where is the incomplete gamma function. F( # ; a)

P(X # x) 5 F(x; a, b) 5 Fa x b

; ab

x . 0ba

Example 4.24 Suppose the survival time X in weeks of a randomly selected male mouse exposed to 240 rads of gamma radiation has a gamma distribution with and . (Data in Survival Distributions: Reliability Applications in the Biomedical Services, by A. J. Gross and V. Clark, suggests and .) The expected survival time is weeks, whereas and

weeks. The probability that a mouse survives between 60 and 120 weeks is

The probability that a mouse survives at least 30 weeks is

The Chi-Squared Distribution The chi-squared distribution is important because it is the basis for a number of procedures in statistical inference. The central role played by the chi-squared distribution in inference springs from its relationship to normal distributions (see Exercise 71). We’ll discuss this distribution in more detail in later chapters.

5 1 2 F(30/15; 8) 5 .999

P(X $ 30) 5 1 2 P(X , 30) 5 1 2 P(X # 30)

5 F(8;8) 2 F(4;8) 5 .547 2 .051 5 .496

5 F(120/15; 8) 2 F(60/15; 8)

P(60 # X # 120) 5 P(X # 120) 2 P(X # 60)

sX 5 11800 5 42.43 V(X) 5 (8)(15)2 5 1800E(X) 5 (8)(15) 5 120

b < 13.3a < 8.5

b 5 15a 5 8

PROPOSITION

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

170 CHAPTER 4 Continuous Random Variables and Probability Distributions

Let be a positive integer. Then a random variable X is said to have a chi- squared distribution with parameter if the pdf of X is the gamma density with and . The pdf of a chi-squared rv is thus

(4.10)

The parameter is called the number of degrees of freedom (df) of X. The symbol is often used in place of “chi-squared.”x2

n

f (x; n) 5 u 1

2n/2�(v/2) x(n/2)21e2x/2 x $ 0

0 x , 0

b 5 2a 5 n/2 n

n

EXERCISES Section 4.4 (59–71)

59. Let the time between two successive arrivals at the drive-up window of a local bank. If X has an exponential distribution with (which is identical to a standard gamma distribution with ), compute the following: a. The expected time between two successive arrivals b. The standard deviation of the time between successive

arrivals c. d.

60. Let X denote the distance (m) that an animal moves from its birth site to the first territorial vacancy it encounters. Suppose that for banner-tailed kangaroo rats, X has an expo- nential distribution with parameter (as sug- gested in the article “Competition and Dispersal from Multiple Nests,” Ecology, 1997: 873–883). a. What is the probability that the distance is at most

100 m? At most 200 m? Between 100 and 200 m? b. What is the probability that distance exceeds the mean

distance by more than 2 standard deviations? c. What is the value of the median distance?

61. Data collected at Toronto Pearson International Airport sug- gests that an exponential distribution with mean value 2.725 hours is a good model for rainfall duration (Urban Stormwater Management Planning with Analytical Probabilistic Models, 2000, p. 69). a. What is the probability that the duration of a particular

rainfall event at this location is at least 2 hours? At most 3 hours? Between 2 and 3 hours?

b. What is the probability that rainfall duration exceeds the mean value by more than 2 standard deviations? What is the probability that it is less than the mean value by more than one standard deviation?

62. The paper “Microwave Observations of Daily Antarctic Sea-Ice Edge Expansion and Contribution Rates” (IEEE Geosci. and Remote Sensing Letters, 2006: 54–58) states that “The distribution of the daily sea-ice advance/retreat from each sensor is similar and is approximately double exponential.” The proposed double exponential distribution has density function for . The standard deviation is given as 40.9 km.

2` , x , `f(x) 5 .5le2l|x|

l 5 .01386

P(2 # X # 5)P(X # 4)

a 5 1 l 5 1

X 5 a. What is the value of the parameter ? b. What is the probability that the extent of daily sea-ice

change is within 1 standard deviation of the mean value?

63. A consumer is trying to decide between two long-distance calling plans. The first one charges a flat rate of per minute, whereas the second charges a flat rate of for calls up to 20 minutes in duration and then for each additional minute exceeding 20 (assume that calls lasting a noninteger number of minutes are charged proportion- ately to a whole-minute’s charge). Suppose the con- sumer’s distribution of call duration is exponential with parameter . a. Explain intuitively how the choice of calling plan should

depend on what the expected call duration is. b. Which plan is better if expected call duration is 10 min-

utes? 15 minutes? [Hint: Let denote the cost for the first plan when call duration is x minutes and let be the cost function for the second plan. Give expressions for these two cost functions, and then determine the expected cost for each plan.]

64. Evaluate the following: a. (6) b. (5/2) c. F(4; 5) (the incomplete gamma function) d. F(5; 4) e. F(0 ; 4)

65. Let X have a standard gamma distribution with . Evaluate the following: a. b. c. d. e. f.

66. Suppose the time spent by a randomly selected student who uses a terminal connected to a local time-sharing computer facility has a gamma distribution with mean 20 min and variance . a. What are the values of and ? b. What is the probability that a student uses the terminal

for at most 24 min? c. What is the probability that a student spends between 20

and 40 min using the terminal?

ba

80 min2

P(X , 4 or X . 6) P(3 , X , 8)P(3 # X # 8)

P(X . 8)P(X , 5)P(X # 5)

a 5 7

��

h2(x) h1(x)

l

10¢ 99¢ 10¢

l

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.5 Other Continuous Distributions 171

67. Suppose that when a transistor of a certain type is subjected to an accelerated life test, the lifetime X (in weeks) has a gamma distribution with mean 24 weeks and standard devi- ation 12 weeks. a. What is the probability that a transistor will last between

12 and 24 weeks? b. What is the probability that a transistor will last at most

24 weeks? Is the median of the lifetime distribution less than 24? Why or why not?

c. What is the 99th percentile of the lifetime distribution? d. Suppose the test will actually be terminated after t

weeks. What value of t is such that only .5% of all tran- sistors would still be operating at termination?

68. The special case of the gamma distribution in which is a positive integer n is called an Erlang distribution. If we replace by in Expression (4.8), the Erlang pdf is

It can be shown that if the times between successive events are independent, each with an exponential distribution with parameter , then the total time X that elapses before all of the next n events occur has pdf . a. What is the expected value of X? If the time (in min-

utes) between arrivals of successive customers is expo- nentially distributed with , how much time can be expected to elapse before the tenth customer arrives?

b. If customer interarrival time is exponentially distributed with , what is the probability that the tenth cus- tomer (after the one who has just arrived) will arrive within the next 30 min?

c. The event { } occurs iff at least n events occur in the next t units of time. Use the fact that the number of events occurring in an interval of length t has a Poisson distribution with parameter to write an expressionlt

X # t

l 5 .5

l 5 .5

f (x; l, n) l

f (x; l, n) 5 • l(lx)n21e2lx

(n 2 1)! x $ 0

0 x , 0

1/lb

a

(involving Poisson probabilities) for the Erlang cdf .

69. A system consists of five identical components connected in series as shown:

F(t; l, n) 5 P(X # t)

1 2 3 4 5

As soon as one component fails, the entire system will fail. Suppose each component has a lifetime that is exponentially distributed with and that components fail inde- pendently of one another. Define events {ith compo- nent lasts at least t hours}, , so that the are independent events. Let the time at which the system fails—that is, the shortest (minimum) lifetime among the five components. a. The event { } is equivalent to what event involving

? b. Using the independence of the , compute .

Then obtain and the pdf of X. What type of distribution does X have?

c. Suppose there are n components, each having exponen- tial lifetime with parameter . What type of distribution does X have?

70. If X has an exponential distribution with parameter , derive a general expression for the (100p)th percentile of the dis- tribution. Then specialize to obtain the median.

71. a. The event { } is equivalent to what event involv- ing X itself?

b. If X has a standard normal distribution, use part (a) to write the integral that equals . Then differenti- ate this with respect to y to obtain the pdf of [the square of a N(0, 1) variable]. Finally, show that has a chi-squared distribution with df [see (4.10)]. [Hint: Use the following identity.]

d

dy e 3

b(y)

a(y)

f(x) dx f 5 f [b(y)] # br(y) 2 f [a(y)] # a r(y)

n 5 1 X 2 X 2

P(X 2 # y)

X 2 # y

l

l

F(t) 5 P(X # t) P(X $ t)Airs

A1, c, A5

X $ t

X 5 Aisi 5 1, c, 5

Ai 5 l 5 .01

4.5 Other Continuous Distributions The normal, gamma (including exponential), and uniform families of distributions provide a wide variety of probability models for continuous variables, but there are many practical situations in which no member of these families fits a set of observed data very well. Statisticians and other investigators have developed other families of distributions that are often appropriate in practice.

The Weibull Distribution The family of Weibull distributions was introduced by the Swedish physicist Waloddi Weibull in 1939; his 1951 article “A Statistical Distribution Function of Wide Applicability” (J. of Applied Mechanics, vol. 18: 293–297) discusses a num- ber of applications.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

172 CHAPTER 4 Continuous Random Variables and Probability Distributions

A random variable X is said to have a Weibull distribution with parameters and if the pdf of X is

(4.11)f (x; a, b) 5 • a

ba xa21e2(x/b)

a

x $ 0

0 x , 0

b (a . 0, b . 0)a

.5

1

0 5 10

f(x)

2

0

4

6

8

.50 1.0 1.5 2.0 2.5

f(x)

x

x

a = 2, b = 1

a = 2, b = .5

a = 1, b = 1 (exponential)

a = 10, b = 2

a = 10, b = 1

a = 10, b = .5

Figure 4.28 Weibull density curves

In some situations, there are theoretical justifications for the appropriateness of the Weibull distribution, but in many applications simply provides a good fit to observed data for particular values of and . When , the pdf reduces to the exponential distribution (with ), so the exponential distribu- tion is a special case of both the gamma and Weibull distributions. However, there are gamma distributions that are not Weibull distributions and vice versa, so one family is not a subset of the other. Both and can be varied to obtain a number of different-looking density curves, as illustrated in Figure 4.28. is called a scale parameter, since different values stretch or compress the graph in the x direction, and

is referred to as a shape parameter.a

b

ba

l 5 1/b a 5 1ba

f (x; a, b)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.5 Other Continuous Distributions 173

Integrating to obtain E(X) and yields

The computation of and thus necessitates using the gamma function.

The integration � is easily carried out to obtain the cdf of X. x0 f (y; a, b) dy s2m

m 5 b�a1 1 1 a

b s2 5 b2 e�a1 1 2 a b 2 c�a1 1 1

a b d2 f

E(X 2)

Example 4.25

The cdf of a Weibull rv having parameters and is

(4.12)F(x; a, b) 5 e 0 x , 0 1 2 e2(x/b)a x $ 0

ba

In recent years the Weibull distribution has been used to model engine emissions of various pollutants. Let X denote the amount of emission (g/gal) from a ran- domly selected four-stroke engine of a certain type, and suppose that X has a Weibull distribution with and (suggested by information in the article “Quantification of Variability and Uncertainty in Lawn and Garden Equipment and Total Hydrocarbon Emission Factors,” J. of the Air and Waste Management Assoc., 2002: 435–448). The corresponding density curve looks exactly like the one in Figure 4.28 for , except that now the values 50 and 100 replace 5 and 10 on the horizontal axis (because is a “scale parameter”). Then

Similarly, , so the distribution is almost entirely concentrated on values between 0 and 25. The value c which separates the 5% of all engines having the largest amounts of emissions from the remaining 95% satisfies

Isolating the exponential term on one side, taking logarithms, and solving the result- ing equation gives as the 95th percentile of the emission distribution. ■c < 17.3

.95 5 1 2 e2(c/10)2 NOx

P(X # 25) 5 .998

P(X # 10) 5 F(10; 2, 10) 5 1 2 e2(10/10)2 5 1 2 e21 5 .632

b

b 5 1a 5 2

NOx

b 5 10a 5 2

NOx

In practical situations, a Weibull model may be reasonable except that the smallest possible X value may be some value not assumed to be zero (this would also apply to a gamma model). The quantity can then be regarded as a third (threshold) parameter of the distribution, which is what Weibull did in his original work. For, say,

, all curves in Figure 4.28 would be shifted 3 units to the right. This is equiva- lent to saying that has the pdf (4.11), so that the cdf of X is obtained by replacing x in (4.12) by .x 2 g

X 2 g g 5 3

g

g

An understanding of the volumetric properties of asphalt is important in designing mixtures which will result in high-durability pavement. The article “Is a Normal Distribution the Most Appropriate Statistical Distribution for Volumetric Properties in Asphalt Mixtures?” (J. of Testing and Evaluation, Sept. 2009: 1–11) used the analysis of some sample data to recommend that for a particular mixture, X = air void volume (%) be modeled with a three-parameter Weibull distribution. Suppose the values of the parameters are g = 4, a = 1.3, and b = .8 (quite close to estimates given in the article).

For , the cumulative distribution function is

F(x; a, b, g) 5 F(x; 1.3, .8, 4) 5 1 2 e2[(x24)/.8]1.3 x . 4

Example 4.26

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

174 CHAPTER 4 Continuous Random Variables and Probability Distributions

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.0 4 5 6

0.226

D en

si ty

x

f(x)

Figure 4.29 Weibull density curve with threshold = 4, shape = 1.3, scale = .8

A nonnegative rv X is said to have a lognormal distribution if the rv has a normal distribution. The resulting pdf of a lognormal rv

when ln(X) is normally distributed with parameters and is

f (x; m, s) 5 u 112psx e2[ln(x)2m]2/(2s2) x $ 0 0 x , 0

sm

Y 5 ln(X)

(4.13) 5 PaZ # ln(x) 2 m s

b 5 �a ln(x) 2 m s

b x $ 0 F(x; m, s) 5 P(X # x) 5 P[ln(X) # ln(x)]

Be careful here; the parameters and are not the mean and standard deviation of X but of ln(X ). It is common to refer to and as the location and the scale param- eters, respectively. The mean and variance of X can be shown to be

In Chapter 5, we will present a theoretical justification for this distribution in con- nection with the Central Limit Theorem, but as with other distributions, the lognor- mal can be used as a model even in the absence of such justification. Figure 4.30 illustrates graphs of the lognormal pdf; although a normal curve is symmetric, a log- normal curve has a positive skew.

Because ln(X) has a normal distribution, the cdf of X can be expressed in terms of the cdf of a standard normal rv Z.�(z)

E(X) 5 em1s2/2 V(X) 5 e2m1s2 # (e s 2 2 1)

sm

sm

The probability that the air void volume of a specimen is between 5% and 6% is

Figure 4.29 shows a graph from Minitab of the corresponding Weibull density func- tion in which the shaded area corresponds to the probability just calculated.

5 .263 2 .037 5 .226

P(5 # X # 6) 5 F(6; 1.3,.8,4) 2 F(5; 1.3, .8, 4) 5 e2[(524)/.8]1.3 2 e2[(624)/.8]1.3

The Lognormal Distribution

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.5 Other Continuous Distributions 175

.05

0

.10

.15

.20

.25

0 5 10 15 20 25

f(x)

x

μ = 1, σ = 1

μ = 3, σ = 1

μ = 3, σ = √3

Figure 4.30 Lognormal density curves

Example 4.27 According to the article “Predictive Model for Pitting Corrosion in Buried Oil and Gas Pipelines” (Corrosion, 2009: 332–342), the lognormal distribution has been reported as the best option for describing the distribution of maximum pit depth data from cast iron pipes in soil. The authors suggest that a lognormal distribution with

= .353 and s = .754 is appropriate for maximum pit depth (mm) of buried pipelines. For this distribution, the mean value and variance of pit depth are

The probability that maximum pit depth is between 1 and 2 mm is

This probability is illustrated in Figure 4.31 (from Minitab).

5 Pa0 2 .353 .754

# Z # .693 2 .353

.754 b 5 �(.47) 2 �(2.45) 5 .354

P(1 # X # 2) 5 P(ln(1) # ln(X ) # ln(2)) 5 P(0 # ln(X) # .693)

V(X) 5 e2(.353)1(.754)2 # (e(.754)2 2 1) 5 (3.57697)(.765645) 5 2.7387 E(X) 5 e.3531(.754)2/2 5 e.6373 5 1.891

m

0.5

0.4

0.354

0.3

D en

si ty

0.2

0.1

0.0 0 1 2

x

f(x)

Figure 4.31 Lognormal density curve with location = .353 and scale = .754

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

176 CHAPTER 4 Continuous Random Variables and Probability Distributions

What value c is such that only 1% of all specimens have a maximum pit depth exceeding c? The desired value satisfies

The z critical value 2.33 captures an upper-tail area of .01 (z.01 = 2.33), and thus a cumulative area of .99. This implies that

from which ln(c) = 2.1098 and c = 8.247. Thus 8.247 is the 99th percentile of the maximum pit depth distribution. ■

The Beta Distribution All families of continuous distributions discussed so far except for the uniform distri- bution have positive density over an infinite interval (though typically the density func- tion decreases rapidly to zero beyond a few standard deviations from the mean). The beta distribution provides positive density only for X in an interval of finite length.

ln(c) 2 .353

.754 5 2.33

.99 5 P(X # c) 5 PaZ # ln(c) 2 .353 .754

b

A random variable X is said to have a beta distribution with parameters (both positive), A, and B if the pdf of X is

The case gives the standard beta distribution.A 5 0, B 5 1

f (x; a, b, A, B) 5 • 1

B 2 A # �(a 1 b) �(a) # �(b) a

x 2 A

B 2 A ba21aB 2 x

B 2 A bb21 A # x # B

0 otherwise

a, b

Figure 4.32 illustrates several standard beta pdf’s. Graphs of the general pdf are sim- ilar, except they are shifted and then stretched or compressed to fit over [A, B]. Unless and are integers, integration of the pdf to calculate probabilities is diffi- cult. Either a table of the incomplete beta function or appropriate software should be used. The mean and variance of X are

m 5 A 1 (B 2 A) # a a 1 b

s2 5 (B 2 A)2ab

(a 1 b)2(a 1 b 1 1)

ba

.2 .4 .6 .8 1

1

2

3

4

5

0

� � .5

� 2 � .5 �

� 5 � 2 �

x

f(x; , ) �

Figure 4.32 Standard beta density curves

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.5 Other Continuous Distributions 177

Example 4.28 Project managers often use a method labeled PERT—for program evaluation and review technique—to coordinate the various activities making up a large project. (One successful application was in the construction of the Apollo spacecraft.) A stan- dard assumption in PERT analysis is that the time necessary to complete any partic- ular activity once it has been started has a beta distribution with the optimistic time (if everything goes well) and the pessimistic time (if everything goes badly). Suppose that in constructing a single-family house, the time X (in days) nec- essary for laying the foundation has a beta distribution with , and . Then , so . For these values of and , the pdf of X is a simple polynomial function. The probability that it takes at most 3 days to lay the foundation is

The standard beta distribution is commonly used to model variation in the pro- portion or percentage of a quantity occurring in different samples, such as the pro- portion of a 24-hour day that an individual is asleep or the proportion of a certain element in a chemical compound.

5 4

273

3

2

(x 2 2)(5 2 x)2dx 5 4

27 # 11

4 5

11

27 5 .407

P(X # 3) 5 3

3

2

1

3 # 4! 1!2!

a x 2 2 3

b a 5 2 x 3

b2dx

ba

E(X) 5 2 1 (3)(.4) 5 3.2a/(a 1 b) 5 .4b 5 3 A 5 2, B 5 5, a 5 2

B 5 A 5

EXERCISES Section 4.5 (72–86)

72. The lifetime X (in hundreds of hours) of a certain type of vacuum tube has a Weibull distribution with parameters

and . Compute the following: a. E(X) and V(X) b. c.

(This Weibull distribution is suggested as a model for time in service in “On the Assessment of Equipment Reliability: Trading Data Collection Costs for Precision,” J. of Engr. Manuf., 1991: 105–109.)

73. The authors of the article “A Probabilistic Insulation Life Model for Combined Thermal-Electrical Stresses” (IEEE Trans. on Elect. Insulation, 1985: 519–522) state that “the Weibull distribution is widely used in statistical problems relating to aging of solid insulating materials subjected to aging and stress.” They propose the use of the distribution as a model for time (in hours) to failure of solid insulating specimens subjected to AC voltage. The values of the parameters depend on the voltage and temperature; sup- pose and (values suggested by data in the article). a. What is the probability that a specimen’s lifetime is at

most 250? Less than 250? More than 300? b. What is the probability that a specimen’s lifetime is

between 100 and 250? c. What value is such that exactly 50% of all specimens

have lifetimes exceeding that value?

74. Let the time (in weeks) from shipment of a defec- tive product until the customer returns the product. Suppose that the minimum return time is and that the excessg 5 3.5

1021X 5

b 5 200a 5 2.5

P(1.5 # X # 6) P(X # 6)

b 5 3a 5 2

over the minimum has a Weibull distribution with parameters and (see “Practical Applications of the Weibull Distribution,” Industrial Quality Control, Aug. 1964: 71–78). a. What is the cdf of X? b. What are the expected return time and variance of return

time? [Hint: First obtain and .] c. Compute . d. Compute .

75. Let X have a Weibull distribution with the pdf from Expression (4.11). Verify that . [Hint: In the integral for E(X), make the change of variable

, so that .]

76. a. In Exercise 72, what is the median lifetime of such tubes? [Hint: Use Expression (4.12).]

b. In Exercise 74, what is the median return time? c. If X has a Weibull distribution with the cdf from

Expression (4.12), obtain a general expression for the (100p)th percentile of the distribution.

d. In Exercise 74, the company wants to refuse to accept returns after t weeks. For what value of t will only 10% of all returns be refused?

77. The authors of the paper from which the data in Exercise 1.27 was extracted suggested that a reasonable probability model for drill lifetime was a lognormal distribution with

and . a. What are the mean value and standard deviation of lifetime? b. What is the probability that lifetime is at most 100? c. What is the probability that lifetime is at least 200?

Greater than 200?

s 5 .8m 5 4.5

x 5 by1/ay 5 (x/b)a

m 5 b�(1 1 1/a)

P(5 # X # 8) P(X . 5)

V(X 2 3.5)E(X 2 3.5)

b 5 1.5a 5 2 X 2 3.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

178 CHAPTER 4 Continuous Random Variables and Probability Distributions

78. The article “On Assessing the Accuracy of Offshore Wind Turbine Reliability-Based Design Loads from the Environmental Contour Method” (Intl. J. of Offshore and Polar Engr., 2005: 132–140) proposes the Weibull distribu- tion with and as a model for 1-hour significant wave height (m) at a certain site. a. What is the probability that wave height is at most .5 m? b. What is the probability that wave height exceeds its

mean value by more than one standard deviation? c. What is the median of the wave-height distribution? d. For , give a general expression for the 100pth

percentile of the wave-height distribution.

79. Nonpoint source loads are chemical masses that travel to the main stem of a river and its tributaries in flows that are dis- tributed over relatively long stream reaches, in contrast to those that enter at well-defined and regulated points. The article “Assessing Uncertainty in Mass Balance Calculation of River Nonpoint Source Loads” (J. of Envir. Engr., 2008: 247–258) suggested that for a certain time period and loca- tion, nonpoint source load of total dissolved solids could be modeled with a lognormal distribution having mean value 10,281 kg/day/km and a coefficient of variation

. a. What are the mean value and standard deviation of

ln(X)? b. What is the probability that X is at most 15,000

kg/day/km? c. What is the probability that X exceeds its mean value,

and why is this probability not .5? d. Is 17,000 the 95th percentile of the distribution?

80. a. Use Equation (4.13) to write a formula for the median of the lognormal distribution. What is the median for the load distribution of Exercise 79?

b. Recalling that is our notation for the per- centile of the standard normal distribution, write an expression for the percentile of the lognor- mal distribution. In Exercise 79, what value will load exceed only 1% of the time?

81. A theoretical justification based on a certain material fail- ure mechanism underlies the assumption that ductile strength X of a material has a lognormal distribution. Suppose the parameters are and . a. Compute E(X) and V(X).

s 5 .1m 5 5

100(1 2 a)

100(1 2 a)za

m|

CV 5 .40 (CV 5 sX/mX)

X 5

0 , p , 1

b 5 .863a 5 1.817

b. Compute . c. Compute . d. What is the value of median ductile strength? e. If ten different samples of an alloy steel of this type were

subjected to a strength test, how many would you expect to have strength of at least 125?

f. If the smallest 5% of strength values were unacceptable, what would the minimum acceptable strength be?

82. The article “The Statistics of Phytotoxic Air Pollutants” (J. of Royal Stat. Soc., 1989: 183–198) suggests the lognor- mal distribution as a model for concentration above a certain forest. Suppose the parameter values are and . a. What are the mean value and standard deviation of con-

centration? b. What is the probability that concentration is at most 10?

Between 5 and 10?

83. What condition on and is necessary for the standard beta pdf to be symmetric?

84. Suppose the proportion X of surface area in a randomly selected quadrat that is covered by a certain plant has a stan- dard beta distribution with and . a. Compute E(X) and V(X). b. Compute . c. Compute . d. What is the expected proportion of the sampling region

not covered by the plant?

85. Let X have a standard beta density with parameters and . a. Verify the formula for E(X) given in the section. b. Compute . If X represents the proportion of

a substance consisting of a particular ingredient, what is the expected proportion that does not consist of this ingredient?

86. Stress is applied to a 20-in. steel bar that is clamped in a fixed position at each end. Let the distance from the left end at which the bar snaps. Suppose Y/20 has a standard beta distribution with and . a. What are the parameters of the relevant standard beta

distribution? b. Compute . c. Compute the probability that the bar snaps more than

2 in. from where you expect it to.

P(8 # Y # 12)

V(Y) 5 100 7

E(Y) 5 10

Y 5

E[(1 2 X)m]

ba

P(.2 # X # .4) P(X # .2)

b 5 2a 5 5

ba

s 5 .9 m 5 1.9

SO2

P(110 # X # 125) P(X . 125)

4.6 Probability Plots An investigator will often have obtained a numerical sample and wish to know whether it is plausible that it came from a population distribution of some particular type (e.g., from a normal distribution). For one thing, many formal proce- dures from statistical inference are based on the assumption that the population dis- tribution is of a specified type. The use of such a procedure is inappropriate if the actual underlying probability distribution differs greatly from the assumed type.

x1, x2, c, xn

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

4.6 Probability Plots 179

For example, the article “Toothpaste Detergents: A Potential Source of Oral Soft Tissue Damage” (Intl. J. of Dental Hygiene, 2008: 193–198) contains the following statement: “Because the sample number for each experiment (replication) was lim- ited to three wells per treatment type, the data were assumed to be normally distrib- uted.” As justification for this leap of faith, the authors wrote that “Descriptive statistics showed standard deviations that suggested a normal distribution to be highly likely.” Note: This argument is not very persuasive.

Additionally, understanding the underlying distribution can sometimes give insight into the physical mechanisms involved in generating the data. An effective way to check a distributional assumption is to construct what is called a probability plot. The essence of such a plot is that if the distribution on which the plot is based is correct, the points in the plot should fall close to a straight line. If the actual dis- tribution is quite different from the one used to construct the plot, the points will likely depart substantially from a linear pattern.

Sample Percentiles The details involved in constructing probability plots differ a bit from source to source. The basis for our construction is a comparison between percentiles of the sample data and the corresponding percentiles of the distribution under consideration. Recall that the (100p)th percentile of a continuous distribution with cdf is the number that satisfies . That is, is the number on the measurement scale such that the area under the density curve to the left of is p. Thus the 50th percentile

satisfies , and the 90th percentile satisfies . Consider as an example the standard normal distribution, for which we have denoted the cdf by

. From Appendix Table A.3, we find the 20th percentile by locating the row and column in which .2000 (or a number as close to it as possible) appears inside the table. Since .2005 appears at the intersection of the row and the .04 column, the 20th percentile is approximately . Similarly, the 25th percentile of the standard normal distribution is (using linear interpolation) approximately .

Roughly speaking, sample percentiles are defined in the same way that per- centiles of a population distribution are defined. The 50th-sample percentile should separate the smallest 50% of the sample from the largest 50%, the 90th percentile should be such that 90% of the sample lies below that value and 10% lies above, and so on. Unfortunately, we run into problems when we actually try to compute the sample percentiles for a particular sample of n observations. If, for example, we can split off 20% of these values or 30% of the data, but there is no value that will split off exactly 23% of these ten observations. To proceed further, we need an oper- ational definition of sample percentiles (this is one place where different people do slightly different things). Recall that when n is odd, the sample median or 50th- sample percentile is the middle value in the ordered list, for example, the sixth-largest value when . This amounts to regarding the middle observation as being half in the lower half of the data and half in the upper half. Similarly, suppose . Then if we call the third-smallest value the 25th percentile, we are regarding that value as being half in the lower group (consisting of the two smallest observations) and half in the upper group (the seven largest observations). This leads to the follow- ing general definition of sample percentiles.

n 5 10 n 5 11

n 5 10,

2.675 2.84

2.8

�( # )

F(h(.9)) 5 .9F(h(.5)) 5 .5h(.5) h(p)

h(p)F(h(p)) 5 p h(p)F( # )

Order the n sample observations from smallest to largest. Then the ith smallest observation in the list is taken to be the th sample percentile.[100(i 2 .5)/n]

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.29

180 CHAPTER 4 Continuous Random Variables and Probability Distributions

Once the percentage values have been calcu- lated, sample percentiles corresponding to intermediate percentages can be obtained by linear interpolation. For example, if , the percentages corresponding to the ordered sample observations are

. The 10th percentile is then halfway between the 5th percentile (smallest sample observation) and the 15th percentile (second-smallest observation). For our purposes, such interpolation is not necessary because a probability plot will be based only on the percentages cor- responding to the n sample observations.

A Probability Plot Suppose now that for percentages the percentiles are determined for a specified population distribution whose plausibility is being inves- tigated. If the sample was actually selected from the specified distribution, the sample percentiles (ordered sample observations) should be reasonably close to the corresponding population distribution percentiles. That is, for there should be reasonable agreement between the ith smallest sample observation and the th percentile for the specified distribution. Let’s consider the (population percentile, sample percentile) pairs—that is, the pairs

for . Each such pair can be plotted as a point on a two-dimensional coordinate system. If the sample percentiles are close to the corresponding popula- tion distribution percentiles, the first number in each pair will be roughly equal to the second number. The plotted points will then fall close to a line. Substantial deviations of the plotted points from a line cast doubt on the assumption that the distribution under consideration is the correct one.

The value of a certain physical constant is known to an experimenter. The experi- menter makes independent measurements of this value using a particular measurement device and records the resulting measurement errors

. These observations appear in the accompa- nying table. (error 5 observed value 2 true value)

n 5 10

458 458

i 5 1, c, n

a[100(i 2 .5)/n]th percentile, ith smallest sample of the distribution, observation

b

[100(i 2 .5)/n]

i 5 1, 2, c, n

100(i 2 .5)/n (i 5 1, c, n)

100(i 2 .5)/n

25%, c, and 100(10 2 .5)/10 5 95% 100(1 2 .5)/10 5 5%, 100(2 2 .5)/10 5 15%,

n 5 10

100(i 2 .5)/n (i 5 1, 2, c, n)

Percentage 5 15 25 35 45

z percentile 21.645 21.037 2.675 2.385 2.126

Sample observation 21.91 21.25 2.75 2.53 .20

Percentage 55 65 75 85 95

z percentile .126 .385 .675 1.037 1.645

Sample observation .35 .72 .87 1.40 1.56

Is it plausible that the random variable measurement error has a standard normal dis- tribution? The needed standard normal (z) percentiles are also displayed in the table. Thus the points in the probability plot are , and (1.645, 1.56). Figure 4.33 shows the resulting plot. Although the points deviate a bit from the line, the predominant impression is that this line fits the points458

(21.645, 21.91), (21.037, 21.25), c

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.6 Probability Plots 181

x

z percentile

45° line1.6

1.2

.8

�1.6 �1.2 �.8 �.4 .4 .8 1.2 1.6

.4

�.4

�.8

�1.2

�1.6

�1.8

Figure 4.33 Plots of pairs (z percentile, observed value) for the data of Example 4.29: first sample

x

z percentile

45° line

1.2

.8

�1.6 �1.2 �.8 �.4 .4 .8 1.2 1.6

.4

�.4

�.8

�1.2

S-shaped curve

Figure 4.34 Plots of pairs (z percentile, observed value) for the data of Example 4.29: second sample

Figure 4.34 shows a plot of pairs (z percentile, observation) for a second sample of ten observations. The line gives a good fit to the middle part of the sample but not to the extremes. The plot has a well-defined S-shaped appearance. The two smallest sample observations are considerably larger than the correspon- ding z percentiles (the points on the far left of the plot are well above the line). Similarly, the two largest sample observations are much smaller than the associated z percentiles. This plot indicates that the standard normal distribution would not be a plausible choice for the probability model that gave rise to these observed meas- urement errors.

458

458

very well. The plot suggests that the standard normal distribution is a reasonable probability model for measurement error.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.30

182 CHAPTER 4 Continuous Random Variables and Probability Distributions

An investigator is typically not interested in knowing just whether a specified probability distribution, such as the standard normal distribution (normal with

and ) or the exponential distribution with , is a plausible model for the population distribution from which the sample was selected. Instead, the issue is whether some member of a family of probability distributions specifies a plausible model—the family of normal distributions, the family of exponential distributions, the family of Weibull distributions, and so on. The values of the param- eters of a distribution are usually not specified at the outset. If the family of Weibull distributions is under consideration as a model for lifetime data, are there any values of the parameters and for which the corresponding Weibull distribution gives a good fit to the data? Fortunately, it is almost always the case that just one probabil- ity plot will suffice for assessing the plausibility of an entire family. If the plot deviates substantially from a straight line, no member of the family is plausible. When the plot is quite straight, further work is necessary to estimate values of the parameters that yield the most reasonable distribution of the specified type.

Let’s focus on a plot for checking normality. Such a plot is useful in applied work because many formal statistical procedures give accurate inferences only when the pop- ulation distribution is at least approximately normal. These procedures should generally not be used if the normal probability plot shows a very pronounced departure from linearity. The key to constructing an omnibus normal probability plot is the relationship between standard normal (z) percentiles and those for any other normal distribution:

Consider first the case . If each observation is exactly equal to the corresponding normal percentile for some value of , the pairs ( [z percentile], observation) fall on a line, which has slope 1. This then implies that the (z percentile, observation) pairs fall on a line passing through (0, 0) (i.e., one with y-intercept 0) but having slope rather than 1. The effect of a nonzero value of

is simply to change the y-intercept from 0 to .mm s

458 s #s

m 5 0

percentile for a normal (m, s) distribution 5 m 1 s # (corresponding z percentile)

ba

l 5 .1s 5 1m 5 0

A plot of the n pairs

on a two-dimensional coordinate system is called a normal probability plot. If the sample observations are in fact drawn from a normal distribution with mean value and standard deviation , the points should fall close to a straight line with slope and intercept . Thus a plot for which the points fall close to some straight line suggests that the assumption of a normal popula- tion distribution is plausible.

ms

sm

([100(i 2 .5)/n]th z percentile, ith smallest observation)

The accompanying sample consisting of observations on dielectric break- down voltage of a piece of epoxy resin appeared in the article “Maximum Likelihood Estimation in the 3-Parameter Weibull Distribution (IEEE Trans. on Dielectrics and Elec. Insul., 1996: 43–55). The values of for which z percentiles are needed are and .975.

Observation 24.46 25.61 26.25 26.42 26.66 27.15 27.31 27.54 27.74 27.94 z percentile 21.96 21.44 21.15 2.93 2.76 2.60 2.45 2.32 2.19 2.06

Observation 27.98 28.04 28.28 28.49 28.50 28.87 29.11 29.13 29.50 30.88 z percentile .06 .19 .32 .45 .60 .76 .93 1.15 1.44 1.96

(1 2 .5)/20 5 .025, (2 2 .5)/20 5 .075, c, (i 2 .5)/n

n 5 20

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.6 Probability Plots 183

–2 –1

25

24

0 1 2

26

27

28

29

30

31

z percentile

x

Figure 4.35 Normal probability plot for the dielectric breakdown voltage sample ■

There is an alternative version of a normal probability plot in which the z per- centile axis is replaced by a nonlinear probability axis. The scaling on this axis is constructed so that plotted points should again fall close to a line when the sampled distribution is normal. Figure 4.36 shows such a plot from Minitab for the break- down voltage data of Example 4.30.

Figure 4.35 shows the resulting normal probability plot. The pattern in the plot is quite straight, indicating it is plausible that the population distribution of dielectric breakdown voltage is normal.

.999

.99

.95

.80

.50

.20

.05

.01

.001

Pr ob

ab ili

ty

24.2 25.2 26.2 27.2 28.2 29.2 30.2 31.2

Voltage

Figure 4.36 Normal probability plot of the breakdown voltage data from Minitab

A nonnormal population distribution can often be placed in one of the follow- ing three categories:

1. It is symmetric and has “lighter tails” than does a normal distribution; that is, the density curve declines more rapidly out in the tails than does a normal curve.

2. It is symmetric and heavy-tailed compared to a normal distribution.

3. It is skewed.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

184 CHAPTER 4 Continuous Random Variables and Probability Distributions

A uniform distribution is light-tailed, since its density function drops to zero outside a finite interval. The density function for is heavy-tailed, since declines much less rapidly than does . Lognormal and Weibull distributions are among those that are skewed. When the points in a normal probability plot do not adhere to a straight line, the pattern will frequently suggest that the population distribution is in a particular one of these three categories.

When the distribution from which the sample is selected is light-tailed, the largest and smallest observations are usually not as extreme as would be expected from a normal random sample. Visualize a straight line drawn through the middle part of the plot; points on the far right tend to be below the line (observed value percentile), whereas points on the left end of the plot tend to fall above the straight line (observed value percentile). The result is an S-shaped pattern of the type pictured in Figure 4.34.

A sample from a heavy-tailed distribution also tends to produce an S-shaped plot. However, in contrast to the light-tailed case, the left end of the plot curves downward (observed percentile), as shown in Figure 4.37(a). If the underlying distribution is positively skewed (a short left tail and a long right tail), the smallest sample observations will be larger than expected from a normal sample and so will the largest observations. In this case, points on both ends of the plot will fall above a straight line through the middle part, yielding a curved pattern, as illustrated in Figure 4.37(b). A sample from a lognormal distribution will usually produce such a pattern. A plot of (z percentile, ln(x)) pairs should then resemble a straight line.

, z

. z

, z

e2x2/21/(1 1 x2) 2` , x , `f (x) 5 1/[p(1 1 x2)]

x

z percentile (a)

x

z percentile (b)

Figure 4.37 Probability plots that suggest a nonnormal distribution: (a) a plot consistent with a heavy-tailed distribution; (b) a plot consistent with a positively skewed distribution

Even when the population distribution is normal, the sample percentiles will not coincide exactly with the theoretical percentiles because of sampling variability. How much can the points in the probability plot deviate from a straight-line pattern before the assumption of population normality is no longer plausible? This is not an easy question to answer. Generally speaking, a small sample from a normal distri- bution is more likely to yield a plot with a nonlinear pattern than is a large sample. The book Fitting Equations to Data (see the Chapter 13 bibliography) presents the results of a simulation study in which numerous samples of different sizes were selected from normal distributions. The authors concluded that there is typically

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.6 Probability Plots 185

greater variation in the appearance of the probability plot for sample sizes smaller than 30, and only for much larger sample sizes does a linear pattern generally predominate. When a plot is based on a small sample size, only a very substantial departure from linearity should be taken as conclusive evidence of nonnormality. A similar comment applies to probability plots for checking the plausibility of other types of distributions.

Beyond Normality Consider a family of probability distributions involving two parameters, and , and let denote the corresponding cdf’s. The family of normal distribu- tions is one such family, with , and . Another example is the Weibull family, with , and

Still another family of this type is the gamma family, for which the cdf is an integral involving the incomplete gamma function that cannot be expressed in any simpler form.

The parameters and are said to be location and scale parameters, respec- tively, if is a function of . The parameters and of the normal family are location and scale parameters, respectively. Changing shifts the location of the bell-shaped density curve to the right or left, and changing amounts to stretching or compressing the measurement scale (the scale on the horizontal axis when the density function is graphed). Another example is given by the cdf

A random variable with this cdf is said to have an extreme value distribution. It is used in applications involving component lifetime and material strength.

Although the form of the extreme value cdf might at first glance suggest that is the point of symmetry for the density function, and therefore the mean and

median, this is not the case. Instead, , and the density function is negatively skewed (a long lower tail). Similarly, the scale parameter is not the standard deviation ( and ). However, changing the value of does change the location of the density curve, whereas a change in rescales the meas- urement axis.

The parameter of the Weibull distribution is a scale parameter, but a is not a location parameter. A similar comment applies to the parameters and of the gamma distribution. In the usual form, the density function for any member of either the gamma or Weibull distribution is positive for and zero otherwise. A loca- tion parameter can be introduced as a third parameter (we did this for the Weibull distribution) to shift the density function so that it is positive if and zero otherwise.

When the family under consideration has only location and scale parameters, the issue of whether any member of the family is a plausible population distribution can be addressed via a single, easily constructed probability plot. One first obtains the percentiles of the standard distribution, the one with and , for per- centages . The n (standardized percentile, observation) pairs give the points in the plot. This is exactly what we did to obtain an omnibus normal probability plot. Somewhat surprisingly, this methodology can be applied to yield an omnibus Weibull probability plot. The key result is that if X has a Weibull distribution with shape parameter and scale parameter , then the transformedba

100(i 2 .5)/n (i 5 1, c, n) u2 5 1u1 5 0

x . g g

x . 0

ba

b

u2

u1s 5 1.283u2m 5 u1 2 .5772u2

u2

f(x; u1, u2) 5 Fr(x; u1, u2) P(X # u1) 5 F(u1; u1, u2) 5 1 2 e

21 5 .632 u1

F(x; u1, u2) 5 1 2 e 2e(x2u1)/u2 2` , x , `

s

m

sm(x 2 u1)/u2F(x; u1, u2) u2u1

F(x; a, b) 5 1 2 e2(x/b)a u1 5 a, u2 5 b

F(x; m, s) 5 �[(x 2 m)/s]u1 5 m, u2 5 s F(x; u1, u2)

u2u1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 4.31

186 CHAPTER 4 Continuous Random Variables and Probability Distributions

variable ln(X) has an extreme value distribution with location parameter and scale parameter . Thus a plot of the (extreme value standardized percentile, ln(x)) pairs showing a strong linear pattern provides support for choosing the Weibull distribution as a population model.

The accompanying observations are on lifetime (in hours) of power apparatus insu- lation when thermal and electrical stress acceleration were fixed at particular values (“On the Estimation of Life of Power Apparatus Insulation Under Combined Electrical and Thermal Stress,” IEEE Trans. on Electrical Insulation, 1985: 70–78). A Weibull probability plot necessitates first computing the 5th, 15th, . . . , and 95th percentiles of the standard extreme value distribution. The (100p)th percentile satisfies

from which .h( p) 5 ln[2ln(1 2 p)]

p 5 F(h(p)) 5 1 2 e2eh(p)

h(p)

1/a u1 5 ln(b)

Percentile

x 282 501 741 851 1072

ln(x) 5.64 6.22 6.61 6.75 6.98

Percentile .05 .33 .64 1.10

x 1122 1202 1585 1905 2138

ln(x) 7.02 7.09 7.37 7.55 7.67

2.23

2.512.8421.2521.8222.97

�3 5

8

7

6

�2 �1 0 1

ln(x)

Percentile

Figure 4.38 A Weibull probability plot of the insulation lifetime data ■

The pairs are plotted as points in Figure 4.38. The straightness of the plot argues strongly for using the Weibull dis- tribution as a model for insulation life, a conclusion also reached by the author of the cited article.

(22.97, 5.64), (21.82, 6.22), c, (1.10, 7.67)

The gamma distribution is an example of a family involving a shape parame- ter for which there is no transformation such that h(X) has a distribution that depends only on location and scale parameters. Construction of a probability plot necessitates first estimating the shape parameter from sample data (some methods for doing this are described in Chapter 6). Sometimes an investigator wishes to know

h( # )

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

4.6 Probability Plots 187

whether the transformed variable has a normal distribution for some value of (by convention, is identified with the logarithmic transformation, in which case X has a lognormal distribution). The book Graphical Methods for Data Analysis, listed in the Chapter 1 bibliography, discusses this type of problem as well as other refinements of probability plotting. Fortunately, the wide availability of var- ious probability plots with statistical software packages means that the user can often sidestep technical details.

u 5 0 uX u

EXERCISES Section 4.6 (87–97)

87. The accompanying normal probability plot was constructed from a sample of 30 readings on tension for mesh screens behind the surface of video display tubes used in computer monitors. Does it appear plausible that the tension distribu- tion is normal?

22–26). Would you feel comfortable estimating population mean thickness using a method that assumed a normal pop- ulation distribution?

–2 –1

200

0 1 2

250

300

350

z percentile

x

88. A sample of 15 female collegiate golfers was selected and the clubhead velocity (km/hr) while swinging a driver was determined for each one, resulting in the following data (“Hip Rotational Velocities During the Full Golf Swing,” J. of Sports Science and Medicine, 2009: 296–299):

69.0 69.7 72.7 80.3 81.0 85.0 86.0 86.3 86.7 87.7 89.3 90.7 91.0 92.5 93.0

The corresponding z percentiles are

21.83 �1.28 �0.97 �0.73 �0.52 �0.34 �0.17 0.0 0.17 0.34

0.52 0.73 0.97 1.28 1.83

Construct a normal probability plot and a dotplot. Is it plau- sible that the population distribution is normal?

89. Construct a normal probability plot for the following sam- ple of observations on coating thickness for low-viscosity paint (“Achieving a Target Value for a Manufacturing Process: A Case Study,” J. of Quality Technology, 1992:

.83 .88 .88 1.04 1.09 1.12 1.29 1.31 1.48 1.49 1.59 1.62 1.65 1.71 1.76 1.83

90. The article “A Probabilistic Model of Fracture in Concrete and Size Effects on Fracture Toughness” (Magazine of Con- crete Res., 1996: 311–320) gives arguments for why frac- ture toughness in concrete specimens should have a Weibull distribution and presents several histograms of data that appear well fit by superimposed Weibull curves. Consider the following sample of size observations on tough- ness for high-strength concrete (consistent with one of the histograms); values of are also given.pi 5 (i 2 .5)/18

n 5 18

Observation .47 .58 .65 .69 .72 .74 .0278 .0833 .1389 .1944 .2500 .3056

Observation .77 .79 .80 .81 .82 .84 .3611 .4167 .4722 .5278 .5833 .6389

Observation .86 .89 .91 .95 1.01 1.04 .6944 .7500 .8056 .8611 .9167 .9722pi

pi

pi

Construct a Weibull probability plot and comment.

91. Construct a normal probability plot for the fatigue-crack propagation data given in Exercise 39 (Chapter 1). Does it appear plausible that propagation life has a normal distribu- tion? Explain.

92. The article “The Load-Life Relationship for M50 Bearings with Silicon Nitride Ceramic Balls” (Lubrication Engr., 1984: 153–159) reports the accompanying data on bearing load life (million revs.) for bearings tested at a 6.45 kN load.

47.1 68.1 68.1 90.8 103.6 106.0 115.0 126.0 146.6 229.0 240.0 240.0 278.0 278.0 289.0 289.0 367.0 385.9 392.0 505.0

a. Construct a normal probability plot. Is normality plausible?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

188 CHAPTER 4 Continuous Random Variables and Probability Distributions

b. Construct a Weibull probability plot. Is the Weibull dis- tribution family plausible?

93. Construct a probability plot that will allow you to assess the plausibility of the lognormal distribution as a model for the rainfall data of Exercise 83 in Chapter 1.

94. The accompanying observations are precipitation values dur- ing March over a 30-year period in Minneapolis-St. Paul.

suggested check for normality is to plot the pairs. Suppose we believe that the

observations come from a distribution with mean 0, and let be the ordered absolute values of the . A half-normal plot is a probability plot of the . More specifically, since

, a half-normal plot is a plot of the pairs. The virtue of this plot is

that small or large outliers in the original sample will now appear only at the upper end of the plot rather than at both ends. Construct a half-normal plot for the following sample of measurement errors, and comment:

.

97. The following failure time observations (1000s of hours) resulted from accelerated life testing of 16 integrated circuit chips of a certain type:

2.39, 12.38, 243.40, 1.15, 23.96, 22.34, 30.84 23.78, 21.27, 1.44,

(�215[(i 2 .5)/n 1 1]/26, wi) 2�(w) 2 1

P( u Z u # w) 5 P(2w # Z # w) 5 wirs xirsw1, c, wn

(�21((i 2 .5)/n), yi)

.77 1.20 3.00 1.62 2.81 2.48 1.74 .47 3.09 1.31 1.87 .96 .81 1.43 1.51 .32 1.18 1.89

1.20 3.37 2.10 .59 1.35 .90 1.95 2.20 .52 .81 4.75 2.05

82.8 11.6 359.5 502.5 307.8 179.7 242.0 26.5 244.8 304.3 379.1 212.6 229.9 558.9 366.7 204.6

SUPPLEMENTARY EXERCISES (98–128)

98. Let the time it takes a read/write head to locate a desired record on a computer disk memory device once the head has been positioned over the correct track. If the disks rotate once every 25 millisec, a reasonable assumption is that X is uniformly distributed on the interval [0, 25]. a. Compute . b. Compute . c. Obtain the cdf F(X). d. Compute E(X) and .

99. A 12-in. bar that is clamped at both ends is to be subjected to an increasing amount of stress until it snaps. Let the distance from the left end at which the break occurs. Suppose Y has pdf

Compute the following: a. The cdf of Y, and graph it. b. , and c. E(Y), , and V(Y) d. The probability that the break point occurs more than

2 in. from the expected break point.

E(Y2) P(4 # Y # 6)P(Y # 4), P(Y . 6)

f(y) 5 • a 1

24 bya1 2 y

12 b 0 # y # 12

0 otherwise

Y 5

sX

P(X $ 10) P(10 # X # 20)

X 5 e. The expected length of the shorter segment when the break occurs.

100. Let X denote the time to failure (in years) of a certain hydraulic component. Suppose the pdf of X is

for . a. Verify that f(x) is a legitimate pdf. b. Determine the cdf. c. Use the result of part (b) to calculate the probability that

time to failure is between 2 and 5 years. d. What is the expected time to failure? e. If the component has a salvage value equal to

when its time to failure is x, what is the expected salvage value?

101. The completion time X for a certain task has cdf F(x) given by

0 x , 0

x3

3 0 # x , 1

1 2 1

2 a7

3 2 xb a7

4 2

3

4 xb 1 # x # 7

3

1 x . 7

3

100/(4 1 x)

x . 0f(x) 5 32/(x 1 4)3

a. Construct and interpret a normal probability plot for this data set.

b. Calculate the square root of each value and then con- struct a normal probability plot based on this trans- formed data. Does it seem plausible that the square root of precipitation is normally distributed?

c. Repeat part (b) after transforming by cube roots.

95. Use a statistical software package to construct a normal probability plot of the tensile ultimate-strength data given in Exercise 13 of Chapter 1, and comment.

96. Let the ordered sample observations be denoted by ( being the smallest and the largest). Ouryny1y1, y2, c, yn

Use the corresponding percentiles of the exponential distribution with to construct a probability plot. Then explain why the plot assesses the plausibility of the sample having been generated from any exponential distribution.

l 5 1

⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 189

a. Obtain the pdf f(x) and sketch its graph. b. Compute . c. Compute E(X).

102. The breakdown voltage of a randomly chosen diode of a certain type is known to be normally distributed with mean value 40 V and standard deviation 1.5 V. a. What is the probability that the voltage of a single diode

is between 39 and 42? b. What value is such that only 15% of all diodes have

voltages exceeding that value? c. If four diodes are independently selected, what is the

probability that at least one has a voltage exceeding 42?

103. The article “Computer Assisted Net Weight Control” (Quality Progress, 1983: 22–25) suggests a normal distri- bution with mean 137.2 oz and standard deviation 1.6 oz for the actual contents of jars of a certain type. The stated contents was 135 oz. a. What is the probability that a single jar contains more

than the stated contents? b. Among ten randomly selected jars, what is the proba-

bility that at least eight contain more than the stated contents?

c. Assuming that the mean remains at 137.2, to what value would the standard deviation have to be changed so that 95% of all jars contain more than the stated contents?

104. When circuit boards used in the manufacture of compact disc players are tested, the long-run percentage of defec- tives is 5%. Suppose that a batch of 250 boards has been received and that the condition of any particular board is independent of that of any other board. a. What is the approximate probability that at least 10% of

the boards in the batch are defective? b. What is the approximate probability that there are

exactly 10 defectives in the batch?

105. The article “Characterization of Room Temperature Damping in Aluminum-Indium Alloys” (Metallurgical Trans., 1993: 1611–1619) suggests that Al matrix grain size for an alloy consisting of 2% indium could be modeled with a normal distribution with a mean value 96 and standard deviation 14. a. What is the probability that grain size exceeds 100? b. What is the probability that grain size is between

50 and 80? c. What interval (a, b) includes the central 90% of all grain

sizes (so that 5% are below a and 5% are above b)?

106. The reaction time (in seconds) to a certain stimulus is a continuous random variable with pdf

a. Obtain the cdf. b. What is the probability that reaction time is at most 2.5

sec? Between 1.5 and 2.5 sec?

f(x) 5 • 3

2 # 1 x2

1 # x # 3

0 otherwise

(mm)

P(.5 # X # 2) c. Compute the expected reaction time. d. Compute the standard deviation of reaction time. e. If an individual takes more than 1.5 sec to react, a light

comes on and stays on either until one further second has elapsed or until the person reacts (whichever hap- pens first). Determine the expected amount of time that the light remains lit. [Hint: Let h(X) the time that the light is on as a function of reaction time X.]

107. Let X denote the temperature at which a certain chemical reaction takes place. Suppose that X has pdf

a. Sketch the graph of f(x). b. Determine the cdf and sketch it. c. Is 0 the median temperature at which the reaction takes

place? If not, is the median temperature smaller or larger than 0?

d. Suppose this reaction is independently carried out once in each of ten different labs and that the pdf of reaction time in each lab is as given. Let the number among the ten labs at which the temperature exceeds 1. What kind of distribution does Y have? (Give the names and values of any parameters.)

108. The article “Determination of the MTF of Positive Photoresists Using the Monte Carlo Method” (Photographic Sci. and Engr., 1983: 254–260) proposes the exponential distribution with parameter as a model for the distribution of a photon’s free path length

under certain circumstances. Suppose this is the cor- rect model. a. What is the expected path length, and what is the stan-

dard deviation of path length? b. What is the probability that path length exceeds 3.0?

What is the probability that path length is between 1.0 and 3.0?

c. What value is exceeded by only 10% of all path lengths?

109. The article “The Prediction of Corrosion by Statistical Analysis of Corrosion Profiles” (Corrosion Science, 1985: 305–315) suggests the following cdf for the depth X of the deepest pit in an experiment involving the exposure of carbon manganese steel to acidified seawater.

The authors propose the values and . Assume this to be the correct model. a. What is the probability that the depth of the deepest pit

is at most 150? At most 300? Between 150 and 300? b. Below what value will the depth of the maximum pit be

observed in 90% of all such experiments? c. What is the density function of X? d. The density function can be shown to be unimodal (a

single peak). Above what value on the measurement axis does this peak occur? (This value is the mode.)

b 5 90a 5 150

F(x; a, b) 5 e2e2(x2a)/b 2` , x , `

(mm)

l 5 .93

Y 5

f(x) 5 • 1

9 (4 2 x2) 21 # x # 2

0 otherwise

5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

190 CHAPTER 4 Continuous Random Variables and Probability Distributions

e. It can be shown that . What is the mean for the given values of and , and how does it compare to the median and mode? Sketch the graph of the density function. [Note: This is called the largest extreme value distribution.]

110. Let t = the amount of sales tax a retailer owes the govern- ment for a certain period. The article “Statistical Sampling in Tax Audits” (Statistics and the Law, 2008: 320–343) proposes modeling the uncertainty in t by regarding it as a normally distributed random variable with mean value and standard deviation s (in the article, these two parame- ters are estimated from the results of a tax audit involving n sampled transactions). If a represents the amount the retailer is assessed, then an under-assessment results if t . a and an over-assessment results if a . t. The proposed penalty (i.e., loss) function for over- or under-assessment is L(a, t) t � a if t . a and k(a � t) if t # a (k . 1 is suggested to incorporate the idea that over-assessment is more serious than under-assessment). a. Show that is the value of a

that minimizes the expected loss, where is the inverse function of the standard normal cdf.

b. If k = 2 (suggested in the article), = $100,000, and s = $10,000, what is the optimal value of a, and what is the resulting probability of over-assessment?

111. The mode of a continuous distribution is the value x* that maximizes f(x). a. What is the mode of a normal distribution with param-

eters and ? b. Does the uniform distribution with parameters A and B

have a single mode? Why or why not? c. What is the mode of an exponential distribution with

parameter ? (Draw a picture.) d. If X has a gamma distribution with parameters and ,

and , find the mode. [Hint: ln[f(x)] will be maxi- mized iff f(x) is, and it may be simpler to take the deriv- ative of ln[f(x)].]

e. What is the mode of a chi-squared distribution having degrees of freedom?

112. The article “Error Distribution in Navigation” (J. of the Institute of Navigation, 1971: 429–442) suggests that the frequency distribution of positive errors (magnitudes of errors) is well approximated by an exponential distribution. Let the lateral position error (nautical miles), which can be either negative or positive. Suppose the pdf of X is

a. Sketch a graph of f(x) and verify that f(x) is a legitimate pdf (show that it integrates to 1).

b. Obtain the cdf of X and sketch it. c. Compute , and the

probability that an error of more than 2 miles is made.

113. In some systems, a customer is allocated to one of two service facilities. If the service time for a customer served by facility i has an exponential distribution with parameter

P(21 # X # 2)P(X # 2),P(X # 0),

f(x) 5 (.1)e2.2|x| 2` , x , `

X 5

n

a . 1 ba

l

sm

m

�21 a* 5 m 1 s�21(1/(k 1 1))

55

m

ba

E(X) < .5772b 1 a and p is the proportion of all customers served by facility 1, then the pdf of the service time of a ran- domly selected customer is

This is often called the hyperexponential or mixed expo- nential distribution. This distribution is also proposed as a model for rainfall amount in “Modeling Monsoon Affected Rainfall of Pakistan by Point Processes” (J. of Water Re- sources Planning and Mgmnt., 1992: 671–688). a. Verify that is indeed a pdf. b. What is the cdf ? c. If X has as its pdf, what is E(X)? d. Using the fact that when X has an expo-

nential distribution with parameter , compute when X has pdf . Then compute V(X).

e. The coefficient of variation of a random variable (or distribution) is . What is CV for an exponen- tial rv? What can you say about the value of CV when X has a hyperexponential distribution?

f. What is CV for an Erlang distribution with parameters and n as defined in Exercise 68? [Note: In applied

work, the sample CV is used to decide which of the three distributions might be appropriate.]

114. Suppose a particular state allows individuals filing tax returns to itemize deductions only if the total of all item- ized deductions is at least $5000. Let X (in 1000s of dol- lars) be the total of itemized deductions on a randomly chosen form. Assume that X has the pdf

a. Find the value of k. What restriction on is necessary? b. What is the cdf of X? c. What is the expected total deduction on a randomly

chosen form? What restriction on is necessary for E(X) to be finite?

d. Show that ln(X/5) has an exponential distribution with parameter .

115. Let be the input current to a transistor and be the out- put current. Then the current gain is proportional to

. Suppose the constant of proportionality is 1 (which amounts to choosing a particular unit of measure- ment), so that current gain . Assume X is normally distributed with and . a. What type of distribution does the ratio have? b. What is the probability that the output current is more

than twice the input current? c. What are the expected value and variance of the ratio of

output to input current?

116. The article “Response of SiCf/Si3N4 Composites Under Static and Cyclic Loading—An Experimental and Statistical Analysis” (J. of Engr. Materials and Technology, 1997: 186–193) suggests that tensile strength (MPa) of

I0/Ii

s 5 .05m 5 1 5 X 5 ln(I0/Ii)

ln(I0/Ii)

I0Ii

a 2 1

a

a

f(x; a) 5 e k/xa x $ 5 0 otherwise

l

CV 5 s/m

f(x; l1, l2, p) E(X 2)l

E(X 2) 5 2/l2 f(x; l1, l2, p)

F(x; l1, l2, p) f(x; l1, l2, p)

f(x; l1, l2, p) 5 epl1e 2l1x 1 (1 2 p)l2e

2l2x x $ 0

0 otherwise

X 5 li (i 5 1, 2)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 191

composites under specified conditions can be modeled by a Weibull distribution with and . a. Sketch a graph of the density function. b. What is the probability that the strength of a randomly

selected specimen will exceed 175? Will be between 150 and 175?

c. If two randomly selected specimens are chosen and their strengths are independent of one another, what is the probability that at least one has a strength between 150 and 175?

d. What strength value separates the weakest 10% of all specimens from the remaining 90%?

117. Let Z have a standard normal distribution and define a new rv Y by . Show that Y has a normal distribu- tion with parameters and . [Hint: iff ? Use this to find the cdf of Y and then differentiate it with respect to y.]

118. a. Suppose the lifetime X of a component, when measured in hours, has a gamma distribution with parameters and . Let the lifetime measured in minutes. Derive the pdf of Y. [Hint: iff . Use this to obtain the cdf of Y and then differentiate to obtain the pdf.]

b. If X has a gamma distribution with parameters and , what is the probability distribution of ?

119. In Exercises 117 and 118, as well as many other situations, one has the pdf f(x) of X and wishes to know the pdf of

. Assume that is an invertible function, so that can be solved for x to yield . Then it can be shown that the pdf of Y is

a. If X has a uniform distribution with and , derive the pdf of .

b. Work Exercise 117, using this result. c. Work Exercise 118(b), using this result.

120. Based on data from a dart-throwing experiment, the article “Shooting Darts” (Chance, Summer 1997, 16–19) proposed that the horizontal and vertical errors from aiming at a point target should be independent of one another, each with a normal distribution having mean 0 and variance . It can then be shown that the pdf of the distance V from the target to the landing point is

a. This pdf is a member of what family introduced in this chapter?

b. If mm (close to the value suggested in the paper), what is the probability that a dart will land within 25 mm (roughly 1 in.) of the target?

121. The article “Three Sisters Give Birth on the Same Day” (Chance, Spring 2001, 23–25) used the fact that three Utah sisters had all given birth on March 11, 1998 as a basis for

s 5 20

f(v) 5 v

s2 # e2v2/2s2 v . 0

s2

Y 5 2ln(X) B 5 1A 5 0

g(y) 5 f [k(y)] # |kr(y)|

x 5 k(y)y 5 h(x) h( # )y 5 h(X)

Y 5 cX ba

X # y/60Y # y Y 5b

a

Z #Y # ysm Y 5 sZ 1 m

b 5 180a 5 9 posing some interesting questions regarding birth coinci- dences. a. Disregarding leap year and assuming that the other

365 days are equally likely, what is the probability that three randomly selected births all occur on March 11? Be sure to indicate what, if any, extra assumptions you are making.

b. With the assumptions used in part (a), what is the prob- ability that three randomly selected births all occur on the same day?

c. The author suggested that, based on extensive data, the length of gestation (time between conception and birth) could be modeled as having a normal distribution with mean value 280 days and standard deviation 19.88 days. The due dates for the three Utah sisters were March 15, April 1, and April 4, respectively. Assuming that all three due dates are at the mean of the distribution, what is the probability that all births occurred on March 11? [Hint: The deviation of birth date from due date is nor- mally distributed with mean 0.]

d. Explain how you would use the information in part (c) to calculate the probability of a common birth date.

122. Let X denote the lifetime of a component, with f(x) and F(x) the pdf and cdf of X. The probability that the compo- nent fails in the interval is approximately

. The conditional probability that it fails in given that it has lasted at least x is

. Dividing this by produces the failure rate function:

An increasing failure rate function indicates that older components are increasingly likely to wear out, whereas a decreasing failure rate is evidence of increasing reliability with age. In practice, a “bathtub-shaped” failure is often assumed. a. If X is exponentially distributed, what is r(x)? b. If X has a Weibull distribution with parameters and ,

what is r(x)? For what parameter values will r(x) be increasing? For what parameter values will r(x) de- crease with x?

c. Since . Suppose

so that if a component lasts hours, it will last forever (while seemingly unreasonable, this model can be used to study just “initial wearout”). What are the cdf and pdf of X?

123. Let U have a uniform distribution on the interval [0, 1]. Then observed values having this distribution can be ob- tained from a computer’s random number generator. Let

.X 5 2(1/l)ln(1 2 U)

b

r(x) 5 •aa1 2 x

b b 0 # x # b

0 otherwise

2�r(x)dx r(x) 5 2(d/dx)ln[1 2 F(x)], ln[1 2 F(x)] 5

ba

r(x) 5 f(x)

1 2 F(x)

�xf(x) # �x/[1 2 F(x)] (x, x 1 �x) f(x) # �x

(x, x 1 �x)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Tangent line

x

192 CHAPTER 4 Continuous Random Variables and Probability Distributions

a. Show that X has an exponential distribution with param- eter . [Hint: The cdf of X is ; is equivalent to ?]

b. How would you use part (a) and a random number gen- erator to obtain observed values from an exponential distribution with parameter ?

124. Consider an rv X with mean and standard deviation , and let g(X) be a specified function of X. The first-order Taylor series approximation to g(X) in the neighborhood of

is

The right-hand side of this equation is a linear function of X. If the distribution of X is concentrated in an interval over which is approximately linear [e.g., is approxi- mately linear in (1, 2)], then the equation yields approxi- mations to E(g(X)) and V(g(X)). a. Give expressions for these approximations. [Hint: Use

rules of expected value and variance for a linear func- tion .]

b. If the voltage v across a medium is fixed but current I is random, then resistance will also be a random variable related to I by . If and , calcu- late approximations to and .

125. A function g(x) is convex if the chord connecting any two points on the function’s graph lies above the graph. When g(x) is differentiable, an equivalent condition is that for every x, the tangent line at x lies entirely on or below the graph. (See the figure below.) How does compare to E(g(X))? [Hint: The equation of the tangent line at is . Use the conditiony 5 g(m) 1 gr(m) # (x 2 m)x 5 m

g(m) 5 g(E(X))

sRmR

sI 5 .5mI 5 20R 5 v/I

aX 1 b

1xg( # )

g(X) < g(m) 1 gr(m) # (X 2 m) m

sm

l 5 10

U # X # xF(x) 5 P(X # x)l

of convexity, substitute X for x, and take expected values. [Note: Unless g(x) is linear, the resulting inequality (usually called Jensen’s inequality) is strict ( rather than ); it is valid for both continuous and discrete rv’s.]

126. Let X have a Weibull distribution with parameters and . Show that has a chi-squared distribu- tion with . [Hint: The cdf of Y is ; express this probability in the form , use the fact that X has a cdf of the form in Expression (4.12), and differenti- ate with respect to y to obtain the pdf of Y.]

127. An individual’s credit score is a number calculated based on that person’s credit history that helps a lender determine how much he/she should be loaned or what credit limit should be established for a credit card. An article in the Los Angeles Times gave data which suggested that a beta distribution with parameters

would provide a reasonable approximation to the distribution of American credit scores. [Note: credit scores are integer-valued]. a. Let X represent a randomly selected American credit

score. What are the mean value and standard deviation of this random variable? What is the probability that X is within 1 standard deviation of its mean value?

b. What is the approximate probability that a randomly selected score will exceed 750 (which lenders consider a very good score)?

128. Let V denote rainfall volume and W denote runoff volume (both in mm). According to the article “Runoff Quality Analysis of Urban Catchments with Analytical Probability Models” (J. of Water Resource Planning and Management, 2006: 4–14), the runoff volume will be 0 if and will be if . Here is the volume of depres- sion storage (a constant), and k (also a constant) is the runoff coefficient. The cited article proposes an exponen- tial distribution with parameter for V. a. Obtain an expression for the cdf of W. [Note: W is nei-

ther purely continuous nor purely discrete; instead it has a “mixed” distribution with a discrete component at 0 and is continuous for values .]

b. What is the pdf of W for ? Use this to obtain an expression for the expected value of runoff volume.

w . 0 w . 0

l

ndV . ndk(V 2 nd) V # nd

b 5 2 a 5 8,B 5 850,A 5 150,

P(X # g(y)) P(Y # y)n 5 2

Y 5 2X2/b2b a 5 2

#,

Bibliography Bury, Karl, Statistical Distributions in Engineering, Cambridge

Univ. Press, Cambridge, England, 1999. A readable and informative survey of distributions and their properties.

Johnson, Norman, Samuel Kotz, and N. Balakrishnan, Continuous Univariate Distributions, vols. 1–2, Wiley, New York, 1994. These two volumes together present an exhaus- tive survey of various continuous distributions.

Nelson, Wayne, Applied Life Data Analysis, Wiley, New York, 1982. Gives a comprehensive discussion of distributions and methods that are used in the analysis of lifetime data.

Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability Models and Applications (2nd ed.), Macmillan, New York, 1994. Good coverage of general properties and specific dis- tributions.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

193

5 Joint ProbabilityDistributions and Random Samples

INTRODUCTION

In Chapters 3 and 4, we developed probability models for a single random

variable. Many problems in probability and statistics involve several random

variables simultaneously. In this chapter, we first discuss probability models for

the joint (i.e., simultaneous) behavior of several random variables, putting

special emphasis on the case in which the variables are independent of one

another. We then study expected values of functions of several random

variables, including covariance and correlation as measures of the degree of

association between two variables.

The last three sections of the chapter consider functions of n random

variables X1, X2, . . ., Xn, focusing especially on their average (X1 � . . . � Xn)/n.

We call any such function, itself a random variable, a statistic. Methods from

probability are used to obtain information about the distribution of a statistic.

The premier result of this type is the Central Limit Theorem (CLT), the basis for

many inferential procedures involving large sample sizes.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.1

DEFINITION

5.1 Jointly Distributed Random Variables There are many experimental situations in which more than one random variable (rv) will be of interest to an investigator. We first consider joint probability distributions for two discrete rv’s, then for two continuous variables, and finally for more than two variables.

Two Discrete Random Variables The probability mass function (pmf) of a single discrete rv X specifies how much prob- ability mass is placed on each possible X value. The joint pmf of two discrete rv’s X and Y describes how much probability mass is placed on each possible pair of values (x, y).

Let X and Y be two discrete rv’s defined on the sample space of an experi- ment. The joint probability mass function p(x, y) is defined for each pair of numbers (x, y) by

p(x, y) � P(X � x and Y � y)

It must be the case that p(x, y) � 0 and .

Now let A be any set consisting of pairs of (x, y) values (e.g., A � {(x, y): x � y � 5} or {(x, y): max(x, y) � 3}). Then the probability P[(X, Y) � A] is obtained by summing the joint pmf over pairs in A:

P[(X, Y) � A] 5 g (x, y)

g �A

p(x, y)

g x

g y

p(x, y) 5 1

S

194 CHAPTER 5 Joint Probability Distributions and Random Samples

A large insurance agency services a number of customers who have purchased both a homeowner’s policy and an automobile policy from the agency. For each type of pol- icy, a deductible amount must be specified. For an automobile policy, the choices are $100 and $250, whereas for a homeowner’s policy, the choices are 0, $100, and $200. Suppose an individual with both types of policy is selected at random from the agency’s files. Let X � the deductible amount on the auto policy and Y � the deductible amount on the homeowner’s policy. Possible (X, Y) pairs are then (100, 0), (100, 100), (100, 200), (250, 0), (250, 100), and (250, 200); the joint pmf specifies the probability asso- ciated with each one of these pairs, with any other pair having probability zero. Suppose the joint pmf is given in the accompanying joint probability table:

y p(x, y) 0 100 200

x 100 .20 .10 .20 250 .05 .15 .30

Then p(100, 100) � P(X � 100 and Y � 100) � P($100 deductible on both poli- cies) � .10. The probability P(Y � 100) is computed by summing probabilities of all (x, y) pairs for which y � 100:

P(Y � 100) � p(100, 100) � p(250, 100) � p(100, 200) � p(250, 200)

� .75 ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.2 (Example 5.1 continued)

DEFINITION

5.1 Jointly Distributed Random Variables 195

Once the joint pmf of the two variables X and Y is available, it is in principle straight- forward to obtain the distribution of just one of these variables. As an example, let X and Y be the number of statistics and mathematics courses, respectively, currently being taken by a randomly selected statistics major. Suppose that we wish the distribution of X, and that when X � 2, the only possible values of Y are 0, 1, and 2. Then

pX(2) � P(X � 2) � P[(X, Y ) � (2, 0) or (2, 1) or (2, 2)]

� p(2, 0) � p(2, 1) � p(2, 2)

That is, the joint pmf is summed over all pairs of the form (2, y). More generally, for any possible value x of X, the probability pX(x) results from holding x fixed and sum- ming the joint pmf p(x, y) over all y for which the pair (x, y) has positive probability mass. The same strategy applies to obtaining the distribution of Y by itself.

The marginal probability mass function of X, denoted by pX(x), is given by

Similarly, the marginal probability mass function of Y is

.pY(y) 5 g x: p(x, y).0

p(x, y) for each possible value y

pX(x) 5 g y: p(x, y).0

p(x, y) for each possible value x

The use of the word marginal here is a consequence of the fact that if the joint pmf is displayed in a rectangular table as in Example 5.1, then the row totals give the marginal pmf of X and the column totals give the marginal pmf of Y. Once these marginal pmf’s are available, the probability of any event involving only X or only Y can be calculated.

The possible X values are x � 100 and x � 250, so computing row totals in the joint probability table yields

pX(100) � p(100, 0) � p(100, 100) � p(100, 200) � .50

and

pX(250) � p(250, 0) � p(250, 100) � p(250, 200) � .50

The marginal pmf of X is then

Similarly, the marginal pmf of Y is obtained from column totals as

so P(Y � 100) � pY(100) � pY(200) � .75 as before. ■

Two Continuous Random Variables The probability that the observed value of a continuous rv X lies in a one- dimensional set A (such as an interval) is obtained by integrating the pdf f(x) over the set A. Similarly, the probability that the pair (X, Y) of continuous rv’s falls in

pY(y) 5 • .25 y 5 0, 100

.50 y 5 200

0 otherwise

pX(x) 5 e .5 x 5 100, 2500 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.3

DEFINITION

196 CHAPTER 5 Joint Probability Distributions and Random Samples

a two-dimensional set A (such as a rectangle) is obtained by integrating a function called the joint density function.

Let X and Y be continuous rv’s. A joint probability density function f (x, y) for these two variables is a function satisfying f(x, y) � 0 and

. Then for any two-dimensional set A

In particular, if A is the two-dimensional rectangle {(x, y): a � x � b, c � y � d}, then

P[(X,Y) � A] 5 P(a # X # b, c # Y # d ) 5 � b

a �

d

c f (x, y) dy dx

P[(X, Y) � A] 5 � A � f (x, y) dx dy

� `

2` �

`

2` f (x, y) dx dy 5 1

We can think of f(x, y) as specifying a surface at height f(x, y) above the point (x, y) in a three-dimensional coordinate system. Then P[(X, Y) � A] is the volume underneath this surface and above the region A, analogous to the area under a curve in the case of a single rv. This is illustrated in Figure 5.1.

f (x, y)

y

A � Shaded rectangle

Surface f (x, y)

Figure 5.1 P[(X, Y ) � A] � volume under density surface above A

A bank operates both a drive-up facility and a walk-up window. On a randomly selected day, let X � the proportion of time that the drive-up facility is in use (at least one customer is being served or waiting to be served) and Y � the proportion of time that the walk-up window is in use. Then the set of possible values for (X, Y ) is the rec- tangle D � {(x, y): 0 � x � 1, 0 � y � 1}. Suppose the joint pdf of (X, Y) is given by

To verify that this is a legitimate pdf, note that f(x, y) � 0 and

5 � 1

0

6

5 x dx 1 �

1

0

6

5 y2 dy 5

6

10 1

6

15 5 1

5 � 1

0 �

1

0 6

5 x dx dy 1 �

1

0 �

1

0 6

5 y2 dx dy

� `

2`

� `

2`

f (x, y) dx dy 5 � 1

0 �

1

0 6

5 (x 1 y2) dx dy

f (x, y) 5 • 6

5 (x 1 y2) 0 # x # 1, 0 # y # 1

0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.5

Example 5.4 (Example 5.3 continued

DEFINITION

5.1 Jointly Distributed Random Variables 197

The probability that neither facility is busy more than one-quarter of the time is

� .0109 ■

The marginal pdf of each variable can be obtained in a manner analogous to what we did in the case of two discrete variables. The marginal pdf of X at the value x results from holding x fixed in the pair (x, y) and integrating the joint pdf over y. Integrating the joint pdf with respect to x gives the marginal pdf of Y.

5 6

20 # x

2

2 ` x50

x51/4

1 6

20 # y

3

3 ` y50

y51/4

5 7

640

5 6

5 �

1/4

0 �

1/4

0 x dx dy 1

6

5 �

1/4

0 �

1/4

0 y

2 dx dy

Pa0 # X # 1 4 , 0 # Y #

1

4 b 5 � 1/4

0 �

1/4

0 6

5 (x 1 y2) dx dy

The marginal probability density functions of X and Y, denoted by fX(x) and fY(y), respectively, are given by

fY (y) 5 � `

2` f (x, y) dx for 2` , y , `

fX(x) 5 � `

2` f (x, y) dy for 2` , x , `

The marginal pdf of X, which gives the probability distribution of busy time for the drive-up facility without reference to the walk-up window, is

for 0 � x � 1 and 0 otherwise. The marginal pdf of Y is

Then

In Example 5.3, the region of positive joint density was a rectangle, which made computation of the marginal pdf’s relatively easy. Consider now an example in which the region of positive density is more complicated.

A nut company markets cans of deluxe mixed nuts containing almonds, cashews, and peanuts. Suppose the net weight of each can is exactly 1 lb, but the weight contribu- tion of each type of nut is random. Because the three weights sum to 1, a joint prob- ability model for any two gives all necessary information about the weight of the third type. Let X � the weight of almonds in a selected can and Y � the weight of cashews. Then the region of positive density is D � {(x, y): 0 � x � 1, 0 � y � 1, x � y � 1}, the shaded region pictured in Figure 5.2.

Pa1 4 # Y #

3

4 b 5 � 3/4

1/4 fY (y) dy 5

37

80 5 .4625

fY(y) 5 • 6

5 y2 1

3

5 0 # y # 1

0 otherwise

fX(x) 5 � `

2`

f (x, y) dy 5 � 1

0

6

5 (x 1 y2) dy 5

6

5 x 1

2

5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

198 CHAPTER 5 Joint Probability Distributions and Random Samples

x

(0, 1)

x

(x, 1 � x)

(1, 0)

y

Figure 5.2 Region of positive density for Example 5.5

Now let the joint pdf for (X, Y) be

For any fixed x, f(x, y) increases with y; for fixed y, f(x, y) increases with x. This is appropriate because the word deluxe implies that most of the can should consist of almonds and cashews rather than peanuts, so that the density function should be large near the upper boundary and small near the origin. The surface determined by f(x, y) slopes upward from zero as (x, y) moves away from either axis.

Clearly, f(x, y) � 0. To verify the second condition on a joint pdf, recall that a double integral is computed as an iterated integral by holding one variable fixed (such as x as in Figure 5.2), integrating over values of the other variable lying along the straight line passing through the value of the fixed variable, and finally integrat- ing over all possible values of the fixed variable. Thus

To compute the probability that the two types of nuts together make up at most 50% of the can, let A � {(x, y): 0 � x � 1, 0 � y � 1, and x � y � .5}, as shown in Figure 5.3. Then

P((X, Y) � A) 5 � A � f (x, y) dx dy 5 �

.5

0 �

.52x

0 24xy dy dx 5 .0625

5 � 1

0 24x e y2

2 ` y512x

y50

f dx 5 � 1 0

12x(1 2 x)2 dx 5 1

� `

2` �

`

2` f (x, y) dy dx 5 �

D � f (x, y) dy dx 5 �

1

0 e � 12x

0 24xy dy f dx

f (x, y) 5 e24xy 0 # x # 1, 0 # y # 1, x 1 y # 1 0 otherwise

x 1.50

1

.5

0

x � y �

.5

x � y �

1

A � Shaded region

y � .5 � x

Figure 5.3 Computing P[(X, Y ) � A] for Example 5.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.7 (Example 5.5 continued)

Example 5.6

DEFINITION

5.1 Jointly Distributed Random Variables 199

The marginal pdf for almonds is obtained by holding X fixed at x and integrating the joint pdf f(x, y) along the vertical line through x:

By symmetry of f(x, y) and the region D, the marginal pdf of Y is obtained by replac- ing x and X in fX(x) by y and Y, respectively. ■

Independent Random Variables In many situations, information about the observed value of one of the two variables X and Y gives information about the value of the other variable. In Example 5.1, the marginal probability of X at x � 250 was .5, as was the probability that X � 100. If, however, we are told that the selected individual had Y � 0, then X � 100 is four times as likely as X � 250. Thus there is a dependence between the two variables.

In Chapter 2, we pointed out that one way of defining independence of two events is via the condition P(A � B) � P(A) � P(B). Here is an analogous definition for the independence of two rv’s.

fX(x) 5 � `

2` f (x, y) dy 5 • �

12x

0 24xy dy 5 12x(1 2 x)2 0 # x # 1

0 otherwise

Two random variables X and Y are said to be independent if for every pair of x and y values

p(x, y) � pX(x) � pY (y) when X and Y are discrete

or (5.1)

f(x, y) � fX(x) � fY (y) when X and Y are continuous

If (5.1) is not satisfied for all (x, y), then X and Y are said to be dependent.

The definition says that two variables are independent if their joint pmf or pdf is the product of the two marginal pmf’s or pdf’s. Intuitively, independence says that knowing the value of one of the variables does not provide additional information about what the value of the other variable might be.

In the insurance situation of Examples 5.1 and 5.2,

p(100, 100) � .10 � (.5)(.25) � pX(100) � pY(100)

so X and Y are not independent. Independence of X and Y requires that every entry in the joint probability table be the product of the corresponding row and column marginal probabilities. ■

Because f(x, y) has the form of a product, X and Y would appear to be independent.

However, although , so the variables

are not in fact independent. To be independent, f (x, y) must have the form g(x) h(y) and the region of positive density must be a rectangle whose sides are parallel to the coordinate axes. ■

Independence of two random variables is most useful when the description of the experiment under study suggests that X and Y have no effect on one another.

# fX Q34R 5 fY Q34R 5 916 , f Q34 , 34R 5 0 2 916 # 916

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

Example 5.8

200 CHAPTER 5 Joint Probability Distributions and Random Samples

Then once the marginal pmf’s or pdf’s have been specified, the joint pmf or pdf is simply the product of the two marginal functions. It follows that

P(a � X � b, c � Y � d) � P(a � X � b) � P(c � Y � d)

Suppose that the lifetimes of two components are independent of one another and that the first lifetime, X1, has an exponential distribution with parameter l1, whereas the second, X2, has an exponential distribution with parameter l2. Then the joint pdf is

f (x1, x2) � f (x1) � f (x2)

Let l1 � 1/1000 and l2 � 1/1200, so that the expected lifetimes are 1000 hours and 1200 hours, respectively. The probability that both component lifetimes are at least 1500 hours is

P(1500 � X1, 1500 � X2) � P(1500 � X1) � P(1500 � X2)

� (.2231)(.2865) � .0639 ■

More Than Two Random Variables To model the joint behavior of more than two random variables, we extend the con- cept of a joint distribution of two variables.

e2l1(1500) # e2l2(1500)

5 el1e2l1x1 # l2e2l2x2 5 l1l2e2l1x12l2x2 x1 . 0, x2 . 0 0 otherwise

X2X1

If X1, X2, . . ., Xn are all discrete random variables, the joint pmf of the vari- ables is the function

p(x1, x2, . . . , xn) � P(X1 � x1, X2 � x2, . . . , Xn � xn)

If the variables are continuous, the joint pdf of X1, . . ., Xn is the function f(x1, x2, . . ., xn) such that for any n intervals [a1, b1], . . . , [an, bn],

P(a1 # X1 # b1, c, an # Xn # bn) 5 � b1

a1 c �

bn

an f (x1, c, xn) dxn cdx1

In a binomial experiment, each trial could result in one of only two possible outcomes. Consider now an experiment consisting of n independent and identical trials, in which each trial can result in any one of r possible outcomes. Let pi � P(outcome i on any particular trial), and define random variables by Xi � the num- ber of trials resulting in outcome i (i � 1, . . . , r). Such an experiment is called a multinomial experiment, and the joint pmf of X1, . . . , Xr is called the multinomial distribution. By using a counting argument analogous to the one used in deriving the binomial distribution, the joint pmf of X1, . . . , Xr can be shown to be

p(x1, . . . , xr)

The case r � 2 gives the binomial distribution, with X1 � number of successes and X2 � n � X1 � number of failures.

5 u n!(x1!)(x2!) # c # (xr!) p1x1 # c# prxr xi 5 0, 1, 2, c, with x11 c1 xr 5 n 0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.10

Example 5.9

5.1 Jointly Distributed Random Variables 201

DEFINITION

If the allele of each of ten independently obtained pea sections is determined and p1 � P(AA), p2 � P(Aa), p3 � P(aa), X1 � number of AAs, X2 � number of Aas, and X3 � number of aa’s, then the multinomial pmf for these Xi’s is

With p1 � p3 � .25, p2 � .5,

P(X1 � 2, X2 � 5, X3 � 3) � p(2, 5, 3)

When a certain method is used to collect a fixed volume of rock samples in a region, there are four resulting rock types. Let X1, X2, and X3 denote the proportion by vol- ume of rock types 1, 2, and 3 in a randomly selected sample (the proportion of rock type 4 is 1 � X1 � X2 � X3, so a variable X4 would be redundant). If the joint pdf of X1, X2, X3 is

then k is determined by

This iterated integral has value k/144, so k � 144. The probability that rocks of types 1 and 2 together account for at most 50% of the sample is

� .6066 ■

The notion of independence of more than two random variables is similar to the notion of independence of more than two events.

5 � .5

0 e � .52x1

0 c � 12x12x2

0 144x1x2(1 2 x3) dx3d dx2 fdx1

E 0 # xi # 1 for i51, 2, 3x11 x2 1 x3 # 1, x11 x2 # .5F P(X1 1 X2 # .5) 5 � � � f (x1, x2, x3) dx3 dx2 dx1

5 � 1

0 e � 12x1

0 c � 12x12x2

0 kx1x2(1 2 x3) dx3d dx2 f dx1

1 5 � `

2` �

`

2` �

`

2` f (x1, x2, x3) dx3 dx2 dx1

5 e kx1x2(1 2 x3) 0 # x1 # 1, 0 # x2 # 1, 0 # x3 # 1, x1 1 x2 1 x3 # 1 0 otherwise

f (x1, x2, x3)

5 10!

2! 5! 3! (.25)2(.5)5(.25)3 5 .0769

p(x1, x2, x3) 5 10!

(x1!)(x2!)(x3!) p1

x1p2 x2p3

x3 xi 5 0, 1, c and x1 1 x2 1 x3 5 10

The random variables X1, X2, . . . , Xn are said to be independent if for every- subset X , X , . . . , X of the variables (each pair, each triple, and so on), the joint pmf or pdf of the subset is equal to the product of the marginal pmf’s or pdf’s.

iki2i1

Thus if the variables are independent with n � 4, then the joint pmf or pdf of any two variables is the product of the two marginals, and similarly for any three variables and all four variables together. Most importantly, once we are told that n variables are independent, then the joint pmf or pdf is the product of the n marginals.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

Example 5.11

202 CHAPTER 5 Joint Probability Distributions and Random Samples

If X1, . . . , Xn represent the lifetimes of n components, the components operate inde- pendently of one another, and each lifetime is exponentially distributed with param- eter l, then for x1 � 0, x2 � 0, . . . , xn � 0,

If these n components constitute a system that will fail as soon as a single compo- nent fails, then the probability that the system lasts past time t is

� (e�lt)n � e�nlt

Therefore,

P(system lifetime � t) � 1 � e�nlt for t � 0

which shows that system lifetime has an exponential distribution with parameter nl; the expected value of system lifetime is 1/nl. ■

In many experimental situations to be considered in this book, independence is a reasonable assumption, so that specifying the joint distribution reduces to decid- ing on appropriate marginal distributions.

Conditional Distributions Suppose X � the number of major defects in a randomly selected new automobile and Y � the number of minor defects in that same auto. If we learn that the selected car has one major defect, what now is the probability that the car has at most three minor defects—that is, what is P(Y � 3 | X � 1)? Similarly, if X and Y denote the lifetimes of the front and rear tires on a motorcycle, and it happens that X � 10,000 miles, what now is the probability that Y is at most 15,000 miles, and what is the expected lifetime of the rear tire “conditional on” this value of X? Questions of this sort can be answered by studying conditional probability distributions.

5 a �` t

le2lx1 dx1b

ca �` t

le2lxn dxnb P(X1 . t, c, Xn . t) 5 �

`

t c �

`

t f (x1, c, xn) dx1cdxn

f (x1, x2, c, xn) 5 (le 2lx1) # (le2lx2) # c # (le2lxn) 5 lne2lgxi

Let X and Y be two continuous rv’s with joint pdf f (x, y) and marginal X pdf fX(x). Then for any X value x for which fX(x) � 0, the conditional probability density function of Y given that X � x is

If X and Y are discrete, replacing pdf’s by pmf’s in this definition gives the conditional probability mass function of Y when X � x.

fY u X(y ux) 5 f (x, y)

fX(x) 2` , y , `

Notice that the definition of fY | X(y | x) parallels that of P(B | A), the conditional probability that B will occur, given that A has occurred. Once the conditional pdf or pmf has been determined, questions of the type posed at the outset of this sub- section can be answered by integrating or summing over an appropriate set of Y values.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.12

5.1 Jointly Distributed Random Variables 203

EXERCISES Section 5.1 (1–21)

Reconsider the situation of Examples 5.3 and 5.4 involving the proportion of time that a bank’s drive-up facility is busy and Y � the analogous proportion for the walk-up window. The conditional pdf of Y given that X � .8 is

The probability that the walk-up facility is busy at most half the time given that X � .8 is then

Using the marginal pdf of Y gives P(Y � .5) � .350. Also E(Y) � .6, whereas the expected proportion of time that the walk-up facility is busy given that X � .8 (a conditional expectation) is

If the two variables are independent, the marginal pmf or pdf in the denominator will cancel the corresponding factor in the numerator. The conditional distribution is then identical to the corresponding marginal distribution.

E(Y uX 5 .8) 5 � `

2` y # fY u X(y u .8) dy 5 134 �

1

0 y (24 1 30y2) dy 5 .574

P(Y # .5 uX 5 .8) 5 � .5

2` fY u X(y u .8) dy 5 �

.5

0

1

34 (24 1 30y2) dy 5 .390

fY u X (y u .8) 5 f (.8, y)

fX (.8) 5

1.2(.8 1 y2)

1.2(.8) 1 .4 5

1

34 (24 1 30y2) 0 , y , 1

X 5

1. A service station has both self-service and full-service islands. On each island, there is a single regular unleaded pump with two hoses. Let X denote the number of hoses being used on the self-service island at a particular time, and let Y denote the num- ber of hoses on the full-service island in use at that time. The joint pmf of X and Y appears in the accompanying tabulation.

y p(x, y) 0 1 2

0 .10 .04 .02 x 1 .08 .20 .06

2 .06 .14 .30

a. What is P(X � 1 and Y � 1)? b. Compute P(X � 1 and Y � 1). c. Give a word description of the event {X � 0 and Y � 0},

and compute the probability of this event. d. Compute the marginal pmf of X and of Y. Using pX(x),

what is P(X � 1)? e. Are X and Y independent rv’s? Explain.

2. When an automobile is stopped by a roving safety patrol, each tire is checked for tire wear, and each headlight is checked to see whether it is properly aimed. Let X denote the number of headlights that need adjustment, and let Y denote the number of defective tires. a. If X and Y are independent with pX(0) � .5, pX(1) � .3,

pX(2) � .2, and pY(0) � .6, pY(1) � .1, pY(2) � pY(3) � .05, and pY (4) � .2, display the joint pmf of (X, Y) in a joint probability table.

b. Compute P(X � 1 and Y � 1) from the joint probability table, and verify that it equals the product P(X � 1) P(Y � 1).

c. What is P(X � Y � 0) (the probability of no violations)? d. Compute P(X � Y � 1).

3. A certain market has both an express checkout line and a superexpress checkout line. Let X1 denote the number of customers in line at the express checkout at a particular time of day, and let X2 denote the number of customers in line at the superexpress checkout at the same time. Suppose the joint pmf of X1 and X2 is as given in the accompanying table.

x2 0 1 2 3

0 .08 .07 .04 .00 1 .06 .15 .05 .04

x1 2 .05 .04 .10 .06 3 .00 .03 .04 .07 4 .00 .01 .05 .06

a. What is P(X1 � 1, X2 � 1), that is, the probability that there is exactly one customer in each line?

b. What is P(X1 � X2), that is, the probability that the numbers of customers in the two lines are identical?

c. Let A denote the event that there are at least two more cus- tomers in one line than in the other line. Express A in terms of X1 and X2, and calculate the probability of this event.

#

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

204 CHAPTER 5 Joint Probability Distributions and Random Samples

d. What is the probability that the total number of customers in the two lines is exactly four? At least four?

4. Return to the situation described in Exercise 3. a. Determine the marginal pmf of X1, and then calculate the

expected number of customers in line at the express checkout.

b. Determine the marginal pmf of X2. c. By inspection of the probabilities P(X1 � 4), P(X2 � 0),

and P(X1 � 4, X2 � 0), are X1 and X2 independent random variables? Explain.

5. The number of customers waiting for gift-wrap service at a department store is an rv X with possible values 0, 1, 2, 3, 4 and corresponding probabilities .1, .2, .3, .25, .15. A randomly selected customer will have 1, 2, or 3 packages for wrapping with probabilities .6, .3, and .1, respectively. Let Y � the total number of packages to be wrapped for the customers waiting in line (assume that the number of packages submitted by one customer is independent of the number submitted by any other customer). a. Determine P(X � 3, Y � 3), i.e., p(3, 3). b. Determine p(4, 11).

6. Let X denote the number of Canon digital cameras sold dur- ing a particular week by a certain store. The pmf of X is

x 0 1 2 3 4

pX(x) .1 .2 .3 .25 .15

Sixty percent of all customers who purchase these cameras also buy an extended warranty. Let Y denote the number of purchasers during this week who buy an extended warranty. a. What is P(X � 4, Y � 2)? [Hint: This probability equals

P(Y � 2 | X � 4) � P(X � 4); now think of the four purchases as four trials of a binomial experiment, with success on a trial corresponding to buying an extended warranty.]

b. Calculate P(X � Y). c. Determine the joint pmf of X and Y and then the marginal

pmf of Y.

7. The joint probability distribution of the number X of cars and the number Y of buses per signal cycle at a proposed left-turn lane is displayed in the accompanying joint probability table.

y p(x, y) 0 1 2

0 .025 .015 .010 1 .050 .030 .020 2 .125 .075 .050

x 3 .150 .090 .060 4 .100 .060 .040 5 .050 .030 .020

a. What is the probability that there is exactly one car and exactly one bus during a cycle?

b. What is the probability that there is at most one car and at most one bus during a cycle?

c. What is the probability that there is exactly one car during a cycle? Exactly one bus?

d. Suppose the left-turn lane is to have a capacity of five cars, and that one bus is equivalent to three cars. What is the probability of an overflow during a cycle?

e. Are X and Y independent rv’s? Explain.

8. A stockroom currently has 30 components of a certain type, of which 8 were provided by supplier 1, 10 by supplier 2, and 12 by supplier 3. Six of these are to be randomly selected for a particular assembly. Let X � the number of supplier 1’s components selected, Y � the number of sup- plier 2’s components selected, and p(x, y) denote the joint pmf of X and Y. a. What is p(3, 2)? [Hint: Each sample of size 6 is equally

likely to be selected. Therefore, p(3, 2) � (number of outcomes with X � 3 and Y � 2)/(total number of out- comes). Now use the product rule for counting to obtain the numerator and denominator.]

b. Using the logic of part (a), obtain p(x, y). (This can be thought of as a multivariate hypergeometric distribution—sampling without replacement from a finite population consisting of more than two cate- gories.)

9. Each front tire on a particular type of vehicle is supposed to be filled to a pressure of 26 psi. Suppose the actual air pres- sure in each tire is a random variable—X for the right tire and Y for the left tire, with joint pdf

a. What is the value of K? b. What is the probability that both tires are underfilled? c. What is the probability that the difference in air pressure

between the two tires is at most 2 psi? d. Determine the (marginal) distribution of air pressure in

the right tire alone. e. Are X and Y independent rv’s?

10. Annie and Alvie have agreed to meet between 5:00 P.M. and 6:00 P.M. for dinner at a local health-food restaurant. Let X � Annie’s arrival time and Y � Alvie’s arrival time. Suppose X and Y are independent with each uniformly dis- tributed on the interval [5, 6]. a. What is the joint pdf of X and Y? b. What is the probability that they both arrive between

5:15 and 5:45? c. If the first one to arrive will wait only 10 min before

leaving to eat elsewhere, what is the probability that they have dinner at the health-food restaurant? [Hint: The event of interest is .]

11. Two different professors have just submitted final exams for duplication. Let X denote the number of typographical errors on the first professor’s exam and Y denote the number of such errors on the second exam. Suppose X has a Poisson

A 5 E(x, y): | x 2 y | # 1 6 F

f (x, y) 5 eK(x2 1 y2) 20 # x # 30, 20 # y # 30 0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.1 Jointly Distributed Random Variables 205

distribution with parameter m1, Y has a Poisson distribution with parameter m2, and X and Y are independent. a. What is the joint pmf of X and Y? b. What is the probability that at most one error is made on

both exams combined? c. Obtain a general expression for the probability that the

total number of errors in the two exams is m (where m is a nonnegative integer). [Hint: A � {(x, y): x � y � m} � {(m, 0), (m � 1, 1), . . . , (1, m � 1), (0, m)}. Now sum the joint pmf over (x, y) � A and use the binomial theorem, which says that

for any a, b.]

12. Two components of a minicomputer have the following joint pdf for their useful lifetimes X and Y:

a. What is the probability that the lifetime X of the first component exceeds 3?

b. What are the marginal pdf’s of X and Y? Are the two life- times independent? Explain.

c. What is the probability that the lifetime of at least one component exceeds 3?

13. You have two lightbulbs for a particular lamp. Let X � the lifetime of the first bulb and Y � the lifetime of the second bulb (both in 1000s of hours). Suppose that X and Y are independent and that each has an exponential distribution with parameter l� 1. a. What is the joint pdf of X and Y? b. What is the probability that each bulb lasts at most

1000 hours (i.e., X � 1 and Y � 1)? c. What is the probability that the total lifetime of the two

bulbs is at most 2? [Hint: Draw a picture of the region A � {(x, y): x � 0, y � 0, x � y � 2} before integrating.]

d. What is the probability that the total lifetime is between 1 and 2?

14. Suppose that you have ten lightbulbs, that the lifetime of each is independent of all the other lifetimes, and that each lifetime has an exponential distribution with parameter l. a. What is the probability that all ten bulbs fail before

time t? b. What is the probability that exactly k of the ten bulbs fail

before time t? c. Suppose that nine of the bulbs have lifetimes that are

exponentially distributed with parameter l and that the remaining bulb has a lifetime that is exponentially dis- tributed with parameter u (it is made by another manu- facturer). What is the probability that exactly five of the ten bulbs fail before time t?

15. Consider a system consisting of three components as pic- tured. The system will continue to function as long as the

f (x, y) 5 e xe2x(11y) x $ 0 and y $ 0 0 otherwise

g m

k50 am

k bakbm2k 5 (a 1 b)m

first component functions and either component 2 or com- ponent 3 functions. Let X1, X2, and X3 denote the lifetimes of components 1, 2, and 3, respectively. Suppose the Xi’s are independent of one another and each Xi has an exponential distribution with parameter l.

1

3

2

a. Let Y denote the system lifetime. Obtain the cumulative distribution function of Y and differentiate to obtain the pdf. [Hint: F(y) � P(Y � y); express the event {Y � y} in terms of unions and/or intersections of the three events {X1 � y}, {X2 � y}, and {X3 � y}.]

b. Compute the expected system lifetime.

16. a. For f(x1, x2, x3) as given in Example 5.10, compute the joint marginal density function of X1 and X3 alone (by integrating over x2).

b. What is the probability that rocks of types 1 and 3 together make up at most 50% of the sample? [Hint: Use the result of part (a).]

c. Compute the marginal pdf of X1 alone. [Hint: Use the result of part (a).]

17. An ecologist wishes to select a point inside a circular sam- pling region according to a uniform distribution (in practice this could be done by first selecting a direction and then a distance from the center in that direction). Let X � the x coordinate of the point selected and Y � the y coordinate of the point selected. If the circle is centered at (0, 0) and has radius R, then the joint pdf of X and Y is

a. What is the probability that the selected point is within R/2 of the center of the circular region? [Hint: Draw a picture of the region of positive density D. Because f(x, y) is constant on D, computing a proba- bility reduces to computing an area.]

b. What is the probability that both X and Y differ from 0 by at most R/2?

c. Answer part (b) for replacing R/2. d. What is the marginal pdf of X? Of Y? Are X and Y

independent?

18. Refer to Exercise 1 and answer the following questions: a. Given that X � 1, determine the conditional pmf of

Y—i.e., pY | X(0 | 1), pY | X(1 | 1), and pY | X(2 | 1). b. Given that two hoses are in use at the self-service island,

what is the conditional pmf of the number of hoses in use on the full-service island?

c. Use the result of part (b) to calculate the conditional probability P(Y � 1 | X � 2).

R/22

f (x, y) 5 u 1

pR2 x2 1 y2 # R2

0 otherwise

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.13

PROPOSITION

206 CHAPTER 5 Joint Probability Distributions and Random Samples

5.2 Expected Values, Covariance, and Correlation Any function h(X) of a single rv X is itself a random variable. However, to compute E[h(X)], it is not necessary to obtain the probability distribution of h(X); instead, E[h(X)] is computed as a weighted average of h(x) values, where the weight function is the pmf p(x) or pdf f(x) of X. A similar result holds for a function h(X, Y) of two jointly distributed random variables.

Let X and Y be jointly distributed rv’s with pmf p(x, y) or pdf f(x, y) according to whether the variables are discrete or continuous. Then the expected value of a function h(X, Y), denoted by E[h(X, Y)] or mh(X, Y), is given by

E[h(X, Y )] 5 dgx gy h(x, y) # p(x, y) if X and Y are discrete �

`

2` �

`

2` h(x, y) # f (x, y) dx dy if X and Y are continuous

Five friends have purchased tickets to a certain concert. If the tickets are for seats 1–5 in a particular row and the tickets are randomly distributed among the five, what is the expected number of seats separating any particular two of the five? Let X and Y denote the seat numbers of the first and second individuals, respectively. Possible (X, Y) pairs are {(1, 2), (1, 3), . . . , (5, 4)}, and the joint pmf of (X, Y) is

The number of seats separating the two individuals is h(X, Y) � | X � Y | � 1. The accompanying table gives h(x, y) for each possible (x, y) pair.

p(x, y) 5 u 120 x 5 1, c, 5; y 5 1, c, 5; x 2 y 0 otherwise

d. Given that two hoses are in use at the full-service island, what is the conditional pmf of the number in use at the self-service island?

19. The joint pdf of pressures for right and left front tires is given in Exercise 9. a. Determine the conditional pdf of Y given that X � x and

the conditional pdf of X given that Y � y. b. If the pressure in the right tire is found to be 22 psi, what

is the probability that the left tire has a pressure of at least 25 psi? Compare this to P(Y � 25).

c. If the pressure in the right tire is found to be 22 psi, what is the expected pressure in the left tire, and what is the standard deviation of pressure in this tire?

20. Let X1, X2, X3, X4, X5, and X6 denote the numbers of blue, brown, green, orange, red, and yellow M&M candies, respectively, in a sample of size n. Then these Xi’s have a multinomial distribution. According to the M&M Web site,

the color proportions are p1 � .24, p2 � .13, p3 � .16, p4 � .20, p5 � .13, and p6 � .14. a. If n � 12, what is the probability that there are exactly

two M&Ms of each color? b. For n � 20, what is the probability that there are at most

five orange candies? [Hint: Think of an orange candy as a success and any other color as a failure.]

c. In a sample of 20 M&Ms, what is the probability that the number of candies that are blue, green, or orange is at least 10?

21. Let X1, X2, and X3 be the lifetimes of components 1, 2, and 3 in a three-component system. a. How would you define the conditional pdf of X3 given

that X1 � x1 and X2 � x2? b. How would you define the conditional joint pdf of X2 and

X3 given that X1 � x1?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.2 Expected Values, Covariance, and Correlation 207

x h(x, y) 1 2 3 4 5

1 — 0 1 2 3 2 0 — 0 1 2

y 3 1 0 — 0 1 4 2 1 0 — 0 5 3 2 1 0 —

Thus

In Example 5.5, the joint pdf of the amount X of almonds and amount Y of cashews in a 1-lb can of nuts was

If 1 lb of almonds costs the company $1.00, 1 lb of cashews costs $1.50, and 1 lb of peanuts costs $.50, then the total cost of the contents of a can is

h(X, Y) � (1)X � (1.5)Y � (.5)(1 � X � Y) � .5 � .5X � Y

(since 1 � X � Y of the weight consists of peanuts). The expected total cost is

The method of computing the expected value of a function h(X1, . . . , Xn) of n random variables is similar to that for two random variables. If the Xi’s are discrete, E[h(X1, . . . , Xn)] is an n-dimensional sum; if the Xi’s are continuous, it is an n- dimensional integral.

Covariance When two random variables X and Y are not independent, it is frequently of interest to assess how strongly they are related to one another.

5 � 1

0 �

12x

0 (.5 1 .5x 1 y) # 24xy dy dx 5 $1.10

E[h(X, Y)] 5 � `

2` �

`

2` h(x, y) # f (x, y) dx dy

f (x, y) 5 e24xy 0 # x # 1, 0 # y # 1, x 1 y # 1 0 otherwise

x2y

E[h(X, Y)] 5 b (x, y)

h(x, y) # p(x, y) 5 g 5

x51 g

5

y51 ( u x 2 y u 2 1) # 1

20 5 1

DEFINITION The covariance between two rv’s X and Y is

Cov(X, Y) � E[(X � mX)(Y � mY)]

5 d gx gy (x 2 mX)(y 2 mY)p(x, y) X, Y discrete �

`

2` �

`

2` (x 2 mX)(y 2 mY)f (x, y) dx dy X, Y continuous

Example 5.14

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

Example 5.15

208 CHAPTER 5 Joint Probability Distributions and Random Samples

That is, since X � mX and Y � mY are the deviations of the two variables from their respective mean values, the covariance is the expected product of deviations. Note that Cov(X, X) � E[(X � mX)

2] � V(X). The rationale for the definition is as follows. Suppose X and Y have a strong

positive relationship to one another, by which we mean that large values of X tend to occur with large values of Y and small values of X with small values of Y. Then most of the probability mass or density will be associated with (x � mX) and (y � mY), either both positive (both X and Y above their respective means) or both negative, so the product (x � mX)(y � mY) will tend to be positive. Thus for a strong positive rela- tionship, Cov(X, Y) should be quite positive. For a strong negative relationship, the signs of (x � mX) and (y � mY) will tend to be opposite, yielding a negative product. Thus for a strong negative relationship, Cov(X, Y) should be quite negative. If X and Y are not strongly related, positive and negative products will tend to cancel one another, yielding a covariance near 0. Figure 5.4 illustrates the different possibilities. The covariance depends on both the set of possible pairs and the probabilities. In Figure 5.4, the probabilities could be changed without altering the set of possible pairs, and this could drastically change the value of Cov(X, Y).

Figure 5.4 p(x, y) � 1/10 for each of ten pairs corresponding to indicated points: (a) positive covariance; (b) negative covariance; (c) covariance near zero

� X (c)

� X (b)

�Y�Y

y

�Y

� X (a)

x

� �

� �

y

x

� �

� �

y

x

The joint and marginal pmf’s for X � automobile policy deductible amount and Y � homeowner policy deductible amount in Example 5.1 were

y p(x, y) 0 100 200 x 100 250 y 0 100 200

x 100 .20 .10 .20 pX(x) .5 .5 pY (y) .25 .25 .5 250 .05 .15 .30

from which mX � xpX(x) � 175 and mY � 125. Therefore,

Cov(X, Y ) 5 b (x, y)

(x 2 175)(y 2 125)p(x, y)

Cov(X, Y) � E(XY) � mX � mY

� (100 � 175)(0 � 125)(.20) � . . .

� (250 � 175)(200 � 125)(.30)

� 1875 ■

The following shortcut formula for Cov(X, Y) simplifies the computations.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.16 (Example 5.5 continued)

5.2 Expected Values, Covariance, and Correlation 209

According to this formula, no intermediate subtractions are necessary; only at the end of the computation is mX � mY subtracted from E(XY). The proof involves expand- ing (X � mX)(Y � mY) and then taking the expected value of each term separately.

The joint and marginal pdf’s of X � amount of almonds and Y � amount of cashews were

with fY ( y) obtained by replacing x by y in fX(x). It is easily verified that and

Thus . A negative covariance is rea- sonable here because more almonds in the can implies fewer cashews. ■

It might appear that the relationship in the insurance example is quite strong since Cov(X, Y ) � 1875, whereas in the nut example would seem to imply quite a weak relationship. Unfortunately, the covariance has a serious defect that makes it impossible to interpret a computed value. In the insurance example, suppose we had expressed the deductible amount in cents rather than in dollars. Then 100X would replace X, 100Y would replace Y, and the resulting covariance would be Cov(100X, 100Y) � (100)(100)Cov(X, Y) � 18,750,000. If, on the other hand, the deductible amount had been expressed in hundreds of dollars, the computed covari- ance would have been (.01)(.01)(1875) � .1875. The defect of covariance is that its computed value depends critically on the units of measurement. Ideally, the choice of units should have no effect on a measure of strength of relationship. This is achieved by scaling the covariance.

Correlation

Cov(X, Y) 5 2 275

Cov(X, Y ) 5 215 2 A25B A25B 5 215 2 425 5 2 275

5 8 � 1

0 x2(1 2 x)3 dx 5 215

E(XY) 5 � `

2` �

`

2` xy f (x, y) dx dy 5 �

1

0

� 12x

0 xy # 24xy dy dx

mX 5 mY 5 2 5,

fX(x) 5 e12x(1 2 x) 2 0 # x # 1

0 otherwise

f (x, y) 5 e24xy 0 # x # 1, 0 # y # 1, x 1 y # 1 0 otherwise

DEFINITION The correlation coefficient of X and Y, denoted by Corr(X, Y), rX,Y, or just r, is defined by

rX, Y 5 Cov(X, Y) sX

# sY

It is easily verified that in the insurance scenario of Example 5.15, E(X 2) � 36,250, , sX � 75, E(Y

2) � 22,500, , and sY � 82.92. This gives

■r 5 1875

(75)(82.92) 5 .301

sY 2 5 6875sX

2 5 36,250 2 (175)2 5 5625 Example 5.17

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.18

PROPOSITION

PROPOSITION

210 CHAPTER 5 Joint Probability Distributions and Random Samples

Statement 1 says precisely that the correlation coefficient is not affected by a linear change in the units of measurement (if, say, X � temperature in °C, then 9X/5 � 32 � temperature in °F). According to Statement 2, the strongest possible positive relationship is evidenced by r ��1, whereas the strongest possible negative rela- tionship corresponds to r ��1. The proof of the first statement is sketched in Exercise 35, and that of the second appears in Supplementary Exercise 87 at the end of the chapter. For descriptive purposes, the relationship will be described as strong if | r | � .8, moderate if .5 | r | .8, and weak if | r | � .5.

If we think of p(x, y) or f (x, y) as prescribing a mathematical model for how the two numerical variables X and Y are distributed in some population (height and weight, verbal SAT score and quantitative SAT score, etc.), then r is a population characteristic or parameter that measures how strongly X and Y are related in the pop- ulation. In Chapter 12, we will consider taking a sample of pairs (x1, y1), . . . , (xn, yn) from the population. The sample correlation coefficient r will then be defined and used to make inferences about r.

The correlation coefficient r is actually not a completely general measure of the strength of a relationship.

This proposition says that r is a measure of the degree of linear relationship between X and Y, and only when the two variables are perfectly related in a linear manner will r be as positive or negative as it can be. A r less than 1 in absolute value indicates only that the relationship is not completely linear, but there may still be a very strong nonlinear relation. Also, r � 0 does not imply that X and Y are independent, but only that there is a complete absence of a linear relationship. When r � 0, X and Y are said to be uncorrelated. Two variables could be uncorrelated yet highly dependent because there is a strong nonlinear relationship, so be careful not to conclude too much from knowing that r � 0.

Let X and Y be discrete rv’s with joint pmf

p(x, y) 5 u14 (x, y) 5 (24, 1), (4,21), (2, 2), (22, 22) 0 otherwise

The following proposition shows that r remedies the defect of Cov(X, Y ) and also suggests how to recognize the existence of a strong (linear) relationship.

1. If a and c are either both positive or both negative,

Corr(aX � b, cY � d ) � Corr(X, Y)

2. For any two rv’s X and Y, �1 � Corr(X, Y) � 1.

1. If X and Y are independent, then r � 0, but r � 0 does not imply independence.

2. r � 1 or �1 iff Y � aX � b for some numbers a and b with a � 0.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.2 Expected Values, Covariance, and Correlation 211

The points that receive positive probability mass are identified on the (x, y) coordinate system in Figure 5.5. It is evident from the figure that the value of X is completely determined by the value of Y and vice versa, so the two variables are completely dependent. However, by symmetry mX � mY � 0 and

. The covariance is then E(XY) 5 (24) 14 1 (24) 1 4 1 (4)

1 4 1 (4)

1 4 5 0

2

1

�1

�2

1 2 3 4�1�2�3�4

Figure 5.5 The population of pairs for Example 5.18 ■

A value of r near 1 does not necessarily imply that increasing the value of X causes Y to increase. It implies only that large X values are associated with large Y values. For example, in the population of children, vocabulary size and number of cavities are quite positively correlated, but it is certainly not true that cavities cause vocabu- lary to grow. Instead, the values of both these variables tend to increase as the value of age, a third variable, increases. For children of a fixed age, there is probably a low correlation between number of cavities and vocabulary size. In summary, association (a high correlation) is not the same as causation.

EXERCISES Section 5.2 (22–36)

22. An instructor has given a short quiz consisting of two parts. For a randomly selected student, let X � the number of points earned on the first part and Y � the number of points earned on the second part. Suppose that the joint pmf of X and Y is given in the accompanying table.

y p(x, y) 0 5 10 15

0 .02 .06 .02 .10 x 5 .04 .15 .20 .10

10 .01 .15 .14 .01

a. If the score recorded in the grade book is the total num- ber of points earned on the two parts, what is the expected recorded score E(X � Y)?

b. If the maximum of the two scores is recorded, what is the expected recorded score?

23. The difference between the number of customers in line at the express checkout and the number in line at the super- express checkout in Exercise 3 is X1 � X2. Calculate the expected difference.

24. Six individuals, including A and B, take seats around a cir- cular table in a completely random fashion. Suppose the seats are numbered 1, . . . , 6. Let X � A’s seat number and Y � B’s seat number. If A sends a written message around the table to B in the direction in which they are closest, how many individuals (including A and B) would you expect to handle the message?

25. A surveyor wishes to lay out a square region with each side hav- ing length L. However, because of a measurement error, he instead lays out a rectangle in which the north–south sides both have length X and the east–west sides both have length Y. Suppose that X and Y are independent and that each is uniformly distributed on the interval [L � A, L � A] (where 0 A L). What is the expected area of the resulting rectangle?

26. Consider a small ferry that can accommodate cars and buses. The toll for cars is $3, and the toll for buses is $10. Let X and Y denote the number of cars and buses, respec- tively, carried on a single trip. Suppose the joint distribution of X and Y is as given in the table of Exercise 7. Compute the expected revenue from a single trip.

Cov(X,Y) � E(XY) � mX� mY � 0 and thus rX,Y � 0. Although there is perfect dependence, there is also complete absence of any linear relationship!

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.19

212 CHAPTER 5 Joint Probability Distributions and Random Samples

5.3 Statistics and Their Distributions The observations in a single sample were denoted in Chapter 1 by x1, x2, . . . , xn. Consider selecting two different samples of size n from the same population dis- tribution. The xi’s in the second sample will virtually always differ at least a bit from those in the first sample. For example, a first sample of n � 3 cars of a par- ticular type might result in fuel efficiencies x1 � 30.7, x2 � 29.4, x3 � 31.1, whereas a second sample may give x1 � 28.8, x2 � 30.0, and x3 � 32.5. Before we obtain data, there is uncertainty about the value of each xi. Because of this uncertainty, before the data becomes available we view each observation as a ran- dom variable and denote the sample by X1, X2, . . . , Xn (uppercase letters for random variables).

This variation in observed values in turn implies that the value of any func- tion of the sample observations—such as the sample mean, sample standard devi- ation, or sample fourth spread—also varies from sample to sample. That is, prior to obtaining x1, . . . , xn, there is uncertainty as to the value of , the value of s, and so on.

Suppose that material strength for a randomly selected specimen of a particular type has a Weibull distribution with parameter values a � 2 (shape) and b � 5 (scale). The corresponding density curve is shown in Figure 5.6. Formulas from Section 4.5 give

The mean exceeds the median because of the distribution’s positive skew.

m 5 E(x) 5 4.4311 m| 5 4.1628 s2 5 V(X) 5 5.365 s 5 2.316

x

27. Annie and Alvie have agreed to meet for lunch between noon (0:00 P.M.) and 1:00 P.M. Denote Annie’s arrival time by X, Alvie’s by Y, and suppose X and Y are independent with pdf’s

What is the expected amount of time that the one who arrives first must wait for the other person? [Hint: h(X, Y ) � | X � Y | .]

28. Show that if X and Y are independent rv’s, then E(XY) � E(X) � E(Y). Then apply this in Exercise 25. [Hint: Consider the continuous case with f(x, y) � fX(x) � fY (y).]

29. Compute the correlation coefficient r for X and Y of Example 5.16 (the covariance has already been computed).

30. a. Compute the covariance for X and Y in Exercise 22. b. Compute r for X and Y in the same exercise.

31. a. Compute the covariance between X and Y in Exercise 9. b. Compute the correlation coefficient r for this X and Y.

fY (y) 5 e2y 0 # y # 1 0 otherwise

fX(x) 5 e3x 2 0 # x # 1

0 otherwise

32. Reconsider the minicomputer component lifetimes X and Y as described in Exercise 12. Determine E(XY). What can be said about Cov(X, Y) and r?

33. Use the result of Exercise 28 to show that when X and Y are independent, Cov(X, Y) � Corr(X, Y) � 0.

34. a. Recalling the definition of s2 for a single rv X, write a formula that would be appropriate for computing the variance of a function h(X, Y) of two random variables. [Hint: Remember that variance is just a special expected value.]

b. Use this formula to compute the variance of the recorded score h(X, Y) [ � max(X, Y)] in part (b) of Exercise 22.

35. a. Use the rules of expected value to show that Cov(aX � b, cY � d) � ac Cov(X, Y).

b. Use part (a) along with the rules of variance and standard deviation to show that Corr(aX � b, cY � d) � Corr(X, Y) when a and c have the same sign.

c. What happens if a and c have opposite signs?

36. Show that if Y � aX � b (a � 0), then Corr(X, Y) ��1 or �1. Under what conditions will r ��1?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.3 Statistics and Their Distributions 213

0 5 10 0

15

.05

.10

.15

x

f(x)

Figure 5.6 The Weibull density curve for Example 5.19

We used statistical software to generate six different samples, each with n � 10, from this distribution (material strengths for six different groups of ten specimens each). The results appear in Table 5.1, followed by the values of the sample mean, sample median, and sample standard deviation for each sample. Notice first that the ten observations in any particular sample are all different from those in any other sample. Second, the six values of the sample mean are all different from one another, as are the six values of the sample median and the six values of the sample standard deviation. The same is true of the sample 10% trimmed means, sample fourth spreads, and so on.

Table 5.1 Samples from the Weibull Distribution of Example 5.19

Sample 1 2 3 4 5 6

1 6.1171 5.07611 3.46710 1.55601 3.12372 8.93795 2 4.1600 6.79279 2.71938 4.56941 6.09685 3.92487 3 3.1950 4.43259 5.88129 4.79870 3.41181 8.76202 4 0.6694 8.55752 5.14915 2.49759 1.65409 7.05569 5 1.8552 6.82487 4.99635 2.33267 2.29512 2.30932 6 5.2316 7.39958 5.86887 4.01295 2.12583 5.94195 7 2.7609 2.14755 6.05918 9.08845 3.20938 6.74166 8 10.2185 8.50628 1.80119 3.25728 3.23209 1.75468 9 5.2438 5.49510 4.21994 3.70132 6.84426 4.91827

10 4.5590 4.04525 2.12934 5.50134 4.20694 7.26081 4.401 5.928 4.229 4.132 3.620 5.761 4.360 6.144 4.608 3.857 3.221 6.342

s 2.642 2.062 1.611 2.124 1.678 2.496 x| x

Furthermore, the value of the sample mean from any particular sample can be regarded as a point estimate (“point” because it is a single number, corresponding to a single point on the number line) of the population mean m, whose value is known to be 4.4311. None of the estimates from these six samples is identical to what is being estimated. The estimates from the second and sixth samples are much too large, whereas the fifth sample gives a substantial underestimate. Similarly, the sam- ple standard deviation gives a point estimate of the population standard deviation. All six of the resulting estimates are in error by at least a small amount.

In summary, the values of the individual sample observations vary from sample to sample, so will in general the value of any quantity computed from sample data, and the value of a sample characteristic used as an estimate of the corresponding popula- tion characteristic will virtually never coincide with what is being estimated. ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

DEFINITION

214 CHAPTER 5 Joint Probability Distributions and Random Samples

A statistic is any quantity whose value can be calculated from sample data. Prior to obtaining data, there is uncertainty as to what value of any particular statistic will result. Therefore, a statistic is a random variable and will be denoted by an uppercase letter; a lowercase letter is used to represent the calculated or observed value of the statistic.

Thus the sample mean, regarded as a statistic (before a sample has been selected or an experiment carried out), is denoted by ; the calculated value of this statistic is . Similarly, S represents the sample standard deviation thought of as a statistic, and its computed value is s. If samples of two different types of bricks are selected and the individual compressive strengths are denoted by X1, . . . , Xm and Y1, . . . , Yn, respec- tively, then the statistic � , the difference between the two sample mean com- pressive strengths, is often of great interest.

Any statistic, being a random variable, has a probability distribution. In partic- ular, the sample mean has a probability distribution. Suppose, for example, that n � 2 components are randomly selected and the number of breakdowns while under warranty is determined for each one. Possible values for the sample mean number of breakdowns are 0 (if X1 � X2 � 0), .5 (if either X1 � 0 and X2 � 1 or X1 � 1 and X2 � 0), 1, 1.5, . . .. The probability distribution of specifies P( � 0), P( � .5), and so on, from which other probabilities such as P(1 � � 3) and P( � 2.5) can be calculated. Similarly, if for a sample of size n � 2, the only possible values of the sample variance are 0, 12.5, and 50 (which is the case if X1 and X2 can each take on only the values 40, 45, or 50), then the probability distribution of S2 gives P(S2 � 0), P(S2 � 12.5), and P(S2 � 50). The probability distribution of a statistic is sometimes referred to as its sampling distribution to emphasize that it describes how the statis- tic varies in value across all samples that might be selected.

XX XXX

X

X

YX

xX

Random Samples The probability distribution of any particular statistic depends not only on the pop- ulation distribution (normal, uniform, etc.) and the sample size n but also on the method of sampling. Consider selecting a sample of size n � 2 from a population consisting of just the three values 1, 5, and 10, and suppose that the statistic of inter- est is the sample variance. If sampling is done “with replacement,” then S2 � 0 will result if X1 � X2. However, S

2 cannot equal 0 if sampling is “without replacement.” So P(S2 � 0) � 0 for one sampling method, and this probability is positive for the other method. Our next definition describes a sampling method often encountered (at least approximately) in practice.

The rv’s X1, X2, . . . , Xn are said to form a (simple) random sample of size n if

1. The Xi’s are independent rv’s.

2. Every Xi has the same probability distribution.

Conditions 1 and 2 can be paraphrased by saying that the Xi’s are independent and identically distributed (iid). If sampling is either with replacement or from an infinite (conceptual) population, Conditions 1 and 2 are satisfied exactly. These conditions will be approximately satisfied if sampling is without replacement, yet the sample size n is much smaller than the population size N. In practice, if n/N � .05 (at most

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.20

5.3 Statistics and Their Distributions 215

5% of the population is sampled), we can proceed as if the Xi’s form a random sample. The virtue of this sampling method is that the probability distribution of any statistic can be more easily obtained than for any other sampling method.

There are two general methods for obtaining information about a statistic’s sampling distribution. One method involves calculations based on probability rules, and the other involves carrying out a simulation experiment.

Deriving a Sampling Distribution Probability rules can be used to obtain the distribution of a statistic provided that it is a “fairly simple” function of the Xi’s and either there are relatively few different X values in the population or else the population distribution has a “nice” form. Our next two examples illustrate such situations.

A certain brand of MP3 player comes in three configurations: a model with 2 GB of memory, costing $80, a 4 GB model priced at $100, and an 8 GB version with a price tag of $120. If 20% of all purchasers choose the 2 GB model, 30% choose the 4 GB model, and 50% choose the 8 GB model, then the probability distribution of the cost X of a single randomly selected MP3 player purchase is given by

x 80 100 120

p(x) .2 .3 .5 with m � 106, s2 � 244 (5.2)

Suppose on a particular day only two MP3 players are sold. Let X1 � the revenue from the first sale and X2 � the revenue from the second. Suppose that X1 and X2 are independent, each with the probability distribution shown in (5.2) [so that X1 and X2 constitute a random sample from the distribution (5.2)]. Table 5.2 lists possible (x1, x2) pairs, the probability of each [computed using (5.2) and the assumption of independ- ence], and the resulting and s2 values. [Note that when n � 2, s2 � (x1 � )

2 � (x2 � ) 2.]

Now to obtain the probability distribution of , the sample average revenue per sale, we must consider each possible value and compute its probability. For example, � 100 occurs three times in the table with probabilities .10, .09, and .10, so

p (100) � P( � 100) � .10 � .09 � .10 � .29

Similarly,

(800) � P(S2 � 800) � P(X1 � 80, X2 � 120 or X1 � 120, X2 � 80)

� .10 � .10 � .20

pS 2

X X

xx X

xxx

Table 5.2 Outcomes, Probabilities, and Values of and s2 for Example 5.20

x1 x2 p(x1, x2) s 2

80 80 .04 80 0 80 100 .06 90 200 80 120 .10 100 800 100 80 .06 90 200 100 100 .09 100 0 100 120 .15 110 200 120 80 .10 100 800 120 100 .15 110 200 120 120 .25 120 0

x

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

216 CHAPTER 5 Joint Probability Distributions and Random Samples

The complete sampling distributions of and S2 appear in (5.3) and (5.4).

80 90 100 110 120

p ( ) .04 .12 .29 .30 .25 (5.3)

s2 0 200 800

(s2) .38 .42 .20 (5.4)

Figure 5.7 pictures a probability histogram for both the original distribution (5.2) and the distribution (5.3). The figure suggests first that the mean (expected value) of the distribution is equal to the mean 106 of the original distribution, since both histograms appear to be centered at the same place. From (5.3),

X X

pS 2

x X

x

X

.3

.2

.5

.04 .12

.29 .30 .25

10080 120 80 90 100 110 120

Figure 5.7 Probability histograms for the underlying distribution and distribution in Example 5.20

X

m � E( ) � p ( ) � (80)(.04) � . . . � (120)(.25) � 106 � m

Second, it appears that the distribution has smaller spread (variability) than the original distribution, since probability mass has moved in toward the mean. Again from (5.3),

The variance of is precisely half that of the original variance (because n � 2). Using (5.4), the mean value of S2 is

� E(S2) � � pS2(s 2)

� (0)(.38) � (200)(.42) � (800)(.20) � 244 � s2

That is, the sampling distribution is centered at the population mean m, and the S2

sampling distribution is centered at the population variance s2. If there had been four purchases on the day of interest, the sample average rev-

enue would be based on a random sample of four Xi’s, each having the distribution (5.2). More calculation eventually yields the pmf of for n � 4 as

80 85 90 95 100 105 110 115 120

p ( ) .0016 .0096 .0376 .0936 .1761 .2340 .2350 .1500 .0625x X

x

X X

X

gS2mS 2

X

5 122 5 244

2 5 s2

2

5 (802)(.04) 1 c1 (1202)(.25) 2 (106)2

s X 2 5 V(X) 5 gx22 # pX(x) 2 mX2

X

x X

xgXX

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.3 Statistics and Their Distributions 217

80 90 100 110 120

Figure 5.8 Probability histogram for based on n � 4 in Example 5.20X

Example 5.20 should suggest first of all that the computation of ( ) and pS2(s 2)

can be tedious. If the original distribution (5.2) had allowed for more than three pos- sible values, then even for n � 2 the computations would have been more involved. The example should also suggest, however, that there are some general relationships between E( ), V( ), E(S2), and the mean m and variance s2 of the original distribu- tion. These are stated in the next section. Now consider an example in which the ran- dom sample is drawn from a continuous distribution.

Service time for a certain type of bank transaction is a random variable having an exponential distribution with parameter l. Suppose X1 and X2 are service times for two different customers, assumed independent of each other. Consider the total service time To � X1 � X2 for the two customers, also a statistic. The cdf of To is, for t � 0,

� 1 � e�lt � lte�lt

The region of integration is pictured in Figure 5.9.

5 � t

0 �

t2x1

0 le2lx1 # le2lx2 dx2 dx1 5 �

t

0 [le2lx1 2 le2lt] dx1

FT0(t) 5 P(X1 1 X2 # t) 5 � � 5(x1, x2): x11x2# t6

f(x1, x2) dx1 dx2

XX

x X

p

Example 5.21

x1

x2

x1

x 1 � x

2 � t

(x1, t � x1)

Figure 5.9 Region of integration to obtain cdf of To in Example 5.21

The pdf of To is obtained by differentiating (t):

(5.5)fTo(t) 5 el 2te2lt t $ 0

0 t , 0

FTo

From this, m � 106 � m and . Figure 5.8 is a probability his-

togram of this pmf.

s X 2 5 61 5 s2/4

X

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.22

218 CHAPTER 5 Joint Probability Distributions and Random Samples

Figure 5.10 Normal distribution, with m � 8.25 and s � .75

6.00 6.75 7.50 9.00 9.75 10.50

� 8.25�

� .75

This is a gamma pdf (a � 2 and b � 1/l). The pdf of � To/2 is obtained from the relation { � } iff {To � 2 } as

(5.6)

The mean and variance of the underlying exponential distribution are m � 1/l and s2 � 1/l2. From Expressions (5.5) and (5.6), it can be verified that E( ) � 1/l, V( ) � 1/(2l2), E(To) � 2/l, and V(To) � 2/l

2. These results again suggest some general relationships between means and variances of , To, and the underlying distribution.

Simulation Experiments The second method of obtaining information about a statistic’s sampling distribution is to perform a simulation experiment. This method is usually used when a deriva- tion via probability rules is too difficult or complicated to be carried out. Such an experiment is virtually always done with the aid of a computer. The following char- acteristics of an experiment must be specified:

1. The statistic of interest ( , S, a particular trimmed mean, etc.)

2. The population distribution (normal with m � 100 and s � 15, uniform with lower limit A � 5 and upper limit B � 10, etc.)

3. The sample size n (e.g., n � 10 or n � 50)

4. The number of replications k (number of samples to be obtained)

Then use appropriate software to obtain k different random samples, each of size n, from the designated population distribution. For each sample, calculate the value of the statistic and construct a histogram of the k values. This histogram gives the approximate sampling distribution of the statistic. The larger the value of k, the better the approximation will tend to be (the actual sampling distribution emerges as kB �). In practice, k � 500 or 1000 is usually sufficient if the statistic is “fairly simple.”

The population distribution for our first simulation study is normal with m � 8.25 and s � .75, as pictured in Figure 5.10. [The article “Platelet Size in Myocardial Infarction” (British Med. J., 1983: 449–451) suggests this distribution for platelet volume in individuals with no history of serious heart problems.]

X

X X

X

f X (x) 5 e4l2xe22lx x $ 0

0 x , 0

xxX X

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.3 Statistics and Their Distributions 219

We actually performed four different experiments, with 500 replications for each one. In the first experiment, 500 samples of n � 5 observations each were generated using Minitab, and the sample sizes for the other three were n � 10, n � 20, and n � 30, respectively. The sample mean was calculated for each sample, and the resulting histograms of values appear in Figure 5.11.x

7.35 7.65 7.95 8.25 8.55 8.85 9.15 7.50 7.80 8.10 8.40 8.70 9.00 9.30

.05

.10

.15

.20

.25

Relative frequency

x

7.65 7.95 8.25 8.55 8.85 7.50 7.80 8.10 8.40 8.70

.05

.10

.15

.20

.25

x

.05

.10

.15

.20

.25

Relative frequency

Relative frequency

Relative frequency

x 7.80 8.10 8.40 8.70

7.95 8.25 8.55

.05

.10

.15

.20

.25

x 7.80 8.10 8.40 8.70

7.95 8.25 8.55

(a) (b)

(c) (d)

Figure 5.11 Sample histograms for based on 500 samples, each consisting of n observations: (a) n � 5; (b) n � 10; (c) n � 20; (d) n � 30

x

The first thing to notice about the histograms is their shape. To a reason- able approximation, each of the four looks like a normal curve. The resemblance would be even more striking if each histogram had been based on many more than 500 values. Second, each histogram is centered approximately at 8.25, the mean of the population being sampled. Had the histograms been based on an unending sequence of values, their centers would have been exactly the popu- lation mean, 8.25.

x

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.23

220 CHAPTER 5 Joint Probability Distributions and Random Samples

The final aspect of the histograms to note is their spread relative to one another. The larger the value of n, the more concentrated is the sampling distribution about the mean value. This is why the histograms for n � 20 and n � 30 are based on narrower class intervals than those for the two smaller sample sizes. For the larger sample sizes, most of the values are quite close to 8.25. This is the effect of aver- aging. When n is small, a single unusual x value can result in an value far from the center. With a larger sample size, any unusual x values, when averaged in with the other sample values, still tend to yield an value close to m. Combining these insights yields a result that should appeal to your intuition: based on a large n tends to be closer to mm than does based on a small n.

Consider a simulation experiment in which the population distribution is quite skewed. Figure 5.12 shows the density curve for lifetimes of a certain type of elec- tronic control [this is actually a lognormal distribution with E(ln(X)) � 3 and V(ln(X)) � .16]. Again the statistic of interest is the sample mean . The experiment utilized 500 replications and considered the same four sample sizes as in Example 5.22. The resulting histograms along with a normal probability plot from Minitab for the 500 values based on n � 30 are shown in Figure 5.13.x

X

X X

x

x x

Unlike the normal case, these histograms all differ in shape. In particular, they become progressively less skewed as the sample size n increases. The average of the 500 values for the four different sample sizes are all quite close to the mean value of the population distribution. If each histogram had been based on an unending sequence of values rather than just 500, all four would have been cen- tered at exactly 21.7584. Thus different values of n change the shape but not the center of the sampling distribution of . Comparison of the four histograms in Figure 5.13 also shows that as n increases, the spread of the histograms decreases. Increasing n results in a greater degree of concentration about the population mean value and makes the histogram look more like a normal curve. The his- togram of Figure 5.13(d) and the normal probability plot in Figure 5.13(e) pro- vide convincing evidence that a sample size of n � 30 is sufficient to overcome the skewness of the population distribution and give an approximately normal sampling distribution.

X

X

x

x

0 25 50 75

.01

.02

.03

.04

.05

x

f(x)

Figure 5.12 Density curve for the simulation experiment of Example 5.23 [E(X) � 21.7584, V(X) � 82.1449]

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.3 Statistics and Their Distributions 221

Figure 5.13 Results of the simulation experiment of Example 5.23: (a) histogram for n � 5; (b) histogram for n � 10; (c) histogram for n � 20; (d) histogram for n � 30; (e) normal probability plot for n � 30 (from Minitab)

xxx x

.05

.10

0

.05

.10

0 10 20

(a) (b) 30 40 10 20 30 40

Density

n = 5 n = 10

n = 20 n = 30

Density

x x

.1

.2

0 15 20

(c)

25

Density

x

.1

.2

0 15 20

(d)

25

Density

x

(e)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

222 CHAPTER 5 Joint Probability Distributions and Random Samples

EXERCISES Section 5.3 (37–45)

37. A particular brand of dishwasher soap is sold in three sizes: 25 oz, 40 oz, and 65 oz. Twenty percent of all purchasers select a 25-oz box, 50% select a 40-oz box, and the remain- ing 30% choose a 65-oz box. Let X1 and X2 denote the pack- age sizes selected by two independently selected purchasers. a. Determine the sampling distribution of , calculate E( ),

and compare to m. b. Determine the sampling distribution of the sample vari-

ance S2, calculate E(S2), and compare to s2.

38. There are two traffic lights on a commuter’s route to and from work. Let X1 be the number of lights at which the com- muter must stop on his way to work, and X2 be the number of lights at which he must stop when returning from work. Suppose these two variables are independent, each with pmf given in the accompanying table (so X1, X2 is a random sample of size n � 2).

x1 0 1 2

p(x1) .2 .5 .3

a. Determine the pmf of To � X1 � X2. b. Calculate mTo. How does it relate to m, the population

mean? c. Calculate To. How does it relate to s

2, the population variance?

d. Let X3 and X4 be the number of lights at which a stop is required when driving to and from work on a second day assumed independent of the first day. With To � the sum of all four Xi’s, what now are the values of E(To) and V(To)?

e. Referring back to (d), what are the values of P(To � 8) and P(To � 7) [Hint: Don’t even think of listing all pos- sible outcomes!]

39. It is known that 80% of all brand A zip drives work in a sat- isfactory manner throughout the warranty period (are “suc- cesses”). Suppose that n � 10 drives are randomly selected. Let X � the number of successes in the sample. The statistic X/n is the sample proportion (fraction) of successes. Obtain the sampling distribution of this statistic. [Hint: One possible value of X/n is .3, corresponding to X � 3. What is the prob- ability of this value (what kind of random variable is X)?]

40. A box contains ten sealed envelopes numbered 1, . . . , 10. The first five contain no money, the next three each contains $5, and there is a $10 bill in each of the last two. A sample of size 3 is selected with replacement (so we have a random sample), and you get the largest amount in any of the envelopes selected. If X1, X2, and X3 denote the amounts in the selected envelopes, the statistic of interest is M � the maximum of X1, X2, and X3. a. Obtain the probability distribution of this statistic. b. Describe how you would carry out a simulation experi-

ment to compare the distributions of M for various sam- ple sizes. How would you guess the distribution would change as n increases?

s2

XX

41. Let X be the number of packages being mailed by a ran- domly selected customer at a certain shipping facility. Suppose the distribution of X is as follows:

x 1 2 3 4

p(x) .4 .3 .2 .1

a. Consider a random sample of size n � 2 (two cus- tomers), and let be the sample mean number of pack- ages shipped. Obtain the probability distribution of .

b. Refer to part (a) and calculate P( � 2.5). c. Again consider a random sample of size n � 2, but now

focus on the statistic R � the sample range (difference between the largest and smallest values in the sample). Obtain the distribution of R. [Hint: Calculate the value of R for each outcome and use the probabilities from part (a).]

d. If a random sample of size n � 4 is selected, what is P( � 1.5)? [Hint: You should not have to list all pos- sible outcomes, only those for which � 1.5.]

42. A company maintains three offices in a certain region, each staffed by two employees. Information concerning yearly salaries (1000s of dollars) is as follows:

Office 1 1 2 2 3 3 Employee 1 2 3 4 5 6 Salary 29.7 33.6 30.2 33.6 25.8 29.7

a. Suppose two of these employees are randomly selected from among the six (without replacement). Determine the sampling distribution of the sample mean salary .

b. Suppose one of the three offices is randomly selected. Let X1 and X2 denote the salaries of the two employees. Determine the sampling distribution of .

c. How does E( ) from parts (a) and (b) compare to the population mean salary m?

43. Suppose the amount of liquid dispensed by a certain machine is uniformly distributed with lower limit A � 8 oz and upper limit B � 10 oz. Describe how you would carry out simula- tion experiments to compare the sampling distribution of the (sample) fourth spread for sample sizes n � 5, 10, 20, and 30.

44. Carry out a simulation experiment using a statistical com- puter package or other software to study the sampling dis- tribution of when the population distribution is Weibull with a � 2 and b� 5, as in Example 5.19. Consider the four sample sizes n � 5, 10, 20, and 30, and in each case use 1000 replications. For which of these sample sizes does the

sampling distribution appear to be approximately normal?

45. Carry out a simulation experiment using a statistical com- puter package or other software to study the sampling dis- tribution of when the population distribution is lognormal with E(ln(X)) � 3 and V(ln(X)) � 1. Consider the four sam- ple sizes n � 10, 20, 30, and 50, and in each case use 1000 replications. For which of these sample sizes does the sampling distribution appear to be approximately normal?

X

X

X

X

X X

X

x X

X X

X

m � 1.1, s2 � .49

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.4 The Distribution of the Sample Mean 223

5.4 The Distribution of the Sample Mean

PROPOSITION

The importance of the sample mean springs from its use in drawing conclusions about the population mean m. Some of the most frequently used inferential procedures are based on properties of the sampling distribution of . A preview of these proper- ties appeared in the calculations and simulation experiments of the previous section, where we noted relationships between E( ) and m and also among V( ), s2, and n.XX

X

X

Let X1, X2, . . . , Xn be a random sample from a distribution with mean value m and standard deviation s. Then

1. E( ) � m � m

2. and

In addition, with To � X1 � . . . � Xn (the sample total), E(To) � nm,

V(To) � ns 2, and .sTo 5 1ns

s X

5 s/1nV(X) 5 s X 2 5 s2/n

XX

Example 5.24

Proofs of these results are deferred to the next section. According to Result 1, the sampling (i.e., probability) distribution of is centered precisely at the mean of the population from which the sample has been selected. Result 2 shows that the dis- tribution becomes more concentrated about m as the sample size n increases. In marked contrast, the distribution of To becomes more spread out as n increases. Averaging moves probability in toward the middle, whereas totaling spreads probability out over a wider and wider range of values. The standard deviation

is often called the standard error of the mean; it describes the magnitude of a typical or representative deviation of the sample mean from the population mean.

In a notched tensile fatigue test on a titanium specimen, the expected number of cycles to first acoustic emission (used to indicate crack initiation) is m � 28,000, and the standard deviation of the number of cycles is s � 5000. Let X1, X2, . . . , X25 be a random sample of size 25, where each Xi is the number of cycles on a different ran- domly selected specimen. Then the expected value of the sample mean number of cycles until first emission is E( ) � m � 28,000, and the expected total number of cycles for the 25 specimens is E( ) � nm � 25(28,000) � 700,000. The standard deviation of (standard error of the mean) and of are

If the sample size increases to n � 100, E( ) is unchanged, but s � 500, half of its previous value (the sample size must be quadrupled to halve the standard deviation of ). ■

The Case of a Normal Population Distribution The simulation experiment of Example 5.22 indicated that when the population dis- tribution is normal, each histogram of values is well approximated by a normal curve.

x

X

XX

sTo 5 1ns 5 125(5000) 5 25,000

s X

5 s/1n 5 5000

125 5 1000

T0X T0

X

s X

5 s/1n

X X

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.25

PROPOSITION

224 CHAPTER 5 Joint Probability Distributions and Random Samples

Let X1, X2, . . . , Xn be a random sample from a normal distribution with mean m and standard deviation s. Then for any n, is normally distributed (with mean m and standard deviation ), as is To (with mean nm and standard deviation ).* 1ns

s/1n X

We know everything there is to know about the and To distributions when the pop- ulation distribution is normal. In particular, probabilities such as P(a � � b) and P(c � To � d) can be obtained simply by standardizing. Figure 5.14 illustrates the proposition.

X X

X distribution when n � 10

X distribution when n � 4

Population distribution

Figure 5.14 A normal population distribution and sampling distributionsX

The time that it takes a randomly selected rat of a certain subspecies to find its way through a maze is a normally distributed rv with m � 1.5 min and s � .35 min. Suppose five rats are selected. Let X1, . . . , X5 denote their times in the maze. Assuming the Xi’s to be a random sample from this normal distribution, what is the probability that the total time To � X1 � . . . � X5 for the five is between 6 and 8 min? By the proposition, To has a normal distribution with mTo � nm � 5(1.5) � 7.5 and variance , so sTo � .783. To standardize To, subtract mTo and divide by sTo:

� P(�1.92 � Z � .64) � �(.64) � �(�1.92) � .7115

Determination of the probability that the sample average time (a normally distributed variable) is at most 2.0 min requires m � m � 1.5 and

. Then

■ P(X # 2.0) 5 PaZ # 2.0 2 1.5 .1565

b 5 P(Z # 3.19) 5 �(3.19) 5 .9993

.35/15 5 .1565 s

X 5 s/1n 5

X

X

P(6 # To # 8) 5 Pa6 2 7.5.783 # Z # 8 2 7.5

.783 b

sTo 2 5 ns2 5 5(.1225) 5 .6125

* A proof of the result for To when n � 2 is possible using the method in Example 5.21, but the details are messy. The general result is usually proved using a theoretical tool called a moment generating function. One of the chapter references can be consulted for more information.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.4 The Distribution of the Sample Mean 225

Example 5.26

THEOREM The Central Limit Theorem (CLT)

Let X1, X2, . . . , Xn be a random sample from a distribution with mean m and variance s2. Then if n is sufficiently large, has approximately a normal dis- tribution with m � m and , and To also has approximately a normal sX

2 5 s2/n X

X

Figure 5.15 illustrates the Central Limit Theorem. According to the CLT, when n is large and we wish to calculate a probability such as P(a � � b), we need only “pretend” that is normal, standardize it, and use the normal table. The resulting answer will be approximately correct. The exact answer could be obtained only by first finding the distribution of , so the CLT provides a truly impressive shortcut. The proof of the theorem involves much advanced mathematics.

X

X X

X distribution for small to moderate n

Population distribution

X distribution for large n (approximately normal)

Figure 5.15 The Central Limit Theorem illustrated

The Central Limit Theorem When the Xi’s are normally distributed, so is for every sample size n. The deri- vations in Example 5.20 and simulation experiment of Example 5.23 suggest that even when the population distribution is highly nonnormal, averaging produces a distribution more bell-shaped than the one being sampled. A reasonable conjecture is that if n is large, a suitable normal curve will approximate the actual distribu- tion of . The formal statement of this result is the most important theorem of probability.

X

X

The amount of a particular impurity in a batch of a certain chemical product is a random variable with mean value 4.0 g and standard deviation 1.5 g. If 50 batches are independently prepared, what is the (approximate) probability that the sample average amount of impurity is between 3.5 and 3.8 g? According to the rule of thumb to be stated shortly, n � 50 is large enough for the CLT to be applicable. then has approximately a normal distribution with mean value m � 4.0 and

, so

� �(�.94) � �(�2.36) � .1645 ■

P(3.5 # X # 3.8) < Pa3.5 2 4.0 .2121

# Z # 3.8 2 4.0

.2121 b

s X

5 1.5/150 5 .2121 X

X X

distribution with mTo � nm, . The larger the value of n, the better the approximation.

sT 2 o

5 ns2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.28

226 CHAPTER 5 Joint Probability Distributions and Random Samples

A certain consumer organization customarily reports the number of major defects for each new automobile that it tests. Suppose the number of such defects for a certain model is a random variable with mean value 3.2 and standard deviation 2.4. Among 100 randomly selected cars of this model, how likely is it that the sample average number of major defects exceeds 4? Let Xi denote the number of major defects for the ith car in the random sample. Notice that Xi is a discrete rv, but the CLT is appli- cable whether the variable of interest is discrete or continuous. Also, although the fact that the standard deviation of this nonnegative variable is quite large relative to the mean value suggests that its distribution is positively skewed, the large sample size implies that does have approximately a normal distribution. Using m � 3.2 and s � .24,

The CLT provides insight into why many random variables have probability distri- butions that are approximately normal. For example, the measurement error in a sci- entific experiment can be thought of as the sum of a number of underlying perturbations and errors of small magnitude.

A practical difficulty in applying the CLT is in knowing when n is sufficiently large. The problem is that the accuracy of the approximation for a particular n depends on the shape of the original underlying distribution being sampled. If the underlying distribution is close to a normal density curve, then the approximation will be good even for a small n, whereas if it is far from being normal, then a large n will be required.

P(X . 4) < PaZ . 4 2 3.2 .24

b 5 1 2 �(3.33) 5 .0004 X

X X

Rule of Thumb

If n � 30, the Central Limit Theorem can be used.

There are population distributions for which even an n of 40 or 50 does not suffice, but such distributions are rarely encountered in practice. On the other hand, the rule of thumb is often conservative; for many population distributions, an n much less than 30 would suffice. For example, in the case of a uniform population distribution, the CLT gives a good approximation for n � 12.

Consider the distribution shown in Figure 5.16 for the amount purchased (rounded to the nearest dollar) by a randomly selected customer at a particular gas station (a similar distribution for purchases in Britain (in £) appeared in the article “Data Mining for Fun and Profit,” Statistical Science, 2000: 111–131; there were big spikes at the values, 10, 15, 20, 25, and 30). The distribution is obviously quite non-normal.

We asked Minitab to select 1000 different samples, each consisting of n � 15 observations, and calculate the value of the sample mean for each one. Figure 5.17 is a histogram of the resulting 1000 values; this is the approximate sam- pling distribution of under the specified circumstances. This distribution is clearly approximately normal even though the sample size is actually much smaller than

X

X

Example 5.27

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.4 The Distribution of the Sample Mean 227

Figure 5.16 Probability distribution of X � amount of gasoline purchased ($)

0.16 Probability

Purchase amount

0.14

0.12

0.10

0.08

0.06

0.04

0.02

0.00 60555045403530252015105

Density

Mean

0.14

0.12

0.10

0.08

0.06

0.04

0.02

0.00 36333027242118

Figure 5.17 Approximate sampling distribution of the sample mean amount purchased when n � 15 and the population distribution is as shown in Figure 5.16

30, our rule-of-thumb cutoff for invoking the Central Limit Theorem. As further evidence for normality, Figure 5.18 shows a normal probability plot of the 1000 values; the linear pattern is very prominent. It is typically not non-normality in the central part of the population distribution that causes the CLT to fail, but instead very substantial skewness.

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

228 CHAPTER 5 Joint Probability Distributions and Random Samples

Other Applications of the Central Limit Theorem The CLT can be used to justify the normal approximation to the binomial distribu- tion discussed in Chapter 4. Recall that a binomial variable X is the number of suc- cesses in a binomial experiment consisting of n independent success/failure trials with p � P(S) for any particular trial. Define a new rv X1 by

and define X2, X3, . . . , Xn analogously for the other n � 1 trials. Each Xi indicates whether or not there is a success on the corresponding trial.

Because the trials are independent and P(S) is constant from trial to trial, the Xi’s are iid (a random sample from a Bernoulli distribution). The CLT then implies that if n is sufficiently large, both the sum and the average of the Xi’s have approxi- mately normal distributions. When the Xi’s are summed, a 1 is added for every S that occurs and a 0 for every F, so X1 � . . . � Xn � X. The sample mean of the Xi’s is X/n, the sample proportion of successes. That is, both X and X/n are approximately normal when n is large. The necessary sample size for this approximation depends on the value of p: When p is close to .5, the distribution of each Xi is reasonably sym- metric (see Figure 5.19), whereas the distribution is quite skewed when p is near 0 or 1. Using the approximation only if both np � 10 and n(1 � p) � 10 ensures that n is large enough to overcome any skewness in the underlying Bernoulli distribution.

X1 5 e1 if the 1st trial results in a success0 if the 1st trial results in a failure

Mean

Mean StDev N RJ P-Value

26.49 3.112 1000

0.999 > 0.100

99.99

95

80

50

20

5

1

0.01

P er

ce nt

99

403530252015

Figure 5.18 Normal probability plot from Minitab of the 1000 values based on samples of size n � 15

x

Recall from Section 4.5 that X has a lognormal distribution if ln(X) has a nor- mal distribution.

0 1

(a)

0 1

(b)

Figure 5.19 Two Bernoulli distributions: (a) p � .4 (reasonably symmetric); (b) p � .1 (very skewed)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

5.4 The Distribution of the Sample Mean 229

Let X1, X2, . . . , Xn be a random sample from a distribution for which only pos- itive values are possible [P(Xi � 0) � 1]. Then if n is sufficiently large, the product Y � X1X2 � . . . � Xn has approximately a lognormal distribution.

To verify this, note that

ln(Y) � ln(X1) � ln(X2) � . . . � ln(Xn)

Since ln(Y) is a sum of independent and identically distributed rv’s [the ln(Xi)s], it is approximately normal when n is large, so Y itself has approximately a lognormal dis- tribution. As an example of the applicability of this result, Bury (Statistical Models in Applied Science, Wiley, p. 590) argues that the damage process in plastic flow and crack propagation is a multiplicative process, so that variables such as percentage elongation and rupture strength have approximately lognormal distributions.

EXERCISES Section 5.4 (46–57)

46. The inside diameter of a randomly selected piston ring is a random variable with mean value 12 cm and standard deviation .04 cm. a. If is the sample mean diameter for a random sample of

n � 16 rings, where is the sampling distribution of centered, and what is the standard deviation of the distribution?

b. Answer the questions posed in part (a) for a sample size of n � 64 rings.

c. For which of the two random samples, the one of part (a) or the one of part (b), is more likely to be within .01 cm of 12 cm? Explain your reasoning.

47. Refer to Exercise 46. Suppose the distribution of diameter is normal. a. Calculate P(11.99 � � 12.01) when n � 16. b. How likely is it that the sample mean diameter exceeds

12.01 when n � 25?

48. The National Health Statistics Reports dated Oct. 22, 2008, stated that for a sample size of 277 18-year-old American males, the sample mean waist circumference was 86.3 cm. A somewhat complicated method was used to estimate various population percentiles, resulting in the following values:

5th 10th 25th 50th 75th 90th 95th

69.6 70.9 75.2 81.3 95.4 107.1 116.4

a. Is it plausible that the waist size distribution is at least approximately normal? Explain your reasoning. If your answer is no, conjecture the shape of the population dis- tribution.

b. Suppose that the population mean waist size is 85 cm and that the population standard deviation is 15 cm. How likely is it that a random sample of 277 individu- als will result in a sample mean waist size of at least 86.3 cm?

X

X

X X

X

c. Referring back to (b), suppose now that the population mean waist size in 82 cm. Now what is the (approxi- mate) probability that the sample mean will be at least 86.3 cm? In light of this calculation, do you think that 82 cm is a reasonable value for m?

49. There are 40 students in an elementary statistics class. On the basis of years of experience, the instructor knows that the time needed to grade a randomly chosen first examina- tion paper is a random variable with an expected value of 6 min and a standard deviation of 6 min. a. If grading times are independent and the instructor

begins grading at 6:50 P.M. and grades continuously, what is the (approximate) probability that he is through grading before the 11:00 P.M. TV news begins?

b. If the sports report begins at 11:10, what is the probabil- ity that he misses part of the report if he waits until grad- ing is done before turning on the TV?

50. The breaking strength of a rivet has a mean value of 10,000 psi and a standard deviation of 500 psi. a. What is the probability that the sample mean breaking

strength for a random sample of 40 rivets is between 9900 and 10,200?

b. If the sample size had been 15 rather than 40, could the probability requested in part (a) be calculated from the given information?

51. The time taken by a randomly selected applicant for a mort- gage to fill out a certain form has a normal distribution with mean value 10 min and standard deviation 2 min. If five individuals fill out a form on one day and six on another, what is the probability that the sample average amount of time taken on each day is at most 11 min?

52. The lifetime of a certain type of battery is normally distrib- uted with mean value 10 hours and standard deviation

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

230 CHAPTER 5 Joint Probability Distributions and Random Samples

1 hour. There are four batteries in a package. What lifetime value is such that the total lifetime of all batteries in a pack- age exceeds that value for only 5% of all packages?

53. Rockwell hardness of pins of a certain type is known to have a mean value of 50 and a standard deviation of 1.2. a. If the distribution is normal, what is the probability that

the sample mean hardness for a random sample of 9 pins is at least 51?

b. Without assuming population normality, what is the (approximate) probability that the sample mean hard- ness for a random sample of 40 pins is at least 51?

54. Suppose the sediment density (g/cm) of a randomly selected specimen from a certain region is normally distributed with mean 2.65 and standard deviation .85 (suggested in “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants,” Water Research, 1984: 1169–1174). a. If a random sample of 25 specimens is selected, what is

the probability that the sample average sediment density is at most 3.00? Between 2.65 and 3.00?

b. How large a sample size would be required to ensure that the first probability in part (a) is at least .99?

55. The number of parking tickets issued in a certain city on any given weekday has a Poisson distribution with parameter m � 50. What is the approximate probability that

5.5 The Distribution of a Linear Combination The sample mean and sample total To are special cases of a type of random vari- able that arises very frequently in statistical applications.

X

a. Between 35 and 70 tickets are given out on a particular day? [Hint: When m is large, a Poisson rv has approxi- mately a normal distribution.]

b. The total number of tickets given out during a 5-day week is between 225 and 275?

56. A binary communication channel transmits a sequence of “bits” (0s and 1s). Suppose that for any particular bit trans- mitted, there is a 10% chance of a transmission error (a 0 becoming a 1 or a 1 becoming a 0). Assume that bit errors occur independently of one another. a. Consider transmitting 1000 bits. What is the approximate

probability that at most 125 transmission errors occur? b. Suppose the same 1000-bit message is sent two different

times independently of one another. What is the approx- imate probability that the number of errors in the first transmission is within 50 of the number of errors in the second?

57. Suppose the distribution of the time X (in hours) spent by students at a certain university on a particular project is gamma with parameters a � 50 and b� 2. Because a is large, it can be shown that X has approximately a normal distribution. Use this fact to compute the approximate prob- ability that a randomly selected student spends at most 125 hours on the project.

Given a collection of n random variables X1, . . . , Xn and n numerical constants a1, . . . , an, the rv

(5.7)

is called a linear combination of the Xi’s.

Y 5 a1X1 1 c1 anXn 5 g n

i51 aiXi

For example, 4X1 � 5X2 � 8X3 is a linear combination of X1, X2, and X3 with a1 � 4, a2 � �5, and a3 � 8.

Taking a1 � a2 � . . . � an � 1 gives Y � X1 � . . . � Xn � To, and yields

Notice that we are not requiring the Xi’s to be independent or identically distrib- uted. All the Xi’s could have different distributions and therefore different mean values and variances. We first consider the expected value and variance of a lin- ear combination.

Y 5 1 n X1 1 c 1

1 n Xn 5

1 n (X1 1 c1 Xn) 5

1 n To 5 X

a1 5 a2 5 c5 an 5 1 n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.5 The Distribution of a Linear Combination 231

PROPOSITION Let X1, X2, . . . , Xn have mean values m1, . . . , mn, respectively, and variances , respectively.

1. Whether or not the Xi’s are independent,

E(a1X1 � a2X2 � . . . � anXn) � a1E(X1) � a2E(X2) � . . . � anE(Xn)

� a1m1 � . . . � anmn (5.8)

2. If X1, . . . , Xn are independent,

(5.9)

and

(5.10)

3. For any X1, . . . , Xn,

(5.11)V(a1X1 1 c 1 anXn) 5 g n

i51 g

n

j51 aiajCov(Xi, Xj)

sa1X11c1anXn 5 1a1 2s1

2 1 c1 an 2sn

2

5 a1 2s1

2 1 c 1 an 2sn

2

V(a1X1 1 a2X2 1 c1 anXn) 5 a1 2V(X1) 1 a2

2V(X2) 1 c1 an 2V(Xn)

s1 2, c, sn

2

Proofs are sketched out at the end of the section. A paraphrase of (5.8) is that the expected value of a linear combination is the same as the linear combination of the expected values—for example, E(2X1 � 5X2) � 2m1 � 5m2. The result (5.9) in Statement 2 is a special case of (5.11) in Statement 3; when the Xi’s are independ- ent, Cov(Xi, Xj) � 0 for i � j and � V(Xi) for i � j (this simplification actually occurs when the Xi’s are uncorrelated, a weaker condition than independence). Specializing to the case of a random sample (Xi’s iid) with ai � 1/n for every i gives E( ) � m and V( ) � s2/n, as discussed in Section 5.4. A similar comment applies to the rules for To.

A gas station sells three grades of gasoline: regular, extra, and super. These are priced at $3.00, $3.20, and $3.40 per gallon, respectively. Let X1, X2, and X3 denote the amounts of these grades purchased (gallons) on a particular day. Suppose the Xi’s are independent with m1 � 1000, m2 � 500, m3 � 300, s1 � 100, s2 � 80, and s3 � 50. The revenue from sales is Y � 3.0X1 � 3.2X2 � 3.4X3, and

E(Y) � 3.0m1 � 3.2m2 � 3.4m3 � $5620

The Difference Between Two Random Variables An important special case of a linear combination results from taking n � 2, a1 � 1, and a2 � �1:

Y � a1X1 � a2X2 � X1 � X2

We then have the following corollary to the proposition.

sY 5 1184,436 5 $429.46

V(Y) 5 (3.0)2s1 2 1 (3.2)2s2

2 1 (3.4)2s3 2 5 184,436

XX

Example 5.29

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 5.30

COROLLARY

232 CHAPTER 5 Joint Probability Distributions and Random Samples

E(X1 � X2) � E(X1) � E(X2) for any two rv’s X1 and X2. V(X1 � X2) � V(X1) � V(X2) if X1 and X2 are independent rv’s.

The expected value of a difference is the difference of the two expected values, but the variance of a difference between two independent variables is the sum, not the difference, of the two variances. There is just as much variability in X1 � X2 as in X1 � X2 [writing X1 � X2 � X1 � (�1)X2, (�1)X2 has the same amount of variability as X2 itself].

A certain automobile manufacturer equips a particular model with either a six-cylinder engine or a four-cylinder engine. Let X1 and X2 be fuel efficiencies for independently and randomly selected six-cylinder and four-cylinder cars, respectively. With m1 � 22, m2 � 26, s1 � 1.2, and s2 � 1.5,

E(X1 � X2) � m1 � m2 � 22 � 26 � �4

If we relabel so that X1 refers to the four-cylinder car, then E(X1 � X2) � 4, but the variance of the difference is still 3.69. ■

The Case of Normal Random Variables When the Xi’s form a random sample from a normal distribution, and To are both normally distributed. Here is a more general result concerning linear combinations.

X

sX12X2 5 13.69 5 1.92

V(X1 2 X2) 5 s1 2 1 s2

2 5 (1.2)2 1 (1.5)2 5 3.69

Example 5.31 (Example 5.29

continued)

PROPOSITION If X1, X2, . . . , Xn are independent, normally distributed rv’s (with possibly dif- ferent means and/or variances), then any linear combination of the Xi’s also has a normal distribution. In particular, the difference X1 � X2 between two independent, normally distributed variables is itself normally distributed.

The total revenue from the sale of the three grades of gasoline on a particular day was Y � 3.0X1 � 3.2X2 � 3.4X3, and we calculated mY � 5620 and (assuming inde- pendence) sY � 429.46. If the Xis are normally distributed, the probability that rev- enue exceeds 4500 is

The CLT can also be generalized so it applies to certain linear combinations. Roughly speaking, if n is large and no individual term is likely to contribute too much to the overall value, then Y has approximately a normal distribution.

Proofs for the Case n � 2 For the result concerning expected values, suppose that X1 and X2 are continuous with joint pdf f(x1, x2). Then

5 P(Z . 22.61) 5 1 2 �(22.61) 5 .9955

P(Y . 4500) 5 PaZ . 4500 2 5620 429.46

b

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

5.5 The Distribution of a Linear Combination 233

� a1E(X1) � a2E(X2)

Summation replaces integration in the discrete case. The argument for the variance result does not require specifying whether either variable is discrete or continuous. Recalling that V(Y) � E[(Y � mY)

2],

V(a1X1 � a2X2) � E{[a1X1 � a2X2 � (a1m1 � a2m2)] 2}

The expression inside the braces is a linear combination of the variables Y1 � (X1 � m1)

2, Y2 � (X2 � m2) 2, and Y3 � (X1 � m1)(X2 � m2), so carrying the E oper-

ation through to the three terms gives as required. ■

a1 2V(X1) 1 a2

2V(X2) 1 2a1a2 Cov(X1, X2)

5 E5a12(X1 2 m1)2 1 a22(X2 2 m2)2 1 2a1a2(X1 2 m1)(X2 2 m2)6

5 a1 � `

2` x1 fX1(x1) dx1 1 a2 �

`

2` x2 fX2(x2) dx2

1 a2 � `

2` �

`

2` x2f(x1, x2) dx1 dx2

5 a1 � `

2` �

`

2` x1f(x1, x2) dx2 dx1

E(a1X1 1 a2X2) 5 � `

2` �

`

2` (a1x1 1 a2x2)f(x1, x2) dx1 dx2

EXERCISES Section 5.5 (58–74)

58. A shipping company handles containers in three different sizes: (1) 27 ft3 (3 3 3), (2) 125 ft3, and (3) 512 ft3. Let Xi (i � 1, 2, 3) denote the number of type i containers shipped during a given week. With mi � E(Xi) and

, suppose that the mean values and standard deviations are as follows:

m1 � 200 m2 � 250 m3 � 100

s1 � 10 s2 � 12 s3 � 8

a. Assuming that X1, X2, X3 are independent, calculate the expected value and variance of the total volume shipped. [Hint: Volume � 27X1 � 125X2 � 512X3.]

b. Would your calculations necessarily be correct if the Xi’s were not independent? Explain.

59. Let X1, X2, and X3 represent the times necessary to perform three successive repair tasks at a certain service facility. Suppose they are independent, normal rv’s with expected values m1, m2, and m3 and variances , and , respec- tively. a. If m� m2 � m3 � 60 and , calculate

P(To � 200) and P(150 � To � 200)? b. Using the mi’s and si’s given in part (a), calculate both

P(55 � ) and P(58 � � 62). c. Using the mi’s and si’s given in part (a), calculate and

interpret P(�10 � X1 � .5X2 � .5X3 � 5). d. If m1 � 40, m2 � 50, m3 � 60, , and s1

2 5 10, s2 2 5 12

XX

s1 2 5 s2

2 5 s3 2 5 15

s3 2s1

2, s2 2

si 2 5 V(Xi)

, calculate P(X1 � X2 � X3 � 160) and also

P(X1 � X2 � 2X3).

60. Five automobiles of the same type are to be driven on a 300- mile trip. The first two will use an economy brand of gaso- line, and the other three will use a name brand. Let X1, X2, X3, X4, and X5 be the observed fuel efficiencies (mpg) for the five cars. Suppose these variables are independent and nor- mally distributed with m1 � m2 � 20, m3 � m4 � m5 � 21, and s2 � 4 for the economy brand and 3.5 for the name brand. Define an rv Y by

so that Y is a measure of the difference in efficiency between economy gas and name-brand gas. Compute P(0 � Y) and P(�1 � Y � 1). [Hint: Y � a1X1 � . . . � a5X5, with

.]

61. Exercise 26 introduced random variables X and Y, the number of cars and buses, respectively, carried by a ferry on a single trip. The joint pmf of X and Y is given in the table in Exercise 7. It is readily verified that X and Y are independent. a. Compute the expected value, variance, and standard de-

viation of the total number of vehicles on a single trip.

a1 5 1 2 , c, a5 5 2

1 3

Y 5 X1 1 X2

2 2

X3 1 X4 1 X5 3

s3 2 5 14

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

234 CHAPTER 5 Joint Probability Distributions and Random Samples

b. If each car is charged $3 and each bus $10, compute the expected value, variance, and standard deviation of the revenue resulting from a single trip.

62. Manufacture of a certain component requires three different machining operations. Machining time for each operation has a normal distribution, and the three times are indepen- dent of one another. The mean values are 15, 30, and 20 min, respectively, and the standard deviations are 1, 2, and 1.5 min, respectively. What is the probability that it takes at most 1 hour of machining time to produce a randomly selected component?

63. Refer to Exercise 3. a. Calculate the covariance between X1 � the number of

customers in the express checkout and X2 � the number of customers in the superexpress checkout.

b. Calculate V(X1 � X2). How does this compare to V(X1) � V(X2)?

64. Suppose your waiting time for a bus in the morning is uni- formly distributed on [0, 8], whereas waiting time in the evening is uniformly distributed on [0, 10] independent of morning waiting time. a. If you take the bus each morning and evening for a week,

what is your total expected waiting time? [Hint: Define rv’s X1, . . . , X10 and use a rule of expected value.]

b. What is the variance of your total waiting time? c. What are the expected value and variance of the differ-

ence between morning and evening waiting times on a given day?

d. What are the expected value and variance of the differ- ence between total morning waiting time and total evening waiting time for a particular week?

65. Suppose that when the pH of a certain chemical compound is 5.00, the pH measured by a randomly selected begin- ning chemistry student is a random variable with mean 5.00 and standard deviation .2. A large batch of the com- pound is subdivided and a sample given to each student in a morning lab and each student in an afternoon lab. Let

� the average pH as determined by the morning stu- dents and � the average pH as determined by the after- noon students. a. If pH is a normal variable and there are 25 students in

each lab, compute P(�.1 � � � .1). [Hint: � is a linear combination of normal variables, so is normally distributed. Compute and .]s

X2Y m

X2Y

YXYX

Y X

a. Suppose that X1 and X2 are independent rv’s with means 2 and 4 kips, respectively, and standard deviations .5 and 1.0 kip, respectively. If a1 � 5 ft and a2 � 10 ft, what is the expected bending moment and what is the standard deviation of the bending moment?

b. If X1 and X2 are normally distributed, what is the proba- bility that the bending moment will exceed 75 kip-ft?

c. Suppose the positions of the two loads are random vari- ables. Denoting them by A1 and A2, assume that these variables have means of 5 and 10 ft, respectively, that each has a standard deviation of .5, and that all Ai’s and Xi’s are independent of one another. What is the expected moment now?

d. For the situation of part (c), what is the variance of the bending moment?

e. If the situation is as described in part (a) except that Corr(X1, X2) � .5 (so that the two loads are not inde- pendent), what is the variance of the bending moment?

67. One piece of PVC pipe is to be inserted inside another piece. The length of the first piece is normally distributed with mean value 20 in. and standard deviation .5 in. The length of the second piece is a normal rv with mean and standard deviation 15 in. and .4 in., respectively. The amount of overlap is normally distributed with mean value 1 in. and standard deviation .1 in. Assuming that the lengths and amount of overlap are independent of one another, what is the probability that the total length after insertion is between 34.5 in. and 35 in.?

68. Two airplanes are flying in the same direction in adjacent parallel corridors. At time t � 0, the first airplane is 10 km ahead of the second one. Suppose the speed of the first plane (km/hr) is normally distributed with mean 520 and standard deviation 10 and the second plane’s speed is also normally distributed with mean and standard deviation 500 and 10, respectively. a. What is the probability that after 2 hr of flying, the sec-

ond plane has not caught up to the first plane? b. Determine the probability that the planes are separated

by at most 10 km after 2 hr.

69. Three different roads feed into a particular freeway entrance. Suppose that during a fixed time period, the num- ber of cars coming from each road onto the freeway is a ran- dom variable, with expected value and standard deviation as given in the table.

Road 1 Road 2 Road 3

Expected value 800 1000 600 Standard deviation 16 25 18

a. What is the expected total number of cars entering the freeway at this point during the period? [Hint: Let Xi � the number from road i.]

b. What is the variance of the total number of entering cars? Have you made any assumptions about the rela- tionship between the numbers of cars on the different roads?

(0, 1)

(x, 1 � x)

y

b. If there are 36 students in each lab, but pH determina- tions are not assumed normal, calculate (approximately) P(�.1 � � � .1).

66. If two loads are applied to a cantilever beam as shown in the accompanying drawing, the bending moment at 0 due to the loads is a1X1 � a2X2.

YX

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 235

c. With Xi denoting the number of cars entering from road i during the period, suppose that Cov(X1, X2) � 80, Cov(X1, X3) � 90, and Cov(X2, X3) � 100 (so that the three streams of traffic are not independent). Compute the expected total number of entering cars and the stan- dard deviation of the total.

70. Consider a random sample of size n from a continuous dis- tribution having median 0 so that the probability of any one observation being positive is .5. Disregarding the signs of the observations, rank them from smallest to largest in absolute value, and let W � the sum of the ranks of the observations having positive signs. For example, if the observations are �.3, �.7, �2.1, and �2.5, then the ranks of positive observations are 2 and 3, so W � 5. In Chapter 15, W will be called Wilcoxon’s signed-rank statistic. W can be represented as follows:

W � 1 � Y1 � 2 � Y2 � 3 � Y3 � . . . � n � Yn

where the Yi’s are independent Bernoulli rv’s, each with p � .5 (Yi � 1 corresponds to the observation with rank i being positive). a. Determine E(Yi) and then E(W) using the equation for W.

[Hint: The first n positive integers sum to n(n � 1)/2.] b. Determine V(Yi) and then V(W). [Hint: The sum of the

squares of the first n positive integers can be expressed as n(n � 1)(2n � 1)/6.]

71. In Exercise 66, the weight of the beam itself contributes to the bending moment. Assume that the beam is of uniform thickness and density so that the resulting load is uniformly distributed on the beam. If the weight of the beam is ran- dom, the resulting load from the weight is also random; denote this load by W (kip-ft).

a. If the beam is 12 ft long, W has mean 1.5 and standard deviation .25, and the fixed loads are as described in part (a) of Exercise 66, what are the expected value and vari- ance of the bending moment? [Hint: If the load due to the

5 g n

i51 i # Yi

beam were w kip-ft, the contribution to the bending moment would be w� .]

b. If all three variables (X1, X2, and W) are normally distrib- uted, what is the probability that the bending moment will be at most 200 kip-ft?

72. I have three errands to take care of in the Administration Building. Let Xi � the time that it takes for the ith errand (i � 1, 2, 3), and let X4 � the total time in minutes that I spend walking to and from the building and between each errand. Suppose the Xi’s are independent, and normally dis- tributed, with the following means and standard deviations: m1 � 15, s1 � 4, m2 � 5, s2 � 1, m3 � 8, s3 � 2, m4 � 12, s4 � 3. I plan to leave my office at precisely 10:00 A.M. and wish to post a note on my door that reads, “I will return by t A.M.” What time t should I write down if I want the proba- bility of my arriving after t to be .01?

73. Suppose the expected tensile strength of type-A steel is 105 ksi and the standard deviation of tensile strength is 8 ksi. For type-B steel, suppose the expected tensile strength and standard deviation of tensile strength are 100 ksi and 6 ksi, respectively. Let � the sample average tensile strength of a random sample of 40 type-A specimens, and let � the sample average tensile strength of a random sample of 35 type-B specimens.

a. What is the approximate distribution of ? Of ? b. What is the approximate distribution of � ? Justify

your answer. c. Calculate (approximately) P(�1 � � � 1). d. Calculate P( � � 10). If you actually observed � �

10, would you doubt that m1 � m2 � 5?

74. In an area having sandy soil, 50 small trees of a certain type were planted, and another 50 trees were planted in an area having clay soil. Let X � the number of trees planted in sandy soil that survive 1 year and Y � the number of trees planted in clay soil that survive 1 year. If the probability that a tree planted in sandy soil will survive 1 year is .7 and the probability of 1-year survival in clay soil is .6, compute an approximation to P(�5 � X � Y � 5) (do not bother with the continuity correction).

YXYX YX

YX YX

Y

X

0

12 x dx

SUPPLEMENTARY EXERCISES (75–96)

75. A restaurant serves three fixed-price dinners costing $12, $15, and $20. For a randomly selected couple dining at this restaurant, let X � the cost of the man’s dinner and Y � the cost of the woman’s dinner. The joint pmf of X and Y is given in the following table:

a. Compute the marginal pmf’s of X and Y. b. What is the probability that the man’s and the woman’s

dinner cost at most $15 each? c. Are X and Y independent? Justify your answer. d. What is the expected total cost of the dinner for the two

people? e. Suppose that when a couple opens fortune cookies at the

conclusion of the meal, they find the message “You will receive as a refund the difference between the cost of the more expensive and the less expensive meal that you have chosen.” How much would the restaurant expect to refund?

y

p(x, y) 12 15 20

12 .05 .05 .10 x 15 .05 .10 .35

20 0 .20 .10

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

236 CHAPTER 5 Joint Probability Distributions and Random Samples

76. In cost estimation, the total cost of a project is the sum of component task costs. Each of these costs is a random variable with a probability distribution. It is customary to obtain information about the total cost distribution by adding together characteristics of the individual compo- nent cost distributions—this is called the “roll-up” proce- dure. For example, E(X1 � . . . � Xn) � E(X1) � . . . � E(Xn), so the roll-up procedure is valid for mean cost. Suppose that there are two component tasks and that X1 and X2 are independent, normally distributed random vari- ables. Is the roll-up procedure valid for the 75th per- centile? That is, is the 75th percentile of the distribution of X1 � X2 the same as the sum of the 75th percentiles of the two individual distributions? If not, what is the rela- tionship between the percentile of the sum and the sum of percentiles? For what percentiles is the roll-up procedure valid in this case?

77. A health-food store stocks two different brands of a certain type of grain. Let X � the amount (lb) of brand A on hand and Y � the amount of brand B on hand. Suppose the joint pdf of X and Y is

a. Draw the region of positive density and determine the value of k.

b. Are X and Y independent? Answer by first deriving the marginal pdf of each variable.

c. Compute P(X � Y � 25). d. What is the expected total amount of this grain on hand? e. Compute Cov(X, Y) and Corr(X, Y). f. What is the variance of the total amount of grain on

hand?

78. The article “Stochastic Modeling for Pavement Warranty Cost Estimation” (J. of Constr. Engr. and Mgmnt., 2009: 352–359) proposes the following model for the distribution of Y � time to pavement failure. Let X1 be the time to fail- ure due to rutting, and X2 be the time to failure due to trans- verse cracking; these two rvs are assumed independent. Then Y � min(X1, X2). The probability of failure due to either one of these distress modes is assumed to be an increasing function of time t. After making certain distribu- tional assumptions, the following form of the cdf for each mode is obtained:

where � is the standard normal cdf. Values of the five parameters a, b, c, d, and e are �25.49, 1.15, 4.45, �1.78, and .171 for cracking and �21.27, .0325, .972, �.00028, and .00022 for rutting. Determine the probabil- ity of pavement failure within t � 5 years and also t � 10 years.

79. Suppose that for a certain individual, calorie intake at breakfast is a random variable with expected value 500 and standard deviation 50, calorie intake at lunch is

�c (a 1 bt) >(c 1 dt 1 et2)1/2d

f (x, y) 5 e kxy x $ 0, y $ 0, 20 # x 1 y # 30 0 otherwise

random with expected value 900 and standard deviation 100, and calorie intake at dinner is a random variable with expected value 2000 and standard deviation 180. Assuming that intakes at different meals are independent of one another, what is the probability that average calorie intake per day over the next (365-day) year is at most 3500? [Hint: Let Xi, Yi, and Zi denote the three calorie intakes on day i. Then total intake is given by (Xi � Yi � Zi).]

80. The mean weight of luggage checked by a randomly selected tourist-class passenger flying between two cities on a certain airline is 40 lb, and the standard deviation is 10 lb. The mean and standard deviation for a business-class pas- senger are 30 lb and 6 lb, respectively. a. If there are 12 business-class passengers and 50

tourist-class passengers on a particular flight, what are the expected value of total luggage weight and the standard deviation of total luggage weight?

b. If individual luggage weights are independent, normally distributed rv’s, what is the probability that total luggage weight is at most 2500 lb?

81. We have seen that if E(X1) � E(X2) � . . . � E(Xn) � m, then E(X1 � . . . � Xn) � nm. In some applications, the number of Xi’s under consideration is not a fixed num- ber n but instead is an rv N. For example, let N � the number of components that are brought into a repair shop on a particular day, and let Xi denote the repair shop time for the ith component. Then the total repair time is X1 � X2 � . . . � XN, the sum of a random num- ber of random variables. When N is independent of the Xi’s, it can be shown that

E(X1 � . . . � XN) � E(N) � m

a. If the expected number of components brought in on a particularly day is 10 and expected repair time for a ran- domly submitted component is 40 min, what is the expected total repair time for components submitted on any particular day?

b. Suppose components of a certain type come in for repair according to a Poisson process with a rate of 5 per hour. The expected number of defects per component is 3.5. What is the expected value of the total number of defects on components submitted for repair during a 4-hour period? Be sure to indicate how your answer follows from the general result just given.

82. Suppose the proportion of rural voters in a certain state who favor a particular gubernatorial candidate is .45 and the proportion of suburban and urban voters favoring the candidate is .60. If a sample of 200 rural voters and 300 urban and suburban voters is obtained, what is the approx- imate probability that at least 250 of these voters favor this candidate?

83. Let m denote the true pH of a chemical compound. A sequence of n independent sample pH determinations will be made. Suppose each sample pH is a random variable

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 237

with expected value m and standard deviation .1. How many determinations are required if we wish the proba- bility that the sample average is within .02 of the true pH to be at least .95? What theorem justifies your probability calculation?

84. If the amount of soft drink that I consume on any given day is independent of consumption on any other day and is normally distributed with m � 13 oz and s� 2 and if I currently have two six-packs of 16-oz bottles, what is the probability that I still have some soft drink left at the end of 2 weeks (14 days)?

85. Refer to Exercise 58, and suppose that the Xi’s are inde- pendent with each one having a normal distribution. What is the probability that the total volume shipped is at most 100,000 ft3?

86. A student has a class that is supposed to end at 9:00 A.M. and another that is supposed to begin at 9:10 A.M. Suppose the actual ending time of the 9 A.M. class is a normally distrib- uted rv X1 with mean 9:02 and standard deviation 1.5 min and that the starting time of the next class is also a normally distributed rv X2 with mean 9:10 and standard deviation 1 min. Suppose also that the time necessary to get from one classroom to the other is a normally distributed rv X3 with mean 6 min and standard deviation 1 min. What is the prob- ability that the student makes it to the second class before the lecture starts? (Assume independence of X1, X2, and X3, which is reasonable if the student pays no attention to the finishing time of the first class.)

87. a. Use the general formula for the variance of a linear com- bination to write an expression for V(aX � Y). Then let a � sY /sX, and show that r � �1. [Hint: Variance is always � 0, and Cov(X, Y ) � sX � sY � r.]

b. By considering V(aX � Y ), conclude that r � 1. c. Use the fact that V(W ) � 0 only if W is a constant to

show that r � 1 only if Y � aX � b.

88. Suppose a randomly chosen individual’s verbal score X and quantitative score Y on a nationally administered aptitude examination have a joint pdf

You are asked to provide a prediction t of the individual’s total score X � Y. The error of prediction is the mean squared error E[(X � Y � t)2]. What value of t minimizes the error of prediction?

89. a. Let X1 have a chi-squared distribution with parameter n1 (see Section 4.4), and let X2 be independent of X1 and have a chi-squared distribution with parameter n2. Use the technique of Example 5.21 to show that X1 � X2 has a chi-squared distribution with parameter n1 � n2.

f (x, y) 5 • 2

5 (2x 1 3y) 0 # x # 1, 0 # y # 1

0 otherwise

b. In Exercise 71 of Chapter 4, you were asked to show that if Z is a standard normal rv, then Z 2 has a chi-squared distribution with n� 1. Let Z 1, Z 2, . . . , Zn be n inde- pendent standard normal rv’s. What is the distribution of

? Justify your answer. c. Let X1, . . . , Xn be a random sample from a normal dis-

tribution with mean m and variance s2. What is the dis- tribution of the sum ? Justify your answer.

90. a. Show that Cov(X, Y � Z) � Cov(X, Y) � Cov(X, Z). b. Let X1 and X2 be quantitative and verbal scores on one

aptitude exam, and let Y1 and Y2 be corresponding scores on another exam. If Cov(X1, Y1) � 5, Cov(X1, Y2) � 1, Cov(X2, Y1) � 2, and Cov(X2, Y2) � 8, what is the covariance between the two total scores X1 � X2 and Y1 � Y2?

91. A rock specimen from a particular area is randomly selected and weighed two different times. Let W denote the actual weight and X1 and X2 the two measured weights. Then X1 � W � E1 and X2 � W � E2, where E1 and E2 are the two measurement errors. Suppose that the Eis are independent of one another and of W and that

. a. Express r, the correlation coefficient between the two

measured weights X1 and X2, in terms of , the variance of actual weight, and , the variance of measured weight.

b. Compute r when sW � 1 kg and sE � .01 kg.

92. Let A denote the percentage of one constituent in a ran- domly selected rock specimen, and let B denote the per- centage of a second constituent in that same specimen. Suppose D and E are measurement errors in determining the values of A and B so that measured values are X � A � D and Y � B � E, respectively. Assume that measurement errors are independent of one another and of actual values. a. Show that

where X1 and X2 are replicate measurements on the value of A, and Y1 and Y2 are defined analogously with respect to B. What effect does the presence of measure- ment error have on the correlation?

b. What is the maximum value of Corr(X, Y) when Corr(X1, X2) � .8100 and Corr(Y1, Y2) � .9025? Is this disturbing?

93. Let X1, . . . , Xn be independent rv’s with mean values m1, . . . , mn and variances . Consider a function h(x1, . . . ,

xn), and use it to define a new rv Y � h(X1, . . . , Xn). Under rather general conditions on the h function, if the si’s are all small relative to the corresponding mi’s, it can be shown that E(Y) � h(m1, . . . , mn) and

where each partial derivative is evaluated at (x1, . . . , xn) � (m1, . . . , mn). Suppose three resistors with resistances X1, X2, X3

V(Y) < a 'h 'x1

b2 # s12 1 c1 a 'h 'xnb 2 # sn2

s1 2, c, sn

2

Corr(X, Y) 5 Corr(A, B) # 1Corr(X1, X2) # 1Corr(Y1, Y2)

sX 2

sW 2

V(E1) 5 V(E2) 5 sE 2

Y 5 gni51 [(Xi 2 m)/s] 2

Z1 2 1 c1 Z n

2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

238 CHAPTER 5 Joint Probability Distributions and Random Samples

Bibliography Devore, Jay, and Kenneth Berk, Modern Mathematical Statistics

with Applications, Thomson-Brooks/Cole, Belmont, CA, 2007. A bit more sophisticated exposition of probability topics than in the present book.

Olkin, Ingram, Cyrus Derman, and Leon Gleser, Probability Models and Applications (2nd ed.), Macmillan, New York, 1994. Contains a careful and comprehensive exposition of joint distributions, rules of expectation, and limit theorems.

are connected in parallel across a battery with voltage X4. Then by Ohm’s law, the current is

Let m1 � 10 ohms, s1 � 1.0 ohm, m2 � 15 ohms, s2 � 1.0 ohm, m3 � 20 ohms, s3 � 1.5 ohms, m4 � 120 V, s4 � 4.0 V. Calculate the approximate expected value and standard devia- tion of the current (suggested by “Random Samplings,” CHEMTECH, 1984: 696–697).

94. A more accurate approximation to E[h(X1, . . . , Xn)] in Exercise 93 is

h(m1, c,mn) 1 1

2 s1

2a'2h 'x12

b 1 c1 1 2 sn

2a'2h 'xn2

b

Y 5 X4c 1X1 1 1

X2 1

1

X3 d

Compute this for Y � h(X1, X2, X3, X4) given in Exercise 93, and compare it to the leading term h(m1, . . . , mn).

95. Let X and Y be independent standard normal random vari- ables, and define a new rv by U � .6X � .8Y. a. Determine Corr(X, U). b. How would you alter U to obtain Corr(X, U) � r for a

specified value of r?

96. Let X1, X2, . . . , Xn be random variables denoting n inde- pendent bids for an item that is for sale. Suppose each Xi is uniformly distributed on the interval [100, 200]. If the seller sells to the highest bidder, how much can he expect to earn on the sale? [Hint: Let Y � max(X1, X2, . . . , Xn). First find FY(y) by noting that Y � y iff each Xi is � y. Then obtain the pdf and E(Y).]

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

239

6 Point Estimation

INTRODUCTION

Given a parameter of interest, such as a population mean or population pro-

portion p, the objective of point estimation is to use a sample to compute a

number that represents in some sense a good guess for the true value of the

parameter. The resulting number is called a point estimate. In Section 6.1, we

present some general concepts of point estimation. In Section 6.2, we describe

and illustrate two important methods for obtaining point estimates: the method

of moments and the method of maximum likelihood.

m

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

Statistical inference is almost always directed toward drawing some type of conclu- sion about one or more parameters (population characteristics). To do so requires that an investigator obtain sample data from each of the populations under study. Conclusions can then be based on the computed values of various sample quantities. For example, let m (a parameter) denote the true average breaking strength of wire connections used in bonding semiconductor wafers. A random sample of n � 10 connections might be made, and the breaking strength of each one determined, resulting in observed strengths x1, x2, . . . , x10. The sample mean breaking strength could then be used to draw a conclusion about the value of . Similarly, if s 2 is the variance of the breaking strength distribution (population variance, another parame- ter), the value of the sample variance s 2 can be used to infer something about s 2.

When discussing general concepts and methods of inference, it is convenient to have a generic symbol for the parameter of interest. We will use the Greek letter for this purpose. The objective of point estimation is to select a single number, based on sample data, that represents a sensible value for . Suppose, for example, that the parameter of interest is m, the true average lifetime of batteries of a certain type. A random sample of n � 3 batteries might yield observed lifetimes (hours) x1 � 5.0, x2 � 6.4, x3 � 5.9. The computed value of the sample mean lifetime is � 5.77, and it is reasonable to regard 5.77 as a very plausible value of —our “best guess” for the value of based on the available sample information.

Suppose we want to estimate a parameter of a single population (e.g., or s) based on a random sample of size n. Recall from the previous chapter that before data is available, the sample observations must be considered random variables (rv’s) X1, X2, . . . , Xn. It follows that any function of the Xi ’s—that is, any statistic—such as the sample mean or sample standard deviation S is also a random variable. The same is true if available data consists of more than one sample. For example, we can represent tensile strengths of m type 1 specimens and n type 2 specimens by X1, . . . , Xm and Y1, . . . , Yn, respectively. The difference between the two sample mean strengths is

� , the natural statistic for making inferences about 1 � 2, the difference between the population mean strengths.

mmYX

X

m

m

m

x

u

u

m

x

A point estimate of a parameter is a single number that can be regarded as a sensible value for . A point estimate is obtained by selecting a suitable sta- tistic and computing its value from the given sample data. The selected statis- tic is called the point estimator of .u

u

u

In the battery example just given, the estimator used to obtain the point estimate of m was , and the point estimate of m was 5.77. If the three observed lifetimes had instead been x1 � 5.6, x2 � 4.5, and x3 � 6.1, use of the estimator would have resulted in the estimate � (5.6 � 4.5 � 6.1)/3 � 5.40. The symbol (“theta hat”) is customarily used to denote both the estimator of and the point estimate resulting from a given sample.* Thus is read as “the point estimator of is the samplemm̂ 5 X

u

ûx X

X

6.1 Some General Concepts of Point Estimation

240 CHAPTER 6 Point Estimation

* Following earlier notation, we could use (an uppercase theta) for the estimator, but this is cumber- some to write.

�̂

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 6.1

6.1 Some General Concepts of Point Estimation 241

mean .” The statement “the point estimate of m is 5.77” can be written concisely as . Notice that in writing , there is no indication of how this point

estimate was obtained (what statistic was used). It is recommended that both the esti- mator and the resulting estimate be reported.

An automobile manufacturer has developed a new type of bumper, which is sup- posed to absorb impacts with less damage than previous bumpers. The manufacturer has used this bumper in a sequence of 25 controlled crashes against a wall, each at 10 mph, using one of its compact car models. Let X � the number of crashes that result in no visible damage to the automobile. The parameter to be estimated is p � the proportion of all such crashes that result in no damage [alternatively, p � P(no damage in a single crash)]. If X is observed to be x � 15, the most reasonable esti- mator and estimate are

If for each parameter of interest there were only one reasonable point estima- tor, there would not be much to point estimation. In most problems, though, there will be more than one reasonable estimator.

Reconsider the accompanying 20 observations on dielectric breakdown voltage for pieces of epoxy resin first introduced in Example 4.30 (Section 4.6).

24.46 25.61 26.25 26.42 26.66 27.15 27.31 27.54 27.74 27.94

27.98 28.04 28.28 28.49 28.50 28.87 29.11 29.13 29.50 30.88

The pattern in the normal probability plot given there is quite straight, so we now assume that the distribution of breakdown voltage is normal with mean value . Because normal distributions are symmetric, is also the median lifetime of the distribution. The given observations are then assumed to be the result of a random sample X1, X2, . . . , X20 from this normal distribution. Consider the following esti- mators and resulting estimates for :

a. Estimator � , estimate � � �xi /n � 555.86/20 � 27.793

b. Estimator � , estimate � � (27.94 � 27.98)/2 � 27.960

c. Estimator � [min(Xi) � max(Xi)]/2 � the average of the two extreme lifetimes, estimate � [min(xi) � max(xi)]/2 � (24.46 � 30.88)/2 � 27.670

d. Estimator � tr(10), the 10% trimmed mean (discard the smallest and largest 10% of the sample and then average),

estimate � tr(10)

� 27.838

Each one of the estimators (a)–(d) uses a different measure of the center of the sample to estimate . Which of the estimates is closest to the true value? We cannot answer this without knowing the true value. A question that can be answered is, “Which esti- mator, when used on other samples of Xi’s, will tend to produce estimates closest to the true value?” We will shortly consider this type of question. ■

m

5 555.86 2 24.46 2 25.61 2 29.50 2 30.88

16

x

X

x|X| xX

m

m

m

estimator p̂ 5 X n

estimate 5 x n 5

15

25 5 .60

û 5 72.5m̂ 5 5.77 X

Example 6.2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

242 CHAPTER 6 Point Estimation

The article “Is a Normal Distribution the Most Appropriate Statistical Distribution for Volumetric Properties in Asphalt Mixtures?” first cited in Example 4.26, reported the following observations on X � voids filled with asphalt (%) for 52 specimens of a certain type of hot-mix asphalt:

74.33 71.07 73.82 77.42 79.35 82.27 77.75 78.65 77.19 74.69 77.25 74.84 60.90 60.75 74.09 65.36 67.84 69.97 68.83 75.09 62.54 67.47 72.00 66.51 68.21 64.46 64.34 64.93 67.33 66.08 67.31 74.87 69.40 70.83 81.73 82.50 79.87 81.96 79.51 84.12 80.61 79.89 79.70 78.74 77.28 79.97 75.09 74.38 77.67 83.73 80.39 76.90

Let’s estimate the variance s 2 of the population distribution. A natural estimator is the sample variance:

Minitab gave the following output from a request to display descriptive statistics:

Variable Count Mean SE Mean StDev Variance Q1 Median Q3 VFA(B) 52 73.880 0.889 6.413 41.126 67.933 74.855 79.470

Thus the point estimate of the population variance is

[alternatively, the computational formula for the numerator of s2 gives

A point estimate of the population standard deviation is then

An alternative estimator results from using the divisor n rather than n � 1:

We will shortly indicate why many statisticians prefer S 2 to this latter estimator.

The cited article considered fitting four different distributions to the data: normal, log- normal, two-parameter Weibull, and three-parameter Weibull. Several different tech- niques were used to conclude that the two-parameter Weibull provided the best fit (a normal probability plot of the data shows some deviation from a linear pattern). From Section 4.5, the variance of a Weibull random variable is

where a and b are the shape and scale parameters of the distribution. The authors of the article used the method of maximum likelihood (see Section 6.2) to estimate these parameters. The resulting estimates were . A sensible estimate of the population variance can now be obtained from substituting the esti- mates of the two parameters into the expression for s2; the result is This latter estimate is obviously quite different from the sample variance. Its validity depends on the population distribution being Weibull, whereas the sample variance is a sensible way to estimate s2 when there is uncertainty as to the specific form of the population distribution. ■

ŝ2 5 56.035.

â 5 11.9731, b̂ 5 77.0153

s2 5 b25�(1 1 2/a) 2 [�(1 1 1/a)]26

ŝ2 5 g (Xi

2X2)2

n , estimate 5

2097.4124

52 5 40.335

ŝ 5 s 5141.126 5 6.413.

Sxx 5 gxi 2 2 (gxi)

2 / n 5 285,929.5964 2 (3841.78)2 / 52 5 2097.4124].

ŝ2 5 s2 5 g (xi 2 x )

2

52 2 1 5 41.126

ŝ 2 5 S 2 5 g (Xi 2 X

2 )2

n 2 1

Example 6.3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.1 Some General Concepts of Point Estimation 243

In the best of all possible worlds, we could find an estimator for which always. However, is a function of the sample Xi’s, so it is a random variable. For some samples, will yield a value larger than , whereas for other samples will underestimate . If we write

then an accurate estimator would be one resulting in small estimation errors, so that estimated values will be near the true value.

A sensible way to quantify the idea of being close to is to consider the squared error . For some samples, will be quite close to and the resulting squared error will be near 0. Other samples may give values of far from , corre- sponding to very large squared errors. An omnibus measure of accuracy is the expected or mean square error . If a first estimator has smaller MSE than does a second, it is natural to say that the first estimator is the better one. However, MSE will generally depend on the value of . What often happens is that one estimator will have a smaller MSE for some values of and a larger MSE for other values. Finding an estimator with the smallest MSE is typically not possible.

One way out of this dilemma is to restrict attention just to estimators that have some specified desirable property and then find the best estimator in this restricted group. A popular property of this sort in the statistical community is unbiasedness.

Unbiased Estimators Suppose we have two measuring instruments; one instrument has been accurately cal- ibrated, but the other systematically gives readings smaller than the true value being measured. When each instrument is used repeatedly on the same object, because of measurement error, the observed measurements will not be identical. However, the measurements produced by the first instrument will be distributed about the true value in such a way that on average this instrument measures what it purports to measure, so it is called an unbiased instrument. The second instrument yields observations that have a systematic error component or bias.

u

u

MSE 5 E[(û 2 u)2]

uû

uû(û 2 u)2 uû

û 5 u 1 error of estimation

u

ûuû

û 5 uû

DEFINITION A point estimator is said to be an unbiased estimator of if for every possible value of . If is not unbiased, the difference is called the bias of .û

E(û) 2 uûu E(û) 5 uuû

That is, is unbiased if its probability (i.e., sampling) distribution is always “cen- tered” at the true value of the parameter. Suppose is an unbiased estimator; then if

� 100, the sampling distribution is centered at 100; if � 27.5, then the �̂ sam- pling distribution is centered at 27.5, and so on. Figure 6.1 pictures the distributions of several biased and unbiased estimators. Note that “centered” here means that the expected value, not the median, of the distribution of �̂ is equal to .u

uûu

� � �1Bias of �1Bias of

pdf of�2ˆ

pdf of�1ˆ

ˆ

pdf of�2ˆ

pdf of�1ˆ

ˆ

Figure 6.1 The pdf’s of a biased estimator and an unbiased estimator for a parameter uû2û1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

244 CHAPTER 6 Point Estimation

It may seem as though it is necessary to know the value of (in which case estimation is unnecessary) to see whether is unbiased. This is not usually the case, though, because unbiasedness is a general property of the estimator’s sampling distribution—where it is centered—which is typically not dependent on any particular parameter value.

In Example 6.1, the sample proportion X/n was used as an estimator of p, where X, the number of sample successes, had a binomial distribution with parame- ters n and p. Thus

E( p̂ ) 5 E aX n b 5 1

n E(X) 5

1 n (np) 5 p

u

When X is a binomial rv with parameters n and p, the sample proportion � X/n is an unbiased estimator of p.

No matter what the true value of p is, the distribution of the estimator will be cen- tered at the true value.

Suppose that X, the reaction time to a certain stimulus, has a uniform distribution on the interval from 0 to an unknown upper limit (so the density function of X is rectan- gular in shape with height 1/ for 0 x ). It is desired to estimate on the basis of a random sample X1, X2, . . . , Xn of reaction times. Since is the largest possible time in the entire population of reaction times, consider as a first estimator the largest sample reaction time: . If n � 5 and x1 � 4.2, x2 � 1.7,û1 5 max (X1 , c, Xn)

u

uuu

u

PROPOSITION

x3 � 2.4, x4 � 3.9, and x5 � 1.3, the point estimate of is (4.2, 1.7, 2.4, .

Unbiasedness implies that some samples will yield estimates that exceed and other samples will yield estimates smaller than —otherwise could not possibly be the center (balance point) of ’s distribution. However, our proposed estimator will never overestimate (the largest sample value cannot exceed the largest population value) and will underestimate unless the largest sample value equals . This intuitive argument shows that is a biased estimator. More precisely, it can be shown (see Exercise 32) that

The bias of is given by n /(n � 1) � � � /(n � 1), which approaches 0 as n gets large.

It is easy to modify to obtain an unbiased estimator of . Consider the estimator

Using this estimator on the data gives the estimate (6/5)(4.2) � 5.04. The fact that (n � 1)/n 1 implies that will overestimate for some samples and underesti- mate it for others. The mean value of this estimator is

uû2

û2 5 n 1 1

n # max ( X1 , c, Xn )

uû1

uuuû1

E( û1 ) 5 n

n 1 1 # u , u asince n

n 1 1 , 1b

û1

uu

u

û1

uu

u

3.9, 1.3) 5 4.2 û1 5 maxu

Example 6.4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.1 Some General Concepts of Point Estimation 245

If is used repeatedly on different samples to estimate , some estimates will be too large and others will be too small, but in the long run there will be no systematic ten- dency to underestimate or overestimate . ■u

uû2

5 n 1 1

n # n n 1 1

u 5 u

E(û2) 5 E cn 1 1n max(X1, . . . , Xn)d 5 n 1 1

n # E [max(X1, . . . , Xn)]

Principle of Unbiased Estimation

When choosing among several different estimators of , select one that is unbiased.

u

According to this principle, the unbiased estimator in Example 6.4 should be preferred to the biased estimator 1. Consider now the problem of estimating s

2.û û2

PROPOSITION Let X1, X2, . . . , Xn be a random sample from a distribution with mean and variance s2. Then the estimator

is unbiased for estimating s 2.

ŝ2 5 S 2 5 g (Xi 2 X

2)2

n 2 1

m

Proof For any rv Y, V(Y) � E(Y2) � [E(Y)]2, so E(Y 2) � V(Y) � [E(Y)]2. Applying this to

gives

The estimator that uses divisor n can be expressed as (n � 1)S 2/n, so

This estimator is therefore not unbiased. The bias is (n � 1)s2/n � s2 � �s2/n. Because the bias is negative, the estimator with divisor n tends to underestimate s2, and this is why the divisor n � 1 is preferred by many statisticians (though when n is large, the bias is small and there is little difference between the two).

Unfortunately, the fact that S 2 is unbiased for estimating s2 does not imply that S is unbiased for estimating s. Taking the square root messes up the property of

E c(n 2 1)S 2 n

d 5 n 2 1 n

E(S 2) 5 n 2 1

n s2

5 1

n 2 1 5ns2 2 s26 5 s2 (as desired)

5 1

n 2 1 ens2 1 nm2 2 1

n ns2 2

1 n (nm) 2 f

5 1

n 2 1 eg (s2 1 m2) 2 1n 5V(gXi) 1 [E(gXi)]26 f

E(S 2) 5 1

n 2 1 egE(Xi2) 2 1n E[(gXi)2] f

S 2 5 1

n 2 1 cg X 2i 2 (gXi)

2

n d

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

246 CHAPTER 6 Point Estimation

If X1, X2, . . . , Xn is a random sample from a distribution with mean , then is an unbiased estimator of . If in addition the distribution is continuous and symmetric, then and any trimmed mean are also unbiased estimators of .mX|

m

Xm

The fact that is unbiased is just a restatement of one of our rules of expected value: E( ) � for every possible value of (for discrete as well as continuous distribu- tions). The unbiasedness of the other estimators is more difficult to verify.

The next example introduces another situation in which there are several un- biased estimators for a particular parameter.

Under certain circumstances organic contaminants adhere readily to wafer surfaces and cause deterioration in semiconductor manufacturing devices. The paper “Ceramic Chemical Filter for Removal of Organic Contaminants” (J. of the Institute of Environmental Sciences and Technology, 2003: 59–65) discussed a recently devel- oped alternative to conventional charcoal filters for removing organic airborne molec- ular contamination in cleanroom applications. One aspect of the investigation of filter performance involved studying how contaminant concentration in air related to concentration on a wafer surface after prolonged exposure. Consider the following representative data on x � DBP concentration in air and y � DBP concentration on a wafer surface after 4-hour exposure (both in g/m3, where DBP � dibutyl phthalate).

Obs. i: 1 2 3 4 5 6 x: .8 1.3 1.5 3.0 11.6 26.6 y: .6 1.1 4.5 3.5 14.4 29.1

The authors comment that “DBP adhesion on the wafer surface was roughly propor- tional to the DBP concentration in air.” Figure 6.2 shows a plot of y versus x—i.e., of the (x, y) pairs.

m

mmX X

unbiasedness (the expected value of the square root is not the square root of the expected value). Fortunately, the bias of S is small unless n is quite small. There are other good reasons to use S as an estimator, especially when the population distribu- tion is normal. These will become more apparent when we discuss confidence inter- vals and hypothesis testing in the next several chapters.

In Example 6.2, we proposed several different estimators for the mean of a normal distribution. If there were a unique unbiased estimator for , the estimation problem would be resolved by using that estimator. Unfortunately, this is not the case.

m

m

PROPOSITION

Example 6.5

0 5 10 15 20 25 30

30

25

20

15

10

5

0

Wafer DBP

Air DBP

Figure 6.2 Plot of the DBP data from Example 6.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.1 Some General Concepts of Point Estimation 247

If y were exactly proportional to x, we would have y � bx for some value b, which says that the (x, y) points in the plot would lie exactly on a straight line with slope b passing through (0, 0). But this is only approximately the case. So we now assume that for any fixed x, wafer DBP is a random variable Y having mean value bx. That is, we postulate that the mean value of Y is related to x by a line passing through (0, 0) but that the observed value of Y will typically deviate from this line (this is referred to in the statistical literature as “regression through the origin”).

We now wish to estimate the slope parameter b. Consider the following three estimators:

The resulting estimates based on the given data are 1.3497, 1.1875, and 1.1222, respectively. So the estimate definitely depends on which estimator is used. If one of these three estimators were unbiased and the other two were biased, there would be a good case for using the unbiased one. But all three are unbiased; the argument relies on the fact that each one is a linear function of the Yi’s (we are assuming here that the xi’s are fixed, not random):

In both the foregoing example and the situation involving estimating a normal pop- ulation mean, the principle of unbiasedness (preferring an unbiased estimator to a biased one) cannot be invoked to select an estimator. What we now need is a crite- rion for choosing among unbiased estimators.

Estimators with Minimum Variance Suppose and are two estimators of that are both unbiased. Then, although the distribution of each estimator is centered at the true value of , the spreads of the dis- tributions about the true value may be different.

u

uû2û1

EagxiYi gx2i

b 5 1 gxi

2 E QgxiYiR 5 1

gx2i Qgxi bxiR 5 1

gxi 2 b Qgxi2R 5 b

EagYi gxi

b 5 1 gxi

E QgYiR 5 1 gxi

QgbxiR 5 1 gxi

b QgxiR 5 b

E a1 n g

Yi xi b 5 1

n g

E(Yi) xi

5 1 n g bxi xi

5 1 n gb 5

nb n

5 b

[1: b̂ 5 1 n g

Yi xi

[2: b̂ 5 g Yi

gxi [3: b̂ 5 g

xiYi gxi

2

Principle of Minimum Variance Unbiased Estimation

Among all estimators of that are unbiased, choose the one that has minimum variance. The resulting is called the minimum variance unbiased estima- tor (MVUE) of .u

u

Figure 6.3 pictures the pdf’s of two unbiased estimators, with having smaller variance than . Then is more likely than to produce an estimate close to the true . The MVUE is, in a certain sense, the most likely among all unbiased estimators to produce an estimate close to the true .u

u

û2û1û2

û1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

248 CHAPTER 6 Point Estimation

pdf of 2�̂

pdf of 1�̂

Figure 6.3 Graphs of the pdf’s of two different unbiased estimators

In Example 6.5, suppose each Yi is normally distributed with mean bxi and variance s2 (the assumption of constant variance). Then it can be shown that the third estima- tor not only has smaller variance than either of the other two unbi- ased estimators, but in fact is the MVUE—it has smaller variance than any other unbiased estimator of b.

We argued in Example 6.4 that when X1, . . . , Xn is a random sample from a uniform distribution on [0, ], the estimator

is unbiased for u (we previously denoted this estimator by ). This is not the only unbiased estimator of u. The expected value of a uniformly distributed rv is just the midpoint of the interval of positive density, so E(Xi) � u/2. This implies that E( ) � u/2, from which E(2 ) � . That is, the estimator is unbiased for .

If X is uniformly distributed on the interval from A to B, then V(X) � s2 � (B � A)2/12. Thus, in our situation, V(Xi) � u

2/12, V( ) � s2/n � 2/(12n), and The results of Exercise 32 can be used to show

that . The estimator has smaller variance than does if

3n� n(n � 2)—that is, if 0 � n2 � n � n(n � 1). As long as n 1, , so

is a better estimator than . More advanced methods can be used to show that

is the MVUE of u—every other unbiased estimator of u has variance that exceeds

u 2/[n(n � 2)]. ■

One of the triumphs of mathematical statistics has been the development of methodology for identifying the MVUE in a wide variety of situations. The most important result of this type for our purposes concerns estimating the mean of a normal distribution.

m

û1û2û1

V(û1) , V(û2)

û2û1V(û1) 5 u 2/[n(n 1 2)]

V(û2) 5 V(2X ) 5 4V(X) 5 u 2/(3n).

uX

uû2 5 2XuX X

û2

û1 5 n 1 1

n # max (X1 , c, Xn)

u

b̂ 5 gxiYi /gxi 2

THEOREM Let X1, . . . , Xn be a random sample from a normal distribution with parame- ters and s. Then the estimator is the MVUE for .mXm̂ 5m

Whenever we are convinced that the population being sampled is normal, the theo- rem says that �x should be used to estimate . In Example 6.2, then, our estimate would be �x � 27.793.

In some situations, it is possible to obtain an estimator with small bias that would be preferred to the best unbiased estimator. This is illustrated in Figure 6.4. However, MVUEs are often easier to obtain than the type of biased estimator whose distribution is pictured.

m

Example 6.6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.1 Some General Concepts of Point Estimation 249

pdf of 2, the MVUE�̂

pdf of 1, a biased estimator�̂

Figure 6.4 A biased estimator that is preferable to the MVUE

Some Complications The last theorem does not say that in estimating a population mean , the estimator should be used irrespective of the distribution being sampled.

Suppose we wish to estimate the thermal conductivity of a certain material. Using standard measurement techniques, we will obtain a random sample X1, . . . , Xn of n thermal conductivity measurements. Let’s assume that the population distribution is a member of one of the following three families:

(6.1)

(6.2)

(6.3)

The pdf (6.1) is the normal distribution, (6.2) is called the Cauchy distribution, and (6.3) is a uniform distribution. All three distributions are symmetric about , and in fact the Cauchy distribution is bell-shaped but with much heavier tails (more proba- bility farther out) than the normal curve. The uniform distribution has no tails. The four estimators for m considered earlier are , , e (the average of the two extreme observations), and tr(10), a trimmed mean.

The very important moral here is that the best estimator for depends cru- cially on which distribution is being sampled. In particular,

1. If the random sample comes from a normal distribution, then is the best of the four estimators, since it has minimum variance among all unbiased estimators.

2. If the random sample comes from a Cauchy distribution, then and e are terrible estimators for , whereas is quite good (the MVUE is not known); is bad because it is very sensitive to outlying observations, and the heavy tails of the Cauchy distribution make a few such observations likely to appear in any sample.

3. If the underlying distribution is uniform, the best estimator is e; this estimator is greatly influenced by outlying observations, but the lack of tails makes such observations impossible.

4. The trimmed mean is best in none of these three situations but works reason- ably well in all three. That is, tr(10) does not suffer too much in comparison with the best procedure in any of the three situations. ■

More generally, recent research in statistics has established that when estimating a point of symmetry of a continuous probability distribution, a trimmed mean with trimming proportion 10% or 20% (from each end of the sample) produces reasonably

m

X

X

XX |

m

XX

X

m

X XX|X

m

f (x) 5 u 12c 2c # x 2 m # c 0 otherwise

f (x) 5 1

p[1 1 (x 2 m)2] 2` , x , `

f (x) 5 1

12ps2 e2(x2m)2/(2s2) 2` , x , `

m

Xm

Example 6.7

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

250 CHAPTER 6 Point Estimation

behaved estimates over a very wide range of possible models. For this reason, a trimmed mean with small trimming percentage is said to be a robust estimator.

In some situations, the choice is not between two different estimators con- structed from the same sample, but instead between estimators based on two differ- ent experiments.

Suppose a certain type of component has a lifetime distribution that is exponential with parameter l so that expected lifetime is � 1/l. A sample of n such components is selected, and each is put into operation. If the experiment is continued until all n life- times, X1, . . . , Xn, have been observed, then is an unbiased estimator of .

In some experiments, though, the components are left in operation only until the time of the rth failure, where r � n. This procedure is referred to as censoring. Let Y1 denote the time of the first failure (the minimum lifetime among the n com- ponents), Y2 denote the time at which the second failure occurs (the second smallest lifetime), and so on. Since the experiment terminates at time Yr, the total accumu- lated lifetime at termination is

We now demonstrate that is an unbiased estimator for . To do so, we need two properties of exponential variables:

1. The memoryless property (see Section 4.4), which says that at any time point, remaining lifetime has the same exponential distribution as original lifetime.

2. When X1, . . . , Xk are independent, each exponentially distributed with parame- ter l, min(X1, . . . , Xk), is exponential with parameter kl.

Since all n components last until Y1, n � 1 last an additional Y2 � Y1, n � 2 an addi- tional Y3 � Y2 amount of time, and so on, another expression for Tr is

Tr � nY1 � (n � 1)(Y2 � Y1) � (n � 2)(Y3 � Y2) � . . .

� (n � r � 1)(Yr � Yr � 1)

But Y1 is the minimum of n exponential variables, so E(Y1) � 1/(nl). Similarly, Y2 � Y1 is the smallest of the n � 1 remaining lifetimes, each exponential with parameter l (by the memoryless property), so E(Y2 � Y1) � 1/[(n � 1)l]. Continuing, E(Yi � 1 � Yi) � 1/[(n � i)l], so

E(Tr) � nE(Y1) � (n � 1)E(Y2 � Y1) � . . . � (n � r � 1)E(Yr � Yr � 1)

Therefore, E(Tr /r) � (1/r)E(Tr) � (1/r) � (r/l) � 1/l � m as claimed. As an example, suppose 20 components are tested and r � 10. Then if the first

ten failure times are 11, 15, 29, 33, 35, 40, 47, 55, 58, and 72, the estimate of is

The advantage of the experiment with censoring is that it terminates more quickly than the uncensored experiment. However, it can be shown that V(Tr /r) � 1/(l

2r), which is larger than 1/(l2n), the variance of in the uncensored experiment. ■X

m̂ 5 11 1 15 1 c1 72 1 (10)(72)

10 5 111.5

m

5 r

l

5 n # 1 nl

1 (n 2 1) # 1 (n 2 1)l

1 c1 (n 2 r 1 1) # 1 (n 2 r 1 1)l

mm̂ 5 Tr /r

Tr 5 g r

i51 Yi 1 (n 2 r)Yr

mX

m

Example 6.8

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.1 Some General Concepts of Point Estimation 251

Reporting a Point Estimate: The Standard Error Besides reporting the value of a point estimate, some indication of its precision should be given. The usual measure of precision is the standard error of the estimator used.

DEFINITION The standard error of an estimator is its standard deviation . It is the magnitude of a typical or representative deviation between an estimate and the value of . If the standard error itself involves unknown parameters whose values can be estimated, substitution of these estimates into yields the estimated standard error (estimated standard deviation) of the estimator. The estimated standard error can be denoted either by (the over s empha- sizes that is being estimated) or by .sûsû

ˆŝû

sû

u

sû 5 1V(û)û

Assuming that breakdown voltage is normally distributed, is the best estima- tor of . If the value of s is known to be 1.5, the standard error of is

. If, as is usually the case, the value of s is unknown, the estimate is substituted into to obtain the esti- mated standard error . ■

The standard error of � X/n is

Since p and q � 1 � p are unknown (else why estimate?), we substitute � x/np̂

sp̂ 5 2V(X/n) 5 B V(X)

n2 5 B

npq

n2 5 B

pq n

ŝX2 5 sX2 5 s/1n 5 1.462/120 5 .327 sX2ŝ 5 s 5 1.462

sX2 5 s/1n 5 1.5/120 5 .335 Xm

m̂ 5 XExample 6.9 (Example 6.2 continued)

Example 6.10 (Example 6.1 continued)

The form of the estimator may be sufficiently complicated so that standard statistical theory cannot be applied to obtain an expression for . This is true, for sû

and � 1 � x/n into yielding the estimated standard error

. Alternatively, since the largest value of pq is attained when 1(.6)(.4)/25 5 .098

5 1p̂q̂/n 5ŝp̂sp̂,q̂

p � q � .5, an upper bound on the standard error is ■11/(4n) 5 .10.

When the point estimator has approximately a normal distribution, which will often be the case when n is large, then we can be reasonably confident that the true value of lies within approximately 2 standard errors (standard deviations) of . Thus if a sample of n � 36 component lifetimes gives and s � 3.60, then

, so within 2 estimated standard errors, translates to the interval 28.50 (2)(.60) � (27.30, 29.70).

If is not necessarily approximately normal but is unbiased, then it can be shown that the estimate will deviate from by as much as 4 standard errors at most 6% of the time. We would then expect the true value to lie within 4 standard errors of (and this is a very conservative statement, since it applies to any unbiased ). Summarizing, the standard error tells us roughly within what distance of we can expect the true value of to lie.u

ûû

u

m̂s/1n 5 .60 m̂ 5 x 5 28.50

ûu

example, in the case � s, ; the standard deviation of the statistic S, sS, cannot in general be determined. In recent years, a new computer-intensive method called the bootstrap has been introduced to address this problem. Suppose that the population pdf is f (x; ), a member of a particular parametric family, and that data x1, x2, . . . , xn gives . We now use the computer to obtain “bootstrap samples” from the pdf f(x; 21.7), and for each sample we calculate a “bootstrap estimate” :û*

û 5 21.7 u

û 5 Su

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

252 CHAPTER 6 Point Estimation

B � 100 or 200 is often used. Now let , the sample mean of the bootstrap estimates. The bootstrap estimate of ’s standard error is now just the sample stan- dard deviation of the :

(In the bootstrap literature, B is often used in place of B � 1; for typical values of B, there is usually little difference between the resulting estimates.)

A theoretical model suggests that X, the time to breakdown of an insulating fluid between electrodes at a particular voltage, has f(x; l) � le�lx, an exponential distri- bution. A random sample of n � 10 breakdown times (min) gives the following data:

41.53 18.73 2.99 30.34 12.33 117.52 73.02 223.63 4.00 26.78

Since E(X) � 1/l, E( ) � 1/l, so a reasonable estimate of l is . We then used a statistical computer package to obtain B � 100 bootstrap

samples, each of size 10, from f(x; .018153). The first such sample was 41.00, 109.70, 16.78, 6.31, 6.76, 5.62, 60.96, 78.81, 192.25, 27.61, from which

and . The average of the 100 bootstrap esti- mates is , and the sample standard deviation of these 100 estimates is

, the bootstrap estimate of ’s standard error. A histogram of the ’s

was somewhat positively skewed, suggesting that the sampling distribution of also has this property. ■

Sometimes an investigator wishes to estimate a population characteristic without assuming that the population distribution belongs to a particular parametric family. An instance of this occurred in Example 6.7, where a 10% trimmed mean was proposed for estimating a symmetric population distribution’s center . The data of Example 6.2 gave

, but now there is no assumed f (x; ), so how can we obtain a boot- strap sample? The answer is to regard the sample itself as constituting the population (the n � 20 observations in Example 6.2) and take B different samples, each of size n, with replacement from this population. The book by Bradley Efron and Robert Tibshirani or the one by John Rice listed in the chapter bibliography provides more information.

uû 5 x#tr(10) 5 27.838 u

100l̂i*l̂sl̂ 5 .0091 l#* 5 .02153

l̂*1 5 1/54.58 5 .01832gx*i 5 545.8

5 .018153 l̂ 5 1/ x 5 1/55.087X

Sû 5 B 1

B 2 1 g (û*i 2 u*)

2

û*i ’s û

u#* 5 �û*i /B

Bth bootstrap sample: x1*, x*2, c, x*n ; estimate 5 ûB* ( Second bootstrap sample: x*1, x2*, c, x n*; estimate 5 û2* First bootstrap sample: x*1, x*2 , c, xn*; estimate 5 û*1

Example 6.11

EXERCISES Section 6.1 (1–19)

1. The accompanying data on flexural strength (MPa) for con- crete beams of a certain type was introduced in Example 1.2.

5.9 7.2 7.3 6.3 8.1 6.8 7.0

7.6 6.8 6.5 7.0 6.3 7.9 9.0

8.2 8.7 7.8 9.7 7.4 7.7 9.7

7.8 7.7 11.6 11.3 11.8 10.7

a. Calculate a point estimate of the mean value of strength for the conceptual population of all beams manufactured in this fashion, and state which estimator you used. [Hint: �xi � 219.8.]

b. Calculate a point estimate of the strength value that sepa- rates the weakest 50% of all such beams from the strongest 50%, and state which estimator you used.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.1 Some General Concepts of Point Estimation 253

c. Calculate and interpret a point estimate of the population standard deviation s. Which estimator did you use? [Hint:

.] d. Calculate a point estimate of the proportion of all such

beams whose flexural strength exceeds 10 MPa. [Hint: Think of an observation as a “success” if it exceeds 10.]

e. Calculate a point estimate of the population coefficient of variation s/ , and state which estimator you used.

2. A sample of 20 students who had recently taken elementary statistics yielded the following information on the brand of calculator owned (T � Texas Instruments, H � Hewlett Packard, C � Casio, S � Sharp):

T T H T C T T S C H

S S T H C T T T H T

a. Estimate the true proportion of all such students who own a Texas Instruments calculator.

b. Of the 10 students who owned a TI calculator, 4 had graphing calculators. Estimate the proportion of students who do not own a TI graphing calculator.

3. Consider the following sample of observations on coating thickness for low-viscosity paint (“Achieving a Target Value for a Manufacturing Process: A Case Study,” J. of Quality Technology, 1992: 22–26):

.83 .88 .88 1.04 1.09 1.12 1.29 1.31

1.48 1.49 1.59 1.62 1.65 1.71 1.76 1.83

Assume that the distribution of coating thickness is normal (a normal probability plot strongly supports this assumption). a. Calculate a point estimate of the mean value of coating

thickness, and state which estimator you used. b. Calculate a point estimate of the median of the coating

thickness distribution, and state which estimator you used. c. Calculate a point estimate of the value that separates the

largest 10% of all values in the thickness distribution from the remaining 90%, and state which estimator you used. [Hint: Express what you are trying to estimate in terms of m and s.]

d. Estimate P(X � 1.5), i.e., the proportion of all thickness values less than 1.5. [Hint: If you knew the values of and s, you could calculate this probability. These values are not available, but they can be estimated.]

e. What is the estimated standard error of the estimator that you used in part (b)?

4. The article from which the data in Exercise 1 was extracted also gave the accompanying strength observations for cylinders:

6.1 5.8 7.8 7.1 7.2 9.2 6.6 8.3 7.0 8.3

7.8 8.1 7.4 8.5 8.9 9.8 9.7 14.1 12.6 11.2

Prior to obtaining data, denote the beam strengths by X1, . . . , Xm and the cylinder strengths by Y1, . . . , Yn. Suppose that the Xi’s constitute a random sample from a distribution with mean 1 and standard deviation s1 and that the Yi’s form a random sample (independent of the Xi’s) from another distribution with mean m2 and standard deviation s2.

m

m

m

gxi 2 5 1860.94

a. Use rules of expected value to show that � is an unbi- ased estimator of 1 � 2. Calculate the estimate for the given data.

b. Use rules of variance from Chapter 5 to obtain an expres- sion for the variance and standard deviation (standard error) of the estimator in part (a), and then compute the estimated standard error.

c. Calculate a point estimate of the ratio s1/s2 of the two standard deviations.

d. Suppose a single beam and a single cylinder are randomly selected. Calculate a point estimate of the variance of the dif- ference X � Y between beam strength and cylinder strength.

5. As an example of a situation in which several different statis- tics could reasonably be used to calculate a point estimate, consider a population of N invoices. Associated with each invoice is its “book value,” the recorded amount of that invoice. Let T denote the total book value, a known amount. Some of these book values are erroneous. An audit will be carried out by randomly selecting n invoices and determining the audited (correct) value for each one. Suppose that the sample gives the following results (in dollars).

mm

YX

Invoice

1 2 3 4 5

Book value 300 720 526 200 127 Audited value 300 520 526 200 157 Error 0 200 0 0 �30

Let

� sample mean book value

� sample mean audited value

� sample mean error

Propose three different statistics for estimating the total audited (i.e., correct) value—one involving just N and , another involving T, N, and , and the last involving T and

/ . If N � 5000 and T � 1,761,300, calculate the three corresponding point estimates. (The article “Statistical Models and Analysis in Auditing,” Statistical Science, 1989: 2–33 discusses properties of these estimators.)

6. Consider the accompanying observations on stream flow (1000s of acre-feet) recorded at a station in Colorado for the period April 1–August 31 over a 31-year span (from an arti- cle in the 1974 volume of Water Resources Research).

127.96 210.07 203.24 108.91 178.21

285.37 100.85 89.59 185.36 126.94

200.19 66.24 247.11 299.87 109.64

125.86 114.79 109.11 330.33 85.54

117.64 302.74 280.55 145.11 95.36

204.91 311.13 150.58 262.09 477.08

94.33

YX D

X

D

X

Y

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

254 CHAPTER 6 Point Estimation

An appropriate probability plot supports the use of the log- normal distribution (see Section 4.5) as a reasonable model for stream flow.

a. Estimate the parameters of the distribution. [Hint: Remember that X has a lognormal distribution with parameters and s 2 if ln(X) is normally distributed with mean and variance s 2.]

b. Use the estimates of part (a) to calculate an estimate of the expected value of stream flow. [Hint: What is E(X)?]

7. a. A random sample of 10 houses in a particular area, each of which is heated with natural gas, is selected and the amount of gas (therms) used during the month of January is determined for each house. The resulting observations are 103, 156, 118, 89, 125, 147, 122, 109, 138, 99. Let denote the average gas usage during January by all houses in this area. Compute a point estimate of .

b. Suppose there are 10,000 houses in this area that use nat- ural gas for heating. Let t denote the total amount of gas used by all of these houses during January. Estimate t using the data of part (a). What estimator did you use in computing your estimate?

c. Use the data in part (a) to estimate p, the proportion of all houses that used at least 100 therms.

d. Give a point estimate of the population median usage (the middle value in the population of all houses) based on the sample of part (a). What estimator did you use?

8. In a random sample of 80 components of a certain type, 12 are found to be defective. a. Give a point estimate of the proportion of all such compo-

nents that are not defective. b. A system is to be constructed by randomly selecting two

of these components and connecting them in series, as shown here.

m

m

m

m

a. Find an unbiased estimator of m and compute the estimate for the data. [Hint: E(X) � m for X Poisson, so E( ) � ?]

b. What is the standard deviation (standard error) of your estimator? Compute the estimated standard error. [Hint:

for X Poisson.]

10. Using a long rod that has length , you are going to lay out a square plot in which the length of each side is . Thus the area of the plot will be 2. However, you do not know the value of , so you decide to make n independent measure- ments X1, X2, . . . , Xn of the length. Assume that each Xi has mean (unbiased measurements) and variance s2. a. Show that is not an unbiased estimator for m2. [Hint: For

any rv Y, E(Y2) � V(Y) � [E(Y)]2. Apply this with Y � .] b. For what value of k is the estimator � kS2 unbiased

for 2? [Hint: Compute E( � kS 2).]

11. Of n1 randomly selected male smokers, X1 smoked filter cig- arettes, whereas of n2 randomly selected female smokers, X2 smoked filter cigarettes. Let p1 and p2 denote the probabili- ties that a randomly selected male and female, respectively, smoke filter cigarettes. a. Show that (X1/n1) � (X2/n2) is an unbiased estimator for

p1 � p2. [Hint: E(Xi) � ni pi for i � 1, 2.] b. What is the standard error of the estimator in part (a)? c. How would you use the observed values x1 and x2 to esti-

mate the standard error of your estimator? d. If n1 � n2 � 200, x1 � 127, and x2 � 176, use the esti-

mator of part (a) to obtain an estimate of p1 � p2. e. Use the result of part (c) and the data of part (d) to esti-

mate the standard error of the estimator.

12. Suppose a certain type of fertilizer has an expected yield per acre of 1 with variance s

2, whereas the expected yield for a second type of fertilizer is 2 with the same variance s

2. Let and denote the sample variances of yields based on sample sizes n1 and n2, respectively, of the two fertilizers. Show that the pooled (combined) estimator

is an unbiased estimator of s2.

13. Consider a random sample X1, . . . , Xn from the pdf

f (x; ) � .5(1 � x) �1 x 1

where �1 1 (this distribution arises in particle physics). Show that is an unbiased estimator of . [Hint: First determine � E(X) � E( ).]

14. A sample of n captured Pandemonium jet fighters results in serial numbers x1, x2, x3, . . . , xn. The CIA knows that the air- craft were numbered consecutively at the factory starting with a and ending with b, so that the total number of planes manu- factured is b� a� 1 (e.g., if a� 17 and b� 29, then 29 � 17 � 1 � 13 planes having serial numbers 17, 18, 19, . . . , 28, 29 were manufactured). However, the CIA does not know the values of a or b. A CIA statistician suggests using the

Xm uû 5 3X

u

uu

ŝ2 5 (n1 2 1)S1

2 1 (n2 2 1)S2 2

n1 1 n2 2 2

S 2 2S 1

2 m

m

X 2m X 2

X X 2

m

m

m

m

m

sX 2 5 m

X

The series connection implies that the system will function if and only if neither component is defective (i.e., both com- ponents work properly). Estimate the proportion of all such systems that work properly. [Hint: If p denotes the probabil- ity that a component works properly, how can P(system works) be expressed in terms of p?]

9. Each of 150 newly manufactured items is examined and the number of scratches per item is recorded (the items are sup- posed to be free of scratches), yielding the following data:

Number of scratches per item 0 1 2 3 4 5 6 7

Observed frequency 18 37 42 30 13 7 2 1

Let X � the number of scratches on a randomly chosen item, and assume that X has a Poisson distribution with parameter m.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.2 Methods of Point Estimation 255

a. Suppose that r � 2. Show that

� (r � 1)/(X � r � 1)

is an unbiased estimator for p. [Hint: Write out E( ) and cancel x � r � 1 inside the sum.]

b. A reporter wishing to interview five individuals who support a certain candidate begins asking people whether (S) or not (F) they support the candidate. If the sequence of responses is SFFSFFFSSS, estimate p � the true pro- portion who support the candidate.

18. Let X1, X2, . . . , Xn be a random sample from a pdf f (x) that is symmetric about m, so that is an unbiased estimator of

. If n is large, it can be shown that V( ) � 1/(4n[ f (m)]2). a. Compare V( ) to V( ) when the underlying distribution

is normal. b. When the underlying pdf is Cauchy (see Example 6.7),

V( ) � , so is a terrible estimator. What is V( ) in this case when n is large?

19. An investigator wishes to estimate the proportion of stu- dents at a certain university who have violated the honor code. Having obtained a random sample of n students, she realizes that asking each, “Have you violated the honor code?” will probably result in some untruthful responses. Consider the following scheme, called a randomized response technique. The investigator makes up a deck of 100 cards, of which 50 are of type I and 50 are of type II.

Type I: Have you violated the honor code (yes or no)?

Type II: Is the last digit of your telephone number a 0, 1, or 2 (yes or no)?

Each student in the random sample is asked to mix the deck, draw a card, and answer the resulting question truthfully. Because of the irrelevant question on type II cards, a yes response no longer stigmatizes the respondent, so we assume that responses are truthful. Let p denote the proportion of honor-code violators (i.e., the probability of a randomly selected student being a violator), and let l � P(yes response). Then l and p are related by l� .5p � (.5)(.3). a. Let Y denote the number of yes responses, so Y � Bin

(n, l). Thus Y/n is an unbiased estimator of l. Derive an estimator for p based on Y. If n � 80 and y � 20, what is your estimate? [Hint: Solve l � .5p � .15 for p and then substitute Y/n for l.]

b. Use the fact that E(Y/n) � l to show that your estimator is unbiased.

c. If there were 70 type I and 30 type II cards, what would be your estimator for p?

X|X`X

XX| X|m

X|

p̂ estimator max(Xi) � min(Xi) � 1 to estimate the total number of planes manufactured. a. If n � 5, x1 � 237, x2 � 375, x3 � 202, x4 � 525, and

x5 � 418, what is the corresponding estimate? b. Under what conditions on the sample will the value of

the estimate be exactly equal to the true total number of planes? Will the estimate ever be larger than the true total? Do you think the estimator is unbiased for estimat- ing b � a � 1? Explain in one or two sentences.

15. Let X1, X2, . . . , Xn represent a random sample from a Rayleigh distribution with pdf

a. It can be shown that E(X 2) � 2 . Use this fact to con- struct an unbiased estimator of based on (and use rules of expected value to show that it is unbiased).

b. Estimate from the following n � 10 observations on vibratory stress of a turbine blade under specified conditions:

16.88 10.23 4.59 6.66 13.68

14.23 19.87 9.40 6.51 10.95

16. Suppose the true average growth of one type of plant during a 1-year period is identical to that of a second type, but the variance of growth for the first type is s 2, whereas for the second type the variance is 4s 2. Let X1, . . . , Xm be m independent growth observations on the first type [so E(Xi ) � m, V(Xi) � s

2], and let Y1, . . . , Yn be n independ- ent growth observations on the second type [E(Yi ) � , V(Yi ) � 4s

2]. a. Show that for any d between 0 and 1, the estimator

is unbiased for . b. For fixed m and n, compute , and then find the value

of d that minimizes . [Hint: Differentiate with respect to d.]

17. In Chapter 3, we defined a negative binomial rv as the num- ber of failures that occur before the rth success in a sequence of independent and identical success/failure trials. The probability mass function (pmf) of X is

nb(x; r, p) �

a x 1 r 2 1 x

b pr(1 2 p)x x 5 0, 1, 2, . . .

V(m̂)V(m̂) V(m̂)

mm̂ 5 dX 1 (1 2 d)Y

m

m

u

gXi 2u

u

f (x; u) 5 x

u e2x2/(2u) x . 0

6.2 Methods of Point Estimation The definition of unbiasedness does not in general indicate how unbiased estimators can be derived. We now discuss two “constructive” methods for obtaining point estimators: the method of moments and the method of maximum likelihood. By constructive we mean that the general definition of each type of estimator suggests explicitly how to

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

256 CHAPTER 6 Point Estimation

obtain the estimator in any specific problem. Although maximum likelihood estimators are generally preferable to moment estimators because of certain efficiency properties, they often require significantly more computation than do moment estimators. It is sometimes the case that these methods yield unbiased estimators.

The Method of Moments The basic idea of this method is to equate certain sample characteristics, such as the mean, to the corresponding population expected values. Then solving these equa- tions for unknown parameter values yields the estimators.

DEFINITION Let X1, . . . , Xn be a random sample from a pmf or pdf f (x). For k � 1, 2, 3, . . . , the kth population moment, or kth moment of the distribution f (x), is E(Xk). The kth sample moment is (1/n)gni51Xi

k.

Thus the first population moment is E (X) � , and the first sample moment ism

DEFINITION Let X1, X2, . . . , Xn be a random sample from a distribution with pmf or pdf f(x; 1, . . . , m), where 1, . . . , m are parameters whose values are unknown. Then the moment estimators are obtained by equat- ing the first m sample moments to the corresponding first m population moments and solving for 1, . . . , m.uu

û1, c, ûm uuuu

Example 6.12

If, for example, m � 2, E(X) and E(X2) will be functions of 1 and 2. Setting E(X) � (1/n) �Xi (� ) and gives two equations in 1 and 2. The solution then defines the estimators.

uuE(X 2) 5 (1/n)gXi 2X

uu

Let X1, X2, . . . , Xn represent a random sample of service times of n customers at a certain facility, where the underlying distribution is assumed exponential with param- eter l. Since there is only one parameter to be estimated, the estimator is obtained by equating E(X ) to . Since E(X ) � 1/l for an exponential distribution, this gives 1/l� or l� 1/ . The moment estimator of l is then .. ■l̂ 5 1/XXX

X

Let X1, . . . , Xn be a random sample from a gamma distribution with parameters a and b. From Section 4.4, E(X ) � ab and E(X2) � b2�(a� 2)/�(a) � b2(a� 1)a. The moment estimators of a and b are obtained by solving

Since a(a� 1)b2 � a2b 2 � ab 2 and the first equation implies a2b2 � 2, the second equation becomes

Now dividing each side of this second equation by the corresponding side of the first equation and substituting back gives the estimators

1 n gXi

2 5 X 2

1 ab2

X

X 5 ab 1 n gXi

2 5 a(a 1 1)b2

Example 6.13

�Xi /n � . The second population and sample moments are E(X 2) and ,

respectively. The population moments will be functions of any unknown parameters 1, 2, . . . . uu

gXi 2/nX

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.2 Methods of Point Estimation 257

To illustrate, the survival-time data mentioned in Example 4.24 is

152 115 109 94 88 137 152 77 160 165

125 40 128 123 136 101 62 153 83 69

with � 113.5 and . The estimates are

These estimates of a and b differ from the values suggested by Gross and Clark because they used a different estimation technique. ■

â 5 (113.5)2

14,087.8 2 (113.5)2 5 10.7 b̂ 5

14,087.8 2 (113.5)2

113.5 5 10.6

(1/20)g x i 2 5 14,087.8x

â 5 X 2

(1/n)gXi 2 2 X 2

b̂ 5 (1/n)gXi

2 2 X 2

X

Example 6.14 Let X1, . . . , Xn be a random sample from a generalized negative binomial distribution with parameters r and p (see Section 3.5). Since E(X) � r(1 � p)/p and V(X) � r(1 � p)/p2, E(X2) � V(X) � [E(X )]2 � r(1 � p)(r � rp � 1)/p2. Equating E(X ) to and E(X2) to eventually gives

As an illustration, Reep, Pollard, and Benjamin (“Skill and Chance in Ball Games,” J. of Royal Stat. Soc., 1971: 623–629) consider the negative binomial dis- tribution as a model for the number of goals per game scored by National Hockey League teams. The data for 1966–1967 follows (420 games):

p̂ 5 X

(1/n)gXi 2 2 X 2

r̂ 5 X 2

(1/n)gXi 2 2 X 2 2 X

(1/n)gXi 2

X

Goals 0 1 2 3 4 5 6 7 8 9 10

Frequency 29 71 82 89 65 45 24 7 4 1 3

Then,

� �xi /420 � [(0)(29) � (1)(71) � . . . � (10)(3)]/420 � 2.98

and

Thus,

Although r by definition must be positive, the denominator of r̂ could be negative, indicating that the negative binomial distribution is not appropriate (or that the moment estimator is flawed). ■

Maximum Likelihood Estimation The method of maximum likelihood was first introduced by R. A. Fisher, a geneti- cist and statistician, in the 1920s. Most statisticians recommend this method, at least when the sample size is large, since the resulting estimators have certain desirable efficiency properties (see the proposition on page 262).

p̂ 5 2.98

12.40 2 (2.98)2 5 .85 r̂ 5

(2.98)2

12.40 2 (2.98)2 2 2.98 5 16.5

gxi 2 /420 5 [(0)2(29) 1 (1)2(71) 1 c 1 (10)2(3)]/420 5 12.40

x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

258 CHAPTER 6 Point Estimation

A sample of ten new bike helmets manufactured by a certain company is obtained. Upon testing, it is found that the first, third, and tenth helmets are flawed, whereas the others are not. Let p � P(flawed helmet), i.e., p is the proportion of all such hel- mets that are flawed. Define (Bernoulli) random variables X1, X2, , X10 by

Then for the obtained sample, X1 � X3 � X10 � 1 and the other seven Xi’s are all zero. The probability mass function of any particular Xi is (1 � p)

1� , which becomes p if xi � 1 and 1 � p when xi � 0. Now suppose that the conditions of various helmets are independent of one another. This implies that the Xi’s are independent, so their joint probability mass function is the product of the individual pmf’s. Thus the joint pmf evaluated at the observed Xi’s is

(6.4)

Suppose that p � .25. Then the probability of observing the sample that we actually obtained is (.25)3(.75)7 � .002086. If instead p � .50, then this probability is (.50)3(.50)7 � .000977. For what value of p is the obtained sample most likely to have occurred? That is, for what value of p is the joint pmf (6.4) as large as it can be? What value of p maximizes (6.4)? Figure 6.5(a) shows a graph of the likelihood (6.4) as a function of p. It appears that the graph reaches its peak above p � .3 � the proportion of flawed helmets in the sample. Figure 6.5(b) shows a graph of the nat- ural logarithm of (6.4); since ln[g(u)] is a strictly increasing function of g(u), find- ing u to maximize the function g(u) is the same as finding u to maximize ln[g(u)].

f (x1, . . . , x10; p) 5 p(1 2 p)p cp 5 p 3(1 2 p)7

xipxi

X1 5 e 1 if 1st helmet is flawed0 if 1st helmet isn’t flawed . . . X10 5 e 1 if 10th helmet is flawed

0 if 10th helmet isn’t flawed

c

0.0

0.0000

0.0005

0.0010

Likelihood ln(likelihood)

0.0015

0.0020

0.0025

0.2 0.4 p p

0.6 0.8 1.0 0.0

–50

–40

–30

–20

–10

0

0.2 0.4 0.6 0.8 1.0

Figure 6.5 (a) Graph of the likelihood (joint pmf) (6.4) from Example 6.15 (b) Graph of the natural logarithm of the likelihood

Example 6.15

We can verify our visual impression by using calculus to find the value of p that maximizes (6.4). Working with the natural log of the joint pmf is often easier than working with the joint pmf itself, since the joint pmf is typically a product so its log- arithm will be a sum. Here

(6.5) ln[ f (x1, . . . , x10; p)] 5 ln[p 3(1 2 p)7] 5 3ln(p) 1 7ln(1 2 p)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.2 Methods of Point Estimation 259

Thus

[the (�1) comes from the chain rule in calculus]. Equating this derivative to 0 and solving for p gives 3(1 � p) � 7p, from which 3 � 10p and so p � 3/10 � .30 as conjectured. That is, our point estimate is � .30. It is called the maximum like- lihood estimate because it is the parameter value that maximizes the likelihood (joint pmf) of the observed sample. In general, the second derivative should be examined to make sure a maximum has been obtained, but here this is obvious from Figure 6.5.

Suppose that rather than being told the condition of every helmet, we had only been informed that three of the ten were flawed. Then we would have the observed value of a binomial random variable X � the number of flawed helmets. The pmf of X is For x � 3, this becomes The

binomial coefficient is irrelevant to the maximization, so again ■p̂ 5 .30.A103 B p3(1 2 p)7.A103 BA10x B px (1 2 p)102x.

5 3 p 2

7

1 2 p

d

dp 5ln[ f(x1, . . . , x10; p)]6 5 ddp 53ln(p) 1 7ln(1 2 p)6 5

3 p 1

7

1 2 p (21)

Let X1, X2, . . . , Xn have joint pmf or pdf

f(x1, x2, . . . , xn; 1, . . . , m) (6.6)

where the parameters 1, . . . , m have unknown values. When x1, . . . , xn are the observed sample values and (6.6) is regarded as a function of 1, . . . , m, it is called the likelihood function. The maximum likelihood estimates (mle’s)

are those values of the i’s that maximize the likelihood function, so that

When the Xi’s are substituted in place of the xi’s, the maximum likelihood estimators result.

f (x1, c, xn; û1, c, ûm) $ f (x1, c, xn; u1, c, um) for all u1, c, um

uû1, c, ûm

uu

uu

uu

DEFINITION

The likelihood function tells us how likely the observed sample is as a func- tion of the possible parameter values. Maximizing the likelihood gives the parame- ter values for which the observed sample is most likely to have been generated—that is, the parameter values that “agree most closely” with the observed data.

Suppose X1, X2, . . . , Xn is a random sample from an exponential distribution with parameter l. Because of independence, the likelihood function is a product of the individual pdf’s:

f (x1, . . . , xn; l) � (le �lx1) � . . . � (le�lxn) � lne�l�xi

The natural logarithm of the likelihood function is

ln[ f (x1, . . . , xn ; l)] � n ln(l) � l�xi

Equating (d/dl)[ln(likelihood)] to zero results in n/l� �xi � 0, or l� n/ �xi � 1/ . Thus the mle is ; it is identical to the method of moments estimator [but it is not an unbiased estimator, since E(1/ ) � 1/E( )]. ■XX

l̂ 5 1/X x

Example 6.16

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

260 CHAPTER 6 Point Estimation

Let X1, . . . , Xn be a random sample from a normal distribution. The likelihood function is

so

To find the maximizing values of and s2, we must take the partial derivatives of ln( f ) with respect to and s2, equate them to zero, and solve the resulting two equa- tions. Omitting the details, the resulting mle’s are

The mle of s2 is not the unbiased estimator, so two different principles of estimation (unbiasedness and maximum likelihood) yield two different estimators. ■

In Chapter 3, we mentioned the use of the Poisson distribution for modeling the number of “events” that occur in a two-dimensional region. Assume that when the region R being sampled has area a(R), the number X of events occurring in R has a Poisson distribution with parameter la(R) (where l is the expected number of events per unit area) and that nonoverlapping regions yield independent X’s.

Suppose an ecologist selects n nonoverlapping regions R1, . . . , Rn and counts the number of plants of a certain species found in each region. The joint pmf (like- lihood) is then

The ln(likelihood) is

ln[ p(x1, . . . , xn; l)] � gxi � ln[a(Ri)] � ln(l) � gxi � lga(Ri) � g ln(xi!)

Taking d/dl [ln(p)] and equating it to zero yields

so

The mle is then . This is intuitively reasonable because l is the true

density (plants per unit area), whereas is the sample density since �a(Ri) is just the total area sampled. Because E(Xi) � l � a(Ri), the estimator is unbiased.

Sometimes an alternative sampling procedure is used. Instead of fixing regions to be sampled, the ecologist will select n points in the entire region

l̂ 5 gXi /ga(Ri)

l 5 gxi

ga(Ri)

gxi l

2 ga(Ri) 5 0

5 [a(R1)]

x1 # c # [a(Rn)]xn # lgxi # e2lg a(Ri) x1! # c # xn!

p(x1, c, xn; l) 5 [l # a(R1)]x1e2l # a(R1)

x1! # c # [l

# a(Rn)]xne2l # a(Rn) xn!

m̂ 5 X ŝ2 5 g (Xi 2 X)

2

n

m

m

ln[f(x1, c, xn; m, s 2)] 5 2

n

2 ln (2ps2) 2

1

2s2 g (xi 2 m)

2

5 a 1 2ps2

bn/2e2g(xi2m)2/(2s2) f (x1, c, xn; m, s

2) 5 1

12ps2 e2(x12m)2/(2s2) # c # 1

12ps2 e2(xn2m)2/(2s2)

Example 6.17

Example 6.18

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.2 Methods of Point Estimation 261

of interest and let yi � the distance from the ith point to the nearest plant. The cumulative distribution function (cdf) of Y � distance to the nearest plant is

Taking the derivative of FY (y) with respect to y yields

If we now form the likelihood fY(y1; l) � . . . � fY ( yn; l), differentiate ln(likelihood), and so on, the resulting mle is

which is also a sample density. It can be shown that in a sparse environment (small l), the distance method is in a certain sense better, whereas in a dense environment the first sampling method is better. ■

Let X1, . . . , Xn be a random sample from a Weibull pdf

Writing the likelihood and ln(likelihood), then setting both (�/�a)[ln( f )] � 0 and (�/�b)[ln( f )] � 0, yields the equations

These two equations cannot be solved explicitly to give general formulas for the mle’s and . Instead, for each sample x1, . . . , xn, the equations must be solved using an

iterative numerical procedure. Even moment estimators of a and b are somewhat complicated (see Exercise 21). ■

Estimating Functions of Parameters In Example 6.17, we obtained the mle of s2 when the underlying distribution is nor- mal. The mle of , as well as that of many other mle’s, can be easily derived using the following proposition.

s 5 1s2

b̂â

a 5 c gxia # ln (xi) gxi

a 2

g ln(xi) n

d 21

b 5 a gxia n

b1/a

f (x; a, b) 5 • a

ba #

xa21 # e2(x/b)a x $ 0 0 otherwise

l̂ 5 n

pgYi 2 5

number of plants observed

total area sampled

fY(y; l) 5 e2plye 2lpy2 y $ 0

0 otherwise

5 1 2 e2lpy2(lpy2)0

0! 5 1 2 e2l # py2

FY(y) 5 P(Y # y) 5 1 2 P(Y . y) 5 1 2 Pa no plants in acircle of radius yb

Example 6.19

The Invariance Principle

Let be the mle’s of the parameters 1, 2, . . . , m. Then the mle of any function h( 1, 2, . . . , m) of these parameters is the function

of the mle’s.h(û1, û2, c, ûm) uuu

uuuû1, û2, c, ûm

PROPOSITION

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

262 CHAPTER 6 Point Estimation

In the normal case, the mle’s of and s2 are � and . To

obtain the mle of the function , substitute the mle’s into the function:

The mle of s is not the sample standard deviation S, though they are close unless n is quite small. ■

The mean value of an rv X that has a Weibull distribution is

� b � �(1 � 1/a)

The mle of is therefore , where and are the mle’s of a and b. In particular, is not the mle of , though it is an unbiased estimator. At least for large n, is a better estimator than .

For the data given in Example 6.3, the mle’s of the Weibull parameters are , from which . This estimate is quite close

to the sample mean 73.88. ■

Large Sample Behavior of the MLE Although the principle of maximum likelihood estimation has considerable intuitive appeal, the following proposition provides additional rationale for the use of mle’s.

m̂ 5 73.80â 5 11.9731 and b̂ 5 77.0153

Xm̂ mX

b̂âm̂ 5 b̂�(1 1 1/â)m

m

ŝ 5 2ŝ2 5 c 1 n g (Xi 2 X)

2d1/2 h(m, s2) 5 2s2 5 s

ŝ2 5 g (Xi 2 X ) 2/nXm̂m

Under very general conditions on the joint distribution of the sample, when the sample size n is large, the maximum likelihood estimator of any parameter is approximately unbiased and has variance that is either as small as or nearly as small as can be achieved by any estimator. Stated another way, the mle is approximately the MVUE of .uû

[E(û) < u] u

Because of this result and the fact that calculus-based techniques can usually be used to derive the mle’s (though often numerical methods, such as Newton’s method, are necessary), maximum likelihood estimation is the most widely used estimation tech- nique among statisticians. Many of the estimators used in the remainder of the book are mle’s. Obtaining an mle, however, does require that the underlying distribution be specified.

Some Complications Sometimes calculus cannot be used to obtain mle’s.

Suppose my waiting time for a bus is uniformly distributed on [0, ] and the results x1, . . . , xn of a random sample from this distribution have been observed. Since f (x; ) � 1/ for 0 x and 0 otherwise,

f (x1, c, xn; u) 5 u 1

un 0 # x1 # u, c, 0 # xn # u

0 otherwise

uuu

u

Example 6.20 (Example 6.17 continued)

Example 6.22

PROPOSITION

Example 6.21 (Example 6.19 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

6.2 Methods of Point Estimation 263

As long as max(xi) , the likelihood is 1/ n, which is positive, but as soon as

� max(xi), the likelihood drops to 0. This is illustrated in Figure 6.6. Calculus will not work because the maximum of the likelihood occurs at a point of discontinuity, but the figure shows that . Thus if my waiting times are 2.3, 3.7, 1.5, .4,

and 3.2, then the mle is From Example 6.4, the mle is not unbiased. ■

A method that is often used to estimate the size of a wildlife population involves per- forming a capture/recapture experiment. In this experiment, an initial sample of M animals is captured, each of these animals is tagged, and the animals are then returned to the population. After allowing enough time for the tagged individuals to mix into the population, another sample of size n is captured. With X � the number of tagged animals in the second sample, the objective is to use the observed x to esti- mate the population size N.

The parameter of interest is � N, which can assume only integer values, so even after determining the likelihood function (pmf of X here), using calculus to obtain N would present difficulties. If we think of a success as a previously tagged animal being recaptured, then sampling is without replacement from a population containing M successes and N � M failures, so that X is a hypergeometric rv and the likelihood function is

The integer-valued nature of N notwithstanding, it would be difficult to take the derivative of p(x; N). However, if we consider the ratio of p(x; N) to p(x; N � 1), we have

This ratio is larger than 1 if and only if (iff ) N � Mn/x. The value of N for which p(x; N) is maximized is therefore the largest integer less than Mn/x. If we use stan- dard mathematical notation [r] for the largest integer less than or equal to r, the mle of N is � [Mn/x]. As an illustration, if M � 200 fish are taken from a lake and tagged, and subsequently n � 100 fish are recaptured, and among the 100 there are x � 11 tagged fish, then � [(200)(100)/11] � [1818.18] � 1818. The estimate is actually rather intuitive; x/n is the proportion of the recaptured sample that is tagged, whereas M/N is the proportion of the entire population that is tagged. The estimate is obtained by equating these two proportions (estimating a population proportion by a sample proportion). ■

p(x; N )

p(x; N 2 1) 5

(N 2 M ) # (N 2 n) N(N 2 M 2 n 1 x)

p(x; N ) 5 h(x; n, M, N) 5 QM x R # QN 2 Mn 2 x R

QNn R

u

û 5 3.7.

û 5 max(Xi)

u

uu

Example 6.23

max(xi) �

Likelihood

Figure 6.6 The likelihood function for Example 6.22

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

264 CHAPTER 6 Point Estimation

Suppose X1, X2, . . . , Xn is a random sample from a pdf f (x; ) that is symmetric about but that the investigator is unsure of the form of the f function. It is then desirable to use an estimator that is robust—that is, one that performs well for a wide variety of underlying pdf’s. One such estimator is a trimmed mean. In recent years, statisticians have proposed another type of estimator, called an M-estimator, based on a generalization of maximum likelihood estimation. Instead of maximizing the log likelihood �ln[ f (x; )] for a specified f, one maximizes �r(xi; ). The “objective function” r is selected to yield an estimator with good robustness properties. The book by David Hoaglin et al. (see the bibliography) contains a good exposition on this subject.

u

u

u

u

EXERCISES Section 6.2 (20–30)

20. A diagnostic test for a certain disease is applied to n individ- uals known to not have the disease. Let X � the number among the n test results that are positive (indicating pres- ence of the disease, so X is the number of false positives) and p � the probability that a disease-free individual’s test result is positive (i.e., p is the true proportion of test results from disease-free individuals that are positive). Assume that only X is available rather than the actual sequence of test results. a. Derive the maximum likelihood estimator of p. If n � 20

and x � 3, what is the estimate? b. Is the estimator of part (a) unbiased? c. If n � 20 and x � 3, what is the mle of the probability

(1 � p)5 that none of the next five tests done on disease- free individuals are positive?

21. Let X have a Weibull distribution with parameters a and b, so

E(X) � b � �(1 � 1/a)

V(X) � b2{�(1 � 2/a) � [�(1 � 1/a)]2}

a. Based on a random sample X1, . . . , Xn, write equations for the method of moments estimators of b and a. Show that, once the estimate of a has been obtained, the esti- mate of b can be found from a table of the gamma func- tion and that the estimate of a is the solution to a complicated equation involving the gamma function.

b. If n � 20, � 28.0, and , compute the

estimates. [Hint: [�(1.2)]2/�(1.4) � .95.]

22. Let X denote the proportion of allotted time that a randomly selected student spends working on a certain aptitude test. Suppose the pdf of X is

where �1 � . A random sample of ten students yields data x1 � .92, x2 � .79, x3 � .90, x4 � .65, x5 � .86, x6 � .47, x7 � .73, x8 � .97, x9 � .94, x10 � .77.

u

f (x; u) 5 e (u 1 1)xu 0 # x # 1 0 otherwise

gxi 2 5 16,500x

a. Use the method of moments to obtain an estimator of , and then compute the estimate for this data.

b. Obtain the maximum likelihood estimator of , and then compute the estimate for the given data.

23. Two different computer systems are monitored for a total of n weeks. Let Xi denote the number of breakdowns of the first system during the ith week, and suppose the Xi’s are independent and drawn from a Poisson distribution with parameter m1. Similarly, let Yi denote the number of break- downs of the second system during the ith week, and assume independence with each Yi Poisson with parameter m2. Derive the mle’s of m1, m2, and m1 � m2. [Hint: Using independence, write the joint pmf (likelihood) of the Xi’s and Yi’s together.]

24. A vehicle with a particular defect in its emission control sys- tem is taken to a succession of randomly selected mechanics until r � 3 of them have correctly diagnosed the problem. Suppose that this requires diagnoses by 20 different mechan- ics (so there were 17 incorrect diagnoses). Let p � P(correct diagnosis), so p is the proportion of all mechanics who would correctly diagnose the problem. What is the mle of p? Is it the same as the mle if a random sample of 20 mechan- ics results in 3 correct diagnoses? Explain. How does the mle compare to the estimate resulting from the use of the unbi- ased estimator given in Exercise 17?

25. The shear strength of each of ten test spot welds is deter- mined, yielding the following data (psi):

392 376 401 367 389 362 409 415 358 375

a. Assuming that shear strength is normally distributed, estimate the true average shear strength and standard deviation of shear strength using the method of maxi- mum likelihood.

b. Again assuming a normal distribution, estimate the strength value below which 95% of all welds will have their strengths. [Hint: What is the 95th percentile in terms of and s? Now use the invariance principle.]m

u

u

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 265

26. Refer to Exercise 25. Suppose we decide to examine another test spot weld. Let X � shear strength of the weld. Use the given data to obtain the mle of P(X 400). [Hint: P(X 400) � �((400 � )/s).]

27. Let X1, . . . , Xn be a random sample from a gamma distribu- tion with parameters a and b. a. Derive the equations whose solutions yield the maximum

likelihood estimators of a and b. Do you think they can be solved explicitly?

b. Show that the mle of � ab is .

28. Let X1, X2, . . . , Xn represent a random sample from the Rayleigh distribution with density function given in Exercise 15. Determine a. The maximum likelihood estimator of , and then calcu-

late the estimate for the vibratory stress data given in that exercise. Is this estimator the same as the unbiased esti- mator suggested in Exercise 15?

b. The mle of the median of the vibratory stress distribu- tion. [Hint: First express the median in terms of .]

29. Consider a random sample X1, X2, . . . , Xn from the shifted exponential pdf

f (x; l, u) 5 ele2l(x2u) x $ u 0 otherwise

u

u

m̂ 5 Xm

m

Taking � 0 gives the pdf of the exponential distribution considered previously (with positive density to the right of zero). An example of the shifted exponential distribution appeared in Example 4.5, in which the variable of interest was time headway in traffic flow and � .5 was the mini- mum possible time headway. a. Obtain the maximum likelihood estimators of and l. b. If n � 10 time headway observations are made, result-

ing in the values 3.11, .64, 2.55, 2.20, 5.44, 3.42, 10.39, 8.93, 17.82, and 1.30, calculate the estimates of

and l.

30. At time t � 0, 20 identical components are tested. The life- time distribution of each is exponential with parameter l. The experimenter then leaves the test facility unmonitored. On his return 24 hours later, the experimenter immediately terminates the test after noticing that y � 15 of the 20 com- ponents are still in operation (so 5 have failed). Derive the mle of l. [Hint: Let Y � the number that survive 24 hours. Then Y � Bin(n, p). What is the mle of p? Now notice that p � P(Xi � 24), where Xi is exponentially distributed. This relates l to p, so the former can be estimated once the latter has been.]

u

u

u

u

31. An estimator is said to be consistent if for any 0, as n . That is, is consistent

if, as the sample size gets larger, it is less and less likely that will be further than from the true value of . Show that is a consistent estimator of when s2 � by using Chebyshev’s inequality from Exercise 44 of Chapter 3. [Hint: The inequality can be rewritten in the form

Now identify Y with .]

32. a. Let X1, . . . , Xn be a random sample from a uniform distri- bution on [0, ]. Then the mle of is . û 5 Y 5 max(Xi)uu

X

P(|Y 2 mY | $ P) # sY2 /P

`mX uPû

û`SP(|û 2 u| $ P) S 0 Pû 33. At time t � 0, there is one individual alive in a certain pop-

ulation. A pure birth process then unfolds as follows. The time until the first birth is exponentially distributed with parameter l. After the first birth, there are two individuals alive. The time until the first gives birth again is exponential with parameter l, and similarly for the second individual. Therefore, the time until the next birth is the minimum of two exponential (l) variables, which is exponential with parameter 2l. Similarly, once the second birth has occurred, there are three individuals alive, so the time until the next birth is an exponential rv with parameter 3l, and so on (the memoryless property of the exponential distribution is being used here). Suppose the process is observed until the sixth birth has occurred and the successive birth times are 25.2, 41.7, 51.2, 55.5, 59.5, 61.8 (from which you should calculate the times between successive births). Derive the mle of l. [Hint: The likelihood is a product of exponential terms.]

34. The mean squared error of an estimator is . If is unbiased, then

but in general . Consider the esti-MSE(û) 5 V(û ) 1 (bias)2 MSE(û) 5 V(û),ûMSE(û) 5 E(û 2 u)2

SUPPLEMENTARY EXERCISES (31–38)

Use the fact that Y y iff each Xi y to derive the cdf of Y. Then show that the pdf of Y � max(Xi) is

b. Use the result of part (a) to show that the mle is biased but that (n � 1)max(Xi)/n is unbiased.

fY (y) 5 u nyn21

un 0 # y # u

0 otherwise

mator , where S2 � sample variance. What value ofŝ2 5 KS 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

266 CHAPTER 6 Point Estimation

K minimizes the mean squared error of this estimator when the population distribution is normal? [Hint: It can be shown that

E[(S2)2] � (n � 1)s4/(n � 1)

In general, it is difficult to find to minimize , which is why we look only at unbiased estimators and minimize .]

35. Let X1, . . . , Xn be a random sample from a pdf that is symmet- ric about . An estimator for that has been found to perform well for a variety of underlying distributions is the Hodges–Lehmann estimator. To define it, first compute for each i j and each j � 1, 2, . . . , n the pairwise average i,j � (Xi � Xj)/2. Then the estimator is � the median of the i,j’s. Compute the value of this estimate using the data of Exercise 44 of Chapter 1. [Hint: Construct a square table with the xi’s listed on the left margin and on top. Then compute averages on and above the diagonal.]

36. When the population distribution is normal, the statistic median {| X1 � |, . . . , | Xn � |}/.6745 can be used to estimate s. This estimator is more resistant to the effects of outliers (observations far from the bulk of the data)

X|X|

Xm̂ X

mm

V(û)

MSE(û)û

than is the sample standard deviation. Compute both the corresponding point estimate and s for the data of Example 6.2.

37. When the sample standard deviation S is based on a random sample from a normal population distribution, it can be shown that

Use this to obtain an unbiased estimator for s of the form cS. What is c when n � 20?

38. Each of n specimens is to be weighed twice on the same scale. Let Xi and Yi denote the two observed weights for the ith specimen. Suppose Xi and Yi are independent of one another, each normally distributed with mean value

i (the true weight of specimen i) and variance s 2.

a. Show that the maximum likelihood estimator of s2 is . [Hint: If � (z1 � z2)/2, then

�(zi � ) 2 � (z1 � z2)

2/2.] b. Is the mle an unbiased estimator of s2? Find an

unbiased estimator of s2. [Hint: For any rv Z, E(Z 2) � V(Z) � [E(Z)]2. Apply this to Z � Xi � Yi.]

ŝ2 z

zŝ2 5 g (Xi 2 Yi) 2/(4n)

m

E(S) 5 12/(n 2 1)�(n/2)s/�((n 2 1)/2)

Bibliography DeGroot, Morris, and Mark Schervish, Probability and Statistics

(3rd ed.), Addison-Wesley, Boston, MA, 2002. Includes an excellent discussion of both general properties and methods of point estimation; of particular interest are examples show- ing how general principles and methods can yield unsatisfac- tory estimators in particular situations.

Devore, Jay, and Kenneth Berk, Modern Mathematical Statistics with Applications, Thomson-Brooks/Cole, Belmont, CA, 2007. The exposition is a bit more comprehensive and sophisticated than that of the current book.

Efron, Bradley, and Robert Tibshirani, An Introduction to the Bootstrap, Chapman and Hall, New York, 1993. The bible of the bootstrap.

Hoaglin, David, Frederick Mosteller, and John Tukey, Understanding Robust and Exploratory Data Analysis, Wiley, New York, 1983. Contains several good chapters on robust point estimation, including one on M-estimation.

Rice, John, Mathematical Statistics and Data Analysis (3rd ed.), Thomson-Brooks/Cole, Belmont, CA, 2007. A nice blending of statistical theory and data.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

267

7 Statistical Intervals Basedon a Single Sample

INTRODUCTION

A point estimate, because it is a single number, by itself provides no informa-

tion about the precision and reliability of estimation. Consider, for example,

using the statistic to calculate a point estimate for the true average breaking

strength (g) of paper towels of a certain brand, and suppose that .

Because of sampling variability, it is virtually never the case that . The

point estimate says nothing about how close it might be to m. An alternative to

reporting a single sensible value for the parameter being estimated is to calcu-

late and report an entire interval of plausible values—an interval estimate or

confidence interval (CI). A confidence interval is always calculated by first

selecting a confidence level, which is a measure of the degree of reliability of

the interval. A confidence interval with a 95% confidence level for the true

average breaking strength might have a lower limit of 9162.5 and an upper

limit of 9482.9. Then at the 95% confidence level, any value of m between

9162.5 and 9482.9 is plausible. A confidence level of 95% implies that 95% of

all samples would give an interval that includes m, or whatever other parame-

ter is being estimated, and only 5% of all samples would yield an erroneous

interval. The most frequently used confidence levels are 95%, 99%, and 90%.

The higher the confidence level, the more strongly we believe that the value of

the parameter being estimated lies within the interval (an interpretation of any

particular confidence level will be given shortly).

Information about the precision of an interval estimate is conveyed by the

width of the interval. If the confidence level is high and the resulting interval is

quite narrow, our knowledge of the value of the parameter is reasonably pre-

cise. A very wide confidence interval, however, gives the message that there is

x 5 m

x 5 9322.7

X

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 7.1

268 CHAPTER 7 Statistical Intervals Based on a Single Sample

Brand 1:

Brand 2:

Strength

Strength

( )( )

( )( )

Figure 7.1 CIs indicating precise (brand 1) and imprecise (brand 2) information about m

a great deal of uncertainty concerning the value of what we are estimating.

Figure 7.1 shows 95% confidence intervals for true average breaking strengths

of two different brands of paper towels. One of these intervals suggests precise

knowledge about m, whereas the other suggests a very wide range of plausible

values.

7.1 Basic Properties of Confidence Intervals The basic concepts and properties of confidence intervals (CIs) are most easily intro- duced by first focusing on a simple, albeit somewhat unrealistic, problem situation. Suppose that the parameter of interest is a population mean m and that

1. The population distribution is normal

2. The value of the population standard deviation s is known

Normality of the population distribution is often a reasonable assumption. However, if the value of m is unknown, it is typically implausible that the value of s would be available (knowledge of a population’s center typically precedes information con- cerning spread). We’ll develop methods based on less restrictive assumptions in Sections 7.2 and 7.3.

Industrial engineers who specialize in ergonomics are concerned with designing workspace and worker-operated devices so as to achieve high productivity and com- fort. The article “Studies on Ergonomically Designed Alphanumeric Keyboards” (Human Factors, 1985: 175–187) reports on a study of preferred height for an exper- imental keyboard with large forearm–wrist support. A sample of trained typ- ists was selected, and the preferred keyboard height was determined for each typist. The resulting sample average preferred height was . Assuming that the preferred height is normally distributed with (a value suggested by data in the article), obtain a CI for m, the true average preferred height for the population of all experienced typists. ■

The actual sample observations are assumed to be the result of a random sample from a normal distribution with mean value m and stan- dard deviation s. The results described in Chapter 5 then imply that, irrespective of the sample size n, the sample mean is normally distributed with expected value m and standard deviation . Standardizing by first subtracting its expected value and then dividing by its standard deviation yields the standard normal variable

(7.1)Z 5 X 2 m

s/1n

Xs/1n X

X1, c, Xn

x1, x2, c, xn

s 5 2.0 cm x 5 80.0 cm

n 5 31

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.1 Basic Properties of Confidence Intervals 269

Because the area under the standard normal curve between and 1.96 is .95,

(7.2)

Now let’s manipulate the inequalities inside the parentheses in (7.2) so that they appear in the equivalent form , where the endpoints l and u involve

and . This is achieved through the following sequence of operations, each yielding inequalities equivalent to the original ones.

1. Multiply through by :

2. Subtract from each term:

3. Multiply through by to eliminate the minus sign in front of m (which reverses the direction of each inequality):

that is,

The equivalence of each set of inequalities to the original set implies that

(7.3)

The event inside the parentheses in (7.3) has a somewhat unfamiliar appearance; previously, the random quantity has appeared in the middle with constants on both ends, as in . In (7.3) the random quantity appears on the two ends, whereas the unknown constant m appears in the middle. To interpret (7.3), think of a random interval having left endpoint and right endpoint

. In interval notation, this becomes

(7.4)

The interval (7.4) is random because the two endpoints of the interval involve a ran- dom variable. It is centered at the sample mean and extends to each side of . Thus the interval’s width is , which is not random; only the location of the interval (its midpoint ) is random (Figure 7.2). Now (7.3) can be par- aphrased as “the probability is .95 that the random interval (7.4) includes or covers the true value of m.” Before any experiment is performed and any data is gathered, it is quite likely that m will lie inside the interval (7.4).

X 2 # (1.96) # s/1nX

1.96s/1nX

aX 2 1.96 # s 1n

, X 1 1.96 # s 1n

b X 1 1.96 # s/1n

X 2 1.96 # s/1n

a # Y # b

PaX 2 1.96 s 1n

, m , X 1 1.96 s

1n b 5 .95

X 2 1.96 # s 1n

, m , X 1 1.96 # s 1n

X 1 1.96 # s 1n

. m . X 2 1.96 # s 1n

21

2X 2 1.96 # s 1n

, 2m , 2X 1 1.96 # s 1n

X

21.96 # s 1n

, X 2 m , 1.96 # s 1n

s/1n

s/1nX l , m , m

Pa21.96 , X 2 m s/1n

, 1.96b 5 .95 21.96

Figure 7.2 The random interval (7.4) centered at X

X � 1.96 /� n

1.96 /� n 1.96 /� n

X � 1.96 /� nX

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 7.2 (Example 7.1 continued)

DEFINITION

270 CHAPTER 7 Statistical Intervals Based on a Single Sample

If, after observing , we compute the observed sample mean and then substitute into (7.4) in place of , the resulting fixed interval is called a 95% confidence interval for M. This CI can be expressed either as

or as

A concise expression for the interval is , where gives the left endpoint (lower limit) and gives the right endpoint (upper limit).1

2x 6 1.96 # s/1n

x 2 1.96 # s 1n

, m , x 1 1.96 # s 1n

with 95% confidence

ax 2 1.96 # s 1n

, x 1 1.96 # s 1n

b is a 95% CI for m

Xxx X1 5 x1, X2 5 x2, c, Xn 5 xn

The quantities needed for computation of the 95% CI for true average preferred height are , and . The resulting interval is

That is, we can be highly confident, at the 95% confidence level, that . This interval is relatively narrow, indicating that m has been

rather precisely estimated. ■

Interpreting a Confidence Level The confidence level 95% for the interval just defined was inherited from the prob- ability .95 for the random interval (7.4). Intervals having other levels of confidence will be introduced shortly. For now, though, consider how 95% confidence can be interpreted.

Because we started with an event whose probability was .95—that the random interval (7.4) would capture the true value of m—and then used the data in Example 7.1 to compute the CI (79.3, 80.7), it is tempting to conclude that m is within this fixed interval with probability .95. But by substituting for , all randomness disappears; the interval (79.3, 80.7) is not a random interval, and m is a constant (unfortunately unknown to us). It is therefore incorrect to write the state- ment .

A correct interpretation of “95% confidence” relies on the long-run relative fre- quency interpretation of probability: To say that an event A has probability .95 is to say that if the experiment on which A is defined is performed over and over again, in the long run A will occur 95% of the time. Suppose we obtain another sample of typ- ists’ preferred heights and compute another 95% interval. Then we consider repeating this for a third sample, a fourth sample, a fifth sample, and so on. Let A be the event that . Since , in the long run 95% of our computed CIs will contain m. This is illustrated in Figure 7.3, where the vertical line cuts the measurement axis at the true (but unknown) value of m. Notice that 7 of the 100 intervals shown fail to contain m. In the long run, only 5% of the intervals so constructed would fail to contain m.

According to this interpretation, the confidence level 95% is not so much a statement about any particular interval such as (79.3, 80.7). Instead it pertains to what would happen if a very large number of like intervals were to be constructed

P(A) 5 .95X 2 1.96 # s/1n , m , X 1 1.96 # s/1n

P(m lies in (79.3, 80.7)) 5 .95

Xx 5 80.0

79.3 , m , 80.7

x 6 1.96 # s 1n

5 80.0 6 (1.96) 2.0

131 5 80.0 6 .7 5 (79.3, 80.7)

x 5 80.0s 5 2.0, n 5 31

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.1 Basic Properties of Confidence Intervals 271

using the same CI formula. Although this may seem unsatisfactory, the root of the difficulty lies with our interpretation of probability—it applies to a long sequence of replications of an experiment rather than just a single replication. There is another approach to the construction and interpretation of CIs that uses the notion of sub- jective probability and Bayes’ theorem, but the technical details are beyond the scope of this text; the book by DeGroot, et al. (see the Chapter 6 bibliography) is a good source. The interval presented here (as well as each interval presented subsequently) is called a “classical” CI because its interpretation rests on the classical notion of probability.

Other Levels of Confidence The confidence level of 95% was inherited from the probability .95 for the initial inequalities in (7.2). If a confidence level of 99% is desired, the initial probability of .95 must be replaced by .99, which necessitates changing the z critical value from 1.96 to 2.58. A 99% CI then results from using 2.58 in place of 1.96 in the formula for the 95% CI.

In fact, any desired level of confidence can be achieved by replacing 1.96 or 2.58 with the appropriate standard normal critical value. As Figure 7.4 shows, a probability of is achieved by using in place of 1.96.za/21 2 a

µ

µ

Figure 7.3 One hundred 95% CIs (asterisks identify intervals that do not include m).

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

272 CHAPTER 7 Statistical Intervals Based on a Single Sample

0�z /2� z /2�

z curve

Shaded area � /2�1 � �

Figure 7.4 P(2za/2 # Z , za/2) 5 1 2 a

A confidence interval for the mean m of a normal population when the value of s is known is given by

(7.5)

or, equivalently, by .x 6 za/2 # s/1n

ax 2 za/2 # s1n, x 1 za/2 # s

1n b

100(1 2 a)%

The formula (7.5) for the CI can also be expressed in words as

point estimate of .

The production process for engine control housing units of a particular type has recently been modified. Prior to this modification, historical data had suggested that the distribution of hole diameters for bushings on the housings was normal with a standard deviation of .100 mm. It is believed that the modification has not affected the shape of the distribution or the standard deviation, but that the value of the mean diameter may have changed. A sample of 40 housing units is selected and hole diam- eter is determined for each one, resulting in a sample mean diameter of 5.426 mm. Let’s calculate a confidence interval for true average hole diameter using a confi- dence level of 90%. This requires that , from which and

(corresponding to a cumulative z-curve area of .9500). The desired interval is then

With a reasonably high degree of confidence, we can say that . This interval is rather narrow because of the small amount of variability in hole diameter . ■

Confidence Level, Precision, and Sample Size Why settle for a confidence level of 95% when a level of 99% is achievable? Because the price paid for the higher confidence level is a wider interval. Since the 95% interval extends to each side of , the width of the interval is

. Similarly, the width of the 99% interval is

. That is, we have more confidence in the 99% inter- val precisely because it is wider. The higher the desired degree of confidence, the wider the resulting interval will be.

If we think of the width of the interval as specifying its precision or accuracy, then the confidence level (or reliability) of the interval is inversely related to its

2(2.58) # s/1n 5 5.16 # s/1n 2(1.96) # s/1n 5 3.92 # s/1n

x1.96 # s/1n

(s 5 .100)

5.400 , m , 5.452

5.426 6 (1.645) .100

140 5 5.426 6 .026 5 (5.400, 5.452)

za/2 5 z.05 5 1.645 a 5 .10100(1 2 a) 5 90

m 6 (z critical value) (standard error of the mean)

DEFINITION

Example 7.3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 7.4

7.1 Basic Properties of Confidence Intervals 273

The sample size necessary for the CI (7.5) to have a width w is

n 5 a2za/2 # swb 2

precision. A highly reliable interval estimate may be imprecise in that the endpoints of the interval may be far apart, whereas a precise interval may entail relatively low reliability. Thus it cannot be said unequivocally that a 99% interval is to be preferred to a 95% interval; the gain in reliability entails a loss in precision.

An appealing strategy is to specify both the desired confidence level and inter- val width and then determine the necessary sample size.

Extensive monitoring of a computer time-sharing system has suggested that response time to a particular editing command is normally distributed with standard deviation 25 millisec. A new operating system has been installed, and we wish to estimate the true average response time m for the new environment. Assuming that response times are still normally distributed with , what sample size is nec- essary to ensure that the resulting 95% CI has a width of (at most) 10? The sample size n must satisfy

)

Rearranging this equation gives

so

Since n must be an integer, a sample size of 97 is required. ■

A general formula for the sample size n necessary to ensure an interval width w is obtained from equating w to and solving for n.2 # za/2 # s/1n

n 5 (9.80)2 5 96.04

1n 5 2 # (1.96)(25)/10 5 9.80

10 5 2 # (1.96)(25/1n

s 5 25

The smaller the desired width w, the larger n must be. In addition, n is an increasing function of s (more population variability necessitates a larger sample size) and of the confidence level (as a decreases, increases).

The half-width of the 95% CI is sometimes called the bound on the error of estimation associated with a 95% confidence level. That is, with 95% confidence, the point estimate will be no farther than this from m. Before obtain- ing data, an investigator may wish to determine a sample size for which a particular value of the bound is achieved. For example, with m representing the average fuel efficiency (mpg) for all cars of a certain type, the objective of an investigation may be to estimate m to within 1 mpg with 95% confidence. More generally, if we wish to estimate m to within an amount B (the specified bound on the error of estimation) with confidence, the necessary sample size results from replacing 2/w by 1/B in the formula in the preceding box.

Deriving a Confidence Interval Let denote the sample on which the CI for a parameter u is to be based. Suppose a random variable satisfying the following two properties can be found:

X1, X2, c, Xn

100(1 2 a) %

x

1.96s/1n za/2100(1 2 a)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 7.5

274 CHAPTER 7 Statistical Intervals Based on a Single Sample

1. The variable depends functionally on both and u.

2. The probability distribution of the variable does not depend on u or on any other unknown parameters.

Let denote this random variable. For example, if the pop- ulation distribution is normal with known s and , the variable

satisfies both properties; it clearly depends functionally on m, yet has the standard normal probability distribution, which does not depend on m. In general, the form of the h function is usually suggested by exam- ining the distribution of an appropriate estimator .

For any a between 0 and 1, constants a and b can be found to satisfy

(7.6)

Because of the second property, a and b do not depend on u. In the normal example, and . Now suppose that the inequalities in (7.6) can be manipu-

lated to isolate u, giving the equivalent probability statement

Then and are the lower and upper confidence limits, respectively, for a CI. In the normal example, we saw that

and .

A theoretical model suggests that the time to breakdown of an insulating fluid between electrodes at a particular voltage has an exponential distribution with parameter l (see Section 4.4). A random sample of breakdown times yields the following sample data (in min):

. A 95% CI for l and for the true average breakdown time are desired.

Let . It can be shown that this random variable has a probability distribution called a chi-squared distribution with 2n degrees of freedom (df) ( , where n is the parameter of a chi-squared distribution as men- tioned in Section 4.4). Appendix Table A.7 pictures a typical chi-squared density curve and tabulates critical values that capture specified tail areas. The relevant num- ber of df here is . The row of the table shows that 34.170 captures upper-tail area .025 and 9.591 captures lower-tail area .025 (upper-tail area .975). Thus for ,

Division by isolates l, yielding

The lower limit of the 95% CI for l is , and the upper limit is . For the given data, , giving the interval (.00871, .03101).

The expected value of an exponential rv is . Since

the 95% CI for true average breakdown time is . This interval is obviously quite wide, reflecting substantial variability

in breakdown times and a small sample size. ■ (32.24, 114.87)

(2gxi /34.170, 2gxi /9.591) 5

P(2gXi /34.170 , 1/l , 2gXi /9.591) 5 .95

m 5 1/l gxi 5 550.8734.170/(2gxi)

9.591/(2gxi)

P(9.591/(2gXi) , l , (34.170/(2gXi)) 5 .95

2gXi

P(9.591 , 2lgXi , 34.170) 5 .95

n 5 10

n 5 202(10) 5 20

n 5 2n

h(X1, X2, c, Xn; l) 5 2lgXi

x10 5 26.78x5 5 12.33, x6 5 117.52, x7 5 73.02, x8 5 223.63, x9 5 4.00, x1 5 41.53, x2 5 18.73, x3 5 2.99, x4 5 30.34,

n 5 10

u(X1, c, Xn) 5 X 1 za/2 # s/1nl(X1, c, Xn) 5 X 2 za/2 # s/1n 100(1 2 a)%

u(x1, c, xn)l(x1, x2, c, xn)

P(l(X1, X2, c, Xn) , u , u(X1, X2, c, Xn)) 5 1 2 a

b 5 za/2a 5 2za/2

P(a , h(X1, c, Xn; u) , b) 5 1 2 a

h(X1, c, Xn; m) 5 (X 2 m)/(s/1n) u 5 m

h(X1, X2, c, Xn; u)

X1, c, Xn

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.1 Basic Properties of Confidence Intervals 275

In general, the upper and lower confidence limits result from replacing each , in (7.6) by � and solving for u. In the insulating fluid example just considered,

gives as the upper confidence limit, and the lower limit is obtained from the other equation. Notice that the two interval limits are not equidistant from the point estimate, since the interval is not of the form .

Bootstrap Confidence Intervals The bootstrap technique was introduced in Chapter 6 as a way of estimating . It can also be applied to obtain a CI for u. Consider again estimating the mean m of a nor- mal distribution when s is known. Let’s replace m by u and use as the point estimator. Notice that is the 97.5th percentile of the distribution of [that is, ]. Similarly, is the 2.5th percentile, so

That is, with

(7.7)

the CI for u is . In many cases, the percentiles in (7.7) cannot be calculated, but they can be estimated from bootstrap samples. Suppose we obtain boot- strap samples and calculate , and followed by the 1000 differences

. The 25th largest and 25th smallest of these differences are estimates of the unknown percentiles in (7.7). Consult the Devore and Berk or Efron books cited in Chapter 6 for more information.

û*1 2 u#*, c, û*1000 2 u#* u#*û*1, c, û*1000

B 5 1000 (l, u)

u 5 û 2 2.5th percentile of û 2 u

l 5 û 2 97.5th percentile of û 2 u

5 P(û 2 2.5th percentile . u . û 2 97.5th percentile)

.95 5 P(2.5th percentile , û 2 u , 97.5th percentile)

21.96s/1nP(X 2 m , 1.96s/1n) 5 P(Z , 1.96) 5 .9750 û 2 u1.96s1n

û 5 X

sû

û 6 c

l 5 34.170/(2gxi)2lgxi 5 34.170

EXERCISES Section 7.1 (1–11)

1. Consider a normal population distribution with the value of s known. a. What is the confidence level for the interval

? b. What is the confidence level for the interval

? c. What value of in the CI formula (7.5) results in a con-

fidence level of 99.7%? d. Answer the question posed in part (c) for a confidence

level of 75%.

2. Each of the following is a confidence interval for m � true average (i.e., population mean) resonance frequency (Hz) for all tennis rackets of a certain type:

(114.4, 115.6) (114.1, 115.9)

a. What is the value of the sample mean resonance frequency? b. Both intervals were calculated from the same sample data.

The confidence level for one of these intervals is 90% and for the other is 99%. Which of the intervals has the 90% confidence level, and why?

za/2

1.44s/1n x 6

2.81s/1n x 6

3. Suppose that a random sample of 50 bottles of a particular brand of cough syrup is selected and the alcohol content of each bottle is determined. Let m denote the average alcohol content for the population of all bottles of the brand under study. Suppose that the resulting 95% confidence interval is (7.8, 9.4). a. Would a 90% confidence interval calculated from this

same sample have been narrower or wider than the given interval? Explain your reasoning.

b. Consider the following statement: There is a 95% chance that m is between 7.8 and 9.4. Is this statement correct? Why or why not?

c. Consider the following statement: We can be highly con- fident that 95% of all bottles of this type of cough syrup have an alcohol content that is between 7.8 and 9.4. Is this statement correct? Why or why not?

d. Consider the following statement: If the process of select- ing a sample of size 50 and then computing the corre- sponding 95% interval is repeated 100 times, 95 of the resulting intervals will include m. Is this statement cor- rect? Why or why not?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

276 CHAPTER 7 Statistical Intervals Based on a Single Sample

4. A CI is desired for the true average stray-load loss m (watts) for a certain type of induction motor when the line current is held at 10 amps for a speed of 1500 rpm. Assume that stray- load loss is normally distributed with . a. Compute a 95% CI for m when and . b. Compute a 95% CI for m when and . c. Compute a 99% CI for m when and . d. Compute an 82% CI for m when and . e. How large must n be if the width of the 99% interval for m is to be 1.0?

5. Assume that the helium porosity (in percentage) of coal sam- ples taken from any particular seam is normally distributed with true standard deviation .75. a. Compute a 95% CI for the true average porosity of a cer-

tain seam if the average porosity for 20 specimens from the seam was 4.85.

b. Compute a 98% CI for true average porosity of another seam based on 16 specimens with a sample average poros- ity of 4.56.

c. How large a sample size is necessary if the width of the 95% interval is to be .40?

d. What sample size is necessary to estimate true average porosity to within .2 with 99% confidence?

6. On the basis of extensive tests, the yield point of a particular type of mild steel-reinforcing bar is known to be normally distributed with . The composition of bars has been slightly modified, but the modification is not believed to have affected either the normality or the value of s. a. Assuming this to be the case, if a sample of 25 modified

bars resulted in a sample average yield point of 8439 lb, compute a 90% CI for the true average yield point of the modified bar.

b. How would you modify the interval in part (a) to obtain a confidence level of 92%?

7. By how much must the sample size n be increased if the width of the CI (7.5) is to be halved? If the sample size is increased by a factor of 25, what effect will this have on the width of the interval? Justify your assertions.

8. Let , with . Thena1 1 a2 5 aa1 . 0, a2 . 0

s 5 100

x 5 58.3n 5 100 x 5 58.3n 5 100 x 5 58.3n 5 100

x 5 58.3n 5 25 s 5 3.0 a. Use this equation to derive a more general expression for

a CI for m of which the interval (7.5) is a special case.

b. Let and . Does this result in a narrower or wider interval than the interval (7.5)?

9. a. Under the same conditions as those leading to the interval (7.5), . Use this to derive a one-sided interval for m that has infinite width and provides a lower confidence bound on m. What is this interval for the data in Exercise 5(a)?

b. Generalize the result of part (a) to obtain a lower bound with confidence level .

c. What is an analogous interval to that of part (b) that pro- vides an upper bound on m? Compute this 99% interval for the data of Exercise 4(a).

10. A random sample of heat pumps of a certain type yielded the following observations on lifetime (in years):

2.0 1.3 6.0 1.9 5.1 .4 1.0 5.3

15.7 .7 4.8 .9 12.2 5.3 .6

a. Assume that the lifetime distribution is exponential and use an argument parallel to that of Example 7.5 to obtain a 95% CI for expected (true average) lifetime.

b. How should the interval of part (a) be altered to achieve a confidence level of 99%?

c. What is a 95% CI for the standard deviation of the life- time distribution? [Hint: What is the standard deviation of an exponential random variable?]

11. Consider the next 1000 95% CIs for m that a statistical con- sultant will obtain for various clients. Suppose the data sets on which the intervals are based are selected independently of one another. How many of these 1000 intervals do you expect to capture the corresponding value of m? What is the probability that between 940 and 960 of these intervals contain the corresponding value of m? [Hint: Let Y � the number among the 1000 intervals that contain m. What kind of random variable is Y?]

n 5 15

100(1 2 a)%

P[(X 2 m)/(s/1n) , 1.645] 5 .95

a1 5 a/4, a2 5 3a/4a 5 .05

100(1 2 a)%

Pa2za1 , X 2 ms/1n , za2b 5 1 2 a

The CI for m given in the previous section assumed that the population distribution is normal with the value of s known. We now present a large-sample CI whose valid- ity does not require these assumptions. After showing how the argument leading to this interval generalizes to yield other large-sample intervals, we focus on an inter- val for a population proportion p.

7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 277

A Large-Sample Interval for m Let be a random sample from a population having a mean m and stan- dard deviation s. Provided that n is large, the Central Limit Theorem (CLT) implies that has approximately a normal distribution whatever the nature of the population distribution. It then follows that has approximately a standard normal distribution, so that

An argument parallel to that given in Section 7.1 yields as a large- sample CI for m with a confidence level of approximately %. That is, when n is large, the CI for m given previously remains valid whatever the popula- tion distribution, provided that the qualifier “approximately” is inserted in front of the confidence level.

A practical difficulty with this development is that computation of the CI requires the value of s, which will rarely be known. Consider the standardized vari- able , in which the sample standard deviation S has replaced s. Previously, there was randomness only in the numerator of Z by virtue of . In the new standardized variable, both and S vary in value from one sample to another. So it might seem that the distribution of the new variable should be more spread out than the z curve to reflect the extra variation in the denominator. This is indeed true when n is small. However, for large n the subsititution of S for s adds little extra variability, so this variable also has approximately a standard normal distribution. Manipulation of the variable in a probability statement, as in the case of known s, gives a general large-sample CI for m.

X X

(X 2 m)/(S/1n)

100(1 2 a) x 6 za/2 # s/1n

Pa2za/2 , X 2 ms/1n , za/2b < 1 2 a

Z 5 (X 2 m)/(s/1n) X

X1, X2, c, Xn

PROPOSITION If n is sufficiently large, the standardized variable

has approximately a standard normal distribution. This implies that

(7.8)

is a large-sample confidence interval for � with confidence level approxi- mately %. This formula is valid regardless of the shape of the pop- ulation distribution.

100(1 2 a)

x 6 za/2 # s1n

Z 5 X 2 m

S/1n

In words, the CI (7.8) is

point estimate of (z critical value) (estimated standard error of the mean).

Generally speaking, will be sufficient to justify the use of this interval. This is somewhat more conservative than the rule of thumb for the CLT because of the additional variability introduced by using S in place of s.

Haven’t you always wanted to own a Porsche? The author thought maybe he could afford a Boxster, the cheapest model. So he went to www.cars.com on Nov. 18, 2009, and found a total of 1113 such cars listed. Asking prices ranged from $3499

n . 40

m 6

Example 7.6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

278 CHAPTER 7 Statistical Intervals Based on a Single Sample

200000 40000 60000 Mileage

80000 100000 120000

Figure 7.5 A boxplot of the odometer reading data from Example 7.6

Summary quantities include . The mean and median are reasonably close (if the two largest values

were each reduced by 30,000, the mean would fall to 44,479.4, while the median would be unaffected). The boxplot and the magnitudes of s and fs relative to the mean and median both indicate a substantial amount of variability. A confidence level of about 95% requires , and the interval is

That is, with 95% confidence. This interval is rather wide because a sample size of 50, even though large by our rule of thumb, is not large enough to overcome the substantial variability in the sample. We do not have a very precise estimate of the population mean odometer reading.

Is the interval we’ve calculated one of the 95% that in the long run includes the parameter being estimated, or is it one of the “bad” 5% that does not do so? Without knowing the value of m, we cannot tell. Remember that the confidence level refers to the long run capture percentage when the formula is used repeatedly on various sam- ples; it cannot be interpreted for a single sample and the resulting interval. ■

Unfortunately, the choice of sample size to yield a desired interval width is not as straightforward here as it was for the case of known s. This is because the width

38, 294.7 , m , 53,064.1

5 (38,294.7, 53,064.1)

45,679.4 6 (1.96)a26,641.675 150

b 5 45,679.4 6 7384.7 z.025 5 1.96

fs 5 34,265 26,641.675,n 5 50, x 5 45,679.4, x| 5 45,013.5, s 5

to $130,000 (the latter price was one of only two exceeding $70,000). The prices depressed him, so he focused instead on odometer readings (miles). Here are reported readings for a sample of 50 of these Boxsters:

2948 2996 7197 8338 8500 8759 12710 12925

15767 20000 23247 24863 26000 26210 30552 30600

35700 36466 40316 40596 41021 41234 43000 44607

45000 45027 45442 46963 47978 49518 52000 53334

54208 56062 57000 57365 60020 60265 60803 62851

64404 72140 74594 79308 79500 80000 80000 84000

113000 118634

A boxplot of the data (Figure 7.5) shows that, except for the two outliers at the upper end, the distribution of values is reasonably symmetric (in fact, a normal probability plot exhibits a reasonably linear pattern, though the points corresponding to the two smallest and two largest observations are somewhat removed from a line fit through the remaining points).

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 279

of (7.8) is . Since the value of s is not available before the data has been gathered, the width of the interval cannot be determined solely by the choice of n. The only option for an investigator who wishes to specify a desired width is to make an educated guess as to what the value of s might be. By being conservative and guess- ing a larger value of s, an n larger than necessary will be chosen. The investigator may be able to specify a reasonably accurate value of the population range (the difference between the largest and smallest values). Then if the population distribution is not too skewed, dividing the range by 4 gives a ballpark value of what s might be.

The charge-to-tap time (min) for carbon steel in one type of open hearth furnace is to be determined for each heat in a sample of size n. If the investigator believes that almost all times in the distribution are between 320 and 440, what sample size would be appropriate for estimating the true average time to within 5 min. with a confi- dence level of 95%?

A reasonable value for s is . Thus

Since the sample size must be an integer, should be used. Note that esti- mating to within 5 min. with the specified confidence level is equivalent to a CI width of 10 min. ■

A General Large-Sample Confidence Interval The large-sample intervals and are special cases of a general large-sample CI for a parameter u. Suppose that is an estimator satisfy- ing the following properties: (1) It has approximately a normal distribution; (2) it is (at least approximately) unbiased; and (3) an expression for , the standard devia- tion of , is available. For example, in the case is an unbiased estimator whose distribution is approximately normal when n is large and

. Standardizing yields the rv , which has approximately a standard normal distribution. This justifies the probability statement

(7.9)

Suppose first that does not involve any unknown parameters (e.g., known s in

the case ). Then replacing each in (7.9) by results in so the lower and upper confidence limits are and , respec- tively. Now suppose that does not involve u but does involve at least one other unknown parameter. Let be the estimate of obtained by using estimates in place of the unknown parameters (e.g., estimates ). Under general con- ditions (essentially that be close to for most samples), a valid CI is

. The large-sample interval is an example. Finally, suppose that does involve the unknown u. This is the case, for

example, when , a population proportion. Then can be dif- ficult to solve. An approximate solution can often be obtained by replacing u in by its estimate . This results in an estimated standard deviation , and the corre- sponding interval is again .

In words, this CI is a

point estimate of (z critical value)(estimated standard error of the estimator)u 6

û 6 za/2 # sû sûû

sû

(û 2 u)/sû 5 za/2u 5 p sû

x 6 za/2 # s/1nû 6 za/2 # sû sûsû

s/1ns/1n sûsû

sû

û 1 za/2 # sûû 2 za/2 # sû u 5 û 6 za/2 # sû ,5,u 5 m

sû

Pa2za/2 , û 2 usû , za/2b < 1 2 a

Z 5 (û 2 u)/sûûsm̂ 5 sX 5 s/2n

u 5 m, m̂ 5 Xû sû

x 6 za/2 # s/1nx 6 za/2 # s/1n

n 5 139

n 5 c (1.96)(30) 5

d2 5 138.3 (440 2 320)/4 5 30

2za/2s/1n

Example 7.7

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

280 CHAPTER 7 Statistical Intervals Based on a Single Sample

A Confidence Interval for a Population Proportion Let p denote the proportion of “successes” in a population, where success identifies an individual or object that has a specified property (e.g., individuals who graduated from college, computers that do not need warranty service, etc.). A random sample of n individuals is to be selected, and X is the number of successes in the sample. Provided that n is small compared to the population size, X can be regarded as a binomial rv with and . Furthermore, if both

and , X has approximately a normal distribution. The natural estimator of p is , the sample fraction of successes. Since

is just X multiplied by the constant also has approximately a normal distri- bution. As shown in Section 6.1, (unbiasedness) and . The standard deviation involves the unknown parameter p. Standardizing by subtracting p and dividing by then implies that

Proceeding as suggested in the subsection “Deriving a Confidence Interval” (Section 7.1), the confidence limits result from replacing each , by � and solving the resulting quadratic equation for p. This gives the two roots

5 p| 6 za/2 $p̂(1 2 p̂) / n 1 za/22 / 4n2

1 1 za/2 2

/ n

p 5 p̂ 1 za/2

2 / 2n

1 1 za/2 2 /n

6 za/2 $p̂(1 2 p̂) / n 1 za/22 / 4n2

1 1 za/2 2

/ n

Pa2za/2 , p̂ 2 p#p(1 2 p)/n , za/2b < 1 2 a sp̂

p̂sp̂

sp̂ 5 #p(1 2 p)/nE(p̂) 5 p 1/n, p̂p̂

p̂ 5 X/n (q 5 1 2 p),nq $ 10

np $ 10sX 5 #np(1 2 p)E(X) 5 np

PROPOSITION Let . Then a confidence interval for a population propor-

tion p with confidence level approximately % is

(7.10)

where and, as before, the in (7.10) corresponds to the lower confidence limit and the to the upper confidence limit.

This is often referred to as the score CI for p.

1 2q̂ 5 1 2 p̂

p| 6 za/2 $p̂q̂ / n 1 za/22 / 4n2

1 1 za/2 2

/ n

100(1 2 a)

p| 5 p̂ 1 za/2

2 / 2n

1 1 za/2 2

/ n

If the sample size n is very large, then is generally quite negligible (small) com- pared to and is quite negligible compared to 1, from which . In this case

is also negligible compared to pq/n ( is a much larger divisor than is n); as a result, the dominant term in the expression is and the score interval is approximately

(7.11)

This latter interval has the general form of a large-sample interval sug- gested in the last subsection. The approximate CI (7.11) is the one that for decades

û 6 za/2ŝû

p̂ 6 za/2#p̂q̂/n za/2#p̂q̂/n6

n2ˆˆz2/4n2 p| < p̂z2/np̂

z2/2n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 281

has appeared in introductory statistics textbooks. It clearly has a much simpler and more appealing form than the score CI. So why bother with the latter?

First of all, suppose we use in the traditional formula (7.11). Then our nominal confidence level (the one we think we’re buying by using that z critical value) is approximately 95%. So before a sample is selected, the probability that the random interval includes the actual value of p (i.e., the coverage probability) should be about .95. But as Figure 7.6 shows for the case , the actual coverage probability for this interval can differ considerably from the nominal probability .95, particularly when p is not close to .5 (the graph of coverage probability versus p is very jagged because the underlying binomial probability distribution is discrete rather than continuous). This is generally speaking a deficiency of the traditional interval—the actual confidence level can be quite different from the nominal level even for reasonably large sample sizes. Recent research has shown that the score interval rectifies this behavior—for virtually all sample sizes and values of p, its actual confidence level will be quite close to the nominal level specified by the choice of . This is due largely to the fact that the score interval is shifted a bit toward .5 compared to the traditional interval. In particular, the midpoint of the score interval is always a bit closer to .5 than is the midpoint of the traditional interval. This is especially important when p is close to 0 or 1.

p̂ p|

za/2

n 5 100

z.025 5 1.96

0 0.2 0.4 0.6 p

0.8 1

0.86

0.88

0.90

0.92

0.94

0.96 Coverage probability

Figure 7.6 Actual coverage probability for the interval (7.11) for varying values of p when n 5 100

In addition, the score interval can be used with nearly all sample sizes and parameter values. It is thus not necessary to check the conditions n and

that would be required were the traditional interval employed. So rather than asking when n is large enough for (7.11) to yield a good approximation to (7.10), our recommendation is that the score CI should always be used. The slight additional tediousness of the computation is outweighed by the desirable properties of the interval.

The article “Repeatability and Reproducibility for Pass/Fail Data” (J. of Testing and Eval., 1997: 151–153) reported that in trials in a particular laboratory, 16 resulted in ignition of a particular type of substrate by a lighted cigarette. Let p denote the long-run proportion of all such trials that would result in ignition. A point estimate for p is . A confidence interval for p with a confidence level of approximately 95% is

p̂ 5 16/48 5 .333

n 5 48

n(1 2 p̂) $ 10 p̂ $ 10

Example 7.8

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

282 CHAPTER 7 Statistical Intervals Based on a Single Sample

This interval is quite wide because a sample size of 48 is not at all large when esti- mating a proportion.

The traditional interval is

These two intervals would be in much closer agreement were the sample size sub- stantially larger. ■

Equating the width of the CI for p to a prespecified width w gives a quadratic equation for the sample size n necessary to give an interval with a desired degree of precision. Suppressing the subscript in , the solution is

(7.12)

Neglecting the terms in the numerator involving w2 gives

This latter expression is what results from equating the width of the traditional inter- val to w.

These formulas unfortunately involve the unknown . The most conservative approach is to take advantage of the fact that is a maximum when

. Thus if is used in (7.12), the width will be at most w regardless of what value of results from the sample. Alternatively, if the investigator believes strongly, based on prior information, that , then p0 can be used in place of . A similar comment applies when .

The width of the 95% CI in Example 7.8 is .258. The value of n necessary to ensure a width of .10 irrespective of the value of isp̂

p $ p0 $ .5p̂ p # p0 # .5

p̂ p̂ 5 q̂ 5 .5p̂ 5 .5

p̂q̂[5 p̂(1 2 p̂)] p̂

n < 4z2p̂q̂

w2

n 5 2z 2p̂q̂ 2 z2w2 6 #4z 4p̂q̂(p̂q̂ 2 w2) 1 w 2z 4

w2

za/2

.333 6 1.96#(.333)(.667)/48 5 .333 6 .133 5 (.200, .466)

5 .345 6 .129 5 (.216, .474)

.333 1 (1.96)2/96

1 1 (1.96)2/48 6 (1.96)

#(.333)(.667)/48 1 (1.96)2/9216 1 1 (1.96)2/48

n 5 2(1.96)2(.25) 2 (1.96)2(.01) 6 #4(1.96)4(.25)(.25 2 .01) 1 (.01)(1.96)4

.01 5 380.3

Thus a sample size of 381 should be used. The expression for n based on the tradi- tional CI gives a slightly larger value of 385. ■

One-Sided Confidence Intervals (Confidence Bounds) The confidence intervals discussed thus far give both a lower confidence bound and an upper confidence bound for the parameter being estimated. In some circum- stances, an investigator will want only one of these two types of bounds. For exam- ple, a psychologist may wish to calculate a 95% upper confidence bound for true average reaction time to a particular stimulus, or a reliability engineer may want only a lower confidence bound for true average lifetime of components of a certain type. Because the cumulative area under the standard normal curve to the left of 1.645 is .95,

Example 7.9

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 283

Manipulating the inequality inside the parentheses to isolate m on one side and replacing rv’s by calculated values gives the inequality ; the expression on the right is the desired lower confidence bound. Starting with

and manipulating the inequality results in the upper confi- dence bound. A similar argument gives a one-sided bound associated with any other confidence level.

P(21.645 , Z ) < .95

m . x 2 1.645s/1n

Pa X 2 m S/1n

, 1.645b < .95

A large-sample upper confidence bound for � is

and a large-sample lower confidence bound for � is

A one-sided confidence bound for p results from replacing by za and by either or in the CI formula (7.10) for p. In all cases the confidence level is approximately %.100(1 2 a)

21 6za/2

m . x 2 za # s1n

m , x 1 za # s1n

PROPOSITION

Example 7.10 The slant shear test is the most widely accepted procedure for assessing the quality of a bond between a repair material and its concrete substrate. The article “Testing the Bond Between Repair Materials and Concrete Substrate” (ACI Materials J., 1996: 553–558) reported that in one particular investigation, a sample of 48 shear strength observations gave a sample mean strength of 17.17 N/mm2 and a sample standard deviation of 3.28 N/mm2. A lower confidence bound for true average shear strength m with confidence level 95% is

That is, with a confidence level of 95%, we can say that . ■m . 16.39

17.17 2 (1.645) (3.28)

148 5 17.17 2 .78 5 16.39

EXERCISES Section 7.2 (12–27)

12. A random sample of 110 lightning flashes in a certain region resulted in a sample average radar echo duration of .81 sec and a sample standard deviation of .34 sec (“Lightning Strikes to an Airplane in a Thunderstorm,” J. of Aircraft, 1984: 607–611). Calculate a 99% (two-sided) con- fidence interval for the true average echo duration m, and interpret the resulting interval.

13. The article “Gas Cooking, Kitchen Ventilation, and Exposure to Combustion Products” (Indoor Air, 2006: 65–73) reported that for a sample of 50 kitchens with gas cooking appliances monitored during a one-week period,

the sample mean CO2 level (ppm) was 654.16, and the sam- ple standard deviation was 164.43. a. Calculate and interpret a 95% (two-sided) confidence

interval for true average CO2 level in the population of all homes from which the sample was selected.

b. Suppose the investigators had made a rough guess of 175 for the value of s before collecting data. What sample size would be necessary to obtain an interval width of 50 ppm for a confidence level of 95%?

14. The article “Evaluating Tunnel Kiln Performance” (Amer. Ceramic Soc. Bull., Aug. 1997: 59–63) gave the following

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

284 CHAPTER 7 Statistical Intervals Based on a Single Sample

summary information for fracture strengths (MPa) of ceramic bars fired in a particular kiln: .

a. Calculate a (two-sided) confidence interval for true aver- age fracture strength using a confidence level of 95%. Does it appear that true average fracture strength has been precisely estimated?

b. Suppose the investigators had believed a priori that the population standard deviation was about 4 MPa. Based on this supposition, how large a sample would have been required to estimate m to within .5 MPa with 95% confi- dence?

15. Determine the confidence level for each of the following large-sample one-sided confidence bounds: a. Upper bound: b. Lower bound: c. Upper bound:

16. The alternating current (AC) breakdown voltage of an insu- lating liquid indicates its dielectric strength. The article “Testing Practices for the AC Breakdown Voltage Testing of Insulation Liquids” (IEEE Electrical Insulation Magazine, 1995: 21–26) gave the accompanying sample observations on breakdown voltage (kV) of a particular circuit under certain conditions.

62 50 53 57 41 53 55 61 59 64 50 53 64 62 50 68

54 55 57 50 55 50 56 55 46 55 53 54 52 47 47 55

57 48 63 57 57 55 53 59 53 52 50 55 60 50 56 58

a. Construct a boxplot of the data and comment on inter- esting features.

b. Calculate and interpret a 95% CI for true average break- down voltage m. Does it appear that m has been precisely estimated? Explain.

c. Suppose the investigator believes that virtually all values of breakdown voltage are between 40 and 70. What sam- ple size would be appropriate for the 95% CI to have a width of 2 kV (so that m is estimated to within 1 kV with 95% confidence)?

17. Exercise 1.13 gave a sample of ultimate tensile strength observations (ksi). Use the accompanying descriptive statis- tics output from Minitab to calculate a 99% lower confi- dence bound for true average ultimate tensile strength, and interpret the result.

N Mean Median TrMean StDev SE Mean 153 135.39 135.40 135.41 4.59 0.37

Minimum Maximum Q1 Q3 122.20 147.70 132.95 138.25

18. The article “Ultimate Load Capacities of Expansion Anchor Bolts” (J. of Energy Engr., 1993: 139–158) gave the follow- ing summary data on shear strength (kip) for a sample of 3/8-in. anchor bolts: . Calculate a lower confidence bound using a confidence level of 90% for true average shear strength.

n 5 78, x 5 4.25, s 5 1.30

x 1 .67s/1n x 2 2.05s/1n x 1 .84s/1n

s 5 3.73 x 5 89.10,n 5 169

19. The article “Limited Yield Estimation for Visual Defect Sources” (IEEE Trans. on Semiconductor Manuf., 1997: 17–23) reported that, in a study of a particular wafer inspec- tion process, 356 dies were examined by an inspection probe and 201 of these passed the probe. Assuming a stable process, calculate a 95% (two-sided) confidence interval for the proportion of all dies that pass the probe.

20. The Associated Press (October 9, 2002) reported that in a survey of 4722 American youngsters aged 6 to 19, 15% were seriously overweight (a body mass index of at least 30; this index is a measure of weight relative to height). Calculate and interpret a confidence interval using a 99% confidence level for the proportion of all American young- sters who are seriously overweight.

21. In a sample of 1000 randomly selected consumers who had opportunities to send in a rebate claim form after purchas- ing a product, 250 of these people said they never did so (“Rebates: Get What You Deserve,” Consumer Reports, May 2009: 7). Reasons cited for their behavior included too many steps in the process, amount too small, missed dead- line, fear of being placed on a mailing list, lost receipt, and doubts about receiving the money. Calculate an upper con- fidence bound at the 95% confidence level for the true pro- portion of such consumers who never apply for a rebate. Based on this bound, is there compelling evidence that the true proportion of such consumers is smaller than 1/3? Explain your reasoning.

22. The technology underlying hip replacements has changed as these operations have become more popular (over 250,000 in the United States in 2008). Starting in 2003, highly durable ceramic hips were marketed. Unfortunately, for too many patients the increased durability has been counterbal- anced by an increased incidence of squeaking. The May 11, 2008, issue of the New York Times reported that in one study of 143 individuals who received ceramic hips between 2003 and 2005, 10 of the hips developed squeaking. a. Calculate a lower confidence bound at the 95% confi-

dence level for the true proportion of such hips that develop squeaking.

b. Interpret the 95% confidence level used in (a).

23. The Pew Forum on Religion and Public Life reported on Dec. 9, 2009, that in a survey of 2003 American adults, 25% said they believed in astrology. a. Calculate and interpret a confidence interval at the 99%

confidence level for the proportion of all adult Americans who believe in astrology.

b. What sample size would be required for the width of a 99% CI to be at most .05 irrespective of the value of ?

24. A sample of 56 research cotton samples resulted in a sample average percentage elongation of 8.17 and a sam- ple standard deviation of 1.42 (“An Apparent Relation Between the Spiral Angle , the Percent Elongation E1, and the Dimensions of the Cotton Fiber,” Textile Research J., 1978: 407–410). Calculate a 95% large-sam- ple CI for the true average percentage elongation m. What

f

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.3 Intervals Based on a Normal Population Distribution 285

assumptions are you making about the distribution of percentage elongation?

25. A state legislator wishes to survey residents of her district to see what proportion of the electorate is aware of her position on using state funds to pay for abortions. a. What sample size is necessary if the 95% CI for p is to

have a width of at most .10 irrespective of p? b. If the legislator has strong reason to believe that at least

of the electorate know of her position, how large a sample size would you recommend?

26. The superintendent of a large school district, having once had a course in probability and statistics, believes that the number of teachers absent on any given day has a Poisson distribution with parameter m. Use the accompanying data on absences for 50 days to obtain a large-sample CI for m. [Hint: The mean and variance of a Poisson variable both equal m, so

Z 5 X 2 m

1m/n

2/3

has approximately a standard normal distribution. Now pro- ceed as in the derivation of the interval for p by making a probability statement (with probability ) and solving the resulting inequalities for — see the argument just after (7.10).]

Number of absences 0 1 2 3 4 5 6 7 8 9 10

Frequency 1 4 8 10 8 7 5 3 2 1 1

27. Reconsider the CI (7.10) for p, and focus on a confidence level of 95%. Show that the confidence limits agree quite well with those of the traditional interval (7.11) once two successes and two failures have been appended to the sam- ple [i.e., (7.11) based on S’s in trials]. [Hint:

. Note: Agresti and Coull showed that this adjust- ment of the traditional interval also has an actual confidence level close to the nominal level.]

1.96 < 2 n 1 4x 1 2

m

1 2 a

The CI for m presented in Section 7.2 is valid provided that n is large. The resulting interval can be used whatever the nature of the population distribution. The CLT can- not be invoked, however, when n is small. In this case, one way to proceed is to make a specific assumption about the form of the population distribution and then derive a CI tailored to that assumption. For example, we could develop a CI for m when the population is described by a gamma distribution, another interval for the case of a Weibull distribution, and so on. Statisticians have indeed carried out this program for a number of different distributional families. Because the normal distribution is more frequently appropriate as a population model than is any other type of distribution, we will focus here on a CI for this situation.

7.3 Intervals Based on a Normal Population Distribution

ASSUMPTION The population of interest is normal, so that constitutes a random sample from a normal distribution with both m and s unknown.

X1, c, Xn

The key result underlying the interval in Section 7.2 was that for large n, the rv has approximately a standard normal distribution. When n is small, S is no longer likely to be close to s, so the variability in the distribution of Z arises from randomness in both the numerator and the denomi- nator. This implies that the probability distribution of will be more spread out than the standard normal distribution. The result on which infer- ences are based introduces a new family of probability distributions called t distributions.

(X 2 m)/(S/1n)

Z 5 (X 2 m)/(S/1n)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

286 CHAPTER 7 Statistical Intervals Based on a Single Sample

THEOREM When is the mean of a random sample of size n from a normal distribution with mean m, the rv

(7.13)

has a probability distribution called a t distribution with degrees of free- dom (df).

n 2 1

T 5 X 2 m

S/1n

X

Properties of t Distributions Before applying this theorem, a discussion of properties of t distributions is in order. Although the variable of interest is still , we now denote it by T to emphasize that it does not have a standard normal distribution when n is small. Recall that a normal distribution is governed by two parameters; each different choice of m in combination with s gives a particular normal distribution. Any particular t distribution results from specifying the value of a single param- eter, called the number of degrees of freedom, abbreviated df. We’ll denote this parameter by the Greek letter n. Possible values of n are the positive integers 1, 2, 3, . So there is a t distribution with 1 df, another with 2 df, yet another with 3 df, and so on.

For any fixed value of n, the density function that specifies the associated t curve is even more complicated than the normal density function. Fortunately, we need con- cern ourselves only with several of the more important features of these curves.

c

(X 2 m)/(S/1n)

Properties of t Distributions

Let tn denote the t distribution with n df.

1. Each tn curve is bell-shaped and centered at 0.

2. Each tn curve is more spread out than the standard normal (z) curve.

3. As n increases, the spread of the corresponding tn curve decreases.

4. As , the sequence of tn curves approaches the standard normal curve (so the z curve is often called the t curve with ).df 5 ` n S `

Figure 7.7 illustrates several of these properties for selected values of n.

0

z curve

t25 curve

t5 curve

Figure 7.7 tn and z curves

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.3 Intervals Based on a Normal Population Distribution 287

NOTATION Let the number on the measurement axis for which the area under the t curve with n df to the right of is is called a t critical value.ta,na;ta,n

ta,n 5

For example, is the t critical value that captures an upper-tail area of .05 under the t curve with 6 df. The general notation is illustrated in Figure 7.8. Because t curves are symmetric about zero, captures lower-tail area a. Appendix Table A.5 gives for selected values of a and n. This table also appears inside the back cover. The columns of the table correspond to different values of a. To obtain , go to the column, look down to the row, and read . Similarly, (.05 column, row), and .t.01,22 5 2.508n 5 22t.05,22 5 1.717

t.05,15 5 1.753n 5 15a 5 .05 t.05,15

ta,n

2ta,n

t.05,6

The number of df for T in (7.13) is because, although S is based on the n deviations implies that only of these are “freely determined.” The number of df for a t variable is the number of freely deter- mined deviations on which the estimated standard deviation in the denominator of T is based.

The use of t distribution in making inferences requires notation for capturing t-curve tail areas analogous to for the z curve. You might think that ta would do the trick. However, the desired value depends not only on the tail area captured but also on df.

za

n 2 1X1 2 X, c, Xn 2 X, �(Xi 2 X) 5 0 n 2 1

0

t curve�

Shaded area � �

t , � �

Figure 7.8 Illustration of a t critical value

The values of exhibit regular behavior as we move across a row or down a column. For fixed n, increases as a decreases, since we must move farther to the right of zero to capture area a in the tail. For fixed a, as n is increased (i.e., as we look down any particular column of the t table) the value of decreases. This is because a larger value of n implies a t distribution with smaller spread, so it is not necessary to go so far from zero to capture tail area a. Furthermore, decreases more slowly as n increases. Consequently, the table values are shown in increments of 2 between 30 df and 40 df and then jump to , and finally `. Because is the standard normal curve, the familiar za values appear in the last row of the table. The rule of thumb suggested earlier for use of the large-sample CI (if ) comes from the approximate equality of the standard normal and t distributions for .

The One-Sample t Confidence Interval The standardized variable T has a t distribution with df, and the area under the cor- responding t density curve between and is (area a/2 lies in each tail), so

(7.14)

Expression (7.14) differs from expressions in previous sections in that T and are used in place of Z and , but it can be manipulated in the same manner to obtain a confidence interval for m.

za/2

ta/2,n21

P(2ta/ 2, n21 , T , ta/ 2, n21) 5 1 2 a

1 2 ata/2,n212ta/2,n21

n 2 1

n $ 40 n . 40

t`n 5 50, 60, 120

ta,n

ta,n

ta,n

ta,n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

288 CHAPTER 7 Statistical Intervals Based on a Single Sample

PROPOSITION Let and s be the sample mean and sample standard deviation computed from the results of a random sample from a normal population with mean m. Then a % confidence interval for M is

(7.15)

or, more compactly, . An upper confidence bound for M is

and replacing by in this latter expression gives a lower confidence bound for M, both with confidence level %.100(1 2 a)

21

x 1 ta,n21 # s1n

x 6 ta/2,n21 # s/1n

ax 2 ta/2,n21 # s1n , x 1 ta/2,n21 # s

1n b

100(1 2 a)

x

Even as traditional markets for sweetgum lumber have declined, large section solid tim- bers traditionally used for construction bridges and mats have become increasingly scarce. The article “Development of Novel Industrial Laminated Planks from Sweetgum Lumber” (J. of Bridge Engr., 2008: 64–66) described the manufacturing and testing of composite beams designed to add value to low-grade sweetgum lumber. Here is data on the modulus of rupture (psi; the article contained summary data expressed in MPa):

6807.99 7637.06 6663.28 6165.03 6991.41 6992.23

6981.46 7569.75 7437.88 6872.39 7663.18 6032.28

6906.04 6617.17 6984.12 7093.71 7659.50 7378.61

7295.54 6702.76 7440.17 8053.26 8284.75 7347.95

7422.69 7886.87 6316.67 7713.65 7503.33 7674.99

Figure 7.9 shows a normal probability plot from the R software. The straightness of the pattern in the plot provides strong support for assuming that the population dis- tribution of MOR is at least approximately normal.

6000

6500

7000

Sa m

pl e

Q ua

nt ile

s

7500

8000

–2 –1 0

Theoretical Quantiles

Normal Probability of MOR

1 2

Figure 7.9 A normal probability plot of the modulus of rupture data

Example 7.11

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.3 Intervals Based on a Normal Population Distribution 289

The sample mean and sample standard deviation are 7203.191 and 543.5400, respec- tively (for anyone bent on doing hand calculation, the computational burden is eased a bit by subtracting 6000 from each x value to obtain ; then

and , from which and as given).

Let’s now calculate a confidence interval for true average MOR using a confidence level of 95%. The CI is based on degrees of freedom, so the necessary t critical value is . The interval estimate is now

We estimate that with 95% confidence. If we use the same formula on sample after sample, in the long run 95% of the calculated inter- vals will contain m. Since the value of m is not available, we don’t know whether the calculated interval is one of the “good” 95% or the “bad” 5%. Even with the mod- erately large sample size, our interval is rather wide. This is a consequence of the substantial amount of sample variability in MOR values.

A lower 95% confidence bound would result from retaining only the lower confidence limit (the one with ) and replacing 2.045 with . ■

Unfortunately, it is not easy to select n to control the width of the t interval. This is because the width involves the unknown (before the data is collected) s and because n enters not only through but also through . As a result, an appropriate n can be obtained only by trial and error.

In Chapter 15, we will discuss a small-sample CI for m that is valid pro- vided only that the population distribution is symmetric, a weaker assumption than normality. However, when the population distribution is normal, the t inter- val tends to be shorter than would be any other interval with the same confidence level.

A Prediction Interval for a Single Future Value In many applications, the objective is to predict a single value of a variable to be observed at some future time, rather than to estimate the mean value of that variable.

Consider the following sample of fat content (in percentage) of randomly selected hot dogs (“Sensory and Mechanical Assessment of the Quality of Frankfurters,” J. of Texture Studies, 1990: 395–409):

25.2 21.3 22.8 17.0 29.8 21.0 25.5 16.0 20.9 19.5

Assuming that these were selected from a normal population distribution, a 95% CI for (interval estimate of) the population mean fat content is

Suppose, however, you are going to eat a single hot dog of this type and want a pre- diction for the resulting fat content. A point prediction, analogous to a point esti- mate, is just . This prediction unfortunately gives no information about reliability or precision. ■

x 5 21.90

5 (18.94, 24.86)

x 6 t.025,9 # s1n 5 21.90 6 2.262 # 4.134 110

5 21.90 6 2.96

n 5 10

ta/2, n211/1n

t.05,29 5 1.6992

7000.253 , m , 7406.129

5 7203.191 6 202.938 5 (7000.253, 7406.129)

x 6 t.025,29 # s1n 5 7203.191 6 (2.045) # 543.5400 130

t.025,29 5 2.045 n 2 1 5 29

sy 5 sxy 5 1203.191gyi 2 5 51,997,668.77gyi 5 36,095.72

yi 5 xi 2 6000

Example 7.12

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

290 CHAPTER 7 Statistical Intervals Based on a Single Sample

The general setup is as follows: We have available a random sample from a normal population distribution, and wish to predict the value of

, a single future observation (e.g., the lifetime of a single lightbulb to be purchased or the fuel efficiency of a single vehicle to be rented). A point predictor is , and the resulting prediction error is . The expected value of the prediction error is

Since is independent of , it is independent of , so the variance of the prediction error is

The prediction error is a linear combination of independent, normally distributed rv’s, so itself is normally distributed. Thus

has a standard normal distribution. It can be shown that replacing s by the sample standard deviation S (of ) results in

Manipulating this T variable as was manipulated in the devel- opment of a CI gives the following result.

T 5 (X 2 m)/(S/2n)

T 5 X 2 Xn11

SÉ1 1 1 n

| t distribution with n 2 1 df

X1, c, Xn

Z 5 (X 2 Xn11) 2 0

És2a1 1 1 n b

5 X 2 Xn11

És2a1 1 1 n b

V(X 2 Xn11) 5 V(X) 1 V(Xn11) 5 s2

n 1 s2 5 s2a1 1 1

n b

XX1, c, XnXn11

E(X 2 Xn11) 5 E(X) 2 E(Xn11) 5 m 2 m 5 0

X 2 Xn11

X Xn11

X1, X2, c, Xn

PROPOSITION A prediction interval (PI) for a single observation to be selected from a nor- mal population distribution is

(7.16)

The prediction level is %. A lower prediction bound results from replacing by ta and discarding the part of (7.16); a similar modifica- tion gives an upper prediction bound.

1ta/2

100(1 2 a)

x# 6 ta/2,n21 # sB1 1 1 n

Example 7.13 (Example 7.12 continued)

The interpretation of a 95% prediction level is similar to that of a 95% confidence level; if the interval (7.16) is calculated for sample after sample, in the long run 95% of these intervals will include the corresponding future values of X.

With , and , a 95% PI for the fat content of a single hot dog is

This interval is quite wide, indicating substantial uncertainty about fat content. Notice that the width of the PI is more than three times that of the CI. ■

5 (12.09, 31.71)

21.90 6 (2.262)(4.134)É1 1 1

10 5 21.90 6 9.81

t.025,9 5 2.262n 5 10, x 5 21.90, s 5 4.134

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.3 Intervals Based on a Normal Population Distribution 291

Let k be a number between 0 and 100. A tolerance interval for capturing at least k% of the values in a normal population distribution with a confidence level 95% has the form

Tolerance critical values for , and 99 in combination with various sample sizes are given in Appendix Table A.6. This table also includes critical values for a confidence level of 99% (these values are larger than the corre- sponding 95% values). Replacing by gives an upper tolerance bound, and using in place of results in a lower tolerance bound. Critical values for obtaining these one-sided bounds also appear in Appendix Table A.6.

62 16

k 5 90, 95

x 6 (tolerance critical value) # s

The error of prediction is , a difference between two random variables, whereas the estimation error is , the difference between a random variable and a fixed (but unknown) value. The PI is wider than the CI because there is more variability in the prediction error (due to ) than in the estimation error. In fact, as n gets arbi- trarily large, the CI shrinks to the single value m, and the PI approaches There is uncertainty about a single X value even when there is no need to estimate.

Tolerance Intervals Consider a population of automobiles of a certain type, and suppose that under spec- ified conditions, fuel efficiency (mpg) has a normal distribution with and

. Then since the interval from to 1.645 captures 90% of the area under the z curve, 90% of all these automobiles will have fuel efficiency values between

and . But what if the values of m and s are not known? We can take a sample of size n, determine the fuel efficiencies, and s, and form the interval whose lower limit is and whose upper limit is

. However, because of sampling variability in the estimates of m and s, there is a good chance that the resulting interval will include less than 90% of the population values. Intuitively, to have an a priori 95% chance of the resulting inter- val including at least 90% of the population values, when and s are used in place of m and s we should also replace 1.645 by some larger number. For example, when

, the value 2.310 is such that we can be 95% confident that the interval will include at least 90% of the fuel efficiency values in the population.x 6 2.310s

n 5 20

x

x 1 1.645s x 2 1.645s

x m 1 1.645s 5 33.29m 2 1.645s 5 26.71

21.645s 5 2 m 5 30

m 6 za/2 # s. Xn11

X 2 m X 2 Xn11

As part of a larger project to study the behavior of stressed-skin panels, a structural component being used extensively in North America, the article “Time-Dependent Bending Properties of Lumber” (J. of Testing and Eval., 1996: 187–193) reported on various mechanical properties of Scotch pine lumber specimens. Consider the fol- lowing observations on modulus of elasticity (MPa) obtained 1 minute after loading in a certain configuration:

10,490 16,620 17,300 15,480 12,970 17,260 13,400 13,900

13,630 13,260 14,370 11,700 15,470 17,840 14,070 14,760

There is a pronounced linear pattern in a normal probability plot of the data. Relevant summary quantities are , . For a confi- dence level of 95%, a two-sided tolerance interval for capturing at least 95% of the modulus of elasticity values for specimens of lumber in the population sampled uses the tolerance critical value of 2.903. The resulting interval is

14,532.5 6 (2.903)(2055.67) 5 14,532.5 6 5967.6 5 (8,564.9, 20,500.1)

x 5 14,532.5, s 5 2055.67n 5 16

Example 7.14

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

292 CHAPTER 7 Statistical Intervals Based on a Single Sample

We can be highly confident that at least 95% of all lumber specimens have modulus of elasticity values between 8,564.9 and 20,500.1.

The 95% CI for m is (13,437.3, 15,627.7), and the 95% prediction interval for the modulus of elasticity of a single lumber specimen is (10,017.0, 19,048.0). Both the prediction interval and the tolerance interval are substantially wider than the con- fidence interval. ■

Intervals Based on Nonnormal Population Distributions The one-sample t CI for m is robust to small or even moderate departures from nor- mality unless n is quite small. By this we mean that if a critical value for 95% con- fidence, for example, is used in calculating the interval, the actual confidence level will be reasonably close to the nominal 95% level. If, however, n is small and the population distribution is highly nonnormal, then the actual confidence level may be considerably different from the one you think you are using when you obtain a par- ticular critical value from the t table. It would certainly be distressing to believe that your confidence level is about 95% when in fact it was really more like 88%! The bootstrap technique, introduced in Section 7.1, has been found to be quite success- ful at estimating parameters in a wide variety of nonnormal situations.

In contrast to the confidence interval, the validity of the prediction and toler- ance intervals described in this section is closely tied to the normality assumption. These latter intervals should not be used in the absence of compelling evidence for normality. The excellent reference Statistical Intervals, cited in the bibliography at the end of this chapter, discusses alternative procedures of this sort for various other situations.

EXERCISES Section 7.3 (28–41)

28. Determine the values of the following quantities: a. b. c. d. e.

29. Determine the t critical value(s) that will capture the desired t-curve area in each of the following cases: a. Central area � .95, df � 10 b. Central area � .95, df � 20 c. Central area � .99, df � 20 d. Central area � .99, df � 50 e. Upper-tail area � .01, df � 25 f. Lower-tail area � .025, df � 5

30. Determine the t critical value for a two-sided confidence interval in each of the following situations: a. Confidence level � 95%, df � 10 b. Confidence level � 95%, df � 15 c. Confidence level � 99%, df � 15 d. Confidence level � 99%, n � 5 e. Confidence level � 98%, df � 24 f. Confidence level � 99%, n � 38

31. Determine the t critical value for a lower or an upper confi- dence bound for each of the situations described in Exercise 30.

32. According to the article “Fatigue Testing of Condoms” (Polymer Testing, 2009: 567–571), “tests currently used for

t.005,40t.05,40t.05,25t.05,15t.1,15

condoms are surrogates for the challenges they face in use,” including a test for holes, an inflation test, a package seal test, and tests of dimensions and lubricant quality (all fertile terri- tory for the use of statistical methodology!). The investigators developed a new test that adds cyclic strain to a level well below breakage and determines the number of cycles to break. A sample of 20 condoms of one particular type resulted in a sample mean number of 1584 and a sample stan- dard deviation of 607. Calculate and interpret a confidence interval at the 99% confidence level for the true average num- ber of cycles to break. [Note: The article presented the results of hypothesis tests based on the t distribution; the validity of these depends on assuming normal population distributions.]

33. The article “Measuring and Understanding the Aging of Kraft Insulating Paper in Power Transformers” (IEEE Electrical Insul. Mag., 1996: 28–34) contained the follow- ing observations on degree of polymerization for paper specimens for which viscosity times concentration fell in a certain middle range:

418 421 421 422 425 427 431

434 437 439 446 447 448 453

454 463 465

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.3 Intervals Based on a Normal Population Distribution 293

a. Construct a boxplot of the data and comment on any interesting features.

b. Is it plausible that the given sample observations were selected from a normal distribution?

c. Calculate a two-sided 95% confidence interval for true average degree of polymerization (as did the authors of the article). Does the interval suggest that 440 is a plau- sible value for true average degree of polymerization? What about 450?

34. A sample of 14 joint specimens of a particular type gave a sample mean proportional limit stress of 8.48 MPa and a sample standard deviation of .79 MPa (“Characterization of Bearing Strength Factors in Pegged Timber Connections,” J. of Structural Engr., 1997: 326–332). a. Calculate and interpret a 95% lower confidence bound

for the true average proportional limit stress of all such joints. What, if any, assumptions did you make about the distribution of proportional limit stress?

b. Calculate and interpret a 95% lower prediction bound for the proportional limit stress of a single joint of this type.

35. Silicone implant augmentation rhinoplasty is used to correct congenital nose deformities. The success of the procedure depends on various biomechanical properties of the human nasal periosteum and fascia. The article “Biomechanics in Augmentation Rhinoplasty” (J. of Med. Engr. and Tech., 2005: 14–17) reported that for a sample of 15 (newly deceased) adults, the mean failure strain (%) was 25.0, and the standard deviation was 3.5. a. Assuming a normal distribution for failure strain, esti-

mate true average strain in a way that conveys informa- tion about precision and reliability.

b. Predict the strain for a single adult in a way that con- veys information about precision and reliability. How does the prediction compare to the estimate calculated in part (a)?

36. The observations on escape time given in Exercise 36 of Chapter 1 give a sample mean and sample standard deviation of 370.69 and 24.36, respectively. a. Calculate an upper confidence bound for population

mean escape time using a confidence level of 95%. b. Calculate an upper prediction bound for the escape time

of a single additional worker using a prediction level of 95%. How does this bound compare with the confidence bound of part (a)?

c. Suppose that two additional workers will be chosen to participate in the simulated escape exercise. Denote their escape times by X27 and X28, and let denote the aver- age of these two values. Modify the formula for a PI for a single x value to obtain a PI for , and calculate a 95% two-sided interval based on the given escape data.

37. A study of the ability of individuals to walk in a straight line (“Can We Really Walk Straight?” Amer. J. of Physical Anthro., 1992: 19–27) reported the accompanying data on cadence (strides per second) for a sample of ran- domly selected healthy men.

n 5 20

Xnew

Xnew

n 5 26

.95 .85 .92 .95 .93 .86 1.00 .92 .85 .81

.78 .93 .93 1.05 .93 1.06 1.06 .96 .81 .96

A normal probability plot gives substantial support to the assumption that the population distribution of cadence is approximately normal. A descriptive summary of the data from Minitab follows:

Variable N Mean Median TrMean StDev SEMean cadence 20 0.9255 0.9300 0.9261 0.0809 0.0181

Variable Min Max Q1 Q3 cadence 0.7800 1.0600 0.8525 0.9600

a. Calculate and interpret a 95% confidence interval for population mean cadence.

b. Calculate and interpret a 95% prediction interval for the cadence of a single individual randomly selected from this population.

c. Calculate an interval that includes at least 99% of the cadences in the population distribution using a confi- dence level of 95%.

38. A sample of 25 pieces of laminate used in the manufacture of circuit boards was selected, and the amount of warpage (in.) under particular conditions was determined for each piece, resulting in a sample mean warpage of .0635 and a sample standard deviation of .0065. a. Calculate a prediction for the amount of warpage of a

single piece of laminate in a way that provides informa- tion about precision and reliability.

b. Calculate an interval for which you can have a high degree of confidence that at least 95% of all pieces of laminate result in amounts of warpage that are between the two limits of the interval.

39. Exercise 72 of Chapter 1 gave the following observations on a receptor binding measure (adjusted distribution volume) for a sample of 13 healthy individuals: 23, 39, 40, 41, 43, 47, 51, 58, 63, 66, 67, 69, 72. a. Is it plausible that the population distribution from which

this sample was selected is normal? b. Calculate an interval for which you can be 95% confi-

dent that at least 95% of all healthy individuals in the population have adjusted distribution volumes lying between the limits of the interval.

c. Predict the adjusted distribution volume of a single healthy individual by calculating a 95% prediction inter- val. How does this interval’s width compare to the width of the interval calculated in part (b)?

40. Exercise 13 of Chapter 1 presented a sample of observations on ultimate tensile strength, and Exercise 17 of the previous section gave summary quantities and requested a large-sample confidence interval. Because the sample size is large, no assumptions about the population distribution are required for the validity of the CI. a. Is any assumption about the tensile-strength distribution

required prior to calculating a lower prediction bound for

n 5 153

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

294 CHAPTER 7 Statistical Intervals Based on a Single Sample

the tensile strength of the next specimen selected using the method described in this section? Explain.

b. Use a statistical software package to investigate the plau- sibility of a normal population distribution.

c. Calculate a lower prediction bound with a prediction level of 95% for the ultimate tensile strength of the next specimen selected.

41. A more extensive tabulation of t critical values than what appears in this book shows that for the t distribution with

20 df, the areas to the right of the values .687, .860, and 1.064 are .25, .20, and .15, respectively. What is the confi- dence level for each of the following three confidence inter- vals for the mean m of a normal population distribution? Which of the three intervals would you recommend be used, and why?

a. b. c. (x 2 1.064s/121, x 1 1.064s/121)

(x 2 .860s/121, x 1 1.325s/121) (x 2 .687s/121, x 1 1.725s/121)

THEOREM

7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population

Although inferences concerning a population variance s2 or standard deviation s are usually of less interest than those about a mean or proportion, there are occasions when such procedures are needed. In the case of a normal population distribution, inferences are based on the following result concerning the sample variance S2.

NOTATION

Let be a random sample from a normal distribution with parameters m and s2. Then the rv

has a chi-squared ( ) probability distribution with df.n 2 1x2

(n 2 1)S 2

s2 5

g (Xi 2 X) 2

s2

X1, X2, c, Xn

As discussed in Sections 4.4 and 7.1, the chi-squared distribution is a contin- uous probability distribution with a single parameter n, called the number of degrees of freedom, with possible values 1, 2, 3, . . . . The graphs of several probability density functions (pdf’s) are illustrated in Figure 7.10. Each pdf is positive only for , and each has a positive skew (long upper tail), though the distribu- tion moves rightward and becomes more symmetric as n increases. To specify infer- ential procedures that use the chi-squared distribution, we need notation analogous to that for a t critical value .ta,n

x . 0 f (x; n) x2

Let , called a chi-squared critical value, denote the number on the hori- zontal axis such that a of the area under the chi-squared curve with n df lies to the right of .xa,n

2

xa,n 2

f (x; )� � � 8 � � 12

� � 20

x

Figure 7.10 Graphs of chi-squared density functions

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

7.4 Confidence Intervals for the Variance and Standard Deviation of a Normal Population 295

A % confidence interval for the variance s2 of a normal pop- ulation has lower limit

and upper limit

A confidence interval for has lower and upper limits that are the square roots of the corresponding limits in the interval for s2. An upper or a lower confidence bound results from replacing with a in the corresponding limit of the CI.

a/2

s

(n 2 1)s2/x12a/2,n21 2

(n 2 1)s2/xa/2,n21 2

100(1 2 a)

Symmetry of t distributions made it necessary to tabulate only upper-tailed t critical values ( for small values of a). The chi-squared distribution is not sym- metric, so Appendix Table A.7 contains values of both for a near 0 and near 1, as illustrated in Figure 7.11(b). For example, and (the 5th percentile) .5 10.851

x.95,20 2x.025,14

2 5 26.119, xa,n

2 ta,n

2 density curve�

� � �

Shaded area � �

2� �, 2 � (a)

.99, 2

� .01,

Each shaded area � .01

(b)

Figure 7.11 notation illustratedxa,n2

The rv satisfies the two properties on which the general method for obtaining a CI is based: It is a function of the parameter of interest s2, yet its probability distribution (chi-squared) does not depend on this parameter. The area under a chi-squared curve with n df to the right of is , as is the area to the left of . Thus the area captured between these two critical values is . As a consequence of this and the theorem just stated,

(7.17)

The inequalities in (7.17) are equivalent to

Substituting the computed value s2 into the limits gives a CI for s2, and taking square roots gives an interval for s.

(n 2 1)S 2

xa/2,n21 2

, s2 , (n 2 1)S 2

x12a/2,n21 2

Pax12a/2,n212 , (n 2 1)S 2

s2 , xa/2,n21

2 b 5 1 2 a

1 2 ax12a/2,n 2

a/2xa/2,n 2

(n 2 1)S 2/s2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

296 CHAPTER 7 Statistical Intervals Based on a Single Sample

The accompanying data on breakdown voltage of electrically stressed circuits was read from a normal probability plot that appeared in the article “Damage of Flexible Printed Wiring Boards Associated with Lightning-Induced Voltage Surges” (IEEE Transactions on Components, Hybrids, and Manuf. Tech., 1985: 214–220). The straightness of the plot gave strong support to the assumption that breakdown volt- age is approximately normally distributed.

1470 1510 1690 1740 1900 2000 2030 2100 2190

2200 2290 2380 2390 2480 2500 2580 2700

Let s2 denote the variance of the breakdown voltage distribution. The computed value of the sample variance is , the point estimate of s2. With

, a 95% CI requires and . The interval is

Taking the square root of each endpoint yields (276.0, 564.0) as the 95% CI for s. These intervals are quite wide, reflecting substantial variability in breakdown volt- age in combination with a small sample size. ■

CIs for s2 and s when the population distribution is not normal can be diffi- cult to obtain. For such cases, consult a knowledgeable statistician.

a 16(137,324.3) 28.845

, 16(137,324.3)

6.908 b 5 (76,172.3, 318, 064.4)

x.025,16 2 5 28.845x.975,16

2 5 6.908df 5 n 2 1 5 16 s2 5 137, 324.3

Example 7.15

EXERCISES Section 7.4 (42–46)

42. Determine the values of the following quantities: a. b. c. d. e. f.

43. Determine the following: a. The 95th percentile of the chi-squared distribution with

b. The 5th percentile of the chi-squared distribution with

c. , where x2 is a chi-squared rv with

d. , where x2 is a chi- squared rv with

44. The amount of lateral expansion (mils) was determined for a sample of pulsed-power gas metal arc welds used in LNG ship containment tanks. The resulting sample stan- dard deviation was mils. Assuming normality, derive a 95% CI for s2 and for s.

45. The following observations were made on fracture tough- ness of a base plate of 18% nickel maraging steel [“Fracture

s 5 2.81

n 5 9

n 5 25 P(x2 , 14.611 or x2 . 37.652)

n 5 22 P(10.98 # x2 # 36.78) n 5 10

n 5 10

x.995,25 2x.99,25

2 x.005,25

2x.01,25 2

x.1,25 2x.1,15

2 Testing of Weldments,” ASTM Special Publ. No. 381, 1965: 328–356 (in ksi , given in increasing order)]:

69.5 71.9 72.6 73.1 73.3 73.5 75.5 75.7

75.8 76.1 76.2 76.2 77.0 77.9 78.1 79.6

79.7 79.9 80.1 82.2 83.7 93.7

Calculate a 99% CI for the standard deviation of the fracture- toughness distribution. Is this interval valid whatever the nature of the distribution? Explain.

46. The article “Concrete Pressure on Formwork” (Mag. of Concrete Res., 2009: 407–417) gave the following observa- tions on maximum concrete pressure (kN/m2):

33.2 41.8 37.3 40.2 36.7 39.1 36.2 41.8

36.0 35.2 36.7 38.9 35.8 35.2 40.1

a. Is it plausible that this sample was selected from a nor- mal population distribution?

b. Calculate an upper confidence bound with confidence level 95% for the population standard deviation of max- imum pressure.

1in.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 297

47. Example 1.11 introduced the accompanying observations on bond strength.

11.5 12.1 9.9 9.3 7.8 6.2 6.6 7.0 13.4 17.1 9.3 5.6 5.7 5.4 5.2 5.1 4.9 10.7 15.2 8.5 4.2 4.0 3.9 3.8 3.6 3.4 20.6 25.5 13.8 12.6 13.1 8.9 8.2 10.7 14.2 7.6 5.2 5.5 5.1 5.0 5.2 4.8 4.1 3.8 3.7 3.6 3.6 3.6

a. Estimate true average bond strength in a way that con- veys information about precision and reliability. [Hint:

and .] b. Calculate a 95% CI for the proportion of all such bonds

whose strength values would exceed 10.

48. A triathlon consisting of swimming, cycling, and running is one of the more strenuous amateur sporting events. The article “Cardiovascular and Thermal Response of Triathlon Performance” (Medicine and Science in Sports and Exercise, 1988: 385–389) reports on a research study involving nine male triathletes. Maximum heart rate (beats/min) was recorded during performance of each of the three events. For swimming, the sample mean and sample standard deviation were 188.0 and 7.2, respectively. Assuming that the heart-rate distribution is (approximately) normal, construct a 98% CI for true mean heart rate of triathletes while swimming.

49. For each of 18 preserved cores from oil-wet carbonate reser- voirs, the amount of residual gas saturation after a solvent injection was measured at water flood-out. Observations, in percentage of pore volume, were

23.5 31.5 34.0 46.7 45.6 32.5

41.4 37.2 42.5 46.9 51.5 36.4

44.5 35.7 33.5 39.3 22.0 51.2

(See “Relative Permeability Studies of Gas-Water Flow Following Solvent Injection in Carbonate Rocks,” Soc. of Petroleum Engineers J., 1976: 23–30.) a. Construct a boxplot of this data, and comment on any

interesting features. b. Is it plausible that the sample was selected from a normal

population distribution? c. Calculate a 98% CI for the true average amount of resid-

ual gas saturation.

50. A journal article reports that a sample of size 5 was used as a basis for calculating a 95% CI for the true average nat- ural frequency (Hz) of delaminated beams of a certain type. The resulting interval was (229.764, 233.504). You decide that a confidence level of 99% is more appropriate

�xi 2 5 4247.08�xi 5 387.8

than the 95% level used. What are the limits of the 99% interval? [Hint: Use the center of the interval and its width to determine and s.]

51. An April 2009 survey of 2253 American adults conducted by the Pew Research Center’s Internet & American Life Project revealed that 1262 of the respondents had at some point used wireless means for online access. a. Calculate and interpret a 95% CI for the proportion of all

American adults who at the time of the survey had used wireless means for online access.

b. What sample size is required if the desired width of the 95% CI is to be at most .04, irrespective of the sample results?

c. Does the upper limit of the interval in (a) specify a 95% upper confidence bound for the proportion being esti- mated? Explain.

52. High concentration of the toxic element arsenic is all too common in groundwater. The article “Evaluation of Treatment Systems for the Removal of Arsenic from Groundwater” (Practice Periodical of Hazardous, Toxic, and Radioactive Waste Mgmt., 2005: 152–157) reported that for a sample of water specimens selected for treatment by coagulation, the sample mean arsenic con- centration was 24.3 mg/L, and the sample standard devia- tion was 4.1. The authors of the cited article used t-based methods to analyze their data, so hopefully had reason to believe that the distribution of arsenic concentration was normal. a. Calculate and interpret a 95% CI for true average arsenic

concentration in all such water specimens. b. Calculate a 90% upper confidence bound for the

standard deviation of the arsenic concentration distri- bution.

c. Predict the arsenic concentration for a single water spec- imen in a way that conveys information about precision and reliability.

53. Aphid infestation of fruit trees can be controlled either by spraying with pesticide or by inundation with ladybugs. In a particular area, four different groves of fruit trees are selected for experimentation. The first three groves are sprayed with pesticides 1, 2, and 3, respectively, and the fourth is treated with ladybugs, with the following results on yield:

ni � Number

Treatment of Trees (Bushels/Tree) si

1 100 10.5 1.5 2 90 10.0 1.3 3 100 10.1 1.8 4 120 10.7 1.6

xi

n 5 5

x

SUPPLEMENTARY EXERCISES (47–62)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

298 CHAPTER 7 Statistical Intervals Based on a Single Sample

Let the true average yield (bushels/tree) after receiv- ing the ith treatment. Then

measures the difference in true average yields between treatment with pesticides and treatment with ladybugs. When n1, n2, n3, and n4 are all large, the estimator obtained by replacing each mi by is approximately normal. Use this to derive a large-sample % CI for u, and com- pute the 95% interval for the given data.

54. It is important that face masks used by firefighters be able to withstand high temperatures because firefighters com- monly work in temperatures of 200–500°F. In a test of one type of mask, 11 of 55 masks had lenses pop out at 250°. Construct a 90% CI for the true proportion of masks of this type whose lenses would pop out at 250°.

55. A manufacturer of college textbooks is interested in esti- mating the strength of the bindings produced by a particular binding machine. Strength can be measured by recording the force required to pull the pages from the binding. If this force is measured in pounds, how many books should be tested to estimate the average force required to break the binding to within .1 lb with 95% confidence? Assume that s is known to be .8.

56. Chronic exposure to asbestos fiber is a well-known health hazard. The article “The Acute Effects of Chrysotile Asbestos Exposure on Lung Function” (Environ. Research, 1978: 360–372) reports results of a study based on a sample of construction workers who had been exposed to asbestos over a prolonged period. Among the data given in the arti- cle were the following (ordered) values of pulmonary com- pliance (cm3/cm H2O) for each of 16 subjects 8 months after the exposure period (pulmonary compliance is a measure of lung elasticity, or how effectively the lungs are able to inhale and exhale):

167.9 180.8 184.8 189.8 194.8 200.2

201.9 206.9 207.2 208.4 226.3 227.7

228.5 232.4 239.8 258.6

a. Is it plausible that the population distribution is normal? b. Compute a 95% CI for the true average pulmonary com-

pliance after such exposure. c. Calculate an interval that, with a confidence level of

95%, includes at least 95% of the pulmonary compliance values in the population distribution.

57. In Example 6.8, we introduced the concept of a censored experiment in which n components are put on test and the experiment terminates as soon as r of the components have failed. Suppose component lifetimes are independ- ent, each having an exponential distribution with parame- ter l. Let Y1 denote the time at which the first failure

100(1 2 a) Xi

u 5 1

3 (m1 1 m2 1 m3) 2 m4

mi 5 occurs, Y2 the time at which the second failure occurs, and so on, so that is the total accumulated lifetime at termination. Then it can be shown that has a chi-squared distribution with 2r df. Use this fact to develop a % CI formula for true average lifetime . Compute a 95% CI from the data in Example 6.8.

58. Let be a random sample from a continuous probability distribution having median (so that

).

a. Show that

so that (min(xi), max(xi)) is a confidence

interval for with . [Hint: The complement

of the event is . But iff for

all i.] b. For each of six normal male infants, the amount of the

amino acid alanine (mg/100 mL) was determined while the infants were on an isoleucine-free diet, resulting in the following data:

2.84 3.54 2.80 1.44 2.94 2.70

Compute a 97% CI for the true median amount of ala- nine for infants on such a diet (“The Essential Amino Acid Requirements of Infants,” Amer. J. of Nutrition, 1964: 322–330).

c. Let x(2) denote the second smallest of the xi’s and denote the second largest of the xi’s. What is the confi- dence level of the interval for ?

59. Let be a random sample from a uniform dis- tribution on the interval [0, u], so that

Then if , it can be shown that the rv has density function

a. Use to verify that

and use this to derive a % CI for u. b. Verify that , and derive

a % CI for u based on this probability statement.

100(1 2 a) P(a1/n # Y/u # 1) 5 1 2 a

100(1 2 a)

Pa(a/2)1/n , Y u

# (1 2 a/2)1/nb 5 1 2 a

fU(u)

fU(u) 5 enu n21 0 # u # 1

0 otherwise

U 5 Y/uY 5 max (Xi)

f(x) 5 c1u 0 # x # u 0 otherwise

X1, X2, c, Xn

m|(x(2), x(n21))

x(n21)

Xi # m |max (Xi) # m

|m|6 ´ 5min (Xi) $ m|6 5max (Xi) #5min (Xi) , m| , max (Xi)6

a 5 Q1 2 R n21m|

100(1 2 a)%

P(min (Xi) , m | , max (Xi)) 5 1 2 a 12 b

n21

P(Xi # m |) 5 P(Xi $ m

|) 5 .5 m|

X1, X2, c, Xn

1/l 100(1 2 a)

2lTr

Tr 5 Y1 1 c 1 Yr 1 (n 2 r)Yr

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Bibliography 299

c. Which of the two intervals derived previously is shorter? If my waiting time for a morning bus is uniformly dis- tributed and observed waiting times are x1 � 4.2,

, and , derive a 95% CI for u by using the shorter of the two intervals.

60. Let . Then a % CI for m when n is large is

The choice yields the usual interval derived in Section 7.2; if , this interval is not symmetric about . The width of this interval is . Show

that w is minimized for the choice , so that the sym- metric interval is the shortest. [Hints: (a) By definition of

, so that ; (b) the rela- tionship between the derivative of a function and the inverse function is .]

61. Suppose are observed values resulting from a random sample from a symmetric but possibly heavy-tailed distribution. Let and fs denote the sample median and fourth spread, respectively. Chapter 11 of

x|

x1, x2, c, xn

(d/dy) f 21(y) 5 1/f r(x)x 5 f21(y) y 5 f(x)

za 5 � 21(1 2 a)za, �(za) 5 1 2 a

g 5 a/2 w 5 s(zg 1 za2g)/1nx#

g 2 a/2 g 5 a/2

ax 2 zg # s1n , x 1 za2g # s

1n b

100(1 2 a)0 # g # a

x5 5 2.4x2 5 3.5, x3 5 1.7, x4 5 1.2

Understanding Robust and Exploratory Data Analysis (see the bibliography in Chapter 6) suggests the follow- ing robust 95% CI for the population mean (point of symmetry):

The value of the quantity in parentheses is 2.10 for , 1.94 for , and 1.91 for . Compute this CI for the data of Exercise 45, and compare to the t CI appropriate for a normal population distribution.

62. a. Use the results of Example 7.5 to obtain a 95% lower confidence bound for the parameter l of an exponential distribution, and calculate the bound based on the data given in the example.

b. If lifetime X has an exponential distribution, the proba- bility that lifetime exceeds t is . Use the result of part (a) to obtain a 95% lower confidence bound for the probability that breakdown time exceeds 100 min.

P(X . t) 5 e2lt

n 5 30n 5 20 n 5 10

x| 6 a conservative t critical value 1.075

b # fs 1n

Bibliography

DeGroot, Morris, and Mark Schervish, Probability and Statistics (3rd ed.), Addison-Wesley, Reading, MA, 2002. A very good exposition of the general principles of statistical inference.

Devore, Jay, and Kenneth Berk, Modern Mathematical Statistics with Applications, Cengage, Belmont, CA, 2007. The expo- sition is a bit more comprehensive and sophisticated than that

of the current book, and includes more material on boot- strapping.

Hahn, Gerald, and William Meeker, Statistical Intervals, Wiley, New York, 1991. Almost everything you ever wanted to know about statistical intervals (confidence, prediction, tolerance, and others).

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

300

Tests of Hypotheses Based on a Single Sample8

INTRODUCTION

A parameter can be estimated from sample data either by a single number

(a point estimate) or an entire interval of plausible values (a confidence inter-

val). Frequently, however, the objective of an investigation is not to estimate

a parameter but to decide which of two contradictory claims about the

parameter is correct. Methods for accomplishing this comprise the part of sta-

tistical inference called hypothesis testing. In this chapter, we first discuss

some of the basic concepts and terminology in hypothesis testing and then

develop decision procedures for the most frequently encountered testing

problems based on a sample from a single population.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.1 Hypotheses and Test Procedures 301

DEFINITION

A statistical hypothesis, or just hypothesis, is a claim or assertion either about the value of a single parameter (population characteristic or characteristic of a probabil- ity distribution), about the values of several parameters, or about the form of an entire probability distribution. One example of a hypothesis is the claim , where m is the true average inside diameter of a certain type of PVC pipe. Another example is the statement , where p is the proportion of defective circuit boards among all circuit boards produced by a certain manufacturer. If m1 and m2 denote the true average breaking strengths of two different types of twine, one hypothesis is the assertion that , and another is the statement

. Yet another example of a hypothesis is the assertion that the stopping distance under particular conditions has a normal distribution. Hypotheses of this latter sort will be considered in Chapter 14. In this and the next several chapters, we concentrate on hypotheses about parameters.

In any hypothesis-testing problem, there are two contradictory hypotheses under consideration. One hypothesis might be the claim and the other , or the two contradictory statements might be and . The objective is to decide, based on sample information, which of the two hypotheses is correct. There is a familiar analogy to this in a criminal trial. One claim is the assertion that the accused individual is innocent. In the U.S. judicial system, this is the claim that is initially believed to be true. Only in the face of strong evidence to the contrary should the jury reject this claim in favor of the alternative assertion that the accused is guilty. In this sense, the claim of innocence is the favored or protected hypothesis, and the burden of proof is placed on those who believe in the alternative claim.

Similarly, in testing statistical hypotheses, the problem will be formulated so that one of the claims is initially favored. This initially favored claim will not be rejected in favor of the alternative claim unless sample evidence contradicts it and provides strong support for the alternative assertion.

p , .10p $ .10 m 2 .75m 5 .75

m1 2 m2 . 5 m1 2 m2 5 0

p , .10

m 5 .75

The null hypothesis, denoted by H0, is the claim that is initially assumed to be true (the “prior belief” claim). The alternative hypothesis, denoted by Ha, is the assertion that is contradictory to H0.

The null hypothesis will be rejected in favor of the alternative hypothe- sis only if sample evidence suggests that H0 is false. If the sample does not strongly contradict H0, we will continue to believe in the plausibility of the null hypothesis. The two possible conclusions from a hypothesis-testing analysis are then reject H0 or fail to reject H0.

A test of hypotheses is a method for using sample data to decide whether the null hypothesis should be rejected. Thus we might test against the alterna- tive . Only if sample data strongly suggests that m is something other than .75 should the null hypothesis be rejected. In the absence of such evidence, H0 should not be rejected, since it is still quite plausible.

Sometimes an investigator does not want to accept a particular assertion unless and until data can provide strong support for the assertion. As an example, suppose a company is considering putting a new type of coating on bearings that it produces. The true average wear life with the current coating is known to be 1000 hours. With m denoting the true average life for the new coating, the company would not

Ha: m 2 .75 H0: m 5 .75

8.1 Hypotheses and Test Procedures

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

want to make a change unless evidence strongly suggested that m exceeds 1000. An appropriate problem formulation would involve testing against

. The conclusion that a change is justified is identified with Ha, and it would take conclusive evidence to justify rejecting H0 and switching to the new coating.

Scientific research often involves trying to decide whether a current theory should be replaced by a more plausible and satisfactory explanation of the phenome- non under investigation. A conservative approach is to identify the current theory with H0 and the researcher’s alternative explanation with Ha. Rejection of the current the- ory will then occur only when evidence is much more consistent with the new theory. In many situations, Ha is referred to as the “researcher’s hypothesis,” since it is the claim that the researcher would really like to validate. The word null means “of no value, effect, or consequence,” which suggests that H0 should be identified with the hypothesis of no change (from current opinion), no difference, no improvement, and so on. Suppose, for example, that 10% of all circuit boards produced by a certain manufacturer during a recent period were defective. An engineer has suggested a change in the production process in the belief that it will result in a reduced defective rate. Let p denote the true proportion of defective boards resulting from the changed process. Then the research hypothesis, on which the burden of proof is placed, is the assertion that . Thus the alternative hypothesis is .

In our treatment of hypothesis testing, H0 will generally be stated as an equality claim. If u denotes the parameter of interest, the null hypothesis will have the form , where u0 is a specified number called the null value of the parameter (value claimed for u by the null hypothesis). As an example, consider the circuit board situation just discussed. The suggested alternative hypothesis was

, the claim that the defective rate is reduced by the process modifica- tion. A natural choice of H0 in this situation is the claim that , according to which the new process is either no better or worse than the one currently used. We will instead consider versus . The rationale for using this simplified null hypothesis is that any reasonable decision procedure for deciding between and will also be reasonable for deciding between the claim that and Ha. The use of a simplified H0 is preferred because it has certain technical benefits, which will be apparent shortly.

The alternative to the null hypothesis will look like one of the fol- lowing three assertions:

1. (in which case the implicit null hypothesis is ),

2. (in which case the implicit null hypothesis is ), or

3.

For example, let s denote the standard deviation of the distribution of inside diameters (inches) for a certain type of metal sleeve. If the decision was made to use the sleeve unless sample evidence conclusively demonstrated that , the appropriate hypotheses would be versus . The number u0 that appears in both H0 and Ha (separates the alternative from the null) is called the null value.

Test Procedures A test procedure is a rule, based on sample data, for deciding whether to reject H0. A test of versus in the circuit board problem might be based on examining a random sample of boards. Let X denote the number of defective boards in the sample, a binomial random variable; x represents the

n 5 200 Ha: p , .10H0: p 5 .10

Ha: s . .001H0: s 5 .001 s . .001

Ha: u 2 u0

u $ u0Ha: u , u0

u # u0Ha: u . u0

H0: u 5 u0

p $ .10 Ha: p , .10H0: p 5 .10

Ha: p , .10H0: p 5 .10

p $ .10 Ha: p , .10

H0: u 5 u0

Ha: p , .10p , .10

Ha: m . 1000 H0: m 5 1000

302 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.1 Hypotheses and Test Procedures 303

observed value of X. If H0 is true, , whereas we can expect fewer than 20 defective boards if Ha is true. A value x just a bit below 20 does not strongly contradict H0, so it is reasonable to reject H0 only if x is substantially less than 20. One such test procedure is to reject H0 if and not reject H0 oth- erwise. This procedure has two constituents: (1) a test statistic, or function of the sample data used to make a decision, and (2) a rejection region consisting of those x values for which H0 will be rejected in favor of Ha. For the rule just suggested, the rejection region consists of , and 15. H0 will not be rejected if

, or 200.x 5 16, 17, c, 199 x 5 0, 1, 2, c

x # 15

E(X) 5 np 5 200(.10) 5 20

A test procedure is specified by the following:

1. A test statistic, a function of the sample data on which the decision (reject H0 or do not reject H0) is to be based

2. A rejection region, the set of all test statistic values for which H0 will be rejected

The null hypothesis will then be rejected if and only if the observed or computed test statistic value falls in the rejection region.

As another example, suppose a cigarette manufacturer claims that the average nicotine content m of brand B cigarettes is (at most) 1.5 mg. It would be unwise to reject the manufacturer’s claim without strong contradictory evidence, so an appro- priate problem formulation is to test versus . Consider a decision rule based on analyzing a random sample of 32 cigarettes. Let denote the sample average nicotine content. If H0 is true, , whereas if H0 is false, we expect to exceed 1.5. Strong evidence against H0 is provided by a value

that considerably exceeds 1.5. Thus we might use as a test statistic along with the rejection region .

In both the circuit board and nicotine examples, the choice of test statistic and form of the rejection region make sense intuitively. However, the choice of cutoff value used to specify the rejection region is somewhat arbitrary. Instead of rejecting

in favor of when , we could use the rejection region . For this region, H0 would not be rejected if 15 defective boards are observed,

whereas this occurrence would lead to rejection of H0 if the initially suggested region is employed. Similarly, the rejection region might be used in the nicotine problem in place of the region .

Errors in Hypothesis Testing The basis for choosing a particular rejection region lies in consideration of the errors that one might be faced with in drawing a conclusion. Consider the rejection region

in the circuit board problem. Even when is true, it might happen that an unusual sample results in , so that H0 is erroneously rejected. On the other hand, even when is true, an unusual sample might yield , in which case H0 would not be rejected—again an incorrect conclusion. Thus it is possible that H0 may be rejected when it is true or that H0 may not be rejected when it is false. These possible errors are not consequences of a foolishly chosen rejection region. Either error might result when the region is employed, or indeed when any other sensible region is used.

x # 14

x 5 20Ha: p , .10 x 5 13

H0: p 5 .10x # 15

x $ 1.60 x $ 1.55

x # 14 x # 15Ha: p , .10H0: p 5 .10

x $ 1.6 Xx

X E(X) 5 m 5 1.5

X Ha: m . 1.5H0: m 5 1.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 8.1

DEFINITION

304 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

A type I error consists of rejecting the null hypothesis H0 when it is true. A type II error involves not rejecting H0 when H0 is false.

In the nicotine scenario, a type I error consists of rejecting the manufacturer’s claim that when it is actually true. If the rejection region is employed, it might happen that even when , resulting in a type I error. Alternatively, it may be that H0 is false and yet is observed, leading to H0 not being rejected (a type II error).

In the best of all possible worlds, test procedures for which neither type of error is possible could be developed. However, this ideal can be achieved only by basing a decision on an examination of the entire population. The difficulty with using a procedure based on sample data is that because of sampling variability, an unrepresentative sample may result, e.g., a value of that is far from m or a value of

that differs considerably from p. Instead of demanding error-free procedures, we must seek procedures for

which either type of error is unlikely to occur. That is, a good procedure is one for which the probability of making either type of error is small. The choice of a partic- ular rejection region cutoff value fixes the probabilities of type I and type II errors. These error probabilities are traditionally denoted by a and b, respectively. Because H0 specifies a unique value of the parameter, there is a single value of a. However, there is a different value of b for each value of the parameter consistent with Ha.

A certain type of automobile is known to sustain no visible damage 25% of the time in 10-mph crash tests. A modified bumper design has been proposed in an effort to increase this percentage. Let p denote the proportion of all 10-mph crashes with this new bumper that result in no visible damage. The hypotheses to be tested are

(no improvement) versus . The test will be based on an experiment involving independent crashes with prototypes of the new design. Intuitively, H0 should be rejected if a substantial number of the crashes show no damage. Consider the following test procedure:

This rejection region is called upper-tailed because it consists only of large values of the test statistic.

When H0 is true, X has a binomial probability distribution with and . Then

That is, when H0 is actually true, roughly 10% of all experiments consisting of 20 crashes would result in H0 being incorrectly rejected (a type I error).

In contrast to a, there is not a single b. Instead, there is a different b for each different p that exceeds .25. Thus there is a value of b for (in which case

), another value of b for , and so on. For example,

5 P(X # 7 when X | Bin(20, .3)) 5 B(7; 20, .3) 5 .772 5 P(H0 is not rejected when it is false because p 5 .3)

b(.3) 5 P(type II error when p 5 .3)

p 5 .5X | Bin(20, .3) p 5 .3

5 1 2 .898 5 .102

5 P(X $ 8 when X | Bin(20, .25)) 5 1 2 B(7; 20, .25)

a 5 P(type I error) 5 P(H0 is rejected when it is true)

p 5 .25 n 5 20

where x is the observed value of the test statistic.

Rejection region: R8 5 58, 9, 10, c, 19, 206; that is, reject H0 if x $ 8, Test statistic: X 5 the number of crashes with no visible damage

n 5 20 Ha: p . .25H0: p 5 .25

p̂ X

x 5 1.52 m 5 1.5x 5 1.63

x $ 1.6m 5 1.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.1 Hypotheses and Test Procedures 305

When p is actually .3 rather than .25 (a “small” departure from H0), roughly 77% of all experiments of this type would result in H0 being incorrectly not rejected!

The accompanying table displays b for selected values of p (each calculated for the rejection region R8). Clearly, b decreases as the value of p moves farther to the right of the null value .25. Intuitively, the greater the departure from H0, the less likely it is that such a departure will not be detected.

p .3 .4 .5 .6 .7 .8

b(p) .772 .416 .132 .021 .001 .000

The proposed test procedure is still reasonable for testing the more realistic null hypothesis that . In this case, there is no longer a single a, but instead there is an a for each p that is at most , and so on. It is easily verified, though, that if . That is, the largest value of a occurs for the boundary value .25 between H0 and Ha. Thus if a is small for the simplified null hypothesis, it will also be as small as or smaller for the more realistic H0. ■

The drying time of a certain type of paint under specified test conditions is known to be normally distributed with mean value 75 min and standard deviation 9 min. Chemists have proposed a new additive designed to decrease average drying time. It is believed that drying times with this additive will remain normally distributed with

. Because of the expense associated with the additive, evidence should strongly suggest an improvement in average drying time before such a conclusion is adopted. Let m denote the true average drying time when the additive is used. The appropriate hypotheses are versus . Only if H0 can be rejected will the additive be declared successful and then be used.

Experimental data is to consist of drying times from test specimens. Let denote the 25 drying times—a random sample of size 25 from a nor- mal distribution with mean value m and standard deviation . The sample mean drying time then has a normal distribution with expected value and stan- dard deviation . When H0 is true, , so only an value substantially less than 75 would strongly contradict H0. A reasonable rejection region has the form , where the cutoff value c is suitably chosen. Consider the choice , so that the test procedure consists of test statistic and rejection region . Because the rejection region consists only of small values of the test statistic, the test is said to be lower-tailed. Calculation of a and b now involves a routine standardization of followed by reference to the standard normal probabilities of Appendix Table A.3:

b(70) 5 1 2 �a 70.8 2 70 1.8

b 5 .3300 b(67) 5 .0174 5 1 2 �a 70.8 2 72

1.8 b 5 1 2 �(2.67) 5 1 2 .2514 5 .7486

5 P(X . 70.8 when X , normal with mX 5 72 and sX 5 1.8) 5 P(H0 is not rejected when it is false because m 5 72)

b(72) 5 P(type II error when m 5 72)

5 �a 70.8 2 75 1.8

b 5 �(22.33) 5 .01 5 P(X # 70.8 when X | normal with mX 5 75, sX 5 1.8)

a 5 P(type I error) 5 P(H0 is rejected when it is true)

X

x # 70.8 Xc 5 70.8

x # c x

m X 5 75sX 5 s/1n 5 9/125 5 1.80 mX 5 mX

s 5 9 X1 , c, X25

n 5 25

Ha: m , 75H0: m 5 75

s 5 9

p , .25a(p) , a(.25) 5 .102 .25: a(.25), a(.23), a(.20), a(.15)

p # .25

Example 8.2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

306 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

For the specified test procedure, only 1% of all experiments carried out as described will result in H0 being rejected when it is actually true. However, the chance of a type II error is very large when (only a small departure from H0), somewhat less when , and quite small when (a very substantial departure from H0). These error probabilities are illustrated in Figure 8.1. Notice that a is computed using the probability distribution of the test statistic when H0 is true, whereas deter- mination of b requires knowing the test statistic’s distribution when H0 is false.

m 5 67m 5 70 m 5 72

7573

70.8

(a)

Shaded area � � .01�

72 75

70.8

(b)

70 75

70.8

(c)

Shaded area � (70)�

Shaded area � (72)�

Figure 8.1 a and b illustrated for Example 8.2: (a) the distribution of when (H0 true); (b) the distribution of when (H0 false); (c) the distribution of when (H0 false)m 5 70Xm 5 72X

m 5 75X

As in Example 8.1, if the more realistic null hypothesis is considered, there is an a for each parameter value for which H0 is true: a(75), a(75.8), a(76.5), and so on. It is easily verified, though, that a(75) is the largest of all these type I error probabilities. Focusing on the boundary value amounts to working explicitly with the “worst case.” ■

The specification of a cutoff value for the rejection region in the examples just considered was somewhat arbitrary. Use of in Example 8.1 gave

, and . Many would think these error probabili- ties intolerably large. Perhaps they can be decreased by changing the cutoff value.

Let us use the same experiment and test statistic X as previously described in the auto- mobile bumper problem but now consider the rejection region Since X still has a binomial distribution with parameters and p,

5 P(X $ 9 when X , Bin(20, .25)) 5 1 2 B(8; 20, .25) 5 .041 a 5 P(H0 is rejected when p 5 .25)

n 5 20 R9 5 59, 10, c, 206.

b(.5) 5 .132a 5 .102, b(.3) 5 .772 R8 5 58, 9, c, 206

m $ 75

Example 8.3 (Example 8.1 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.1 Hypotheses and Test Procedures 307

PROPOSITION

Example 8.4 (Example 8.2 continued)

The type I error probability has been decreased by using the new rejection region. However, a price has been paid for this decrease:

Both these b’s are larger than the corresponding error probabilities .772 and .132 for the region R8. In retrospect, this is not surprising; a is computed by summing over probabilities of test statistic values in the rejection region, whereas b is the proba- bility that X falls in the complement of the rejection region. Making the rejection region smaller must therefore decrease a while increasing b for any . ■

The use of cutoff value in the paint-drying example resulted in a very small value of a (.01) but rather large b’s. Consider the same experiment and test statistic

with the new rejection region . Because is still normally distributed with mean value and ,

The change in cutoff value has made the rejection region larger (it includes more values), resulting in a decrease in b for each fixed m less than 75. However, a for this new region has increased from the previous value .01 to approximately .05. If a type I error probability this large can be tolerated, though, the second region is preferable to the first because of the smaller b’s. ■

The results of these examples can be generalized in the following manner.

(c 5 70.8) (c 5 72)

x

b(70) 5 1 2 �a 72 2 70 1.8

b 5 .1335 b(67) 5 .0027 5 1 2 �a 72 2 72

1.8 b 5 1 2 �(0) 5 .5

5 P(X . 72 when X is a normal rv with mean 72 and standard deviation 1.8)

b(72) 5 P(H0 is not rejected when m 5 72)

5 �a 72 2 75 1.8

b 5 �(21.67) 5 .0475 < .05 5 P(X # 72 when X , N(75, 1.82))

a 5 P(H0 is rejected when it is true)

sX 5 1.8mX 5 m Xx # 72X

c 5 70.8

p . .25

b(.5) 5 B(8; 20, .5) 5 .252

5 P(X # 8 when X , Bin(20, .3)) 5 B(8; 20, .3) 5 .887 b(.3) 5 P(H0 is not rejected when p 5 .3)

Suppose an experiment and a sample size are fixed and a test statistic is chosen. Then decreasing the size of the rejection region to obtain a smaller value of a results in a larger value of b for any particular parameter value consistent with Ha.

This proposition says that once the test statistic and n are fixed, there is no rejection region that will simultaneously make both a and all b’s small. A region must be cho- sen to effect a compromise between a and b.

Because of the suggested guidelines for specifying H0 and Ha, a type I error is usually more serious than a type II error (this can always be achieved by proper choice of the hypotheses). The approach adhered to by most statistical practitioners is then to specify the largest value of a that can be tolerated and find a rejection region having that value of a rather than anything smaller. This makes b as small as possi- ble subject to the bound on a. The resulting value of a is often referred to as the significance level of the test. Traditional levels of significance are .10, .05, and .01,

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

308 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

though the level in any particular problem will depend on the seriousness of a type I error—the more serious this error, the smaller should be the significance level. The corresponding test procedure is called a level A test (e.g., a level .05 test or a level .01 test). A test with significance level a is one for which the type I error prob- ability is controlled at the specified level.

Again let m denote the true average nicotine content of brand B cigarettes. The objective is to test versus based on a random sample

of nicotine content. Suppose the distribution of nicotine content is known to be normal with . Then is normally distributed with mean value

and standard deviation . Rather than use itself as the test statistic, let’s standardize , assuming that

H0 is true.

Z expresses the distance between and its expected value when H0 is true as some number of standard deviations. For example, results from an that is 3 stan- dard deviations larger than we would have expected it to be were H0 true.

Rejecting H0 when “considerably” exceeds 1.5 is equivalent to rejecting H0 when z “considerably” exceeds 0. That is, the form of the rejection region is . Let’s now determine c so that . When H0 is true, Z has a standard normal dis- tribution. Thus

The value c must capture upper-tail area .05 under the z curve. Either from Section 4.3 or directly from Appendix Table A.3, .

Notice that is equivalent to , that is, . Then b involves the probability that and can be calculated for

any m greater than 1.5. ■ X , 1.56x $ 1.56 x 2 1.5 $ (.0354)(1.645)z $ 1.645

c 5 z.05 5 1.645

5 P(Z $ c when Z , N(0, 1)) a 5 P(type I error) 5 P(rejecting H0 when H0 is true)

a 5 .05 z $ c

x

xz 5 3 X

Test statistic: Z 5 X 2 1.5

s/1n 5

X 2 1.5

.0354

XX sX 5 .20/132 5 .0354mX 5 m

Xs 5 .20 X1, X2, c, X32

Ha: m . 1.5H0: m 5 1.5 Example 8.5

EXERCISES Section 8.1 (1–14)

1. For each of the following assertions, state whether it is a legitimate statistical hypothesis and why: a. b. c. d. e. f. , where l is the parameter of an exponential

distribution used to model component lifetime

2. For the following pairs of assertions, indicate which do not comply with our rules for setting up hypotheses and why (the subscripts 1 and 2 differentiate between quantities for two different populations or samples): a. b. c. d. e. H0: S 1

2 5 S 2 2, Ha: S 1

2 2 S 22 H0: m1 2 m2 5 25, Ha: m1 2 m2 . 100 H0: p 2 .25, Ha: p 5 .25 H0: s 5 20, Ha: s # 20 H0: m 5 100, Ha: m . 100

H: l # .01 H: X 2 Y 5 5

H: s1/s2 , 1H: s # .20 H: x| 5 45H: s . 100

f. g. h.

3. To determine whether the pipe welds in a nuclear power plant meet specifications, a random sample of welds is selected, and tests are conducted on each weld in the sample. Weld strength is measured as the force required to break the weld. Suppose the specifications state that mean strength of welds should exceed 100 lb/in2; the inspection team decides to test versus . Explain why it might be preferable to use this Ha rather than .

4. Let m denote the true average radioactivity level (picocuries per liter). The value 5 pCi/L is considered the dividing line between safe and unsafe water. Would you recommend testing

versus or versus Ha: m , 5?H0: m 5 5Ha: m . 5H0: m 5 5

m , 100 Ha: m . 100H0: m 5 100

H0: p1 2 p2 5 2.1, Ha: p1 2 p2 , 2.1 H0: s1/s2 5 1, Ha: s1/s2 2 1 H0: m 5 120, Ha: m 5 150

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.1 Hypotheses and Test Procedures 309

Explain your reasoning. [Hint: Think about the consequences of a type I and type II error for each possibility.]

5. Before agreeing to purchase a large order of polyethylene sheaths for a particular type of high-pressure oil-filled sub- marine power cable, a company wants to see conclusive evi- dence that the true standard deviation of sheath thickness is less than .05 mm. What hypotheses should be tested, and why? In this context, what are the type I and type II errors?

6. Many older homes have electrical systems that use fuses rather than circuit breakers. A manufacturer of 40-amp fuses wants to make sure that the mean amperage at which its fuses burn out is in fact 40. If the mean amperage is lower than 40, customers will complain because the fuses require replacement too often. If the mean amperage is higher than 40, the manufacturer might be liable for damage to an elec- trical system due to fuse malfunction. To verify the amperage of the fuses, a sample of fuses is to be selected and inspected. If a hypothesis test were to be performed on the resulting data, what null and alternative hypotheses would be of inter- est to the manufacturer? Describe type I and type II errors in the context of this problem situation.

7. Water samples are taken from water used for cooling as it is being discharged from a power plant into a river. It has been determined that as long as the mean temperature of the dis- charged water is at most 150°F, there will be no negative effects on the river’s ecosystem. To investigate whether the plant is in compliance with regulations that prohibit a mean discharge water temperature above 150°, 50 water samples will be taken at randomly selected times and the temperature of each sample recorded. The resulting data will be used to test the hypotheses

versus . In the context of this situ- ation, describe type I and type II errors. Which type of error would you consider more serious? Explain.

8. A regular type of laminate is currently being used by a manu- facturer of circuit boards. A special laminate has been devel- oped to reduce warpage. The regular laminate will be used on one sample of specimens and the special laminate on another sample, and the amount of warpage will then be determined for each specimen. The manufacturer will then switch to the spe- cial laminate only if it can be demonstrated that the true aver- age amount of warpage for that laminate is less than for the regular laminate. State the relevant hypotheses, and describe the type I and type II errors in the context of this situation.

9. Two different companies have applied to provide cable tele- vision service in a certain region. Let p denote the proportion of all potential subscribers who favor the first company over the second. Consider testing versus based on a random sample of 25 individuals. Let X denote the number in the sample who favor the first company and x rep- resent the observed value of X. a. Which of the following rejection regions is most appro-

priate and why?

R3 5 5x: x $ 176 R1 5 5x: x # 7 or x $ 186, R2 5 5x: x # 86,

Ha: p 2 .5H0: p 5 .5

Ha: m . 1508H0: m 5 1508

b. In the context of this problem situation, describe what the type I and type II errors are.

c. What is the probability distribution of the test statistic X when H0 is true? Use it to compute the probability of a type I error.

d. Compute the probability of a type II error for the selected region when , again when , and also for both

and . e. Using the selected region, what would you conclude if 6

of the 25 queried favored company 1?

10. A mixture of pulverized fuel ash and Portland cement to be used for grouting should have a compressive strength of more than 1300 KN/m2. The mixture will not be used unless exper- imental evidence indicates conclusively that the strength specification has been met. Suppose compressive strength for specimens of this mixture is normally distributed with

. Let m denote the true average compressive strength. a. What are the appropriate null and alternative hypotheses? b. Let denote the sample average compressive strength

for randomly selected specimens. Consider the test procedure with test statistic and rejection region

. What is the probability distribution of the test statistic when H0 is true? What is the probability of a type I error for the test procedure?

c. What is the probability distribution of the test statistic when ? Using the test procedure of part (b), what is the probability that the mixture will be judged unsatisfactory when in fact (a type II error)?

d. How would you change the test procedure of part (b) to obtain a test with significance level .05? What impact would this change have on the error probability of part (c)?

e. Consider the standardized test statistic . What

are the values of Z corresponding to the rejection region of part (b)?

11. The calibration of a scale is to be checked by weighing a 10-kg test specimen 25 times. Suppose that the results of dif- ferent weighings are independent of one another and that the weight on each trial is normally distributed with kg. Let m denote the true average weight reading on the scale. a. What hypotheses should be tested? b. Suppose the scale is to be recalibrated if either

or . What is the probability that recalibration is carried out when it is actually unnecessary?

c. What is the probability that recalibration is judged un- necessary when in fact ? When ?

d. Let . For what value c is the rejec- tion region of part (b) equivalent to the “two-tailed” region of either or ?

e. If the sample size were only 10 rather than 25, how should the procedure of part (d) be altered so that ?

f. Using the test of part (e), what would you conclude from the following sample data?

9.981 10.006 9.857 10.107 9.888 9.728 10.439 10.214 10.190 9.793

a 5 .05

z # 2cz $ c

z 5 (x 2 10)/(s/1n) m 5 9.8m 5 10.1

x # 9.8968x $ 10.1032

s 5 .200

Z 5 (X 2 1300)/(s/1n) 5 (X 2 1300)/13.42

m 5 1350

m 5 1350

x $ 1331.26 X

n 5 10 X

s 5 60

p 5 .7p 5 .6 p 5 .4p 5 .3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

310 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

g. Reexpress the test procedure of part (b) in terms of the standardized test statistic .

12. A new design for the braking system on a certain type of car has been proposed. For the current system, the true average braking distance at 40 mph under specified conditions is known to be 120 ft. It is proposed that the new design be implemented only if sample data strongly indicates a reduc- tion in true average braking distance for the new design. a. Define the parameter of interest and state the relevant

hypotheses. b. Suppose braking distance for the new system is normally

distributed with . Let denote the sample average braking distance for a random sample of 36 observations. Which of the following three rejection regions is appro- priate:

? c. What is the significance level for the appropriate region

of part (b)? How would you change the region to obtain a test with ?

d. What is the probability that the new design is not imple- mented when its true average braking distance is actually 115 ft and the appropriate region from part (b) is used?

a 5 .001

R3 5 5x: either x $ 125.13 or x # 114.876 R2 5 5x: x # 115.206,R1 5 5x: x $ 124.806,

Xs 5 10

Z 5 (X 2 10)/(s/1n) e. Let . What is the significance

level for the rejection region ? For the region ?

13. Let denote a random sample from a normal pop- ulation distribution with a known value of s. a. For testing the hypotheses versus

(where m0 is a fixed number), show that the test with test statistic and rejection region

has significance level .01. b. Suppose the procedure of part (a) is used to test

versus . If , and , what is the probability of committing a type I error when ? When ? In general, what can be said about the probability of a type I error when the actual value of m is less than m0? Verify your assertion.

14. Reconsider the situation of Exercise 11 and suppose the rejection region is

. a. What is a for this procedure? b. What is b when ? When ? Is this

desirable? m 5 9.9m 5 10.1

5z: z $ 2.51 or z # 22.656 5x: x $ 10.1004 or x # 9.89406 5

m 5 98m 5 99 s 5 5

m0 5 100, n 5 25Ha: m . m0H0: m # m0

x $ m0 1 2.33s/1n X

Ha: m . m0

H0: m 5 m0

X1, c, Xn

5z: z # 22.886 5z: z # 22.336

Z 5 (X 2 120)/(s/1n)

8.2 Tests About a Population Mean The general discussion in Chapter 7 of confidence intervals for a population mean m focused on three different cases. We now develop test procedures for these cases.

Case I: A Normal Population with Known s Although the assumption that the value of s is known is rarely met in practice, this case provides a good starting point because of the ease with which general proce- dures and their properties can be developed. The null hypothesis in all three cases will state that m has a particular numerical value, the null value, which we will denote by m0. Let represent a random sample of size n from the normal population. Then the sample mean has a normal distribution with expected value

and standard deviation . When H0 is true, . Consider now the statistic Z obtained by standardizing under the assumption that H0 is true:

Substitution of the computed sample mean gives z, the distance between and m0 expressed in “standard deviation units.” For example, if the null hypothesis is

, and , then the test statistic value is . That is, the observed value of is 1.5 standard deviations (of ) larger than what we expect it to be when H0 is true. The statistic Z is a natural measure of the distance between , the estimator of m, and its expected value when H0 is true. If this distance is too great in a direction consistent with Ha, the null hypothesis should be rejected.

Suppose first that the alternative hypothesis has the form . Then an value less than m0 certainly does not provide support for Ha. Such an correspondsxx

Ha: m . m0

X X

xz 5 (103 2 100)/2.0 5 1.5 x 5 103H0: m 5 100, sX 5 s/1n 5 10/125 5 2.0

xx

Z 5 X 2 m0 s/1n

X mX 5 m0sX 5 s/1nmX 5 m

X X1, c, Xn

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.2 Tests About a Population Mean 311

to a negative value of z (since is negative and the divisor is positive). Similarly, an value that exceeds m0 by only a small amount (corresponding to z, which is positive but small) does not suggest that H0 should be rejected in favor of Ha. The rejection of H0 is appropriate only when considerably exceeds m0—that is, when the z value is positive and large. In summary, the appropriate rejection region, based on the test statistic Z rather than , has the form .

As discussed in Section 8.1, the cutoff value c should be chosen to control the probability of a type I error at the desired level a. This is easily accomplished because the distribution of the test statistic Z when H0 is true is the standard normal distribution (that’s why m0 was subtracted in standardizing). The required cutoff c is the z critical value that captures upper-tail area a under the z curve. As an example, let , the value that captures tail area . Then,

More generally, the rejection region has type I error probability a. The test procedure is upper-tailed because the rejection region consists only of large values of the test statistic.

Analogous reasoning for the alternative hypothesis suggests a rejection region of the form , where c is a suitably chosen negative number ( is far below m0 if and only if z is quite negative). Because Z has a standard normal distribution when H0 is true, taking yields . This is a lower-tailed test. For example, implies that the rejection region

specifies a test with significance level .10. Finally, when the alternative hypothesis is should be rejected

if is too far to either side of m0. This is equivalent to rejecting H0 either if or if . Suppose we desire . Then,

Thus c is such that , the area under the z curve to the right of c, is .025 (and not .05!). From Section 4.3 or Appendix Table A.3, , and the rejection region is or . For any a, the two-tailed rejection region or has type I error probability a (since area a/2 is captured under each of the two tails of the z curve). Again, the key reason for using the standardized test sta- tistic Z is that because Z has a known distribution when H0 is true (standard normal), a rejection region with desired type I error probability is easily obtained by using an appropriate critical value.

The test procedure for case I is summarized in the accompanying box, and the corresponding rejection regions are illustrated in Figure 8.2.

z # 2za/2

z $ za/2z # 21.96z $ 1.96 c 5 1.96

1 2 �(c)

5 �(2c) 1 1 2 �(c) 5 2[1 2 �(c)]

.05 5 P(Z $ c or Z # 2c when Z has a standard normal distribution)

a 5 .05z # 2c z $ cx

Ha: m 2 m0, H0 z # 21.28

z.10 5 1.28 P(type I error) 5 ac 5 2za

xz # c Ha: m , m0

z $ za

5 P(Z $ 1.645 when Z , N(0,1)) 5 1 2 �(1.645) 5 .05 a 5 P(type I error) 5 P(H0 is rejected when H0 is true)

.05 (z.05 5 1.645)c 5 1.645

z $ cX

x

x s/1nx 2 m0

Null hypothesis:

Test statistic value:

Alternative Hypothesis Rejection Region for Level a Test

(upper-tailed test) (lower-tailed test)

either or (two-tailed test)z # 2za/2z $ za/2Ha: m 2 m0 z # 2zaHa: m , m0

z $ zaHa: m . m0

z 5 x 2 m0 s/1n

H0: m 5 m0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 8.6

312 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

0 �z �z

zRejection region: z � z�

Rejection region: z � �z�

Shaded area � � P(type I error)�

Total shaded area � � P(type I error)�

0� 0�z /2� z /2�

Rejection region: either z � z /2 or z � � /2� �

Shaded area � /2�

Shaded area � /2�

z curve (probability distribution of test statistic Z when H0 is true)

(a) (b) (c)

Figure 8.2 Rejection regions for z tests: (a) upper-tailed test; (b) lower-tailed test; (c) two-tailed test

Use of the following sequence of steps is recommended when testing hypotheses about a parameter.

1. Identify the parameter of interest and describe it in the context of the problem sit- uation.

2. Determine the null value and state the null hypothesis.

3. State the appropriate alternative hypothesis.

4. Give the formula for the computed value of the test statistic (substituting the null value and the known values of any other parameters, but not those of any sample- based quantities).

5. State the rejection region for the selected significance level a.

6. Compute any necessary sample quantities, substitute into the formula for the test statistic value, and compute that value.

7. Decide whether H0 should be rejected, and state this conclusion in the problem context.

The formulation of hypotheses (Steps 2 and 3) should be done before examining the data.

A manufacturer of sprinkler systems used for fire protection in office buildings claims that the true average system-activation temperature is 130°. A sample of sys- tems, when tested, yields a sample average activation temperature of 131.08°F. If the distribution of activation times is normal with standard deviation 1.5°F, does the data contradict the manufacturer’s claim at significance level ?

1. Parameter of interest: average activation temperature.

2. Null hypothesis: .

3. Alternative hypothesis: (a departure from the claimed value in either direction is of concern).

4. Test statistic value:

z 5 x 2 m0 s/1n

5 x 2 130

1.5/1n

Ha: m 2 130 H0: m 5 130 (null value 5 m0 5 130)

m 5 true

a 5 .01

n 5 9

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.2 Tests About a Population Mean 313

5. Rejection region: The form of Ha implies use of a two-tailed test with rejection region either or . From Section 4.3 or Appendix Table A.3,

, so we reject H0 if either or .

6. Substituting and ,

That is, the observed sample mean is a bit more than 2 standard deviations above what would have been expected were H0 true.

7. The computed value does not fall in the rejection region , so H0 cannot be rejected at significance level .01. The data does not

give strong support to the claim that the true average differs from the design value of 130. ■

B and Sample Size Determination The z tests for case I are among the few in statistics for which there are simple formulas available for b, the probability of a type II error. Consider first the upper-tailed test with rejection region

. This is equivalent to , so H0 will not be rejected if . Now let denote a particular value of m that exceeds the

null value m0. Then,

As m� increases, m0 � m� becomes more negative, so b(m�) will be small when m� greatly exceeds m0 (because the value at which � is evaluated will then be quite neg- ative). Error probabilities for the lower-tailed and two-tailed tests are derived in an analogous manner.

If s is large, the probability of a type II error can be large at an alternative value m� that is of particular concern to an investigator. Suppose we fix a and also specify b for such an alternative value. In the sprinkler example, company officials might view as a very substantial departure from and there- fore wish in addition to . More generally, consider the two restrictions and for specified and Then for an upper-tailed test, the sample size n should be chosen to satisfy

This implies that

It is easy to solve this equation for the desired n. A parallel argument yields the nec- essary sample size for lower- and two-tailed tests as summarized in the next box.

2zb 5 z critical value that

captures lower-tail area b 5 za 1

m0 2 mr s/1n

�aza 1 m0 2 mrs/1n b 5 b b.

mr,a,b(mr) 5 bP(type I error) 5 a a 5 .01b(132) 5 .10

H0: m 5 130mr 5 132

5 �aza 1 m0 2 mrs/1n b 5 Pa X 2 mr

s/1n , za 1

m0 2 mr s/1n

when m 5 mrb 5 P(X , m0 1 za # s/1n when m 5 mr)

b(mr) 5 P(H0 is not rejected when m 5 mr)

mrx , m0 1 za # s/1n x $ m0 1 za # s/1nz $ za

2.16 , 2.58) (22.58 ,z 5 2.16

z 5 131.08 2 130

1.5/19 5

1.08

.5 5 2.16

x 5 131.08n 5 9

z # 22.58z $ 2.58z.005 5 2.58 z # 2z.005z $ z.005

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 8.7

314 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

Let m denote the true average tread life of a certain type of tire. Consider testing versus based on a sample of size from a

normal population distribution with . A test with requires . The probability of making a type II error when is

Since , the requirement that the level .01 test also have necessitates

The sample size must be an integer, so tires should be used. ■

Case II: Large-Sample Tests When the sample size is large, the z tests for case I are easily modified to yield valid test procedures without requiring either a normal population distribution or known s. The key result was used in Chapter 7 to justify large-sample confidence intervals: A large n implies that the standardized variable

has approximately a standard normal distribution. Substitution of the null value m0 in place of m yields the test statistic

which has approximately a standard normal distribution when H0 is true. The use of rejection regions given previously for case I (e.g., when the alternative hypothesis is ) then results in test procedures for which the significanceHa: m . m0

z $ za

Z 5 X 2 m0 S/1n

Z 5 X 2 m

S/1n

n 5 30

n 5 c 1500(2.33 1 1.28) 30,000 2 31,000

d2 5 (25.42)2 5 29.32

b(31,000) 5 .1z.1 5 1.28

b(31,000) 5 �a2.33 1 30,000 2 31,000 1500/116

b 5 �(2.34) 5 .3669 m 5 31,000za 5 z.01 5 2.33 a 5 .01s 5 1500

n 5 16Ha: m . 30,000H0: m 5 30,000

Alternative Hypothesis Type II Error Probability for a Level a Test

where The sample size n for which a level a test also has at the

alternative value m� is

c s(za/2 1 zb) m0 2 mr

d2 for a two-tailed test (an approximate solution)

c s(za 1 zb) m0 2 mr

d2 for a one-tailed (upper or lower) testμn 5

b(mr) 5 b �(z) 5 the standard normal cdf.

�aza/2 1 m0 2 mrs/1n b 2 �a2za/2 1 m0 2 mr s/1n

bHa: m 2 m0 1 2 �a2za 1 m0 2 mrs/1n bHa: m , m0

�aza 1 m0 2 mrs/1n bHa: m . m0 b(mr)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.2 Tests About a Population Mean 315

Example 8.8

level is approximately (rather than exactly) a. The rule of thumb will again be used to characterize a large sample size.

A dynamic cone penetrometer (DCP) is used for measuring material resistance to penetration (mm/blow) as a cone is driven into pavement or subgrade. Suppose that for a particular application it is required that the true average DCP value for a cer- tain type of pavement be less than 30. The pavement will not be used unless there is conclusive evidence that the specification has been met. Let’s state and test the appropriate hypotheses using the following data (“Probabilistic Model for the Analysis of Dynamic Cone Penetrometer Test Values in Pavement Structure Evaluation,” J. of Testing and Evaluation, 1999: 7–14):

14.1 14.5 15.5 16.0 16.0 16.7 16.9 17.1 17.5 17.8 17.8 18.1 18.2 18.3 18.3 19.0 19.2 19.4 20.0 20.0 20.8 20.8 21.0 21.5 23.5 27.5 27.5 28.0 28.3 30.0 30.0 31.6 31.7 31.7 32.5 33.5 33.9 35.0 35.0 35.0 36.7 40.0 40.0 41.3 41.7 47.5 50.0 51.0 51.8 54.4 55.0 57.0

Figure 8.3 shows a descriptive summary obtained from Minitab. The sample mean DCP is less than 30. However, there is a substantial amount of variation in the data (sample coefficient of variation ), so the fact that the mean is less than the design specification cutoff may be a consequence just of sampling variabil- ity. Notice that the histogram does not resemble at all a normal curve (and a normal probability plot does not exhibit a linear pattern), but the large-sample z tests do not require a normal population distribution.

5 s/ x 5 .4265

n . 40

Descriptive Statistics

Variable: DCP

Anderson-Darling Normality Test

A-Squarect 1.902 0.000

28.7615 12.2647 150.423

0.808264 –3.9E–01

52

14.1000 18.2250 27.5000 35.0000 57.0000

3.21761

15.2098

31.700020.0000

10.2784

25.3470

20 25 30

15 25 35 45 55

P-Value:

Mean StDev Variance Skewness Kurtosis N

Minimum 1st Quartile Median 3rd Quartile Maximum

95% Confidence Interval for Mu

95% Confidence Interval for Mu

95% Confidence Interval for Sigma

95% Confidence Interval for Median

95% Confidence Interval for Median

Figure 8.3 Minitab descriptive summary for the DCP data of Example 8.8

1. true average DCP value

2. H0: m 5 30

m 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

316 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

3. (so the pavement will not be used unless the null hypothesis is rejected)

4.

5. A test with significance level .05 rejects H0 when (a lower-tailed test).

6. With , and ,

7. Since , H0 cannot be rejected. We do not have compelling evi- dence for concluding that ; use of the pavement is not justified. ■

Determination of b and the necessary sample size for these large-sample tests can be based either on specifying a plausible value of s and using the case I formu- las (even though s is used in the test) or on using the methodology to be introduced shortly in connection with case III.

Case III: A Normal Population Distribution When n is small, the Central Limit Theorem (CLT) can no longer be invoked to jus- tify the use of a large-sample test. We faced this same difficulty in obtaining a small- sample confidence interval (CI) for m in Chapter 7. Our approach here will be the same one used there: We will assume that the population distribution is at least approximately normal and describe test procedures whose validity rests on this assumption. If an investigator has good reason to believe that the population distri- bution is quite nonnormal, a distribution-free test from Chapter 15 can be used. Alternatively, a statistician can be consulted regarding procedures valid for specific families of population distributions other than the normal family. Or a bootstrap pro- cedure can be developed.

The key result on which tests for a normal population mean are based was used in Chapter 7 to derive the one-sample t CI: If is a random sample from a normal distribution, the standardized variable

has a t distribution with degrees of freedom (df). Consider testing against by using the test statistic . That is, the test statistic results from standardizing under the assumption that H0 is true (using

the estimated standard deviation of , rather than ). When H0 is true, the test statistic has a t distribution with df. Knowledge of the test statistic’s dis- tribution when H0 is true (the “null distribution”) allows us to construct a rejection region for which the type I error probability is controlled at the desired level. In par- ticular, use of the upper-tail t critical value to specify the rejection region

implies that

The test statistic is really the same here as in the large-sample case but is la- beled T to emphasize that its null distribution is a t distribution with df rathern 2 1

5 a

5 P(T $ ta,n21 when T has a t distribution with n 2 1 df)

P(type I error) 5 P(H0 is rejected when it is true)

t $ ta,n21

ta,n21

n 2 1 s/1nXS/1n,

X T 5 (X 2 m0)/(S/1n)Ha: m . m0

H0: m 5 m0n 2 1

T 5 X 2 m

S/1n

X1, X2, c, Xn

m , 30 2.73 . 21.645

z 5 28.76 2 30

12.2647/152 5

21.24

1.701 5 2.73

s 5 12.2647n 5 52, x 5 28.76

z # 21.645

z 5 x 2 30

s/1n

Ha: m , 30

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.2 Tests About a Population Mean 317

Example 8.9

than the standard normal (z) distribution. The rejection region for the t test differs from that for the z test only in that a t critical value replaces the z critical value za. Similar comments apply to alternatives for which a lower-tailed or two-tailed test is appropriate.

ta,n21

The One-Sample t Test

Null hypothesis:

Test statistic value:

Alternative Hypothesis Rejection Region for a Level a Test

(upper-tailed) (lower-tailed)

either or (two-tailed)t # 2ta/2,n21t $ ta/2,n21Ha: m 2 m0 t # 2ta, n21Ha: m , m0 t $ ta, n21Ha: m . m0

t 5 x 2 m0 s/1n

H0: m 5 m0

Glycerol is a major by-product of ethanol fermentation in wine production and con- tributes to the sweetness, body, and fullness of wines. The article “A Rapid and Simple Method for Simultaneous Determination of Glycerol, Fructose, and Glucose in Wine” (American J. of Enology and Viticulture, 2007: 279–283) includes the following observations on glycerol concentration (mg/mL) for samples of standard-quality (uncertified) white wines: 2.67, 4.62, 4.14, 3.81, 3.83. Suppose the desired concentration value is 4. Does the sample data suggest that true average concentration is something other than the desired value? The accompanying normal probability plot from Minitab provides strong support for assuming that the popu- lation distribution of glycerol concentration is normal. Let’s carry out a test of appropriate hypotheses using the one-sample t test with a significance level of .05.

2.0 3.02.5 3.5

Glycerol conc

P er

ce nt

4.54.0 5.5

Mean StDev N

3.814 0.7185

5 RJ P-Value

0.947 >0.100

5.0 1

10

20

5

30 40 50 60 70 80

90

95

99

Figure 8.4 Normal probability plot for the data of Example 8.9

1.

2.

3. Ha: m 2 4 H0: m 5 4

m 5 true average glycerol concentration

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

318 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

4.

5. The inequality in Ha implies that a two-tailed test is appropriate, which requires . Thus H0 will be rejected if either or

.

6. , from which , and the estimated standard error of the mean is The test statistic value is then .

7. Clearly does not lie in the rejection region for a significance level of .05. It is still plausible that . The deviation of the sample mean 3.814 from its expected value 4 when H0 is true can be attributed just to sampling variability rather than to H0 being false.

The accompanying Minitab output from a request to perform a two-tailed one- sample t test shows identical calculated values to those just obtained. The fact that the last number on output, the “P-value,” exceeds .05 (and any other reason- able significance level) implies that the null hypothesis can’t be rejected. This is discussed in detail in Section 8.4.

Test of mu � 4 vs not � 4 Variable N Mean StDev SE Mean 95% CI T P glyc conc 5 3.814 0.718 0.321 (2.922, 4.706) �0.58 0.594 ■

B and Sample Size Determination The calculation of b at the alternative value m� in case I was carried out by expressing the rejection region in terms of (e.g.,

) and then subtracting m� to standardize correctly. An equiva- lent approach involves noting that when , the test statistic

still has a normal distribution with variance 1, but now the mean value of Z is given by . That is, when , the test sta- tistic still has a normal distribution though not the standard normal distribution. Because of this, is an area under the normal curve corresponding to mean value and variance 1. Both a and b involve working with nor- mally distributed variables.

The calculation of for the t test is much less straightforward. This is because the distribution of the test statistic is quite compli- cated when H0 is false and Ha is true. Thus, for an upper-tailed test, determining

involves integrating a very unpleasant density function. This must be done numeri- cally. The results are summarized in graphs of b that appear in Appendix Table A.17. There are four sets of graphs, corresponding to one-tailed tests at level .05 and level .01 and two-tailed tests at the same levels.

To understand how these graphs are used, note first that both b and the nec- essary sample size n in case I are functions not just of the absolute difference

but of . Suppose, for example, that This departure from H0 will be much easier to detect (smaller b) when , in which case m0 and m� are 5 population standard deviations apart, than when

. The fact that b for the t test depends on d rather than just is unfortunate, since to use the graphs one must have some idea of the true value of s. A conservative (large) guess for s will yield a conservative (large) value of

and a conservative estimate of the sample size necessary for prescribed and .b(mr)

ab(mr)

um0 2 mr us 5 10

s 5 2 um0 2 mr u 5 10.d 5 um0 2 mr u /sum0 2 mr u

b(mr) 5 P(T , ta,n21 when m 5 mr rather than m0)

T 5 (X 2 m0)/(S/1n) b(mr)

(mr 2 m0)/(s/1n) b(mr)

m 5 mr(mr 2 m0)/(s/1n) Z 5 (X 2 m0)/(s/1n)

m 5 mr x $ m0 1 za # s/1n

x

m 5 4 t 5 2.58

t 5 (3.814 2 4)/.321 5 2.58 s/1n 5 .321.

s 5 .718,x 5 3.814gxi 5 19.07, and gxi 2 5 74.7979

t # 22.776 t $ 2.776ta/2,n21 5 t025,4 5 2.776

t 5 x 2 4

s/1n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.2 Tests About a Population Mean 319

Example 8.10

Once the alternative and value of are selected, d is calculated and its value located on the horizontal axis of the relevant set of curves. The value of b is the height of the df curve above the value of d (visual interpolation is nec- essary if is not a value for which the corresponding curve appears), as illus- trated in Figure 8.5.

n 2 1 n 2 1

smr

1

0 d

Value of d corresponding to specified alternative �'

curve for n � 1 df�

� �'� � when

Figure 8.5 A typical b curve for the t test

Rather than fixing n (i.e., , and thus the particular curve from which b is read), one might prescribe both a (.05 or .01 here) and a value of b for the chosen m� and s. After computing d, the point (d, b) is located on the relevant set of graphs. The curve below and closest to this point gives and thus n (again, interpola- tion is often necessary).

The true average voltage drop from collector to emitter of insulated gate bipolar transistors of a certain type is supposed to be at most 2.5 volts. An investigator selects a sample of such transistors and uses the resulting voltages as a basis for testing versus using a t test with significance level

. If the standard deviation of the voltage distribution is , how likely is it that H0 will not be rejected when in fact ? With

, the point on the b curve at 9 df for a one-tailed test with above 1.0 has a height of approximately .1, so . The investiga- tor might think that this is too large a value of b for such a substantial departure from H0 and may wish to have for this alternative value of m. Since , the point must be located. This point is very close to the 14 df curve, so using will give both and when the value of m is 2.6 and

. A larger value of s would give a larger b for this alternative, and an alter- native value of m closer to 2.5 would also result in an increased value of b. ■

Most of the widely used statistical software packages are capable of calculat- ing type II error probabilities. They generally work in terms of power, which is sim- ply . A small value of b (close to 0) is equivalent to large power (near 1). A powerful test is one that has high power and therefore good ability to detect when the null hypothesis is false.

As an example, we asked Minitab to determine the power of the upper-tailed test in Example 8.10 for the three sample sizes 5, 10, and 15 when and the value of m is actually 2.6 rather than the null value 2.5—a “difference” of.

a 5 .05, s 5 .10,

1 2 b

s 5 .10 b 5 .05a 5 .05n 5 15

(d, b) 5 (1.0, .05) d 5 1.0b 5 .05

b < .1a 5 .05 d 5 u2.5 2 2.6 u /.100 5 1.0

m 5 2.6 s 5 .100a 5 .05

Ha: m . 2.5H0: m 5 2.5 n 5 10

n 2 1

n 2 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

320 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

2.6�2.5 � .1. We also asked the software to determine the necessary sample size for a power of .9 ( ) and also .95. Here is the resulting output:

Power and Sample Size

Testing mean � null (versus . null)

Calculating power for mean � null � difference

Alpha � 0.05 Assumed standard deviation � 0.1

Sample

Difference Size Power

0.1 5 0.579737

0.1 10 0.897517

0.1 15 0.978916

Sample Target

Actual Difference Size Power Power

0.1 11 0.90 0.924489

0.1 13 0.95 0.959703

The power for the sample size is a bit smaller than .9. So if we insist that the power be at least .9, a sample size of 11 is required and the actual power for that n is roughly .92. The software says that for a target power of .95, a sample size of

is required, whereas eyeballing our b curves gave 15. When available, this type of software is more reliable than the curves. Finally, Minitab now also provides power curves for the specified sample sizes, as shown in Figure 8.6. Such curves show how the power increases for each sample size as the actual value of m moves further and further away from the null value.

n 5 13

n 5 10

b 5 .1

0.00 0.05 0.10

Difference

Power Curves for 1-Sample t Test

P ow

er

0.15 0.20

Sample Size

5 10 15

Assumptions Alpha StDev Alternative

0.05 0.1

>

0.0

0.2

0.4

0.6

0.8

1.0

Figure 8.6 Power curves from Minitab for the t test of Example 8.10

EXERCISES Section 8.2 (15–36)

15. Let the test statistic Z have a standard normal distribution when H0 is true. Give the significance level for each of the following situations:

a. , rejection region b. , rejection region c. , rejection region z $ 2.88 or z # 22.88Ha: m 2 m0

z # 22.75Ha: m , m0

z $ 1.88Ha: m . m0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.2 Tests About a Population Mean 321

16. Let the test statistic T have a t distribution when H0 is true. Give the significance level for each of the following situations: a. , rejection region b. , rejection region c. , rejection region or

17. Answer the following questions for the tire problem in Example 8.7. a. If and a level test is used, what is

the decision? b. If a level .01 test is used, what is b(30,500)? c. If a level .01 test is used and it is also required that

, what sample size n is necessary? d. If , what is the smallest a at which H0 can be

rejected (based on )?

18. Reconsider the paint-drying situation of Example 8.2, in which drying time for a test specimen is normally distrib- uted with . The hypotheses versus

are to be tested using a random sample of observations.

a. How many standard deviations (of ) below the null value is ?

b. If , what is the conclusion using ? c. What is a for the test procedure that rejects H0 when

? d. For the test procedure of part (c), what is b(70)? e. If the test procedure of part (c) is used, what n is neces-

sary to ensure that ? f. If a level .01 test is used with , what is the prob-

ability of a type I error when ?

19. The melting point of each of 16 samples of a certain brand of hydrogenated vegetable oil was determined, resulting in

. Assume that the distribution of the melting point is normal with . a. Test versus using a two-tailed

level .01 test. b. If a level .01 test is used, what is b(94), the probability

of a type II error when ? c. What value of n is necessary to ensure that

when ?

20. Lightbulbs of a certain type are advertised as having an average lifetime of 750 hours. The price of these bulbs is very favorable, so a potential customer has decided to go ahead with a purchase arrangement unless it can be conclu- sively demonstrated that the true average lifetime is smaller than what is advertised. A random sample of 50 bulbs was selected, the lifetime of each bulb determined, and the appropriate hypotheses were tested using Minitab, resulting in the accompanying output.

Variable N Mean StDev SEMean Z P-Value lifetime 50 738.44 38.20 5.40 �2.14 0.016

What conclusion would be appropriate for a significance level of .05? A significance level of .01? What significance level and conclusion would you recommend?

21. The true average diameter of ball bearings of a certain type is supposed to be .5 in. A one-sample t test will be carried

a 5 .01 b(94) 5 .1

m 5 94

Ha: m 2 95H0: m 5 95 s 5 1.20

x 5 94.32

m 5 76 n 5 100

b(70) 5 .01

z # 22.88

a 5 .01x 5 72.3 x 5 72.3

X n 5 25 Ha: m , 75

H0: m 5 75s 5 9

n 5 16 x 5 30,960 b(30,500) 5 .05

a 5 .01x 5 30,960

t # 21.697 t $ 1.697Ha: m 2 m0 , n 5 31

t # 22.500Ha: m , m0 , n 5 24 t $ 3.733Ha: m . m0 , df 5 15

out to see whether this is the case. What conclusion is appropriate in each of the following situations? a. b. c. d.

22. The article “The Foreman’s View of Quality Control” (Quality Engr., 1990: 257–280) described an investigation into the coating weights for large pipes resulting from a galvanized coating process. Production standards call for a true average weight of 200 lb per pipe. The accompany- ing descriptive summary and boxplot are from Minitab.

Variable N Mean Median TrMean StDev SEMean ctg wt 30 206.73 206.00 206.81 6.35 1.16

Variable Min Max Q1 Q3 ctg wt 193.00 218.00 202.75 212.00

n 5 25, t 5 23.9 n 5 25, t 5 22.6, a 5 .01 n 5 13, t 5 21.6, a 5 .05 n 5 13, t 5 1.6, a 5 .05

200 210190 220 Coating weight

a. What does the boxplot suggest about the status of the specification for true average coating weight?

b. A normal probability plot of the data was quite straight. Use the descriptive output to test the appropriate hypotheses.

23. Exercise 36 in Chapter 1 gave observations on escape time (sec) for oil workers in a simulated exercise, from which the sample mean and sample standard deviation are 370.69 and 24.36, respectively. Suppose the investiga- tors had believed a priori that true average escape time would be at most 6 min. Does the data contradict this prior belief? Assuming normality, test the appropriate hypotheses using a significance level of .05.

24. Reconsider the sample observations on stabilized viscosity of asphalt specimens introduced in Exercise 46 in Chapter 1 (2781, 2900, 3013, 2856, and 2888). Suppose that for a par- ticular application it is required that true average viscosity be 3000. Does this requirement appear to have been satis- fied? State and test the appropriate hypotheses.

25. The desired percentage of SiO2 in a certain type of alumi- nous cement is 5.5. To test whether the true average per- centage is 5.5 for a particular production facility, 16 independently obtained samples are analyzed. Suppose that the percentage of SiO2 in a sample is normally distributed with and that . a. Does this indicate conclusively that the true average per-

centage differs from 5.5? Carry out the analysis using the sequence of steps suggested in the text.

b. If the true average percentage is and a level test based on is used, what is the prob-

ability of detecting this departure from H0? c. What value of n is required to satisfy and

?b(5.6) 5 .01 a 5 .01

n 5 16a 5 .01 m 5 5.6

x 5 5.25s 5 .3

n 5 26

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

322 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

26. To obtain information on the corrosion-resistance properties of a certain type of steel conduit, 45 specimens are buried in soil for a 2-year period. The maximum penetration (in mils) for each specimen is then measured, yielding a sample aver- age penetration of and a sample standard deviation of . The conduits were manufactured with the specifi- cation that true average penetration be at most 50 mils. They will be used unless it can be demonstrated conclusively that the specification has not been met. What would you conclude?

27. Automatic identification of the boundaries of significant struc- tures within a medical image is an area of ongoing research. The paper “Automatic Segmentation of Medical Images Using Image Registration: Diagnostic and Simulation Applications” (J. of Medical Engr. and Tech., 2005: 53–63) discussed a new technique for such identification. A measure of the accuracy of the automatic region is the average linear displacement (ALD). The paper gave the following ALD observations for a sample of 49 kidneys (units of pixel dimensions).

1.38 0.44 1.09 0.75 0.66 1.28 0.51 0.39 0.70 0.46 0.54 0.83 0.58 0.64 1.30 0.57 0.43 0.62 1.00 1.05 0.82 1.10 0.65 0.99 0.56 0.56 0.64 0.45 0.82 1.06 0.41 0.58 0.66 0.54 0.83 0.59 0.51 1.04 0.85 0.45 0.52 0.58 1.11 0.34 1.25 0.38 1.44 1.28 0.51

a. Summarize/describe the data. b. Is it plausible that ALD is at least approximately nor-

mally distributed? Must normality be assumed prior to calculating a CI for true average ALD or testing hypothe- ses about true average ALD? Explain.

c. The authors commented that in most cases the ALD is better than or of the order of 1.0. Does the data in fact provide strong evidence for concluding that true average ALD under these circumstances is less than 1.0? Carry out an appropriate test of hypotheses.

d. Calculate an upper confidence bound for true average ALD using a confidence level of 95%, and interpret this bound.

28. Minor surgery on horses under field conditions requires a reliable short-term anesthetic producing good muscle relax- ation, minimal cardiovascular and respiratory changes, and a quick, smooth recovery with minimal aftereffects so that horses can be left unattended. The article “A Field Trial of Ketamine Anesthesia in the Horse” (Equine Vet. J., 1984: 176–179) reports that for a sample of horses to which ketamine was administered under certain conditions, the sample average lateral recumbency (lying-down) time was 18.86 min and the standard deviation was 8.6 min. Does this data suggest that true average lateral recumbency time under these conditions is less than 20 min? Test the appro- priate hypotheses at level of significance .10.

29. The article “Uncertainty Estimation in Railway Track Life- Cycle Cost” (J. of Rail and Rapid Transit, 2009) presented the following data on time to repair (min) a rail break in the high rail on a curved track of a certain railway line.

159 120 480 149 270 547 340 43 228 202 240 218

n 5 73

s 5 4.8 x 5 52.7

A normal probability plot of the data shows a reasonably lin- ear pattern, so it is plausible that the population distribution of repair time is at least approximately normal. The sample mean and standard deviation are 249.7 and 145.1, respectively. a. Is there compelling evidence for concluding that true

average repair time exceeds 200 min? Carry out a test of hypotheses using a significance level of .05.

b. Using , what is the type II error probability of the test used in (a) when true average repair time is actu- ally 300 min? That is, what is b(300)?

30. Have you ever been frustrated because you could not get a container of some sort to release the last bit of its contents? The article “Shake, Rattle, and Squeeze: How Much Is Left in That Container?” (Consumer Reports, May 2009: 8) reported on an investigation of this issue for various con- sumer products. Suppose five 6.0 oz tubes of toothpaste of a particular brand are randomly selected and squeezed until no more toothpaste will come out. Then each tube is cut open and the amount remaining is weighed, resulting in the fol- lowing data (consistent with what the cited article reported): .53, .65, .46, .50, .37. Does it appear that the true average amount left is less than 10% of the advertised net contents? a. Check the validity of any assumptions necessary for test-

ing the appropriate hypotheses. b. Carry out a test of the appropriate hypotheses using a

significance level of .05. Would your conclusion change if a significance level of .01 had been used?

c. Describe in context type I and II errors, and say which error might have been made in reaching a conclusion.

31. A well-designed and safe workplace can contribute greatly to increased productivity. It is especially important that workers not be asked to perform tasks, such as lifting, that exceed their capabilities. The accompanying data on maximum weight of lift (MAWL, in kg) for a frequency of four lifts/min was reported in the article “The Effects of Speed, Frequency, and Load on Measured Hand Forces for a Floor-to-Knuckle Lifting Task” (Ergonomics, 1992: 833–843); subjects were randomly selected from the population of healthy males ages 18–30. Assuming that MAWL is normally distributed, does the data suggest that the population mean MAWL exceeds 25? Carry out a test using a significance level of .05.

25.8 36.6 26.3 21.8 27.2

32. The recommended daily dietary allowance for zinc among males older than age 50 years is 15 mg/day. The article “Nutrient Intakes and Dietary Patterns of Older Americans: A National Study” (J. of Gerontology, 1992: M145–150) reports the following summary data on intake for a sample of males age 65–74 years: , , and

. Does this data indicate that average daily zinc intake in the population of all males ages 65–74 falls below the recommended allowance?

33. Reconsider the accompanying sample data on expense ratio (%) for large-cap growth mutual funds first introduced in Exercise 1.53.

s 5 6.43 x 5 11.3n 5 115

s 5 150

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.3 Tests Concerning a Population Proportion 323

0.52 1.06 1.26 2.17 1.55 0.99 1.10 1.07 1.81 2.05 0.91 0.79 1.39 0.62 1.52 1.02 1.10 1.78 1.01 1.15

A normal probability plot shows a reasonably linear pattern.

a. Is there compelling evidence for concluding that the pop- ulation mean expense ratio exceeds 1%? Carry out a test of the relevant hypotheses using a significance level of .01.

b. Referring back to (a), describe in context type I and II errors and say which error you might have made in reaching your conclusion. The source from which the data was obtained reported that for the popu- lation of all 762 such funds. So did you actually commit an error in reaching your conclusion?

c. Supposing that , determine and interpret the power of the test in (a) for the actual value of m stated in (b).

34. A sample of 12 radon detectors of a certain type was selected, and each was exposed to 100 pCi/L of radon. The resulting readings were as follows:

s 5 .5

m 5 1.33

105.6 90.9 91.2 96.9 96.5 91.3 100.1 105.0 99.6 107.7 103.3 92.4

a. Does this data suggest that the population mean reading under these conditions differs from 100? State and test the appropriate hypotheses using .

b. Suppose that prior to the experiment a value of had been assumed. How many determinations would then have been appropriate to obtain for the alternative ?

35. Show that for any , when the population distribution is normal and s is known, the two-tailed test satisfies

, so that is symmetric about m0.

36. For a fixed alternative value m�, show that as for either a one-tailed or a two-tailed z test in the

case of a normal population distribution with known s. n S `

b(mr) S 0

b(mr)b(m0 2 �) 5 b(m0 1 �)

� . 0

m 5 95 b 5 .10

s 5 7.5 a 5 .05

8.3 Tests Concerning a Population Proportion Let p denote the proportion of individuals or objects in a population who possess a specified property (e.g., cars with manual transmissions or smokers who smoke a fil- ter cigarette). If an individual or object with the property is labeled a success (S), then p is the population proportion of successes. Tests concerning p will be based on a random sample of size n from the population. Provided that n is small relative to the population size, X (the number of S’s in the sample) has (approximately) a bino- mial distribution. Furthermore, if n itself is large , both X and the estimator are approximately normally distributed. We first consider large-sample tests based on this latter fact and then turn to the small- sample case that directly uses the binomial distribution.

Large-Sample Tests Large-sample tests concerning p are a special case of the more general large- sample procedures for a parameter . Let be an estimator of that is (at least approximately) unbiased and has approximately a normal distribution. The null hypothesis has the form where denotes a number (the null value) appropriate to the problem context. Suppose that when H0 is true, the standard deviation of , involves no unknown parameters. For example, if and

which involves no unknown parameters only if the value of is known. A large-sample test statistic results from standardizing under the assumption that H0 is true (so that ):

If the alternative hypothesis is , an upper-tailed test whose significance level is approximately a is specified by the rejection region . The other two alternatives, and , are tested using a lower-tailed z test and a two-tailed z test, respectively.

Ha: u 2 u0Ha: u , u0 z $ za

Ha: u . u0

Test statistic: Z 5 û 2 u0 sû

E(û) 5 u0

ûs

s/2n,sû 5 sX 5û 5 X, u 5 msû,û

u0H0: u 5 u0

uûu

p̂ 5 X/n n(1 2 p) $ 10][np $ 10 and

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

324 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

In the case will not involve any unknown parameters when H0 is true, but this is atypical. When does involve unknown parameters, it is often possible to use an estimated standard deviation in place of and still have Z approximately normally distributed when H0 is true (because when n is large, for most samples). The large-sample test of the previous section furnishes an example of this: Because s is usu- ally unknown, we use in place of in the denominator of z.

The estimator is unbiased has approximately a normal distribution, and its standard deviation is These facts were used in Section 7.2 to obtain a confidence interval for p. When H0 is true, and

so does not involve any unknown parameters. It then fol- lows that when n is large and H0 is true, the test statistic

has approximately a standard normal distribution. If the alternative hypothesis is and the upper-tailed rejection region is used, then

Thus the desired level of significance a is attained by using the critical value that captures area a in the upper tail of the z curve. Rejection regions for the other two alternative hypotheses, lower-tailed for and two-tailed for , are justified in an analogous manner.

Ha: p 2 p0Ha: p , p0

normal distribution) < a 5 P(Z $ za when Z has approximately a standard

P(type I error) 5 P(H0 is rejected when it is true)

z $ zaHa: p . p0

Z 5 p̂ 2 p0

1p0(1 2 p0)/n

sp̂sp̂ 5 1p0(1 2 p0)/n, E(p̂) 5 p0

sp̂ 5 1p(1 2 p)/n. (E(p̂) 5 p),p̂ 5 X/n

s/1n5 sX 5 s/1nsû

sû < sû sûSû

sû

sûu 5 p,

Null hypothesis:

Test statistic value:

Alternative Hypothesis Rejection Region

(upper-tailed) (lower-tailed)

either or (two-tailed)

These test procedures are valid provided that and .n(1 2 p0) $ 10np0 $ 10

z # 2za/2z $ za/2Ha: p 2 p0 z # 2zaHa: p , p0 z $ zaHa: p . p0

z 5 p̂ 2 p0

#p0(1 2 p0)/n H0: p 5 p0

Natural cork in wine bottles is subject to deterioration, and as a result wine in such bottles may experience contamination. The article “Effects of Bottle Closure Type on Consumer Perceptions of Wine Quality” (Amer. J. of Enology and Viticulture, 2007: 182–191) reported that, in a tasting of commercial chardonnays, 16 of 91 bot- tles were considered spoiled to some extent by cork-associated characteristics. Does this data provide strong evidence for concluding that more than 15% of all such bot- tles are contaminated in this way? Let’s carry out a test of hypotheses using a sig- nificance level of .10.

1. the true proportion of all commercial chardonnay bottles considered spoiled to some extent by cork-associated characteristics.

2. The null hypothesis is .

3. The alternative hypothesis is , the assertion that the population percentage exceeds 15%.

Ha: p . .15

H0: p 5 .15

p 5

Example 8.11

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.3 Tests Concerning a Population Proportion 325

4. Since and , the large- sample z test can be used. The test statistic value is .

5. The form of Ha implies that an upper-tailed test is appropriate: Reject H0 if .

6. , from which

7. Since .69 < 1.28, z is not in the rejection region. At significance level .10, the null hypothesis cannot be rejected. Although the percentage of contaminated bottles in the sample somewhat exceeds 15%, the sample percentage is not large enough to conclude that the population percentage exceeds 15%. The difference between the sample proportion .1758 and the null value .15 can adequately be explained by sampling variability. ■

B and Sample Size Determination When H0 is true, the test statistic Z has approxi- mately a standard normal distribution. Now suppose that H0 is not true and that Then Z still has approximately a normal distribution (because it is a linear function of ), but its mean value and variance are no longer 0 and 1, respectively. Instead,

The probability of a type II error for an upper-tailed test is . This can be computed by using the given mean and vari-

ance to standardize and then referring to the standard normal cdf. In addition, if it is desired that the level a test also have for a specified value of b, this equa- tion can be solved for the necessary n as in Section 8.2. General expressions for and n are given in the accompanying box.

b(pr) b(pr) 5 b

P(Z . za when p 5 pr) b(pr) 5

E(Z) 5 pr 2 p0

2p0(1 2 p0)/n V(Z) 5

pr(1 2 pr)/n p0(1 2 p0)/n

p̂ p 5 pr.

z 5 (.1758 2 .15)/2(.15)(.85)/91 5 .0258/.0374 5 .69

p̂ 5 16/91 5 .1758

z $ z.10 5 1.28

z 5 (p̂ 2 .15)/1(.15)(.85)/n nq0 5 91(.85) 5 77.35 . 10np0 5 91(.15) 5 13.65 . 10

Alternative Hypothesis

The sample size n for which the level a test also satisfies is

c za/22p0(1 2 p0) 1 zb2pr(1 2 pr) pr 2 p0

d2 two-tailed test (an approximate solution)

cza2p0(1 2 p0) 1 zb2pr(1 2 pr) pr 2 p0

d2 one-tailed testen 5 b(pr) 5 b

2�c p0 2 pr 2 za/22p0(1 2 p0)/n 2pr(1 2 pr)/n

d

�c p0 2 pr 1 za/22p0(1 2 p0)/n 2pr(1 2 pr)/n

dHa: p 2 p0

1 2 �c p0 2 pr 2 za2p0(1 2 p0)/n 2pr(1 2 pr)/n

dHa: p , p0

�c p0 2 pr 1 za2p0(1 2 p0)/n 2pr(1 2 pr)/n

dHa: p . p0 b(pr)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

326 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

A package-delivery service advertises that at least 90% of all packages brought to its office by 9 A.M. for delivery in the same city are delivered by noon that day. Let p denote the true proportion of such packages that are delivered as advertised and con- sider the hypotheses versus . If only 80% of the packages are delivered as advertised, how likely is it that a level .01 test based on pack- ages will detect such a departure from H0? What should the sample size be to ensure that ? With , and ,

Thus the probability that H0 will be rejected using the test when is .9772— roughly 98% of all samples will result in correct rejection of H0.

Using in the sample size formula yields

Small-Sample Tests Test procedures when the sample size n is small are based directly on the binomial distribution rather than the normal approximation. Consider the alternative hypothe- sis and again let X be the number of successes in the sample. Then X is the test statistic, and the upper-tailed rejection region has the form . When H0 is true, X has a binomial distribution with parameters n and p0, so

As the critical value c decreases, more x values are included in the rejection region and P(type I error) increases. Because X has a discrete probability distribution, it is usually not possible to find a value of c for which P(type I error) is exactly the desired significance level a (e.g., .05 or .01). Instead, the largest rejection region of the form satisfying is used.

Let denote an alternative value of . When so

That is, is the result of a straightforward binomial probability calculation. The sample size n necessary to ensure that a level a test also has specified b at a particular alternative value p� must be determined by trial and error using the bino- mial cdf.

Test procedures for and for are constructed in a similar manner. In the former case, the appropriate rejection region has the form (a lower-tailed test). The critical value c is the largest number satisfying The rejection region when the alternative hypothesis is consists of both large and small x values.

Ha: p 2 p0 B(c; n, p0) # a.

x # c Ha: p 2 p0Ha: p , p0

b(pr)

5 P(X , c when X , Bin(n, pr)) 5 B(c 2 1; n, pr) b(pr) 5 P(type II error when p 5 pr)

p 5 pr, X , Bin(n, pr),p ( pr . p0)pr 1 2 B(c 2 1: n, p0) # a5c, c 1 1, c, n6

5 1 2 B(c 2 1; n, p0)

5 1 2 P(X # c 2 1 when X , Bin(n, p0)) 5 P(X $ c when X , Bin(n, p0))

P(type I error) 5 P(H0 is rejected when it is true)

x $ c Ha: p . p0

n 5 c 2.332(.9)(.1) 1 2.332(.8)(.2) .8 2 .9

d2 < 266

za 5 zb 5 2.33

p 5 .8

5 1 2 �(2.00) 5 .0228

b(.8) 5 1 2 �a .9 2 .8 2 2.332(.9)(.1)/225 2(.8)(.2)/225

b

n 5 225a 5 .01, p0 5 .9, pr 5 .8b(.8) 5 .01

n 5 225 Ha: p , .9H0: p 5 .9

Example 8.12

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.3 Tests Concerning a Population Proportion 327

Example 8.13 A plastics manufacturer has developed a new type of plastic trash can and proposes to sell them with an unconditional 6-year warranty. To see whether this is economi- cally feasible, 20 prototype cans are subjected to an accelerated life test to simulate 6 years of use. The proposed warranty will be modified only if the sample data strongly suggests that fewer than 90% of such cans would survive the 6-year period. Let p denote the proportion of all cans that survive the accelerated test. The relevant hypotheses are versus . A decision will be based on the test statistic X, the number among the 20 that survive. If the desired significance level is

, c must satisfy . From Appendix Table A.1, , whereas . The appropriate rejection

region is therefore . If the accelerated test results in , H0 would be rejected in favor of Ha, necessitating a modification of the proposed warranty. The probability of a type II error for the alternative value is

That is, when of all samples consisting of cans would result in H0 being incorrectly not rejected. This error probability is high because 20 is a small sample size and is close to the null value . ■p0 5 .9pr 5 .8

n 5 20p 5 .8, 63%

5 1 2 B(15; 20, .8) 5 1 2 .370 5 .630

5 P(X $ 16 when X , Bin(20, .8)) b(.8) 5 P(H0 is not rejected when X , Bin(20, .8))

pr 5 .8

x 5 14x # 15 B(16; 20, .9) 5 .133B(15; 20, .9) 5 .043

B(c; 20, .9) # .05a 5 .05

Ha: p , .9H0: p 5 .9

EXERCISES Section 8.3 (37–46)

37. A common characterization of obese individuals is that their body mass index is at least 30 [ , where height is in meters and weight is in kilograms]. The article “The Impact of Obesity on Illness Absence and Productivity in an Industrial Population of Petrochemical Workers” (Annals of Epidemiology, 2008: 8–14) reported that in a sample of female workers, 262 had BMIs of less than 25, 159 had BMIs that were at least 25 but less than 30, and 120 had BMIs exceeding 30. Is there compelling evi- dence for concluding that more than 20% of the individuals in the sampled population are obese? a. State and test appropriate hypotheses using the rejection

region approach with a significance level of .05. b. Explain in the context of this scenario what constitutes

type I and II errors. c. What is the probability of not concluding that more than

20% of the population is obese when the actual percent- age of obese individuals is 25%?

38. A manufacturer of nickel-hydrogen batteries randomly selects 100 nickel plates for test cells, cycles them a speci- fied number of times, and determines that 14 of the plates have blistered. a. Does this provide compelling evidence for concluding

that more than 10% of all plates blister under such cir- cumstances? State and test the appropriate hypotheses using a significance level of .05. In reaching your con- clusion, what type of error might you have committed?

b. If it is really the case that 15% of all plates blister under these circumstances and a sample size of 100 is used,

BMI 5 weight/(height)2 how likely is it that the null hypothesis of part (a) will not be rejected by the level .05 test? Answer this question for a sample size of 200.

c. How many plates would have to be tested to have for the test of part (a)?

39. A random sample of 150 recent donations at a certain blood bank reveals that 82 were type A blood. Does this suggest that the actual percentage of type A donations differs from 40%, the percentage of the population having type A blood? Carry out a test of the appropriate hypotheses using a sig- nificance level of .01. Would your conclusion have been dif- ferent if a significance level of .05 had been used?

40. It is known that roughly 2/3 of all human beings have a dominant right foot or eye. Is there also right-sided domi- nance in kissing behavior? The article “Human Behavior: Adult Persistence of Head-Turning Asymmetry” (Nature, 2003: 771) reported that in a random sample of 124 kissing couples, both people in 80 of the couples tended to lean more to the right than to the left. a. If 2/3 of all kissing couples exhibit this right-leaning

behavior, what is the probability that the number in a sample of 124 who do so differs from the expected value by at least as much as what was actually observed?

b. Does the result of the experiment suggest that the 2/3 fig- ure is implausible for kissing behavior? State and test the appropriate hypotheses.

41. The article referenced in Example 8.11 also reported that in a sample of 106 wine consumers, 22 (20.8%) thought that

b(.15) 5 .10

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

328 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

screw tops were an acceptable substitute for natural corks. Suppose a particular winery decided to use screw tops for one of its wines unless there was strong evidence to suggest that fewer than 25% of wine consumers found this acceptable. a. Using a significance level of .10, what would you

recommend to the winery? b. For the hypotheses tested in (a), describe in context what

the type I and II errors would be, and say which type of error might have been committed.

42. With domestic sources of building supplies running low several years ago, roughly 60,000 homes were built with imported Chinese drywall. According to the article “Report Links Chinese Drywall to Home Problems” (New York Times, Nov. 24, 2009), federal investigators identified a strong association between chemicals in the drywall and electrical problems, and there is also strong evidence of res- piratory difficulties due to the emission of hydrogen sulfide gas. An extensive examination of 51 homes found that 41 had such problems. Suppose these 51 were randomly sam- pled from the population of all homes having Chinese dry- wall. a. Does the data provide strong evidence for concluding

that more than 50% of all homes with Chinese drywall have electrical/environmental problems? Carry out a test of hypotheses using .

b. Calculate a lower confidence bound using a confidence level of 99% for the percentage of all such homes that have electrical/environmental problems.

c. If it is actually the case that 80% of all such homes have problems, how likely is it that the test of (a) would not conclude that more than 50% do?

43. A plan for an executive travelers’ club has been developed by an airline on the premise that 5% of its current customers would qualify for membership. A random sample of 500 customers yielded 40 who would qualify. a. Using this data, test at level .01 the null hypothesis that

the company’s premise is correct against the alternative that it is not correct.

b. What is the probability that when the test of part (a) is used, the company’s premise will be judged correct when in fact 10% of all current customers qualify?

44. Each of a group of 20 intermediate tennis players is given two rackets, one having nylon strings and the other synthetic

a 5 .01

gut strings. After several weeks of playing with the two rackets, each player will be asked to state a preference for one of the two types of strings. Let p denote the proportion of all such players who would prefer gut to nylon, and let X be the number of players in the sample who prefer gut. Because gut strings are more expensive, consider the null hypothesis that at most 50% of all such players prefer gut. We simplify this to , planning to reject H0 only if sample evidence strongly favors gut strings. a. Which of the rejection regions {15, 16, 17, 18, 19, 20},

{0, 1, 2, 3, 4, 5}, or {0, 1, 2, 3, 17, 18, 19, 20} is most appropriate, and why are the other two not appropriate?

b. What is the probability of a type I error for the chosen region of part (a)? Does the region specify a level .05 test? Is it the best level .05 test?

c. If 60% of all enthusiasts prefer gut, calculate the proba- bility of a type II error using the appropriate region from part (a). Repeat if 80% of all enthusiasts prefer gut.

d. If 13 out of the 20 players prefer gut, should H0 be rejected using a significance level of .10?

45. A manufacturer of plumbing fixtures has developed a new type of washerless faucet. Let (a randomly selected faucet of this type will develop a leak within 2 years under normal use). The manufacturer has decided to proceed with production unless it can be determined that p is too large; the borderline acceptable value of p is specified as .10. The man- ufacturer decides to subject n of these faucets to accelerated testing (approximating 2 years of normal use). With the number among the n faucets that leak before the test con- cludes, production will commence unless the observed X is too large. It is decided that if , the probability of not proceeding should be at most .10, whereas if the probability of proceeding should be at most .10. Can be used? ? ? What is the appropriate rejection region for the chosen n, and what are the actual error proba- bilities when this region is used?

46. Scientists think that robots will play a crucial role in facto- ries in the next several decades. Suppose that in an experi- ment to determine whether the use of robots to weave computer cables is feasible, a robot was used to assemble 500 cables. The cables were examined and there were 15 defectives. If human assemblers have a defect rate of .035 (3.5%), does this data support the hypothesis that the pro- portion of defectives is lower for robots than for humans? Use a .01 significance level.

n 5 25n 5 20 n 5 10

p 5 .30 p 5 .10

X 5

p 5 P

H0: p 5 .5

8.4 P-Values Using the rejection region method to test hypotheses entails first selecting a signifi- cance level a. Then after computing the value of the test statistic, the null hypothe- sis H0 is rejected if the value falls in the rejection region and is otherwise not rejected. We now consider another way of reaching a conclusion in a hypothesis test- ing analysis. This alternative approach is based on calculation of a certain probabil- ity called a P-value. One advantage is that the P-value provides an intuitive measure of the strength of evidence in the data against H0.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.4 P-Values 329

DEFINITION The P-value is the probability, calculated assuming that the null hypothesis is true, of obtaining a value of the test statistic at least as contradictory to H0 as the value calculated from the available sample.

Example 8.14

This definition is quite a mouthful. Here are some key points:

• The P-value is a probability.

• This probability is calculated assuming that the null hypothesis is true.

• Beware: The P-value is not the probability that H0 is true, nor is it an error probability!

• To determine the P-value, we must first decide which values of the test statistic are at least as contradictory to H0 as the value obtained from our sample.

Urban storm water can be contaminated by many sources, including discarded bat- teries. When ruptured, these batteries release metals of environmental signifi- cance. The article “Urban Battery Litter” (J. of Environ. Engr., 2009: 46–57) presented summary data for characteristics of a variety of batteries found in urban areas around Cleveland. A sample of 51 Panasonic AAA batteries gave a sample mean zinc mass of 2.06 g and a sample standard deviation of .141 g. Does this data provide compelling evidence for concluding that the population mean zinc mass exceeds 2.0 g?

With m denoting the true average zinc mass for such batteries, the relevant hypotheses are versus . The sample size is large enough so that a z test can be used without making any specific assumption about the shape of the population distribution. The test statistic value is

Now we must decide which values of z are at least as contradictory to H0. Let’s first consider an easier task: Which values of are at least as contradictory to the null hypothesis as 2.06, the mean of the observations in our sample? Because . appears in Ha, it should be clear that 2.10 is at least as contradictory to H0 as is 2.06, and so in fact is any value that exceeds 2.06. But an value that exceeds 2.06 corresponds to a value of z that exceeds 3.04. Thus the P-value is

Since the test statistic Z was created by subtracting the null value 2.0 in the numer- ator, when —i.e., when H0 is true—Z has approximately a standard normal distribution. As a consequence,

We will shortly illustrate how to determine the P-value for any z or t test—i.e., any test where the reference distribution is the standard normal distribution (and z curve) or some t distribution (and corresponding t curve). For the moment, though, let’s focus on reaching a conclusion once the P-value is available. Because it is a proba- bility, the P-value must be between 0 and 1. What kinds of P-values provide evi- dence against the null hypothesis? Consider two specific instances:

5 1 2 �(3.04) 5 .0012 P-value 5 P(Z $ 3.04 when m 5 2.0) < area under the z curve to the right of 3.04

m 5 2.0

P-value 5 P(Z $ 3.04 when m 5 2.0)

xx

x

z 5 x 2 2.0

s/2n 5

2.06 2 2.0

.141/251 5 3.04

Ha: m . 2.0H0: m 5 2.0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

330 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

• : In this case, fully 25% of all possible test statistic values are at least as contradictory to H0 as the one that came out of our sample. So our data is not all that contradictory to the null hypothesis.

• : Here, only .18% (much less than 1%) of all possible test statistic values are at least as contradictory to H0 as what we obtained. Thus the sample appears to be highly contradictory to the null hypothesis.

More generally, the smaller the P-value, the more evidence there is in the sample data against the null hypothesis and for the alternative hypothesis. That is, H0 should be rejected in favor of Ha when the P-value is sufficiently small. So what con- stitutes “sufficiently small”?

P-value 5 .0018

P-value 5 .250

Decision rule based on the P-value

Select a significance level a (as before, the desired type I error probability). Then

do not reject H0 if P-value . a

reject H0 if P-value # a

Example 8.15

Thus if the P-value exceeds the chosen significance level, the null hypothesis cannot be rejected at that level. But if the P-value is equal to or less than a, then there is enough evidence to justify rejecting H0. In Example 8.14, we calculated

. Then using a significance level of .01, we would reject the null hypothesis in favor of the alternative hypothesis because . However, suppose we select a significance level of only .001, which requires more substantial evidence from the data before H0 can be rejected. In this case we would not reject H0 because .

How does the decision rule based on the P-value compare to the decision rule employed in the rejection region approach? The two procedures—the rejection region method and the P-value method—are in fact identical. Whatever the conclu- sion reached by employing the rejection region approach with a particular a, the same conclusion will be reached via the P-value approach using that same a.

The nicotine content problem discussed in Example 8.5 involved testing versus using a z test (i.e., a test which utilizes the z curve

as the reference distribution). The inequality in Ha implies that the upper-tailed rejec- tion region is appropriate. Suppose . Then using exactly the same reasoning as in Example 8.14 gives . Consider now testing with several different significance levels:

Because and also , using the P-value approach results in rejection of H0 for the first two significance levels. However, for , 2.10 is not in the rejection region and .0179 is larger than .01. More generally, whenever a is smaller than the P-value .0179, the critical value za will be beyond the calculated value of z and H0 cannot be rejected by either method. This is illustrated in Figure 8.7.

a 5 .01 .0179 # .05P-value 5 .0179 # .10

a 5 .01 1 za 5 z.01 5 2.33 1 2.10 , 2.33 1 do not reject H0

a 5 .05 1 za 5 z.05 5 1.645 1 2.10 $ 1.645 1 reject H0

a 5 .10 1 za 5 z.10 5 1.28 1 2.10 $ 1.28 1 reject H0

P-value 5 1 2 �(2.10) 5 .0179 z 5 2.10z $ za

Ha: m . 1.5H0: m 5 1.5

.0012 . .001

.0012 # .01 P-value 5 .0012

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.4 P-Values 331

Example 8.16

PROPOSITION:

0

(a)

2.10 � computed z

Shaded area � .0179

Standard normal (z) curve

0

(b)

Shaded area � �

z curve z curve

2.10

z �z � z �z �

0

(c)

Shaded area � �

2.10

Figure 8.7 Relationship between a and tail area captured by computed z: (a) tail area captured by computed z ; (b) when and H0 is rejected; (c) when

and H0 is not rejected ■a , .0179, za . 2.10 a , .0179, za , 2.10

Let’s reconsider the P-value .0012 in Example 8.14 once again. H0 can be rejected only if . Thus the null hypothesis can be rejected if or .01 or .005 or .0015 or .00125. What is the smallest significance level a here for which H0 can be rejected? It is the P-value .0012.

a 5 .05.0012 # a

The P-value is the smallest significance level a at which the null hypothesis can be rejected. Because of this, the P-value is alternatively referred to as the observed significance level (OSL) for the data.

It is customary to call the data significant when H0 is rejected and not signifi- cant otherwise. The P-value is then the smallest level at which the data is significant. An easy way to visualize the comparison of the P-value with the chosen a is to draw a picture like that of Figure 8.8. The calculation of the P-value depends on whether the test is upper-, lower-, or two-tailed. However, once it has been calculated, the comparison with a does not depend on which type of test was used.

(b) (a) 10

P-value � smallest level at which H0 can be rejected

Figure 8.8 Comparing a and the P-value: (a) reject H0 when a lies here; (b) do not reject H0 when a lies here

The true average time to initial relief of pain for a best-selling pain reliever is known to be 10 min. Let m denote the true average time to relief for a company’s newly developed reliever. Suppose that when data from an experiment involving the new

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

332 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

P-value: P 5 c1 2 �(z) for an upper-tailed z test�(z) or an lower-tailed z test 2[1 2 �(|z|)] for a two-tailed z test

pain reliever is analyzed, the P-value for testing versus is calculated as .0384. Since is larger than the P-value [.05 lies in the interval (a) of Figure 8.8], H0 would be rejected by anyone carrying out the test at level .05. However, at level .01, H0 would not be rejected because .01 is smaller than the small- est level (.0384) at which H0 can be rejected. ■

The most widely used statistical computer packages automatically include a P-value when a hypothesis-testing analysis is performed. A conclusion can then be drawn directly from the output, without reference to a table of critical values. With the P-value in hand, an investigator can see at a quick glance for which signifi- cance levels H0 would or would not be rejected. Also, each individual can then select his or her own significance level. In addition, knowing the P-value allows a decision maker to distinguish between a close call (e.g., , and a very clearcut conclusion (e.g., , ), something that would not be possible just from the statement “H0 can be rejected at significance level .05.”

P-Values for z Tests The P-value for a z test (one based on a test statistic whose distribution when H0 is true is at least approximately standard normal) is easily determined from the information in Appendix Table A.3. Consider an upper-tailed test and let z denote the computed value of the test statistic Z. The null hypothesis is rejected if , and the P-value is the smallest a for which this is the case. Since za increases as a decreases, the P-value is the value of a for which . That is, the P-value is just the area captured by the computed value z in the upper tail of the standard normal curve. The corresponding cumulative area is �(z), so in this case

. An analogous argument for a lower-tailed test shows that the P-value is the

area captured by the computed value z in the lower tail of the standard normal curve. More care must be exercised in the case of a two-tailed test. Suppose first that z is positive. Then the P-value is the value of a satisfying (i.e., computed

). This says that the area captured in the upper tail is half the P-value, so that . If z is negative, the P-value is the a for which , or, equivalently, , so Since when z is negative, for either positive or negative z.

P-value 5 2[1 2 �(|z|)]2z 5 uz u P-value 5 2[1 2 �(2z)].2z 5 za/2z 5 2za/2

P-value 5 2[1 2 �(z)] z 5 upper-tail critical value

z 5 za/2

P-value 5 1 2 �(z)

z 5 za

z $ za

P-value 5 .0003a 5 .05 P-value 5 .0498)a 5 .05

a 5 .05 Ha: m , 10H0: m 5 10

Each of these is the probability of getting a value at least as extreme as what was obtained (assuming H0 true). The three cases are illustrated in Figure 8.9.

The next example illustrates the use of the P-value approach to hypothesis testing by means of a sequence of steps modified from our previously recommended sequence.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.4 P-Values 333

Example 8.17 The target thickness for silicon wafers used in a certain type of integrated circuit is 245 mm. A sample of 50 wafers is obtained and the thickness of each one is deter- mined, resulting in a sample mean thickness of 246.18 mm and a sample standard deviation of 3.60 mm. Does this data suggest that true average wafer thickness is something other than the target value?

1. Parameter of interest:

2. Null hypothesis:

3. Alternative hypothesis:

4. Formula for test statistic value:

5. Calculation of test statistic value:

6. Determination of P-value: Because the test is two-tailed,

7. Conclusion: Using a significance level of .01, H0 would not be rejected since . At this significance level, there is insufficient evidence to conclude

that true average thickness differs from the target value. ■

P-Values for t Tests Just as the P-value for a z test is a z curve area, the P-value for a t test will be a t-curve area. Figure 8.10 on the next page illustrates the three different cases. The number of df for the one-sample t test is .n 2 1

.0204 . .01

P-value 5 2(1 2 �(2.32)) 5 .0204

z 5 246.18 2 245

3.60/250 5 2.32

z 5 x 2 245

s/2n

Ha: m 2 245 H0: m 5 245

m 5 true average wafer thickness

P-value = area in upper tail z curve

Calculated z

0

0

0

P-value = sum of area in two tails

z curve

Calculated z, −z

P-value = area in lower tail z curve

Calculated z

1. Upper-tailed test Ha contains the inequality >

2. Lower-tailed test Ha contains the inequality <

3. Two-tailed test Ha contains the inequality ≠

= 1 – Φ(z)

= Φ(z)

= 2[1 – Φ(|z|)]

Figure 8.9 Determination of the P-value for a z test

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

334 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

The table of t critical values used previously for confidence and prediction intervals doesn’t contain enough information about any particular t distribution to allow for accurate determination of desired areas. So we have included another t table in Appendix Table A.8, one that contains a tabulation of upper-tail t-curve areas. Each different column of the table is for a different number of df, and the rows are for calculated values of the test statistic t ranging from 0.0 to 4.0 in increments of .1. For example, the number .074 appears at the intersection of the 1.6 row and the 8 df column, so the area under the 8 df curve to the right of 1.6 (an upper-tail area) is .074. Because t curves are symmetric, .074 is also the area under the 8 df curve to the left of (a lower-tail area).

Suppose, for example, that a test of versus is based on the 8 df t distribution. If the calculated value of the test statistic is , then the P-value for this upper-tailed test is .074. Because .074 exceeds .05, we would not be able to reject H0 at a significance level of .05. If the alternative hypothesis is

and a test based on 20 df yields , then Appendix Table A.8 shows that the P-value is the captured lower-tail area .002. The null hypothesis can be rejected at either level .05 or .01. Consider testing versus

; the null hypothesis states that the means of the two populations are identical, whereas the alternative hypothesis states that they are different without specifying a direction of departure from H0. If a t test is based on 20 df and then the P-value for this two-tailed test is . This would also be the P-value for . The tail area is doubled because values both larger than 3.2 and smaller than are more contradictory to H0 than what was calculated (val- ues farther out in either tail of the t curve).

23.2 t 5 23.2

2(.002) 5 .004 t 5 3.2,

Ha: m1 2 m2 2 0 H0: m1 2 m2 5 0

t 5 23.2Ha: m , 100

t 5 1.6 Ha: m . 100H0: m 5 100

21.6

1. Upper-tailed test Ha contains the inequality >

2. Lower-tailed test Ha contains the inequality <

3. Two-tailed test Ha contains the inequality ≠

P-value = area in upper tail

t curve for relevant df

t curve for relevant df

t curve for relevant df

Calculated t

P-value = sum of area in two tails

Calculated t, −t

P-value = area in lower tail

Calculated t

0

0

0

Figure 8.10 P-values for t tests

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.4 P-Values 335

Example 8.19

Example 8.18 In Example 8.9 we considered a test of versus based on a sample of observations from a normal population distribution. The test statistic value was . Looking to the df column of Appendix Table A.8 and then down to the .6 row, the entry is .290. Because the test is two-tailed, this upper-tail area must be doubled to obtain the P-value. The result is . This P-value is clearly larger than any reasonable sig- nificance level a (.01, .05, and even .10), so there is no reason to reject the null hypothesis. The Minitab output included in Example 8.9 has . P-values from software packages will be more accurate than what results from Appendix Table A.8 since values of t in our table are accurate only to the tenths digit. ■

More on Interpreting P-values The P-value resulting from carrying out a test on a selected sample is not the probability that H0 is true, nor is it the probability of rejecting the null hypothe- sis. Once again, it is the probability, calculated assuming that H0 is true, of obtaining a test statistic value at least as contradictory to the null hypothesis as the value that actually resulted. For example, consider testing against using a lower-tailed z test. If the calculated value of the test statistic is , then

But if a second sample is selected, the resulting value of z will almost surely be dif- ferent from , so the corresponding P-value will also likely differ from .0228. Because the test statistic value itself varies from one sample to another, the P-value will also vary from one sample to another. That is, the test statistic is a random variable, and so the P-value will also be a random variable. A first sample may give a P-value of .0228, a second sample may result in a P-value of .1175, a third may yield .0606 as the P-value, and so on.

If H0 is false, we hope the P-value will be close to 0 so that the null hypothe- sis can be rejected. On the other hand, when H0 is true, we’d like the P-value to exceed the selected significance level so that the correct decision to not reject H0 is made. The next example presents simulations to show how the P-value behaves both when the null hypothesis is true and when it is false.

The fuel efficiency (mpg) of any particular new vehicle under specified driving con- ditions may not be identical to the EPA figure that appears on the vehicle’s sticker. Suppose that four different vehicles of a particular type are to be selected and driven over a certain course, after which the fuel efficiency of each one is to be deter- mined. Let m denote the true average fuel efficiency under these conditions. Consider testing versus using the one-sample t test based on the resulting sample. Since the test is based on degrees of freedom, the P-value for an upper-tailed test is the area under the t curve with 3 df to the right of the calculated t.

Let’s first suppose that the null hypothesis is true. We asked Minitab to generate 10,000 different samples, each containing 4 observations, from a normal population distribution with mean value and standard deviation . The first sample and resulting summary quantities were

s 5 2m 5 20

n 2 1 5 3 H0: m . 20H0: m 5 20

22.00

5 area under the z curve to the left of 22.00 5 0.228

P-value 5 P(Z , 22.00 when m 5 50)

z 5 22.00 H0: m , 50

H0: m 5 50

P-value 5 .594

P-value < .580

4 (5 5 2 1)2.594 < 2.6 n 5 5

Ha: m 2 4H0: m 5 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

336 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

The P-value is the area under the 3-df t curve to the right of .2799, which according to Minitab is .3989. Using a significance level of .05, the null hypothesis would of course not be rejected. The values of t for the next four samples were

, , and 3.1053, with corresponding P-values .912, .293, .733, and .0265.

Figure 8.11(a) shows a histogram of the 10,000 P-values from this simula- tion experiment. About 4.5% of these P-values are in the first class interval from 0 to .05. Thus when using a significance level of .05, the null hypothesis is rejected in roughly 4.5% of these 10,000 tests. If we continued to generate samples and carry out the test for each sample at significance level .05, in the long run 5% of the P-values would be in the first class interval. This is because when H0 is true and a test with significance level .05 is used, by definition the probability of reject- ing H0 is .05.

Looking at the histogram, it appears that the distribution of P-values is rela- tively flat. In fact, it can be shown that when H0 is true, the probability distribution of the P-value is a uniform distribution on the interval from 0 to 1. That is, the den- sity curve is completely flat on this interval, and thus must have a height of 1 if the total area under the curve is to be 1. Since the area under such a curve to the left of .05 is , we again have that the probability of rejecting H0 when it is true that it is .05, the chosen significance level.

Now consider what happens when H0 is false because . We again had Minitab generate 10,000 different samples of size 4 (each from a normal distribution with and ), calculate for each one, and then determine the P-value. The first such sample resulted in

, P-value . Figure 8.11(b) gives a histogram of the resulting P-values. The shape of this histogram is quite different from that of Figure 8.11(a)— there is a much greater tendency for the P-value to be small (closer to 0) when

than when . Again H0 is rejected at significance level .05 whenever the P-value is at most .05 (in the first class interval). Unfortunately, this is the case for only about 19% of the P-values. So only about 19% of the 10,000 tests correctly reject the null hypothesis; for the other 81%, a type II error is committed. The diffi- culty is that the sample size is quite small and 21 is not very different from the value asserted by the null hypothesis.

Figure 8.11(c) illustrates what happens to the P-value when H0 is false because (still with and ). The histogram is even more concentrated

toward values close to 0 than was the case when . In general, as m moves further to the right of the null value 20, the distribution of the P-value will become more and more concentrated on values close to 0. Even here a bit fewer than 50% of the P-values are smaller than .05. So it is still slightly more likely than not that the null hypothesis is incorrectly not rejected. Only for values of m much larger than 20 (e.g., at least 24 or 25) is it highly likely that the P-value will be smaller than .05 and thus give the correct conclusion.

The big idea of this example is that because the value of any test statistic is random, the P-value will also be a random variable and thus have a distribution. The further the actual value of the parameter is from the value specified by the null hypothesis, the more the distribution of the P-value will be concentrated on values close to 0 and the greater the chance that the test will correctly reject H0 (corre- sponding to smaller b).

m 5 21 s 5 2n 5 4m 5 22

m 5 20m 5 21

5 .0408t 5 2.5832 s 5 .49637,x 5 20.6411,

t 5 (x 2 20)/(s/24)s 5 2m 5 21

m 5 21

(.05)(1) 5 .05

2.702021.7591, .6082

x 5 20.264 s 5 1.8864 t 5 20.264 2 20

.1.8864/14 5 .2799

x1 5 20.830, x2 5 22.232, x3 5 20.276, x4 5 17.718

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.4 P-Values 337

EXERCISES Section 8.4 (47–62)

0.15 0.30 0

2

3

4

5

0.00

1

6

0.45

(a) µ = 20 P-value

P er

ce nt

0.60 0.75 0.90

(b) µ = 21

0.15 0

10

15

20

0.00

5

0.45 0.75 0.900.600.30

P-value

P er

ce nt

(c) µ = 22

0.15 0

40

50

0.00

10

30

20

0.45 0.75 0.900.600.30

P-value

P er

ce nt

Figure 8.11 P-value simulation results for Example 8.19

47. For which of the given P-values would the null hypothesis be rejected when performing a level .05 test? a. .001 b. .021 c. .078 d. .047 e. .148

48. Pairs of P-values and significance levels, a, are given. For each pair, state whether the observed P-value would lead to rejection of H0 at the given significance level. a. b. c. d. e. f.

49. Let m denote the mean reaction time to a certain stimulus. For a large-sample z test of versus , find the P-value associated with each of the given values of the z test statistic. a. 1.42 b. .90 c. 1.96 d. 2.48 e. 2.11

Ha: m . 5H0: m 5 5

P-value 5 .218, a 5 .10 P-value 5 .039, a 5 .01 P-value 5 .084, a 5 .10 P-value 5 .498, a 5 .05 P-value 5 .003, a 5 .001 P-value 5 .084, a 5 .05

50. Newly purchased tires of a certain type are supposed to be filled to a pressure of 30 lb/in2. Let m denote the true average pressure. Find the P-value associated with each given z statistic value for testing versus

. a. 2.10 b. c. d. 1.41 e.

51. Give as much information as you can about the P-value of a t test in each of the following situations: a. Upper-tailed test, b. Lower-tailed test, c. Two-tailed test, d. Upper-tailed test, e. Upper-tailed test, f. Two-tailed test,

52. The paint used to make lines on roads must reflect enough light to be clearly visible at night. Let m denote the true average reflectometer reading for a new type of paint under consideration. A test of versus willHa: m . 20H0: m 5 20

df 5 40, t 5 24.8 df 5 5, t 5 5.0 df 5 19, t 5 2.4

df 5 15, t 5 21.6 df 5 11, t 5 22.4 df 5 8, t 5 2.0

25.32.5521.75 Ha: m 2 30

H0: m 5 30

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

338 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

be based on a random sample of size n from a normal pop- ulation distribution. What conclusion is appropriate in each of the following situations? a. b. c.

53. Let m denote true average serum receptor concentration for all pregnant women. The average for all women is known to be 5.63. The article “Serum Transferrin Receptor for the Detection of Iron Deficiency in Pregnancy” (Amer. J. of Clinical Nutr., 1991: 1077–1081) reports that

for a test of versus based on pregnant women. Using a

significance level of .01, what would you conclude?

54. The article “Analysis of Reserve and Regular Bottlings: Why Pay for a Difference Only the Critics Claim to Notice?” (Chance, Summer 2005, pp. 9–15) reported on an experiment to investigate whether wine tasters could distinguish between more expensive reserve wines and their regular counterparts. Wine was presented to tasters in four containers labeled A, B, C, and D, with two of these containing the reserve wine and the other two the regular wine. Each taster randomly selected three of the containers, tasted the selected wines, and indi- cated which of the three he/she believed was different from the other two. Of the tasting trials, 346 resulted in correct distinctions (either the one reserve that differed from the two regular wines or the one regular wine that differed from the two reserves). Does this provide compelling evi- dence for concluding that tasters of this type have some abil- ity to distinguish between reserve and regular wines? State and test the relevant hypotheses using the P-value approach. Are you particularly impressed with the ability of tasters to distinguish between the two types of wine?

55. An aspirin manufacturer fills bottles by weight rather than by count. Since each bottle should contain 100 tablets, the average weight per tablet should be 5 grains. Each of 100 tablets taken from a very large lot is weighed, resulting in a sample average weight per tablet of 4.87 grains and a sample standard deviation of .35 grain. Does this informa- tion provide strong evidence for concluding that the company is not filling its bottles as advertised? Test the appropriate hypotheses using by first computing the P-value and then comparing it to the specified significance level.

56. Because of variability in the manufacturing process, the actual yielding point of a sample of mild steel subjected to increasing stress will usually differ from the theoretical yielding point. Let p denote the true proportion of samples that yield before their theoretical yielding point. If on the basis of a sample it can be concluded that more than 20% of all specimens yield before the theoretical point, the produc- tion process will have to be modified. a. If 15 of 60 specimens yield before the theoretical point,

what is the P-value when the appropriate test is used, and what would you advise the company to do?

a 5 .01

n 5 855

n 5 176Ha: m 2 5.63 H0: m 5 5.63P-value . .10

n 5 24, t 5 2.2 n 5 9, t 5 1.8, a 5 .01 n 5 15, t 5 3.2, a 5 .05

b. If the true percentage of “early yields” is actually 50% (so that the theoretical point is the median of the yield distribution) and a level .01 test is used, what is the prob- ability that the company concludes a modification of the process is necessary?

57. The article “Heavy Drinking and Polydrug Use Among College Students” (J. of Drug Issues, 2008: 445–466) stated that 51 of the 462 college students in a sample had a lifetime abstinence from alcohol. Does this provide strong evidence for concluding that more than 10% of the population sam- pled had completely abstained from alcohol use? Test the appropriate hypotheses using the P-value method. [Note: The article used more advanced statistical methods to study the use of various drugs among students characterized as light, moderate, and heavy drinkers.]

58. A random sample of soil specimens was obtained, and the amount of organic matter (%) in the soil was determined for each specimen, resulting in the accompanying data (from “Engineering Properties of Soil,” Soil Science, 1998: 93–102).

1.10 5.09 0.97 1.59 4.60 0.32 0.55 1.45 0.14 4.47 1.20 3.50 5.02 4.67 5.22 2.69 3.98 3.17 3.03 2.21 0.69 4.47 3.31 1.17 0.76 1.17 1.57 2.62 1.66 2.05

The values of the sample mean, sample standard deviation, and (estimated) standard error of the mean are 2.481, 1.616, and .295, respectively. Does this data suggest that the true average percentage of organic matter in such soil is some- thing other than 3%? Carry out a test of the appropriate hypotheses at significance level .10 by first determining the P-value. Would your conclusion be different if had been used? [Note: A normal probability plot of the data shows an acceptable pattern in light of the reasonably large sample size.]

59. The accompanying data on cube compressive strength (MPa) of concrete specimens appeared in the article “Experimental Study of Recycled Rubber-Filled High- Strength Concrete” (Magazine of Concrete Res., 2009: 549–556):

112.3 97.0 92.7 86.0 102.0 99.2 95.8 103.5 89.0 86.7

a. Is it plausible that the compressive strength for this type of concrete is normally distributed?

b. Suppose the concrete will be used for a particular appli- cation unless there is strong evidence that true average strength is less than 100 MPa. Should the concrete be used? Carry out a test of appropriate hypotheses using the P-value method.

60. A certain pen has been designed so that true average writing lifetime under controlled conditions (involving the use of a writing machine) is at least 10 hours. A random sample of 18 pens is selected, the writing lifetime of each is deter- mined, and a normal probability plot of the resulting data supports the use of a one-sample t test.

a 5 .05

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.5 Some Comments on Selecting a Test 339

a. What hypotheses should be tested if the investigators believe a priori that the design specification has been satisfied?

b. What conclusion is appropriate if the hypotheses of part (a) are tested, , and ?

c. What conclusion is appropriate if the hypotheses of part (a) are tested, , and ?

d. What should be concluded if the hypotheses of part (a) are tested and ?

61. A spectrophotometer used for measuring CO concentration [ppm (parts per million) by volume] is checked for accuracy by taking readings on a manufactured gas (called span gas) in which the CO concentration is very precisely controlled at 70 ppm. If the readings suggest that the spectrophotome- ter is not working properly, it will have to be recalibrated. Assume that if it is properly calibrated, measured concen- tration for span gas samples is normally distributed. On the basis of the six readings—85, 77, 82, 68, 72, and 69—is

t 5 23.6

a 5 .01t 5 21.8

a 5 .05t 5 22.3

recalibration necessary? Carry out a test of the relevant hypotheses using the P-value approach with .

62. The relative conductivity of a semiconductor device is determined by the amount of impurity “doped” into the device during its manufacture. A silicon diode to be used for a specific purpose requires an average cut-on voltage of .60 V, and if this is not achieved, the amount of impurity must be adjusted. A sample of diodes was selected and the cut-on voltage was determined. The accompanying SAS output resulted from a request to test the appropriate hypotheses.

N Mean Std Dev T Prob. T 15 0.0453333 0.0899100 1.9527887 0.0711

[Note: SAS explicitly tests , so to test the null value .60 must be subtracted from each xi; the reported mean is then the average of the values. Also, SAS’s P-value is always for a two-tailed test.] What would be con- cluded for a significance level of .01? .05? .10?

(xi 2 .60)

H0: m 5 .60,H0: m 5 0

uu.

a 5 .05

8.5 Some Comments on Selecting a Test Once the experimenter has decided on the question of interest and the method for gathering data (the design of the experiment), construction of an appropriate test consists of three distinct steps:

1. Specify a test statistic (the function of the observed values that will serve as the decision maker).

2. Decide on the general form of the rejection region (typically reject H0 for suit- ably large values of the test statistic, reject for suitably small values, or reject for either small or large values).

3. Select the specific numerical critical value or values that will separate the rejec- tion region from the acceptance region (by obtaining the distribution of the test statistic when H0 is true, and then selecting a level of significance).

In the examples thus far, both Steps 1 and 2 were carried out in an ad hoc manner through intuition. For example, when the underlying population was assumed normal with mean m and known s, we were led from to the standardized test statistic

For testing versus , intuition then suggested rejecting H0 when z was large. Finally, the critical value was determined by specifying the level of significance a and using the fact that Z has a standard normal distribution when H0 is true. The reliability of the test in reaching a correct decision can be assessed by studying type II error probabilities.

Issues to be considered in carrying out Steps 1–3 encompass the following questions:

1. What are the practical implications and consequences of choosing a particular level of significance once the other aspects of a test have been determined?

2. Does there exist a general principle, not dependent just on intuition, that can be used to obtain best or good test procedures?

Ha: m . m0H0: m 5 m0

Z 5 X 2 m 0 s/2n

X

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

340 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

3. When two or more tests are appropriate in a given situation, how can the tests be compared to decide which should be used?

4. If a test is derived under specific assumptions about the distribution or population being sampled, how will the test perform when the assumptions are violated?

Statistical Versus Practical Significance Although the process of reaching a decision by using the methodology of classical hypothesis testing involves selecting a level of significance and then rejecting or not rejecting H0 at that level a, simply reporting the a used and the decision reached conveys little of the information contained in the sample data. Especially when the results of an experiment are to be communicated to a large audience, rejection of H0 at level .05 will be much more convincing if the observed value of the test statistic greatly exceeds the 5% critical value than if it barely exceeds that value. This is pre- cisely what led to the notion of P-value as a way of reporting significance without imposing a particular a on others who might wish to draw their own conclusions.

Even if a P-value is included in a summary of results, however, there may be dif- ficulty in interpreting this value and in making a decision. This is because a small P-value, which would ordinarily indicate statistical significance in that it would strongly suggest rejection of H0 in favor of Ha, may be the result of a large sample size in combination with a departure from H0 that has little practical significance. In many experimental situations, only departures from H0 of large magnitude would be worthy of detection, whereas a small departure from H0 would have little practical significance.

Consider as an example testing versus where m is the mean of a normal population with . Suppose a true value of would not represent a serious departure from H0 in the sense that not rejecting H0 when would be a relatively inexpensive error. For a reasonably large sam- ple size n, this m would lead to an value near 101, so we would not want this sam- ple evidence to argue strongly for rejection of H0 when is observed. For various sample sizes, Table 8.1 records both the P-value when and also the probability of not rejecting H0 at level .01 when .

The second column in Table 8.1 shows that even for moderately large sample sizes, the P-value of argues very strongly for rejection of H0, whereas the observed itself suggests that in practical terms the true value of m differs little from the null value . The third column points out that even when there is little practical difference between the true m and the null value, for a fixed level of signif- icance a large sample size will almost always lead to rejection of the null hypothesis at that level. To summarize, one must be especially careful in interpreting evidence when the sample size is large, since any small departure from H0 will almost surely be detected by a test, yet such a departure may have little practical significance.

m0 5 100 x

x 5 101

m 5 101 x 5 101

x 5 101 x

m 5 101

m 5 101s 5 10 Ha: m . 100H0: m 5 100

Table 8.1 An Illustration of the Effect of Sample Size on P-values and B

n P-Value When B(101) for Level .01 Test

25 .3085 .9664 100 .1587 .9082 400 .0228 .6293 900 .0013 .2514

1600 .0000335 .0475 2500 .000000297 .0038

10,000 7. .000069 3 10224

x 5 101

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

8.5 Some Comments on Selecting a Test 341

The Likelihood Ratio Principle Let be the observations in a random sample of size n from a probabil- ity distribution . The joint distribution evaluated at these sample values is the product . As in the discussion of maximum likeli- hood estimation, the likelihood function is this joint distribution, regarded as a function of u. Consider testing is in �0 versus is in �a, where �0 and �a are disjoint (for example, versus ). The likelihood ratio principle for test construction proceeds as follows:

1. Find the largest value of the likelihood for any u in �0 (by finding the maximum likelihood estimate within �0 and substituting back into the likelihood function).

2. Find the largest value of the likelihood for any u in �a.

3. Form the ratio

The ratio is called the likelihood ratio statistic value. The test proce- dure consists of rejecting H0 when this ratio is small. That is, a constant k is chosen, and H0 is rejected if . Thus H0 is rejected when the denominator of l greatly exceeds the numerator, indicating that the data is much more consistent with Ha than with H0.

The constant k is selected to yield the desired type I error probability. Often the inequality can be manipulated to yield a simpler equivalent condition. For exam- ple, for testing versus in the case of normality, is equiv- alent to . Thus, with , the likelihood ratio test is the one-sample t test.

The likelihood ratio principle can also be applied when the Xi’s have different distributions and even when they are dependent, though the likelihood function can be complicated in such cases. Many of the test procedures to be presented in subse- quent chapters are obtained from the likelihood ratio principle. These tests often turn out to minimize b among all tests that have the desired a, so are truly best tests. For more details and some worked examples, refer to one of the references listed in the Chapter 6 bibliography.

A practical limitation on the use of the likelihood ratio principle is that, to construct the likelihood ratio test statistic, the form of the probability distribution from which the sample comes must be specified. To derive the t test from the like- lihood ratio principle, the investigator must assume a normal pdf. If an investiga- tor is willing to assume that the distribution is symmetric but does not want to be specific about its exact form (such as normal, uniform, or Cauchy), then the prin- ciple fails because there is no way to write a joint pdf simultaneously valid for all symmetric distributions. In Chapter 15, we will present several distribution-free test procedures, so called because the probability of a type I error is controlled simultaneously for many different underlying distributions. These procedures are useful when the investigator has limited knowledge of the underlying distribution. We shall also say more about issues 3 and 4 listed at the outset of this section.

c 5 ta, n21t $ c l # kHa: m . m0H0: m # m0

l # k

l(x1, c, xn) # k

l(x1, c, xn)

l(x1, c, xn) 5 maximum likelihood for u in �0 maximum likelihood for u in �a

Ha: u . 100H0: u # 100 Ha: uH0: u

f (x1; u) # f (x2; u) # c # f (xn; u) f (x; u)

x1, x2, c, xn

EXERCISES Section 8.5 (63–64)

63. Reconsider the paint-drying problem discussed in Ex- ample 8.2. The hypotheses were versus

, with s assumed to have value 9.0. ConsiderHa: m , 75 H0: m 5 75

the alternative value , which in the context of the problem would presumably not be a practically significant departure from H0.

m 5 74

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

342 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

SUPPLEMENTARY EXERCISES (65–87)

65. A sample of 50 lenses used in eyeglasses yields a sample mean thickness of 3.05 mm and a sample standard deviation of .34 mm. The desired true average thickness of such lenses is 3.20 mm. Does the data strongly suggest that the true average thickness of such lenses is something other than what is desired? Test using .

66. In Exercise 65, suppose the experimenter had believed before collecting the data that the value of s was approxi- mately .30. If the experimenter wished the probability of a type II error to be .05 when , was a sample size 50 unnecessarily large?

67. It is specified that a certain type of iron should contain .85 g of silicon per 100 g of iron (.85%). The silicon content of each of 25 randomly selected iron specimens was deter- mined, and the accompanying Minitab output resulted from a test of the appropriate hypotheses.

Variable N Mean StDev SE Mean T P sil cont 25 0.8880 0.1807 0.0361 1.05 0.30

a. What hypotheses were tested? b. What conclusion would be reached for a significance

level of .05, and why? Answer the same question for a significance level of .10.

68. One method for straightening wire before coiling it to make a spring is called “roller straightening.” The article “The Effect of Roller and Spinner Wire Straightening on Coiling Performance and Wire Properties” (Springs, 1987: 27–28) reports on the tensile properties of wire. Suppose a sample of 16 wires is selected and each is tested to determine ten- sile strength (N/mm2). The resulting sample mean and stan- dard deviation are 2160 and 30, respectively. a. The mean tensile strength for springs made using spinner

straightening is 2150 N/mm2. What hypotheses should be tested to determine whether the mean tensile strength for the roller method exceeds 2150?

b. Assuming that the tensile strength distribution is approx- imately normal, what test statistic would you use to test the hypotheses in part (a)?

c. What is the value of the test statistic for this data? d. What is the P-value for the value of the test statistic com-

puted in part (c)? e. For a level .05 test, what conclusion would you reach?

m 5 3.00

a 5 .05

69. Contamination of mine soils in China is a serious environ- mental problem. The article “Heavy Metal Contamination in Soils and Phytoaccumulation in a Manganese Mine Wasteland, South China” (Air, Soil, and Water Res., 2008: 31–41) reported that, for a sample of 3 soil specimens from a certain restored mining area, the sample mean concentration of Total Cu was 45.31 mg/kg with a corre- sponding (estimated) standard error of the mean of 5.26. It was also stated that the China background value for this concentration was 20. The results of various statistical tests described in the article were predicated on assuming normality. a. Does the data provide strong evidence for concluding

that the true average concentration in the sampled region exceeds the stated background value? Carry out a test at significance level .01 using the P-value method. Does the result surprise you? Explain.

b. Referring back to the test of (a), how likely is it that the P-value would be at least .01 when the true average con- centration is 50 and the true standard deviation of con- centration is 10?

70. The article “Orchard Floor Management Utilizing Soil- Applied Coal Dust for Frost Protection” (Agri. and Forest Meteorology, 1988: 71–82) reports the following values for soil heat flux of eight plots covered with coal dust.

34.7 35.4 34.7 37.7 32.5 28.0 18.4 24.9

The mean soil heat flux for plots covered only with grass is 29.0. Assuming that the heat-flux distribution is approxi- mately normal, does the data suggest that the coal dust is effective in increasing the mean heat flux over that for grass? Test the appropriate hypotheses using .

71. The article “Caffeine Knowledge, Attitudes, and Con- sumption in Adult Women” (J. of Nutrition Educ., 1992: 179–184) reports the following summary data on daily caf- feine consumption for a sample of adult women:

, and a. Does it appear plausible that the population distribu-

tion of daily caffeine consumption is normal? Is it nec- essary to assume a normal population distribution to test hypotheses about the value of the population mean consumption? Explain your reasoning.

range 5 521176.s 5 235 mgx 5 215 mg, n 5 47,

a 5 .05

a. For a level .01 test, compute b at this alternative for sam- ple sizes , 900, and 2500.

b. If the observed value of is , what can you say about the resulting P-value when ? Is the data statistically significant at any of the standard val- ues of a?

c. Would you really want to use a sample size of 2500 along with a level .01 test (disregarding the cost of such an experiment)? Explain.

n 5 2500 x 5 74X

n 5 100 64. Consider the large-sample level .01 test in Section 8.3 for

testing against . a. For the alternative value , compute b(.21) for

sample sizes , 2500, 10,000, 40,000, and 90,000. b. For , compute the P-value when

2500, 10,000, and 40,000. c. In most situations, would it be reasonable to use a level

.01 test in conjunction with a sample size of 40,000? Why or why not?

n 5 100,p̂ 5 x/n 5 .21 n 5 100

p 5 .21 Ha: p . .2H0: p 5 .2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 343

b. Suppose it had previously been believed that mean con- sumption was at most 200 mg. Does the given data con- tradict this prior belief? Test the appropriate hypotheses at significance level .10 and include a P-value in your analysis.

72. Annual holdings turnover for a mutual fund is the percent- age of a fund’s assets that are sold during a particular year. Generally speaking, a fund with a low value of turnover is more stable and risk averse, whereas a high value of turnover indicates a substantial amount of buying and sell- ing in an attempt to take advantage of short-term market fluctuations. Here are values of turnover for a sample of 20 large-cap blended funds (refer to Exercise 1.53 for a bit more information) extracted from Morningstar.com:

1.03 1.23 1.10 1.64 1.30 1.27 1.25 0.78 1.05 0.64 0.94 2.86 1.05 0.75 0.09 0.79 1.61 1.26 0.93 0.84

a. Would you use the one-sample t test to decide whether there is compelling evidence for concluding that the pop- ulation mean turnover is less than 100%? Explain.

b. A normal probability plot of the 20 ln(turnover) values shows a very pronounced linear pattern, suggesting it is reasonable to assume that the turnover distribution is log- normal. Recall that X has a lognormal distribution if ln(X) is normally distributed with mean value m and variance s2. Because m is also the median of the ln(X) distribution, em is the median of the X distribution. Use this information to decide whether there is compelling evidence for concluding that the median of the turnover population distribution is less than 100%.

73. The true average breaking strength of ceramic insulators of a certain type is supposed to be at least 10 psi. They will be used for a particular application unless sample data indicates conclusively that this specification has not been met. A test of hypotheses using is to be based on a random sample of ten insulators. Assume that the breaking-strength distribution is normal with unknown standard deviation. a. If the true standard deviation is .80, how likely is it that

insulators will be judged satisfactory when true average breaking strength is actually only 9.5? Only 9.0?

b. What sample size would be necessary to have a 75% chance of detecting that the true average breaking strength is 9.5 when the true standard deviation is .80?

74. The accompanying observations on residual flame time (sec) for strips of treated children’s nightwear were given in the article “An Introduction to Some Precision and Accuracy of Measurement Problems” (J. of Testing and Eval., 1982: 132–140). Suppose a true average flame time of at most 9.75 had been mandated. Does the data suggest that this condition has not been met? Carry out an appropriate test after first investigating the plausibility of assumptions that underlie your method of inference.

9.85 9.93 9.75 9.77 9.67 9.87 9.67

9.94 9.85 9.75 9.83 9.92 9.74 9.99

9.88 9.95 9.95 9.93 9.92 9.89

a 5 .01

75. The incidence of a certain type of chromosome defect in the U.S. adult male population is believed to be 1 in 75. A ran- dom sample of 800 individuals in U.S. penal institutions reveals 16 who have such defects. Can it be concluded that the incidence rate of this defect among prisoners differs from the presumed rate for the entire adult male population? a. State and test the relevant hypotheses using .

What type of error might you have made in reaching a conclusion?

b. What P-value is associated with this test? Based on this P-value, could H0 be rejected at significance level .20?

76. In an investigation of the toxin produced by a certain poi- sonous snake, a researcher prepared 26 different vials, each containing 1 g of the toxin, and then determined the amount of antitoxin needed to neutralize the toxin. The sample aver- age amount of antitoxin necessary was found to be 1.89 mg, and the sample standard deviation was .42. Previous research had indicated that the true average neutralizing amount was 1.75 mg/g of toxin. Does the new data contra- dict the value suggested by prior research? Test the relevant hypotheses using the P-value approach. Does the validity of your analysis depend on any assumptions about the popula- tion distribution of neutralizing amount? Explain.

77. The sample average unrestrained compressive strength for 45 specimens of a particular type of brick was computed to be 3107 psi, and the sample standard deviation was 188. The distribution of unrestrained compressive strength may be somewhat skewed. Does the data strongly indicate that the true average unrestrained compressive strength is less than the design value of 3200? Test using .

78. The Dec. 30, 2009, the NewYork Times reported that in a sur- vey of 948 American adults who said they were at least some- what interested in college football, 597 said the current Bowl Championship System should be replace by a playoff similar to that used in college basketball. Does this provide com- pelling evidence for concluding that a majority of all such individuals favor replacing the B.C.S. with a playoff? Test the appropriate hypotheses using the P-value method.

79. When are independent Poisson variables, each with parameter m, and n is large, the sample mean has approximately a normal distribution with and

. This implies that

has approximately a standard normal distribution. For test- ing , we can replace m by m0 in the equation for Z to obtain a test statistic. This statistic is actually preferred to the large-sample statistic with denominator (when the Xi’s are Poisson) because it is tailored explicitly to the Poisson assumption. If the number of requests for consult- ing received by a certain statistician during a 5-day work week has a Poisson distribution and the total number of con- sulting requests during a 36-week period is 160, does this suggest that the true average number of weekly requests exceeds 4.0? Test using .a 5 .02

S/1n

H0: m 5 m0

Z 5 X 2 m

2m/n

V(X) 5 m/n m 5 E(X)

X X1, X2, c , Xn

a 5 .001

a 5 .05

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

344 CHAPTER 8 Tests of Hypotheses Based on a Single Sample

80. An article in the Nov. 11, 2005, issue of the San Luis Obispo Tribune reported that researchers making random purchases at California Wal-Mart stores found scanners coming up with the wrong price 8.3% of the time. Suppose this was based on 200 purchases. The National Institute for Standards and Technology says that in the long run at most two out of every 100 items should have incorrectly scanned prices. a. Develop a test procedure with a significance level of

(approximately) .05, and then carry out the test to decide whether the NIST benchmark is not satisfied.

b. For the test procedure you employed in (a), what is the probability of deciding that the NIST benchmark has been satisfied when in fact the mistake rate is 5%?

81. A hot-tub manufacturer advertises that with its heating equipment, a temperature of 100°F can be achieved in at most 15 min. A random sample of 42 tubs is selected, and the time necessary to achieve a 100°F temperature is deter- mined for each tub. The sample average time and sample standard deviation are 16.5 min and 2.2 min, respectively. Does this data cast doubt on the company’s claim? Compute the P-value and use it to reach a conclusion at level .05.

82. Chapter 7 presented a CI for the variance s2 of a normal pop- ulation distribution. The key result there was that the rv

has a chi-squared distribution with df. Consider the null hypothesis (equivalently,

). Then when H0 is true, the test statistic has a chi-squared distribution with

df. If the relevant alternative is , rejecting H0 if gives a test with significance level a.

To ensure reasonably uniform characteristics for a particular application, it is desired that the true standard deviation of the softening point of a certain type of petroleum pitch be at most .50°C. The softening points of ten different specimens were determined, yielding a sample standard deviation of .58°C. Does this strongly contradict the uniformity specification? Test the appropriate hypotheses using .

83. Referring to Exercise 82, suppose an investigator wishes to test versus based on a sample of 21 observations. The computed value of 20s2/.04 is 8.58. Place bounds on the P-value and then reach a conclusion at level .01.

84. When the population distribution is normal and n is large, the sample standard deviation S has approximately a normal distribution with and . We already know that in this case, for any n, is normal with and . a. Assuming that the underlying distribution is normal,

what is an approximately unbiased estimator of the 99th percentile ?

b. When the Xi’s are normal, it can be shown that and S are independent rv’s (one measures location whereas the

X u 5 m 1 2.33s

V(X) 5 s2/n E(X) 5 mX

V(S) < s2/(2n)E(S) < s

Ha: s 2 , .04H0: s

2 5 .04

a 5 .01

(n 2 1)s2/s0 2 $ xa, n21

2 Ha: s

2 . s0 2

n 2 1x2 5 (n 2 1)S 2/s0 2

s 5 s0

H0: s 2 5 s0

2 n 2 1x2 5 (n 2 1)S 2/s2

other measures spread). Use this to compute and for the estimator of part (a). What is the estimated standard error ?

c. Write a test statistic for testing that has approximately a standard normal distribution when H0 is true. If soil pH is normally distributed in a certain region and 64 soil samples yield , does this provide strong evidence for concluding that at most 99% of all possible samples would have a pH of less than 6.75? Test using .

85. Let be a random sample from an exponential distribution with parameter l. Then it can be shown that

has a chi-squared distribution with (by first showing that has a chi-squared distribution with ). a. Use this fact to obtain a test statistic and rejection region

that together specify a level a test for versus each of the three commonly encountered alternatives. [Hint: , so is equivalent to

.] b. Suppose that ten identical components, each having

exponentially distributed time until failure, are tested. The resulting failure times are

95 16 11 3 42 71 225 64 87 123

Use the test procedure of part (a) to decide whether the data strongly suggests that the true average lifetime is less than the previously claimed value of 75.

86. Suppose the population distribution is normal with known s. Let g be such that . For testing versus , consider the test that rejects H0 if either or , where the test statistic is

. a. Show that . b. Derive an expression for . [Hint: Express the test in

the form “reject H0 if either .”] c. Let . For what values of g (relative to a) will

?

87. After a period of apprenticeship, an organization gives an exam that must be passed to be eligible for membership. Let

(randomly chosen apprentice passes). The organiza- tion wishes an exam that most but not all should be able to pass, so it decides that is desirable. For a particular exam, the relevant hypotheses are versus the alternative . Suppose ten people take the exam, and let the number who pass. a. Does the lower-tailed region specify a level

.01 test? b. Show that even though Ha is two-sided, no two-tailed test

is a level .01 test. c. Sketch a graph of as a function of p� for this test.

Is this desirable? b(pr)

50,1, c, 56 X 5

Ha: p 2 .90 H0: p 5 .90

p 5 .90

p 5 P

b(m0 1 �) , b(m0 2 �) � . 0

x $ c1 or # c2

b(mr) P(type I error) 5 a

Z 5 (X 2 m0)/(s/1n) z # 2za2gz $ zg

Ha: m 2 m0 H0: m 5 m00 , g , a

l 5 1/m0

m 5 m0E(Xi) 5 m 5 1/l

H0: m 5 m0

v 5 22lXi

n 5 2n2lgXi

X1, X2, c, Xn

a 5 .01

x 5 6.33, s 5 .16

H0: u 5 u0

ŝ û

sûV(û)

Bibliography See the bibliographies at the end of Chapter 6 and Chapter 7.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

345

9 Inferences Based on Two Samples

INTRODUCTION

Chapters 7 and 8 presented confidence intervals (CIs) and hypothesis-testing

procedures for a single mean m, single proportion p, and a single variance s2.

Here we extend these methods to situations involving the means, proportions,

and variances of two different population distributions. For example, let m1 denote true average Rockwell hardness for heat-treated steel specimens and

m2 denote true average hardness for cold-rolled specimens. Then an investiga-

tor might wish to use samples of hardness observations from each type of steel

as a basis for calculating an interval estimate of , the difference

between the two true average hardnesses. As another example, let p1 denote

the true proportion of nickel-cadmium cells produced under current operating

conditions that are defective because of internal shorts, and let p2 represent the

true proportion of cells with internal shorts produced under modified operating

conditions. If the rationale for the modified conditions is to reduce the propor-

tion of defective cells, a quality engineer would want to use sample informa-

tion to test the null hypothesis (i.e., ) versus the

alternative hypothesis, (i.e., ).p1 . p2Ha: p1 2 p2 . 0

p1 5 p2H0: p1 2 p2 5 0

m1 2 m2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

346 CHAPTER 9 Inferences Based on Two Samples

The inferences discussed in this section concern a difference between the means of two different population distributions. An investigator might, for example, wish to test hypotheses about the difference between true average breaking strengths of two different types of corrugated fiberboard. One such hypothesis would state that

that is, that . Alternatively, it may be appropriate to estimate by computing a 95% CI. Such inferences necessitate obtaining a sample of

strength observations for each type of fiberboard. m1 2 m2

m1 5 m2m1 2 m2 5 0

m1 2 m2

Basic Assumptions

1. is a random sample from a distribution with mean m1 and variance .

2. is a random sample from a distribution with mean m2 and variance .

3. The X and Y samples are independent of one another.

s2 2

Y1, Y2, c, Yn

s1 2

X1, X2, c, Xm

The expected value of is , so is an unbiased estimator of . The standard deviation of is

s X 2Y

5 B s21 m

1 s2

2

n

X 2 Ym1 2 m2

X 2 Ym1 2 m2X 2 Y

The use of m for the number of observations in the first sample and n for the num- ber of observations in the second sample allows for the two sample sizes to be dif- ferent. Sometimes this is because it is more difficult or expensive to sample one population than another. In other situations, equal sample sizes may initially be specified, but for reasons beyond the scope of the experiment, the actual sample sizes may differ. For example, the abstract of the article “A Randomized Controlled Trial Assessing the Effectiveness of Professional Oral Care by Dental Hygienists” (Intl. J. of Dental Hygiene, 2008: 63–67) states that “Forty patients were randomly assigned to either the POC group or the control group . One patient in the POC group and three in the control group dropped out because of exacerbation of underlying disease or death.” The data analysis was then based on

and . The natural estimator of is , the difference between the corre-

sponding sample means. Inferential procedures are based on standardizing this estima- tor, so we need expressions for the expected value and standard deviation of .X 2 Y

X 2 Ym1 2 m2

n 5 16m 5 19

(n 5 20)(m 5 20)

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means

Proof Both these results depend on the rules of expected value and variance pre- sented in Chapter 5. Since the expected value of a difference is the difference of expected values,

E(X 2 Y) 5 E(X) 2 E(Y) 5 m1 2 m2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 347

Because the X and Y samples are independent, and are independent quantities, so the variance of the difference is the sum of and :

The standard deviation of is the square root of this expression. ■

If we regard as a parameter u, then its estimator is with standard deviation given by the proposition. When and both have known values, the value of this standard deviation can be calculated. The sample variances must be used to estimate when and are unknown.

Test Procedures for Normal Populations with Known Variances In Chapters 7 and 8, the first CI and test procedure for a population mean m were based on the assumption that the population distribution was normal with the value of the population variance s 2 known to the investigator. Similarly, we first assume here that both population distributions are normal and that the values of both and

are known. Situations in which one or both of these assumptions can be dispensed with will be presented shortly.

Because the population distributions are normal, both and have normal distributions. Furthermore, independence of the two samples implies that the two sample means are independent of one another. Thus the difference is normally distributed, with expected value and standard deviation given in the foregoing proposition. Standardizing gives the standard normal variable

(9.1)

In a hypothesis-testing problem, the null hypothesis will state that m1 � m2 has a specified value. Denoting this null value by �0, we have H0: m1 � m2 � �0. Often �0 � 0, in which case H0 says that m1 � m2. A test statistic results from replacing m1 � m2 in Expression (9.1) by the null value �0. The test statistic Z is obtained by standardizing under the assumption that H0 is true, so it has a standard nor- mal distribution in this case. This test statistic can be written as ( – null value)/ , which is of the same form as several test statistics in Chapter 8.

Consider the alternative hypothesis Ha: m1 � m2 � �0. A value that

considerably exceeds �0 (the expected value of when H0 is true) provides evi- dence against H0 and for Ha. Such a value of corresponds to a positive and large value of z. Thus H0 should be rejected in favor of Ha if z is greater than or equal to an appropriately chosen critical value. Because the test statistic Z has a standard normal distribution when H0 is true, the upper-tailed rejection region gives a test with significance level (type I error probability) a. Rejection regions for

and that yield tests with desired significance level a are lower-tailed and two-tailed, respectively.

Ha: m1 2 m2 2 �0Ha: m1 2 m2 , �0

z $ za

x 2 y X 2 Y

x 2 y

sûû

X 2 Y

Z 5 X 2 Y 2 (m1 2 m2)

B s1

2

m 1 s2

2

n

X 2 Y sX2Ym1 2 m2

X 2 Y

YX

s2 2

s1 2

s2 2s1

2sû

s2 2s1

2sû

û 5 X 2 Ym1 2 m2

X 2 Y

V(X 2 Y) 5 V(X) 1 V(Y) 5 s1

2

m 1 s2

2

n

V(Y)V(X) YX

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Analysis of a random sample consisting of specimens of cold-rolled steel to determine yield strengths resulted in a sample average strength of . A second random sample of two-sided galvanized steel specimens gave a sam- ple average strength of . Assuming that the two yield-strength distribu- tions are normal with and (suggested by a graph in the article “Zinc-Coated Sheet Steel: An Overview,” Automotive Engr., Dec. 1984: 39–43), does the data indicate that the corresponding true average yield strengths m1 and m2 are different? Let’s carry out a test at significance level .

1. The parameter of interest is , the difference between the true average strengths for the two types of steel.

2. The null hypothesis is .

3. The alternative hypothesis is ; if Ha is true, then m1 and m2 are different.

4. With , the test statistic value is

5. The inequality in Ha implies that the test is two-tailed. For , , and , H0 will be rejected if or if .

6. Substituting , and into the formula for z yields

That is, the observed value of is more than 3 standard deviations below what would be expected were H0 true.

7. Since , z does fall in the lower tail of the rejection region. H0 is therefore rejected at level .01 in favor of the conclusion that . The sample data strongly suggests that the true average yield strength for cold-rolled steel differs from that for galvanized steel. The P-value for this two-tailed test is , so H0 should be rejected at any reasonable significance level. ■

2(1 2 �(3.66)) < 2(1 2 1) 5 0

m1 2 m2 23.66 , 22.58

x 2 y

z 5 29.8 2 34.7

B 16.0

20 1

25.0

25

5 24.90

1.34 5 23.66

s2 2 5 25.0m 5 20, x 5 29.8, s1

2 5 16.0, n 5 25, y 5 34.7

z # 22.58z $ 2.58za/2 5 z.005 5 2.58 a/2 5 .005a 5 .01

z 5 x 2 y

B s1

2

m 1 s2

2

n

�0 5 0

Ha: m1 2 m2 2 0 H0: m1 2 m2 5 0

m1 2 m2

a 5 .01

s2 5 5.0s1 5 4.0 y 5 34.7 ksi

n 5 25 x 5 29.8 ksi

m 5 20

348 CHAPTER 9 Inferences Based on Two Samples

Example 9.1

Null hypothesis:

Test statistic value:

Because these are z tests, a P-value is computed as it was for the z tests in Chapter 8 [e.g., for an upper-tailed test].P-value 5 1 2 �(z)

z 5 x 2 y 2 �0

B s1

2

m 1 s2

2

n

H0: m1 2 m2 5 �0

Alternative Hypothesis Rejection Region for Level a Test

(upper-tailed)

(lower-tailed)

either or (two-tailed)z # 2za/2z $ za/2Ha: m1 2 m2 2 �0

z # 2zaHa: m1 2 m2 , �0

z $ zaHa: m1 2 m2 . �0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Using a Comparison to Identify Causality Investigators are often interested in comparing either the effects of two different treat- ments on a response or the response after treatment with the response after no treat- ment (treatment vs. control). If the individuals or objects to be used in the comparison are not assigned by the investigators to the two different conditions, the study is said to be observational. The difficulty with drawing conclusions based on an observa- tional study is that although statistical analysis may indicate a significant difference in response between the two groups, the difference may be due to some underlying factors that had not been controlled rather than to any difference in treatments.

A letter in the Journal of the American Medical Association (May 19, 1978) reported that of 215 male physicians who were Harvard graduates and died between November 1974 and October 1977, the 125 in full-time practice lived an average of 48.9 years beyond graduation, whereas the 90 with academic affiliations lived an average of 43.2 years beyond graduation. Does the data suggest that the mean life- time after graduation for doctors in full-time practice exceeds the mean lifetime for those who have an academic affiliation? (If so, those medical students who say that they are “dying to obtain an academic affiliation” may be closer to the truth than they realize; in other words, is “publish or perish” really “publish and perish”?)

Let m1 denote the true average number of years lived beyond graduation for physicians in full-time practice, and let m2 denote the same quantity for physicians with academic affiliations. Assume the 125 and 90 physicians to be random samples from populations 1 and 2, respectively (which may not be reasonable if there is rea- son to believe that Harvard graduates have special characteristics that differentiate them from all other physicians—in this case inferences would be restricted just to the “Harvard populations”). The letter from which the data was taken gave no infor- mation about variances, so for illustration assume that s1 � 14.6 and s2 � 14.4. The hypotheses are H0: m1 � m2 � 0 versus , so �0 is zero. The com- puted value of the test statistic is

The P-value for an upper-tailed test is . At significance level .01, H0 is rejected (because ) in favor of the conclusion that

. This is consistent with the information reported in the letter.

This data resulted from a retrospective observational study; the investigator did not start out by selecting a sample of doctors and assigning some to the “academic affiliation” treatment and the others to the “full-time practice” treatment, but instead identified members of the two groups by looking backward in time (through obituar- ies!) to past records. Can the statistically significant result here really be attributed to a difference in the type of medical practice after graduation, or is there some other underlying factor (e.g., age at graduation, exercise regimens, etc.) that might also fur- nish a plausible explanation for the difference? Observational studies have been used to argue for a causal link between smoking and lung cancer. There are many studies that show that the incidence of lung cancer is significantly higher among smokers than among nonsmokers. However, individuals had decided whether to become smokers long before investigators arrived on the scene, and factors in making this decision may have played a causal role in the contraction of lung cancer. ■

m1 2 m2 . 0 (m1 . m2) a . P-value

1 2 �(2.85) 5 .0022

z 5 48.9 2 43.2

B (14.6)2

125 1

(14.4)2

90

5 5.70

11.70 1 2.30 5 2.85

Ha: m1 2 m2 . 0

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 349

Example 9.2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 9.3 (Example 9.1 continued)

350 CHAPTER 9 Inferences Based on Two Samples

A randomized controlled experiment results when investigators assign sub- jects to the two treatments in a random fashion. When statistical significance is observed in such an experiment, the investigator and other interested parties will have more confidence in the conclusion that the difference in response has been caused by a difference in treatments. A very famous example of this type of experi- ment and conclusion is the Salk polio vaccine experiment described in Section 9.4. These issues are discussed at greater length in the (nonmathematical) books by Moore and by Freedman et al., listed in the Chapter 1 references.

B and the Choice of Sample Size The probability of a type II error is easily calculated when both population distributions are normal with known values of s1 and s2. Consider the case in which the alternative hypothesis is . Let �� denote a value of that exceeds �0 (a value for which H0 is false). The upper-tailed rejection region can be reex- pressed in the form . Thus

When is normally distributed with mean value �� and stan- dard deviation (the same standard deviation as when H0 is true); using these values to standardize the inequality in parentheses gives the desired probability.

sX2Y

m1 2 m2 5 � r, X 2 Y

5 P(X 2 Y , �0 1 zasX2Y when m1 2 m2 5 � r) b(� r) 5 P(not rejecting H0 when m1 2 m2 5 � r)

x 2 y $ �0 1 zasX2Y

z $ za

m1 2 m2Ha: m1 2 m2 . �0

Alternative Hypothesis (type II error when )

where s 5 sX2Y 5 2(s1 2/m) 1 (s2

2/n)

�aza/2 2 � r 2 �0s b 2 �a2zs/2 2 � r 2 �0 s

bHa: m1 2 m2 2 �0

1 2 �a2za 2 � r 2 �0s bHa: m1 2 m2 , �0

�aza 2 � r 2 �0s bHa: m1 2 m2 . �0

m1 2 m2 5 � rb(� r) 5 P

Suppose that when m1 and m2 (the true average yield strengths for the two types of steel) differ by as much as 5, the probability of detecting such a departure from H0 (the power of the test) should be .90. Does a level .01 test with sample sizes and satisfy this condition? The value of s for these sample sizes (the denom- inator of z) was previously calculated as 1.34. The probability of a type II error for the two-tailed level .01 test when is

It is easy to verify that also (because the rejection region is sym- metric). Thus the power is . Because this is somewhat less than .9, slightly larger sample sizes should be used. ■

1 2 b(5) 5 .8749 b(25) 5 .1251

5 �(21.15) 2 �(26.31) 5 .1251

b(5) 5 �a2.58 2 5 2 0 1.34

b 2 �a22.58 2 5 2 0 1.34

b

m1 2 m2 5 � r 5 5

n 5 25 m 5 20

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 351

Use of the test statistic value

along with the previously stated upper-, lower-, and two-tailed rejection regions based on z critical values gives large-sample tests whose signifi- cance levels are approximately a. These tests are usually appropriate if both

and . A P-value is computed exactly as it was for our earlier z tests.

n . 40m . 40

z 5 x 2 y 2 �0

B s1

2

m 1

s2 2

n

As in Chapter 8, sample sizes m and n can be determined that will satisfy both P(type I error) � a specified a and P(type II error when ) � a spec- ified b. For an upper-tailed test, equating the previous expression for to the specified value of b gives

When the two sample sizes are equal, this equation yields

These expressions are also correct for a lower-tailed test, whereas a is replaced by a/2 for a two-tailed test.

Large-Sample Tests The assumptions of normal population distributions and known values of s1 and s2 are fortunately unnecessary when both sample sizes are sufficiently large. In this case, the Central Limit Theorem guarantees that has approximately a normal distribution regardless of the underlying population distributions. Furthermore, using and in place of and in Expression (9.1) gives a variable whose dis- tribution is approximately standard normal:

A large-sample test statistic results from replacing by �0, the expected value of when H0 is true. This statistic Z then has approximately a standard normal distribution when H0 is true. Tests with a desired significance level are obtained by using z critical values exactly as before.

X 2 Y m1 2 m2

Z 5 X 2 Y 2 (m1 2 m2)

B S1

2

m 1

S2 2

n

s2 2s1

2S2 2S1

2

X 2 Y

m 5 n 5 (s1

2 1 s2 2)(za 1 zb)

2

(� r 2 �0)2

s1 2

m 1 s2

2

n 5

(� r 2 �0)2

(za 1 zb) 2

b(� r) m1 2 m2 5 � r

What impact does fast-food consumption have on various dietary and health charac- teristics? The article “Effects of Fast-Food Consumption on Energy Intake and Diet Quality Among Children in a National Household Study” (Pediatrics, 2004: 112–118) reported the accompanying summary data on daily calorie intake both for a sample of teens who said they did not typically eat fast food and another sample of teens who said they did usually eat fast food.

Example 9.4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

352 CHAPTER 9 Inferences Based on Two Samples

Does this data provide strong evidence for concluding that true average calorie intake for teens who typically eat fast food exceeds by more than 200 calories per day the true average intake for those who don’t typically eat fast food? Let’s investigate by carrying out a test of hypotheses at a significance level of approximately .05.

The parameter of interest is , where m1 is the true average calorie intake for teens who don’t typically eat fast food and m2 is true average intake for teens who do typically eat fast food. The hypotheses of interest are

The alternative hypothesis asserts that true average daily intake for those who typi- cally eat fast food exceeds that for those who don’t by more than 200 calories. The test statistic value is

The inequality in Ha implies that the test is lower-tailed; H0 should be rejected if . The calculated test statistic value is

Since , the null hypothesis is rejected. At a significance level of .05, it does appear that true average daily calorie intake for teens who typically eat fast food exceeds by more than 200 the true average intake for those who don’t typically eat such food.

The P-value for the test is

P-value � area under the z curve to the left of

Because , we again reject the null hypothesis at significance level .05. However, the P-value is not small enough to justify rejecting H0 at significance level .01.

Notice that if the label 1 had instead been used for the fast-food condition and 2 had been used for the no-fast-food condition, then 200 would have replaced in both hypotheses and Ha would have contained the inequality ., implying an upper-tailed test. The resulting test statistic value would have been 2.20, giving the same P-value as before. ■

Confidence Intervals for M1 M2 When both population distributions are normal, standardizing gives a random variable Z with a standard normal distribution. Since the area under the z curve between and is , it follows that1 2 aza/22za/2

X 2 Y

2

2200

.0139 # .05

22.20 5 �(22.20) 5 .0139

22.20 # 21.645

z 5 2258 2 2637 1 200

B (1519)2

663 1

(1138)2

413

5 2179

81.34 5 22.20

z # 2z.05 5 21.645

z 5 x 2 y 2 (2200)

B s1

2

m 1

s2 2

n

H0: m1 2 m2 5 2200 versus Ha: m1 2 m2 , 2200

m1 2 m2

Eat Fast Food Sample Size Sample Mean Sample SD

No 663 2258 1519 Yes 413 2637 1138

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 353

Manipulation of the inequalities inside the parentheses to isolate yields the equivalent probability statement

This implies that a % CI for has lower limit and upper limit , where is the square-root expression. ThissX2 Yx 2 y 1 za/2 # sX2 Y

x 2 y 2 za/2 # sX2 Ym1 2 m2100(1 2 a)

PaX 2 Y 2 za/2 B s1

2

m 1 s2

2

n , m1 2 m2 , X 2 Y 1 za/2 B

s1 2

m 1 s2

2

n b 5 1 2 a

m1 2 m2

P°2za/2 , X 2 Y 2 (m1 2 m2)

B s1

2

m 1 s2

2

n

, za/2¢ 5 1 2 a

interval is a special case of the general formula . If both m and n are large, the CLT implies that this interval is valid even with-

out the assumption of normal populations; in this case, the confidence level is approx- imately %. Furthermore, use of the sample variances and in the standardized variable Z yields a valid interval in which and replace and .s2

2s1 2s2

2s1 2

S2 2S1

2100(1 2 a)

û 6 za/2 # sû

Provided that m and n are both large, a CI for with a confidence level of approximately % is

where gives the lower limit and the upper limit of the interval. An upper or a lower confidence bound can also be calculated by retaining the appropri- ate sign ( or ) and replacing by .zaza/221

12

x 2 y 6 za/2B s1

2

m 1

s2 2

n

100(1 2 a) m1 2 m2

Our standard rule of thumb for characterizing sample sizes as large is and .

An experiment carried out to study various characteristics of anchor bolts resulted in 78 observations on shear strength (kip) of 3/8-in. diameter bolts and 88 observations on the strength of 1/2-in. diameter bolts. Summary quantities from Minitab follow, and a comparative boxplot is presented in Figure 9.1. The sample sizes, sample means, and sample standard deviations agree with values given in the article “Ultimate Load Capacities of Expansion Anchor Bolts” (J. of Energy Engr., 1993: 139–158). The summaries suggest that the main difference between the two samples is in where they are centered.

Variable N Mean Median TrMean StDev SEMean diam 3/8 78 4.250 4.230 4.238 1.300 0.147

Variable Min Max Q1 Q3 diam 3/8 1.634 7.327 3.389 5.075

Variable N Mean Median TrMean StDev SEMean diam 1/2 88 7.140 7.113 7.150 1.680 0.179

Variable Min Max Q1 Q3 diam 1/2 2.450 11.343 5.965 8.447

n . 40 m . 40

Example 9.5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

354 CHAPTER 9 Inferences Based on Two Samples

2

Type 1

Type 2

7 12 Strength

Figure 9.1 A comparative boxplot of the shear strength data

EXERCISES Section 9.1 (1–16)

Let’s now calculate a confidence interval for the difference between true average shear strength for 3/8-in. bolts (m1) and true average shear strength for 1/2-in. bolts (m2) using a confidence level of 95%:

That is, with 95% confidence, . We can therefore be highly confident that the true average shear strength for the 1/2-in. bolts exceeds that for the 3/8-in. bolts by between 2.44 kip and 3.34 kip. Notice that if we relabel so that m1 refers to 1/2-in. bolts and m2 to 3/8-in. bolts, the confidence interval is now centered at 2.89 and the value .45 is still subtracted and added to obtain the confi- dence limits. The resulting interval is (2.44, 3.34), and the interpretation is identical to that for the interval previously calculated. ■

If the variances and are at least approximately known and the investigator uses equal sample sizes, then the common sample size n that yields a % interval of width w is

which will generally have to be rounded up to an integer.

n 5 4za/2

2 (s1 2 1 s2

2)

w2

100(1 2 a) s2

2s1 2

1

23.34 , m1 2 m2 , 22.44

5 22.89 6 .45 5 (23.34, 22.44)

4.25 2 7.14 6 (1.96) B

(1.30)2

78 1

(1.68)2

88 5 22.89 6 (1.96)(.2318)

1. An article in the November 1983 Consumer Reports compared various types of batteries. The average lifetimes of Duracell Alkaline AA batteries and Eveready Energizer Alkaline AA batteries were given as 4.1 hours and 4.5 hours, respectively. Suppose these are the population average lifetimes. a. Let be the sample average lifetime of 100 Duracell bat-

teries and be the sample average lifetime of 100 Eveready batteries. What is the mean value of (i.e., where is the distribution of centered)? How does your answer depend on the specified sample sizes?

X 2 Y X 2 Y

Y X

b. Suppose the population standard deviations of lifetime are 1.8 hours for Duracell batteries and 2.0 hours for Eveready batteries. With the sample sizes given in part (a), what is the variance of the statistic , and what is its standard deviation?

c. For the sample sizes given in part (a), draw a picture of the approximate distribution curve of (include a mea- surement scale on the horizontal axis). Would the shape of the curve necessarily be the same for sample sizes of 10 batteries of each type? Explain.

X 2 Y

X 2 Y

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.1 z Tests and Confidence Intervals for a Difference Between Two Population Means 355

2. The National Health Statistics Reports dated Oct. 22, 2008, included the following information on the heights (in.) for non-Hispanic white females:

Sample Sample Std. Error Age Size Mean Mean

20–39 866 64.9 .09 60 and older 934 63.1 .11

a. Calculate and interpret a confidence interval at confidence level approximately 95% for the difference between pop- ulation mean height for the younger women and that for the older women.

b. Let m1 denote the population mean height for those aged 20–39 and m2 denote the population mean height for those aged 60 and older. Interpret the hypotheses

and , and then carry out a test of these hypotheses at significance level .001 using the rejec- tion region approach.

c. What is the P-value for the test you carried out in (b)? Based on this P-value, would you reject the null hypothe- sis at any reasonable significance level? Explain.

d. What hypotheses would be appropriate if m1 referred to the older age group, m2 to the younger age group, and you wanted to see if there was compelling evidence for conclud- ing that the population mean height for younger women exceeded that for older women by more than 1 in.?

3. Let m1 denote true average tread life for a premium brand of P205/65R15 radial tire, and let m2 denote the true average tread life for an economy brand of the same size. Test

versus at level .01, using the following data:

, and .

4. a. Use the data of Example 9.4 to compute a 95% CI for . Does the resulting interval suggest that

has been precisely estimated? b. Use the data of Exercise 3 to compute a 95% upper confi-

dence bound for .

5. Persons having Reynaud’s syndrome are apt to suffer a sud- den impairment of blood circulation in fingers and toes. In an experiment to study the extent of this impairment, each sub- ject immersed a forefinger in water and the resulting heat out- put (cal/cm2/min) was measured. For subjects with the syndrome, the average heat output was , and for

nonsufferers, the average output was 2.05. Let m1 and m2 denote the true average heat outputs for the two types of subjects. Assume that the two distributions of heat output are normal with and .

a. Consider testing H0: m1 � m2 � �1.0 versus Ha: �1 � �2 �1.0 at level .01. Describe in words what Ha says, and then carry out the test.

b. Compute the P-value for the value of Z obtained in part (a). c. What is the probability of a type II error when the actual

difference between m1 and m2 is ?m1 2 m2 5 21.2

s2 5 .4s1 5 .2

n 5 10 x 5 .64

m 5 10

m1 2 m2

m1 2 m2m1 2 m2

s2 5 1500s1 5 2200, n 5 45, y 5 36,800 m 5 45, x 5 42,500,

Ha: m1 2 m2 . 5000H0: m1 2 m2 5 5000

Ha: m1 2 m2 . 1m2 5 1 H0: m1 2

d. Assuming that , what sample sizes are required to ensure that when ?

6. An experiment to compare the tension bond strength of poly- mer latex modified mortar (Portland cement mortar to which polymer latex emulsions have been added during mixing) to that of unmodified mortar resulted in for the modified mortar and for the unmodified mortar . Let m1 and m2 be the true aver- age tension bond strengths for the modified and unmodified mortars, respectively. Assume that the bond strength distribu- tions are both normal. a. Assuming that s1 � 1.6 and s2 � 1.4, test H0: m1 � m2 � 0

versus Ha: m1 � m2 � 0 at level .01. b. Compute the probability of a type II error for the test of

part (a) when . c. Suppose the investigator decided to use a level .05 test and

wished when . If , what value of n is necessary?

d. How would the analysis and conclusion of part (a) change if s1 and s2 were unknown but and ?

7. Is there any systematic tendency for part-time college faculty to hold their students to different standards than do full-time faculty? The article “Are There Instructional Differences Between Full-Time and Part-Time Faculty?” (College Teaching, 2009: 23–26) reported that for a sample of 125 courses taught by full-time faculty, the mean course GPA was 2.7186 and the standard deviation was .63342, whereas for a sample of 88 courses taught by part-timers, the mean and standard deviation were 2.8639 and .49241, respectively. Does it appear that true average course GPA for part-time faculty differs from that for faculty teaching full-time? Test the appropriate hypotheses at significance level .01 by first obtaining a P-value.

8. Tensile-strength tests were carried out on two different grades of wire rod (“Fluidized Bed Patenting of Wire Rods,” Wire J., June 1977: 56–61), resulting in the accompanying data.

Sample Sample Mean Sample

Grade Size (kg/mm2) SD

AISI 1064 AISI 1078

a. Does the data provide compelling evidence for concluding that true average strength for the 1078 grade exceeds that for the 1064 grade by more than 10 kg/mm2? Test the appropriate hypotheses using the P-value approach.

b. Estimate the difference between true average strengths for the two grades in a way that provides information about precision and reliability.

9. The article “Evaluation of a Ventilation Strategy to Prevent Barotrauma in Patients at High Risk for Acute Respiratory Distress Syndrome” (New Engl. J. of Med., 1998: 355–358) reported on an experiment in which 120 patients with similar clinical features were randomly divided into a control group

s2 5 2.0y 5 123.6n 5 129 s1 5 1.3x 5 107.6m 5 129

s2 5 1.4s1 5 1.6

m 5 40m1 2 m2 5 1b 5 .10

m1 2 m2 5 1

(n 5 32) y 5 16.87 kgf/cm2(m 5 40)

x 5 18.12 kgf/cm2

m1 2 m2 5 21.2b 5 .1 m 5 n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

356 CHAPTER 9 Inferences Based on Two Samples

and a treatment group, each consisting of 60 patients. The sam- ple mean ICU stay (days) and sample standard deviation for the treatment group were 19.9 and 39.1, respectively, whereas these values for the control group were 13.7 and 15.8. a. Calculate a point estimate for the difference between

true average ICU stay for the treatment and control groups. Does this estimate suggest that there is a signif- icant difference between true average stays under the two conditions?

b. Answer the question posed in part (a) by carrying out a formal test of hypotheses. Is the result different from what you conjectured in part (a)?

c. Does it appear that ICU stay for patients given the ventilation treatment is normally distributed? Explain your reasoning.

d. Estimate true average length of stay for patients given the ventilation treatment in a way that conveys information about precision and reliability.

10. An experiment was performed to compare the fracture toughness of high-purity 18 Ni maraging steel with com- mercial-purity steel of the same type (Corrosion Science, 1971: 723–736). For specimens, the sample aver- age toughness was for the high-purity steel, whereas for specimens of commercial steel

. Because the high-purity steel is more expensive, its use for a certain application can be justified only if its fracture toughness exceeds that of commercial-purity steel by more than 5. Suppose that both toughness distributions are normal. a. Assuming that and , test the relevant

hypotheses using . b. Compute b for the test conducted in part (a) when

.

11. The level of lead in the blood was determined for a sam- ple of 152 male hazardous-waste workers ages 20–30 and also for a sample of 86 female workers, resulting in a mean standard error of for the men and

for the women (“Temporal Changes in Blood Lead Levels of Hazardous Waste Workers in New Jersey, 1984–1987,” Environ. Monitoring and Assessment, 1993: 99–107). Calculate an estimate of the difference between true average blood lead levels for male and female work- ers in a way that provides information about reliability and precision.

12. The accompanying table gives summary data on cube com- pressive strength (N/mm2) for concrete specimens made with a pulverized fuel-ash mix (“A Study of Twenty-Five- Year-Old Pulverized Fuel Ash Concrete Used in Foundation Structures,” Proc. Inst. Civ. Engrs., Mar. 1985: 149–165):

Age Sample Sample Sample (days) Size Mean SD

7 68 26.99 4.89 28 74 35.76 6.43

3.8 6 0.2 5.5 6 0.36

m1 2 m2 5 6

a 5 .001 s2 5 1.1s1 5 1.2

y 5 59.8 n 5 38

x 5 65.6 m 5 32

Calculate and interpret a 99% CI for the difference between true average 7-day strength and true average 28-day strength.

13. A mechanical engineer wishes to compare strength proper- ties of steel beams with similar beams made with a particu- lar alloy. The same number of beams, n, of each type will be tested. Each beam will be set in a horizontal position with a support on each end, a force of 2500 lb will be applied at the center, and the deflection will be measured. From past expe- rience with such beams, the engineer is willing to assume that the true standard deviation of deflection for both types of beam is .05 in. Because the alloy is more expensive, the engineer wishes to test at level .01 whether it has smaller average deflection than the steel beam. What value of n is appropriate if the desired type II error probability is .05 when the difference in true average deflection favors the alloy by .04 in.?

14. The level of monoamine oxidase (MAO) activity in blood platelets (nm/mg protein/h) was determined for each indi- vidual in a sample of 43 chronic schizophrenics, resulting in

and , as well as for 45 normal subjects, resulting in and . Does this data strongly suggest that true average MAO activity for normal subjects is more than twice the activity level for schizophrenics? Derive a test procedure and carry out the test using [Hint: H0 and Ha here have a different form from the three standard cases. Let m1 and m2 refer to true average MAO activity for schizophrenics and normal subjects, respectively, and consider the parameter . Write H0 and Ha in terms of u, estimate u, and derive (“Reduced Monoamine Oxidase Activity in Blood Plate- lets from Schizophrenic Patients,” Nature, July 28, 1972: 225–226).]

15. a. Show for the upper-tailed test with s1 and s2 known that as either m or n increases, b decreases when

b. For the case of equal sample sizes and fixed a, what happens to the necessary sample size n as b is decreased, where b is the desired type II error probabil- ity at a fixed alternative?

16. To decide whether two different types of steel have the same true average fracture toughness values, n specimens of each type are tested, yielding the following results:

Type Sample Average Sample SD

1 60.1 1.0 2 59.9 1.0

Calculate the P-value for the appropriate two-sample z test, assuming that the data was based on . Then repeat the calculation for . Is the small P-value for

indicative of a difference that has practical signif- icance? Would you have been satisfied with just a report of the P-value? Comment briefly.

n 5 400 n 5 400

n 5 100

(m 5 n) m1 2 m2 . �0.

ŝû

u 5 2m1 2 m2

a 5 .01.

s2 5 4.03y 5 6.35 s1 5 2.30x 5 2.69

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.2 The Two-Sample t Test and Confidence Interval 357

9.2 The Two-Sample t Test and Confidence Interval Values of the population variances will usually not be known to an investigator. In the previous section, we illustrated for large sample sizes the use of a z test and CI in which the sample variances were used in place of the population variances. In fact, for large samples, the CLT allows us to use these methods even when the two popu- lations of interest are not normal.

In practice, though, it will often happen that at least one sample size is small and the population variances have unknown values. Without the CLT at our dis- posal, we proceed by making specific assumptions about the underlying popula- tion distributions. The use of inferential procedures that follow from these assumptions is then restricted to situations in which the assumptions are at least approximately satisfied. We could, for example, assume that both population distri- butions are members of the Weibull family or that they are both Poisson distribu- tions. It shouldn’t surprise you to learn that normality is typically the most reasonable assumption.

ASSUMPTIONS

THEOREM

Both population distributions are normal, so that is a random sample from a normal distribution and so is (with the X’s and Y’s independent of one another). The plausibility of these assumptions can be judged by constructing a normal probability plot of the xi’s and another of the yi’s.

Y1, c, Yn

X1, X2, c, Xm

The test statistic and confidence interval formula are based on the same standardized variable developed in Section 9.1, but the relevant distribution is now t rather than z.

When the population distributions are both normal, the standardized variable

(9.2)

has approximately a t distribution with df v estimated from the data by

where

(round v down to the nearest integer).

se1 5 s1 1m

, se2 5 s2 1n

n 5

a s12 m

1 s2

2

n b2

(s1 2/m)2

m 2 1 1

(s2 2/n)2

n 2 1

5 [(se1)

2 1 (se2) 2]2

(se1) 4

m 2 1 1

(se2) 4

n 2 1

T 5 X 2 Y 2 (m1 2 m2)

B S1

2

m 1

S2 2

n

Manipulating T in a probability statement to isolate m1 � m2 gives a CI, whereas a test statistic results from replacing by the null value �0.m1 2 m2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

358 CHAPTER 9 Inferences Based on Two Samples

The two-sample t confidence interval for with confidence level % is then

A one-sided confidence bound can be calculated as described earlier. The two-sample t test for testing is as follows:

A P-value can be computed as described in Section 8.4 for the one-sample t test.

Test statistic value: t 5 x 2 y 2 �0

B s1

2

m 1 s2

2

n

H0: m1 2 m2 5 �0

x 2 y 6 ta/2,nB s1

2

m 1 s2

2

n

100(1 2 a) m1 2 m 2

The void volume within a textile fabric affects comfort, flammability, and insulation properties. Permeability of a fabric refers to the accessibility of void space to the flow of a gas or liquid. The article “The Relationship Between Porosity and Air Permeability of Woven Textile Fabrics” (J. of Testing and Eval., 1997: 108–114) gave summary information on air permeability (cm3/cm2/sec) for a number of dif- ferent fabric types. Consider the following data on two different types of plain- weave fabric:

Example 9.6

Fabric Type Sample Size Sample Mean Sample Standard Deviation

Cotton 10 51.71 .79

Triacetate 10 136.14 3.59

Assuming that the porosity distributions for both types of fabric are normal, let’s cal- culate a confidence interval for the difference between true average porosity for the cotton fabric and that for the acetate fabric, using a 95% confidence level. Before the appropriate t critical value can be selected, df must be determined:

Thus we use ; Appendix Table A.5 gives . The resulting interval is

5 (287.06, 281.80)

51.71 2 136.14 6 (2.262) B

.6241

10 1

12.8881

10 5 284.43 6 2.63

t.025,9 5 2.262n 5 9

df 5

a .6241 10

1 12.8881

10 b2

(.6241/10)2

9 1

(12.8881/10)2

9

5 1.8258

.1850 5 9.87

Alternative Hypothesis Rejection Region for Approximate Level a Test

(upper-tailed) (lower-tailed)

either or (two-tailed)t # 2ta/2,nt $ ta/2,nHa: m1 2 m2 2 �0

t # 2ta,nHa: m1 2 m2 , �0

t $ ta,nHa: m1 2 m2 . �0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.2 The Two-Sample t Test and Confidence Interval 359

With a high degree of confidence, we can say that true average porosity for triacetate fabric specimens exceeds that for cotton specimens by between 81.80 and 87.06 cm3/cm2/sec. ■

The deterioration of many municipal pipeline networks across the country is a grow- ing concern. One technology proposed for pipeline rehabilitation uses a flexible liner threaded through existing pipe. The article “Effect of Welding on a High-Density Polyethylene Liner” (J. of Materials in Civil Engr., 1996: 94–100) reported the fol- lowing data on tensile strength (psi) of liner specimens both when a certain fusion process was used and when this process was not used.

No fusion 2748 2700 2655 2822 2511 3149 3257 3213 3220 2753

Fused 3027 3356 3359 3297 3125 2910 2889 2902 s2 5 205.9y 5 3108.1n 5 8

s1 5 277.3x 5 2902.8m 5 10

Example 9.7

Figure 9.2 shows normal probability plots from Minitab. The linear pattern in each plot supports the assumption that the tensile strength distributions under the two con- ditions are both normal.

Figure 9.2 Normal probability plots from Minitab for the tensile strength data

The authors of the article stated that the fusion process increased the average tensile strength. The message from the comparative boxplot of Figure 9.3 is not all that clear. Let’s carry out a test of hypotheses to see whether the data supports this conclusion.

270026002500

Type 1

Type 2

32003100300029002800 3300 3400 Strength

Figure 9.3 A comparative boxplot of the tensile-strength data

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

360 CHAPTER 9 Inferences Based on Two Samples

1. Let m1 be the true average tensile strength of specimens when the no-fusion treatment is used and m2 denote the true average tensile strength when the fusion treatment is used.

2. (no difference in the true average tensile strengths for the two treatments)

3. (true average tensile strength for the no-fusion treatment is less than that for the fusion treatment, so that the investiga- tors’ conclusion is correct)

4. The null value is , so the test statistic value is

5. We now compute both the test statistic value and the df for the test:

Using and ,

so the test will be based on 15 df.

6. Appendix Table A.8 shows that the area under the 15 df t curve to the right of 1.8 is .046, so the P-value for a lower-tailed test is also .046. The following Minitab output summarizes all the computations:

Two-sample T for nofusion vs fused

N Mean StDev SE Mean not fused 10 2903 277 88 fused 8 3108 206 73

95% C.I. for mu nofusion-mu fused: (�488, 38) t-Test mu not fused � mu fused (vs ): T � �1.80 P � 0.046 DF � 15

7. Using a significance level of .05, we can barely reject the null hypothesis in favor of the alternative hypothesis, confirming the conclusion stated in the arti- cle. However, someone demanding more compelling evidence might select

, a level for which H0 cannot be rejected.

If the question posed had been whether fusing increased true average strength by more than 100 psi, then the relevant hypotheses would have been ver- sus ; that is, the null value would have been . ■

Pooled t Procedures Alternatives to the two-sample t procedures just described result from assuming not only that the two population distributions are normal but also that they have equal variances . That is, the two population distribution curves are assumed normal with equal spreads, the only possible difference between them being where they are centered.

(s1 2 5 s2

2)

�0 5 2100Ha: m1 2 m2 , 2100 H0: m1 2 m2 5 2100

a 5 .01

n 5 (7689.529 1 5299.351)2

(7689.529)2/9 1 (5299.351)2/7 5

168,711,003.7

10,581,747.35 5 15.94

s2 2/n 5 5299.351s1

2/m 5 7689.529

t 5 2902.8 2 3108.1

B (277.3)2

10 1

(205.9)2

8

5 2205.3

113.97 5 21.8

t 5 x 2 y

B s1

2

m 1

s2 2

n

�0 5 0

Ha: m1 2 m2 , 0

H0: m1 2 m2 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.2 The Two-Sample t Test and Confidence Interval 361

Let s2 denote the common population variance. Then standardizing gives

which has a standard normal distribution. Before this variable can be used as a basis for making inferences about , the common variance must be estimated from sample data. One estimator of s2 is , the variance of the m observations in the first sample, and another is , the variance of the second sample. Intuitively, a better esti- mator than either individual sample variance results from combining the two sample variances. A first thought might be to use . However, if , then the first sample contains more information about s2 than does the second sample, and an analogous comment applies if . The following weighted average of the two sample variances, called the pooled (i.e., combined) estimator of S2, adjusts for any difference between the two sample sizes:

The first sample contributes degrees of freedom to the estimate of s2, and the second sample contributes df, for a total of df. Statistical theory says that if replaces s2 in the expression for Z, the resulting standardized variable has a t distribution based on df. In the same way that earlier standardized variables were used as a basis for deriving confidence intervals and test procedures, this t variable immediately leads to the pooled t CI for estimating and the pooled t test for testing hypotheses about a difference between means.

In the past, many statisticians recommended these pooled t procedures over the two-sample t procedures. The pooled t test, for example, can be derived from the like- lihood ratio principle, whereas the two-sample t test is not a likelihood ratio test. Furthermore, the significance level for the pooled t test is exact, whereas it is only approximate for the two-sample t test. However, recent research has shown that although the pooled t test does outperform the two-sample t test by a bit (smaller b’s for the same a) when , the former test can easily lead to erroneous conclusions if applied when the variances are different. Analogous comments apply to the behavior of the two confidence intervals. That is, the pooled t procedures are not robust to violations of the equal variance assumption.

It has been suggested that one could carry out a preliminary test of and use a pooled t procedure if this null hypothesis is not rejected.

Unfortunately, the usual “F test” of equal variances (Section 9.5) is quite sensitive to the assumption of normal population distributions—much more so than t procedures. We therefore recommend the conservative approach of using two-sample t proce- dures unless there is really compelling evidence for doing otherwise, particularly when the two sample sizes are different.

Type II Error Probabilities Determining type II error probabilities (or equivalently, power ) for the two-sample t test is complicated. There does not appear to be any simple way to use the b curves of Appendix Table A.17. The most recent version of Minitab (Version 16) will calculate power for the pooled t test but not for the two-sample t test.

5 1 2 b

H0: s1 2 5 s2

2

s1 2 5 s2

2

m1 2 m2

m 1 n 2 2 Sp

2 m 1 n 2 2n 2 1

m 2 1

Sp 2 5

m 2 1

m 1 n 2 2 # S12 1 n 2 1m 1 n 2 2

# S22

m , n

m . n(S1 2 1 S2

2)/2

S2 2

S1 2

m1 2 m2

Z 5 X 2 Y 2 (m1 2 m2)

B s2

m 1 s2

n

5 X 2 Y 2 (m1 2 m2)

B s2a 1

m 1

1 n b

X 2 Y

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

362 CHAPTER 9 Inferences Based on Two Samples

However, the UCLA Statistics Department homepage (http://www.stat.ucla.edu) permits access to a power calculator that will do this. For example, we specified

(these are the sample sizes for Example 9.7, whose sample standard deviations are somewhat smaller than these values of s1 and s2) and asked for the power of a two-tailed level .05 test of when

, and 500. The resulting values of the power were .1089, .4609, and .9635 (corresponding to , and .04), respectively. In general, b will decrease as the sample sizes increase, as a increases, and as moves farther from 0. The software will also calculate sample sizes necessary to obtain a specified value of power for a particular value of .m1 2 m2

m1 2 m2

b 5 .89, .54 m1 2 m2 5 100, 250

H0: m1 2 m2 5 0

m 5 10, n 5 8, s1 5 300, s2 5 225

17. Determine the number of degrees of freedom for the two- sample t test or CI in each of the following situations: a. b. c. d.

18. Let m1 and m2 denote true average densities for two different types of brick. Assuming normality of the two density dis- tributions, test versus using the following data:

, and .

19. Suppose m1 and m2 are true mean stopping distances at 50 mph for cars of a certain type equipped with two different types of braking systems. Use the two-sample t test at significance level .01 to test versus for the following data:

, and .

20. Use the data of Exercise 19 to calculate a 95% CI for the difference between true average stopping distance for cars equipped with system 1 and cars equipped with system 2. Does the interval suggest that precise information about the value of this difference is available?

21. Quantitative noninvasive techniques are needed for rou- tinely assessing symptoms of peripheral neuropathies, such as carpal tunnel syndrome (CTS). The article “A Gap Detection Tactility Test for Sensory Deficits Associated with Carpal Tunnel Syndrome” (Ergonomics, 1995: 2588–2601) reported on a test that involved sensing a tiny gap in an otherwise smooth surface by probing with a fin- ger; this functionally resembles many work-related tactile activities, such as detecting scratches or surface defects. When finger probing was not allowed, the sample average gap detection threshold for normal subjects was 1.71 mm, and the sample standard deviation was .53; for CTS subjects, the sample mean and sample standard deviation were 2.53 and .87, respectively. Does this data suggest that the true average gap detection threshold for CTS subjects exceeds that for normal

n 5 10

m 5 8

s2 5 5.38x 5 115.7, s1 5 5.03, n 5 6, y 5 129.3 m 5 6,Ha: m1 2 m2 , 210

H0: m1 2 m2 5 210

s2 5 .240n 5 5, y 5 21.95 m 5 6, x 5 22.73, s1 5 .164,

Ha: m1 2 m2 2 0H0: m1 2 m2 5 0

m 5 12, n 5 24, s1 5 5.0, s2 5 6.0 m 5 10, n 5 15, s1 5 2.0, s2 5 6.0 m 5 10, n 5 15, s1 5 5.0, s2 5 6.0 m 5 10, n 5 10, s1 5 5.0, s2 5 6.0

subjects? State and test the relevant hypotheses using a significance level of .01.

22. The slant shear test is widely accepted for evaluating the bond of resinous repair materials to concrete; it utilizes cylinder specimens made of two identical halves bonded at 30°. The article “Testing the Bond Between Repair Materials and Concrete Substrate” (ACI Materials J., 1996: 553–558) reported that for 12 specimens prepared using wire-brushing, the sample mean shear strength (N/mm2) and sample stan- dard deviation were 19.20 and 1.58, respectively, whereas for 12 hand-chiseled specimens, the corresponding values were 23.13 and 4.01. Does the true average strength appear to be different for the two methods of surface preparation? State and test the relevant hypotheses using a significance level of .05. What are you assuming about the shear strength distributions?

23. Fusible interlinings are being used with increasing frequency to support outer fabrics and improve the shape and drape of various pieces of clothing. The article “Compatibility of Outer and Fusible Interlining Fabrics in Tailored Garments” (Textile Res. J., 1997: 137–142) gave the accompanying data on extensibility (%) at 100 gm/cm for both high-quality (H) fabric and poor-quality (P) fabric specimens.

EXERCISES Section 9.2 (17–35)

H 1.2 .9 .7 1.0 1.7 1.7 1.1 .9 1.7 1.9 1.3 2.1 1.6 1.8 1.4 1.3 1.9 1.6 .8 2.0 1.7 1.6 2.3 2.0

P 1.6 1.5 1.1 2.1 1.5 1.3 1.0 2.6

a. Construct normal probability plots to verify the plausi- bility of both samples having been selected from normal population distributions.

b. Construct a comparative boxplot. Does it suggest that there is a difference between true average extensibility for high-quality fabric specimens and that for poor- quality specimens?

c. The sample mean and standard deviation for the high- quality sample are 1.508 and .444, respectively, and those

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Type N Mean Std Dev Std Error 1 20 17.49900000 0.55012821 0.12301241 2 20 16.90000000 0.48998389 0.10956373

Variances T DF Prob> T Unequal 3.6362 37.5 0.0008

Equal 3.6362 38.0 0.0008

uu

9.2 The Two-Sample t Test and Confidence Interval 363

for the poor-quality sample are 1.588 and .530. Use the two-sample t test to decide whether true average extensi- bility differs for the two types of fabric.

24. Damage to grapes from bird predation is a serious problem for grape growers. The article “Experimental Method to Investigate and Monitor Bird Behavior and Damage to Vineyards” (Amer. J. of Enology and Viticulture, 2004: 288–291) reported on an experiment involving a bird-feeder table, time-lapse video, and artificial foods. Information was collected for two different bird species at both the experimental location and at a natural vineyard setting. Consider the following data on time (sec) spent on a single visit to the location.

two conditions. Does the interval suggest that population mean lateral motion differs for the two conditions? Is the message different if a confidence level of 95% is used?

26. The article “The Influence of Corrosion Inhibitor and Surface Abrasion on the Failure of Aluminum-Wired Twist-On Connections” (IEEE Trans. on Components, Hybrids, and Manuf. Tech., 1984: 20–25) reported data on potential drop measurements for one sample of connectors wired with alloy aluminum and another sample wired with EC aluminum. Does the accompanying SAS output suggest that the true average potential drop for alloy connections (type 1) is higher than that for EC connections (as stated in the article)? Carry out the appropriate test using a significance level of .01. In reaching your conclu- sion, what type of error might you have committed? [Note: SAS reports the P-value for a two-tailed test.]

Species Location n SE mean

Blackbirds Exptl 65 13.4 2.05 Blackbirds Natural 50 9.7 1.76 Silvereyes Exptl 34 49.4 4.78 Silvereyes Natural 46 38.4 5.06

x

a. Calculate an upper confidence bound for the true average time that blackbirds spend on a single visit at the exper- imental location.

b. Does it appear that true average time spent by blackbirds at the experimental location exceeds the true average time birds of this type spend at the natural location? Carry out a test of appropriate hypotheses.

c. Estimate the difference between the true average time blackbirds spend at the natural location and true average time that silvereyes spend at the natural location, and do so in a way that conveys information about reliability and precision.

[Note: The sample medians reported in the article all seemed significantly smaller than the means, suggesting substantial population distribution skewness. The authors actually used the distribution-free test procedure presented in Section 2 of Chapter 15.]

25. Low-back pain (LBP) is a serious health problem in many industrial settings. The article “Isodynamic Evaluation of Trunk Muscles and Low-Back Pain Among Workers in a Steel Factory” (Ergonomics, 1995: 2107–2117) reported the accompanying summary data on lateral range of motion (degrees) for a sample of workers without a history of LBP and another sample with a history of this malady.

Condition Sample Size Sample Mean Sample SD

No LBP 28 91.5 5.5 LBP 31 88.3 7.8

27. Anorexia Nervosa (AN) is a psychiatric condition leading to substantial weight loss among women who are fearful of becoming fat. The article “Adipose Tissue Distribution After Weight Restoration and Weight Maintenance in Women with Anorexia Nervosa” (Amer. J. of Clinical Nutr., 2009: 1132–1137) used whole-body magnetic resonance imagery to determine various tissue characteristics for both an AN sample of individuals who had undergone acute weight restoration and maintained their weight for a year and a com- parable (at the outset of the study) control sample. Here is summary data on intermuscular adipose tissue (IAT; kg).

Condition Sample Size Sample Mean Sample SD

AN 16 .52 .26 Control 8 .35 .15

Assume that both samples were selected from normal dis- tributions. a. Calculate an estimate for true average IAT under

the described AN protocol, and do so in a way that conveys information about the reliability and precision of the estimation.

b. Calculate an estimate for the difference between true average AN IAT and true average control IAT, and do so in a way that conveys information about the reliability and precision of the estimation. What does your estimate suggest about true average AN IAT relative to true aver- age control IAT?

28. As the population ages, there is increasing concern about accident-related injuries to the elderly. The article “Age and

Calculate a 90% confidence interval for the difference between population mean extent of lateral motion for the

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

364 CHAPTER 9 Inferences Based on Two Samples

Gender Differences in Single-Step Recovery from a For- ward Fall” (J. of Gerontology, 1999: M44–M50) reported on an experiment in which the maximum lean angle—the fur- thest a subject is able to lean and still recover in one step— was determined for both a sample of younger females (21–29 years) and a sample of older females (67–81 years). The following observations are consistent with summary data given in the article:

YF: 29, 34, 33, 27, 28, 32, 31, 34, 32, 27 OF: 18, 15, 23, 13, 12

Does the data suggest that true average maximum lean angle for older females is more than 10 degrees smaller than it is for younger females? State and test the relevant hypotheses at significance level .10 by obtaining a P-value.

29. The article “Effect of Internal Gas Pressure on the Com- pression Strength of Beverage Cans and Plastic Bottles” (J. of Testing and Evaluation, 1993: 129–131) includes the accompanying data on compression strength (lb) for a sample of 12-oz aluminum cans filled with strawberry drink and another sample filled with cola. Does the data suggest that the extra carbonation of cola results in a higher average compression strength? Base your answer on a P-value. What assumptions are necessary for your analysis?

31. Refer to Exercise 33 in Section 7.3. The cited article also gave the following observations on degree of polymeriza- tion for specimens having viscosity times concentration in a higher range:

429 430 430 431 436 437 440 441 445 446 447

a. Construct a comparative boxplot for the two samples, and comment on any interesting features.

b. Calculate a 95% confidence interval for the difference between true average degree of polymerization for the middle range and that for the high range. Does the inter- val suggest that m1 and m2 may in fact be different? Explain your reasoning.

32. The degenerative disease osteoarthritis most frequently affects weight-bearing joints such as the knee. The article “Evidence of Mechanical Load Redistribution at the Knee Joint in the Elderly when Ascending Stairs and Ramps” (Annals of Biomed. Engr., 2008: 467–476) presented the fol- lowing summary data on stance duration (ms) for samples of both older and younger adults.

Sample Sample Sample Beverage Size Mean SD

Strawberry drink 15 540 21 Cola 15 554 15

30. The article “Flexure of Concrete Beams Reinforced with Advanced Composite Orthogrids” (J. of Aerospace Engr., 1997: 7–15) gave the accompanying data on ultimate load (kN) for two different types of beams.

Sample Sample Sample Type Size Mean SD

Fiberglass grid 26 33.4 2.2 Commercial 26 42.8 4.3

carbon grid

Age Sample Size Sample Mean Sample SD

Older 28 801 117 Younger 16 780 72

Assume that both stance duration distributions are normal. a. Calculate and interpret a 99% CI for true average stance

duration among elderly individuals. b. Carry out a test of hypotheses at significance level .05 to

decide whether true average stance duration is larger among elderly individuals than among younger individuals.

33. The article “The Effects of a Low-Fat, Plant-Based Dietary Intervention on Body Weight, Metabolism, and Insulin Sensitivity in Postmenopausal Women” (Amer. J. of Med., 2005: 991–997) reported on the results of an experiment in which half of the individuals in a group of 64 postmenopausal overweight women were randomly assigned to a particular vegan diet, and the other half received a diet based on National Cholesterol Education Program guidelines. The sample mean decrease in body weight for those on the vegan diet was 5.8 kg, and the sample SD was 3.2, whereas for those on the con- trol diet, the sample mean weight loss and standard deviation were 3.8 and 2.8, respectively. Does it appear the true average weight loss for the vegan diet exceeds that for the control diet by more than 1 kg? Carry out an appropriate test of hypothe- ses at significance level .05 based on calculating a P-value.

34. Consider the pooled t variable

T 5 (X 2 Y) 2 (m1 2 m2)

SpB 1 m

1 1 n

a. Assuming that the underlying distributions are normal, calculate and interpret a 99% CI for the difference between true average load for the fiberglass beams and that for the carbon beams.

b. Does the upper limit of the interval you calculated in part (a) give a 99% upper confidence bound for the difference between the two m’s? If not, calculate such a bound. Does it strongly suggest that true average load for the carbon beams is more than that for the fiberglass beams? Explain.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.3 Analysis of Paired Data 365

which has a t distribution with df when both population distributions are normal with (see the Pooled t Procedures subsection for a description of Sp). a. Use this t variable to obtain a pooled t confidence inter-

val formula for . b. A sample of ultrasonic humidifiers of one particular

brand was selected for which the observations on maxi- mum output of moisture (oz) in a controlled chamber were 14.0, 14.3, 12.2, and 15.1. A sample of the second brand gave output values 12.1, 13.6, 11.9, and 11.2 (“Multiple Comparisons of Means Using Simultaneous

m1 2 m2

s1 5 s2

m 1 n 2 2 Confidence Intervals,” J. of Quality Technology, 1989: 232–241). Use the pooled t formula from part (a) to esti- mate the difference between true average outputs for the two brands with a 95% confidence interval.

c. Estimate the difference between the two m’s using the two-sample t interval discussed in this section, and com- pare it to the interval of part (b).

35. Refer to Exercise 34. Describe the pooled t test for testing when both population distributions are

normal with . Then use this test procedure to test the hypotheses suggested in Exercise 33.

s1 5 s2

H0: m1 2 m2 5 �0

Example 9.8

9.3 Analysis of Paired Data In Sections 9.1 and 9.2, we considered making an inference about a difference between two means m1 and m2. This was done by utilizing the results of a random sample

from the distribution with mean m1 and a completely independent (of the X’s) sample from the distribution with mean m2. That is, either m individu- als were selected from population 1 and n different individuals from population 2, or m individuals (or experimental objects) were given one treatment and another set of n individuals were given the other treatment. In contrast, there are a number of experi- mental situations in which there is only one set of n individuals or experimental objects; making two observations on each one results in a natural pairing of values.

Trace metals in drinking water affect the flavor, and unusually high concentrations can pose a health hazard. The article “Trace Metals of South Indian River” (Envir. Studies, 1982: 62–66) reports on a study in which six river locations were selected (six experimental objects) and the zinc concentration (mg/L) determined for both surface water and bottom water at each location. The six pairs of observations are displayed in the accompanying table. Does the data suggest that true average con- centration in bottom water exceeds that of surface water?

Y1, c, Yn

X1, X2, cXm

Location

1 2 3 4 5 6

Zinc concentration in bottom water (x) .430 .266 .567 .531 .707 .716

Zinc concentration in surface water (y) .415 .238 .390 .410 .605 .609

Difference .015 .028 .177 .121 .102 .107

Figure 9.4(a) displays a plot of this data. At first glance, there appears to be little dif- ference between the x and y samples. From location to location, there is a great deal of variability in each sample, and it looks as though any differences between the samples can be attributed to this variability. However, when the observations are identified by location, as in Figure 9.4(b), a different view emerges. At each location, bottom concentration exceeds surface concentration. This is confirmed by the fact that all differences displayed in the bottom row of the data table are positive. A correct analysis of this data focuses on these differences.

x 2 y

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

ASSUMPTIONS

366 CHAPTER 9 Inferences Based on Two Samples

.2 .3 .4 .5 .6 .7 .8

x y

Location x Location y

(a)

2 341 56

562 1 4 3

(b)

Figure 9.4 Plot of paired data from Example 9.8: (a) observations not identified by location; (b) observations identified by location ■

The data consists of n independently selected pairs with and . Let

so the Di’s are the differences within pairs. Then the Di’s are assumed to be normally distributed with mean value mD and variance (this is usually a consequence of the Xi’s and Yi’s themselves being normally distributed).

sD 2

Dn 5 Xn 2 Yn

D1 5 X1 2 Y1, D2 5 X2 2 Y2, c,E(Yi) 5 m2E(Xi) 5 m1

(X1, Y1), (X2, Y2), c(Xn, Yn),

The Paired t Test

Null hypothesis: (where is the difference between the first and second observa- tions within a pair, and )

Test statistic value: (where and sD are the sample mean and standard deviation, respectively, of the di’s)

dt 5 d 2 �0 sD /1n

mD 5 m1 2 m2

D 5 X 2 YH0: mD 5 �0

We are again interested in making an inference about the difference . The two-sample t confidence interval and test statistic were obtained by assuming independent samples and applying the rule . However, with paired data, the X and Y observations within each pair are often not independ- ent, so and are not independent of one another. We must therefore abandon the two-sample t procedures and look for an alternative method of analysis.

The Paired t Test Because different pairs are independent, the Di’s are independent of one another. Let

, where X and Y are the first and second observations, respectively, within an arbitrary pair. Then the expected difference is

(the rule of expected values used here is valid even when X and Y are dependent). Thus any hypothesis about can be phrased as a hypothesis about the mean difference mD. But since the Di’s constitute a normal random sample (of differ- ences) with mean mD, hypotheses about mD can be tested using a one-sample t test. That is, to test hypotheses about when data is paired, form the differences

and carry out a one-sample t test (based on df) on these dif- ferences.

n 2 1D1, D2, c, Dn

m1 2 m2

m1 2 m2

mD 5 E(X 2 Y ) 5 E(X ) 2 E(Y) 5 m1 2 m2

D 5 X 2 Y

YX

V(X 2 Y) 5 V(X) 1 V(Y)

m1 2 m2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 9.9

9.3 Analysis of Paired Data 367

Alternative Hypothesis Rejection Region for Level a Test

either or

A P-value can be calculated as was done for earlier t tests.

t # 2ta/2,n21t $ ta/2,n21Ha: mD 2 �0

t # 2ta,n21Ha: mD , �0

t $ ta,n21Ha: mD . �0

Musculoskeletal neck-and-shoulder disorders are all too common among office staff who perform repetitive tasks using visual display units. The article “Upper-Arm Elevation During Office Work” (Ergonomics, 1996: 1221–1230) reported on a study to determine whether more varied work conditions would have any impact on arm movement. The accompanying data was obtained from a sample of subjects. Each observation is the amount of time, expressed as a proportion of total time observed, during which arm elevation was below 30°. The two measurements from each subject were obtained 18 months apart. During this period, work conditions were changed, and subjects were allowed to engage in a wider variety of work tasks. Does the data suggest that true average time during which elevation is below 30° dif- fers after the change from what it was before the change?

n 5 16

Subject 1 2 3 4 5 6 7 8 Before 81 87 86 82 90 86 96 73 After 78 91 78 78 84 67 92 70 Difference 3 8 4 6 19 4 3

Subject 9 10 11 12 13 14 15 16 Before 74 75 72 80 66 72 56 82 After 58 62 70 58 66 60 65 73 Difference 16 13 2 22 0 12 929

24

Figure 9.5 shows a normal probability plot of the 16 differences; the pattern in the plot is quite straight, supporting the normality assumption. A boxplot of these dif- ferences appears in Figure 9.6; the boxplot is located considerably to the right of zero, suggesting that perhaps (note also that 13 of the 16 differences are positive and only two are negative).

mD . 0

.999 .99 .95 .80 .50 .20

Pr ob

ab ili

ty

.05

.01 .001

–10

Average: 6.75 Std Dev. 8.23408 N: 16

W-test for Normality R: 0.9916 P-Value (approx): >0.1000

0 10

diff 20

Figure 9.5 A normal probability plot from Minitab of the differences in Example 9.9

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

368 CHAPTER 9 Inferences Based on Two Samples

0 10–10 20 Difference

Figure 9.6 A boxplot of the differences in Example 9.9

Let’s now test the appropriate hypotheses.

1. Let mD denote the true average difference between elevation time before the change in work conditions and time after the change.

2. (there is no difference between true average time before the change and true average time after the change)

3.

4.

5. , , and , from which , , and

6. Appendix Table A.8 shows that the area to the right of 3.3 under the t curve with 15 df is .002. The inequality in Ha implies that a two-tailed test is appro- priate, so the P-value is approximately (Minitab gives .0051).

7. Since , the null hypothesis can be rejected at either significance level .05 or .01. It does appear that the true average difference between times is something other than zero; that is, true average time after the change is differ- ent from that before the change. ■

When the number of pairs is large, the assumption of a normal difference dis- tribution is not necessary. The CLT validates the resulting z test.

The Paired t Confidence Interval In the same way that the t CI for a single population mean m is based on the t vari- able , a t confidence interval for is based on the fact that

has a t distribution with df. Manipulation of this t variable, as in previous der- ivations of CIs, yields the following CI:100(1 2 a)%

n 2 1

T 5 D 2 mD SD /1n

mD (5 m1 2 m2)T 5 (X 2 m)/(S/1n)

.004 , .01

2(.002) 5 .004

t 5 6.75

8.234/116 5 3.28 < 3.3

sD 5 8.234d 5 6.75 di 2 5 1746 di 5 108n 5 16

t 5 d 2 0

sD /2n 5

d

sD /2n

H0: mD 2 0

H0: mD 5 0

The paired t CI for �D is

A one-sided confidence bound results from retaining the relevant sign and replacing by ta.ta/2

d 6 ta/2,n21 # sD /1n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 9.10

9.3 Analysis of Paired Data 369

When n is small, the validity of this interval requires that the distribution of differ- ences be at least approximately normal. For large n, the CLT ensures that the result- ing z interval is valid without any restrictions on the distribution of differences.

Adding computerized medical images to a database promises to provide great re- sources for physicians. However, there are other methods of obtaining such infor- mation, so the issue of efficiency of access needs to be investigated. The article “The Comparative Effectiveness of Conventional and Digital Image Libraries”(J. of Au- diovisual Media in Medicine, 2001: 8–15) reported on an experiment in which 13 computer-proficient medical professionals were timed both while retrieving an image from a library of slides and while retrieving the same image from a computer database with a Web front end.

Subject 1 2 3 4 5 6 7 8 9 10 11 12 13 Slide 30 35 40 25 20 30 35 62 40 51 25 42 33 Digital 25 16 15 15 10 20 7 16 15 13 11 19 19 Difference 5 19 25 10 10 10 28 46 25 38 14 23 14

Let mD denote the true mean difference between slide retrieval time (sec) and digital retrieval time. Using the paired t confidence interval to estimate mD requires that the difference distribution be at least approximately normal. The linear pattern of points in the normal probability plot from Minitab (Figure 9.7) validates the nor- mality assumption. (Only 9 points appear because of ties in the differences.)

.999

.99

.95

.80

.50

P ro

ba bi

lit y

.20

.05

.01

5 15 25

Diff

35

W-test for Normality R: 0.9724 P-Value (approx): � 0.1000

Average: 20.5385 StDev: 11.9625 N: 13

45

.001

Figure 9.7 Normal probability plot of the differences in Example 9.10

Relevant summary quantities are , from which . The t critical value required for a 95% confidence level is

and the 95% CI is

We can be highly confident (at the 95% confidence level) that . This interval is rather wide, a consequence of the sample standard deviation being large relative to the sample mean. A sample size much larger than 13 would be

13.3 , mD , 27.7

d 6 ta/2,n21 # sD1n 5 20.5 6 (2.179) # 11.96 113

5 20.5 6 7.2 5 (13.3, 27.7)

t.025,12 5 2.179,sD 5 11.96 d 5 20.5, di 5 267, di

2 5 7201

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

370 CHAPTER 9 Inferences Based on Two Samples

required to estimate with substantially more precision. Notice, however, that 0 lies well outside the interval, suggesting that ; this is confirmed by a formal test of hypotheses. ■

Paired Data and Two-Sample t Procedures Consider using the two-sample t test on paired data. The numerators of the two test sta- tistics are identical, since The difference between the statistics is due entirely to the denominators. Each test sta- tistic is obtained by standardizing . But in the presence of dependence the two-sample t standardization is incorrect. To see this, recall from Section 5.5 that

The correlation between X and Y is

It follows that

Applying this to yieldsX 2 Y

V(X 2 Y ) 5 s1 2 1 s2

2 2 2rs1s2

r 5 Corr(X, Y) 5 Cov(X, Y )/[1V(X) # 1V(Y)]

V(X 6 Y ) 5 V(X) 1 V(Y ) 6 2 Cov(X, Y)

X 2 Y (5D)

d 5 di /n 5 [ (xi 2 yi)]/n 5 ( xi)/n 2 ( yi)/n 5 x 2 y.

mD . 0

The two-sample t test is based on the assumption of independence, in which case . But in many paired experiments, there will be a strong positive depen- dence between X and Y (large X associated with large Y), so that r will be positive and the variance of will be smaller than . Thus whenever there is positive dependence within pairs, the denominator for the paired t statistic should be smaller than for t of the independent-samples test. Often two-sample t will be much closer to zero than paired t, considerably understating the significance of the data.

Similarly, when data is paired, the paired t CI will usually be narrower than the (incorrect) two-sample t CI. This is because there is typically much less variability in the differences than in the x and y values.

Paired Versus Unpaired Experiments In our examples, paired data resulted from two observations on the same subject (Example 9.9) or experimental object (location in Example 9.8). Even when this can- not be done, paired data with dependence within pairs can be obtained by matching individuals or objects on one or more characteristics thought to influence responses. For example, in a medical experiment to compare the efficacy of two drugs for lowering blood pressure, the experimenter’s budget might allow for the treatment of 20 patients. If 10 patients are randomly selected for treatment with the first drug and another 10 independently selected for treatment with the second drug, an independ- ent-samples experiment results.

However, the experimenter, knowing that blood pressure is influenced by age and weight, might decide to create pairs of patients so that within each of the result- ing 10 pairs, age and weight were approximately equal (though there might be sizable

s1 2/n 1 s2

2/nX 2 Y

r 5 0

V(X 2 Y) 5 V(D) 5 V a 1 n Dib 5 V(Di)n 5

s1 2 1 s2

2 2 2rs1s2 n

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.3 Analysis of Paired Data 371

differences between pairs). Then each drug would be given to a different patient within each pair for a total of 10 observations on each drug.

Without this matching (or “blocking”), one drug might appear to outperform the other just because patients in one sample were lighter and younger and thus more susceptible to a decrease in blood pressure than the heavier and older patients in the second sample. However, there is a price to be paid for pairing—a smaller number of degrees of freedom for the paired analysis—so we must ask when one type of experiment should be preferred to the other.

There is no straightforward and precise answer to this question, but there are some useful guidelines. If we have a choice between two t tests that are both valid (and carried out at the same level of significance a), we should prefer the test that has the larger number of degrees of freedom. The reason for this is that a larger num- ber of degrees of freedom means smaller b for any fixed alternative value of the parameter or parameters. That is, for a fixed type I error probability, the probability of a type II error is decreased by increasing degrees of freedom.

However, if the experimental units are quite heterogeneous in their responses, it will be difficult to detect small but significant differences between two treatments. This is essentially what happened in the data set in Example 9.8; for both “treat- ments” (bottom water and surface water), there is great between-location variability, which tends to mask differences in treatments within locations. If there is a high pos- itive correlation within experimental units or subjects, the variance of will be much smaller than the unpaired variance. Because of this reduced variance, it will be easier to detect a difference with paired samples than with independent samples. The pros and cons of pairing can now be summarized as follows.

D 5 X 2 Y

1. If there is great heterogeneity between experimental units and a large corre- lation within experimental units (large positive r), then the loss in degrees of freedom will be compensated for by the increased precision associated with pairing, so a paired experiment is preferable to an independent-samples experiment.

2. If the experimental units are relatively homogeneous and the correlation within pairs is not large, the gain in precision due to pairing will be out- weighed by the decrease in degrees of freedom, so an independent-samples experiment should be used.

Of course, values of , and rwill not usually be known very precisely, so an investigator will be required to make an educated guess as to whether Situation 1 or 2 obtains. In general, if the number of observations that can be obtained is large, then a loss in degrees of freedom (e.g., from 40 to 20) will not be serious; but if the number is small, then the loss (say, from 16 to 8) because of pairing may be serious if not com- pensated for by increased precision. Similar considerations apply when choosing between the two types of experiments to estimate with a confidence interval.m1 2 m2

s1 2, s2

2

EXERCISES Section 9.3 (36–48)

36. Consider the accompanying data on breaking load (kg/25 mm width) for various fabrics in both an unabraded con- dition and an abraded condition (“The Effect of Wet Abrasive Wear on the Tensile Properties of Cotton and

Polyester-Cotton Fabrics,” J. Testing and Evaluation, 1993: 84–93). Use the paired t test, as did the authors of the cited article, to test versus at significance level .01.

Ha: mD . 0H0: mD 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

372 CHAPTER 9 Inferences Based on Two Samples

Fabric

1 2 3 4 5 6 7 8

U 36.4 55.0 51.5 38.7 43.2 48.8 25.6 49.8 A 28.5 20.0 46.0 34.5 36.5 52.5 26.5 46.5

37. Hexavalent chromium has been identified as an inhalation carcinogen and an air toxin of concern in a number of dif- ferent locales. The article “Airborne Hexavalent Chromium in Southwestern Ontario” (J. of Air and Waste Mgmnt. Assoc., 1997: 905–910) gave the accompanying data on both indoor and outdoor concentration (nanograms/m3) for a sample of houses selected from a certain region.

House

1 2 3 4 5 6 7 8 9

Indoor .07 .08 .09 .12 .12 .12 .13 .14 .15 Outdoor .29 .68 .47 .54 .97 .35 .49 .84 .86

House

10 11 12 13 14 15 16 17

Indoor .15 .17 .17 .18 .18 .18 .18 .19 Outdoor .28 .32 .32 1.55 .66 .29 .21 1.02

House

18 19 20 21 22 23 24 25

Indoor .20 .22 .22 .23 .23 .25 .26 .28 Outdoor 1.59 .90 .52 .12 .54 .88 .49 1.24

House

26 27 28 29 30 31 32 33

Indoor .28 .29 .34 .39 .40 .45 .54 .62 Outdoor .48 .27 .37 1.26 .70 .76 .99 .36

a. Calculate a confidence interval for the population mean difference between indoor and outdoor concentrations using a confidence level of 95%, and interpret the result- ing interval.

b. If a 34th house were to be randomly selected from the population, between what values would you predict the difference in concentrations to lie?

38. Concrete specimens with varying height-to-diameter ratios cut from various positions on the original cylinder were obtained both from a normal-strength concrete mix and from a high-strength mix. The peak stress (MPa) was deter- mined for each mix, resulting in the following data (“Effect of Length on Compressive Strain Softening of Concrete,” J. of Engr. Mechanics, 1997: 25–35):

Test condition

1 2 3 4 5

Normal 42.8 55.6 49.0 48.7 44.1 High 90.9 93.1 86.3 90.3 88.5

Test condition

6 7 8 9 10

Normal 55.4 50.1 45.7 51.4 43.1 High 88.1 93.2 90.8 90.1 92.6

Test condition

11 12 13 14 15

Normal 46.8 46.7 47.7 45.8 45.4 High 88.2 88.6 91.0 90.0 90.1

a. Construct a comparative boxplot of peak stresses for the two types of concrete, and comment on any interesting features.

b. Estimate the difference between true average peak stresses for the two types of concrete in a way that conveys informa- tion about precision and reliability. Be sure to check the plausibility of any assumptions needed in your analysis. Does it appear plausible that the true average peak stresses for the two types of concrete are identical? Why or why not?

39. Scientists and engineers frequently wish to compare two different techniques for measuring or determining the value of a variable. In such situations, interest centers on testing whether the mean difference in measurements is zero. The article “Evaluation of the Deuterium Dilution Technique Against the Test Weighing Procedure for the Determination of Breast Milk Intake” (Amer. J. of Clinical Nutr., 1983: 996–1003) reports the accompanying data on amount of milk ingested by each of 14 randomly selected infants.

Infant

1 2 3 4 5

DD method 1509 1418 1561 1556 2169 TW method 1498 1254 1336 1565 2000 Difference 11 164 225 �9 169

Infant

6 7 8 9 10

DD method 1760 1098 1198 1479 1281 TW method 1318 1410 1129 1342 1124 Difference 442 �312 69 137 157

Infant

11 12 13 14

DD method 1414 1954 2174 2058 TW method 1468 1604 1722 1518 Difference �54 350 452 540

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.3 Analysis of Paired Data 373

a. Is it plausible that the population distribution of differ- ences is normal?

b. Does it appear that the true average difference between intake values measured by the two methods is something other than zero? Determine the P-value of the test, and use it to reach a conclusion at significance level .05.

40. Lactation promotes a temporary loss of bone mass to provide adequate amounts of calcium for milk production. The paper “Bone Mass Is Recovered from Lactation to Postweaning in Adolescent Mothers with Low Calcium Intakes” (Amer. J. of Clinical Nutr., 2004: 1322–1326) gave the following data on total body bone mineral content (TBBMC) (g) for a sample both during lactation (L) and in the postweaning period (P).

Subject

1 2 3 4 5 6 7 8 9 10

L 1928 2549 2825 1924 1628 2175 2114 2621 1843 2541 P 2126 2885 2895 1942 1750 2184 2164 2626 2006 2627

a. Does the data suggest that true average total body bone mineral content during postweaning exceeds that during lactation by more than 25 g? State and test the appropri- ate hypotheses using a significance level of .05. [Note: The appropriate normal probability plot shows some curvature but not enough to cast substantial doubt on a normality assumption.]

b. Calculate an upper confidence bound using a 95% con- fidence level for the true average difference between TBBMC during postweaning and during lactation.

c. Does the (incorrect) use of the two-sample t test to test the hypotheses suggested in (a) lead to the same conclu- sion that you obtained there? Explain.

41. Antipsychotic drugs are widely prescribed for condi- tions such as schizophrenia and bipolar disease. The article “Cardiometabolic Risk of Second-Generation Antipsychotic Medications During First-Time Use in Children and Adolescents” (J. of the Amer. Med. Assoc., 2009) reported on body composition and metabolic changes for individuals who had taken various antipsy- chotic drugs for short periods of time. a. The sample of 41 individuals who had taken aripiprazole

had a mean change in total cholesterol (mg/dL) of 3.75, and the estimated standard error was 3.878. Calculate a confidence interval with confidence level approximately 95% for the true average increase in total cholesterol under these circumstances (the cited article included this CI).

b. The article also reported that for a sample of 36 individu- als who had taken quetiapine, the sample mean cholesterol level change and estimated standard error were 9.05 and 4.256, respectively. Making any necessary assumptions about the distribution of change in cholesterol level, does the choice of significance level impact your conclusion as to whether true average cholesterol level increases? Explain. [Note: The article included a P-value.]

sD /1n

c. For the sample of 45 individuals who had taken olanza- pine, the article reported (7.38, 9.69) as a 95% CI for true average weight gain (kg). What is a 99% CI?

42. It has been estimated that between 1945 and 1971, as many as 2 million children were born to mothers treated with diethylstilbestrol (DES), a nonsteroidal estrogen rec- ommended for pregnancy maintenance. The FDA banned this drug in 1971 because research indicated a link with the incidence of cervical cancer. The article “Effects of Prenatal Exposure to Diethylstilbestrol (DES) on Hemispheric Laterality and Spatial Ability in Human Males” (Hormones and Behavior, 1992: 62–75) discussed a study in which 10 males exposed to DES, and their unexposed brothers, underwent various tests. This is the summary data on the results of a spatial ability test:

(exposed), , and standard error of mean difference .5. Test at level .05 to see whether exposure is associated with reduced spatial ability by obtaining the P-value.

43. Cushing’s disease is characterized by muscular weakness due to adrenal or pituitary dysfunction. To provide effec- tive treatment, it is important to detect childhood Cushing’s disease as early as possible. Age at onset of symptoms and age at diagnosis (months) for 15 children suffering from the disease were given in the article “Treatment of Cushing’s Disease in Childhood and Adolescence by Transphenoidal Microadenomectomy” (New Engl. J. of Med., 1984: 889). Here are the values of the differences between age at onset of symptoms and age at diagnosis:

�24 �12 �55 �15 �30 �60 �14 �21 �48 �12 �25 �53 �61 �69 �80

a. Does the accompanying normal probability plot cast strong doubt on the approximate normality of the popu- lation distribution of differences?

5 y 5 13.7x 5 12.6

–1.5 –.5

–80

–70

–60

–50

–40

–30

–20

–10

.5 1.5

z percentile

Difference

b. Calculate a lower 95% confidence bound for the popu- lation mean difference, and interpret the resulting bound.

c. Suppose the (age at diagnosis) – (age at onset) differ- ences had been calculated. What would be a 95% upper confidence bound for the corresponding population mean difference?

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

374 CHAPTER 9 Inferences Based on Two Samples

44. Refer back to the previous exercise. a. By far the most frequently tested null hypothesis when

data is paired is . Is that a sensible hypothe- sis in this context? Explain.

b. Carry out a test of hypotheses to decide whether there is compelling evidence for concluding that on average diagnosis occurs more than 25 months after the onset of symptoms.

45. Torsion during hip external rotation (ER) and extension may be responsible for certain kinds of injuries in golfers and other athletes. The article “Hip Rotational Velocities During the Full Golf Swing” (J. of Sports Science and Medicine, 2009: 296–299) reported on a study in which peak ER velocity and peak IR (internal rotation) velocity (both in deg.sec�1) were determined for a sample of 15 female collegiate golfers during their swings. The following data was supplied by the article’s authors.

Golfer ER IR diff z perc

1 .6 .9 .7 .28 2 .1 .9 .2 .97 3 .7 .6 109.9 0.34 4 .7 .9 17.2 .73 5 .5 .7 40.2 .34 6 .0 .9 173.9 0.97 7 .4 .0 250.6 1.83 8 .1 .7 44.6 .17 9 .8 .6 27.8 .52

10 .5 .8 59.3 0.00 11 .3 .7 107.4 0.17 12 .1 .9 159.8 0.73 13 .3 .1 184.8 1.28 14 .4 .6 .8 .83 15 .2 .6 146.4 0.52

a. Is it plausible that the differences came from a normally distributed population?

b. The article reported that for ER velocity and for IR velocity. Based just on this information, could a test of hypotheses about the difference between true average IR velocity and true average ER velocity be carried out? Explain.

c. The article stated that “The lead hip peak IR velocity was significantly greater than the trail hip ER velocity

.” (The phrasing suggests that an upper-tailed test was used.) Is that in fact the case? [Note: “ ” in Table 2 of the article is erro- neous.]

46. Example 7.11 gave data on the modulus of elasticity obtained 1 minute after loading in a certain configuration. The cited article also gave the values of modulus of elastic- ity obtained 4 weeks after loading for the same lumber spec-

p 5 .033

(p 5 0.003, t value 5 3.65)

5 2227.8(96.6) Mean (6 SD) 5 2145.3(68.0)

23452199 2124321402184

24292244 22722113 23262219 2117258

2022142186 2022752231

2275224 22742101

2021702130 2021962179

2161251 202921152125 212312982130

H0: mD 5 0

imens. The data is presented here.

Observation 1 min 4 weeks Difference

1 10,490 9,110 1380 2 16,620 13,250 3370 3 17,300 14,720 2580 4 15,480 12,740 2740 5 12,970 10,120 2850 6 17,260 14,570 2690 7 13,400 11,220 2180 8 13,900 11,100 2800 9 13,630 11,420 2210

10 13,260 10,910 2350 11 14,370 12,110 2260 12 11,700 8,620 3080 13 15,470 12,590 2880 14 17,840 15,090 2750 15 14,070 10,550 3520 16 14,760 12,230 2530

Calculate and interpret an upper confidence bound for the true average difference between 1-minute modulus and 4-week modulus; first check the plausibility of any neces- sary assumptions.

47. The paper “Slender High-Strength RC Columns Under Ec- centric Compression” (Magazine of Concrete Res., 2005: 361–370) gave the accompanying data on cylinder strength (MPa) for various types of columns cured under both moist conditions and laboratory drying conditions.

Type

1 2 3 4 5 6

M: 82.6 87.1 89.5 88.8 94.3 80.0 LD: 86.9 87.3 92.0 89.3 91.4 85.9

7 8 9 10 11 12

M: 86.7 92.5 97.8 90.4 94.6 91.6 LD: 89.4 91.8 94.3 92.0 93.1 91.3

a. Estimate the difference in true average strength under the two drying conditions in a way that conveys information about reliability and precision, and interpret the estimate. What does the estimate suggest about how true average strength under moist drying conditions compares to that under laboratory drying conditions?

b. Check the plausibility of any assumptions that underlie your analysis of (a).

48. Construct a paired data set for which , so that the data is highly significant when the correct analysis is used, yet t for the two-sample t test is quite near zero, so the incorrect analysis yields an insignificant result.

t 5 `

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.4 Inferences Concerning a Difference Between Population Proportions 375

PROPOSITION

9.4 Inferences Concerning a Difference Between Population Proportions

Having presented methods for comparing the means of two different populations, we now turn attention to the comparison of two population proportions. Regard an indi- vidual or object as a success S if he/she/it processes some characteristic of interest (someone who graduated from college, a refrigerator with an icemaker, etc.). Let

Alternatively, can be regarded as the probability that a randomly selected indi- vidual or object from the first (second) population is a success.

Suppose that a sample of size m is selected from the first population and inde- pendently a sample of size n is selected from the second one. Let X denote the num- ber of S’s in the first sample and Y be the number of S’s in the second. Independence of the two samples implies that X and Y are independent. Provided that the two sam- ple sizes are much smaller than the corresponding population sizes, X and Y can be regarded as having binomial distributions. The natural estimator for , the difference in population proportions, is the corresponding difference in sample pro- portions .X/m 2 Y/n

p1 2 p2

p1( p2)

p2 5 the proportion of S’s in population [ 2 p1 5 the proportion of S’s in population [ 1

Let , where and with X and Y independent variables. Then

so is an unbiased estimator of , and

(9.3)V( p̂1 2 p̂2) 5 p1q1 m

1 p2q2

n (where qi 5 1 2 pi)

p1 2 p2p̂1 2 p̂2

E( p̂1 2 p̂2) 5 p1 2 p2

Y , Bin(n, p2)X , Bin(m, p1)p̂1 5 X/m and p̂2 5 Y/n

Proof Since and ,

Since , , and X and Y are independent,

We will focus first on situations in which both m and n are large. Then because and individually have approximately normal distributions, the estimator also has approximately a normal distribution. Standardiz- ing yields a variable Z whose distribution is approximately standard normal:

Z 5 p̂1 2 p̂2 2 ( p1 2 p2)

B p1q1 m

1 p2q2

n

p̂1 2 p̂2

p̂1 2 p̂2

p̂2p̂1

V aX m

2 Y n b 5 V aX

m b 1 V aY

n b 5 1

m2 V(X ) 1

1

n2 V(Y) 5

p1q1 m

1 p2q2

n

V(Y) 5 np2q2V(X ) 5 mp1q1

EaX m

2 Y n b 5 1

m E(X) 2

1 n E(Y) 5

1 m

mp1 2 1 n np2 5 p1 2 p2

E(Y) 5 np2E(X ) 5 mp1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

376 CHAPTER 9 Inferences Based on Two Samples

A Large-Sample Test Procedure The most general null hypothesis an investigator might consider would be of the form . Although for population means the case presented no difficulties, for population proportions and must be considered separately. Since the vast majority of actual problems of this sort involve (i.e., the null hypothesis ), we’ll concentrate on this case. When

is true, let p denote the common value of p1 and p2 (and similarly for q). Then the standardized variable

(9.4)

has approximately a standard normal distribution when H0 is true. However, this Z cannot serve as a test statistic because the value of p is unknown—H0 asserts only that there is a common value of p, but does not say what that value is. A test statis- tic results from replacing p and q in (9.4) by appropriate estimators.

Assuming that , instead of separate samples of size m and n from two different populations (two different binomial distributions), we really have a sin- gle sample of size from one population with proportion p. The total number of individuals in this combined sample having the characteristic of interest is The natural estimator of p is then

(9.5)

The second expression for shows that it is actually a weighted average of estima- tors and obtained from the two samples. Using and in place of p and q in (9.4) gives a test statistic having approximately a standard normal distribu- tion when H0 is true.

q̂ 5 1 2 p̂p̂p̂2p̂1

p̂ 5 X 1 Y

m 1 n 5

m

m 1 n # p̂1 1 nm 1 n

# p̂2

X 1 Y. m 1 n

p1 5 p2 5 p

Z 5 p̂1 2 p̂2 2 0

B pqa1

m 1

1 n b

H0: p1 2 p2 5 0 p1 5 p2

�0 5 0 �0 2 0�0 5 0

�0 2 0H0: p1 2 p2 5 �0

Null hypothesis:

Test statistic value (large samples):

Alternative Hypothesis Rejection Region for Approximate Level a Test

either or

A P-value is calculated in the same way as for previous z tests. The test can safely be used as long as , and are all at least 10.nq̂2mp̂1, mq̂1, np̂2

z # 2za/2z $ za/2Ha: p1 2 p2 2 0 z # 2zaHa: p1 2 p2 , 0 z $ zaHa: p1 2 p2 . 0

z 5 p̂1 2 p̂2

B p̂q̂a1

m 1

1 n b

H0: p1 2 p2 5 0

The article “Aspirin Use and Survival After Diagnosis of Colorectal Cancer” (J. of the Amer. Med. Assoc., 2009: 649–658) reported that of 549 study participants who regularly used aspirin after being diagnosed with colorectal cancer, there were 81 colorectal cancer-specific deaths, whereas among 730 similarly diagnosed individu- als who did not subsequently use aspirin, there were 141 colorectal cancer-specific

Example 9.11

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.4 Inferences Concerning a Difference Between Population Proportions 377

Alternative Hypothesis

1 2 � ≥2zaBp qa 1 m

1 1 n b 2 (p1 2 p2)

s

¥Ha: p1 2 p2 , 0

� ≥ zaBp qa 1 m

1 1 n b 2 (p1 2 p2) s

¥Ha: p1 2 p2 . 0

b(p1, p2)

deaths. Does this data suggest that the regular use of aspirin after diagnosis will decrease the incidence rate of colorectal cancer-specific deaths? Let’s test the appropriate hypotheses using a significance level of .05.

The parameter of interest is the difference , where p1 is the true pro- portion of deaths for those who regularly used aspirin and p2 is the true proportion of deaths for those who did not use aspirin. The use of aspirin is beneficial if , which corresponds to a negative difference between the two proportions. The relevant hypotheses are therefore

Parameter estimates are , , and . A z test is appropriate here because all of

, and are at least 10. The resulting test statistic value is

The corresponding P-value for a lower-tailed z test is . Because , the null hypothesis can be rejected at significance level .05. So anyone

adopting this significance level would be convinced that the use of aspirin in these cir- cumstances is beneficial. However, someone looking for more compelling evidence might select a significance level .01 and then not be persuaded. ■

Type II Error Probabilities and Sample Sizes Here the determination of b is a bit more cumbersome than it was for other large- sample tests. The reason is that the denominator of Z is an estimate of the standard deviation of , assuming that . When H0 is false, must be restandardized using

(9.6)

The form of s implies that b is not a function of just , so we denote it by .b(p1, p2)

p1 2 p2

sp̂12 p̂2 5 B p1q1 m

1 p2q2

n

p̂1 2 p̂2p1 5 p2 5 pp̂ 2 p̂2

.0162 # .05 �(22.14) 5 .0162

z 5 .1475 2 .1932

B (.1736)(.8264)a 1

549 1

1

730 b

5 2.0457

.021397 5 22.14

nq̂2mp̂1, mq̂1, np̂2

p̂ 5 (81 1 141)/(549 1 730) 5 .1736 p̂2 5 141/730 5 .1932p̂1 5 81/549 5 .1475

H0: p1 2 p2 5 0 versus Ha: p1 2 p2 , 0

p1 , p2

p1 2 p2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

378 CHAPTER 9 Inferences Based on Two Samples

Proof For the upper-tailed test ,

When m and n are both large,

and , which yields the previous (approximate) expression for . ■

Alternatively, for specified p1, p2 with , the sample sizes neces- sary to achieve can be determined. For example, for the upper-tailed test, we equate to the argument of (i.e., what’s inside the parentheses) in the foregoing box. If , there is a simple expression for the common value.m 5 n

�( # )2zb b(p1, p2) 5 b

p1 2 p2 5 d

b(p1, p2)q̂ < q#

p̂ 5 (mp̂1 1 np̂2)/(m 1 n) < (mp1 1 np2)/(m 1 n) 5 p#

za B p̂q̂a 1

m 1

1 n b 2 ( p1 2 p2) s

¥,5 P ≥ ( p̂1 2 p̂2 2 ( p1 2 p2)) s

b(p1, p2) 5 Pcp̂1 2 p̂2 , zaB p̂q̂a 1 m

1 1 n b d

(Ha: p1 2 p2 . 0)

For the case , the level a test has type II error probability b at the alternative values p1, p2 with when

(9.7)

for an upper- or lower-tailed test, with a/2 replacing a for a two-tailed test.

n 5 Cza1( p1 1 p2)(q1 1 q2)/2] 1 zb1p1q1 1 p2q2D

2

d 2

p1 2 p2 5 d m 5 n

One of the truly impressive applications of statistics occurred in connection with the design of the 1954 Salk polio-vaccine experiment and analysis of the resulting data. Part of the experiment focused on the efficacy of the vaccine in combating paralytic polio. Because it was thought that without a control group of children, there would be no sound basis for assessment of the vaccine, it was decided to administer the vaccine to one group and a placebo injection (visually indistinguishable from the vaccine but known to have no effect) to a control group. For ethical reasons and also

Example 9.12

Alternative Hypothesis

where , and s is given by (9.6).

p 5 (mp1 1 np2)/(m 1 n), q 5 (mq1 1 nq2)/(m 1 n)

2� ≥2za/2Bp qa 1 m

1 1 n b 2 (p1 2 p2)

s

¥

� ≥ za/2Bp qa 1 m

1 1 n b 2 (p1 2 p2) s

¥Ha: p1 2 p2 2 0

b(p1, p2)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.4 Inferences Concerning a Difference Between Population Proportions 379

because it was thought that the knowledge of vaccine administration might have an effect on treatment and diagnosis, the experiment was conducted in a double-blind manner. That is, neither the individuals receiving injections nor those administering them actually knew who was receiving vaccine and who was receiving the placebo (samples were numerically coded). (Remember: at that point it was not at all clear whether the vaccine was beneficial.)

Let p1 and p2 be the probabilities of a child getting paralytic polio for the control and treatment conditions, respectively. The objective was to test

versus (the alternative states that a vaccinated child is less likely to contract polio than an unvaccinated child). Supposing the true value of p1 is .0003 (an incidence rate of 30 per 100,000), the vaccine would be a significant improvement if the incidence rate was halved—that is, . Using a level test, it would then be reasonable to ask for sample sizes for which when and . Assuming equal sample sizes, the required n is obtained from (9.7) as

p2 5 .00015p1 5 .0003b 5 .1 a 5 .05

p2 5 .00015

Ha: p1 2 p2 . 0H0: p1 2 p2 5 0

n 5 C1.6452(.5)(.00045)(1.99955) 1 1.282(.00015)(.99985) 1 (.0003)(.9997)D

2

(.0003 2 .00015)2

The actual data for this experiment follows. Sample sizes of approximately 200,000 were used. The reader can easily verify that —a highly significant value. The vaccine was judged a resounding success!

A Large-Sample Confidence Interval As with means, many two-sample problems involve the objective of comparison through hypothesis testing, but sometimes an interval estimate for is appropriate. Both and have approximate normal distributions

when m and n are both large. If we identify u with , then satisfies the conditions necessary for obtaining a large-sample CI. In particular, the

estimated standard deviation of is . The general

interval then takes the following form.û 6 za/2 # ŝû 100(1 2 a)%2( p̂1q̂1/m) 1 ( p̂2q̂2/n)û

û 5 p̂1 2 p̂2p1 2 p2

p̂2 5 Y/np̂1 5 X/m p1 2 p2

Vaccine: n 5 200,745, y 5 33

Placebo: m 5 201,229, x 5 number of cases of paralytic polio 5 110

z 5 6.43

5 [(.0349 1 .0271)/.00015]2 < 171,000

A CI for with confidence level approximately is

This interval can safely be used as long as , and are all at least 10.

nq̂2mp̂1, mq̂1, np̂2

p̂1 2 p̂2 6 za/2B p̂1q̂1 m

1 p̂2q̂2

n

100(1 2 a)%p1 2 p2

Notice that the estimated standard deviation of (the square-root expression) is different here from what it was for hypothesis testing when .

Recent research has shown that the actual confidence level for the traditional CI just given can sometimes deviate substantially from the nominal level (the level you think you are getting when you use a particular z critical value—e.g., 95% when

). The suggested improvement is to add one success and one failure to eachza/2 5 1.96

�0 5 0 p̂1 2 p̂2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

380 CHAPTER 9 Inferences Based on Two Samples

of the two samples and then replace the and in the foregoing formula by and where , etc. This modified interval can also be used when sample sizes are quite small.

The authors of the article “Adjuvant Radiotherapy and Chemotherapy in Node- Positive Premenopausal Women with Breast Cancer” (New Engl. J. of Med., 1997: 956–962) reported on the results of an experiment designed to compare treating cancer patients with chemotherapy only to treatment with a combination of chemotherapy and radiation. Of the 154 individuals who received the chemotherapy-only treatment, 76 survived at least 15 years, whereas 98 of the 164 patients who received the hybrid treatment survived at least that long. With p1 denoting the proportion of all such women who, when treated with just chemotherapy, survive at least 15 years and p2 denoting the analogous proportion for the hybrid treatment, and . A confidence interval for the difference between proportions based on the traditional formula with a confidence level of approximately 99% is

At the 99% confidence level, it is plausible that . This interval is reasonably wide, a reflection of the fact that the sample sizes are not ter- ribly large for this type of interval. Notice that 0 is one of the plausible values of

, suggesting that neither treatment can be judged superior to the other. Using based on sample

sizes of 156 and 166, respectively, the “improved” interval here is identical to the earlier interval. ■

Small-Sample Inferences On occasion an inference concerning may have to be based on samples for which at least one sample size is small. Appropriate methods for such situations are not as straightforward as those for large samples, and there is more controversy among statisticians as to recommended procedures. One frequently used test, called the Fisher–Irwin test, is based on the hypergeometric distribution. Your friendly neighborhood statistician can be consulted for more information.

p1 2 p2

p|1 5 77/156 5 .494, q |

1 5 79/156 5 .506, p |

2 5 .596, q |

2 5 .404 p1 2 p2

2.247 , p1 2 p2 , .039

5 (2.247, .039)

.494 2 .598 6 (2.58) B

(.494)(.506)

154 1

(.598)(.402)

164 5 2.104 6 .143

98/164 5 .598p̂1 5 76/154 5 .494

p|1 5 (x 1 1)/(m 1 2)q |’s

p|’sq̂’sp̂’s

Example 9.13

EXERCISES Section 9.4 (49–58)

49. Is someone who switches brands because of a financial inducement less likely to remain loyal than someone who switches without inducement? Let p1 and p2 denote the true proportions of switchers to a certain brand with and without inducement, respectively, who subsequently make a repeat purchase. Test versus using and the following data:

n 5 600 number of success 5 180 m 5 200 number of success 5 30

a 5 .01 Ha: p1 2 p2 , 0H0: p1 2 p2 5 0

(Similar data is given in “Impact of Deals and Deal Retraction on Brand Switching,” J. of Marketing, 1980: 62–70.)

50. Recent incidents of food contamination have caused great concern among consumers. The article “How Safe Is That Chicken?” (Consumer Reports, Jan. 2010: 19–23) reported that 35 of 80 randomly selected Perdue brand broilers tested positively for either campylobacter or salmonella (or both), the leading bacterial causes of food-borne disease, whereas 66 of 80 Tyson brand broilers tested positive.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.4 Inferences Concerning a Difference Between Population Proportions 381

a. Does it appear that the true proportion of non-contaminated Perdue broilers differs from that for the Tyson brand? Carry out a test of hypotheses using a significance level .01 by obtaining a P-value.

b. If the true proportions of non-contaminated chickens for the Perdue and Tyson brands are .50 and .25, respec- tively, how likely is it that the null hypothesis of equal proportions will be rejected when a .01 significance level is used and the sample sizes are both 80?

51. It is thought that the front cover and the nature of the first question on mail surveys influence the response rate. The article “The Impact of Cover Design and First Questions on Response Rates for a Mail Survey of Skydivers” (Leisure Sciences, 1991: 67–76) tested this theory by experimenting with different cover designs. One cover was plain; the other used a picture of a skydiver. The researchers speculated that the return rate would be lower for the plain cover.

Cover Number Sent Number Returned

Plain 207 104 Skydiver 213 109

Does this data support the researchers’hypothesis? Test the rel- evant hypotheses using by first calculating a P-value.

52. Do teachers find their work rewarding and satisfying? The article “Work-Related Attitudes” (Psychological Reports, 1991: 443–450) reports the results of a survey of 395 elementary school teachers and 266 high school teachers. Of the elementary school teachers, 224 said they were very satisfied with their jobs, whereas 126 of the high school teachers were very satisfied with their work. Estimate the difference between the proportion of all elementary school teachers who are very satisfied and all high school teachers who are very satisfied by calculating and interpreting a CI.

53. Olestra is a fat substitute approved by the FDA for use in snack foods. Because there have been anecdotal reports of gastrointestinal problems associated with olestra consump- tion, a randomized, double-blind, placebo-controlled experiment was carried out to compare olestra potato chips to regular potato chips with respect to GI symptoms (“Gastrointestinal Symptoms Following Consumption of Olestra or Regular Triglyceride Potato Chips,” J. of the Amer. Med. Assoc., 1998: 150–152). Among 529 individu- als in the TG control group, 17.6% experienced an adverse GI event, whereas among the 563 individuals in the olestra treatment group, 15.8% experienced such an event. a. Carry out a test of hypotheses at the 5% significance

level to decide whether the incidence rate of GI problems for those who consume olestra chips according to the experimental regimen differs from the incidence rate for the TG control treatment.

b. If the true percentages for the two treatments were 15% and 20%, respectively, what sample sizes would be necessary to detect such a difference with probability .90?

(m 5 n)

a 5 .10

54. Teen Court is a juvenile diversion program designed to circumvent the formal processing of first-time juvenile offenders within the juvenile justice system. The article “An Experimental Evaluation of Teen Courts” (J. of Experimental Criminology, 2008: 137–163) reported on a study in which offenders were randomly assigned either to Teen Court or to the traditional Department of Juvenile Services method of processing. Of the 56 TC individuals, 18 subsequently recidivated (look it up!) during the 18-month follow-up period, whereas 12 of the 51 DJS individuals did so. Does the data suggest that the true proportion of TC individuals who recidivate during the specified follow-up period differs from the proportion of DJS individuals who do so? State and test the relevant hypotheses by obtaining a P-value and then using a significance level of .10.

55. In medical investigations, the ratio is often of more interest than the difference (e.g., individuals given treatment 1 are how many times as likely to recover as those given treatment 2?). Let . When m and n are both large, the statistic has approximately a normal distribution with approximate mean value ln(u) and approx- imate standard deviation . a. Use these facts to obtain a large-sample 95% CI formula

for estimating ln(u), and then a CI for u itself. b. Return to the heart-attack data of Example 1.3, and cal-

culate an interval of plausible values for u at the 95% confidence level. What does this interval suggest about the efficacy of the aspirin treatment?

56. Sometimes experiments involving success or failure responses are run in a paired or before/after manner. Suppose that before a major policy speech by a political candidate, n individuals are selected and asked whether (S) or not (F) they favor the candidate. Then after the speech the same n people are asked the same question. The responses can be entered in a table as follows:

[(m 2 x)/(mx) 1 (n 2 y)/(ny)]1/2

ln(û) û 5 p̂1/p̂2

p1 2 p2

u 5 p1/p2

where . Let , and p4 denote the four cell probabilities, so that (S before and S after), and so on. We wish to test the hypothesis that the true proportion of supporters (S) after the speech has not increased against the alternative that it has increased.

a. State the two hypotheses of interest in terms of p1, p2, p3, and p4.

b. Construct an estimator for the after/before difference in success probabilities.

c. When n is large, it can be shown that the rv has approximately a normal distribution with variance given by . Use this to construct a test statistic with approximately a standard normal distribution when H0 is true (the result is called McNemar’s test).

d. If , and , what do you conclude?

x4 5 300x1 5 350, x2 5 150, x3 5 200

[pi 1 pj 2 ( pi 2 pj) 2]/n

(Xi 2 Xj)/n

p1 5 P p1, p2, p3x1 1 x2 1 x3 1 x4 5 n

S F S x1 x2 F x3 x4

After

Before

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

382 CHAPTER 9 Inferences Based on Two Samples

57. Two different types of alloy, A and B, have been used to manufacture experimental specimens of a small tension link to be used in a certain engineering application. The ultimate strength (ksi) of each specimen was determined, and the results are summarized in the accompanying frequency distribution.

A B

6 4 12 9 15 19 7 10

m 5 42m 5 40 38 2 , 42 34 2 , 38 30 2 , 34 26 2 , 30

Compute a 95% CI for the difference between the true proportions of all specimens of alloys A and B that have an ultimate strength of at least 34 ksi.

58. Using the traditional formula, a 95% CI for is to be constructed based on equal sample sizes from the two populations. For what value of will the resulting interval have a width at most of .1, irrespective of the results of the sampling?

n (5 m)

p1 2 p2

Methods for comparing two population variances (or standard deviations) are occa- sionally needed, though such problems arise much less frequently than those involv- ing means or proportions. For the case in which the populations under investigation are normal, the procedures are based on a new family of probability distributions.

The F Distribution The F probability distribution has two parameters, denoted by n1 and n2. The param- eter n1 is called the number of numerator degrees of freedom, and n2 is the number of denominator degrees of freedom; here n1 and n2 are positive integers. A random vari- able that has an F distribution cannot assume a negative value. Since the density func- tion is complicated and will not be used explicitly, we omit the formula. There is an important connection between an F variable and chi-squared variables. If X1 and X2 are independent chi-squared rv’s with n1 and n2 df, respectively, then the rv

(9.8)

(the ratio of the two chi-squared variables divided by their respective degrees of freedom), can be shown to have an F distribution.

Figure 9.8 illustrates the graph of a typical F density function. Analogous to the notation and , we use for the value on the horizontal axis that captures a of the area under the F density curve with n1 and n2 df in the upper tail. The density curve is not symmetric, so it would seem that both upper- and lower-tail critical

Fa,v1,v2x 2 a,vta,v

F 5 X1/v1 X2/v2

9.5 Inferences Concerning Two Population Variances

F density curve with 1 and 2 df��

F , 1, 2� � �

Shaded area � �

f

Figure 9.8 An F density curve and critical value

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.5 Inferences Concerning Two Population Variances 383

THEOREM Let be a random sample from a normal distribution with variance , let be another random sample (independent of the Xi’s) from a

normal distribution with variance , and let and denote the two sample variances. Then the rv

(9.9)

has an F distribution with and .v2 5 n 2 1v1 5 m 2 1

F 5 S1

2/s1 2

S2 2/s2

2

S2 2S1

2s2 2

Y1, c, Yns1 2

X1, c, Xm

This theorem results from combining (9.8) with the fact that the variables and each have a chi-squared distribution with

and df, respectively (see Section 7.4). Because F involves a ratio rather than a difference, the test statistic is the ratio of sample variances. The claim that is then rejected if the ratio differs by too much from 1.

s1 2 5 s2

2 n 2 1

m 2 1(n 2 1)S2 2/s2

2(m 2 1)S1 2/s1

2

Null hypothesis:

Test statistic value:

Alternative Hypothesis Rejection Region for a Level a Test

either

Since critical values are tabled only for , and .001, the two- tailed test can be performed only at levels .20, .10, .02, and .002. Other F critical values can be obtained from statistical software.

a 5 .10, .05, .01

f $ Fa/2,m21,n21 or f # F12a/2,m21,n21Ha: s1 2 2 s22

f # F12a, m21, n21Ha: s1 2 , s2

2

f $ Fa,m21,n21Ha: s1 2 . s2

2

f 5 s1 2/s2

2

H0: s1 2 5 s2

2

values must be tabulated. This is not necessary, though, because of the fact that .

Appendix Table A.9 gives for , and .001, and various values of n1 (in different columns of the table) and n2 (in different groups of rows of the table). For example, and . The critical value , which captures .95 of the area to its right (and thus .05 to the left) under the F curve with and , is .

The F Test for Equality of Variances A test procedure for hypotheses concerning the ratio is based on the following result.

s1 2/s2

2

F.95,6,10 5 1/F.05,10,6 5 1/4.06 5 .246v2 5 10v1 5 6

F.95,6,10F.05,10,6 5 4.06F.05,6,10 5 3.22

a 5 .10, .05, .01Fa,v1,v2

F12a,v1,v2 5 1/Fa,v2,v1

On the basis of data reported in the article “Serum Ferritin in an Elderly Population” (J. of Gerontology, 1979: 521–524), the authors concluded that the ferritin distribution in the elderly had a smaller variance than in the younger adults. (Serum ferritin is used in diagnosing iron deficiency.) For a sample of 28 elderly men, the sample standard deviation of serum ferritin (mg/L) was ; for 26 young men, the sample standard deviation was . Does this data support the conclusion as applied to men?

s2 5 84.2 s1 5 52.6

Example 9.14

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

384 CHAPTER 9 Inferences Based on Two Samples

Let and denote the variance of the serum ferritin distributions for elderly men and young men, respectively. The hypotheses of interest are ver- sus . At level .01, H0 will be rejected if . To obtain the crit- ical value, we need . From Appendix Table A.9, , so

. The computed value of F is . Since , H0 is rejected at level .01 in favor of Ha, so variability does appear to

be greater in young men than in elderly men. ■

P-Values for F Tests Recall that the P-value for an upper-tailed t test is the area under the relevant t curve (the one with appropriate df) to the right of the calculated t. In the same way, the P- value for an upper-tailed F test is the area under the F curve with appropriate numer- ator and denominator df to the right of the calculated f. Figure 9.9 illustrates this for a test based on and .v2 5 6v1 5 4

.390 # .394 (52.6)2/(84.2)2 5 .390F.99, 27, 25 5 1/2.54 5 .394

F.01,25,27 5 2.54F.01,25,27

f # F.99, 27, 25Ha: s1 2 , s2

2 H0: s1

2 5 s2 2

s2 2s1

2

f = 6.23

Shaded area = P-value = .025

F density curve for v1 = 4, v2 = 6

Figure 9.9 A P-value for an upper-tailed F test

Tabulation of F-curve upper-tail areas is much more cumbersome than for t curves because two df’s are involved. For each combination of n1 and n2, our F table gives only the four critical values that capture areas .10, .05, .01, and .001. Figure 9.10 shows what can be said about the P-value depending on where f falls relative to the four critical values.

v2

v1

� 1 . . . 4 . . .

6 .10 .05 .01

.001

3.18 4.53 9.15

21.92

P-value > .10 P-value < .001

.05 < P-value < .10

.01 < P-value < .05 .001 < P-value < .01

Figure 9.10 Obtaining P-value information from the F table for an upper-tailed F test

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

9.5 Inferences Concerning Two Population Variances 385

For example, for a test with and ,

Only if f equals a tabulated value do we obtain an exact P-value (e.g., if , then ). Once we know that , would be rejected at a significance level of .05 but not at a level of .01. When , H0 should be rejected at any reasonable significance level.

The F tests discussed in succeeding chapters will all be upper-tailed. If, how- ever, a lower-tailed F test is appropriate, then lower-tailed critical values should be obtained as described earlier so that a bound or bounds on the P-value can be estab- lished. In the case of a two-tailed test, the bound or bounds from a one-tailed test should be multiplied by 2. For example, if when and , then since 5.82 falls between the .05 and .01 critical values, , giving . would then be rejected if but not if

. In this case, we cannot say from our table what conclusion is appropriate when (since we don’t know whether the P-value is smaller or larger than this). However, statistical software shows that the area to the right of 5.82 under this F curve is .029, so the P-value is .058 and the null hypothesis should therefore not be rejected at level .05 (.058 is the smallest a for which H0 can be rejected and our chosen a is smaller than this). Various statistical software packages will, of course, provide an exact P-value for any F test.

A Confidence Interval for The CI for is based on replacing F in the probability statement

by the F variable (9.9) and manipulating the inequalities to isolate . An inter- val for results from taking the square root of each limit. The details are left for an exercise.

s1/s2

s1 2/s2

2

P(F12a/2,v1,v2 , F , Fa/2,v1,v2) 5 1 2 a

s1 2/s2

2

s1/s2

a 5 .05 a 5 .01

a 5 .10H0.02 , P-value , .10 2(.01) , P-value , 2(.05)

v2 5 6v1 5 4f 5 5.82

P-value , .001 H0.01 , P-value , .05P-value 5 .05

f 5 4.53

f 5 25.03 1 P-value , .001 f 5 2.16 1 P-value . .10 f 5 5.70 1 .01 , P-value , .05

v2 5 6v1 5 4

EXERCISES Section 9.5 (59–66)

59. Obtain or compute the following quantities: a. b. c. d. e. The 99th percentile of the F distribution with

f. The 1st percentile of the F distribution with

g. for h. for

60. Give as much information as you can about the P-value of the F test in each of the following situations: a. , upper-tailed test, b. , upper-tailed test, c. , two-tailed test, f 5 5.64v1 5 5, v2 5 10

f 5 2.00v1 5 5, v2 5 10 f 5 4.75v1 5 5, v2 5 10

v1 5 10, v2 5 5P(.177 # F # 4.74) v1 5 6, v2 5 4P(F # 6.16)

v1 5 10, v2 5 12

v1 5 10, v2 5 12

F.95,8,5F.95,5,8F.05,8,5F.05,5,8

d. , lower-tailed test, e. , upper-tailed test,

61. Return to the data on maximum lean angle given in Ex- ercise 28 of this chapter. Carry out a test at significance level .10 to see whether the population standard deviations for the two age groups are different (normal probability plots support the necessary normality assumption).

62. Refer to Example 9.7. Does the data suggest that the stan- dard deviation of the strength distribution for fused speci- mens is smaller than that for not-fused specimens? Carry out a test at significance level .01 by obtaining as much information as you can about the P-value.

f 5 3.24v1 5 35, v2 5 20 f 5 .200v1 5 5, v2 5 10

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

386 CHAPTER 9 Inferences Based on Two Samples

63. Toxaphene is an insecticide that has been identified as a pol- lutant in the Great Lakes ecosystem. To investigate the effect of toxaphene exposure on animals, groups of rats were given toxaphene in their diet. The article “Reproduction Study of Toxaphene in the Rat” (J. of Environ. Sci. Health, 1988: 101–126) reports weight gains (in grams) for rats given a low dose (4 ppm) and for control rats whose diet did not include the insecticide. The sample standard deviation for 23 female control rats was 32 g and for 20 female low-dose rats was 54 g. Does this data suggest that there is more variability in low-dose weight gains than in control weight gains? Assuming normality, carry out a test of hypotheses at significance level .05.

64. In a study of copper deficiency in cattle, the copper values (mg Cu/100 mL blood) were determined both for cattle grazing in an area known to have well-defined molybdenum anomalies (metal values in excess of the normal range of regional variation) and for cattle grazing in a nonanomalous area (“An Investigation into Copper Deficiency in Cattle in the Southern Pennines,” J. Agricultural Soc. Cambridge, 1972: 157–163), resulting in for thes1 5 21.5 (m 5 48)

anomalous condition and for the nonanomalous condition. Test for the equality versus inequality of population variances at significance level .10 by using the P-value approach.

65. The article “Enhancement of Compressive Properties of Failed Concrete Cylinders with Polymer Impregnation” (J. of Testing and Evaluation, 1977: 333–337) reports the following data on impregnated compressive modulus ( ) when two different polymers were used to repair cracks in failed concrete.

Epoxy 1.75 2.12 2.05 1.97

MMA prepolymer 1.77 1.59 1.70 1.69

Obtain a 90% CI for the ratio of variances by first using the method suggested in the text to obtain a general confidence interval formula.

66. Reconsider the data of Example 9.6, and calculate a 95% upper confidence bound for the ratio of the standard devia- tion of the triacetate porosity distribution to that of the cot- ton porosity distribution.

psi 3 106

s2 5 19.45 (n 5 45)

67. The accompanying summary data on compression strength (lb) for in. boxes appeared in the article “Compression of Single-Wall Corrugated Shipping Con- tainers Using Fixed and Floating Test Platens” (J. Testing and Evaluation, 1992: 318–320). The authors stated that “the difference between the compression strength using fixed and floating platen method was found to be small compared to normal variation in compression strength between identical boxes.” Do you agree? Is your analysis predicated on any assumptions?

Sample Sample Sample Method Size Mean SD

Fixed 10 807 27 Floating 10 757 41

68. The authors of the article “Dynamics of Canopy Structure and Light Interception in Pinus elliotti, North Florida” (Ecological Monographs, 1991: 33–51) planned an exper- iment to determine the effect of fertilizer on a measure of leaf area. A number of plots were available for the study, and half were selected at random to be fertilized. To ensure that the plots to receive the fertilizer and the control plots were similar, before beginning the experiment tree density (the number of trees per hectare) was recorded for eight plots to be fertilized and eight control plots, resulting in the given data. Minitab output follows.

12 3 10 3 8 Fertilizer plots 1024 1216 1312 1280

1216 1312 992 1120

Control plots 1104 1072 1088 1328 1376 1280 1120 1200

Two sample T for fertilizer vs control

N Mean StDev SE Mean fertilize 8 1184 126 44 control 8 1196 118 42

95% CI for mu fertilize mu control: ( 144, 120)

a. Construct a comparative boxplot and comment on any interesting features.

b. Would you conclude that there is a significant difference in the mean tree density for fertilizer and control plots? Use .

c. Interpret the given confidence interval.

69. Is the response rate for questionnaires affected by including some sort of incentive to respond along with the question- naire? In one experiment, 110 questionnaires with no incen- tive resulted in 75 being returned, whereas 98 questionnaires that included a chance to win a lottery yielded 66 responses (“Charities, No; Lotteries, No; Cash, Yes,” Public Opinion Quarterly, 1996: 542–562). Does this data suggest that including an incentive increases the likelihood of a response? State and test the relevant hypotheses at signifi- cance level .10 by using the P-value method.

70. The accompanying data was obtained in a study to evaluate the liquefaction potential at a proposed nuclear power station

a 5 .05

22

SUPPLEMENTARY EXERCISES (67–95)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 387

(“Cyclic Strengths Compared for Two Sampling Techniques,” J. of the Geotechnical Division, Am. Soc. Civil Engrs. Proceedings, 1981: 563–576). Before cyclic strength testing, soil samples were gathered using both a pitcher tube method and a block method, resulting in the following observed val- ues of dry density (lb/ft3):

Pitcher sampling 101.1 111.1 107.6 98.1 99.5 98.7 103.3 108.9

109.1 104.1 110.0 98.4 105.1 104.5 105.7 103.3 100.3 102.6 101.7 105.4 99.6 103.3 102.1 104.3

Block sampling 107.1 105.0 98.0 97.9 103.3 104.6 100.1 98.2 97.9 103.2 96.9

Calculate and interpret a 95% CI for the difference between true average dry densities for the two sampling methods.

71. The article “Quantitative MRI and Electrophysiology of Preoperative Carpal Tunnel Syndrome in a Female Population” (Ergonomics, 1997: 642–649) reported that ( .3, 1691.9) was a large-sample 95% confidence inter- val for the difference between true average thenar muscle volume (mm3) for sufferers of carpal tunnel syndrome and true average volume for nonsufferers. Calculate a 90% con- fidence interval for this difference.

72. The following summary data on bending strength (lb-in/in) of joints is taken from the article “Bending Strength of Corner Joints Constructed with Injection Molded Splines” (Forest Products J., April, 1997: 89–92).

Sample Sample Sample Type Size Mean SD

Without side coating 10 80.95 9.59 With side coating 10 63.23 5.96

a. Calculate a 95% lower confidence bound for true average strength of joints with a side coating.

b. Calculate a 95% lower prediction bound for the strength of a single joint with a side coating.

c. Calculate an interval that, with 95% confidence, includes the strength values for at least 95% of the population of all joints with side coatings.

d. Calculate a 95% confidence interval for the difference between true average strengths for the two types of joints.

73. The article “Urban Battery Litter” cited in Example 8.14 gave the following summary data on zinc mass (g) for two different brands of size D batteries:

Brand Sample Size Sample Mean Sample SD

Duracell 15 138.52 7.76 Energizer 20 149.07 1.52

2473

Assuming that both zinc mass distributions are at least approximately normal, carry out a test at significance level .05 using the P-value approach to decide whether true aver- age zinc mass is different for the two types of batteries.

74. The derailment of a freight train due to the catastrophic failure of a traction motor armature bearing provided the impetus for a study reported in the article “Locomotive Traction Motor Armature Bearing Life Study” (Lubrication Engr., Aug. 1997: 12–19). A sample of 17 high-mileage traction motors was selected, and the amount of cone penetration (mm/10) was determined both for the pinion bearing and for the commuta- tor armature bearing, resulting in the following data:

Motor

1 2 3 4 5 6 Commutator 211 273 305 258 270 209 Pinion 226 278 259 244 273 236

Motor

7 8 9 10 11 12 Commutator 223 288 296 233 262 291 Pinion 290 287 315 242 288 242

Motor

13 14 15 16 17 Commutator 278 275 210 272 264 Pinion 278 208 281 274 268

Calculate an estimate of the population mean difference between penetration for the commutator armature bearing and penetration for the pinion bearing, and do so in a way that conveys information about the reliability and precision of the estimate. [Note: A normal probability plot validates the necessary normality assumption.] Would you say that the population mean difference has been precisely esti- mated? Does it look as though population mean penetration differs for the two types of bearings? Explain.

75. Headability is the ability of a cylindrical piece of material to be shaped into the head of a bolt, screw, or other cold-formed part without cracking. The article “New Methods for Assessing Cold Heading Quality” (Wire J. Intl., Oct. 1996: 66–72) described the result of a headability impact test applied to 30 specimens of aluminum killed steel and 30 specimens of silicon killed steel. The sample mean headabil- ity rating number for the steel specimens was 6.43, and the sample mean for aluminum specimens was 7.09. Suppose that the sample standard deviations were 1.08 and 1.19, respectively. Do you agree with the article’s authors that the difference in headability ratings is significant at the 5% level (assuming that the two headability distributions are normal)?

76. The article “Fatigue Testing of Condoms” cited in Exercise 7.32 reported that for a sample of 20 natural latex condoms of a certain type, the sample mean and sample standard deviation of the number of cycles to break were 4358 and 2218, respectively, whereas a sample of 20 polyisoprene condoms gave a sample mean and sample standard

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

388 CHAPTER 9 Inferences Based on Two Samples

deviation of 5805 and 3990, respectively. Is there strong evi- dence for concluding that true average number of cycles to break for the polyisoprene condom exceeds that for the nat- ural latex condom by more than 1000 cycles? Carry out a test using a significance level of .01. [Note: The cited paper reported P-values of t tests for comparing means of the var- ious types considered.]

77. Information about hand posture and forces generated by the fingers during manipulation of various daily objects is needed for designing high-tech hand prosthetic devices. The article “Grip Posture and Forces During Holding Cylindrical Objects with Circular Grips” (Ergonomics, 1996: 1163– 1176) reported that for a sample of 11 females, the sample mean four-finger pinch strength (N) was 98.1 and the sample standard deviation was 14.2. For a sample of 15 males, the sample mean and sample standard deviation were 129.2 and 39.1, respectively. a. A test carried out to see whether true average strengths

for the two genders were different resulted in and . Does the appropriate test procedure described in this chapter yield this value of t and the stated P-value?

b. Is there substantial evidence for concluding that true average strength for males exceeds that for females by more than 25 N? State and test the relevant hypotheses.

78. The article “Pine Needles as Sensors of Atmospheric Pollution” (Environ. Monitoring, 1982: 273–286) reported on the use of neutron-activity analysis to determine pollutant con- centration in pine needles. According to the article’s authors, “These observations strongly indicated that for those elements which are determined well by the analytical procedures, the distribution of concentration is lognormal. Accordingly, in tests of significance the logarithms of concentrations will be used.” The given data refers to bromine concentration in needles taken from a site near an oil-fired steam plant and from a relatively clean site. The summary values are means and standard deviations of the log-transformed observations.

Sample Mean Log SD of Log Site Size Concentration Concentration

Steam plant 8 18.0 4.9 Clean 9 11.0 4.6

Let be the true average log concentration at the first site, and define analogously for the second site. a. Use the pooled t test (based on assuming normality and

equal standard deviations) to decide at significance level .05 whether the two concentration distribution means are equal.

b. If and (the standard deviations of the two log con- centration distributions) are not equal, would m1 and m2 (the means of the concentration distributions) be the same if ? Explain your reasoning.

79. The article “The Accuracy of Stated Energy Contents of Reduced-Energy, Commercially Prepared Foods” (J. of the

m1* 5 m2*

s2*s1*

m2* m1*

P-value 5 .019 t 5 2.51

Amer. Dietetic Assoc., 2010: 116–123) presented the accom- panying data on vendor-stated gross energy and measured value (both in kcal) for 10 different supermarket conven- ience meals):

Meal: 1 2 3 4 5 6 7 8 9 10

Stated: 180 220 190 230 200 370 250 240 80 180

Measured: 212 319 231 306 211 431 288 265 145 228

Carry out a test of hypotheses based on a P-value to decide whether the true average % difference from that stated dif- fers from zero. [Note: The article stated “Although formal statistical methods do not apply to convenience samples, standard statistical tests were employed to summarize the data for exploratory purposes and to suggest directions for future studies.”]

80. Arsenic is a known carcinogen and poison. The standard laboratory procedures for measuring arsenic concentration (mg/L) in water are expensive. Consider the accompanying summary data and Minitab output for comparing a labora- tory method to a new relatively quick and inexpensive field method (from the article “Evaluation of a New Field Measurement Method for Arsenic in Drinking Water Samples,” J. of Envir. Engr., 2008: 382–388).

Two-Sample T-Test and CI

Sample N Mean StDev SE Mean 1 3 19.70 1.10 0.64 2 3 10.90 0.60 0.35 Estimate for difference: 8.800

95% CI for difference: (6.498, 11.102)

What conclusion do you draw about the two methods, and why? Interpret the given confidence interval. [Note: One of the article’s authors indicated in private communication that they were unsure why the two methods disagreed.]

81. The accompanying data on response time appeared in the article “The Extinguishment of Fires Using Low-Flow Water Hose Streams—Part II” (Fire Technology, 1991: 291–320).

Good visibility .43 1.17 .37 .47 .68 .58 .50 2.75

Poor visibility 1.47 .80 1.58 1.53 4.33 4.23 3.25 3.22

The authors analyzed the data with the pooled t test. Does the use of this test appear justified? [Hint: Check for normality. The z percentiles for are

, and 1.53.]

82. Acrylic bone cement is commonly used in total joint arthro- plasty as a grout that allows for the smooth transfer of loads from a metal prosthesis to bone structure. The paper “Validation of the Small-Punch Test as a Technique for Characterizing the Mechanical Properties of Acrylic Bone Cement” (J. of Engr. in Med., 2006: 11–21) gave the fol- lowing data on breaking force (N):

.15, .49, .89 21.53, 2.89, 2.49, 2.15,n 5 8

T-Value 5 12.16 P-Value 5 0.001 DF 5 3 T-Test of difference 5 0 (vs not 5):

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 389

Temp Medium n s

22° Dry 6 170.60 39.08 37° Dry 6 325.73 34.97 22° Wet 6 366.36 34.82 37° Wet 6 306.09 41.97

Assume that all population distributions are normal. a. Estimate true average breaking force in a dry medium at

37° in a way that conveys information about reliability and precision, and interpret your estimate.

b. Estimate the difference between true average breaking force in a dry medium at 37° and true average force at the same temperature in a wet medium, and do so in a way that conveys information about precision and reliability. Then interpret your estimate.

c. Is there strong evidence for concluding that true average force in a dry medium at the higher temperature exceeds that at the lower temperature by more than 100 N?

83. In an experiment to compare bearing strengths of pegs inserted in two different types of mounts, a sample of 14 observations on stress limit for red oak mounts resulted in a sample mean and sample standard deviation of 8.48 MPa and .79 MPa, respec- tively, whereas a sample of 12 observations when Douglas fir mounts were used gave a mean of 9.36 and a standard deviation of 1.52 (“Bearing Strength of White Oak Pegs in Red Oak and Douglas Fir Timbers,” J. of Testing and Evaluation, 1998, 109–114). Consider testing whether or not true average stress limits are identical for the two types of mounts. Compare df’s and P-values for the unpooled and pooled t tests.

84. How does energy intake compare to energy expenditure? One aspect of this issue was considered in the article “Measurement of Total Energy Expenditure by the Doubly Labelled Water Method in Professional Soccer Players” (J. of Sports Sciences, 2002: 391–397), which contained the accompanying data (MJ/day).

Player

1 2 3 4 5 6 7

Expenditure 14.4 12.1 14.3 14.2 15.2 15.5 17.8 Intake 14.6 9.2 11.8 11.6 12.7 15.0 16.3

Test to see whether there is a significant difference between intake and expenditure. Does the conclusion depend on whether a significance level of .05, .01, or .001 is used?

85. An experimenter wishes to obtain a CI for the difference between true average breaking strength for cables manufac- tured by company I and by company II. Suppose breaking strength is normally distributed for both types of cable with

psi and psi. a. If costs dictate that the sample size for the type I cable

should be three times the sample size for the type II cable, how many observations are required if the 99% CI is to be no wider than 20 psi?

s2 5 20s1 5 30

x b. Suppose a total of 400 observations is to be made. How many of the observations should be made on type I cable samples if the width of the resulting interval is to be a minimum?

86. An experiment to determine the effects of temperature on the survival of insect eggs was described in the article “Development Rates and a Temperature-Dependent Model of Pales Weevil” (Environ. Entomology, 1987: 956–962). At 11°C, 73 of 91 eggs survived to the next stage of develop- ment. At 30°C, 102 of 110 eggs survived. Do the results of this experiment suggest that the survival rate (proportion surviving) differs for the two temperatures? Calculate the P-value and use it to test the appropriate hypotheses.

87. Wait staff at restaurants have employed various strategies to increase tips. An article in the Sept. 5, 2005, New Yorker reported that “In one study a waitress received 50% more in tips when she introduced herself by name than when she didn’t.” Consider the following (fictitious) data on tip amount as a percentage of the bill:

Introduction:

No introduction:

Does this data suggest that an introduction increases tips on average by more than 50%? State and test the relevant hypotheses. [Hint: Consider the parameter .]

88. The paper “Quantitative Assessment of Glenohumeral Translation in Baseball Players” (The Amer. J. of Sports Med., 2004: 1711–1715) considered various aspects of shoulder motion for a sample of pitchers and another sample of position players [glenohumeral refers to the articulation between the humerus (ball) and the glenoid (socket)]. The authors kindly supplied the following data on anteroposterior translation (mm), a measure of the extent of anterior and posterior motion, both for the dominant arm and the nondominant arm.

Pos Dom Tr Pos ND Tr Pit Dom Tr Pit ND Tr 1 30.31 32.54 27.63 24.33 2 44.86 40.95 30.57 26.36 3 22.09 23.48 32.62 30.62 4 31.26 31.11 39.79 33.74 5 28.07 28.75 28.50 29.84 6 31.93 29.32 26.70 26.71 7 34.68 34.79 30.34 26.45 8 29.10 28.87 28.69 21.49 9 25.51 27.59 31.19 20.82 10 22.49 21.01 36.00 21.75 11 28.74 30.31 31.58 28.32 12 27.89 27.92 32.55 27.22 13 28.48 27.85 29.56 28.86 14 25.60 24.95 28.64 28.58 15 20.21 21.59 28.58 27.15 16 33.77 32.48 31.99 29.46 17 32.59 32.48 27.16 21.26 18 32.60 31.61 19 29.30 27.46 mean 29.4463 29.2137 30.7112 26.6447 sd 5.4655 4.7013 3.3310 3.6679

a. Estimate the true average difference in translation between dominant and nondominant arms for pitchers in

u 5 m1 2 1.5m2

s2 5 6.10y 5 14.15n 5 50

s1 5 7.82x 5 22.63m 5 50

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

390 CHAPTER 9 Inferences Based on Two Samples

a way that conveys information about reliability and pre- cision, and interpret the resulting estimate.

b. Repeat (a) for position players. c. The authors asserted that “pitchers have greater difference

in side-to-side anteroposterior translation of their shoulders compared with position players.” Do you agree? Explain.

89. Suppose a level .05 test of versus is to be performed, assuming

and normality of both distributions, using equal sample sizes . Evaluate the probability of a type II error when and , and 10,000. Can you think of real problems in which the difference

has little practical significance? Would sam- ple sizes of be desirable in such problems?

90. The following data refers to airborne bacteria count (num- ber of colonies/ft3) both for carpeted hospital rooms and for uncarpeted rooms (“Microbial Air Sampling in a Carpeted Hospital,” J. of Environmental Health, 1968: 405). Does there appear to be a difference in true average bacteria count between carpeted and uncarpeted rooms?

Carpeted 11.8 8.2 7.1 13.0 10.8 10.1 14.6 14.0

Uncarpeted 12.1 8.3 3.8 7.2 12.0 11.1 10.1 13.7

Suppose you later learned that all carpeted rooms were in a veterans’ hospital, whereas all uncarpeted rooms were in a children’s hospital. Would you be able to assess the effect of carpeting? Comment.

91. Researchers sent 5000 resumes in response to job ads that appeared in the Boston Globe and Chicago Tribune. The resumes were identical except that 2500 of them had “white sounding” first names, such as Brett and Emily, whereas the other 2500 had “black sounding” names such as Tamika and Rasheed. The resumes of the first type elicited 250 responses and the resumes of the second type only 167 responses (these numbers are very consistent with informa- tion that appeared in a Jan. 15, 2003, report by the Associated Press). Does this data strongly suggest that a resume with a “black” name is less likely to result in a response than is a resume with a “white” name?

92. McNemar’s test, developed in Exercise 54, can also be used when individuals are paired (matched) to yield n pairs and then one member of each pair is given treatment 1 and the other is given treatment 2. Then X1 is the num- ber of pairs in which both treatments were successful, and similarly for X2, X3, and X4. The test statistic for testing equal efficacy of the two treatments is given by

, which has approximately a stan- dard normal distribution when H0 is true. Use this to test (X2 2 X3)/1(X2 1 X3)

n 5 8 m 5 8

n 5 10,000 m1 2 m2 5 1

n 5 25, 100, 2500m1 2 m2 5 1 (m 5 n)

10 s1 5 s2 5Ha: m1 2 m2 . 0

H0: m1 2 m2 5 0

whether the drug ergotamine is effective in the treatment of migraine headaches.

Ergotamine

S F

Placebo S 44 34 F 46 30

The data is fictitious, but the conclusion agrees with that in the article “Controlled Clinical Trial of Ergotamine Tar- trate” (British Med. J., 1970: 325–327).

93. The article “Evaluating Variability in Filling Operations” (Food Tech., 1984: 51–55) describes two different filling operations used in a ground-beef packing plant. Both filling operations were set to fill packages with 1400 g of ground beef. In a random sample of size 30 taken from each filling operation, the resulting means and standard deviations were 1402.24 g and 10.97 g for operation 1 and 1419.63 g and 9.96 g for operation 2. a. Using a .05 significance level, is there sufficient evi-

dence to indicate that the true mean weight of the pack- ages differs for the two operations?

b. Does the data from operation 1 suggest that the true mean weight of packages produced by operation 1 is higher than 1400 g? Use a .05 significance level.

94. Let be a random sample from a Poisson distribu- tion with parameter m1, and let be a random sam- ple from another Poisson distribution with parameter m2. We wish to test against one of the three standard alternatives. When m and n are large, the large-sample z test of Section 9.1 can be used. However, the fact that suggests that a different denominator should be used in stan- dardizing . Develop a large-sample test procedure appropriate to this problem, and then apply it to the following data to test whether the plant densities for a particular species are equal in two different regions (where each observation is the number of plants found in a randomly located square sam- pling quadrate having area 1 m2, so for region 1 there were 40 quadrates in which one plant was observed, etc.):

Frequency

0 1 2 3 4 5 6 7

Region 1 28 40 28 17 8 2 1 1 Region 2 14 25 30 18 49 2 1 1

95. Referring to Exercise 94, develop a large-sample confidence interval formula for . Calculate the interval for the data given there using a confidence level of 95%.

m1 2 m2

n 5 140 m 5 125

X 2 Y

V(X) 5 m/n

H0: m1 2 m2 5 0

Y1, c, Yn

X1, c, Xm

Bibliography See the bibliography at the end of Chapter 7.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

391

10

INTRODUCTION

In studying methods for the analysis of quantitative data, we first focused on

problems involving a single sample of numbers and then turned to a comparative

analysis of two such different samples. In one-sample problems, the data consisted

of observations on or responses from individuals or experimental objects randomly

selected from a single population. In two-sample problems, either the two sam-

ples were drawn from two different populations and the parameters of interest

were the population means, or else two different treatments were applied to

experimental units (individuals or objects) selected from a single population; in this

latter case, the parameters of interest are referred to as true treatment means.

The analysis of variance, or more briefly ANOVA, refers broadly to a col-

lection of experimental situations and statistical procedures for the analysis of

quantitative responses from experimental units. The simplest ANOVA problem

is referred to variously as a single-factor, single-classification, or one-way

ANOVA. It involves the analysis either of data sampled from more than two

numerical populations (distributions) or of data from experiments in which

more than two treatments have been used. The characteristic that differentiates

the treatments or populations from one another is called the factor under

study, and the different treatments or populations are referred to as the levels

of the factor. Examples of such situations include the following:

1. An experiment to study the effects of five different brands of gasoline on

automobile engine operating efficiency (mpg)

2. An experiment to study the effects of the presence of four different sugar

solutions (glucose, sucrose, fructose, and a mixture of the three) on bacterial

growth

The Analysis of Variance

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 10.1

392 CHAPTER 10 The Analysis of Variance

3. An experiment to investigate whether hardwood concentration in pulp (%)

has an effect on tensile strength of bags made from the pulp

4. An experiment to decide whether the color density of fabric specimens

depends on the amount of dye used

In (1) the factor of interest is gasoline brand, and there are five different

levels of the factor. In (2) the factor is sugar, with four levels (or five, if a con-

trol solution containing no sugar is used). In both (1) and (2), the factor is qual-

itative in nature, and the levels correspond to possible categories of the factor.

In (3) and (4), the factors are concentration of hardwood and amount of dye,

respectively; both these factors are quantitative in nature, so the levels identify

different settings of the factor. When the factor of interest is quantitative, sta-

tistical techniques from regression analysis (discussed in Chapters 12 and 13)

can also be used to analyze the data.

This chapter focuses on single-factor ANOVA. Section 10.1 presents the

F test for testing the null hypothesis that the population or treatment means are

identical. Section 10.2 considers further analysis of the data when H0 has been

rejected. Section 10.3 covers some other aspects of single-factor ANOVA.

Chapter 11 introduces ANOVA experiments involving more than a single factor.

10.1 Single-Factor ANOVA Single-factor ANOVA focuses on a comparison of more than two population or treat- ment means. Let

the number of populations or treatments being compared

the mean of population 1 or the true average response when treatment 1 is applied

the mean of population I or the true average response when treatment I is applied

The relevant hypotheses are

versus

If , H0 is true only if all four mi’s are identical. Ha would be true, for example, if , if , or if all four mi’s differ from one another.

A test of these hypotheses requires that we have available a random sample from each population or treatment.

The article “Compression of Single-Wall Corrugated Shipping Containers Using Fixed and Floating Test Platens” (J. Testing and Evaluation, 1992: 318–320) describes an experiment in which several different types of boxes were compared

m1 5 m3 5 m4 2 m2m1 5 m2 2 m3 5 m4 I 5 4

Ha: at least two the of the mi’s are different

H0: m1 5 m2 5 c5 mI

mI 5

(

m1 5

I 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

10.1 Single-Factor ANOVA 393

Table 10.1 The Data and Summary Quantities for Example 10.1

Type of Box Compression Strength (lb) Sample Mean Sample SD

1 655.5 788.3 734.3 721.4 679.1 699.4 713.00 46.55 2 789.2 772.5 786.9 686.1 732.1 774.8 756.93 40.34 3 737.1 639.0 696.3 671.7 717.2 727.1 698.07 37.20 4 535.1 628.7 542.4 559.0 586.9 520.0 562.02 39.87

Grand mean � 682.50

630

4

3

2

1

660 690 750720 780

550

4

3

2

1

600 650

(a)

(b)

700 750

Figure 10.1 Boxplots for Example 10.1: (a) original data; (b) altered data

With mi denoting the true average compression strength for boxes of type i ( , 2, 3, 4), the null hypothesis is . Figure 10.1(a) shows a com- parative boxplot for the four samples. There is a substantial amount of overlap among observations on the first three types of boxes, but compression strengths for the fourth type appear considerably smaller than for the other types. This suggests that H0 is not true. The comparative boxplot in Figure 10.1(b) is based on adding 120 to each obser- vation in the fourth sample (giving mean 682.02 and the same standard deviation) and leaving the other observations unaltered. It is no longer obvious whether H0 is true or false. In situations such as this, we need a formal test procedure.

H0: m1 5 m2 5 m3 5 m4

i 5 1

with respect to compression strength (lb). Table 10.1 presents the results of a single- factor ANOVA experiment involving types of boxes (the sample means and standard deviations are in good agreement with values given in the article).

I 5 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

394 CHAPTER 10 The Analysis of Variance

Notation and Assumptions The letters X and Y were used in two-sample problems to differentiate the observa- tions in one sample from those in the other. Because this is cumbersome for three or more samples, it is customary to use a single letter with two subscripts. The first sub- script identifies the sample number, corresponding to the population or treatment being sampled, and the second subscript denotes the position of the observation within that sample. Let

The observed data is usually displayed in a rectangular table, such as Table 10.1. There samples from the different populations appear in different rows of the table, and is the jth number in the ith row. For example, (the third observation from the second population), and . When there is no ambi- guity, we will write rather than (e.g., if there were 15 observations on each of 12 treatments, could mean or ). It is assumed that the ’s within any particular sample are independent—a random sample from the ith population or treatment distribution—and that different samples are independent of one another.

In some experiments, different samples contain different numbers of observa- tions. Here we’ll focus on the case of equal sample sizes; the generalization to unequal sample sizes appears in Section 10.3. Let J denote the number of observa- tions in each sample ( in Example 10.1). The data set consists of IJ observa- tions. The individual sample means will be denoted by . That is,

The dot in place of the second subscript signifies that we have added over all values of that subscript while holding the other subscript value fixed, and the horizontal bar indicates division by J to obtain an average. Similarly, the average of all IJ observa- tions, called the grand mean, is

For the data in Table 10.1, , and . Additionally, let , denote the sample variances:

From Example 10.1, , and so on.s1 5 46.55, s1 2 5 2166.90

Si 2 5

g J

j51 (Xij 2 Xi#)2

J 2 1 i 5 1, 2, c, I

S1 2, S2

2, cSI 2x#.. 5 682.50

x1. 5 713.00, x2. 5 756.93, x3. 5 698.07, x4. 5 562.02

X.. 5

g I

i51 g

J

j51 Xij

IJ

Xi # 5 g J

j51 Xij

J i 5 1, 2, c, I

X1., X2., c, XI. J 5 6

Xijx11,2x1,12x112

xi, jxij

x4,1 5 535.1 x2,3 5 786.9xi, j

xi, j 5 the observed value of Xi, j when the experiment is performed unit that receives the ith treatment the ith population, or the measurement taken on the jth experimental

Xi, j 5 the random variable (rv) that denotes the jth measurement taken from

ASSUMPTIONS The I population or treatment distributions are all normal with the same vari- ance s 2. That is, each is normally distributed with

E(Xij) 5 mi V(Xij) 5 s 2

Xij

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

10.1 Single-Factor ANOVA 395

The I sample standard deviations will generally differ somewhat even when the corresponding s’s are identical. In Example 10.1, the largest among s1, s2, s3, and s4 is about 1.25 times the smallest. A rough rule of thumb is that if the largest s is not much more than two times the smallest, it is reasonable to assume equal s 2 ’s.

In previous chapters, a normal probability plot was suggested for checking normality. The individual sample sizes in ANOVA are typically too small for I sep- arate plots to be informative. A single plot can be constructed by subtracting . from each observation in the first sample, . from each observation in the second, and so on, and then plotting these IJ deviations against the z percentiles. Figure 10.2 gives such a plot for the data of Example 10.1. The straightness of the pattern gives strong support to the normality assumption.

x2

x1

–1.4 –.7 0 .7 1.4

–50

50

0

z percentile

Deviation

Figure 10.2 A normal probability plot based on the data of Example 10.1

If either the normality assumption or the assumption of equal variances is judged implausible, a method of analysis other than the usual F test must be employed. Please seek expert advice in such situations (one possibility, a data transformation, is sug- gested in Section 10.3, and another alternative is developed in Section 15.4).

The Test Statistic If H0 is true, the J observations in each sample come from a normal population dis- tribution with the same mean value m, in which case the sample means should be reasonably close to one another. The test procedure is based on compar- ing a measure of differences among the ’s (“between-samples” variation) to a measure of variation calculated from within each of the samples.

xi.

x1., c, xI #

DEFINITION Mean square for treatments is given by

and mean square for error is

The test statistic for single-factor ANOVA is .F 5 MSTr/MSE

MSE 5 S1

2 1 S2 2 1 c1 SI

2

I

5 J

I 2 1 g

i (Xi# 2 X..)2

MSTr 5 J

I 2 1 [(X1# 2 X..)2 1 (X2# 2 X..)2 1 c1 (XI# 2 X..)2]

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

PROPOSITION

396 CHAPTER 10 The Analysis of Variance

The terminology “mean square” will be explained shortly. Notice that uppercase and S 2’s are used, so MSTr and MSE are defined as statistics. We will follow tradi- tion and also use MSTr and MSE (rather than mstr and mse) to denote the calculated values of these statistics. Each assesses variation within a particular sample, so MSE is a measure of within-samples variation.

What kind of value of F provides evidence for or against H0? If H0 is true (all mi’s are equal), the values of the individual sample means should be close to one another and therefore close to the grand mean, resulting in a relatively small value of MSTr. However, if the mi’s are quite different, some should differ quite a bit from

. So the value of MSTr is affected by the status of H0 (true or false). This is not the case with MSE, because the depend only on the underlying value of s2 and not on where the various distributions are centered. The following box presents an impor- tant property of E(MSTr) and E(MSE), the expected values of these two statistics.

si 2’s

x# # xi.’s

Si 2

X’s

When H0 is true,

whereas when H0 is false,

That is, both statistics are unbiased for estimating the common population vari- ance s2 when H0 is true, but MSTr tends to overestimate s

2 when H0 is false.

E(MSTr) . E(MSE) 5 s2

E(MSTr) 5 E(MSE) 5 s2

The unbiasedness of MSE is a consequence of whether H0 is true or false. When H0 is true, each has the same mean value m and variance , so

, the “sample variance” of the , estimates unbiasedly; multiplying this by J gives MSTr as an unbiased estimator of s 2 itself. The tend to spread out more when H0 is false than when it is true, tending to inflate the value of MSTr in this case. Thus a value of F that greatly exceeds 1, corresponding to an MSTr much larger than MSE, casts considerable doubt on H0. The appropriate form of the rejection region is therefore . The cutoff c should be chosen to give

, the desired significance level. This necessitates knowing the distribution of F when H0 is true.

F Distributions and the F Test In Chapter 9, we introduced a family of probability distributions called F distribu- tions. An F distribution arises in connection with a ratio in which there is one num- ber of degrees of freedom (df) associated with the numerator and another number of degrees of freedom associated with the denominator. Let n1 and n2 denote the num- ber of numerator and denominator degrees of freedom, respectively, for a variable with an F distribution. Both n1 and n2 are positive integers. Figure 10.3 pictures an F density curve and the corresponding upper-tail critical value . Appendix Table A.9 gives these critical values for , and .001. Values of n1 are identified with different columns of the table, and the rows are labeled with various values of n2. For example, the F critical value that captures upper-tail area .05 under the F curve with and is , whereas . The key theoretical result is that the test statistic F has an F distribution when H0 is true.

F.05,6,4 5 6.16F.05,4,6 5 4.53n2 5 6n1 5 4

a 5 .10, .05, .01 Fa,n1,n2

P(F $ c where H0 is true) 5 a f $ c

Xi#’s s2/JXi#’sg (Xi 2 #X..)

2/(I 2 1) s2/JXi.

E(Si 2) 5 s2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

THEOREM

10.1 Single-Factor ANOVA 397

F density curve for 1 and 2 df� �

Shaded area � �

F , 1, 2� � �

Figure 10.3 An F density curve and critical value Fa,n1,n2

Let be the test statistic in a single-factor ANOVA problem involving I populations or treatments with a random sample of J observations from each one. When H0 is true and the basic assumptions of this section are sat- isfied, F has an F distribution with and . With f denoting the computed value of F, the rejection region then specifies a test with significance level a. Refer to Section 9.5 to see how P-value information for F tests is obtained.

f $ Fa,I21,I(J21)

n2 5 I(J 2 1)n1 5 I 2 1

F 5 MSTr/MSE

Example 10.2 (Example 10.1 continued)

The rationale for is that although MSTr is based on the I devia- tions , so only of these are freely determined. Because each sample contributes df to MSE and these samples are independent, .

The values of I and J for the strength data are 4 and 6, respectively, so numerator and denominator . At significance level .05,

will be rejected in favor of the conclusion that at least two mi’s are different if . The grand mean is

Since , H0 is resoundingly rejected at significance level .05. True aver- age compression strength does appear to depend on box type. In fact,

under F curve to the right of . H0 would be rejected at any reasonable significance level. ■

The article “Influence of Contamination and Cleaning on Bond Strength to Modified Zirconia” (Dental Materials, 2009: 1541–1550) reported on an experiment in which 50 zirconium-oxide disks were divided into five groups of 10 each. Then a different contamination/cleaning protocol was used for each group. The following summary data on shear bond strength (MPa) appeared in the article:

Treatment: 1 2 3 4 5 Sample mean 10.5 14.8 15.7 16.0 21.6 Sample sd 4.5 6.8 6.5 6.7 6.0

Let mi denote the true average bond strength for protocol . The null hypothesis

H0: m1 5 m2 5 m3 5 m4 5 m5

i (i 5 1,2,3,4,5)

Grand mean 5 15.7

25.09 5 .000P-value 5 area

25.09 $ 3.10

f 5 MSTr/MSE 5 42,455.86/1691.92 5 25.09

MSE 5 1

4 [(46.55)2 1 (40.34)2 1 (37.20)2 1 (39.87)2] 5 1691.92

1 (698.07 2 682.50)2 1 (562.02 2 682.50)2] 5 42,455.86

MSTr 5 6

4 2 1 [(713.00 2 682.50)2 1 (756.93 2 682.50)2

x# # 5 ggxij / (IJ) 5 682.50,f $ F.05,3,20 5 3.10 H0: m1 5 m2 5 m3 5 m4

df 5 I(J 2 1) 5 20df 5 I 2 1 5 3

n2 5 (J 2 1) 1 c1 (J 2 1) 5 I(J 2 1) J 2 1

I 2 1X1# 2 X# #, c, XI. 2 X# #, g (Xi. 2 X# #) 5 0 n1 5 I 2 1

Example 10.3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

398 CHAPTER 10 The Analysis of Variance

The total sum of squares (SST), treatment sum of squares (SSTr), and error sum of squares (SSE) are given by

SSE 5 g I

i51 g

J

j51 (xij 2 xi#)2 where xi# 5 g

J

j51 xij x # # 5 g

I

i51 g

J

j51 xij

SSTr 5 g I

i51 g

J

j51 (xi# 2 x # #)2 5

1

J g

I

i51 xi#2 2

1

IJ x #

#2

SST 5 g I

i51 g

J

j51 (xij 2 x # #)2 5 g

I

i51 g

J

j51 xij

2 2 1

IJ x #

#2

asserts that true average strength is the same for all protocols (doesn’t depend upon which protocol is used). The alternative hypothesis Ha states that at least two of the treatment m’s are different (the negation of the null hypothesis). The authors of the cited article used the F test, so hopefully examined a normal probability plot of the deviations (or a separate plot for each sample, since each sample size is 10) to check the plausibility of assuming normal treatment-response distributions. The five sample standard deviations are certainly close enough to one another to support the assumption of equal s’s.

Numerator and denominator df for the test are and , respectively. The F critical value for a test with significance

level .01 is (our F table doesn’t have a group of rows for 45 denom- inator df, but the .01 entry for 40 df is 3.83 and for 50 df is 3.72). So H0 will be rejected if .

The mean squares are

Thus the test statistic value is . This value falls in the rejection region . At significance level .01, we are able to conclude that true average strength does appear to depend on which protocol is used. Statistical software gives the P-value as .0061. ■

When the null hypothesis is rejected by the F test, as happened in both Examples 10.2 and 10.3, the experimenter will often be interested in further analysis of the data to decide which mi’s differ from which others. Methods for doing this are called multiple comparison procedures; that is the topic of Section 10.2. The article cited in Example 10.3 summarizes the results of such an analysis.

Sums of Squares The introduction of sums of squares facilitates developing an intuitive appreciation for the rationale underlying single-factor and multifactor ANOVAs. Let represent the sum (not the average, since there is no bar) of the for i fixed (sum of the num- bers in the ith row of the table) and denote the sum of all the (the grand total).xij’sx# #

xij ’s xi#

(4.14 $ 3.77) f 5 156.875/37.926 5 4.14

MSE 5 [(4.5)2 1 (6.8)2 1 (6.5)2 1 (6.7)2 1 (6.0)2]/5 5 37.926 5 156.875

1 (16.0 2 15.7)2 1 (21.6 2 15.7)2]

MSTr 5 10

5 2 1 [(10.5 2 15.7)2 1 (14.8 2 15.7)2 1 (15.7 2 15.7)2

f $ 3.77

F.01,4,45 5 3.77 I(J 2 1) 5 5(9) 5 45

I 2 1 5 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

The sum of squares SSTr appears in the numerator of F, and SSE appears in the denominator of F; the reason for defining SST will be apparent shortly.

The expressions on the far right-hand side of SST and SSTr are convenient if ANOVA calculations will be done by hand, although the wide availability of statis- tical software makes this unnecessary. Both SST and SSTr involve (the square of the grand total divided by IJ), which is usually called the correction factor for the mean (CF). After the correction factor is computed, SST is obtained by squar- ing each number in the data table, adding these squares together, and subtracting the correction factor. SSTr results from squaring each row total, summing them, divid- ing by J, and subtracting the correction factor. SSE is then easily obtained by virtue of the following relationship.

x2# #/(IJ)

Fundamental Identity

(10.1)SST 5 SSTr 1 SSE

(10.3)MSTr 5 SSTr

I 2 1 MSE 5

SSE

I(J 2 1) F 5

MSTr

MSE

Thus if any two of the sums of squares are computed, the third can be obtained through (10.1); SST and SSTr are easiest to compute, and then . The proof follows from squaring both sides of the relationship

(10.2)

and summing over all i and j. This gives SST on the left and SSTr and SSE as the two extreme terms on the right. The cross-product term is easily seen to be zero.

The interpretation of the fundamental identity is an important aid to an under- standing of ANOVA. SST is a measure of the total variation in the data—the sum of all squared deviations about the grand mean. The identity says that this total varia- tion can be partitioned into two pieces. SSE measures variation that would be pres- ent (within rows) whether H0 is true or false, and is thus the part of total variation that is unexplained by the status of H0. SSTr is the amount of variation (between rows) that can be explained by possible differences in the mi’s. H0 is rejected if the explained variation is large relative to unexplained variation.

Once SSTr and SSE are computed, each is divided by its associated df to obtain a mean square (mean in the sense of average). Then F is the ratio of the two mean squares.

xij 2 x # # 5 (xij 2 xi#) 1 (xi# 2 x # #)

SSE 5 SST 2 SSTr

The computations are often summarized in a tabular format, called an ANOVA table, as displayed in Table 10.2. Tables produced by statistical software customar- ily include a P-value column to the right of f.

Table 10.2 An ANOVA Table

Source of Sum of Variation df Squares Mean Square f

Treatments SSTr MSTr/MSE Error SSE Total SSTIJ 2 1

MSE 5 SSE/[I(J 2 1)]I(J 2 1) MSTr 5 SSTr/(I 2 1)I 2 1

10.1 Single-Factor ANOVA 399

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 10.4

400 CHAPTER 10 The Analysis of Variance

Mixture 1 .56 1.12 .90 1.07 .94 4.59 .918 Mixture 2 .72 .69 .87 .78 .91 3.97 .794 Mixture 3 .62 1.08 1.07 .99 .93 4.69 .938

x # # 5 13.25

xi#xi#

The accompanying data resulted from an experiment comparing the degree of soiling for fabric copolymerized with three different mixtures of methacrylic acid (similar data appeared in the article “Chemical Factors Affecting Soiling and Soil Release from Cotton DP Fabric,” American Dyestuff Reporter, 1983: 25–30).

Let mi denote the true average degree of soiling when the mixture i is used . The null hypothesis states that the true average degree of soiling is identical for the three mixtures. Let’s carry out a test at significance level .01 to see whether H0 should be rejected in favor of the assertion that true average degree of soil- ing is not the same for all mixtures. Since and , H0 will be rejected if . Squaring each of the 15 observations and summing gives . The values of the three sums of squares are

The computations are summarized in the accompanying ANOVA table. Because H0 is not rejected at significance level .01. The mixtures appear to be

indistinguishable with respect to degree of soiling .(F.10,2,12 5 2.81 1 P-value . .10) f 5 .99 , 6.93,

SSE 5 .4309 2 .0608 5 .3701 5 11.7650 2 11.7042 5 .0608

SSTr 5 1

5 [(4.59)2 1 (3.97)2 1 (4.69)2] 2 11.7042

SST 5 12.1351 2 (13.25)2/15 5 12.1351 2 11.7042 5 .4309

ggxij 2 5 (.56)2 1 (1.12)2 1 c 1 (.93)2 5 12.1351 f $ F.01,2,12 5 6.93

I(J 2 1) 5 12I 2 1 5 2

H0: m1 5 m2 5 m3

(i 5 1, 2, 3)

Sum of Source of Variation df Squares Mean Square f

Treatments 2 .0608 .0304 .99 Error 12 .3701 .0308 Total 14 .4309

EXERCISES Section 10.1 (1–10)

1. In an experiment to compare the tensile strengths of different types of copper wire, samples of each type were used. The between-samples and within-samples esti- mates of s 2 were computed as and

, respectively. a. Use the F test at level .05 to test

versus Ha: at least two mi’s are unequal. b. What can be said about the P-value for the test? m4 5 m5

H0: m1 5 m2 5 m3 5 MSE 5 1094.2

MSTr 5 2673.3

J 5 4 I 5 5 2. Suppose that the compression-strength observations on the

fourth type of box in Example 10.1 had been 655.1, 748.7, 662.4, 679.0, 706.9, and 640.0 (obtained by adding 120 to each previous ). Assuming no change in the remaining observations, carry out an F test with .

3. The lumen output was determined for each of different brands of 60-watt soft-white lightbulbs, with bulbs ofJ 5 8

I 5 3

a 5 .05 x4j

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

10.1 Single-Factor ANOVA 401

each brand tested. The sums of squares were computed as and . State the hypotheses of

interest (including word definitions of parameters), and use the F test of ANOVA to decide whether there are any differences in true average lumen outputs among the three brands for this type of bulb by obtaining as much infor- mation as possible about the P-value.

4. It is common practice in many countries to destroy (shred) refrigerators at the end of their useful lives. In this process material from insulating foam may be released into the atmosphere. The article “Release of Fluorocarbons from Insulation Foam in Home Appliances during Shredding” (J. of the Air and Waste Mgmt. Assoc., 2007: 1452–1460) gave the following data on foam density (g/L) for each of two refrigerators produced by four different manufacturers:

1. 30.4, 29.2 2. 27.7, 27.1 3. 27.1, 24.8 4. 25.5, 28.8

Does it appear that true average foam density is not the same for all these manufacturers? Carry out an appropriate test of hypotheses by obtaining as much P-value information as pos- sible, and summarize your analysis in an ANOVA table.

5. Consider the following summary data on the modulus of elas- ticity (� 106 psi) for lumber of three different grades [in close agreement with values in the article “Bending Strength and Stiffness of Second-Growth Douglas-Fir Dimension Lumber” (Forest Products J., 1991: 35–43), except that the sample sizes there were larger]:

(a 5 .05)

SSTr 5 591.2SSE 5 4773.3

8. A study of the properties of metal plate-connected trusses used for roof support (“Modeling Joints Made with Light- Gauge Metal Connector Plates,” Forest Products J., 1979: 39–44) yielded the following observations on axial-stiffness index (kips/in.) for plate lengths 4, 6, 8, 10, and 12 in:

4: 309.2 409.5 311.0 326.5 316.8 349.8 309.7 6: 402.1 347.2 361.0 404.5 331.0 348.9 381.7 8: 392.4 366.2 351.0 357.1 409.9 367.3 382.0

10: 346.7 452.9 461.4 433.1 410.6 384.2 362.6 12: 407.4 441.8 419.9 410.7 473.4 441.2 465.8

Does variation in plate length have any effect on true aver- age axial stiffness? State and test the relevant hypotheses using analysis of variance with . Display your results in an ANOVA table.

9. Six samples of each of four types of cereal grain grown in a certain region were analyzed to determine thiamin content, resulting in the following data (mg/g):

Wheat 5.2 4.5 6.0 6.1 6.7 5.8 Barley 6.5 8.0 6.1 7.5 5.9 5.6 Maize 5.8 4.7 6.4 4.9 6.0 5.2 Oats 8.3 6.1 7.8 7.0 5.5 7.2

Does this data suggest that at least two of the grains differ with respect to true average thiamin content? Use a level

test based on the P-value method.

10. In single-factor ANOVA with I treatments and J observa- tions per treatment, let . a. Express in terms of m. [Hint: ] b. Determine . [Hint: For any rv Y,

.] c. Determine . d. Determine E(SSTr) and then show that

e. Using the result of part (d), what is E(MSTr) when H0 is true? When H0 is false, how does E(MSTr) compare to s 2?

E(MSTr) 5 s2 1 J

I 2 1 g(mi 2 m)

2

E(X# #2)

V(Y) 1 [E(Y)]2 E(Y2) 5E(Xi#

2) X.. 5 (1/I)gXi.E(X# #)

m 5 (1/I)gmi

a 5 .05

[Hint: ggxij 2 5 5,241,420.79.]

a 5 .01Grade J si

1 10 1.63 .27 2 10 1.56 .24 3 10 1.42 .26

xi.

Use this data and a significance level of .01 to test the null hypothesis of no difference in mean modulus of elasticity for the three grades.

6. The article “Origin of Precambrian Iron Formations” (Econ. Geology, 1964: 1025–1057) reports the following data on total Fe for four types of iron formation

).

1: 20.5 28.1 27.8 27.0 28.0 25.2 25.3 27.1 20.5 31.3

2: 26.3 24.0 26.2 20.2 23.7 34.0 17.1 26.8 23.7 24.9

3: 29.5 34.0 27.5 29.4 27.9 26.2 29.9 29.5 30.0 35.6

4: 36.5 44.2 34.1 30.3 31.4 33.1 34.1 32.9 36.3 25.5

Carry out an analysis of variance F test at significance level .01, and summarize the results in an ANOVA table.

7. An experiment was carried out to compare electrical resis- tivity for six different low-permeability concrete bridge deck

2 5 silicate, 3 5 magnetite, 4 5 hematite (1 5 carbonate,

mixtures. There were 26 measurements on concrete cylin- ders for each mixture; these were obtained 28 days after casting. The entries in the accompanying ANOVA table are based on information in the article “In-Place Resistivity of Bridge Deck Concrete Mixtures” (ACI Materials J., 2009: 114–122). Fill in the remaining entries and test appropriate hypotheses.

Sum of Source df Squares Mean Square f

Mixture Error 13.929 Total 5664.415

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

402 CHAPTER 10 The Analysis of Variance

10.2 Multiple Comparisons in ANOVA When the computed value of the F statistic in single-factor ANOVA is not signifi- cant, the analysis is terminated because no differences among the mi’s have been identified. But when H0 is rejected, the investigator will usually want to know which of the mi’s are different from one another. A method for carrying out this further analysis is called a multiple comparisons procedure.

Several of the most frequently used procedures are based on the following cen- tral idea. First calculate a confidence interval for each pairwise difference with . Thus if , the six required CIs would be for (but not also for ), , and . Then if the interval for does not include 0, conclude that m1 and m2 differ significantly from one another; if the interval does include 0, the two m’s are judged not signifi- cantly different. Following the same line of reasoning for each of the other intervals, we end up being able to judge for each pair of m’s whether or not they differ signif- icantly from one another.

The procedures based on this idea differ in how the various Cls are calculated. Here we present a popular method that controls the simultaneous confidence level for all intervals.

Tukey’s Procedure (the T Method) Tukey’s procedure involves the use of another probability distribution called the Studentized range distribution. The distribution depends on two parameters: a numerator df m and a denominator df n. Let denote the upper-tail a criti- cal value of the Studentized range distribution with m numerator df and n denominator df (analogous to ). Values of are given in Appendix Table A.10.

Qa,m,nFa,n1,n2

Qa,m,n

I(I 2 1)/2

m1 2 m2

m3 2 m4m1 2 m3, m1 2 m4, m2 2 m3, m2 2 m4m2 2 m1

m1 2 m2I 5 4i , j mi 2 mj

PROPOSITION With probability ,

(10.4)

for every i and j ( and ) with .i , jj 5 1, c, Ii 5 1, c, I

# Xi# 2 Xj# 1 Qa,I,I(J21)#MSE/J Xi# 2 Xj# 2 Qa ,I ,I(J21)#MSE/J # mi 2 mj

1 2 a

Notice that numerator df for the appropriate Qa critical value is I, the number of pop- ulation or treatment means being compared, and not as in the F test. When the computed and MSE are substituted into (10.4), the result is a collection of con- fidence intervals with simultaneous confidence level for all pairwise differences of the form with . Each interval that does not include 0 yields the conclusion that the corresponding values of mi and mj differ significantly from one another.

Since we are not really interested in the lower and upper limits of the various intervals but only in which include 0 and which do not, much of the arithmetic asso- ciated with (10.4) can be avoided. The following box gives details and describes how differences can be identified visually using an “underscoring pattern.”

i , jmi 2 mj

100(1 2 a)% xi#, xj#

I 2 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

The T Method for Identifying Significantly Different mi’s

Select a, extract from Appendix Table A.10, and calculate

. Then list the sample means in increasing order and underline those pairs that differ by less than w. Any pair of sample means not underscored by the same line corresponds to a pair of population or treatment means that are judged significantly different.

Qa,I,I(J21) # 2MSE/J w 5Qa,I,I(J21)

Suppose, for example, that and that

Then

1. Consider first the smallest mean . If , proceed to Step 2. However, if , connect these first two means with a line segment. Then if possible extend this line segment even further to the right to the largest that differs from by less than w (so the line may connect two, three, or even more means).

2. Now move to and again extend a line segment to the largest to its right that differs from by less than w (it may not be possible to draw this line, or alter- natively it may underscore just two means, or three, or even all four remaining means).

3. Continue by moving to and repeating, and then finally move to .

To summarize, starting from each mean in the ordered list, a line segment is extended as far to the right as possible as long as the difference between the means is smaller than w. It is easily verified that a particular interval of the form (10.4) will contain 0 if and only if the corresponding pair of sample means is underscored by the same line segment.

An experiment was carried out to compare five different brands of automobile oil fil- ters with respect to their ability to capture foreign material. Let mi denote the true average amount of material captured by brand i filters under con- trolled conditions. A sample of nine filters of each brand was used, resulting in the following sample mean amounts: , and

. Table 10.3 is the ANOVA table summarizing the first part of the analysis.x5. 5 13.1 x1# 5 14.5, x2# 5 13.8, x3# 5 13.3, x4 # 5 14.3

(i 5 1, c, 5)

x1x4 #

x5# xi#x5#

x2 #xi# x5 # 2 x2 # , w

x5 # 2 x2 # $ wx2#

x2# , x5# , x4 # , x1# , x3#

I 5 5

Since , H0 is rejected (decisively) at level .05. We now use Tukey’s pro- cedure to look for significant differences among the mi’s. From Appendix Table A.10,

(the second subscript on Q is I and not as in F), so . After arranging the five sample means in increasing order, the

two smallest can be connected by a line segment because they differ by less than .4. 4.041.088/9 5 .4

w 5I 2 1Q.05,5,40 5 4.04

F.05,4,40 5 2.61

10.2 Multiple Comparisons in ANOVA 403

Example 10.5

Table 10.3 ANOVA Table for Example 10.5

Source of Variation df Sum of Squares Mean Square f

Treatments (brands) 4 13.32 3.33 37.84 Error 40 3.53 .088 Total 44 16.85

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

404 CHAPTER 10 The Analysis of Variance

However, this segment cannot be extended further to the right since . Moving one mean to the right, the pair and cannot be

underscored because these means differ by more than .4. Again moving to the right, the next mean, 13.8, cannot be connected to any further to the right. The last two means can be underscored with the same line segment.

Thus brands 1 and 4 are not significantly different from one another, but are signif- icantly higher than the other three brands in their true average contents. Brand 2 is significantly better than 3 and 5 but worse than 1 and 4, and brands 3 and 5 do not differ significantly.

If rather than 13.8 with the same computed w, then the configura- tion of underscored means would be

A biologist wished to study the effects of ethanol on sleep time. A sample of 20 rats, matched for age and other characteristics, was selected, and each rat was given an oral injection having a particular concentration of ethanol per body weight. The rapid eye movement (REM) sleep time for each rat was then recorded for a 24-hour period, with the following results:

x5# x3# x2# x4 # x1# 13.1 13.3 14.15 14.3 14.5

x2# 5 14.15

x5# x3# x2# x4 # x1# 13.1 13.3 13.8 14.3 14.5

x2#x3 #13.8 2 13.1 5 .7 $ .4

Example 10.6

Treatment (concentration of ethanol)

0 (control) 88.6 73.2 91.4 68.0 75.2 396.4 79.28 1 g/kg 63.0 53.9 69.2 50.1 71.5 307.7 61.54 2 g/kg 44.9 59.5 40.2 56.3 38.7 239.6 47.92 4 g/kg 31.0 39.6 45.3 25.2 22.7 163.8 32.76

x # # 5 55.375x # # 5 1107.5

xi#xi#

Does the data indicate that the true average REM sleep time depends on the con- centration of ethanol? (This example is based on an experiment reported in “Relationship of Ethanol Blood Level to REM and Non-REM Sleep Time and Distribution in the Rat,” Life Sciences, 1978: 839–846.)

The differ rather substantially from one another, but there is also a great deal of variability within each sample, so to answer the question precisely we must carry out the ANOVA. With and correction factor

, the computing formulas yield

Table 10.4 is a SAS ANOVA table. The last column gives the P-value as .0001. Using a significance level of .05, we reject the null hypothesis

SSE 5 7369.8 2 5882.4 5 1487.4

5 67,210.2 2 61,327.8 5 5882.4

SSTr 5 1

5 [(396.40)2 1 (307.70)2 1 (239.60)2 1 (163.80)2] 2 61,327.8

SST 5 68,697.6 2 61,327.8 5 7369.8

61,327.8 x..2/(IJ) 5 (1107.5)2/20 5ggxij

2 5 68,697.6

xi.s

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

, since P-value � .0001 < .05 � a. True average REM sleep time does appear to depend on concentration level. H0: m1 5 m2 5 m3 5 m4

Alpha df MSE Critical Value of Studentized Range

Minimum Significant Difference

Means with the same letter are not significantly different.

Tukey Grouping Mean N TREATMENT A 79.280 5 0(control)

B 61.540 5 1 gm/kg B

C B 47.920 5 2 gm/kg C C 32.760 5 4 gm/kg

5 17.446 5 4.046

5 92.96255 165 0.05

Figure 10.4 Tukey’s method using SAS ■

The Interpretation of aa in Tukey’s Method We stated previously that the simultaneous confidence level is controlled by Tukey’s method. So what does “simultaneous” mean here? Consider calculating a 95% CI for a population mean m based on a sample from that population and then a 95% CI for a population proportion p based on another sample selected independently of the first one. Prior to obtaining data, the probability that the first interval will include m is .95, and this is also the probability that the second interval will include p. Because the two samples are selected independently of one another, the probability that both intervals will include the values of the respective parameters is . Thus the simultaneous or joint confidence level for the two intervals is roughly

(.95)2 < .90(.95)(.95) 5

10.2 Multiple Comparisons in ANOVA 405

Table 10.4 SAS ANOVA Table

Analysis of Variance Procedure Dependent Variable: TIME

Sum of Mean Source DF Squares Square F Value Pr � F Model 3 5882.35750 1960.78583 21.09 0.0001 Error 16 1487.40000 92.96250 Corrected Total 19 7369.75750

There are treatments and 16 df for error, from which and . Ordering the means and underscoring yields

The interpretation of this underscoring must be done with care, since we seem to have concluded that treatments 2 and 3 do not differ, 3 and 4 do not differ, yet 2 and 4 do differ. The suggested way of expressing this is to say that although evidence allows us to conclude that treatments 2 and 4 differ from one another, neither has been shown to be significantly different from 3. Treatment 1 has a significantly higher true average REM sleep time than any of the other treatments.

Figure 10.4 shows SAS output from the application of Tukey’s procedure.

x4 # x3 # x2# x1# 32.76 47.92 61.54 79.28

193.0/5 5 17.47w 5 4.05 Q.05,4,16 5 4.05I 5 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

406 CHAPTER 10 The Analysis of Variance

90%—if pairs of intervals are calculated over and over again from independent sam- ples, in the long run roughly 90% of the time the first interval will capture m and the second will include p. Similarly, if three CIs are calculated based on independent samples, the simultaneous confidence level will be . Clearly, as the number of intervals increases, the simultaneous confidence level that all intervals capture their respective parameters will decrease.

Now suppose that we want to maintain the simultaneous confidence level at 95%. Then for two independent samples, the individual confidence level for each would have to be . The larger the number of intervals, the higher the individual confidence level would have to be to maintain the 95% simul- taneous level.

The tricky thing about the Tukey intervals is that they are not based on independent samples—MSE appears in every one, and various intervals share the same (e.g., in the case , three different intervals all use ). This implies that there is no straightforward probability argument for ascertaining the simultaneous confidence level from the individual confidence levels. Nevertheless, it can be shown that if Q.05 is used, the simultaneous confidence level is controlled at 95%, whereas using Q.01 gives a simultaneous 99% level. To obtain a 95% simultaneous level, the individual level for each interval must be considerably larger than 95%. Said in a slightly different way, to obtain a 5% experimentwise or family error rate, the individual or per-comparison error rate for each interval must be considerably smaller than .05. Minitab asks the user to specify the family error rate (e.g., 5%) and then includes on output the individ- ual error rate (see Exercise 16).

Confidence Intervals for Other Parametric Functions In some situations, a CI is desired for a function of the mi’s more complicated than a difference of . Let , where the ci’s are constants. One such function is

, which in the context of Example 10.5 measures the difference between the group consisting of the first two brands and that of the last three brands. Because the are normally distributed with and

. is normally distributed, unbiased for u, and

Estimating s 2 by MSE and forming results in a t variable , which can be manipulated to obtain the following confidence interval for

,

(10.5)

The parametric function for comparing the first two (store) brands of oil filter with

the last three (national) brands is , from which

gci 2 5 a 1

2 b2 1 a 1

2 b2 1 a21

3 b2 1 a21

3 b2 1 a21

3 b2 5 5

6

u 5 12 (m1 1 m2) 2 1 3 (m3 1 m4 1 m5)

gcixi # 6 ta/2,I(J21)B MSEgc2i

J

gcimi

100(1 2 a)% (û 2 u)/ŝûŝû

V(û) 5 V(g i

ci Xi#) 5 g i

ci 2V(Xi#) 5

s2

J g

i ci

2

s2, û 5 gci XiV(Xij) 5

E(Xij) 5 miXij ’s

1 2 (m1 1 m2) 2

1 3 (m3 1 m4 1 m5)

u 5 gcimimi 2 mj

x1#I 5 4xi#’s

1001.95% < 97.5%

100(.95)3% < 86%

Example 10.7

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

With and , a 95% interval is

Sometimes an experiment is carried out to compare each of several “new” treatments to a control treatment. In such situations, a multiple comparisons tech- nique called Dunnett’s method is appropriate.

.583 6 2.021#5(.088)/[(6)(9)] 5 .583 6 .182 5 (.401, .765)

MSE 5 .088û 5 12 (x1. 1 x2.) 2 1 3 (x3. 1 x4. 1 x5.) 5 .583

EXERCISES Section 10.2 (11–21)

11. An experiment to compare the spreading rates of five dif- ferent brands of yellow interior latex paint available in a particular area used 4 gallons of each paint. The sample average spreading rates for the five brands were , and . The computed value of F was found to be significant at level . With , use Tukey’s procedure to investigate significant differences in the true average spreading rates between brands.

12. In Exercise 11, suppose . Now which true aver- age spreading rates differ significantly from one another? Be sure to use the method of underscoring to illustrate your conclusions, and write a paragraph summarizing your results.

13. Repeat Exercise 12 supposing that in addition to .

14. Use Tukey’s procedure on the data in Example 10.3 to iden- tify differences in true average bond strengths among the five protocols.

15. Exercise 10.7 described an experiment in which 26 resistiv- ity observations were made on each of six different concrete mixtures. The article cited there gave the following sample means: 14.18, 17.94, 18.00, 18.00, 25.74, 27.67. Apply Tukey’s method with a simultaneous confidence level of 95% to identify significant differences, and describe your findings (use ).

16. Reconsider the axial stiffness data given in Exercise 8. ANOVA output from Minitab follows:

Analysis of Variance for Stiffness Source DF SS MS F P Length 4 43993 10998 10.48 0.000 Error 30 31475 1049 Total 34 75468

Level N Mean StDev 4 7 333.21 36.59 6 7 368.06 28.57 8 7 375.13 20.83 10 7 407.36 44.51 12 7 437.17 26.00

Pooled StDev 5 32.39

MSE 5 13.929

x3. 5 427.5 x2. 5 502.8

x3. 5 427.5

MSE 5 272.8a 5 .05 x5 # 5 532.1

x1# 5 462.0, x2 # 5 512.8, x3 # 5 437.5, x4 # 5 469.3 (ft2/gal) (J 5 4)

Tukey’s pairwise comparisons

Family error rate Individual error rate

Critical value

Intervals for (column level mean) – (row level mean)

4 6 8 10

6 15.4

8 8.3 43.1

10 10.9 18.0

12 20.4

a. Is it plausible that the variances of the five axial stiffness index distributions are identical? Explain.

b. Use the output (without reference to our F table) to test the relevant hypotheses.

c. Use the Tukey intervals given in the output to determine which means differ, and construct the corresponding underscoring pattern.

17. Refer to Exercise 5. Compute a 95% t CI for .

18. Consider the accompanying data on plant growth after the application of five different types of growth hormone.

1: 13 17 7 14 2: 21 13 20 17 3: 18 15 20 17 4: 7 11 18 10 5: 6 11 15 8

a. Perform an F test at level . b. What happens when Tukey’s procedure is applied?

19. Consider a single-factor ANOVA experiment in which , , and . Find a value

of SSE for which , so that is rejected, yet when Tukey’s procedure is applied none of the mi’s can be said to differ significantly from one another.

H0: m1 5 m2 5 m3f . F.05,2,12

x3# 5 20J 5 5, x1# 5 10, x2 # 5 12I 5 3

a 5 .05

1 2 (m1 1 m2) 2 m3

u 5

211.8218.9253.8 280.02112.22119.32154.2

223.9 282.4289.52124.3

257.3292.1

285.0

5 4.10

5 0.00693 5 0.0500

10.2 Multiple Comparisons in ANOVA 407

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

408 CHAPTER 10 The Analysis of Variance

a. Test the null hypothesis that true average survival time does not depend on an injection regimen against the alternative that there is some dependence on an injection regimen using .

b. Suppose that CIs for k different paramet- ric functions are computed from the same ANOVA data set. Then it is easily verified that the simultaneous confi- dence level is at least . Compute CIs with a simultaneous confidence level of at least 98% for

and

m4 1 m5) 2 m6

1

4 (m2 1 m3 1m1 2

1

5 (m2 1 m3 1 m4 1 m5 1 m6)

100(1 2 ka)%

100(1 2 a)% a 5 .01

20. Refer to Exercise 19 and suppose , and . Can you now find a value of SSE that produces such

a contradiction between the F test and Tukey’s procedure?

21. The article “The Effect of Enzyme Inducing Agents on the Survival Times of Rats Exposed to Lethal Levels of Nitrogen Dioxide” (Toxicology and Applied Pharmacology, 1978: 169–174) reports the following data on survival times for rats exposed to nitrogen dioxide (70 ppm) via different injection regimens. There were rats in each group.

Regimen si

1. Control 166 32 2. 3-Methylcholanthrene 303 53 3. Allylisopropylacetamide 266 54 4. Phenobarbital 212 35 5. Chlorpromazine 202 34 6. p-Aminobenzoic Acid 184 31

xi # (min)

J 5 14

x3. 5 20 x1# 5 10, x2 # 5 15

10.3 More on Single-Factor ANOVA We now briefly consider some additional issues relating to single-factor ANOVA. These include an alternative description of the model parameters, b for the F test, the relationship of the test to procedures previously considered, data transformation, a random effects model, and formulas for the case of unequal sample sizes.

The ANOVA Model The assumptions of single-factor ANOVA can be described succinctly by means of the “model equation”

where represents a random deviation from the population or true treatment mean mi. The are assumed to be independent, normally distributed rv’s (implying that the

are also) with [so that ] and [from which for every i and j]. An alternative description of single-factor ANOVA will

give added insight and suggest appropriate generalizations to models involving more than one factor. Define a parameter m by

and the parameters by

Then the treatment mean mi can be written as , where m represents the true average overall response in the experiment, and ai is the effect, measured as a depar- ture from m, due to the ith treatment. Whereas we initially had I parameters, we now have . However, because (the average departure from the overall mean response is zero), only I of these new parameters are independently determined, so there are as many independent parameters as there were before. In terms of m and the ai’s, the model becomes

gai 5 0I 1 1 (m, a1, c, aI)

m 1 ai

ai 5 mi 2 m (i 5 1, c, I )

a1, c, aI

m 5 1

I g

I

i51 mi

V(Xij) 5 s 2

V(Pij) 5 s2E(Xij) 5 miE(Pij) 5 0Xij’s Pij’s

Pij

Xij 5 mi 1 Pij

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

10.3 More on Single-Factor ANOVA 409

In Chapter 11, we will develop analogous models for multifactor ANOVA. The claim that the mi’s are identical is equivalent to the equality of the ai’s, and because the null hypothesis becomes

Recall that MSTr is an unbiased estimator of s 2 when H0 is true but otherwise tends to overestimate s 2. Here is a more precise result:

When H0 is true, so (MSE is unbiased whether or not H0 is true). If is used as a measure of the extent to which H0 is false, then a larger value of will result in a greater tendency for MSTr to overestimate s 2. In the next chapter, formulas for expected mean squares for multifactor models will be used to suggest how to form F ratios to test various hypotheses.

Proof of the Formula for E(MSTr) For any rv Y, , so

The result then follows from the relationship . ■

bb for the F Test Consider a set of parameter values for which H0 is not true. The prob- ability of a type II error, b, is the probability that H0 is not rejected when that set is the set of true values. One might think that b would have to be determined separately for each different configuration of ai’s. Fortunately, since b for the F test depends on the ai’s and s

2 only through , it can be simultaneously evaluated for many different alternatives. For example, for each of the following sets of ai’s for which H0 is false, so b is identical for all three alternatives:

1.

2.

3.

The quantity is called the noncentrality parameter for one-way ANOVA (because when H0 is false the test statistic has a noncentral F distribution with this as one of its parameters), and b is a decreasing function of the value of this parameter. Thus, for fixed values of s 2 and J, the null hypothesis is more likely to be rejected for alternatives far from H0 (large ) than for alternatives close to H0. For a fixed value of , b decreases as the sample size J on each treatmentgai

2 gai

2

Jgai 2/s2

a1 5 213, a2 5 11/3, a3 5 11/3, a4 5 11/3

a1 5 212, a2 5 12, a3 5 0, a4 5 0

a1 5 21, a2 5 21, a3 5 1, a4 5 1

gai 2 5 4

gai 2/s2

a1, a2, c, aI

MSTr 5 SSTr/(I 2 1)

5 (I 2 1)s2 1 Jg i ai

2 (sincegai 5 0)

5 Is2 1 IJm2 1 2mJg i ai 1 Jg

i ai

2 2 s2 2 IJm2

5 1

J g

i 5Js2 1 [J(m 1 ai)]26 2 1IJ [IJs2 1 (IJm)2]

5 1

J g

i 5V(Xi#) 1 [E(Xi #)2]6 2 1IJ 5V(X..) 1 [E(X..)]26

E(SSTr) 5 Ea 1 J g

i X i.

2 2 1

IJ X 2#

#b 5 1

J g

i E(Xi #2) 2

1

IJ E(X 2#

#)

E(Y 2) 5 V(Y) 1 [E(Y)]2

gai 2

gai 2

E(MSTr) 5 s2gai 2 5 0

E(MSTr) 5 s2 1 J

I 2 1 gai

2

H0: a1 5 a2 5 c 5 aI 5 0

gai 5 0,

Xij 5 m 1 a1 1 Pij (i 5 1, c, I; j 5 1, c, J )

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

increases, and it increases as the variance s 2 increases (since greater underlying variability makes it more difficult to detect any given departure from H0).

Because hand computation of b and sample size determination for the F test are quite difficult (as in the case of t tests), statisticians have constructed sets of curves from which b can be obtained. Sets of curves for numerator df and are displayed in Figure 10.5* and Figure 10.6*, respectively. After the values of s2 and the ai’s for which b is desired are specified, these are used to compute the value of f, where . We then enter the appropriate set of curves at the value of f on the horizontal axis, move up to the curve associated with error df n2, and move over to the value of power on the vertical axis. Finally, .b 5 1 2 power

f2 5 (J/I)gai 2/s2

n1 5 4n1 5 3

410 CHAPTER 10 The Analysis of Variance

60

6030

3020

20

15

1512

12

10 9

8 7

610 9

8 7

6

� .01� � .05�

1 2 3

4 5321� (for � .01)�

� (for � .05)�

.99

.98

.97

.96

.95

.94

.92

.90

.80

.70

.60

.50

.40

.30

.10

Po w

er �

1 �

1 � 4�

2 �

� �

Figure 10.6 Power curves for the ANOVA F test (n1 5 4)

* From E. S. Pearson and H. O. Hartley, “Charts of the Power Function for Analysis of Variance Tests, Derived from the Non-central F Distribution,” Biometrika, vol. 38, 1951: 112, by permission of Biometrika Trustees.

60

60

30

30

20

20

15

15

12

12

10 9

8 7

6

10 9

8 7

6 �

� .01� � .05�

1 2 3

4 5321� (for � .01)�

� (for � .05)�

.99

.98

.97

.96

.95

.94

.92

.90

.80

.70

.60

.50

.40

.30

.10

Po w

er �

1 �

1 � 3�

2 �

� �

Figure 10.5 Power curves for the ANOVA F test (n1 5 3)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

10.3 More on Single-Factor ANOVA 411

The effects of four different heat treatments on yield point (tons/in2) of steel ingots are to be investigated. A total of eight ingots will be cast using each treatment. Suppose the true standard deviation of yield point for any of the four treatments is . How likely is it that H0 will not be rejected at level .05 if three of the treatments have the same expected yield point and the other treatment has an expected yield point that is 1 ton/in2 greater than the common value of the other three (i.e., the fourth yield is on average 1 standard deviation above those for the first three treatments)?

Suppose that and . Then

so

and . Degrees of freedom for the F test are and , so interpolating visually between and gives

and . This b is rather large, so we might decide to increase the value of J. How many ingots of each type would be required to yield for the alternative under consideration? By trying different values of J, it can be verified that

will meet the requirement, but any smaller J will not. ■

As an alternative to the use of power curves, the SAS statistical software pack- age has a function that calculates the cumulative area under a noncentral F curve (inputs Fa, numerator df, denominator df, and f

2), and this area is b. Minitab does this and also something rather different. The user is asked to specify the maximum difference between mi’s rather than the individual means. For example, we might wish to calculate the power of the test when and . Then the maximum difference is . However, the power depends not only on this maximum difference but on the values of all the mi’s. In this situation Minitab calculates the smallest possible value of power subject to

and , which occurs when the two other m’s are both halfway between 100 and 106. If this power is .85, then we can say that the power is at least .85 and b is at most .15 when the two most extreme m’s are separated by 6 (the com- mon sample size, a, and s must also be specified). The software will also determine the necessary common sample size if maximum difference and minimum power are specified.

Relationship of the F Test to the t Test When the number of treatments or populations is , all formulas and results con- nected with the F test still make sense, so ANOVA can be used to test versus . In this case, a two-tailed, two-sample t test can also be used. In Section 9.3, we mentioned the pooled t test, which requires equal variances, as an alternative to the two-sample t procedure. It can be shown that the single-factor ANOVA F test and the two-tailed pooled t test are equivalent; for any given data set, the P-values for the two tests will be identical, so the same conclusion will be reached by either test.

The two-sample t test is more flexible than the F test when for two rea- sons. First, it is valid without the assumption that ; second, it can be used to test (an upper-tailed t test) or as well as . In the case of , there is unfortunately no general test procedure known to have good properties without assuming equal variances.

I $ 3 Ha: m1 2 m2Ha: m1 , m2Ha: m1 . m2

s1 5 s2

I 5 2

Ha: m1 2 m2 H0: m1 5 m2

I 5 2

m4 5 106m1 5 100

106 2 100 5 6m4 5 106 I 5 4, m1 5 100, m2 5 101, m3 5 102,

J 5 24

b < .05 b < .53power < .47

n2 5 30n2 5 20(J 2 1) 5 28n2 5 I n1 5 I 2 1 5 3f 5 1.22

f2 5 8

4 ca 21

4 b2 1 a21

4 b2 1 a21

4 b2 1 a 3

4 b2d 5 3

2

m1 2 m 5 2 1 4, a2 5 2

1 4, a3 5 2

1 4, a4 5

3 4a1 5

m4 5 m1 1 1, m 5 (gmi)/4 5 m1 1 1 4m1 5 m2 5 m3

s 5 1

Example 10.8

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

412 CHAPTER 10 The Analysis of Variance

Unequal Sample Sizes When the sample sizes from each population or treatment are not equal, let

denote the I sample sizes, and let denote the total number of observations. The accompanying box gives ANOVA formulas and the test procedure.

n 5 �i JiJ1, J2, c, JI

Test statistic value:

Rejection region: f $ Fa ,I21,n2I

f 5 MSTr

MSE where MSTr 5

SSTr

I 2 1 MSE 5

SSE

n 2 I

SSE 5 g I

i51 g

Ji

j51 (Xij 2 Xi#)2 5 SST 2 SSTr df 5 g(Ji 2 1) 5 n 2 I

SSTr 5 g I

i51 g

Ji

j51 (Xi# 2 X # #)2 5 g

I

i51 1

Ji Xi#2 2

1 n X2# # df 5 I 2 1

SST 5 g I

i51 g

Ji

j51 (Xij 2 X # #)2 5 g

I

i51 g

Ji

j51 Xij

2 2 1 n X2# # df 5 n 2 1

The article “On the Development of a New Approach for the Determination of Yield Strength in Mg-based Alloys” (Light Metal Age, Oct. 1998: 51–53) presented the following data on elastic modulus (GPa) obtained by a new ultrasonic method for specimens of a certain alloy produced using three different casting processes.

Ji

Permanent molding 45.5 45.3 45.4 44.4 44.6 43.9 44.6 44.0 8 357.7 44.71 Die casting 44.2 43.9 44.7 44.2 44.0 43.8 44.6 43.1 8 352.5 44.06 Plaster molding 46.0 45.9 44.8 46.2 45.1 45.5 6 273.5 45.58

22 983.7

Let m1, m2, and m3 denote the true average elastic moduli for the three different processes under the given circumstances. The relevant hypotheses are

versus Ha: at least two of the mi’s are different. The test statistic is, of course, , based on numerator df and

denominator df. Relevant quantities include

The remaining computations are displayed in the accompanying ANOVA table. Since , the P-value is smaller than .001. Thus the null

hypothesis should be rejected at any reasonable significance level; there is compelling F.001,2,19 5 10.16 , 12.56 5 f

SSE 5 13.93 2 7.93 5 6.00

SSTr 5 357.72

8 1

352.52

8 1

273.52

6 2 43,984.84 5 7.93

SST 5 43,998.73 2 43,984.80 5 13.93

ggxij 2 5 43,998.73 CF 5

983.72

22 5 43,984.80

n 2 I 5 22 2 3 5 19 I 2 1 5 2F 5 MSTr/MSE

H0: m1 5 m2 5 m3

xi#xi#

Example 10.9

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Let

Then the probability is approximately that

for every i and j ( and ) with .i 2 jj 5 1, c, Ii 5 1, c, I

Xi # 2 Xj # 2 wij # mi 2 mj # Xi # 2 Xj # 1 wij

1 2 a

wij 5 Qa,I,n2I # B MSE

2 a 1

Ji 1

1

Jj b

evidence for concluding that a true average elastic modulus somehow depends on which casting process is used.

Sum of Mean Source of Variation df Squares Square f

Treatments 2 7.93 3.965 12.56 Error 19 6.00 .3158 Total 21 13.93

There is more controversy among statisticians regarding which multiple compar- isons procedure to use when sample sizes are unequal than there is in the case of equal sample sizes. The procedure that we present here is recommended in the excel- lent book Beyond ANOVA: Basics of Applied Statistics (see the chapter bibliography) for use when the I sample sizes are reasonably close to one another (“mild imbalance”). It modifies Tukey’s method by using averages of pairs of 1/Ji’s in place of 1/J.

J1, J2, cJI

The simultaneous confidence level is only approximate rather than exact as it is with equal sample sizes. Underscoring can still be used, but now the wij factor used to decide whether and can be connected will depend on Ji and Jj.

The sample sizes for the elastic modulus data were , and . A simultaneous confidence level of approxi-

mately 95% requires , from which

Since and m2 are judged not signifi- cantly different. The accompanying underscoring scheme shows that m1 and m3 appear to differ significantly, as do m2 and m3.

Data Transformation The use of ANOVA methods can be invalidated by substantial differences in the vari- ances (which until now have been assumed equal with common value s2). It sometimes happens that , a known function of mi (so that when H0 is false, the variances are not equal). For example, if has a Poisson distributionXij

V(Xij) 5 si 2 5 g(mi)

s1 2, c, sI

2

2. Die 1. Permanent

44.06 44.71

3. Plaster

45.58

x1 # 2 x2 # 5 44.71 2 44.06 5 .65 , w12, m1

w12 5 3.59 B .316

2 a 1

8 1

1

8 b 5 .713, w13 5 .771 w23 5 .771

Q.05,3,19 5 3.59 I 5 3, n 2 I 5 19, MSE 5 .316

J1 5 8, J2 5 8, J3 5 6

xj #xi.

100(1 2 a)%

Example 10.10 (Example 10.9 continued)

10.3 More on Single-Factor ANOVA 413

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

414 CHAPTER 10 The Analysis of Variance

with parameter li (approximately normal if ), then and , so is the known function. In such cases, one can often transform the ’s to

so that they will have approximately equal variances (while leaving the trans- formed variables approximately normal), and then the F test can be used on the transformed observations. The key idea in choosing is that often

. We now wish to find the function for which (a constant) for every i.g(mi) # [hr(mi)]2 5 c

h(#)V(Xij) # [hr(mi)]2 5 g(mi) # [hr(mi)]2 V[h(Xij)] <h(#)

h(Xij) Xijg(mi) 5 mi

si 2 5 limi 5 lili $ 10

PROPOSITION If , a known function of mi, then a transformation that “stabilizes the variance” so that is approximately the same for each

i is given by .h(x)~ � [g(x)]21/2 dx

V[h(Xij)] h(Xij)V(Xij) 5 g(mi)

(10.7)

all Ai’s and normally distributed and independent of one another.Pij’s

V(Pij) 5 s2 V(Ai) 5 sA2.

Xij 5 m 1 Ai 1 Pij with E(Ai) 5 E(Pij) 5 0

In the Poisson case, , so h(x) should be proportional to .

Thus Poisson data should be transformed to before the analysis.

A Random Effects Model The single-factor problems considered so far have all been assumed to be examples of a fixed effects ANOVA model. By this we mean that the chosen levels of the fac- tor under study are the only ones considered relevant by the experimenter. The single-factor fixed effects model is

(10.6)

where the are random and both m and the ai’s are fixed parameters. In some single-factor problems, the particular levels studied by the experi-

menter are chosen, either by design or through sampling, from a large population of levels. For example, to study the effects on task performance time of using different operators on a particular machine, a sample of five operators might be chosen from a large pool of operators. Similarly, the effect of soil pH on the yield of maize plants might be studied by using soils with four specific pH values chosen from among the many possible pH levels. When the levels used are selected at random from a larger population of possible levels, the factor is said to be random rather than fixed, and the fixed effects model (10.6) is no longer appropriate. An analogous random effects model is obtained by replacing the fixed ai’s in (10.6) by random variables.

Pij ’s

Xij 5 m 1 ai 1 Pij gai 5 0

h(xij) 5 1xij

�x21/2 dx 5 2x1/2g(x) 5 x

The condition in (10.7) is similar to the condition in (10.6); it states that the expected or average effect of the ith level measured as a departure from m is zero.

For the random effects model (10.7), the hypothesis of no effects due to dif- ferent levels is , which says that different levels of the factor contribute nothing to variability of the response. Although the hypotheses in the single-factor

H0: sA 2 5 0

�ai 5 0E(Ai) 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

fixed and random effects models are different, they are tested in exactly the same way, by forming and rejecting H0 if . This can be jus- tified intuitively by noting that (as for fixed effects), whereas

(10.8)

where are the sample sizes and . The factor in parentheses on the right side of (10.8) is nonnegative, so again if H0 is true and

if H0 is false.

The study of nondestructive forces and stresses in materials furnishes important information for efficient engineering design. The article “Zero-Force Travel-Time Parameters for Ultrasonic Head-Waves in Railroad Rail” (Materials Evaluation, 1985: 854–858) reports on a study of travel time for a certain type of wave that results from longitudinal stress of rails used for railroad track. Three measurements were made on each of six rails randomly selected from a population of rails. The investigators used random effects ANOVA to decide whether some variation in travel time could be attributed to “between-rail variability.” The data is given in the accom- panying table (each value, in nanoseconds, resulted from subtracting 36.1 m’s from the original observation) along with the derived ANOVA table. The value f is highly significant, so is rejected in favor of the conclusion that differences between rails is a source of travel-time variability.

Source of df Sum of Mean f Variation Squares Square

1: 55 53 54 162 Treatments 5 9310.5 1862.1 115.2 2: 26 37 32 95 Error 12 194.0 16.17 3: 78 91 85 254 Total 17 9504.5 4: 92 100 96 288 5: 49 51 50 150 6: 80 85 83 248

x.. 5 1197

xi.

H0: sA 2 5 0

E(MSTr) . s2 E(MSTr) 5 s2

n 5 gJiJ1, J2, c, JI

E(MSTr) 5 s2 1 1

I 2 1 °n 2 gJi

2

n ¢sA2

E(MSE) 5 s2 f $ Fa, I21,n2IF 5 MSTr/MSE

Example 10.11

EXERCISES Section 10.3 (22–34)

22. The following data refers to yield of tomatoes (kg/plot) for four different levels of salinity. Salinity level here refers to electrical conductivity (EC), where the chosen levels were EC , and 10.2 nmhos/cm.

1.6: 59.5 53.3 56.8 63.1 58.7

3.8: 55.2 59.1 52.8 54.5

6.0: 51.7 48.8 53.9 49.0

10.2: 44.6 48.5 41.0 47.3 46.1

Use the F test at level to test for any differences in true average yield due to the different salinity levels.

23. Apply the modified Tukey’s method to the data in Exercise 22 to identify significant differences among the mi’s.

a 5 .05

5 1.6, 3.8, 6.0

24. The accompanying summary data on skeletal-muscle CS activity (nmol/min/mg) appeared in the article “Impact of Lifelong Sedentary Behavior on Mitochondrial Function of Mice Skeletal Muscle” (J. of Gerontology, 2009: 927–939):

Old Old Young Sedentary Active

Sample size 10 8 10 Sample mean 46.68 47.71 58.24 Sample sd 7.16 5.59 8.43

Carry out a test to decide whether true average activity differs for the three groups. If appropriate, investigate differences amongst the means with a multiple comparisons method.

10.3 More on Single-Factor ANOVA 415

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

416 CHAPTER 10 The Analysis of Variance

25. Lipids provide much of the dietary energy in the bodies of infants and young children. There is a growing interest in the quality of the dietary lipid supply during infancy as a major determinant of growth, visual and neural development, and long-term health. The article “Essential Fat Requirements of Preterm Infants” (Amer. J. of Clinical Nutrition, 2000: 245S–250S) reported the following data on total polyunsat- urated fats (%) for infants who were randomized to four dif- ferent feeding regimens: breast milk, corn-oil-based formula, soy-oil-based formula, or soy-and-marine-oil-based formula:

accompanying data on folacin content for randomly selected specimens of the four leading brands of green tea.

1: 7.9 6.2 6.6 8.6 8.9 10.1 9.6 2: 5.7 7.5 9.8 6.1 8.4 3: 6.8 7.5 5.0 7.4 5.3 6.1 4: 6.4 7.1 7.9 4.5 5.0 4.0

(Data is based on “Folacin Content of Tea,” J. of the Amer. Dietetic Assoc., 1983: 627–632.) Does this data suggest that true average folacin content is the same for all brands? a. Carry out a test using via the P-value method. b. Assess the plausibility of any assumptions required for

your analysis in part (a). c. Perform a multiple comparisons analysis to identify sig-

nificant differences among brands.

28. For a single-factor ANOVA with sample sizes , show that

where .

29. When sample sizes are equal , the parameters of the alternative parameterization are restricted

by . For unequal sample sizes, the most natural restriction is . Use this to show that

What is E(MSTr) when H0 is true? [This expectation is cor- rect if is replaced by the restriction (or any other single linear restriction on the ai’s used to reduce the model to I independent parameters), but sim- plifies the algebra and yields natural estimates for the model parameters (in particular, .]

30. Reconsider Example 10.8 involving an investigation of the effects of different heat treatments on the yield point of steel ingots. a. If and , what is b for a level .05 F test when

, and ? b. For the alternative of part (a), what value of J is neces-

sary to obtain ? c. If there are heat treatments, , and ,

what is b for the level .05 F test when four of the mi’s are equal and the fifth differs by 1 from the other four?

31. When sample sizes are not equal, the noncentrality parame- ter is and . Referring to Exercise 22, what is the power of the test when

, and ?

32. In an experiment to compare the quality of four different brands of magnetic recording tape, five 2400-ft reels of each brand (A–D) were selected and the number of flaws in each reel was determined.

A: 10 5 12 14 8

B: 14 12 17 9 8

C: 13 18 10 15 18

D: 17 16 12 22 14

m4 5 m2 1 sm2 5 m3, m1 5 m2 2 s

f2 5 (1/I )g Jia i 2/s2g Jia i

2 /s2

s 5 1J 5 10I 5 5 b 5 .05

m4 5 m1 1 1m1 5 m2, m3 5 m1 2 1 s 5 1J 5 8

âi 5 xi# 2 x # #)

gJiai 5 0

gai 5 0gJiai 5 0

E(MSTr) 5 s2 1 1

I 2 1 gJiai

2

gJiai 5 0 �ai 5 0

a1, a2, caI

(Ji 5 J )

n 5 gJi

SSTr 5 gJi(Xi # 2 X # #)2 5 g JiX i # 2 2 nX 2# #I )

Ji (i 5 1, 2, c

a 5 .05

Sample Sample Sample Regimen Size Mean SD

Breast milk 8 43.0 1.5 CO 13 42.4 1.3 SO 17 43.1 1.2 SMO 14 43.5 1.2

a. What assumptions must be made about the four total polyunsaturated fat distributions before carrying out a single-factor ANOVA to decide whether there are any differences in true average fat content?

b. Carry out the test suggested in part (a). What can be said about the P-value?

26. Samples of six different brands of diet/imitation margarine were analyzed to determine the level of physiologically active polyunsaturated fatty acids (PAPFUA, in percent- ages), resulting in the following data:

Imperial 14.1 13.6 14.4 14.3 Parkay 12.8 12.5 13.4 13.0 12.3 Blue Bonnet 13.5 13.4 14.1 14.3 Chiffon 13.2 12.7 12.6 13.9 Mazola 16.8 17.2 16.4 17.3 18.0 Fleischmann’s 18.1 17.2 18.7 18.4

(The preceding numbers are fictitious, but the sample means agree with data reported in the January 1975 issue of Consumer Reports.) a. Use ANOVA to test for differences among the true aver-

age PAPFUA percentages for the different brands. b. Compute CIs for all . c. Mazola and Fleischmann’s are corn-based, whereas the

others are soybean-based. Compute a CI for

[Hint: Modify the expression for that led to (10.5) in the previous section.]

27. Although tea is the world’s most widely consumed beverage after water, little is known about its nutritional value. Folacin is the only B vitamin present in any significant amount in tea, and recent advances in assay methods have made accurate determination of folacin content feasible. Consider the

V(û)

(m1 1 m2 1 m3 1 m4)

4 2

(m5 1 m6)

2

(mi 2 mj)’s

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 417

It is believed that the number of flaws has approximately a Poisson distribution for each brand. Analyze the data at level .01 to see whether the expected number of flaws per reel is the same for each brand.

33. Suppose that is a binomial variable with parameters n and pi (so approximately normal when and ). Thennqi $ 10npi $ 10

Xij

since , . How should the ’s be transformed so as to stabilize the vari- ance? [Hint: .]

34. Simplify E(MSTr) for the random effects model when .J1 5 J2 5 c5 JI 5 J

g(mi) 5 mi(1 2 mi/n) Xij

V(Xij) 5 si 2 5 npi(1 2 pi) 5 mi(1 2 mi/n)mi 5 npi

SUPPLEMENTARY EXERCISES (35–46)

35. An experiment was carried out to compare flow rates for four different types of nozzle. a. Sample sizes were 5, 6, 7, and 6, respectively, and calcu-

lations gave . State and test the relevant hypotheses using

b. Analysis of the data using a statistical computer package yielded . At level .01, what would you conclude, and why?

36. The article “Computer-Assisted Instruction Augmented with Planned Teacher/Student Contacts” (J. of Exp. Educ., Winter, 1980–1981: 120–126) compared five different methods for teaching descriptive statistics. The five methods were tradi- tional lecture and discussion (L/D), programmed textbook instruction (R), programmed text with lectures (R/L), com- puter instruction (C), and computer instruction with lectures (C/L). Forty-five students were randomly assigned, 9 to each method. After completing the course, the students took a 1-hour exam. In addition, a 10-minute retention test was administered 6 weeks later. Summary quantities are given.

P-value 5 .029

a 5 .01 f 5 3.68

of six motors. The amount of motor vibration (measured in microns) was recorded when each of the 30 motors was run- ning. The data for this study follows. State and test the rele- vant hypotheses at significance level .05, and then carry out a multiple comparisons analysis if appropriate.

Mean 1: 13.1 15.0 14.0 14.4 14.0 11.6 13.68 2: 16.3 15.7 17.2 14.9 14.4 17.2 15.95 3: 13.7 13.9 12.4 13.8 14.9 13.3 13.67 4: 15.7 13.7 14.4 16.0 13.9 14.7 14.73 5: 13.5 13.4 13.2 12.7 13.4 12.3 13.08

38. An article in the British scientific journal Nature (“Sucrose Induction of Hepatic Hyperplasia in the Rat,” August 25, 1972: 461) reports on an experiment in which each of five groups consisting of six rats was put on a diet with a differ- ent carbohydrate. At the conclusion of the experiment, the DNA content of the liver of each rat was determined (mg/g liver), with the following results:

Exam Retention Test

Method si si

L/D 29.3 4.99 30.20 3.82 R 28.0 5.33 28.80 5.26 R/L 30.2 3.33 26.20 4.66 C 32.4 2.94 31.10 4.91 C/L 34.2 2.74 30.20 3.53

xi.xi.

The grand mean for the exam was 30.82, and the grand mean for the retention test was 29.30. a. Does the data suggest that there is a difference among the

five teaching methods with respect to true mean exam score? Use .

b. Using a .05 significance level, test the null hypothesis of no difference among the true mean retention test scores for the five different teaching methods.

37. Numerous factors contribute to the smooth running of an electric motor (“Increasing Market Share Through Improved Product and Process Design: An Experimental Approach,” Quality Engineering, 1991: 361–369). In particular, it is desirable to keep motor noise and vibration to a minimum. To study the effect that the brand of bearing has on motor vibra- tion, five different motor bearing brands were examined by installing each type of bearing on different random samples

a 5 .05

Assuming also that , does the data indicate that true average DNA content is affected by the type of car- bohydrate in the diet? Construct an ANOVA table and use a .05 level of significance.

39. Referring to Exercise 38, construct a t CI for

which measures the difference between the average DNA content for the starch diet and the combined average for the four other diets. Does the resulting interval include zero?

40. Refer to Exercise 38. What is b for the test when true aver- age DNA content is identical for three of the diets and falls below this common value by 1 standard deviation (s) for the other two diets?

41. Four laboratories (1–4) are randomly selected from a large population, and each is asked to make three determinations

u 5 m1 2 (m2 1 m3 1 m4 1 m5)/4

ggxij 2 5 183.4

Carbohydrate

Starch 2.58 Sucrose 2.63 Fructose 2.13 Glucose 2.41 Maltose 2.49

xi.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

418 CHAPTER 10 The Analysis of Variance

43. Let be numbers satisfying . Then is called a contrast in the mi’s.

Notice that with , which implies that every pairwise differ-

ence between mi’s is a contrast (so is, e.g., ). A method attributed to Scheffé gives simultaneous CIs with simultaneous confidence level for all possible contrasts (an infinite number of them!). The interval for is

Using the critical flicker frequency data of Exercise 42, calculate the Scheffé intervals for the contrasts

, and (this last contrast compares blue to the average of brown and green). Which contrasts appear to differ significantly from 0, and why?

44. Four types of mortars—ordinary cement mortar (OCM), polymer impregnated mortar (PIM), resin mortar (RM), and polymer cement mortar (PCM)—were subjected to a compression test to measure strength (MPa). Three strength observations for each mortar type are given in the article “Polymer Mortar Composite Matrices for Maintenance- Free Highly Durable Ferrocement” (J. of Ferrocement, 1984: 337–345) and are reproduced here. Construct an ANOVA table. Using a .05 significance level, determine whether the data suggests that the true mean strength is not the same for all four mortar types. If you determine that the true mean strengths are not all equal, use Tukey’s method to identify the significant differences.

OCM 32.15 35.53 34.20 PIM 126.32 126.80 134.79 RM 117.91 115.02 114.58 PCM 29.09 30.87 29.80

45. Suppose the ’s are “coded” by . How does the value of the F statistic computed from the ’s compare to the value computed from the ’s? Justify your assertion.

46. In Example 10.11, subtract from each observation in the ith sample to obtain a set of 18 residuals. Then construct a normal probability plot and comment on the plausibility of the normality assumption.

(i 5 1, c, 6) xi.

xij

yij

yij 5 cxij 1 dxij

.5m1 1 .5m2 2 m3m1 2 m2, m1 2 m3, m2 2 m3

gcixi# 6 (gci2/Ji)1/2 # [(I 2 1) # MSE # Fa, I21,n2I]1/2 gcimi

100(1 2 a)%

m1 2 .5m2 2 .5m3

5 m1 2 m2gci mi

c1 5 1, c2 5 21, c3 5 c5 cI 5 0 gci mi 5 c1m1 1 c 1 cImI

gci 5 0c1, c2, c, cIof the percentage of methyl alcohol in specimens of a com- pound taken from a single batch. Based on the accompany- ing data, are differences among laboratories a source of variation in the percentage of methyl alcohol? State and test the relevant hypotheses using significance level .05.

1: 85.06 85.25 84.87

2: 84.99 84.28 84.88

3: 84.48 84.72 85.10

4: 84.10 84.55 84.05

42. The critical flicker frequency (cff) is the highest frequency (in cycles/sec) at which a person can detect the flicker in a flickering light source. At frequencies above the cff, the light source appears to be continuous even though it is actually flickering. An investigation carried out to see whether true average cff depends on iris color yielded the following data (based on the article “The Effects of Iris Color on Critical Flicker Frequency,” J. of General Psych., 1973: 91–95):

Bibliography Miller, Rupert, Beyond ANOVA: The Basics of Applied

Statistics, Wiley, New York, 1986. An excellent source of information about assumption checking and alternative methods of analysis.

Montgomery, Douglas, Design and Analysis of Experiments (7th ed.), Wiley, New York, 2009. A very up-to-date presentation of ANOVA models and methodology.

Neter, John, William Wasserman, and Michael Kutner, Applied Linear Statistical Models (5th ed.), Irwin, Homewood, IL, 2004. The second half of this book

contains a very well-presented survey of ANOVA; the level is comparable to that of the present text, but the discussion is more comprehensive, making the book an excellent reference.

Ott, R. Lyman and Michael Longnecker. An Introduction to Statistical Methods and Data Analysis (6th ed.), Duxbury Press, Belmont, CA, 2010. Includes several chapters on ANOVA methodology that can profitably be read by students desiring a very nonmathematical exposition; there is a good chapter on various multiple comparison methods.

Iris Color

1. Brown 2. Green 3. Blue

26.8 26.4 25.7 27.9 24.2 27.2 23.7 28.0 29.9 25.0 26.9 28.5 26.3 29.1 29.4 24.8 28.3 25.7 24.5

Ji 8 5 6 204.7 134.6 169.0 25.59 26.92 28.17

x # # 5 508.3n 5 19

xi# xi.

a. State and test the relevant hypotheses at significance level .05 by using the F table to obtain an upper and/or lower bound on the P-value. [Hint: and .]

b. Investigate differences between iris colors with respect to mean cff.

CF 5 13,598.36 ggxij

2 5 13,659.67

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

419

11 Multifactor Analysis of Variance

INTRODUCTION

In the previous chapter, we used the analysis of variance (ANOVA) to test for

equality of either I different population means or the true average responses

associated with I different levels of a single factor (alternatively referred to as I

different treatments). In many experimental situations, there are two or more

factors that are of simultaneous interest. This chapter extends the methods of

Chapter 10 to investigate such multifactor situations.

In the first two sections, we concentrate on the case of two factors. We

will use I to denote the number of levels of the first factor (A) and J to denote

the number of levels of the second factor (B). Then there are IJ possible combi-

nations consisting of one level of factor A and one of factor B. Each such com-

bination is called a treatment, so there are IJ different treatments. The number

of observations made on treatment (i, j ) will be denoted by . In Section 11.1,

we consider . An important special case of this type is a randomized

block design, in which a single factor A is of primary interest but another fac-

tor, “blocks,” is created to control for extraneous variability in experimental

units or subjects. Section 11.2 focuses on the case , with brief

mention of the difficulties associated with unequal ’s.

Section 11.3 considers experiments involving more than two factors.

When the number of factors is large, an experiment consisting of at least one

observation for each treatment would be expensive and time consuming. One

frequently encountered situation, which we discuss in Section 11.4, is that in

which there are p factors, each of which has two levels. There are then 2p dif-

ferent treatments. We consider both the case in which observations are made

on all these treatments (a complete design) and the case in which observations

are made for only a selected subset of treatments (an incomplete design).

Kij

Kij 5 K . 1

Kij 5 1

Kij

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.1

When factor A consists of I levels and factor B consists of J levels, there are IJ different combinations (pairs) of levels of the two factors, each called a treatment. With the number of observations on the treatment consisting of factor A at level i and factor B at level j, we restrict attention in this section to the case , so that the data consists of IJ observations. Our focus is on the fixed effects model, in which the only levels of interest for the two factors are those actually represented in the experiment. Situations in which at least one factor is random are discussed briefly at the end of the section.

Kij 5 1 Kij 5

Is it really as easy to remove marks on fabrics from erasable pens as the word erasable might imply? Consider the following data from an experiment to com- pare three different brands of pens and four different wash treatments with respect to their ability to remove marks on a particular type of fabric (based on “An Assessment of the Effects of Treatment, Time, and Heat on the Removal of Erasable Pen Marks from Cotton and Cotton/Polyester Blend Fabrics,” J. of Testing and Evaluation, 1991: 394–397). The response variable is a quantitative indicator of overall specimen color change; the lower this value, the more marks were removed.

Washing Treatment

1 2 3 4 Total Average

1 .97 .48 .48 .46 2.39 .598 Brand of Pen 2 .77 .14 .22 .25 1.38 .345

3 .67 .39 .57 .19 1.82 .455

Total 2.41 1.01 1.27 .90 5.59 Average .803 .337 .423 .300 .466

11.1 Two-Factor ANOVA with Kij 5 1

420 CHAPTER 11 Multifactor Analysis of Variance

Is there any difference in the true average amount of color change due either to the different brands of pens or to the different washing treatments? ■

As in single-factor ANOVA, double subscripts are used to identify random variables and observed values. Let

The ’s are usually presented in a rectangular table in which the various rows are identified with the levels of factor A and the various columns with the levels of factor B. In the erasable-pen experiment of Example 11.1, the number of levels of factor A is , the number of levels of factor B is , and so on.

J 5 4, x 13 5 .48, x 22 5 .14I 5 3

x ij

x ij 5 the observed value of Xij

held at level i and factor B is held at level j

Xij 5 the random variable (rv) denoting the measurement when factor A is

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.1 Two-Factor ANOVA with 421Kij 5 1

Whereas in single-factor ANOVA we were interested only in row means and the grand mean, now we are interested also in column means. Let

with observed values , , and . Totals rather than averages are denoted by

omitting the horizontal bar (so , etc.). Intuitively, to see whether there

is any effect due to the levels of factor A, we should compare the observed with one another. Information about the different levels of factor B should come from the .

The Fixed Effects Model Proceeding by analogy to single-factor ANOVA, one’s first inclination in specifying a model is to let the true average response when factor A is at level i and factor B at level j. This results in IJ mean parameters. Then let

where is the random amount by which the observed value differs from its expectation. The are assumed normal and independent with common variance

. Unfortunately, there is no valid test procedure for this choice of parameters. This is because there are parameters (the and ) but only IJ observa- tions, so after using each as an estimate of , there is no way to estimate .

The following alternative model is realistic yet involves relatively few parameters.

s2mijx ij

s2mij ’sIJ 1 1 s2

Pij ’s Pij

Xij 5 mij 1 Pij

mij 5

x # j’s

xi # ’s x.j 5 �ix ij

x # #x #jx i.

5

g I

i51 g

J

j51 Xij

IJ X#

# 5 the grand mean

when factor B is held at level j the average of measurements obtainedX# j 5

when factor A is held at level i the average of measurements obtainedXi # 5

5 g

I

i51 Xij

I

5 g J

j51 Xij

J

Assume the existence of I parameters and J parameters , such that

(11.1)

so that

(11.2)mij 5 ai 1 bj

Xij 5 ai 1 bj 1 Pij (i 5 1, c, I, j 5 1, c, J )

b1, b2, c, bJ

a1, a2, c, aI

Including s2, there are now model parameters, so if and , then there will be fewer parameters than observations (in fact, we will shortly modify (11.2) so that even and/or will be accommodated).

The model specified in (11.1) and (11.2) is called an additive model because each mean response is the sum of an effect due to factor A at level i (ai) and an effect due to factor B at level . The difference between mean responses forj (bj)

mij

J 5 2I 5 2

J $ 3I $ 3I 1 J 1 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Plotting the observed ’s in a manner analogous to that of Figure 11.1 results in Figure 11.2. Although there is some “crossing over” in the observed ’s, the pattern is reasonably representative of what would be expected under additivity with just one observation per treatment.

xij

xij

422 CHAPTER 11 Multifactor Analysis of Variance

1 2 3 4

Levels of A

(a)

Levels of B

Mean response

1 2 3 4

Levels of A

(b)

Levels of B

Mean response

Figure 11.1 Mean responses for two types of model: (a) additive; (b) nonadditive

Color change

.4

.3

.1

.2

1 2

Washing treatment

Brand 1

3 4

.5

.6

.7

.8

.9

1.0

Brand 2

Brand 3

Figure 11.2 Plot of data from Example 11.1

factor A at level i and level when B is held at level j is . When the model is additive,

which is independent of the level j of the second factor. A similar result holds for . Thus additivity means that the difference in mean responses for two lev-

els of one of the factors is the same for all levels of the other factor. Figure 11.1(a) shows a set of mean responses that satisfy the condition of additivity. A nonaddi- tive configuration is illustrated in Figure 11.1(b).

mij 2 mijr

mij 2 mirj 5 (ai 1 bj) 2 (air 1 bj) 5 ai 2 air

mij 2 mirjir

Example 11.2 (Example 11.1 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.1 Two-Factor ANOVA with 423Kij 5 1

By subtracting any constant c from all ai ’s and adding c to all bj ’s, other configura- tions corresponding to the same additive model are obtained. This nonuniqueness is eliminated by use of the following model.

Expression (11.2) is not quite the final model description because the ai’s and bj’s are not uniquely determined. Here are two different configurations of the ai’s and bj’s that yield the same additive :mij ’s

(11.4)

versus HaB: at least one bj 2 0 H0B: b1 5 b2 5 c5 bJ 5 0

versus HaA: at least one ai 2 0 H0A: a1 5 a2 5 c5 aI 5 0

(No factor A effect implies that all ai’s are equal, so they must all be 0 since they sum to 0, and similarly for the bj’s.)

Test Procedures The description and analysis follow closely that for single-factor ANOVA. There are now four sums of squares, each with an associated number of df:

(11.3)

where , and the are assumed independent, normally

distributed, with mean 0 and common variance s2.

Pij ’sg I

i51 ai 5 0, g

J

j51 bj 5 0

Xij 5 m 1 ai 1 bj 1 Pij

m22 5 6m21 5 3a2 5 1m22 5 6m21 5 3a2 5 2

m12 5 5m11 5 2a1 5 0m12 5 5m11 5 2a1 5 1

b2 5 5b1 5 2b2 5 4b1 5 1

This is analogous to the alternative choice of parameters for single-factor ANOVA discussed in Section 10.3. It is not difficult to verify that (11.3) is an additive model in which the parameters are uniquely determined (for example, for the

mentioned previously: , and ). Notice that there are only independently determined ai’s and independently determined bj’s. Including m, (11.3) specifies mean parameters.

The interpretation of the parameters in (11.3) is straightforward: m is the true grand mean (mean response averaged over all levels of both factors), ai is the effect of factor A at level i (measured as a deviation from m), and bj is the effect of factor B at level j. Unbiased (and maximum likelihood) estimators for these parameters are

There are two different null hypotheses of interest in a two-factor experiment with . The first, denoted by H0A, states that the different levels of factor A have no

effect on true average response. The second, denoted by H0B, asserts that there is no factor B effect.

Kij 5 1

m̂ 5 X # # âi 5 Xi # 2 X # # b̂j 5 X #j 2 X # #

I 1 J 2 1 J 2 1I 2 1

b2 5 1.5m 5 4, a1 5 2.5, a2 5 .5, b1 5 21.5mij ’s

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.3 (Example 11.2 continued)

DEFINITION

424 CHAPTER 11 Multifactor Analysis of Variance

(11.5)

The fundamental identity is

(11.6)SST 5 SSA 1 SSB 1 SSE

SSE 5 g I

i51 g J

j51 (Xij 2 Xi # 2 X #j 1 X # #)

2 df 5 (I 2 1)(J 2 1)

SSB 5 g I

i51 g

J

j51 (X #

j 2 X # #)

2 5 Ig J

j51 (X # j 2 X # #)

2 df 5 J 2 1

SSA 5 g I

i51 g

J

j51 (Xi # 2 X # #)

2 5 Jg I

i51 (Xi # 2 X # #)

2 df 5 I 2 1

SST 5 g I

i51 g J

j51 (Xij 2 X # #)

2 df 5 IJ 2 1

There are computing formulas for SST, SSA, and SSB analogous to those given in Chapter 10 for single-factor ANOVA. But the wide availability of statistical software has rendered these formulas almost obsolete.

The expression for SSE results from replacing m, ai, and bj by their estimators in . Error df is IJ � number of mean parameters esti- mated . Total variation is split into a part (SSE) that is not explained by either the truth or the falsity of H0A or H0B and two parts that can be explained by possible falsity of the two null hypotheses.

Statistical theory now says that if we form F ratios as in single-factor ANOVA, when is true, the corresponding F ratio has an F distribution with numer- ator and denominator .df 5 (I 2 1)(J 2 1)df 5 I 2 1 (J 2 1)

H0A (H0B)

5 IJ 2 [1 1 (I 2 1) 1 (J 2 1)] 5 (I 2 1)(J 2 1) g [Xij 2 (m 1 ai 1 bj)]

2

The and for the color-change data are displayed along the margins of the data table given previously. Table 11.1 summarizes the calculations.

x #j ’sxi # ’s

Table 11.1 ANOVA Table for Example 11.3

Source of Variation df Sum of Squares Mean Square f

Factor A (brand) Factor B

(wash treatment) Error Total SST 5 .6947IJ 2 1 5 11

MSE 5 .01447SSE 5 .0868(I 2 1)(J 2 1) 5 6 fB 5 11.05MSB 5 .1599SSB 5 .4797J 2 1 5 3

fA 5 4.43MSA 5 .0641SSA 5 .1282I 2 1 5 2

Hypotheses Test Statistic Value Rejection Region

H0A versus HaA

H0B versus HaB fB $ Fa,J21,(I21)(J21)fB 5 MSB

MSE

fA $ Fa,I21,(I21)(J21)fA 5 MSA

MSE

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.1 Two-Factor ANOVA with 425Kij 5 1

The critical value for testing H0A at level of significance .05 is . Since cannot be rejected at significance level .05. True average color

change does not appear to depend on the brand of pen. Because and , H0B is rejected at significance level .05 in favor of the assertion that

color change varies with washing treatment. A statistical computer package gives P-values of .066 and .007 for these two tests. ■

Plausibility of the normality and constant variance assumptions can be investigated graphically. Define predicted values (also called fitted values)

, and the residuals (the differences

between the observations and predicted values) . We can check the normality assumption with a normal probability plot of the residuals, and the constant variance assumption with a plot of the residuals against the fitted values. Figure 11.3 shows these plots for the data of Example 11.3.

xij 2 x̂ij 5 xij 2 xi # 2 x#j 1 x # # x #

# 1 (xi # 2 x # #) 1 (x #j 2 x # #) 5 xi # 1 x #j 2 x # #

x̂ij 5 m̂ 1 âi 1 b̂j 5

11.05 $ 4.76 F.05,3,6 5 4.76

4.43 , 5.14, H0A

F.05,2,6 5 5.14

0.0�0.2 �0.1 0.1 0.2

Residual

99

95 90

80 70 60

30

1

40

5

50

20

10

Normal Probability Plot of the Residuals

(a)

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Fitted Value

0.15

0.10

0.05

0.0

�0.5

�0.10

Residuals Versus the Fitted Values

(b)

P e r c e n t

R e s i d u a l

Figure 11.3 Diagnostic plots from Minitab for Example 11.3

The normal probability plot is reasonably straight, so there is no reason to question normality for this data set. On the plot of the residuals against the fitted val- ues, look for substantial variation in vertical spread when moving from left to right. For example, a narrow range for small fitted values and a wide range for high fitted values would suggest that the variance is higher for larger responses (this happens often, and it can sometimes be cured by replacing each observation by its logarithm). Figure 11.3(b) shows no evidence against the constant variance assumption.

Expected Mean Squares The plausibility of using the F tests just described is demonstrated by computing the expected mean squares. For the additive model,

E(MSB) 5 s2 1 I

J 2 1 g

J

j51 bj

2

E(MSA) 5 s2 1 J

I 2 1 g

I

i51 ai

2

E(MSE) 5 s2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.4 (Example 11.3 continued)

426 CHAPTER 11 Multifactor Analysis of Variance

If H0A is true, MSA is an unbiased estimator of s 2, so F is a ratio of two unbiased

estimators of s2. When H0A is false, MSA tends to overestimate s 2. Thus H0A should

be rejected when the ratio FA is too large. Similar comments apply to MSB and H0B.

Multiple Comparisons After rejecting either H0A or H0B, Tukey’s procedure can be used to identify signifi- cant differences between the levels of the factor under investigation.

1. For comparing levels of factor A, obtain .

For comparing levels of factor B, obtain .

2. Compute

Qa,I,(I21)(J21) # 2MSE/J for factor A comparisons means being compared)

w 5 Q # (estimated standard deviation of the sample

Qa,J,(I21)(J21)

Qa,I,(I21)(J21)

Identification of significant differences among the four washing treatments requires and . The four factor B sample means

(column averages) are now listed in increasing order, and any pair differing by less than .340 is underscored by a line segment:

Washing treatment 1 appears to differ significantly from the other three treatments, but no other significant differences are identified. In particular, it is not apparent which among treatments 2, 3, and 4 is best at removing marks. ■

Randomized Block Experiments In using single-factor ANOVA to test for the presence of effects due to the I dif- ferent treatments under study, once the IJ subjects or experimental units have been chosen, treatments should be allocated in a completely random fashion. That is, J subjects should be chosen at random for the first treatment, then another sample of J chosen at random from the remaining subjects for the second treat- ment, and so on.

It frequently happens, though, that subjects or experimental units exhibit het- erogeneity with respect to other characteristics that may affect the observed responses. Then, the presence or absence of a significant F value may be due to this extraneous variation rather than to the presence or absence of factor effects. This is why paired experiments were introduced in Chapter 9. The analogy to a paired exper- iment when is called a randomized block experiment. An extraneous factor, “blocks,” is constructed by dividing the IJ units into J groups with I units in each

I . 2

IJ 2 J

x4 # x2# .300 337

x3 # x1#

.423 .803

w 5 4.902(.01447)/3 5 .340Q.05,4,6 5 4.90

(because, e.g., the standard deviation of is ).

3. Arrange the sample means in increasing order, underscore those pairs differing by less than w, and identify pairs not underscored by the same line as correspon- ding to significantly different levels of the given factor.

s/1JXi #

Qa,J,(I21)(J21) # #MSE/I for factor B comparisonse5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.5

11.1 Two-Factor ANOVA with 427Kij 5 1

group. This grouping or blocking should be done so that within each block, the I units are homogeneous with respect to other factors thought to affect the responses. Then within each homogeneous block, the I treatments are randomly assigned to the I units or subjects.

A consumer product-testing organization wished to compare the annual power con- sumption for five different brands of dehumidifier. Because power consumption depends on the prevailing humidity level, it was decided to monitor each brand at four different levels ranging from moderate to heavy humidity (thus blocking on humidity level). Within each level, brands were randomly assigned to the five selected locations. The resulting observations (annual kWh) appear in Table 11.2, and the ANOVA calculations are summarized in Table 11.3.

Table 11.2 Power Consumption Data for Example 11.5

Treatments Blocks (humidity level) (brands) 1 2 3 4

1 685 792 838 875 3190 797.50 2 722 806 893 953 3374 843.50 3 733 802 880 941 3356 839.00 4 811 888 952 1005 3656 914.00 5 828 920 978 1023 3749 937.25

3779 4208 4541 4797 17,325 755.80 841.60 908.20 959.40 866.25x# j

x# j

xi#xi#

Table 11.3 ANOVA Table for Example 11.5

Source of Variation df Sum of Squares Mean Square f

Treatments (brands) 4 53,231.00 13,307.75 Blocks 3 116,217.75 38,739.25 Error 12 1671.00 139.25 Total 19 171,119.75

fB 5 278.20 fA 5 95.57

Since and , H0 is rejected in favor of Ha. Power consumption appears to depend on the brand of humidifier. To identify significantly different brands, we use Tukey’s procedure. and

.

The underscoring indicates that the brands can be divided into three groups with respect to power consumption.

Because the block factor is of secondary interest, is not needed, though the computed value of FB is clearly highly significant. Figure 11.4 shows SAS output for this data. At the top of the ANOVA table, the sums of squares (SSs) for treatments (brands) and blocks (humidity levels) are combined into a single “model” SS.

F.05,3,12

x1# x3 # x2 # x4 # x5 # 797.50 839.00 843.50 914.00 937.25

w 5 4.512139.25/4 5 26.6 Q.05,5,12 5 4.51

fA 5 95.57 $ 3.26F.05,4,12 5 3.26

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.6

428 CHAPTER 11 Multifactor Analysis of Variance

Analysis of Variance Procedure

Dependent Variable: POWERUSE Sum of Mean

Source DF Squares Square F Value Pr . F

Model 7 169448.750 24206.964 173.84 0.0001 Error 12 1671.000 139.250 Corrected Total 19 171119.750

R-Square C.V. Root MSE POWERUSE Mean

0.990235 1.362242 11.8004 866.25000

Source DF Anova SS Mean Square F Value PR . F

BRAND 4 53231.000 13307.750 95.57 0.0001 HUMIDITY 3 116217.750 38739.250 278.20 0.0001

Alpha � 0.05 df � 12 MSE � 139.25 Critical Value of Studentized Range � 4.508

Minimum Significant Difference � 26.597

Means with the same letter are not significantly different.

Tukey Grouping Mean N BRAND

A 937.250 4 5 A A 914.000 4 4

B 843.500 4 2 B B 839.000 4 3

C 797.500 4 1

How does string tension in tennis rackets affect the speed of the ball coming off the racket? The article “Elite Tennis Player Sensitivity to Changes in String Tension and the Effect on Resulting Ball Dynamics” (Sports Engr., 2008: 31–36) described an experiment in which four different string tensions (N) were used, and balls projected from a machine were hit by 18 different players. The rebound speed (km/h) was then determined for each tension-player combination. Consider the following data in Table 11.4 from a similar experiment involving just six play- ers (the resulting ANOVA is in good agreement with what was reported in the article).

The ANOVA calculations are summarized in Table 11.5. The P-value for testing to see whether true average rebound speed depends on string tension is .049. Thus

is barely rejected at significance level .05 in favor of the conclusion that true average speed does vary with tension . Application of Tukey’s procedure to identify significant differences among tensions requires . Then . The difference between the largest and smallest sample mean tensions is 6.87. So although the F test is significant, Tukey’s

w 5 7.464Q.05,4,15 5 4.08

(F.05,3,15 5 3.29) H0: a1 5 a2 5 a3 5 a4 5 0

Figure 11.4 SAS output for power consumption data ■

In many experimental situations in which treatments are to be applied to sub- jects, a single subject can receive all I of the treatments. Blocking is then often done on the subjects themselves to control for variability between subjects; each subject is then said to act as its own control. Social scientists sometimes refer to such exper- iments as repeated-measures designs. The “units” within a block are then the differ- ent “instances” of treatment application. Similarly, blocks are often taken as different time periods, locations, or observers.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.1 Two-Factor ANOVA with 429Kij 5 1

Table 11.4 Rebound Speed Data for Example 11.6

Player Tension 1 2 3 4 5 6

210 105.7 116.6 106.6 113.9 119.4 123.5 114.28 235 113.3 119.9 120.5 119.3 122.5 124.0 119.92 260 117.2 124.4 122.3 120.0 115.1 127.9 121.15 285 110.0 106.8 110.0 115.3 122.6 128.3 115.50

111.55 116.93 114.85 117.13 119.90 125.93x.j

xi.

Table 11.5 ANOVA Table for Example 11.6

Source df SS MS f P

Tension 3 199.975 66.6582 3.32 0.049 Player 5 477.464 95.4928 4.76 0.008 Error 15 301.188 20.0792 Total 23 978.626

method does not identify any significant differences. This occasionally happens when the null hypothesis is just barely rejected. The configuration of sample means in the cited article is similar to ours. The authors commented that the results were contrary to previous laboratory-based tests, where higher rebound speeds are typically associated with low string tension. ■

In most randomized block experiments in which subjects serve as blocks, the subjects actually participating in the experiment are selected from a large population. The subjects then contribute random rather than fixed effects. This does not affect the procedure for comparing treatments when (one observation per “cell,” as in this section), but the procedure is altered if . We will shortly consider two-factor models in which effects are random.

More on Blocking When , either the F test or the paired differences t test can be used to analyze the data. The resulting conclusion will not depend on which procedure is used, since and .

Just as with pairing, blocking entails both a potential gain and a potential loss in precision. If there is a great deal of heterogeneity in experimental units, the value of the variance parameter s2 in the one-way model will be large. The effect of block- ing is to filter out the variation represented by s2 in the two-way model appropriate for a randomized block experiment. Other things being equal, a smaller value of s2

results in a test that is more likely to detect departures from H0 (i.e., a test with greater power).

However, other things are not equal here, since the single-factor F test is based on degrees of freedom (df) for error, whereas the two-factor F test is based on

df for error. Fewer error df results in a decrease in power, essentially because the denominator estimator of s2 is not as precise. This loss in df can be especially serious if the experimenter can afford only a small number of observations. Nevertheless, if it appears that blocking will significantly reduce variability, the sacrifice of error df is sensible.

(I 2 1)(J 2 1) I(J 2 1)

ta/2,n 2 5 Fa,1,nT

2 5 F

I 5 2

Kij 5 K . 1 Kij 5 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

430 CHAPTER 11 Multifactor Analysis of Variance

Models with Random and Mixed Effects In many experiments, the actual levels of a factor used in the experiment, rather than being the only ones of interest to the experimenter, have been selected from a much larger population of possible levels of the factor. If this is true for both factors in a two-factor experiment, a random effects model is appropriate. The case in which the levels of one factor are the only ones of interest and the levels of the other fac- tor are selected from a population of levels leads to a mixed effects model. The two- factor random effects model when is

The , and ’s are all independent, normally distributed rv’s with mean 0 and variances , and s2, respectively. The hypotheses of interest are then

(level of factor A does not contribute to variation in the response) versus and versus . Whereas as before, the expected mean squares for factors A and B are now

Thus when is true, is still a ratio of two unbiased estimators of s2. It can be shown that a level a test for H0A versus HaA still rejects if

, and, similarly, the same procedure as before is used to decide between H0B and HaB.

If factor A is fixed and factor B is random, the mixed model is

where and the Bj’s and are normally distributed with mean 0 and vari- ances and s2, respectively. Now the two null hypotheses are

with expected mean squares

The test procedures for H0A versus HaA and H0B versus HaB are exactly as before. For example, in the analysis of the color-change data in Example 11.1, if the four wash treatments were randomly selected, then because and

is rejected in favor of . An estimate of the “variance component” is then given by .

Summarizing, when , although the hypotheses and expected mean squares differ from the case of both effects fixed, the test procedures are identical.

Kij 5 1 (MSB 2 MSE)/I 5 .0485sB

2 HaB: sB

2 . 0F.05,3,6 5 4.76, H0B: sB 2 5 0

fB 5 11.05

E(MSE) 5 s2 E(MSA) 5 s2 1 J

I 2 1 gai

2 E(MSB) 5 s2 1 IsB 2

H0A: a1 5 c5 aI 5 0 and H0B: sB 2 5 0

sB 2

Pij ’sgai 5 0

Xij 5 m 1 ai 1 Bj 1 Pij (i 5 1, c, I, j 5 1, c, J )

fA $ Fa,I21,(I21)(J21)

H0A

FA (FB)H0A (H0B)

E(MSA) 5 s2 1 JsA 2 E(MSB) 5 s2 1 IsB

2

E(MSE) 5 s2HaB: sB 2 . 0H0B: sB

2 5 0HaA: sA 2 . 0

H0A: sA 2 5 0 sA

2, sB 2

PijAi ’s, Bj ’s

Xij 5 m 1 Ai 1 Bj 1 Pij (i 5 1, c, I, j 5 1, c, J)

Kij 5 1

EXERCISES Section 11.1 (1–15)

1. The number of miles of useful tread wear (in 1000s) was determined for tires of each of five different makes of sub- compact car (factor A, with ) in combination with each of four different brands of radial tires (factor B, with ), resulting in observations. The values

, and were then computed. Assume that an additive model is appropriate.

SSE 5 59.2SSB 5 44.1 SSA 5 30.6,IJ 5 20

J 5 4 I 5 5

a. Test (no differences in true average tire lifetime due to makes of cars) versus Ha: at least one using a level .05 test.

b. (no differences in true aver- age tire lifetime due to brands of tires) versus Ha: at least one using a level .05 test.bj 2 0

H0: b1 5 b2 5 b3 5 b4 5 0 ai 2 0

H0: a1 5 a2 5 a3 5 a4 5 a5 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.1 Two-Factor ANOVA with 431Kij 5 1

2. Four different coatings are being considered for corrosion protection of metal pipe. The pipe will be buried in three dif- ferent types of soil. To investigate whether the amount of cor- rosion depends either on the coating or on the type of soil, 12 pieces of pipe are selected. Each piece is coated with one of the four coatings and buried in one of the three types of soil for a fixed time, after which the amount of corrosion (depth of maximum pits, in .0001 in.) is determined. The data appears in the table.

Soil Type (B) 1 2 3

1 64 49 50 2 53 51 48Coating (A) 3 47 45 50 4 51 43 52

a. Assuming the validity of the additive model, carry out the ANOVA analysis using an ANOVA table to see whether the amount of corrosion depends on either the type of coating used or the type of soil. Use .

b. Compute , and .

3. The article “Adiabatic Humidification of Air with Water in a Packed Tower” (Chem. Eng. Prog., 1952: 362–370) reports data on gas film heat transfer coefficient (Btu/hr ft2 on °F) as a function of gas rate (factor A) and liquid rate (factor B).

B 1(190) 2(250) 3(300) 4(400)

1(200) 200 226 240 261 2(400) 278 312 330 381A 3(700) 369 416 462 517 4(1100) 500 575 645 733

a. After constructing an ANOVA table, test at level .01 both the hypothesis of no gas-rate effect against the appropri- ate alternative and the hypothesis of no liquid-rate effect against the appropriate alternative.

b. Use Tukey’s procedure to investigate differences in expected heat transfer coefficient due to different gas rates.

c. Repeat part (b) for liquid rates.

4. In an experiment to see whether the amount of coverage of light-blue interior latex paint depends either on the brand of paint or on the brand of roller used, one gallon of each of four brands of paint was applied using each of three brands of roller, resulting in the following data (number of square feet covered).

Roller Brand 1 2 3

1 454 446 451 Paint 2 446 444 447 Brand 3 439 442 444

4 444 437 443

b̂3m̂, â1, â2, â3, â4, b̂1, b̂2

a 5 .05

a. Construct the ANOVA table. [Hint: The computations can be expedited by subtracting 400 (or any other convenient number) from each observation. This does not affect the final results.]

b. State and test hypotheses appropriate for deciding whether paint brand has any effect on coverage. Use

. c. Repeat part (b) for brand of roller. d. Use Tukey’s method to identify significant differences

among brands. Is there one brand that seems clearly preferable to the others?

5. In an experiment to assess the effect of the angle of pull on the force required to cause separation in electrical connec- tors, four different angles (factor A) were used, and each of a sample of five connectors (factor B) was pulled once at each angle (“A Mixed Model Factorial Experiment in Testing Electrical Connectors,” Industrial Quality Control, 1960: 12–16). The data appears in the accompanying table.

B 1 2 3 4 5

0° 45.3 42.2 39.6 36.8 45.8 2° 44.1 44.1 38.4 38.0 47.2

A 4° 42.7 42.7 42.6 42.2 48.9 6° 43.5 45.8 47.9 37.9 56.4

Does the data suggest that true average separation force is affected by the angle of pull? State and test the appropriate hypotheses at level .01 by first constructing an ANOVA table ( , and ).

6. A particular county employs three assessors who are respon- sible for determining the value of residential property in the county. To see whether these assessors differ systematically in their assessments, 5 houses are selected, and each assessor is asked to determine the market value of each house. With factor A denoting assessors and factor B denoting houses , suppose and

. a. Test at level .05. (H0 states that

there are no systematic differences among assessors.) b. Explain why a randomized block experiment with only

5 houses was used rather than a one-way ANOVA experi- ment involving a total of 15 different houses, with each assessor asked to assess 5 different houses (a different group of 5 for each assessor).

7. The article “Rate of Stuttering Adaptation Under Two Electro-Shock Conditions” (Behavior Research Therapy, 1967: 49–54) gives adaptation scores for three different treat- ments: (1) no shock, (2) shock following each stuttered word, and (3) shock during each moment of stuttering. These treat- ments were used on each of 18 stutterers, resulting in

, and . a. Construct the ANOVA table and test at level .05 to see

whether the true average adaptation score depends on the treatment given.

SSBl 5 2977.67SST 5 3476.00, SSTr 5 28.78

H0: a1 5 a2 5 a3 5 0 SSE 5 25.6

SSB 5 113.5,SSA 5 11.7,(J 5 5) (I 5 3)

SSB 5 246.97SST 5 396.13, SSA 5 58.16

a 5 .05

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

432 CHAPTER 11 Multifactor Analysis of Variance

b. Judging from the F ratio for subjects (factor B), do you think that blocking on subjects was effective in this experiment? Explain.

8. The paper “Exercise Thermoregulation and Hyperprolac- tinaemia” (Ergonomics, 2005: 1547–1557) discussed how various aspects of exercise capacity might depend on the temperature of the environment. The accompanying data on body mass loss (kg) after exercising on a semi-recumbent cycle ergometer in three different ambient temperatures (6°C, 18°C, and 30°C) was provided by the paper’s authors.

Cold Neutral Hot

1 .4 1.2 1.6 2 .4 1.5 1.9 3 1.4 .8 1.0 4 .2 .4 .7

Subject 5 1.1 1.8 2.4 6 1.2 1.0 1.6 7 .7 1.0 1.4 8 .7 1.5 1.3 9 .8 .8 1.1

a. Does temperature affect true average body mass loss? Carry out a test using a significance level of .01 (as did the authors of the cited paper).

b. Investigate significant differences among the temperatures. c. The residuals are .20, .30, , , .30, .00, .03, ,

, .13, .23, , , .03, , , .33, , , , .67, .11, , .27, .01, , .24. Use these

as a basis for investigating the plausibility of the assump- tions that underlie your analysis in (a).

9. The article “The Effects of a Pneumatic Stool and a One- Legged Stool on Lower Limb Joint Load and Muscular Activity During Sitting and Rising” (Ergonomics, 1993: 519–535) gives the accompanying data on the effort required of a subject to arise from four different types of stools (Borg scale). Perform an analysis of variance using , and fol- low this with a multiple comparisons analysis if appropriate.

Subject 1 2 3 4 5 6 7 8 9

1 12 10 7 7 8 9 8 7 9 8.56 Type 2 15 14 14 11 11 11 12 11 13 12.44 of 3 12 13 13 10 8 11 12 8 10 10.78 Stool 4 10 12 9 9 7 10 11 7 8 9.22

10. The strength of concrete used in commercial construction tends to vary from one batch to another. Consequently, small test cylinders of concrete sampled from a batch are “cured” for periods up to about 28 days in temperature- and moisture-controlled environments before strength measure- ments are made. Concrete is then “bought and sold on the basis of strength test cylinders” (ASTM C 31 Standard Test

xi #

a 5 .05

2.132.332.532.33 2.102.042.272.042.272.14 2.202.072.40

Method for Making and Curing Concrete Test Specimens in the Field). The accompanying data resulted from an experi- ment carried out to compare three different curing methods with respect to compressive strength (MPa). Analyze this data.

Batch Method A Method B Method C

1 30.7 33.7 30.5 2 29.1 30.6 32.6 3 30.0 32.2 30.5 4 31.9 34.6 33.5 5 30.5 33.0 32.4 6 26.9 29.3 27.8 7 28.2 28.4 30.7 8 32.4 32.4 33.6 9 26.6 29.5 29.2

10 28.6 29.4 33.2

11. For the data of Example 11.5, check the plausibility of assumptions by constructing a normal probability plot of the residuals and a plot of the residuals versus the predicted val- ues, and comment on what you learn.

12. Suppose that in the experiment described in Exercise 6 the five houses had actually been selected at random from among those of a certain age and size, so that factor B is ran- dom rather than fixed. Test versus using a level .01 test.

13. a. Show that a constant d can be added to (or subtracted from) each without affecting any of the ANOVA sums of squares.

b. Suppose that each is multiplied by a nonzero constant c. How does this affect the ANOVA sums of squares? How does this affect the values of the F statistics FA and FB? What effect does “coding” the data by have on the conclusions resulting from the ANOVA pro- cedures?

14. Use the fact that with to show that , so that is an unbiased estimator for

15. The power curves of Figures 10.5 and 10.6 can be used to obtain (type II error) for the F test in two-factor ANOVA. For fixed values of , the quantity

is computed. Then the figure corre- sponding to is entered on the horizontal axis at the value the power is read on the vertical axis from the curve labeled , and . a. For the corrosion experiment described in Exercise 2, find b when , and . Repeat for , and .

b. By symmetry, what is b for the test of H0B versus HaB in Example 11.1 when , and

?s 5 .3 b1 5 .3, b2 5 b3 5 b4 5 2.1

s 5 4a1 5 6, a2 5 0, a3 5 a4 5 23 s 5 4a1 5 4, a2 5 0, a3 5 a4 5 22

b 5 1 2 powern2 5 (I 2 1)(J 2 1) f, n1 5 I 2 1

f2 5 (J/I)gai 2/s2

a1, a2, c, aI

b 5 P

ai. âi 5 Xi # 2 X # #E(Xi # 2 X # #) 5 ai

gai 5 gbj 5 0E(Xij) 5 m1 ai 1 bj

yij 5 cxij 1 d

xij

xij

Ha: sB 2 . 0H0: sB

2 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.2 Two-Factor ANOVA with 433Kij . 1

11.2 Two-Factor ANOVA with Kij . 1 In Section 11.1, we analyzed data from a two-factor experiment in which there was one observation for each of the IJ combinations of factor levels. The were assumed to have an additive structure with . Additivity means that the difference in true average responses for any two levels of the factors is the same for each level of the other factor. For example,

, independent of the level j of the second factor. This is shown in Figure 11.1(a), in which the lines connecting true average responses are parallel.

Figure 11.1(b) depicts a set of true average responses that does not have addi- tive structure. The lines connecting these are not parallel, which means that the difference in true average responses for different levels of one factor does depend on the level of the other factor. When additivity does not hold, we say that there is inter- action between the different levels of the factors. The assumption of additivity in Section 11.1 allowed us to obtain an estimator of the random error variance s2

(MSE) that was unbiased whether or not either null hypothesis of interest was true. When for at least one pair, a valid estimator of s2 can be obtained with- out assuming additivity. Our focus here will be on the case , so the num- ber of observations per “cell” (for each combination of levels) is constant.

Fixed Effects Parameters and Hypotheses Rather than use the themselves as model parameters, it is customary to use an equivalent set that reveals more clearly the role of interaction.

mij’ s

Kij 5 K . 1 (i, j)Kij . 1

mij’ s

mij 2 mi rj 5 (m 1 ai 1 bj) 2 (m 1 ai r 1 bj) 5 ai 2 ai r

mij 5 m 1 ai 1 bj, gai 5 gbj 5 0 mij’ s

NOTATION (11.7)m 5

1

IJ g

i g

j mij mi# 5

1

J g

j mij m#j 5

1

I g

i mij

Thus m is the expected response averaged over all levels of both factors (the true grand mean), is the expected response averaged over levels of the second factor when the first factor A is held at level i, and similarly for .m #j

mi#

DEFINITION

(11.8)

from which

(11.9)mij 5 m 1 ai 1 bj 1 gij

gij 5 mij 2 (m 1 ai 1 bj) 5

bj 5 m #j 2 m 5 the effect of factor B at level j ai 5 mi # 2 m 5 the effect of factor A at level i

interaction between factor A at level i and factor B at level j

The model is additive if and only if all . The are referred to as the inter- action parameters. The ai s are called the main effects for factor A, and the bj’s are the main effects for factor B. Although there are , and in addition to m, the conditions for any i, and for any j [all by virtue of (11.7) and (11.8)] imply that only IJ of these new parameters are independ- ently determined: m, of the of the bj’s, and of the gij’s.(I 2 1)(J 2 1)ai’s J 2 1I 2 1

�igij 5 0gai 5 0, gbj 5 0, �jgij 5 0 IJ gij’sI ai’s, J bj’s

gij’sgij’s 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

434 CHAPTER 11 Multifactor Analysis of Variance

for all i, j versus

versus

versus HaB: at least one bj 2 0H0B: b1 5 c5 bJ 5 0

HaA: at least one ai 2 0H0A: a1 5 c5 aI 5 0

HaAB: at least one gij 2 0H0AB: gij 5 0

The no-interaction hypothesis H0AB is usually tested first. If H0AB is not rejected, then the other two hypotheses can be tested to see whether the main effects are signifi- cant. If H0AB is rejected and H0A is then tested and not rejected, the resulting model

does not lend itself to straightforward interpretation. In such a case, it is best to construct a picture similar to that of Figure 11.1(b) to try to visu- alize the way in which the factors interact.

The Model and Test Procedures We now use triple subscripts for both random variables and observed values, with Xijk and xijk referring to the kth observation (replication) when factor A is at level i and factor B is at level j.

mij 5 m 1 bj 1 gij

The fixed effects model is

(11.10)

where the are independent and normally distributed, each with mean 0 and variance s2.

Pijk’s

i 5 1, c, I, j 5 1, c, J, k 5 1, c, K

Xijk 5 m 1 ai 1 bj 1 gij 1 Pijk

Again, a dot in place of a subscript denotes summation over all values of that subscript, and a horizontal bar indicates averaging. Thus is the total of all K observations made for factor A at level i and factor B at level j [all observations in the (i, j)th cell], and is the average of these K observations. Test procedures are based on the following sums of squares:

Xij #

Xij #

DEFINITION

The fundamental identity is

SSAB is referred to as interaction sum of squares.

SST 5 SSA 1 SSB 1 SSAB 1 SSE

SSAB 5 g i g

j g k

(Xij # 2 Xi # # 2 X #j # 1 X # # #)2 df 5 (I 2 1)(J 2 1)

SSB 5 g i g

j g k

(X #j # 2 X # # #) 2 df 5 J 2 1

SSA 5 g i g

j g k

(Xi # # 2 X # # #) 2 df 5 I 2 1

SSE 5 g i g

j g k

(Xijk 2 Xij #)2 df 5 IJ(K 2 1)

SST 5 g i g

j g k

(Xijk 2 X # # #) 2 df 5 IJK 2 1

There are now three sets of hypotheses to be considered:

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.2 Two-Factor ANOVA with 435Kij . 1

Hypotheses Test Statistic Value Rejection Region

versus

versus

versus fAB $ Fa,(I21)(J21),IJ(K21)fAB 5 MSAB

MSE HaABH0AB

fB $ Fa,J21,IJ(K21)fB 5 MSB

MSE HaBH0B

fA $ Fa,I21,IJ(K21)fA 5 MSA

MSE HaAH0A

Total variation is thus partitioned into four pieces: unexplained (SSE—which would be present whether or not any of the three null hypotheses was true) and three pieces that may be attributed to the truth or falsity of the three H0s. Each of four mean squares is defined by . The expected mean squares suggest that each set of hypotheses should be tested using the appropriate ratio of mean squares with MSE in the denominator:

Each of the three mean square ratios can be shown to have an F distribution when the associated H0 is true, which yields the following level test procedures.a

E(MSAB) 5 s2 1 K

(I 2 1)(J 2 1) g

I

i51 g

J

j51 gij

2

E(MSB) 5 s2 1 IK

J 2 1 g

J

j51 bj

2E(MSA) 5 s2 1 JK

I21 g

I

i51 ai

2

E(MSE) 5 s2

MS 5 SS/df

Lightweight aggregate asphalt mix has been found to have lower thermal conductiv- ity than a conventional mix, which is desirable. The article “Influence of Selected Mix Design Factors on the Thermal Behavior of Lightweight Aggregate Asphalt Mixes” (J. of Testing and Eval., 2008: 1–8) reported on an experiment in which var- ious thermal properties of mixes were determined. Three different binder grades were used in combination with three different coarse aggregate contents (%), with two observations made for each such combination, resulting in the conductivity data (W/m °K) that appears in Table 11.6.#

Table 11.6 Conductivity Data for Example 11.7

Coarse Aggregate Content (%)

Asphalt Binder Grade 38 41 44

PG58 .835, .845 .822, .826 .785, .795 .8180 PG64 .855, .865 .832, .836 .790, .800 .8297 PG70 .815, .825 .800, .820 .770, .790 .8033

.8400 .8227 .7883x # j #

xi # #

Example 11.7

Here and for a total of observations. The results of the analysis are summarized in the ANOVA table which appears as Table 11.7 (a table with additional information appeared in the cited paper).

IJK 5 18K 5 2I 5 J 5 3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

436 CHAPTER 11 Multifactor Analysis of Variance

0.86

0.85

Asph Gr

PG58 PG64 PG700.84

0.83

0.82

0.81

0.80

0.79

0.78

0.77

38 41

(a)

Agg Cont

M ea

n

44

2.6 Asph Gr

PG58 PG64 PG702.5

2.4

2.3

2.2

2.1

38 41

(b)

Agg Cont

M ea

n

44

Figure 11.5 Interaction Plots for the Asphalt Data of Example 11.7. (a) Response variable is conductivity. (b) Response variable is diffusivity

Table 11.7 ANOVA Table for Example 11.7

Source DF SS MS f P

AsphGr 2 .0020893 .0010447 14.12 0.002 AggCont 2 .0082973 .0041487 56.06 0.000 Interaction 4 .0003253 .0000813 1.10 0.414 Error 9 .0006660 .0000740 Total 17 .0113780

The P-value for testing for the presence of interaction effects is .414, which is clearly larger than any reasonable significance level. Alternatively,

, so the interaction null hypothesis cannot be rejected even at the largest signifi- cance level that would be used in practice. Thus it appears that there is no interaction between the two factors. However, both main effects are significant at the 5% signif- icance level ( and ; alternatively both corresponding F ratios greatly exceed ). So it appears that true average conductivity depends on which grade is used and also on the level of coarse-aggregate content.

Figure 11.5(a) shows an interaction plot for the conductivity data. Notice the nearly parallel sets of line segments for the three different asphalt grades, in agreement with the F test that shows no significant interaction effects. True average conductivity appears to decrease as aggregate content decreases. Figure 11.5(b) shows an interaction plot for the response variable thermal diffusivity, values of which appear in the cited arti- cle. The bottom two sets of line segments are close to parallel, but differ markedly from those for PG64; in fact, the F ratio for interaction effects is highly significant here.

F.05,2,9 5 4.26 .000 # .05.002 # .05

2.69 fAB 5 1.10 , F.10,4,9 5

Plausibility of the normality and constant variance assumptions can be assessed by constructing plots similar to those of Section 11.1. Define the predicted (i.e., fitted) values to be the cell means: . For example, the predicted value for grade PG58 and aggregate content 38 is for . The residuals are the differences between the observations and corresponding predicted values: . A normal probability plot of the residuals is shown in Figure 11.6(a).xijk 2 xij #

k 5 1, 2x̂11k 5 (.835 1 .845)/2 5 .840 x̂ijk 5 xij #

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.2 Two-Factor ANOVA with 437Kij . 1

The pattern is sufficiently linear that there should be no concern about lack of normality. The plot of residuals against predicted values in Figure 11.6(b) shows a bit less spread on the right than on the left, but not enough of a differential to be worri- some; constant variance seems to be a reasonable assumption.

99

95 90

70 80

20 30 40 50 60

10 5

1 –0.015 –0.010 –0.005 0.000 0.005

(a)

Residual

P er

ce nt

0.010 0.015

0.010

0.005

0.000

–0.005

–0.010

0.77 0.78 0.79 0.80 0.81 0.82 0.83 0.84 0.85 0.86

(b)

Fitted Value

R es

id ua

l

Figure 11.6 Plots for Checking Normality and Constant Variance Assumptions in Example 11.7 ■

Multiple Comparisons When the no-interaction hypothesis is not rejected and at least one of the two main effect null hypotheses is rejected, Tukey’s method can be used to identify significant dif- ferences in levels. For identifying differences among the ai’s when H0A is rejected,

1. Obtain , where the second subscript I identifies the number of levels being compared and the third subscript refers to the number of degrees of freedom for error.

2. Compute , where JK is the number of observations averaged to obtain each of the ’s compared in Step 3.

3. Order the ’s from smallest to largest and, as before, underscore all pairs that differ by less than w. Pairs not underscored correspond to significantly different levels of factor A.

To identify different levels of factor B when H0B is rejected, replace the second subscript in Q by J, replace JK by IK in w, and replace by .x #j #xi# #

xi# #

xi# # w 5 Q2MSE/(JK)

Qa,I,IJ(K21)

H0AB

Example 11.8 (Example 11.7 continued)

for both factor A (grade) and factor B (aggregate content). With and error . The yardstick for identifying

significant differences is then . The grade sample means in increasing order are .8033, .8180, and .8297. Only the difference between the two largest means is smaller than w. This gives the underscoring pattern

PG70 PG58 PG64

Grades PG58 and PG64 do not appear to differ significantly from one another in effect on true average conductivity, but both differ from the PG70 grade.

The ordered means for factor B are .7883, .8227, and .8400. All three pairs of means differ by more than .00139, so there are no underscoring lines. True average conductivity appears to be different for all three levels of aggregate content. ■

w 5 3.952.0000740/6 5 .00139

df 5 IJ(K 2 1) 5 9, Q.05,3,9 5 3.95 a 5 .05I 5 J 5 3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

438 CHAPTER 11 Multifactor Analysis of Variance

Models with Mixed and Random Effects In some problems, the levels of either factor may have been chosen from a large pop- ulation of possible levels, so that the effects contributed by the factor are random rather than fixed. As in Section 11.1, if both factors contribute random effects, the model is referred to as a random effects model, whereas if one factor is fixed and the other is random, a mixed effects model results. We will now consider the analysis for a mixed effects model in which factor A (rows) is the fixed factor and factor B (columns) is the random factor. The case in which both factors are random is dealt with in Exercise 26.

The mixed effects model when factor A is fixed and factor B is random is

i 5 1, c, I, j 5 1, c, J, k 5 1, c, K

Xijk 5 m 1 ai 1 Bj 1 Gij 1 Pijk

Here m and ai’s are constants with , and the Bj’s, Gij’s, and ijk’s are inde- pendent, normally distributed random variables with expected value 0 and variances

, and s2, respectively.* The relevant hypotheses here are somewhat different from those for the fixed effects model. sB

2, sG 2

Pgai 5 0

versus

versus

versus HaG: sG 2 . 0H0G: sG

2 5 0

HaB: sB 2 . 0H0B: sB

2 5 0

HaA: at least one ai 2 0H0A: a1 5 a2 5 c5 aI 5 0

It is customary to test H0A and H0B only if the no-interaction hypothesis H0G cannot be rejected.

Sums of squares and mean squares needed for the test procedures are defined and computed exactly as in the fixed effects case. The expected mean squares for the mixed model are

The ratio is again appropriate for testing the no-interaction hypothesis, with H0G rejected if . However, for testing H0A versus HaA, the expected mean squares suggest that although the numerator of the F ratio should still be MSA, the denominator should be MSAB rather than MSE. MSAB is also the denominator of the F ratio for testing H0B.

fAB $ Fa,(I21)(J21),IJ(K21)

fAB 5 MSAB/MSE

E(MSAB) 5 s2 1 KsG 2

E(MSB) 5 s2 1 KsG 2 1 IKsB

2

E(MSA) 5 s2 1 KsG 2 1

JK

I 2 1 gai

2

E(MSE) 5 s2

* This is referred to as an “unrestricted” model. An alternative “restricted” model requires that for each j (so the Gij’s are no longer independent). Expected mean squares and F ratios appropriate for testing certain hypotheses depend on the choice of model. Minitab’s default option gives output for the unrestricted model.

�iGij 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.9

11.2 Two-Factor ANOVA with 439Kij . 1

A process engineer has identified two potential causes of electric motor vibration, the material used for the motor casing (factor A) and the supply source of bearings used in the motor (factor B). The accompanying data on the amount of vibration (microns) resulted from an experiment in which motors with casings made of steel, aluminum, and plastic were constructed using bearings supplied by five randomly selected sources.

Supply Source

1 2 3 4 5

Steel 13.1 13.2 16.3 15.8 13.7 14.3 15.7 15.8 13.5 12.5 Material Aluminum 15.0 14.8 15.7 16.4 13.9 14.3 13.7 14.2 13.4 13.8

Plastic 14.0 14.3 17.2 16.7 12.4 12.3 14.4 13.9 13.2 13.1

For testing H0A versus HaA (A fixed, B random), the test statistic value is , and the rejection region is . The test

of H0B versus HaB utilizes , with rejection region .fB $ Fa,J21,(I21)(J21)

fB 5 MSB/MSAB fA $ Fa,I21,(I21)(J21)fA 5 MSA/MSAB

Only the three casing materials used in the experiment are under consideration for use in production, so factor A is fixed. However, the five supply sources were ran- domly selected from a much larger population, so factor B is random. The relevant null hypotheses are

Minitab output appears in Figure 11.7. The P-value column in the ANOVA table indi- cates that the latter two null hypotheses should be rejected at significance level .05. Different casing materials by themselves do not appear to affect vibration, but interac- tion between material and supplier is a significant source of variation in vibration.

Factor Type Levels Values casmater fixed 3 1 2 3 source random 5 1 2 3 4 5

Source DF SS MS F P casmater 2 0.7047 0.3523 0.24 0.790 source 4 36.6747 9.1687 6.32 0.013 casmater*source 8 11.6053 1.4507 13.03 0.000 Error 15 1.6700 0.1113 Total 29 50.6547

Source Variance Error Expected Mean Square for Each Term component term (using unrestricted model)

1 casmater 3 (4)12(3)1Q[1] 2 source 1.2863 3 (4)12(3)16(2) 3 casmater*source 0.6697 4 (4)12(3) 4 Error 0.1113 (4)

Figure 11.7 Output from Minitab’s balanced ANOVA option for the data of Example 11.9 ■

When at least two of the ’s are unequal, the ANOVA computations are much more complex than for the case . In addition, there is controversy as to which test procedures should be used. One of the chapter references can be consulted for more information.

Kij 5 K Kij

H0A: a1 5 a2 5 a3 5 0 H0B: sB 2 5 0 H0AB: sG

2 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

440 CHAPTER 11 Multifactor Analysis of Variance

EXERCISES Section 11.2 (16–26)

16. In an experiment to assess the effects of curing time (factor A) and type of mix (factor B) on the compressive strength of hardened cement cubes, three different curing times were used in combination with four different mixes, with three observations obtained for each of the 12 curing time–mix combinations. The resulting sums of squares were computed to be , and . a. Construct an ANOVA table. b. Test at level .05 the null hypothesis H0AB: all (no

interaction of factors) against : at least one . c. Test at level .05 the null hypothesis

(factor A main effects are absent) against HaA: at least one .

d. Test versus HaB: at least one using a level .05 test.

e. The values of the were , and . Use Tukey’s proce-

dure to investigate significant differences among the three curing times.

17. The article “Towards Improving the Properties of Plaster Moulds and Castings” (J. Engr. Manuf., 1991: 265–269) describes several ANOVAs carried out to study how the amount of carbon fiber and sand additions affect various characteristics of the molding process. Here we give data on casting hardness and on wet-mold strength.

Sand Carbon Fiber Casting Wet-Mold Addition (%) Addition (%) Hardness Strength

0 0 61.0 34.0 0 0 63.0 16.0

15 0 67.0 36.0 15 0 69.0 19.0 30 0 65.0 28.0 30 0 74.0 17.0 0 .25 69.0 49.0 0 .25 69.0 48.0

15 .25 69.0 43.0 15 .25 74.0 29.0 30 .25 74.0 31.0 30 .25 72.0 24.0 0 .50 67.0 55.0 0 .50 69.0 60.0

15 .50 69.0 45.0 15 .50 74.0 43.0 30 .50 74.0 22.0 30 .50 74.0 48.0

a. An ANOVA for wet-mold strength gives , and . Test for

the presence of any effects using .a 5 .05 SST 5 3105SSFiber 5 1278, SSE 5 843

SS Sand 5 705,

x3# # 5 3960.02x2# # 5 4029.10

x1# # 5 4010.88,xi #

#’s

bj 2 0 H0B: b1 5 b2 5 b3 5 b4 5 0 ai 2 0

a3 5 0 H0A: a1 5 a2 5

gij 2 0HaAB gij’s 5 0

SST 5 205,966.6 SSA 5 30,763.0, SSB 5 34,185.6, SSE 5 97,436.8

b. Carry out an ANOVA on the casting hardness observa- tions using .

c. Plot sample mean hardness against sand percentage for different levels of carbon fiber. Is the plot consistent with your analysis in part (b)?

18. The accompanying data resulted from an experiment to investigate whether yield from a certain chemical process depended either on the formulation of a particular input or on mixer speed.

Speed

60 70 80

189.7 185.1 189.0 1 188.6 179.4 193.0

190.1 177.3 191.1 Formulation

165.1 161.7 163.3 2 165.9 159.8 166.6

167.6 161.6 170.3

A statistical computer package gave , and

. a. Does there appear to be interaction between the factors? b. Does yield appear to depend on either formulation or

speed? c. Calculate estimates of the main effects. d. The fitted values are , and the

residuals are . Verify that the residuals are .23, , .63, 4.50, , , , 1.97, .07, , , 1.40, .67, , .57, , , and 3.57.

e. Construct a normal probability plot from the residuals given in part (d). Do the ’s appear to be normally distributed?

19. The accompanying data table gives observations on total acidity of coal samples of three different types, with deter- minations made using three different concentrations of ethanolic NaOH (“Chemistry of Brown Coals,” Australian J. Applied Science, 1958: 375–379).

Type of Coal

Morwell Yallourn Maddingley

.404N 8.27, 8.17 8.66, 8.61 8.14, 7.96 NaOH Conc. .626N 8.03, 8.21 8.42, 8.58 8.02, 7.89

.786N 8.60, 8.20 8.61, 8.76 8.13, 8.07

a. Assuming both effects to be fixed, construct an ANOVA table, test for the presence of interaction, and then test for the presence of main effects for each factor (all using level .01).

Pijk

2.1323.4321.232.30 21.1022.0323.3021.202.87

xijk 2 x̂ijk

x̂ijk 5 m̂ 1 âi 1 b̂j 1 ĝij

SSE 5 71.87 SS(Speed) 5 230.81, SS(Form*Speed) 5 18.58

SS(Form) 5 2253.44,

a 5 .05

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.2 Two-Factor ANOVA with 441Kij . 1

b. Use Tukey’s procedure to identify significant differences among the types of coal.

20. The article “Fatigue Limits of Enamel Bonds with Moist and Dry Techniques” (Dental Materials, 2009: 1527–1531) described an experiment to investigate the ability of adhe- sive systems to bond to mineralized tooth structures. The response variable is shear bond strength (MPa), and two dif- ferent adhesives (Adper Single Bond Plus and OptiBond Solo Plus) were used in combination with two different sur- face conditions. The accompanying data was supplied by the authors of the article. The first 12 observations came from the SBP-dry treatment, the next 12 from the SBP- moist treatment, the next 12 from the OBP-dry treatment, and the last 12 from the OBP-moist treatment.

56.7 57.4 53.4 54.0 49.9 49.9

56.2 51.9 49.6 45.7 56.8 54.1

49.2 47.4 53.7 50.6 62.7 48.8

41.0 57.4 51.4 53.4 55.2 38.9

38.8 46.0 38.0 47.0 46.2 39.8

25.9 37.8 43.4 40.2 35.4 40.3

40.6 35.5 58.7 50.4 43.1 61.7

33.3 38.7 45.4 47.2 53.3 44.9

a. Construct a comparative boxplot of the data on the four different treatments and comment.

b. Carry out an appropriate analysis of variance and state your conclusions (use a significance level of .01 for any tests). Include any graphs that provide insight.

c. If a significance level of .05 is used for the two-way ANOVA, the interaction effect is significant (just as in general different glues work better with some materials than with others). So now it makes sense to carry out a one-way ANOVA on the four treatments SBP-D, SBP-M, OBP-D, and OBP-M. Do this, and identify significant dif- ferences among the treatments.

21. In an experiment to investigate the effect of “cement factor” (number of sacks of cement per cubic yard) on flexural strength of the resulting concrete (“Studies of Flexural Strength of Concrete. Part 3: Effects of Variation in Testing Procedure,” Proceedings, ASTM, 1957: 1127–1139), different factor values were used, different batches of cement were selected, and beams were cast from each cement factor/batch combination. Sums of squares include

, and . Construct the ANOVA table. Then, assuming a mixed model with cement factor (A) fixed and batches (B) random, test the three pairs of hypotheses of interest at level .05.

22. A study was carried out to compare the writing lifetimes of four premium brands of pens. It was thought that the writ- ing surface might affect lifetime, so three different surfaces were randomly selected. A writing machine was used to ensure that conditions were otherwise homogeneous (e.g., constant pressure and a fixed angle). The accompanying

SST 5 64,954.70 SSA 5 22,941.80, SSB 5 22,765.53, SSE 5 15,253.50

K 5 2 J 5 5

I 5 3

table shows the two lifetimes (min) obtained for each brand–surface combination.

Writing Surface

1 2 3

1 709, 659 713, 726 660, 645 4112 Brand 2 668, 685 722, 740 692, 720 4227 of Pen 3 659, 685 666, 684 678, 750 4122

4 698, 650 704, 666 686, 733 4137

5413 5621 5564 16,598

Carry out an appropriate ANOVA, and state your conclusions.

23. The accompanying data was obtained in an experiment to investigate whether compressive strength of concrete cylin- ders depends on the type of capping material used or vari- ability in different batches (“The Effect of Type of Capping Material on the Compressive Strength of Concrete Cyl- inders,” Proceedings ASTM, 1958: 1166–1186). Each num- ber is a cell total based on observations.

Batch

1 2 3 4 5

1 1847 1942 1935 1891 1795 Capping Material 2 1779 1850 1795 1785 1626

3 1806 1892 1889 1891 1756

In addition, and 50,443,409. Obtain the ANOVA table and then test at level .01 the hypotheses versus versus , and versus , assuming that capping is a fixed effect and batches is a random effect.

24. a. Show that , so that is an unbiased estimator for ai (in the fixed effects model).

b. With , show that is an unbiased estimator for (in the fixed effects model).

25. Show how a t CI for can be obtained. Then compute a 95% interval for using the data from Exercise 19. [Hint: With , the result of Exercise 24(a) indicates how to obtain . Then compute and and obtain an estimate of by using to estimate s (which identifies the appro- priate number of df).]

26. When both factors are random in a two-way ANOVA exper- iment with K replications per combination of factor levels, the expected mean squares are

, and .

a. What F ratio is appropriate for testing ver- sus ?

b. Answer part (a) for testing versus and versus .HaB: sG

2 . 0H0B: sB 2 5 0HaA: sA

2 . 0 H0A: sA

2 5 0 HaG: sG

2 . 0 H0G: sG

2 5 0 E(MSAB) 5 s2 1 KsG

2 s2 1 KsG

2 1 JKsA 2, E(MSB) 5 s2 1 KsG

2 1 IKsB 2

E(MSE) 5 s2, E(MSA) 5

1MSE sûsû,V(û) û

u 5 a2 2 a3

a2 2 a3

ai 2 air100(1 2 a)%

gij

ĝijĝij 5 Xij# 2 Xi# # 2 X#j # 1 X# # #

Xi # # 2 X# # #E(Xi #

# 2 X# # #) 5 ai

HaB

H0BHaAHaG, H0AH0G

ggxij #2 5gggxijk2 5 16,815,853

K 5 3(xij #)

x# j #

xi # #

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

442 CHAPTER 11 Multifactor Analysis of Variance

11.3 Three-Factor ANOVA To indicate the nature of models and analyses when ANOVA experiments involve more than two factors, we will focus here on the case of three fixed factors—A, B, and C. The numbers of levels of these factors will be denoted by I, J, and K, respec- tively, and of observations made with factor A at level i, factor B at level j, and factor C at level k. The analysis is quite complicated when the are not all equal, so we further specialize to . Then and denote the observed value, before and after the experiment is performed, of the lth replication

when the three factors are fixed at levels i, j, and k. To understand the parameters that will appear in the three-factor ANOVA

model, first recall that in two-factor ANOVA with replications, , where the restrictions for every j, and

for every i were necessary to obtain a unique set of parameters. If we use dot subscripts on the to denote averaging (rather than summation), then

is the effect of factor A at level i averaged over levels of factor B, whereas

is the effect of factor A at level i specific to factor B at level j. When the effect of A at level i depends on the level of B, there is interaction between the factors, and the

are not all zero. In particular,

(11.11)

The Fixed Effects Model and Test Procedures

mij 2 m #j 2 mi # 1 m # # 5 gij

gij’s

mij 2 m#j 5 mij 2 1

I g

i mij 5 ai 1 gij

mi # 2 m # # 5 1

J g

j mij 2

1

IJ g

i g

j mij 5 ai

mij’s �jgij 5 0

�iai 5 �jbj 5 0, �igij 5 0ai 1 bj 1 gij

E(Xijk) 5 mij 5 m 1

(l 5 1, 2, c, L)

xijklXijklLijk 5 L Lijk’s

Lijk 5 the number

The fixed effects model for three-factor ANOVA with is

(11.12)

where the are normally distributed with mean 0 and variance s2, and

(11.13)mijk 5 m 1 ai 1 bi 1 dk 1 gij AB 1 gik

AC 1 gjk BC 1 gijk

Pijkl’s

k 5 1, c, K, l 5 1, c, L

Xijkl 5 mijk 1 Pijkl i 5 1, c, I, j 5 1, c, J

Lijk 5 L

The restrictions necessary to obtain uniquely defined parameters are that the sum over any subscript of any parameter on the right-hand side of (11.13) equal 0.

The parameters , and are called two-factor interactions, and is called a three-factor interaction; the , and are the main effects parame- ters. For any fixed level k of the third factor, analogous to (11.11),

is the interaction of the ith level of A with the jth level of B specific to the kth level of C, whereas

mij # 2 mi # # 2 m # j # 1 m # # # 5 gij AB

mijk 2 mi #k 2 m#jk 1 m # #k 5 gijAB 1 gijk

dk’sai’s, bj’s gijkgjk

BCgij AB, gik

AC

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.3 Three-Factor ANOVA 443

is the interaction between A at level i and B at level j averaged over levels of C. If the interaction of A at level i and B at level j does not depend on k, then all equal 0. Thus nonzero represent nonadditivity of the two-factor over the various levels of the third factor C. If the experiment included more than three fac- tors, there would be corresponding higher-order interaction terms with analogous interpretations. Note that in the previous argument, if we had considered fixing the level of either A or B (rather than C, as was done) and examining the , their inter- pretation would be the same; if any of the interactions of two factors depend on the level of the third factor, then there are nonzero .

When , there is a sum of squares for each main effect, each two-factor interaction, and the three-factor interaction. To write these in a way that indicates how sums of squares are defined when there are more than three factors, note that any of the model parameters in (11.13) can be estimated unbiasedly by averaging

over appropriate subscripts and taking differences. Thus

with other main effects and interaction estimators obtained by symmetry.

ĝijk 5 Xijk # 2 Xij # # 2 Xi #k # 2 X#jk # 1 Xi # # # 1 X#j # # 1 X# #k # 2 X# # # #

m̂ 5 X# # #

# âi 5 Xi # # # # 2 X# # # # ĝij

AB 5 Xij # # 2 Xi # # # 2 X#j # # 1 X# # # #

Xijkl

L . 1 gijk’s

gijk’s

gij AB’sgijk’s

gijk’s

DEFINITION Relevant sums of squares are

with the remaining main effect and two-factor interaction sums of squares obtained by symmetry. SST is the sum of the other eight SSs.

df 5 IJK(L 2 1)SSE 5 g i g

j g k g

l (Xijkl 2 Xijk #)2

df 5 (I 2 1)(J 2 1)(K 2 1)SSABC 5 g i g

j g k g

l ĝijk

2 5 Lg i g

j g k ĝijk

2

5 KLg i g

j (Xij # # 2 Xi # # # 2 X#j # # # 1 X# # # #)

2

df 5 (I 2 1)(J 2 1)SSAB 5 g i g

j g k g

l (ĝij

AB)2

df 5 I 2 1SSA 5 g i g

j g k g

l âi

2 5 JKLg i

(Xi # # #2 X # # # #) 2

df 5 IJKL 2 1SST 5 g i g

j g k g

l (Xijkl 2 X # # # #)

2

Each sum of squares (excepting SST) when divided by its df gives a mean square. Expected mean squares are

with similar expressions for the other expected mean squares. Main effect and inter- action hypotheses are tested by forming F ratios with MSE in each denominator.

E(MSABC) 5 s2 1 L

(I 2 1)(J 2 1)(K 2 1) g

i g

j g k

(gijk) 2

E(MSAB) 5 s2 1 KL

(I 2 1)(J 2 1) g

i g

j (gij

AB)2

E(MSA) 5 s2 1 JKL

I 2 1 g

i ai

2

E(MSE) 5 s2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

444 CHAPTER 11 Multifactor Analysis of Variance

Null Hypothesis Test Statistic Value Rejection Region

fABC $ Fa,(I21)(J21)(K21),IJK(L21)fABC 5 MSABC

MSE H0ABC: all gijk’s 5 0

fAB $ Fa,(I21)(J21),IJK(L21)fAB 5 MSAB

MSE H0AB: all gij

AB’s 5 0

fA $ Fa,I21,IJK(L21)fA 5 MSA

MSE H0A: all ai’s 5 0

Usually the main effect hypotheses are tested only if all interactions are judged not significant.

This analysis assumes that . If , then as in the two-factor case, the highest-order interactions must be assumed absent to obtain an MSE that estimates s2. Setting and disregarding the fourth subscript summation over l, the foregoing formulas for sums of squares are still valid, and error sum of squares is with in the expression for .ĝijkXijk# 5 XijkSSE 5 �i�j�kĝijk2

L 5 1

L 5 1Lijk 5 L . 1

B1 B2

C1 C2 C3 C4 C1 C2 C3 C4

3.6 3.4 2.9 2.5 4.2 4.4 3.6 3.0 A1 3.8 3.7 2.8 2.4 4.0 3.9 3.7 2.8

3.9 3.9 2.7 2.2 3.9 4.2 3.4 2.9

3.8 3.8 2.9 2.4 4.4 4.2 3.8 2.0 A2 3.6 3.9 2.9 2.2 4.4 4.3 3.7 2.9

4.0 3.9 2.8 2.2 4.6 4.7 3.4 2.8

3.7 3.8 2.9 2.1 4.2 4.0 4.0 2.0 A3 3.9 4.0 2.7 2.0 4.4 4.6 3.8 2.4

4.2 3.9 2.8 1.8 4.5 4.5 3.3 2.0

3.6 3.6 2.6 2.0 4.0 4.0 3.8 2.0 A4 3.5 3.7 2.9 2.0 4.1 4.4 3.7 2.2

3.8 3.9 2.9 1.9 4.2 4.2 3.5 2.3

B1 B2

C1 C2 C3 C4 C1 C2 C3 C4

A1 11.3 11.0 8.4 7.1 12.1 12.5 10.7 8.7 A2 11.4 11.6 8.6 6.8 13.4 13.2 10.9 7.7 A3 11.8 11.7 8.4 5.9 13.1 13.1 11.1 6.4 A4 10.9 11.2 8.4 5.9 12.3 12.6 11.0 6.5

xijk.

The table of cell totals for all combinations of the three factors is(xijk #’s)

Example 11.10 The following observations (body temperature 2100°F) were reported in an exper- iment to study heat tolerance of cattle (“The Significance of the Coat in Heat Tolerance of Cattle,” Australian J. Agric. Res., 1959: 744–748). Measurements were made at four different periods (factor A, with ) on two different strains of cattle (factor B, with ) having four different types of coat (factor C, with

); observations were made for each of the combina- tions of levels of the three factors.

4 3 2 3 4 5 32L 5 3K 5 4 J 5 2

I 5 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.3 Three-Factor ANOVA 445

Figure 11.8 displays plots of the corresponding cell means . We will return to these plots after considering tests of various hypotheses. The basis for these tests is the ANOVA table given in Table 11.8.

xijk # 5 xijk # /3

4.5

3.5

2.5

1.5

B1 B2

x

C1

B1 B2

x

C2

B1 B2

x

C3

B1 B2

x

C4

Figure 11.8 Plots of for Example 11.10xijk#

Table 11.8 ANOVA Table for Example 11.10

Source df Sum of Squares Mean Square f

A .49 .163 4.13 B 6.45 6.45 163.29 C 48.93 16.31 412.91 AB .02 .0067 .170 AC 1.61 .179 4.53 BC .88 .293 7.42 ABC .25 .0278 .704 Error 2.53 .0395 Total 61.16IJKL 2 1 5 95

IJK(L 2 1) 5 64 (I 2 1)(J 2 1)(K 2 1) 5 9

(J 2 1)(K 2 1) 5 3 (I 2 1)(K 2 1) 5 9 (I 2 1)(J 2 1) 5 3

K 2 1 5 3 J 2 1 5 1 I 2 1 5 3

Since and does not exceed 2.70, we conclude that three-factor interactions are not significant. However, although the AB interactions are also not significant, both AC and BC interactions as well as all main effects seem to be necessary in the model. When there are no ABC or AB interactions, a plot of the separately for each level of C should reveal no substantial interactions (if only the ABC interactions are zero, plots are more difficult to interpret; see the article “Two-Dimensional Plots for Interpreting Interactions in the Three- Factor Analysis of Variance Model,” Amer. Statistician, May 1979: 63–69). ■

Diagnostic plots for checking the normality and constant variance assumptions can be constructed as described in previous sections. Tukey’s procedure can be used in three-factor (or more) ANOVA. The second subscript on Q is the number of sam- ple means being compared, and the third is degrees of freedom for error.

Models with random and mixed effects are also sometimes appropriate. Sums of squares and degrees of freedom are identical to the fixed effects case, but expected mean squares are, of course, different for the random main effects or interactions. A good reference is the book by Douglas Montgomery listed in the chapter bibliography.

xijk #’s( 5 m̂ijk)

fABC 5 MSABC/MSE 5 .704F.01,9,64 < 2.70

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

446 CHAPTER 11 Multifactor Analysis of Variance

Latin Square Designs When several factors are to be studied simultaneously, an experiment in which there is at least one observation for every possible combination of levels is referred to as a complete layout. If the factors are A, B, and C with I, J, and K levels, respectively, a complete layout requires at least IJK observations. Frequently an experiment of this size is either impracticable because of cost, time, or space constraints or literally impossible. For example, if the response variable is sales of a certain product and the factors are different display configurations, different stores, and different time peri- ods, then only one display configuration can realistically be used in a given store during a given time period.

A three-factor experiment in which fewer than IJK observations are made is called an incomplete layout. There are some incomplete layouts in which the pattern of combinations of factors is such that the analysis is straightforward. One such three- factor design is called a Latin square. It is appropriate when (e.g., four display configurations, four stores, and four time periods) and all two- and three-factor interaction effects are assumed absent. If the levels of factor A are identified with the rows of a two-way table and the levels of B with the columns of the table, then the defining characteristic of a Latin square design is that every level of factor C appears exactly once in each row and exactly once in each column. Figure 11.9 shows exam- ples of , and Latin squares. There are 12 different Latin squares, and the number of different Latin squares increases rapidly with the number of levels (e.g., every permutation of rows of a given Latin square yields a Latin square, and similarly for column permutations). It is recommended that the square used in a an actual experiment be chosen at random from the set of all possible squares of the desired dimension; for further details, consult one of the chapter references.

3 3 35 3 53 3 3, 4 3 4

I 5 J 5 K

1 2 3

1 2 3

A

C

2 3 1

3 1 2

1 2

B

3

3 4 2 1

1 2 3 4

A

C

4 2 1 3

2 1 3 4

1 2

B

3

1 3 4 2

4

4 3 1 5 2

1 2 3 4 5

A

C

3 1 5 2 4

5 4 2 1 3

1 2

B

3

2 5 3 4 1

4

1 2 4 3 5

5

Figure 11.9 Examples of Latin squares

The model equation for a Latin square design is

where and the are independent and normally

distributed with mean 0 and variance s2.

Pij(k)’sgai 5 gbj 5 gdk 5 0

Xij(k) 5 m 1 ai 1 bj 1 dk 1 Pij(k) i, j, k 5 1, c, N

The letter N will denote the common value of I, J, and K. Then a complete lay- out with one observation per combination would require N3 observations, whereas a Latin square requires only N2 observations. Once a particular square has been chosen, the value of k (the level of factor C) is completely determined by the values of i and j. To emphasize this, we use to denote the observed value when the three factors are at levels i, j, and k, respectively, with k taking on only one value for each i, j pair.

xij(k)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.3 Three-Factor ANOVA 447

We employ the following notation for totals and averages:

Note that although previously suggested a double summation, now it corresponds to a single sum over all j (and the associated values of k).

Xi # #

Xi # # 5 Xi # # N

X #j # 5 X #j # N

X # #k 5

X # #k

N X #

# # 5

X # # #

N2

Xi # # 5 g j

Xij(k) X #j # 5 g i

Xij(k) X # #k 5 g i, j

Xij(k) X # # # 5 g i g

j Xij(k)

DEFINITION Sums of squares for a Latin square experiment are

SST 5 SSA 1 SSB 1 SSC 1 SSE

df 5 (N 2 1)(N 2 2)5 g i g

j (Xij(k) 2 Xi # # 2 X #j # 2 X # #k 1 2X # # #)

2

df 5 N 2 1SSE 5 g i g

j [Xij(k) 2 (m̂ 1 âi 1 b̂j 1 d̂k)]

2

df 5 N 2 1SSC 5 g i g

j (X #

#k 2 X # # #)2

df 5 N 2 1SSB 5 g i g

j (X #j # 2 X # # #)2

df 5 N 5 1SSA 5 g i g

j (Xi # # 2 X # # #)

2

df 5 N2 2 1SST 5 g i g

j (Xij(k) 2 X # # #)

2

Each mean square is, of course, the ratio SS/df. For testing , the test statistic value is , with H0C

rejected if . The other two main effect null hypotheses are also rejected if the corresponding F ratio is at least .

If any of the null hypotheses is rejected, significant differences can be identi- fied by using Tukey’s procedure. After computing , pairs of sample means (the , or ) differing by more than w correspond to significant differences between associated factor effects (the , or ).

The hypothesis H0C is frequently the one of central interest. A Latin square design is used to control for extraneous variation in the A and B factors, as was done by a randomized block design for the case of a single extraneous factor. Thus in the product sales example mentioned previously, variation due to both stores and time periods is controlled by a Latin square design, enabling an investigator to test for the presence of effects due to different product-display configurations.

dk’sai’s, bj’s x #

#k’sxi # #’s, x #j #’s

w 5 Qa,N,(N21)(N22) # 1MSE/N

Fa,N21,(N21)(N22)

fC $ Fa,N21,(N21)(N22)

fC 5 MSC/MSEH0C: d1 5 d2 5 c5 dN 5 0

Example 11.11 In an experiment to investigate the effect of relative humidity on abrasion resistance of leather cut from a rectangular pattern (“The Abrasion of Leather,” J. Inter. Soc. Leather Trades’ Chemists, 1946: 287), a Latin square was used to control for possible variability due to row and column position in the pattern. The six levels of relative humidity studied were and , with the following results:6 5 87%

1 5 25%, 2 5 37%, 3 5 50%, 4 5 62%, 5 5 75%,

6 3 6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

448 CHAPTER 11 Multifactor Analysis of Variance

Also,

. Further computations are summarized in Table 11.9.x# # # 5 226.98 x #

#1 5 46.10, x# #2 5 40.59, x # #3 5 39.56, x# #4 5 35.86, x# #5 5 32.23, x# #6 5 32.64,

Since and is rejected in favor of the hypothesis that relative humidity does on average affect abrasion resistance.

To apply Tukey’s procedure,

Ordering the and underscoring yields

75% 87% 62% 50% 37% 25% 5.37 5.44 5.98 6.59 6.77 7.68

In particular, the lowest relative humidity appears to result in a true average abrasion resistance significantly higher than for any other relative humidity studied. ■

x # #k’s w 5 Q.05,6,20 # 2MSE/6 5 4.45 2.175/6 5 .76.

26.89 $ 2.71, H0CF.05,5,20 5 2.71

EXERCISES Section 11.3 (27–37)

27. The output of a continuous extruding machine that coats steel pipe with plastic was studied as a function of the ther- mostat temperature profile (A, at three levels), the type of plastic (B, at three levels), and the speed of the rotating screw that forces the plastic through a tube-forming die (C, at three levels). There were two replications at each combination of levels of the factors, yielding a total of 54 observations on output. The sums of squares were

, and . a. Construct the ANOVA table. b. Use appropriate F tests to show that none of the F ratios for

two- or three-factor interactions is significant at level .05.

SST 5 270,024.33SSE 5 3127.50 SSBC 5 331.67,SSAC 5 62.67,SSAB 5 1069.62,

SSC 5 244,696.39,SSB 5 5511.27,SSA 5 14,144.44,

(L 5 2)

c. Which main effects appear significant? d. With , and , use

Tukey’s procedure to identify significant differences among the levels of factor C.

28. To see whether thrust force in drilling is affected by drilling speed (A), feed rate (B), or material used (C ), an experiment using four speeds, three rates, and two materi- als was performed, with two samples drilled at each combination of levels of the three factors. Sums of squares were calculated as follows:

238.21, and . Construct the ANOVASST 5 2,983,164.81819.50,

SSBC 5 91,880.04, SSE 5 56,SSAC 5 9033.73, SSAB 5 53,SSC 5 157,437.52,SSB 5 2,589,047.62,

SSA 5 19,149.73,

(L 5 2)

x# #3# 5 11,210x# #1# 5 8242, x# #2# 5 9732

Table 11.9 ANOVA Table for Example 11.11

Source of Variation df Sum of Squares Mean Square f

A (rows) 5 2.19 .438 2.50 B (columns) 5 2.57 .514 2.94 C (treatments) 5 23.53 4.706 26.89 Error 20 3.49 .175 Total 35 31.78

B (columns)

1 2 3 4 5 6

1 37.38 45.39 65.03 25.50 55.01 16.79 35.10

2 27.15 18.16 54.96 45.78 36.24 65.06 37.35

3 46.75 65.64 36.34 55.31 17.81 28.05 39.90

A (rows) 4 18.05 36.45 26.31 65.46 46.05 55.51 37.83

5 65.65 55.44 17.27 36.54 27.03 45.96 37.89

6 56.00 26.55 45.93 18.02 65.80 36.61 38.91

40.98 37.63 35.84 36.61 37.94 37.98x# j #

xi# #

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.3 Three-Factor ANOVA 449

table and identify significant interactions using . Is there any single factor that appears to have no effect on thrust force? (In other words, does any factor appear nonsignificant in every effect in which it appears?)

29. The article “An Analysis of Variance Applied to Screw Ma-chines” (Industrial Quality Control, 1956: 8–9) describes an experiment to investigate how the length of steel bars was affected by time of day (A), heat treatment applied (B), and screw machine used (C). The three times were 8:00 A.M., 11:00 A.M., and 3:00 P.M., and there were two treatments and four machines (a factorial experiment), resulting in the accompanying data [coded as

, which does not affect the analysis].1000(length 2 4.380)

3 3 2 3 4

a 5 .01 a. Construct the ANOVA table. b. Assume that there are no three-way interaction effects,

so that MSABC is a valid estimate of s2, and test at level .05 for interaction and main effects.

c. The nitrogen averages are , and . Use Tukey’s method to

examine differences in percentage N among the nitrogen levels .

31. The article “Kolbe–Schmitt Carbonation of 2-Naphthol” (Industrial and Eng. Chemistry: Process and Design Development, 1969: 165–173) presented the accompanying data on percentage yield of BON acid as a function of reac- tion time (1, 2, and 3 hours), temperature (30, 70, and 100°C), and pressure (30, 70, and 100 psi). Assuming that there is no three-factor interaction, so that provides an estimate of s2, Minitab gave the accompanying ANOVA table. Carry out all appropriate tests.

SSE 5 SSABC

(Q.05,4,3 5 6.82)

x4# # 5 1.4300x3# # 5 1.3875 x1 # # 5 1.1200, x2 # # 5 1.3025,

B2

C1 C2 C3 C4

A1 4, 6, 6, 5, , 0, 4, 5, 0, 1 3, 4 0, 1 5, 4

A2 3, 1, 6, 4, 2, 0, 9, 4, 1, 1, 3 , 1 6, 3

A3 6, 0, 8, 7, 0, , 4, 3, 3, 7 10, 0 4, 7, 024

22

2122

21

B1

C1 C2 C3 C4

A1 6, 9, 7, 9, 1, 2, 6, 6, 1, 3 5, 5 0, 4 7, 3

A2 6, 3, 8, 7, 3, 2, 7, 9, 1, 1 4, 8 1, 0 11, 6

A3 5, 4, 10, 11, 1, 2, 10, 5, 9, 6 6, 4 6, 1 4, 8

2

2

Sums of squares include , and .

a. Construct the ANOVA table for this data. b. Test to see whether any of the interaction effects are

significant at level .05. c. Test to see whether any of the main effects are significant

at level .05 (i.e., H0A versus HaA, etc.). d. Use Tukey’s procedure to investigate significant differ-

ences among the four machines.

30. The following summary quantities were computed from an experiment involving four levels of nitrogen (A), two times of planting (B), and two levels of potassium (C) (“Use and Misuse of Multiple Comparison Procedures,” Agronomy J., 1977: 205–208). Only one observation (N content, in percentage, of corn grain) was made for each of the 16 combinations of levels.

SSBC 5 .000625 SST 5 .2384. SSAB 5 .004325 SSAC 5 .00065 SSA 5 .22625 SSB 5 .000025 SSC 5 .0036

SST 5 1037.833SSE 5 447.500SSBC 5 1.542, SSAC 5 71.021,SSAB 5 1.646,

B1

C1 C2 C3

A1 68.5 73.0 68.7

A2 74.5 75.0 74.6

A3 70.5 72.5 74.7

B3

C1 C2 C3

A1 72.5 72.5 73.1

A2 75.5 70.0 76.0

A3 65.0 66.5 70.5

B2

C1 C2 C3

A1 72.8 80.1 72.0

A2 72.0 81.5 76.0

A3 69.5 84.5 76.0

Analysis of Variance for Yield

Source DF SS MS F P

time 2 42.112 21.056 8.76 0.010

temp 2 110.732 55.366 23.04 0.000

press 2 68.136 34.068 14.18 0.002

time*temp 4 67.761 16.940 7.05 0.010

time*press 4 35.184 8.796 3.66 0.056

temp*press 4 136.437 34.109 14.20 0.001

Error 8 19.223 2.403

Total 26 479.585

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

450 CHAPTER 11 Multifactor Analysis of Variance

32. When factors A and B are fixed but factor C is random and the restricted model is used (see the footnote on page 438; there is a technical complication with the unrestricted model here), and

a. Based on these expected mean squares, what F ratios would you use to test

for all i, j; and ? b. In an experiment to assess the effects of age, type of soil,

and day of production on compressive strength of cement/soil mixtures, two ages (A), four types of soil (B), and 3 days (C, assumed random) were used, with observations made for each combination of factor levels. The resulting sums of squares were

and . Obtain the ANOVA table and carry out all tests using level .01.

33. Because of potential variability in aging due to different cast- ings and segments on the castings, a Latin square design with

was used to investigate the effect of heat treatment on aging. With ,

, summary statistics include

and . Obtain the ANOVA table and test at level .05 the hypothesis that heat treatment has no effect on aging.

34. The article “The Responsiveness of Food Sales to Shelf Space Requirements” (J. Marketing Research, 1964: 63–67) reports the use of a Latin square design to investi- gate the effect of shelf space on food sales. The experiment was carried out over a 6-week period using six different stores, resulting in the following data on sales of powdered coffee cream (with shelf space index in parentheses):

ggxij(k) 2 5 297,317.65

gx #

#k 2 5297,155.01,gx

#j #2 5297,200.64,gxi2# #5 297,216.90,

x # # # 5 3815.8,treatments

C 5 heatB 5 segements,A 5 castings N 5 7

SSE 5 8655.60 SSABC 5 2832.72,SSBC 5 3096.21,SSAC 51442.58, SSAB 5 3408.93,SSC 5 2270.22,SSB 5 9656.40, SSA 5 14,318.24,

L 5 2

H0: a1 5 c5 aI 5 0H0: g ij AB 5 0

H0: sABC 2 5 0; H0: sC

2 5 0;

E(MSABC) 5 s2 1 s2ABC

E(MSBC) 5 s2 1 ILsBC 2

E(MSAC) 5 s2 1 JLsAC 2

1 KL

(I 2 1)(J 2 1) g

i g

j (gi j

AB)2

E(MSAB) 5 s2 1 LsABC 2

E(MSC) 5 s2 1 IJLsC 2

E(MSB) 5 s2 1 ILsBC 2 1

IKL

J 2 1 gbj

2

E(MSA) 5 s2 1 JLsAC 2 1

JKL

I 2 1 gai

2

E(MSE) 5 s2

Construct the ANOVA table, and state and test at level .01 the hypothesis that shelf space does not affect sales against the appropriate alternative.

35. The article “Variation in Moisture and Ascorbic Acid Content from Leaf to Leaf and Plant to Plant in Turnip Greens” (Southern Cooperative Services Bull., 1951: 13–17) uses a Latin square design in which factor A is plant, factor B is leaf size (smallest to largest), factor C (in parentheses) is time of weighing, and the response variable is moisture content.

Week

1 2 3

1 27 (5) 14 (4) 18 (3) 2 34 (6) 31 (5) 34 (4) 3 39 (2) 67 (6) 31 (5)

Store 4 40 (3) 57 (1) 39 (2) 5 15 (4) 15 (3) 11 (1) 6 16 (1) 15 (2) 14 (6)

Week

4 5 6

1 35 (1) 28 (6) 22 (2) 2 46 (3) 37 (2) 23 (1) 3 49 (4) 38 (1) 48 (3)

Store 4 70 (6) 37 (4) 50 (5) 5 9 (2) 18 (5) 17 (6) 6 12 (5) 19 (3) 22 (4)

Leaf Size (B)

1 2 3

1 6.67 (5) 7.15 (4) 8.29 (1) 2 5.40 (2) 4.77 (5) 5.40 (4)

Plant (A) 3 7.32 (3) 8.53 (2) 8.50 (5) 4 4.92 (1) 5.00 (3) 7.29 (2) 5 4.88 (4) 6.16 (1) 7.83 (3)

Leaf Size (B)

4 5

1 8.95 (3) 9.62 (2) 2 7.54 (1) 6.93 (3)

Plant (A) 3 9.99 (4) 9.68 (1) 4 7.85 (5) 7.08 (4) 5 5.83 (2) 8.51 (5)

When all three factors are random, the expected mean squares are

, and . This implies that the F ratios for testing , and

are identical to those for fixed effects. Obtain the ANOVA table and test at level .05 to see whether there is any variation in moisture content due to the factors.

36. The article “An Assessment of the Effects of Treatment, Time, and Heat on the Removal of Erasable Pen Marks from Cotton and Cotton/Polyester Blend Fabrics (J. of Testing and Eval., 1991: 394–397) reports the following sums of squares for the response variable degree of removal of marks:

and . Four different laundry treatments, three different types of pen, and six different fabrics were used in the experiment, and there were three observations

SSE 5 115.820 SSABC 5 9.016,SSBC 5 1.382,SSAC 5 15.953,1.432,

SSB 5 .665, SSC 5 21.508, SSAB 5SSA 5 39.171,

H0C: sC 2 5 0

H0A: sA 2 5 0, H0B: sB

2 5 0 E(MSE) 5 s25 s2 1 NsC

2E(MSC) 5 s2 1 NsB

2,5 s2 1 NsA 2, E(MSB)E(MSA)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

for each treatment-pen-fabric combination. Perform an analysis of variance using for each test, and state your conclusions (assume fixed effects for all three factors).

37. A four-factor ANOVA experiment was carried out to investigate the effects of fabric (A), type of exposure (B), level of exposure (C), and fabric direction (D) on extent of color change in exposed fabric as measured by a spec- trocolorimeter. Two observations were made for each of the three fabrics, two types, three levels, and two

a 5 .01 directions, resulting in

and (“Accelerated Weathering of Marine Fabrics,” J. Testing and Eval., 1992: 139–143). Assuming fixed effects for all factors, carry out an analysis of variance using for all tests and summarize your conclusions.

a 5 .01

MST 5 93.621MSE 5 .977,MSBCD 5 .280, MSACD 5 .767,MSABD 5 4.072,MSABD 5 4.072,

MSABC 5 3.714,MSCD 5 .247,MSBD 5 .273, MSBC 5 2.141,MSAD 5 .470,MSAC 5 275.446,

MSAB 5 15.303,MSD 5 .044,MSC 5 491.783, MSB 5 47.255,MSA 5 2207.329,

11.4 2p Factorial Experiments 451

11.4 2p Factorial Experiments If an experimenter wishes to study simultaneously the effect of p different factors on a response variable and the factors have levels, respectively, then a com- plete experiment requires at least observations. In such situations, the experimenter can often perform a “screening experiment” with each factor at only two levels to obtain preliminary information about factor effects. An experiment in which there are p factors, each at two levels, is referred to as a 2p factorial experiment.

23 Experiments As in Section 11.3, we let and refer to the observation from the lth repli- cation, with factors A, B, and C at levels i, j, and k, respectively. The model for this situation is

(11.14)

for . The ’s are assumed independ- ent, normally distributed, with mean 0 and variance s2. Because there are only two lev- els of each factor, the side conditions on the parameters of (11.14) that uniquely specify the model are simply stated:

, and the like. These conditions imply that there is only one functionally independent parameter of each type (for each main effect and interac- tion). For example, , whereas and Because of this, each sum of squares in the analysis will have 1 df.

The parameters of the model can be estimated by taking averages over various subscripts of the ’s and then forming appropriate linear combinations of the aver- ages. For example,

and

Each estimator is, except for the factor 1/(8n), a linear function of the cell totals ’s) in which each coefficient is or , with an equal number of each; such2111(Xijk#

5 (X111# 2 X121# 2 X211# 1 X221# 1 X112# 2 X122# 2 X212# 1 X222#)

8n

ĝ11 AB 5 X11# # 2 X1# # # 2 X#1# # 1 X # # # #

5 (X111# 1 X121# 1 X112# 1 X122# 2 X211# 2 X212# 2 X221# 2 X222#)

8n

â1 5 X1 # # # 2 X # # # #

Xijkl

g22 AB 5 g11

AB.g21 AB 5 2g11

AB, g12 AB 5 2g11

AB,a2 5 2a1

g11 AB 1 g12

AB 5 0, g21 AB 1 g22

AB 5 0 a1 1 a2 5 0, c, g11

AB 1 g21 AB 5 0, g12

AB 1 g22 AB 5 0,

Pijkl i 5 1, 2; j 5 1, 2; k 5 1, 2; l 5 1, c, n

Xijkl 5 m 1 ai 1 bj 1 dk 1 gij AB 1 gik

AC 1 gjk BC 1 gijk 1 Pijkl

xijklXijkl

I1 # I2, # c # Ip I1, I2, c, Ip

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 11.12

452 CHAPTER 11 Multifactor Analysis of Variance

In an experiment to investigate the compressive strength properties of cement–soil mixtures, two different aging periods were used in combination with two different temperatures and two different soils. Two replications were made for each combi- nation of levels of the three factors, resulting in the following data:

Soil

Age Temperature 1 2

1 1 471, 413 385, 434 2 485, 552 530, 593

2 1 712, 637 770, 705 2 712, 789 741, 806

The computed cell totals are , and , so . Then

The other parameter estimates can be computed in the same manner. ■

Analysis of a 23 Experiment Sums of squares for the various effects are easily obtained from the parameter estimates. For example,

and

Since each estimate is a contrast in the cell totals multiplied by , each sum of squares has the form (contrast)2/(8n). Thus to compute the various sums of squares, we need to know the coefficients ( or ) of the appropriate contrasts. The signs

on each in each effect contrast are most conveniently displayed in a table. We will use the notation (1) for the experimental condition

, ab for , and so on. If level 1 is thought of as “low” and level 2 as “high,” any letter that appears denotes a high level of the associated factor. Each column in Table 11.10 gives the signs for a particular effect contrast in the ’s associated with the different experimental conditions.xijk

i 5 2, j 5 2, k 5 1k 5 1, a for i 5 2, j 5 1, k 5 1 i 5 1, j 5 1,

xijk#(1 or 2) 2111

1/(8n)

5 8n(ĝ11 AB)2

5 2ng 2

i51 g 2

j51 (ĝi j

AB)2 5 2n[(ĝ11 AB)2 1 (2ĝ11

AB)2 1 (2ĝ11 AB)2 1 (ĝ11

AB)2]

SSAB 5 g i g

j g k g

l (ĝi j

AB)2

SSA 5 g i g

j g k g

l âi

2 5 4ng 2

i51 âi

2 5 4n[â1 2 1 (2â1)

2] 5 8nâ1 2

5 214.5625 5 2ĝ12 AB 5 2ĝ21

AB 5 ĝ22 AB

ĝ11 AB 5 (884 2 1349 2 1037 1 1501 1 819 2 1475 2 1123 1 1547)/16

5 2125.5625 5 2â2

â1 5 (884 2 1349 1 1037 2 1501 1 819 2 1475 1 1123 2 1547)/16

x # #” #

# 5 9735x222. 5 1547x112. 5 819, x212. 5 1475, x122. 5 1123

x111. 5 884, x211. 5 1349, x121. 5 1037, x221. 5 1501,

functions are called contrasts in the ’s. Furthermore, the estimators satisfy the same side conditions satisfied by the parameters themselves. For example,

5 1

4n X1# # # 1

1

4n X2# # # 2

2

8n X #

# # # 5

1

4n X #

# # # 2

1

4n X #

# # # 5 0

â1 1 â2 5 X1# # # 2 X # # # # 1 X2# # # 2 X # # # # 5 X1# # # 1 X2# # # 2 2X # # # #

Xijk

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.4 2p Factorial Experiments 453

Table 11.10 Signs for Computing Effect Contrasts

Experimental Cell Factorial Effect Condition Total A B C AB AC BC ABC

(1) a b ab c ac bc abc 1111111x222#

2122112x122# 2212121x212# 1221122x112# 2221211x221# 1212212x121# 1122221x211# 2111222x111#

In each of the first three columns, the sign is if the corresponding factor is at the high level and if it is at the low level. Every sign in the AB column is then the “product” of the signs in the A and B columns, with and

, and similarly for the AC and BC columns. Finally, the signs in the ABC column are the products of AB with C (or B with AC or A with BC). Thus, for example,

Once the seven effect contrasts are computed,

Software for doing the calculations required to analyze data from factorial exper- iments is widely available (e.g., Minitab). Alternatively, here is an efficient method for hand computation due to Yates. Write in a column the eight cell totals in the standard order, as given in the table of signs, and establish three additional columns. In each of these three columns, the first four entries are the sums of entries 1 and 2, 3 and 4, 5 and 6, and 7 and 8 of the previous columns. The last four entries are the differences between entries 2 and 1, 4 and 3, 6 and 5, and 8 and 7 of the previous column. The last column then contains and the seven effect contrasts in standard order. Squaring each contrast and dividing by 8n then gives the seven sums of squares.

x # # #

#

SS(effect) 5 (effect contrast)2

8n

AC contrast 5 1 x111# 2 x211# 1 x121# 2 x221# 2 x112# 1 x212# 2 x122# 1 x222#

(1)(2) 5 (2)(1) 5 2 (1)(1) 5 (2)(2) 5 1

2 1

Example 11.13 (Example 11.12 continued)

Since . Yates’s method is illustrated in Table 11.11.n 5 2, 8n 5 16

Table 11.11 Yates’s Method of Computation

Treatment Condition 1 2 Effect Contrast

884 2233 4771 9735 1349 2538 4964 2009 252,255.06 1037 2294 929 681 28,985.06 1501 2670 1080 3,393.06 819 465 305 193 2,328.06

1475 464 376 151 1,425.06 1123 656 71 315.06 1547 424 3,335.06

292,036.42 22312232abc 5 x222#

21bc 5 x122# ac 5 x212# c 5 x112#

2233ab 5 x221# b 5 x121# a 5 x211#

(1) 5 x111#

SS 5 (contrast)2/16xijk#

➛ ➛➛➛

➛ ➛ ➛

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

454 CHAPTER 11 Multifactor Analysis of Variance

Table 11.12 ANOVA Table for Example 11.13

Source of Variation df Sum of Squares Mean Square f

A 1 252,255.06 252,255.06 117.92 B 1 28,985.06 28,985.06 13.55 C 1 2,328.06 2,328.06 1.09 AB 1 3,393.06 3,393.06 1.59 AC 1 1,425.06 1,425.06 .67 BC 1 315.06 315.06 .15 ABC 1 3,335.06 3,335.06 1.56 Error 8 17,113.52 2,139.19 Total 15 309,149.94

Figure 11.10 shows SAS output for this example. Only the P-values for age (A) and temperature (B) are less than .01, so only these effects are judged significant.

Analysis of Variance Procedure Dependent Variable: STRENGTH

Sum of Mean Source DF Squares Square F Value Pr . F Model 7 292036.4375 41719.4911 19.50 0.0002 Error 8 17113.5000 2139.1875 Corrected Total 15 309149.9375

R-Square C.V. Root MSE POWERUSE Mean

0.944643 7.601660 46.25135 608.437500

Source DF Anova SS Mean Square F Value Pr . F

AGE 1 252255.0625 252255.0625 117.92 0.0001 TEMP 1 28985.0625 28985.0625 13.55 0.0062 AGE*TEMP 1 3393.0625 3393.0625 1.59 0.2434 SOIL 1 2328.0625 2328.0625 1.09 0.3273 AGE*SOIL 1 1425.0625 1425.0625 0.67 0.4380 TEMP*SOIL 1 315.0625 315.0625 0.15 0.7111 AGE*TEMP*SOIL 1 3335.0625 3335.0625 1.56 0.2471

Figure 11.10 SAS output for strength data of Example 11.13 ■

2p Experiments for p 3 The analysis of data from a 2p experiment with parallels that of the three-factor case. For example, if there are four factors A, B, C, and D, there are 16 different exper- imental conditions. The first 8 in standard order are exactly those already listed for a three-factor experiment. The second 8 are obtained by placing the letter d beside each

p . 3

.

From the original data, and

so

The ANOVA calculations are summarized in Table 11.12.

5 17,113.52

SSE 5 SST 2 [SSA 1 c1 SSABC] 5 309,149.94 2 292,036.42

SST 5 6,232,289 2 5,923,139.06 5 309,149.94

x2# # #

#

16 5 5,923,139.06

g ig jg kg l x2i j k l 5 6,232,289,

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.4 2p Factorial Experiments 455

condition in the first group. Yates’s method is then initiated by computing totals across replications, listing these totals in standard order, and proceeding as before; with p fac- tors, the pth column to the right of the treatment totals will give the effect contrasts.

For , there will often be no replications of the experiment (so only one complete replicate is available). One possible way to test hypotheses is to assume that certain higher-order effects are absent and then add the corresponding sums of squares to obtain an SSE. Such an assumption can, however, be misleading in the absence of prior knowledge (see the book by Montgomery listed in the chapter bib- liography). An alternative approach involves working directly with the effect con- trasts. Each contrast has a normal distribution with the same variance. When a particular effect is absent, the expected value of the corresponding contrast is 0, but this is not so when the effect is present. The suggested method of analysis is to con- struct a normal probability plot of the effect contrasts (or, equivalently, the effect parameter estimates, since when ). Points correspon- ding to absent effects will tend to fall close to a straight line, whereas points associ- ated with substantial effects will typically be far from this line.

n 5 1estimate 5 contrast/2p

p . 3

Example 11.14 The accompanying data is from the article “Quick and Easy Analysis of Unreplicated Factorials” (Technometrics, 1989: 469–473). The four factors are

, and , and the response variable is the yield of isatin. The observations, in standard order, are .08, .04, .53, .43, .31, .09, .12, .36, .79, .68, .73, .08, .77, .38, .49, and .23. Table 11.13 displays the effect estimates as given in the article (which uses contrast/8 rather than contrast/16).

D 5 temperatureA 5 acid strength, B 5 time, C 5 amount of acid

Table 11.13 Effect Estimates for Example 11.14

Effect A B AB C AC BC ABC D estimate .034 .149 .274

Effect AD BD ABD CD ACD BCD ABCD estimate .124 .0192.0662.0262.1012.2512.161

2.0662.0762.0012.0212.191

0.3

0.2

0.1

0.0

–0.1

–0.2

–0.3

–2 –1 0 z percentile

Effect estimate

1 2

Figure 11.11 A normal probability plot of effect estimates from Example 11.14 ■

Figure 11.11 is a normal probability plot of the effect estimates. All points in the plot fall close to the same straight line, suggesting the complete absence of any effects (we will shortly give an example in which this is not the case).

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

456 CHAPTER 11 Multifactor Analysis of Variance

Visual judgments of deviation from straightness in a normal probability plot are rather subjective. The article cited in Example 11.14 describes a more objective technique for identifying significant effects in an unreplicated experiment.

Confounding It is often not possible to carry out all 2p experimental conditions of a 2p factorial experiment in a homogeneous experimental environment. In such situations, it may be possible to separate the experimental conditions into 2r homogeneous blocks

, so that there are experimental conditions in each block. The blocks may, for example, correspond to different laboratories, different time periods, or different operators or work crews. In the simplest case, and , so that there are two blocks, with each block consisting of four of the eight experimental conditions.

As always, blocking is effective in reducing variation associated with extrane- ous sources. However, when the 2p experimental conditions are placed in 2r blocks, the price paid for this blocking is that of the factor effects cannot be esti- mated. This is because factor effects (main effects and/or interactions) are mixed up, or confounded, with the block effects. The allocation of experimental conditions to blocks is then usually done so that only higher-level interactions are confounded, whereas main effects and low-order interactions remain estimable and hypotheses can be tested.

To see how allocation to blocks is accomplished, consider first a 23 experiment with two blocks and four treatments per block. Suppose we select ABC as the effect to be confounded with blocks. Then any experimental condition having an odd number of letters in common with ABC, such as b (one letter) or abc (three let- ters), is placed in one block, whereas any condition having an even number of letters in common with ABC (where 0 is even) goes in the other block. Figure 11.12 shows this allocation of treatments to the two blocks.

(r 5 1)

2r 2 1 2r 2 1

r 5 1p 5 3

2p2r(r , p)

(1), ab, ac, bc

Block 1

a, b, c, abc

Block 2

Figure 11.12 Confounding ABC in a 23 experiment

In the absence of replications, the data from such an experiment would usually be analyzed by assuming that there were no two-factor interactions (additivity) and using with 3 df to test for the presence of main effects. Alternatively, a normal probability plot of effect contrasts or effect parame- ter estimates could be examined. Most frequently, though, there are replications when just three factors are being studied. Suppose there are u replicates, resulting in a total of blocks in the experiment. Then after subtracting from SST all sums of squares associated with effects not confounded with blocks (computed using Yates’s method), the block sum of squares is computed using the block totals and then subtracted to yield SSE (so there are df for blocks).2r # u 2 1

2r # u

2r # u

SSE 5 SSAB 1 SSAC 1 SSBC

Example 11.15 The article “Factorial Experiments in Pilot Plant Studies” (Industrial and Eng. Chemistry, 1951: 1300–1306) reports the results of an experiment to assess the effects of reactor temperature (A), gas throughput (B), and concentration of active constituent (C) on the strength of the product solution (measured in arbitrary units) in a recirculation

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.4 2p Factorial Experiments 457

(1) ab ac bc

99 52 42 95

Block 1

a b c abc

18 51

108 35

Block 2 Replication 1

(1) ab ac bc

46 �47

22 67

Block 1

a b c abc

18 62

104 36

Block 2 Replication 2

Figure 11.13 Data for Example 11.15

Table 11.14 ANOVA Table for Example 11.15

Source of Variation df Sum of Squares Mean Square f

A 1 12,996 12,996 39.82 B 1 702.25 702.25 2.15 C 1 2,756.25 2,756.25 8.45 AB 1 210.25 210.25 .64 AC 1 30.25 30.25 .093 BC 1 25 25 .077 Blocks 3 5,204 1,734.67 5.32 Error 6 1,958 326.33 Total 15 23,882

unit. Two blocks were used, with the ABC effect confounded with blocks, and there were two replications, resulting in the data in Figure 11.13. The four block � replication totals are 288, 212, 88, and 220, with a grand total of 808, so

SSB1 5 (288)2 1 (212)2 1 (88)2 1 (220)2

4 2

(808)2

16 5 5204.00

The other sums of squares are computed by Yates’s method using the eight experi- mental condition totals, resulting in the ANOVA table given as Table 11.14. By com- parison with , we conclude that only the main effects for A and C differ significantly from zero.

F.05,1,6 5 5.99

Confounding Using More than Two Blocks In the case (four blocks), three effects are confounded with blocks. The exper- imenter first chooses two defining effects to be confounded. For example, in a five- factor experiment (A, B, C, D, and E), the two three-factor interactions BCD and CDE might be chosen for confounding. The third effect confounded is then the generalized interaction of the two, obtained by writing the two chosen effects side by side and then cancelling any letters common to both: . Notice that if ABC and CDE are chosen for confounding, their generalized interaction is

, so that no main effects or two-factor interactions are con- founded.

Once the two defining effects have been selected for confounding, one block consists of all treatment conditions having an even number of letters in common with both defining effects. The second block consists of all conditions having an even number of letters in common with the first defining contrast and an odd number of letters in common with the second contrast, and the third and fourth blocks consist of

(ABC)(CDE) 5 ABDE

(BCD)(CDE) 5 BE

r 5 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

458 CHAPTER 11 Multifactor Analysis of Variance

the “odd/even” and “odd/odd” contrasts. In a five-factor experiment with defining effects ABC and CDE, this results in the allocation to blocks as shown in Figure 11.14 (with the number of letters in common with each defining contrast appearing beside each experimental condition).

(1) ab de acd ace bcd bce abde

(0, 0) (2, 0) (0, 2) (2, 2) (2, 2) (2, 2) (2, 2) (2, 2)

Block 1

d e ac bc abd abe acde bcde

(0, 1) (0, 1) (2, 1) (2, 1) (2, 1) (2, 1) (2, 3) (2, 3)

Block 2

a b cd ce ade bde abcd abce

(1, 0) (1, 0) (1, 2) (1, 2) (1, 2) (1, 2) (3, 2) (3, 2)

Block 3

c ad ae bd be abc cde abcde

(1, 1) (1, 1) (1, 1) (1, 1) (1, 1) (3, 1) (1, 3) (3, 3)

Block 4

Figure 11.14 Four blocks in a 25 factorial experiment with defining effects ABC and CDE

The block containing (1) is called the principal block. Once it has been con- structed, a second block can be obtained by selecting any experimental condition not in the principal block and obtaining its generalized interaction with every condition in the principal block. The other blocks are then constructed in the same way by first selecting a condition not in a block already constructed and finding generalized interactions with the principal block.

For experimental situations with , there is often no replication, so sums of squares associated with nonconfounded higher-order interactions are usually pooled to obtain an error sum of squares that can be used in the denominators of the various F statistics. All computations can again be carried out using Yates’s technique, with SSBl being the sum of sums of squares associated with confounded effects.

When , one first selects r defining effects to be confounded with blocks, making sure that no one of the effects chosen is the generalized interaction of any other two selected. The additional effects confounded with the blocks are then the generalized interactions of all effects in the defining set (including not only generalized interactions of pairs of effects but also of sets of three, four, and so on).

Fractional Replication When the number of factors p is large, even a single replicate of a 2p experiment can be expensive and time consuming. For example, one replicate of a 26 factorial exper- iment involves an observation for each of the 64 different experimental conditions. An appealing strategy in such situations is to make observations for only a fraction of the 2p conditions. Provided that care is exercised in the choice of conditions to be observed, much information about factor effects can still be obtained.

Suppose we decide to include only (half) of the 2p possible conditions in our experiment; this is usually called a half-replicate. The price paid for this economy is twofold. First, information about a single effect (determined by the conditions selected for observation) is completely lost to the experimenter in the sense that no reasonable estimate of the effect is possible. Second, the remaining

main effects and interactions are paired up so that any one effect in a partic- ular pair is confounded with the other effect in the same pair. For example, one such pair may be {A, BCD}, so that separate estimates of the A main effect and BCD inter- action are not possible. It is desirable, then, to select a half-replicate for which main effects and low-order interactions are paired off (confounded) only with higher-order interactions rather than with one another.

2p 2 2

2p21

2p21

2r 2 r 2 1

r . 2

p . 3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.4 2p Factorial Experiments 459

Example 11.16

The first step in specifying a half-replicate is to select a defining effect as the nonestimable effect. Suppose that in a five-factor experiment, ABCDE is chosen as the defining effect. Now the possible treatment conditions are divided into two groups with 16 conditions each, one group consisting of all conditions having an odd number of letters in common with ABCDE and the other containing an even number of letters in common with the defining contrast. Then either group of 16 con- ditions is used as the half-replicate. The “odd” group is

Each main effect and interaction other than ABCDE is then confounded with (aliased with) its generalized interaction with ABCDE. Thus so the AB interaction and CDE interaction are confounded with each other. The resulting alias pairs are

Note in particular that every main effect is aliased with a four-factor interaction. Assuming these interactions to be negligible allows us to test for the presence of main effects.

To specify a quarter-replicate of a 2p factorial experiment ( of the 2p possi- ble treatment conditions), two defining effects must be selected. These two and their generalized interaction become the nonestimable effects. Instead of alias pairs as in the half-replicate, each remaining effect is now confounded with three other effects, each being its generalized interaction with one of the three nonestimable effects.

2p22

5A, BCDE6 5B, ACDE6 5C, ABDE6 5D, ABCE6 5E, ABCD6 5AB, CDE6 5AC, BDE6 5AD, BCE6 5AE, BCD6 5BC, ADE6 5BD, ACE6 5BE, ACD6 5CD, ABE6 5CE, ABD6 5DE, ABC6

(AB)(ABCDE) 5 CDE,

a, b, c, d, e, abc, abd, abe, acd, ace, ade, bcd, bce, bde, cde, abcde

25 5 32

The article “More on Planning Experiments to Increase Research Efficiency” (Industrial and Eng. Chemistry, 1970: 60–65) reports on the results of a quarter- replicate of a 25 experiment in which the five factors were

, and . The response variable was the yield of the chemical process. The chosen defining contrasts were ACE and BDE, with generalized interaction . The remaining 28 main effects and interactions can now be partitioned into seven groups of four effects each, such that the effects within a group cannot be assessed separately. For example, the generalized interactions of A with the nonestimable effects are , and

, so one alias group is . The complete set of alias groups is

5A, CE, ABDE, BCD6 5B, ABCE, DE, ACD6 5C, AE, BCDE, ABD6 5D, ACDE, BE, ABC6 5E, AC, BD, ABCDE6 5AB, BCE, ADE, CD6 5AD, CDE, ABE, BC6

5A, CE, ABDE, BCD6(A)(ABCD) 5 BCD (A)(BDE) 5 ABDE(A)(ACE) 5 CE,

(ACE)(BDE) 5 ABCD

E 5 amount of material Etime D 5 condensationtemperature, B 5 amount of material B, C 5 solvent volume, A 5 condensation

Once the defining contrasts have been chosen for a quarter-replicate, they are used as in the discussion of confounding to divide the 2p treatment conditions into four groups of conditions each. Then any one of the four groups is selected as the set of conditions for which data will be collected. Similar comments apply to a replicate of a 2p factorial experiment.

Having made observations for the selected treatment combinations, a table of signs similar to Table 11.10 is constructed. The table contains a row only for each of the treatment combinations actually observed rather than the full 2p rows, and there

1/2r 2p22

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

460 CHAPTER 11 Multifactor Analysis of Variance

Example 11.17 (Example 11.16 continued)

is a single column for each alias group (since each effect in the group would have the same set of signs for the treatment conditions selected for observation). The signs in each column indicate as usual how contrasts for the various sums of squares are computed. Yates’s method can also be used, but the rule for arranging observed con- ditions in standard order must be modified.

The difficult part of a fractional replication analysis typically involves decid- ing what to use for error sum of squares. Since there will usually be no replication (though one could observe, e.g., two replicates of a quarter-replicate), some effect sums of squares must be pooled to obtain an error sum of squares. In a half-replicate of a 28 experiment, for example, an alias structure can be chosen so that the eight main effects and 28 two-factor interactions are each confounded only with higher- order interactions and that there are an additional 27 alias groups involving only higher-order interactions. Assuming the absence of higher-order interaction effects, the resulting 27 sums of squares can then be added to yield an error sum of squares, allowing 1 df tests for all main effects and two-factor interactions. However, in many cases tests for main effects can be obtained only by pooling some or all of the sums of squares associated with alias groups involving two-factor interactions, and the corresponding two-factor interactions cannot be investigated.

The set of treatment conditions chosen and resulting yields for the quarter-replicate of the 25 experiment were

e ab ad bc cd ace bde abcde 23.2 15.5 16.9 16.2 23.8 23.4 16.8 18.1

The abbreviated table of signs is displayed in Table 11.15. With SSA denoting the sum of squares for effects in the alias group {A, CE,

ABDE, BCD},

SSA 5 (223.2 1 15.5 1 16.9 2 16.2 2 23.8 1 23.4 2 16.8 1 18.1)2

8 5 4.65

Table 11.15 Table of Signs for Example 11.17

A B C D E AB AD

e ab ad bc cd ace bde abcde 1111111

2211212 2212121 2121122 1222112 1221221 2122211 1112222

Similarly, (the � differenti- ates this quantity from error sum of squares SSE), , and giving . To test for main effects, we use

with 2 df. The ANOVA table is in Table 11.16. Since , none of the five main effects can be judged significant.

Of course, with only 2 df for error, the test is not very powerful (i.e., it is quite likely to fail to detect the presence of effects). The article from Industrial and Engineering Chemistry from which the data came actually has an independent estimate of the

F.05,1,2 5 18.51 SSE 5 SSAB 1 SSAD 5 9.91

SST 5 4.65 1 53.56 1 c1 3.25 5 89.73 SSAD 5 3.25,SSAB 5 6.66

SSB 5 53.56, SSC 5 10.35, SSD 5 .91 SSEr 5 10.35

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.4 2p Factorial Experiments 461

Example 11.18

Table 11.16 ANOVA Table for Example 11.17

Source df Sum of Squares Mean Square f

A 1 4.65 4.65 .94 B 1 53.56 53.56 10.80 C 1 10.35 10.35 2.09 D 1 .91 .91 .18 E 1 10.35 10.35 2.09 Error 2 9.91 4.96 Total 7 89.73

An experiment was carried out to investigate shrinkage in the plastic casing material used for speedometer cables (“An Explanation and Critique of Taguchi’s Contribution to Quality Engineering,” Quality and Reliability Engr. Intl., 1988: 123–131). The engineers started with 15 factors: liner outside diameter, liner die, liner material, liner line speed, wire braid type, braiding tension, wire diameter, liner tension, liner temperature, coating material, coating die type, melt temperature, screen pack, cooling method, and line speed. It was suspected that only a few of these factors were important, so a screening experi- ment in the form of a factorial (a fraction of a 215 factorial experiment) was carried out. The resulting alias structure is quite complicated; in particular, every main effect is confounded with two-factor interactions. The response variable was the percent- age of shrinkage for a cable specimen produced at designated levels of the factors.

Figure 11.15 displays a normal probability plot of the effect contrasts. All but two of the points fall quite close to a straight line. The discrepant points correspond to effects and , suggesting that these two factors are the only ones that affect the amount of shrinkage.

G 5 wire diameterE 5 wire braid type

1/211215211

standard error of the treatment effects based on prior experience, so it used a some- what different analysis. Our analysis was done here only for illustrative purposes, since one would ordinarily want many more than 2 df for error. ■

As an alternative to F tests based on pooling sums of squares to obtain SSE, a normal probability plot of effect contrasts can be examined.

Contrast

z percentile �1.6 �.8 0 .8 1.6

0

�.8

�1.6

G � Wire diameter

E � Wire-braid type

Figure 11.15 Normal probability plot of contrasts from Example 11.18

The subjects of factorial experimentation, confounding, and fractional replica- tion encompass many models and techniques we have not discussed. Please consult the chapter references for more information.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

462 CHAPTER 11 Multifactor Analysis of Variance

EXERCISES Section 11.4 (38–49)

38. The accompanying data resulted from an experiment to study the nature of dependence of welding current on three factors: welding voltage, wire feed speed, and tip- to-workpiece distance. There were two levels of each fac- tor (a 23 experiment) with two replications per combination of levels (the averages across replications agree with values given in the article “A Study on Prediction of Welding Current in Gas Metal Arc Welding,” J. Engr. Manuf., 1991: 64–69). The first two given numbers are for the treatment (1), the next two for a, and so on in standard order: 200.0, 204.2, 215.5, 219.5, 272.7, 276.9, 299.5, 302.7, 166.6, 172.6, 186.4, 192.0, 232.6, 240.8, 253.4, 261.6. a. Verify that the sums of squares are as given in the accom-

panying ANOVA table from Minitab. b. Which effects appear to be important, and why?

Analysis of Variance for current Source DF SS MS F P Volt 1 1685.1 1685.1 102.38 0.000 Speed 1 21272.2 21272.2 1292.37 0.000 Dist 1 5076.6 5076.6 308.42 0.000 Volt*speed 1 36.6 36.6 2.22 0.174 Volt*dist 1 0.4 0.4 0.03 0.877 Speed*dist 1 109.2 109.2 6.63 0.033 Volt*speed*dist 1 23.5 23.5 1.43 0.266 Error 8 131.7 16.5 Total 15 28335.3

39. The accompanying data resulted from a 23 experiment with three replications per combination of treatments designed to study the effects of concentration of detergent (A), concen- tration of sodium carbonate (B), and concentration of sodium carboxymethyl cellulose (C) on the cleaning ability of a solution in washing tests (a larger number indicates bet- ter cleaning ability than a smaller number).

Factor Levels

A B C Condition Observations

1 1 1 (1) 106, 93, 116 2 1 1 a 198, 200, 214 1 2 1 b 197, 202, 185 2 2 1 ab 329, 331, 307 1 1 2 c 149, 169, 135 2 1 2 ac 243, 247, 220 1 2 2 bc 255, 230, 252 2 2 2 abc 383, 360, 364

a. After obtaining cell totals , compute estimates of , and .

b. Use the cell totals along with Yates’s method to compute the effect contrasts and sums of squares. Then construct an ANOVA table and test all appropriate hypotheses using .a 5 .05

g21 ACg11

AC b1,xijk #

40. In a study of processes used to remove impurities from cel- lulose goods (“Optimization of Rope-Range Bleaching of Cellulosic Fabrics,” Textile Research J., 1976: 493–496), the following data resulted from a 24 experiment involving the desizing process. The four factors were enzyme con- centration (A), pH (B), temperature (C), and time (D).

Starch % En- by Weight

Treat zyme Temp. Time 1st 2nd ment (g/L) pH (°C) (hr) Repl. Repl.

(1) .50 6.0 60.0 6 9.72 13.50 a .75 6.0 60.0 6 9.80 14.04 b .50 7.0 60.0 6 10.13 11.27 ab .75 7.0 60.0 6 11.80 11.30 c .50 6.0 70.0 6 12.70 11.37 ac .75 6.0 70.0 6 11.96 12.05 bc .50 7.0 70.0 6 11.38 9.92 abc .75 7.0 70.0 6 11.80 11.10 d .50 6.0 60.0 8 13.15 13.00 ad .75 6.0 60.0 8 10.60 12.37 bd .50 7.0 60.0 8 10.37 12.00 abd .75 7.0 60.0 8 11.30 11.64 cd .50 6.0 70.0 8 13.05 14.55 acd .75 6.0 70.0 8 11.15 15.00 bcd .50 7.0 70.0 8 12.70 14.10 abcd .75 7.0 70.0 8 13.20 16.12

a. Use Yates’s algorithm to obtain sums of squares and the ANOVA table.

b. Do there appear to be any second-, third-, or fourth-order interaction effects present? Explain your reasoning. Which main effects appear to be significant?

41. In Exercise 39, suppose a low water temperature has been used to obtain the data. The entire experiment is then repeated with a higher water temperature to obtain the fol- lowing data. Use Yates’s algorithm on the entire set of 48 observations to obtain the sums of squares and ANOVA table, and then test appropriate hypotheses at level .05.

Condition Observations

d 144, 154, 158 ad 239, 227, 244 bd 232, 242, 246 abd 364, 362, 346 cd 194, 162, 203 acd 284, 295, 291 bcd 291, 287, 297 abcd 411, 406, 395

42. The following data on power consumption in electric- furnace heats (kW consumed per ton of melted product) resulted from a 24 factorial experiment with three replicates

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

11.4 2p Factorial Experiments 463

(“Studies on a 10-cwt Arc Furnace,” J. of the Iron and Steel Institute, 1956: 22). The factors were nature of roof (A, low/high), power setting (B, low/high), scrap used (C, tube/plate), and charge (D, 700 lb/1000 lb).

Treat- Treat- ment ment

(1) 866, 862, 800 d 988, 808, 650 a 946, 800, 840 ad 966, 976, 876 b 774, 834, 746 bd 702, 658, 650 ab 709, 789, 646 abd 784, 700, 596 c 1017, 990, 954 cd 922, 808, 868 ac 1028, 906, 977 acd 1056, 870, 908 bc 817, 783, 771 bcd 798, 726, 700 abc 829, 806, 691 abcd 752, 714, 714

Construct the ANOVA table, and test all hypotheses of interest using .

43. The article “Statistical Design and Analysis of Qualification Test Program for a Small Rocket Engine” (Industrial Quality Control, 1964: 14–18) presents data from an exper- iment to assess the effects of vibration (A), temperature cycling (B), altitude cycling (C), and temperature for alti- tude cycling and firing (D) on thrust duration. A subset of the data is given here. (In the article, there were four levels of D rather than just two.) Use the Yates method to obtain sums of squares and the ANOVA table. Then assume that three- and four-factor interactions are absent, pool the cor- responding sums of squares to obtain an estimate of s2, and test all appropriate hypotheses at level .05.

D1 D2 C1 C2 C1 C2

A1 B1 21.60 21.60 11.54 11.50 B2 21.09 22.17 11.14 11.32

A2 B1 21.60 21.86 11.75 9.82 B2 19.57 21.85 11.69 11.18

44. a. In a 24 experiment, suppose two blocks are to be used, and it is decided to confound the ABCD interaction with the block effect. Which treatments should be carried out in the first block [the one containing the treatment (1)], and which treatments are allocated to the second block?

b. In an experiment to investigate niacin retention in veg- etables as a function of cooking temperature (A), sieve size (B), type of processing (C), and cooking time (D), each factor was held at two levels. Two blocks were used, with the allocation of blocks as given in part (a) to con- found only the ABCD interaction with blocks. Use Yates’s procedure to obtain the ANOVA table for the accompanying data.

Treatment Treatment

(1) 91 d 72 a 85 ad 78 b 92 bd 68

xijklxijkl

a 5 .01

xijklmxijklm

ab 94 abd 79 c 86 cd 69 ac 83 acd 75 bc 85 bcd 72 abc 90 abcd 71

c. Assume that all three-way interaction effects are absent, so that the associated sums of squares can be combined to yield an estimate of s2, and carry out all appropriate tests at level .05.

45. a. An experiment was carried out to investigate the effects on audio sensitivity of varying resistance (A), two capacitances (B, C), and inductance of a coil (D) in part of a television circuit. If four blocks were used with four treatments per block and the defining effects for confounding were AB and CD, which treatments appeared in each block?

b. Suppose two replications of the experiment described in part (a) were performed, resulting in the accompanying data. Obtain the ANOVA table, and test all relevant hypotheses at level .01.

Treat- Treat- ment ment

(1) 618 598 d 598 585 a 583 560 ad 587 541 b 477 525 bd 480 508 ab 421 462 abd 462 449 c 601 595 cd 603 577 ac 550 589 acd 571 552 bc 505 484 bcd 502 508 abc 452 451 abcd 449 455

46. In an experiment involving four factors (A, B, C, and D) and four blocks, show that at least one main effect or two-factor interaction effect must be confounded with the block effect.

47. a. In a seven-factor experiment , suppose a quarter-replicate is actually carried out. If the defining effects are ABCDE and CDEFG, what is the third nones- timable effect, and what treatments are in the group containing (1)? What are the alias groups of the seven main effects?

b. If the quarter-replicate is to be carried out using four blocks (with eight treatments per block), what are the blocks if the chosen confounding effects are ACF and BDG?

48. The article “Applying Design of Experiments to Improve a Laser Welding Process” (J. of Engr. Manufacture, 2008: 1035–1042) included the results of a half replicate of a 24

experiment. The four factors were: A. Power (2900 W, 3300 W), B. Current (2400 mV, 3600 mV), C. Laterals cleaning (No, Yes), and D. Roof cleaning (No, Yes). a. If the effect ABCD is chosen as the defining effect for the

replicate and the group of eight treatments for which data is obtained includes treatment (1), what other treatments are in the observed group, and what are the alias pairs?

b. The cited article presented data on two different response variables, the percentage of defective joints for both the

(A, c, G)

xijkl2xijkl1xijkl2xijkl1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

464 CHAPTER 11 Multifactor Analysis of Variance

right laser welding cord and the left welding cord. Here we consider just the latter response. Observations are listed here in standard order after deleting the half not observed. Assuming that two- and three-factor interac- tions are negligible, test at level .05 for the presence of main effects. Also construct a normal probability plot.

8.936 9.130 4.314 7.692 .415 6.061 1.984 3.830

49. A half-replicate of a 25 experiment to investigate the effects of heating time (A), quenching time (B), drawing time (C), position of heating coils (D), and measurement position (E) on the hardness of steel castings resulted in the accompany- ing data. Construct the ANOVA table, and (assuming second

and higher-order interactions to be negligible) test at level .01 for the presence of main effects. Also construct a normal probability plot.

Treat- Treat- ment Observation ment Observation

a 70.4 acd 66.6 b 72.1 ace 67.5 c 70.4 ade 64.0 d 67.4 bcd 66.8 e 68.0 bce 70.3 abc 73.8 bde 67.9 abd 67.0 cde 65.9 abe 67.8 abcde 68.0

SUPPLEMENTARY EXERCISES (50–61)

50. The results of a study on the effectiveness of line drying on the smoothness of fabric were summarized in the article “Line-Dried vs. Machine-Dried Fabrics: Comparison of Appearance, Hand, and Consumer Acceptance” (Home Econ. Research J., 1984: 27–35). Smoothness scores were given for nine different types of fabric and five different dry- ing methods: (1) machine dry, (2) line dry, (3) line dry fol- lowed by 15-min tumble, (4) line dry with softener, and (5) line dry with air movement. Regarding the different types of fabric as blocks, construct an ANOVA table. Using a .05 sig- nificance level, test to see whether there is a difference in the true mean smoothness score for the drying methods.

Drying Method

1 2 3 4 5

Crepe 3.3 2.5 2.8 2.5 1.9 Double knit 3.6 2.0 3.6 2.4 2.3 Twill 4.2 3.4 3.8 3.1 3.1 Twill mix 3.4 2.4 2.9 1.6 1.7

Fabric Terry 3.8 1.3 2.8 2.0 1.6 Broadcloth 2.2 1.5 2.7 1.5 1.9 Sheeting 3.5 2.1 2.8 2.1 2.2 Corduroy 3.6 1.3 2.8 1.7 1.8 Denim 2.6 1.4 2.4 1.3 1.6

51. The water absorption of two types of mortar used to repair damaged cement was discussed in the article “Polymer Mortar Composite Matrices for Maintenance-Free, Highly Durable Ferrocement” (J. of Ferrocement, 1984: 337–345). Specimens of ordinary cement mortar (OCM) and polymer cement mortar (PCM) were submerged for varying lengths of time (5, 9, 24, or 48 hours) and water absorption (% by weight) was recorded. With mortar type as factor A (with two levels) and submersion period as factor B (with four levels), three observations were made for each factor level

combination. Data included in the article was used to com- pute the sums of squares, which were SSA � 322.667, SSB �35.623, SSAB � 8.557, and SST � 372.113. Use this information to construct an ANOVA table. Test the appro- priate hypotheses at a .05 significance level.

52. Four plots were available for an experiment to compare clover accumulation for four different sowing rates (“Performance of Overdrilled Red Clover with Different Sowing Rates and Initial Grazing Managements,” N. Zeal. J. of Exp. Ag., 1984: 71–81). Since the four plots had been grazed differently prior to the experiment and it was thought that this might affect clover accumulation, a ran- domized block experiment was used with all four sowing rates tried on a section of each plot. Use the given data to test the null hypothesis of no difference in true mean clover accumulation (kg DM/ha) for the different sowing rates.

Sowing Rate (kg/ha)

3.6 6.6 10.2 13.5

1 1155 2255 3505 4632 2 123 406 564 416

Plot 3 68 416 662 379 4 62 75 362 564

53. In an automated chemical coating process, the speed with which objects on a conveyor belt are passed through a chemical spray (belt speed), the amount of chemical sprayed (spray volume), and the brand of chemical used (brand) are factors that may affect the uniformity of the coating applied. A replicated 23 experiment was conducted in an effort to increase the coating uniformity. In the fol- lowing table, higher values of the response variable are associated with higher surface uniformity:

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 465

Surface Uniformity

Repli- Repli- Spray Belt cation cation

Run Volume Speed Brand 1 2

1 40 36 2 25 28 3 30 32 4 50 48 5 45 43 6 25 30 7 30 29 8 52 49

Analyze this data and state your conclusions.

54. Coal-fired power plants used in the electrical industry have gained increased public attention because of the environ- mental problems associated with solid wastes generated by large-scale combustion (“Fly Ash Binders in Stabilization of FGD Wastes,” J. of Environmental Engineering, 1998: 43–49). A study was conducted to analyze the influence of three factors—binder type (A), amount of water (B), and land disposal scenario (C)—that affect certain leaching characteristics of solid wastes from combustion. Each factor was studied at two levels. An unreplicated 23 experiment was run, and a response value EC50 (the effective con- centration, in mg/L, that decreases 50% of the light in a luminescence bioassay) was measured for each combination of factor levels. The experimental data is given in the following table:

Factor Response Run A B C EC50

1 23,100 2 1 43,000 3 1 71,400 4 1 1 76,000 5 1 37,000 6 1 1 33,200 7 1 1 17,000 8 1 1 1 16,500

Carry out an appropriate ANOVA, and state your conclusions.

55. Impurities in the form of iron oxides lower the economic value and usefulness of industrial minerals, such as kaolins, to ceramic and paper-processing industries. A 24 experiment was conducted to assess the effects of four factors on the percentage of iron removed from kaolin samples (“Factorial Experiments in the Development of a Kaolin Bleaching Process Using Thiourea in Sulphuric Acid Solutions,” Hydrometallurgy, 1997: 181–197). The factors and their levels are listed in the following table:

21 21 2121

21 2121 2121 212121

111 112 121 122 211 212 221 222

Low High Factor Description Units Level Level

A H2SO4 M .10 .25 B Thiourea g/L 0.0 5.0 C Temperature °C 70 90 D Time min 30 150

The data from an unreplicated 24 experiment is listed in the next table.

Iron (Iron Extraction Test Extraction

Test Run (%) Run (%)

(1) 7 d 28 a 11 ad 51 b 7 bd 33 ab 12 abd 57 c 21 cd 70 ac 41 acd 95 bc 27 bcd 77 abc 48 abcd 99

a. Calculate estimates of all main effects and two-factor interaction effects for this experiment.

b. Create a probability plot of the effects. Which effects appear to be important?

56. Factorial designs have been used in forestry to assess the effects of various factors on the growth behavior of trees. In one such experiment, researchers thought that healthy spruce seedlings should bud sooner than diseased spruce seedlings (“Practical Analysis of Factorial Experiments in Forestry,” Canadian J. of Forestry, 1995: 446–461). In addi- tion, before planting, seedlings were also exposed to three levels of pH to see whether this factor has an effect on virus uptake into the root system. The following table shows data from a experiment to study both factors:

pH

3 5.5 7

Diseased 1.2, 1.4, .8, .6, 1.0, 1.0, 1.0, 1.2, .8, 1.0, 1.2, 1.4, 1.4 .8 1.2

Health Healthy 1.4, 1.6, 1.0, 1.2, 1.2, 1.4, 1.6, 1.6, 1.2, 1.4, 1.2, 1.2, 1.4 1.4 1.4

The response variable is an average rating of five buds from a seedling. The ratings are 0 (bud not broken), 1 (bud par- tially expanded), and 2 (bud fully expanded). Analyze this data.

57. One property of automobile air bags that contributes to their ability to absorb energy is the permeability (ft3/ft2/min) of the woven material used to construct the air bags. Understanding how permeability is influenced by various

2 3 3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

466 CHAPTER 11 Multifactor Analysis of Variance

factors is important for increasing the effectiveness of air bags. In one study, the effects of three factors, each at three levels, were studied (“Analysis of Fabrics Used in Passive Restraint Systems—Airbags,” J. of the Textile Institute, 1996: 554–571):

A (Temperature): 8°C, 50°C, 75°C

B (Fabric denier): 420-D, 630-D, 840-D

C (Air pressure): 17.2 kPa, 34.4 kPa, 103.4 kPa

Temperature 8°

Pressure Denier 17.2 34.4 103.4

420-D 73 157 332 80 155 322

630-D 35 91 288 433 98 271

840-D 125 234 477 111 233 464

Temperature 50°

Pressure Denier 17.2 34.4 103.4

420-D 52 125 281 51 118 264

630-D 16 72 169 12 78 173

840-D 96 149 338 100 155 350

Temperature 75°

Pressure Denier 17.2 34.4 103.4

420-D 37 95 276 31 106 281

630-D 30 91 213 41 100 211

840-D 102 170 307 98 160 311

Analyze this data and state your conclusions (assume that all factors are fixed).

58. A chemical engineer has carried out an experiment to study the effects of the fixed factors of vat pressure (A), cooking time of pulp (B), and hardwood concentration (C) on the strength of paper. The experiment involved two pressures, four cooking times, three concentrations, and two observa- tions at each combination of these levels. Calculated sums of squares are SSA � 6.94, SSB � 5.61, SSC � 12.33, SSAB � 4.05, SSAC � 7.32, SSBC � 15.80, SSE � 14.40, and . Construct the ANOVA table, and carry out appropriate tests at significance level .05.

SST 5 70.82

59. The bond strength when mounting an integrated circuit on a metalized glass substrate was studied as a function of factor A � adhesive type, factor B � curve time, and factor C � conductor material (copper and nickel). The data follows, along with an ANOVA table from Minitab. What conclu- sions can you draw from the data?

Cure Time Copper 1 2 3

72.7 74.6 80.0 1 80.0 77.5 82.7

77.8 78.5 84.6 Adhesive 2 75.3 81.1 78.3

77.3 80.9 83.9

3 76.5 82.6 85.0

Nickel 1 2 3

74.7 75.7 77.2 1 77.4 78.2 74.6

79.3 78.8 83.0 Adhesive 2 77.8 75.4 83.9

77.2 84.5 89.4 3 78.4 77.5 81.2

Analysis of Variance for strength Source DF SS MS F P Adhesive 2 101.317 50.659 6.54 0.007 Curetime 2 151.317 75.659 9.76 0.001 Conmater 1 0.722 0.722 0.09 0.764 Adhes*curet 4 30.526 7.632 0.98 0.441 Adhes*conm 2 8.015 4.008 0.52 0.605 Curet*conm 2 5.952 2.976 0.38 0.687 Adh*curet*conm 4 33.298 8.325 1.07 0.398 Error 18 139.515 7.751 Total 35 470.663

60. The article “Effect of Cutting Conditions on Tool Performance in CBN Hard Turning” (J. of Manuf. Processes, 2005: 10–17) reported the accompanying data on cutting speed (m/s), feed (mm/rev), depth of cut (mm), and tool life (min). Carry out a three-factor ANOVA on tool life, assuming the absence of any factor interactions (as did the authors of the article).

Obs Cut spd Feed Cut dpth life 1 1.21 0.061 0.102 27.5 2 1.21 0.168 0.102 26.5 3 1.21 0.061 0.203 27.0 4 1.21 0.168 0.203 25.0 5 3.05 0.061 0.102 8.0 6 3.05 0.168 0.102 5.0 7 3.05 0.061 0.203 7.0 8 3.05 0.168 0.203 3.5

61. Analogous to a Latin square, a Greco-Latin square design can be used when it is suspected that three extraneous factors may affect the response variable and all four factors (the three extraneous ones and the one of interest) have the same number of levels. In a Latin square, each level of the factor

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Bibliography 467

of interest (C) appears once in each row (with each level of A) and once in each column (with each level of B). In a Greco-Latin square, each level of factor D appears once in each row, in each column, and also with each level of the third extraneous factor C. Alternatively, the design can be used when the four factors are all of equal interest, the num- ber of levels of each is N, and resources are available only for N2 observations. A square is pictured in (a), with (k, l) in each cell denoting the kth level of C and lth level of D. In (b) we present data on weight loss in silicon bars used for semiconductor material as a function of volume of etch (A), color of nitric acid in the etch solution (B), size of bars (C), and time in the etch solution (D) (from “Applications of Analytic Techniques to the Semiconductor Industry,” Fourteenth Midwest Quality Control Conference, 1959).

Let denote the observed weight loss when factor A is at level i, B is at level j, C is at level k, and D is at level l. Assuming no interaction between factors, the total sum of squares SST (with df) can be partitioned into SSA, SSB, SSC, SSD, and SSE. Give expressions for these sums of squares, including computing formulas, obtain the ANOVA table for the given data, and test each of the four main effect hypotheses using .a 5 .05

N2 2 1

xij(kl)

5 3 5

B (C, D) 1 2 3 4 5

1 (1, 1) (2, 3) (3, 5) (4, 2) (5, 4)

2 (2, 2) (3, 4) (4, 1) (5, 3) (1, 5)

A 3 (3, 3) (4, 5) (5, 2) (1, 4) (2, 1)

4 (4, 4) (5, 1) (1, 3) (2, 5) (3, 2)

5 (5, 5) (1, 2) (2, 4) (3, 1) (4, 3)

(a)

65 82 108 101 126

84 109 73 97 83

105 129 89 89 52

119 72 76 117 84

97 59 94 78 106

(b)

Bibliography

Box, George, William Hunter, and Stuart Hunter, Statistics for Experimenters (2nd ed.), Wiley, New York, 2006. Contains a wealth of suggestions and insights on data analysis based on the authors’ extensive consulting experience.

DeVor, R., T. Chang, and J. W. Sutherland, Statistical Quality Design and Control, (2nd ed.), Prentice-Hall, Englewood Cliffs, NJ, 2006. Includes a modern survey of factorial and fractional factorial experimentation with a minimum of mathematics.

Hocking, Ronald, Methods and Applications of Linear Models (2nd ed.), Wiley, New York, 2003. A very general treatment of analysis of variance written by one of the foremost author- ities in this field.

Kleinbaum, David, Lawrence Kupper, Keith Muller, and Azhar Nizam, Applied Regression Analysis and Other Multivariable Methods (4th ed.), Duxbury Press, Boston, 2007. Contains an especially good discussion of problems associated with analysis of “unbalanced data”—that is, unequal ’s.Kij

Kuehl, Robert O., Design of Experiments: Statistical Principles of Research Design and Analysis (2nd ed.), Duxbury Press, Boston, 1999. A comprehensive treatment of designed exper- iments and analysis of the resulting data.

Montgomery, Douglas, Design and Analysis of Experiments (7th ed.), Wiley, New York, 2009. See the Chapter 10 bibliography.

Neter, John, William Wasserman, and Michael Kutner, Applied Linear Statistical Models (5th ed.), Irwin, Homewood, IL, 2004. See the Chapter 10 bibliography.

Vardeman, Stephen, Statistics for Engineering Problem Solving, PWS, Boston, 1994. A general introduction for engineers, with much descriptive and inferential methodology for data from designed experiments.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

468

Simple Linear Regression and Correlation12

INTRODUCTION

In the two-sample problems discussed in Chapter 9, we were interested in

comparing values of parameters for the x distribution and the y distribution.

Even when observations were paired, we did not try to use information about

one of the variables in studying the other variable. This is precisely the objective

of regression analysis: to exploit the relationship between two (or more)

variables so that we can gain information about one of them through knowing

values of the other(s).

Much of mathematics is devoted to studying variables that are deter-

ministically related. Saying that x and y are related in this manner means that

once we are told the value of x, the value of y is completely specified. For

example, consider renting a van for a day, and suppose that the rental cost is

$25.00 plus $.30 per mile driven. Letting x � the number of miles driven and

the rental charge, then . If the van is driven 100 miles

, then . As another example, if the initial

velocity of a particle is v0 and it undergoes constant acceleration a, then

distance traveled , where .

There are many variables x and y that would appear to be related to

one another, but not in a deterministic fashion. A familiar example is given

by variables high school grade point average (GPA) and college

GPA. The value of y cannot be determined just from knowledge of x, and

two different individuals could have the same x value but have very different

y values. Yet there is a tendency for those who have high (low) high school

GPAs also to have high (low) college GPAs. Knowledge of a student’s high

school GPA should be quite helpful in enabling us to predict how that person

will do in college.

y 5x 5

x 5 time5 y 5 v0x 1 1 2ax

2

y 5 25 1 .3(100) 5 55(x 5 100)

y 5 25 1 .3xy 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Other examples of variables related in a nondeterministic fashion include

and size of that child’s vocabulary, size of an engine

(cm3) and fuel efficiency for an automobile equipped with that engine, and

applied tensile force and amount of elongation in a metal strip.

Regression analysis is the part of statistics that investigates the rela-

tionship between two or more variables related in a nondeterministic fashion.

In this chapter, we generalize the deterministic linear relation to

a linear probabilistic relationship, develop procedures for making various

inferences based on the model, and obtain a quantitative measure (the cor-

relation coefficient) of the extent to which the two variables are related. In

Chapter 13, we will consider techniques for validating a particular model and

investigate nonlinear relationships and relationships involving more than two

variables.

y 5 b0 1 b1x

y 5x 5

y 5

x 5y 5x 5 age of a child

12.1 The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and y is a linear relationship . The set of pairs (x, y) for which determines a straight line with slope b1 and y-intercept b0.* The objective of this sec- tion is to develop a linear probabilistic model.

If the two variables are not deterministically related, then for a fixed value of x, there is uncertainty in the value of the second variable. For example, if we are inves- tigating the relationship between age of child and size of vocabulary and decide to select a child of age years, then before the selection is made, vocabulary size is a random variable Y. After a particular 5-year-old child has been selected and tested, a vocabulary of 2000 words may result. We would then say that the observed value of Y associated with fixing was .

More generally, the variable whose value is fixed by the experimenter will be denoted by x and will be called the independent, predictor, or explanatory variable. For fixed x, the second variable will be random; we denote this random variable and its observed value by Y and y, respectively, and refer to it as the depend- ent or response variable.

Usually observations will be made for a number of settings of the independ- ent variable. Let denote values of the independent variable for which observations are made, and let Yi and yi , respectively, denote the random variable and observed value associated with . The available bivariate data then consists of the n pairs . A picture of this data called a scatter plot gives preliminary impressions about the nature of any relationship. In such a plot, each is represented as a point plotted on a two-dimensional coordinate system.

(x i, yi)

(x 1, y1), (x 2, y2), c, (xn, yn) x i

x 1, x 2, c , xn

y 5 2000x 5 5.0

x 5 5.0

y 5 b0 1 b1xy 5 b0 1 b1x

* The slope of a line is the change in y for a 1-unit increase in x. For example, if , then y decreases by 3 when x increases by 1, so the slope is . The y-intercept is the height at which the line crosses the vertical axis and is obtained by setting in the equation.x 5 0

23 y 5 23x 1 10

12.1 The Simple Linear Regression Model 469

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 12.1

470 CHAPTER 12 Simple Linear Regression and Correlation

Visual and musculoskeletal problems associated with the use of visual display ter- minals (VDTs) have become rather common in recent years. Some researchers have focused on vertical gaze direction as a source of eye strain and irritation. This direc- tion is known to be closely related to ocular surface area (OSA), so a method of measuring OSA is needed. The accompanying representative data on

and width of the palprebal fissure (i.e., the horizontal width of the eye opening, in cm) is from the article “Analysis of Ocular Surface Area for Comfortable VDT Workstation Layout” (Ergonomics, 1996: 877–884). The order in which observations were obtained was not given, so for convenience they are listed in increasing order of x values.

x 5y 5 OSA (cm2)

i 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

xi .40 .42 .48 .51 .57 .60 .70 .75 .75 .78 .84 .95 .99 1.03 1.12

yi 1.02 1.21 .88 .98 1.52 1.83 1.50 1.80 1.74 1.63 2.00 2.80 2.48 2.47 3.05

i 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

xi 1.15 1.20 1.25 1.25 1.28 1.30 1.34 1.37 1.40 1.43 1.46 1.49 1.55 1.58 1.60

yi 3.18 3.76 3.68 3.82 3.21 4.27 3.12 3.99 3.75 4.10 4.18 3.77 4.34 4.21 4.92

Figure 12.1 Scatter plot from Minitab for the data from Example 12.1, along with dotplots of x and y values

Thus , and so on. A Minitab scatter plot is shown in Figure 12.1; we used an option that produced a dotplot of both the x values and y values individually along the right and top margins of the plot, which makes it easier to visualize the distributions of the individual variables (histograms or boxplots are alternative options). Here are some things to notice about the data and plot:

• Several observations have identical x values yet different y values (e.g., , but and ). Thus the value of y is not

determined solely by x but also by various other factors.

• There is a strong tendency for y to increase as x increases. That is, larger values of OSA tend to be associated with larger values of fissure width—a positive relationship between the variables.

y9 5 1.74y8 5 1.80x 8 5 x 9 5 .75

(x 1, y1) 5 (.40, 1.02), (x 5, y5) 5 (.57, 1.52)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.1 The Simple Linear Regression Model 471

Example 12.2

• It appears that the value of y could be predicted from x by finding a line that is rea- sonably close to the points in the plot (the authors of the cited article superimposed such a line on their plot). In other words, there is evidence of a substantial (though not perfect) linear relationship between the two variables. ■

The horizontal and vertical axes in the scatter plot of Figure 12.1 intersect at the point (0, 0). In many data sets, the values of x or y or the values of both variables differ considerably from zero relative to the range(s) of the values. For example, a study of how air conditioner efficiency is related to maximum daily outdoor tem- perature might involve observations for temperatures ranging from 80°F to 100°F. When this is the case, a more informative plot would show the appropriately labeled axes intersecting at some point other than (0, 0).

Arsenic is found in many ground-waters and some surface waters. Recent health effects research has prompted the Environmental Protection Agency to reduce allow- able arsenic levels in drinking water so that many water systems are no longer com- pliant with standards. This has spurred interest in the development of methods to remove arsenic. The accompanying data on and arsenic removed (%) by a particular process was read from a scatter plot in the article “Optimizing Arsenic Removal During Iron Removal: Theoretical and Practical Considerations” (J. of Water Supply Res. and Tech., 2005: 545–560).

y 5x 5 pH

x 7.01 7.11 7.12 7.24 7.94 7.94 8.04 8.05 8.07

y 60 67 66 52 50 45 52 48 40

x 8.90 8.94 8.95 8.97 8.98 9.85 9.86 9.86 9.87

y 23 20 40 31 26 9 22 13 7

7.0 7.5 8.0 8.5 9.0 9.5 10.0 0

10

20

30

40

50

60

70

% removal % removal

(a)

0 2 4 6 8 10 0

10

20

30

40

50

60

70

pHpH

(b)

Figure 12.2 Minitab scatter plots of data in Example 12.2

Figure 12.2 shows two Minitab scatter plots of this data. In Figure 12.2(a), the soft- ware selected the scale for both axes. We obtained Figure 12.2(b) by specifying scal- ing for the axes so that they would intersect at roughly the point (0, 0). The second plot is much more crowded than the first one; such crowding can make it difficult to ascertain the general nature of any relationship. For example, curvature can be over- looked in a crowded plot.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

DEFINITION

472 CHAPTER 12 Simple Linear Regression and Correlation

The Simple Linear Regression Model

There are parameters b0, b1, and s 2, such that for any fixed value of the inde-

pendent variable x, the dependent variable is a random variable related to x through the model equation

(12.1)

The quantity in the model equation is a random variable, assumed to be nor- mally distributed with and .V(P) 5 s2E(P) 5 0

P

Y 5 b0 1 b1x 1 P

The variable is usually referred to as the random deviation or random error term in the model. Without , any observed pair (x, y) would correspond to a point falling exactly on the line , called the true (or population) regression line. The inclusion of the random error term allows (x, y) to fall either above the true regression line (when ) or below the line (when ). The points resulting from n independent observations will then be scattered about the true regression line, as illustrated in Figure 12.3. On occasion, the appropriateness of the simple linear regression model may be suggested by the- oretical considerations (e.g., there is an exact linear relationship between the two variables, with representing measurement error). Much more frequently, though, the reasonableness of the model is indicated by a scatter plot exhibiting a substantial linear pattern (as in Figures 12.1 and 12.2).

P

(x 1, y1), c, (xn, yn) P , 0P . 0

y 5 b0 1 b1x P

P

Large values of arsenic removal tend to be associated with low pH, a negative or inverse relationship. Furthermore, the two variables appear to be at least approxi- mately linearly related, although the points in the plot would spread out somewhat about any superimposed straight line (such a line appeared in the plot in the cited article). ■

A Linear Probabilistic Model For the deterministic model , the actual observed value of y is a linear function of x. The appropriate generalization of this to a probabilistic model assumes that the expected value of Y is a linear function of x, but that for fixed x the variable Y differs from its expected value by a random amount.

y 5 b0 1 b1x

y

x

x1 x2

(x1, y1)

(x2, y2)

True regression line y � 0 � 1x��

ε1 ε2

Figure 12.3 Points corresponding to observations from the simple linear regression model

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.1 The Simple Linear Regression Model 473

Implications of the model equation (12.1) can best be understood with the aid of the following notation. Let x* denote a particular value of the independent variable x and

Alternative notation is and . For example, if and , then would denote the expected value

of time-to-fracture when applied stress is 20 kg/mm2. If we think of an entire popu- lation of (x, y) pairs, then is the mean of all y values for which , and is a measure of how much these values of y spread out about the mean value. If, for example, of a child and vocabulary size, then is the average vocab- ulary size for all 5-year-old children in the population, and describes the amount of variability in vocabulary size for this part of the population. Once x is fixed, the only randomness on the right-hand side of the model equation (12.1) is in the ran- dom error , and its mean value and variance are 0 and s2, respectively, whatever the value of x. This implies that

Replacing x* in by x gives the relation , which says that the mean value of Y, rather than Y itself, is a linear function of x. The true regression line is thus the line of mean values; its height above any particular x value is the expected value of Y for that value of x. The slope b1 of the true regression line is interpreted as the expected change in Y associated with a 1- unit increase in the value of x. The second relation states that the amount of vari- ability in the distribution of Y values is the same at each different value of x (homogeneity of variance). In the example involving age of a child and vocabulary size, the model implies that average vocabulary size changes linearly with age (hopefully b1 is positive) and that the amount of variability in vocabulary size at any particular age is the same as at any other age. Finally, for fixed x, Y is the sum of a constant and a normally distributed rv so itself has a normal dis- tribution. These properties are illustrated in Figure 12.4. The variance parameter

Pb0 1 b1x

y 5 b0 1 b1x

mY # x 5 b0 1 b1xmY # x*

sY # x*2 5 V(b0 1 b1x* 1 P) 5 V(b0 1 b1x*) 1 V(P) 5 0 1 s2 5 s2 mY # x* 5 E(b0 1 b1x* 1 P) 5 b0 1 b1x* 1 E(P) 5 b0 1 b1x*

P

sY # 52 mY # 5y 5x 5 age

sY # x*2x 5 x*mY # x*

mY # 20y 5 time-to-fracture (hr)(kg/mm)2 x 5 applied stressV(Y ux*)E(Y ux*)

sY # x*2 5 the variance of Y when x has value x* mY # x* 5 the expected (or mean) value of Y when x has value x*

y

x x1 x2 x3

0 � 1x3��

0 � 1x2��

0 � 1x1�� Line y � 0 � 1x ��

(b)

0�� �

(a)

Normal, mean 0, standard deviation

Figure 12.4 (a) Distribution of ; (b) distribution of Y for different values of xP

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 12.3

474 CHAPTER 12 Simple Linear Regression and Correlation

Suppose the relationship between applied stress x and time-to-failure y is described by the simple linear regression model with true regression line

and . Then for any fixed value x* of stress, time-to-failure has a normal distribution with mean value and standard deviation 8. Roughly speaking, in the population consisting of all (x, y) points, the magnitude of a typical deviation from the true regression line is about 8. For , Y has mean value so

The probability that time-to-failure exceeds 50 when applied stress is 25 is, because

These probabilities are illustrated as the shaded areas in Figure 12.5.

P(Y . 50 when x 5 25) 5 PaZ . 50 2 35 8

b 5 1 2 �(1.88) 5 .0301

mY # 25 5 35,

P(Y . 50 when x 5 20) 5 PaZ . 50 2 41 8

b 5 1 2 �(1.13) 5 .1292

65 2 1.2(20) 5 41,mY # 20 5 x 5 20

65 2 1.2x* s 5 8y 5 65 2 1.2x

s 2 determines the extent to which each normal curve spreads out about its mean value (the height of the line). When s 2 is small, an observed point (x, y) will almost always fall quite close to the true regression line, whereas observations may deviate considerably from their expected values (corresponding to points far from the line) when s2 is large.

20 25 x

y

50

41 35

P(Y � 50 when x � 20) � .1292

P(Y � 50 when x � 25) � .0301

True regression line y � 65 �1.2x

Figure 12.5 Probabilities based on the simple linear regression model

Suppose that Y1 denotes an observation on time-to-failure made with and Y2 denotes an independent observation made with . Then is nor- mally distributed with mean value , variance

, and standard deviation . The prob- ability that Y1 exceeds Y2 is

1128 5 11.314V(Y1 2 Y2) 5 s 2 1 s2 5 128

E(Y1 2 Y2) 5 b1 5 21.2 Y1 2 Y2x 5 24

x 5 25

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.1 The Simple Linear Regression Model 475

EXERCISES Section 12.1 (1–11)

1. The efficiency ratio for a steel specimen immersed in a phos- phating tank is the weight of the phosphate coating divided by the metal loss (both in mg/ft2). The article “Statistical Process Control of a Phosphate Coating Line” (Wire J. Intl., May 1997: 78–81) gave the accompanying data on tank tem- perature (x) and efficiency ratio (y).

Temp. 170 172 173 174 174 175 176 Ratio .84 1.31 1.42 1.03 1.07 1.08 1.04

Temp. 177 180 180 180 180 180 181 Ratio 1.80 1.45 1.60 1.61 2.13 2.15 .84

Temp. 181 182 182 182 182 184 184 Ratio 1.43 .90 1.81 1.94 2.68 1.49 2.52

Temp. 185 186 188 Ratio 3.00 1.87 3.08

a. Construct stem-and-leaf displays of both temperature and efficiency ratio, and comment on interesting features.

b. Is the value of efficiency ratio completely and uniquely determined by tank temperature? Explain your reasoning.

c. Construct a scatter plot of the data. Does it appear that efficiency ratio could be very well predicted by the value of temperature? Explain your reasoning.

2. The article “Exhaust Emissions from Four-Stroke Lawn Mower Engines” (J. of the Air and Water Mgmnt. Assoc., 1997: 945–952) reported data from a study in which both a baseline gasoline mixture and a reformulated gasoline were used. Consider the following observations on age (yr) and NOx emissions (g/kWh):

Engine 1 2 3 4 5 Age 0 0 2 11 7 Baseline 1.72 4.38 4.06 1.26 5.31 Reformulated 1.88 5.93 5.54 2.67 6.53

Engine 6 7 8 9 10 Age 16 9 0 12 4 Baseline .57 3.37 3.44 .74 1.24 Reformulated .74 4.94 4.89 .69 1.42

Construct a scatter plot. Does there appear to be a very strong relationship between the two types of concentration meas- urements? Do the two methods appear to be measuring roughly the same quantity? Explain your reasoning.

4. A study to assess the capability of subsurface flow wetland sys- tems to remove biochemical oxygen demand (BOD) and vari- ous other chemical constituents resulted in the accompanying data on mass loading (kg/ha/d) and mass removal (kg/ha/d) (“Subsurface Flow Wetlands—A Performance Evaluation,” Water Envir. Res., 1995: 244–247).

y 5 BODx 5 BOD

x 3 8 10 11 13 16 27 30 35 37 38 44 103 142

y 4 7 8 8 10 11 16 26 21 9 31 30 75 90

a. Construct boxplots of both mass loading and mass removal, and comment on any interesting features.

b. Construct a scatter plot of the data, and comment on any interesting features.

x 47 62 65 70 70 78 95 100 114 118

y 38 62 53 67 84 79 93 106 117 116

x 124 127 140 140 140 150 152 164 198 221

y 127 114 134 139 142 170 149 154 200 215

Construct scatter plots of NOx emissions versus age. What appears to be the nature of the relationship between these two variables? [Note: The authors of the cited article com- mented on the relationship.]

3. Bivariate data often arises from the use of two different tech- niques to measure the same quantity. As an example, the accompanying observations on hydrogen concentration (ppm) using a gas chromatography method and concen- tration using a new sensor method were read from a graph in the article “A New Method to Measure the Diffusible Hydrogen Content in Steel Weldments Using a Polymer Electrolyte-Based Hydrogen Sensor” (Welding Res., July 1997: 251s–256s).

y 5 x 5

That is, even though we expected Y to decrease when x increases by 1 unit, it is not unlikely that the observed Y at will be larger than the observed Y at x. ■x 1 1

P(Y1 2 Y2 . 0) 5 PaZ . 0 2 (21.2)11.314 b 5 P(Z . .11) 5 .4562

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

476 CHAPTER 12 Simple Linear Regression and Correlation

a. Construct a scatter plot in which the axes intersect at (0, 0). Mark 0, 20, 40, 60, 80, and 100 on the horizontal axis and 0, 50, 100, 150, 200, and 250 on the vertical axis.

b. Construct a scatter plot in which the axes intersect at (55, 100), as was done in the cited article. Does this plot seem preferable to the one in part (a)? Explain your reasoning.

c. What do the plots of parts (a) and (b) suggest about the nature of the relationship between the two variables?

6. One factor in the development of tennis elbow, a malady that strikes fear in the hearts of all serious tennis players, is the impact-induced vibration of the racket-and-arm system at ball contact. It is well known that the likelihood of getting tennis elbow depends on various properties of the racket used. Consider the scatter plot of racket resonance frequency (Hz) and (a character- istic of arm vibration, in m/sec/sec) for different rack- ets (“Transfer of Tennis Racket Vibrations into the Human Forearm,” Medicine and Science in Sports and Exercise, 1992: 1134–1140). Discuss interesting features of the data and scatter plot.

n 5 23 y 5 sum of peak-to-peak acceleration

x 5

ered regressing standard-cured strength (psi) against . Suppose the equation of the true regression line is . a. What is the expected value of 28-day strength when accel-

erated strength ? b. By how much can we expect 28-day strength to change

when accelerated strength increases by 1 psi? c. Answer part (b) for an increase of 100 psi. d. Answer part (b) for a decrease of 100 psi.

8. Referring to Exercise 7, suppose that the standard deviation of the random deviation is 350 psi. a. What is the probability that the observed value of 28-day

strength will exceed 5000 psi when the value of acceler- ated strength is 2000?

b. Repeat part (a) with 2500 in place of 2000. c. Consider making two independent observations on 28-day

strength, the first for an accelerated strength of 2000 and the second for . What is the probability that the second observation will exceed the first by more than 1000 psi?

d. Let Y1 and Y2 denote observations on 28-day strength when and , respectively. By how much would x2

have to exceed x1 in order that ?

9. The flow rate y (m3/min) in a device used for air-quality measurement depends on the pressure drop x (in. of water) across the device’s filter. Suppose that for x values between 5 and 20, the two variables are related according to the simple linear regression model with true regression line

. a. What is the expected change in flow rate associated with

a 1-in. increase in pressure drop? Explain. b. What change in flow rate can be expected when pressure

drop decreases by 5 in.? c. What is the expected flow rate for a pressure drop of

10 in.? A drop of 15 in.? d. Suppose and consider a pressure drop of 10 in.

What is the probability that the observed value of flow rate will exceed .835? That observed flow rate will exceed .840?

e. What is the probability that an observation on flow rate when pressure drop is 10 in. will exceed an observation on flow rate made when pressure drop is 11 in.?

10. Suppose the expected cost of a production run is related to the size of the run by the equation . Let Y denote an observation on the cost of a run. If the variables’size and cost are related according to the simple linear regression model, could it be the case that when and ? Explain.

11. Suppose that in a certain chemical process the reaction time y (hr) is related to the temperature (°F) in the chamber in which the reaction takes place according to the simple lin- ear regression model with equation and

. a. What is the expected change in reaction time for a 1°F

increase in temperature? For a 10°F increase in temperature?

s 5 .075 y 5 5.00 2 .01x

P(Y . 6500 when x 5 200) 5 .10 x 5 100) 5 .05P(Y . 5500

y 5 4000 1 10x

s 5 .025

y 5 2.12 1 .095x

P(Y2 . Y1) 5 .95 x 5 x 2x 5 x 1

x 5 2500

P

5 2500

y 5 1800 1 1.3x x 5 accelerated strength (psi)

y 5 28-day

x 100

y

38

36

34

32

30

28

26

22

24

120110 130 140 160150 170 190180

7. The article “Some Field Experience in the Use of an Accelerated Method in Estimating 28-Day Strength of Concrete” (J. of Amer. Concrete Institute, 1969: 895) consid-

x 59 63 68 72 74 78 83

y 118 182 247 208 197 135 132

5. The article “Objective Measurement of the Stretchability of Mozzarella Cheese” (J. of Texture Studies, 1992: 185–194) reported on an experiment to investigate how the behavior of mozzarella cheese varied with temperature. Consider the accompanying data on and (%) at failure of the cheese. [Note: The researchers were Italian and used real mozzarella cheese, not the poor cousin widely available in the United States.]

y 5 elongationx 5 temperature

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 477

12.2 Estimating Model Parameters We will assume in this and the next several sections that the variables x and y are related according to the simple linear regression model. The values of , and s 2

will almost never be known to an investigator. Instead, sample data consisting of n observed pairs will be available, from which the model parame- ters and the true regression line itself can be estimated. These observations are assumed to have been obtained independently of one another. That is, yi is the observed value of Yi, where and the n deviations are independent rv’s. Independence of follows from independence of the i’s.

According to the model, the observed points will be distributed about the true regression line in a random manner. Figure 12.6 shows a typical plot of observed pairs along with two candidates for the estimated regression line. Intuitively, the line is not a reasonable estimate of the true line

because, if were the true line, the observed points would almost surely have been closer to this line. The line is a more plausible estimate because the observed points are scattered rather closely about this line.

y 5 b0 1 b1x y 5 a0 1 a1xy 5 b0 1 b1x

y 5 a0 1 a1x

P Y1, Y2, c, Yn

P1, P2, c, PnYi 5 b0 1 b1x i 1 Pi

(x 1, y1), c, (xn, yn)

b0, b1

y � a0 � a1x

y � b0 � b1x

x

y

Figure 12.6 Two different estimates of the true regression line

b. What is the expected reaction time when temperature is 200°F? When temperature is 250°F?

c. Suppose five observations are made independently on reac- tion time, each one for a temperature of 250°F. What is the probability that all five times are between 2.4 and 2.6 hr?

d. What is the probability that two independently observed reaction times for temperatures 1° apart are such that the time at the higher temperature exceeds the time at the lower temperature?

Figure 12.6 and the foregoing discussion suggest that our estimate of should be a line that provides in some sense a best fit to the observed

data points. This is what motivates the principle of least squares, which can be traced back to the German mathematician Gauss (1777–1855). According to this principle, a line provides a good fit to the data if the vertical distances (deviations) from the observed points to the line are small (see Figure 12.7). The measure of the goodness of fit is the sum of the squares of these deviations. The best-fit line is then the one having the smallest possible sum of squared deviations.

y 5 b0 1 b1x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

478 CHAPTER 12 Simple Linear Regression and Correlation

10 20 30 40

20

40

60

80 y � b0 � b1x

Applied stress (kg/mm2)

T im

e to

f ai

lu re

( hr

) y

x

Figure 12.7 Deviations of observed data from line y 5 b0 1 b1x

The minimizing values of b0 and b1 are found by taking partial derivatives of with respect to both b0 and b1, equating them both to zero [analogously to

in univariate calculus], and solving the equations

Cancellation of the factor and rearrangement gives the following system of equa- tions, called the normal equations:

These equations are linear in the two unknowns b0 and b1. Provided that not all xi’s are identical, the least squares estimates are the unique solution to this system.

(gx i)b0 1 (gx i 2)b1 5 gx iyi

nb0 1 (gx i)b1 5 gyi

22

'f (b0, b1)

'b1 5 g2(yi 2 b0 2 b1x i) (2x i) 5 0

'f (b0, b1)

'b0 5 g2(yi 2 b0 2 b1x i) (21) 5 0

f r(b) 5 0 f (b0, b1)

Principle of Least Squares

The vertical deviation of the point from the line is

The sum of squared vertical deviations from the points to the line is then

The point estimates of b0 and b1, denoted by and and called the least squares estimates, are those values that minimize . That is, and

are such that for any b0 and b1. The estimated regression line or least squares line is then the line whose equation is

.y 5 b̂0 1 b̂1x

f ( b̂0, b̂1) # f(b0, b1)b̂1

b̂0f( b0 , b1) b̂1b̂0

f (b0, b1) 5 g n

i51 [yi 2 (b0 1 b1x i)]

2

(x 1, y1), c, (xn, yn)

height of point 2 height of line 5 yi 2 (b0 1 b1x i)

y 5 b0 1 b1x(x i, yi)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 479

Example 12.4

The computational formulas for and require only the summary statistics gx i,SxxSxy

The cetane number is a critical property in specifying the ignition quality of a fuel used in a diesel engine. Determination of this number for a biodiesel fuel is expen- sive and time-consuming. The article “Relating the Cetane Number of Biodiesel Fuels to Their Fatty Acid Composition: A Critical Study” (J. of Automobile Engr., 2009: 565–583) included the following data on and number for a sample of 14 biofuels. The iodine value is the amount of iodine neces- sary to saturate a sample of 100 g of oil. The article’s authors fit the simple linear regression model to this data, so let’s follow their lead.

y 5 cetanex 5 iodine value (g)

x 132.0 129.0 120.0 113.2 105.0 92.0 84.0 83.2 88.4 59.0 80.0 81.5 71.0 69.2

y 46.0 48.0 51.0 52.1 54.0 52.0 59.0 58.7 61.6 64.0 61.4 54.6 58.8 58.0

The least squares estimate of the slope coefficient b1 of the true regression line is

(12.2)

Computing formulas for the numerator and denominator of are

The least squares estimate of the intercept b0 of the true regression line is

(12.3)b0 5 b̂0 5 gyi 2 b̂1gx i

n 5 y 2 b̂1x

Sxy 5 gx iyi 2 (gx i)(gyi)/n Sxx 5 gx i 2 2 (gx i)

2/n

b̂1

b1 5 b̂1 5 g (x i 2 x)(yi 2 y)

g (x i 2 x) 2

5 Sxy Sxx

The necessary summary quantities for hand calculation can be obtained by placing the x values in a column and the y values in another column and then creating columns for x2, xy, and y2 (these latter values are not needed at the moment but will be used shortly). Calculating the column sums gives � 128,913.93, , from which

The estimated slope of the true regression line (i.e., the slope of the least squares line) is

b̂1 5 Sxy Sxx

5 21424.41429

6802.7693 5 2.20938742

Sxy 5 71,347.30 2 (1307.5)(779.2)/14 5 21424.41429

Sxx 5 128,913.93 2 (1307.5) 2/14 5 6802.7693

gx iyi 5 71,347.30, gyi 2 5 43,745.22

gx i 5 1307.5, gyi 5 779.2, gx i 2

( will be needed shortly). In computing , use extra dig- its in because, if is large in magnitude, rounding will affect the final answer. In practice, the use of a statistical software package is preferable to hand calculation and hand-drawn plots. Once again, be sure that the scatter plot shows a linear pat- tern with relatively homogenous variation before fitting the simple linear regression model.

xb̂1

b̂0gyi 2gyi, gx i

2, and gx iyi

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

480 CHAPTER 12 Simple Linear Regression and Correlation

We estimate that the expected change in true average cetane number associated with a 1 g increase in iodine value is —i.e., a decrease of .209. Since

and , the estimated intercept of the true regression line (i.e., the intercept of the least squares line) is

The equation of the estimated regression line (least squares line) is , exactly that reported in the cited article. Figure 12.8 displays a

scatter plot of the data with the least squares line superimposed. This line provides a very good summary of the relationship between the two variables.

75.212 2 .2094x y 5

b̂0 5 y 2 b̂1x 5 55.657143 2 (2.20938742)(93.392857) 5 75.212432

y 5 55.657143x 5 93.392857 2.209

Refer back to the iodine value–cetane number scenario described in the previous example. The estimated regression equation was . A point esti- mate of true average cetane number for all biofuels whose iodine value is 100 is

If a single biofuel sample whose iodine value is 100 is to be selected, 54.27 is also a point prediction for the resulting cetane number. ■

The least squares line should not be used to make a prediction for an x value much beyond the range of the data, such as or in Example 12.4. The danger of extrapolation is that the fitted relationship (a line here) may not be valid for such x values.

x 5 150x 5 40

m̂Y#100 5 b̂0 1 b̂1(100) 5 75.212 2 .2094(100) 5 54.27

y 5 75.212 2 .2094x Example 12.5

65

60

55

ce t

nu m

iod val

50

45

50 60 70 80 90 100

cet num = 75.21 – 0.2094 iod val

110 120 130 140

Figure 12.8 Scatter plot for Example 12.4 with least squares line superimposed, from Minitab ■

The estimated regression line can immediately be used for two different purposes. For a fixed x value (the height of the line above x*) gives either (1) a point estimate of the expected value of Y when or (2) a point prediction of the Y value that will result from a single new observation made at

.x 5 x*

x 5 x* x*, b̂0 1 b̂1x*

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 481

DEFINITION

y � Elongation

x � Tensile force

y � Product sales

x � Advertising expenditure

(a)

0 � 1x

0 � 1x

(b)

� �

� �

Figure 12.9 Typical sample for �2: (a) small; (b) large

The fitted (or predicted) values are obtained by successively substituting into the equation of the estimated regression line:

. The residuals are the differences between the observed and fitted y values.

y1 2 ŷ1, y2 2 ŷ2, c, yn 2 ŷn

ŷ1 5 b̂0 1 b̂1x 1, ŷ2 5 b̂0 1 b̂1x 2, c, ŷn 5 b̂0 1 b̂1xn

x 1, c, xn

ŷ1, ŷ2, c, ŷn

In words, the predicted value is the value of y that we would predict or expect when using the estimated regression line with is the height of the estimated regression line above the value xi for which the ith observation was made. The resid- ual is the vertical deviation between the point and the least squares line—a positive number if the point lies above the line and a negative number if it lies below the line. If the residuals are all small in magnitude, then much of the vari- ability in observed y values appears to be due to the linear relationship between x and y, whereas many large residuals suggest quite a bit of inherent variability in y relative to the amount due to the linear relation. Assuming that the line in Figure 12.7 is the least squares line, the residuals are identified by the vertical line segments from the observed points to the line. When the estimated regression line is obtained via the principle of least squares, the sum of the residuals should in theory be zero. In practice, the sum may deviate a bit from zero due to rounding.

(x i, yi)yi 2 ŷi

x 5 x i; ŷi

ŷi

Estimating S2 and S The parameter s 2 determines the amount of variability inherent in the regression model. A large value of s 2 will lead to observed that are quite spread out about the true regression line, whereas when s2 is small the observed points will tend to fall very close to the true line (see Figure 12.9). An estimate of s 2 will be used in confidence interval (CI) formulas and hypothesis-testing procedures pre- sented in the next two sections. Because the equation of the true line is unknown, the estimate is based on the extent to which the sample observations deviate from the estimated line. Many large deviations (residuals) suggest a large value of s 2, whereas deviations all of which are small in magnitude suggest that s 2 is small.

(x i, yi)s

Example 12.6 Japan’s high population density has resulted in a multitude of resource-usage problems. One especially serious difficulty concerns waste removal. The article “Innovative Sludge Handling Through Pelletization Thickening” (Water Research, 1999: 3245–3252) reported the development of a new compression machine for

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

482 CHAPTER 12 Simple Linear Regression and Correlation

processing sewage sludge. An important part of the investigation involved relating the moisture content of compressed pellets (y, in %) to the machine’s filtration rate (x, in kg-DS/m/hr). The following data was read from a graph in the article:

x 125.3 98.2 201.4 147.3 145.9 124.7 112.2 120.2 161.2 178.9

y 77.9 76.8 81.5 79.8 78.2 78.3 77.5 77.0 80.1 80.2

x 159.5 145.8 75.1 151.4 144.2 125.0 198.8 132.5 159.6 110.7

y 79.9 79.0 76.7 78.2 79.5 78.1 81.5 77.0 79.0 78.6

Relevant summary quantities (summary statistics) are , and , from which

, and . Thus

from which the equation of least squares line is . For numerical accuracy, the fitted values are calculated from :

Nine of the 20 residuals are negative, so the corresponding nine points in a scatter plot of the data lie below the estimated regression line. All predicted values (fits) and residuals appear in the accompanying table.

ŷ1 5 72.958547 1 .04103377(125.3) < 78.100, y1 2 ŷ1 < 2.200, etc.

ŷi 5 72.958547 1 .04103377x i

y 5 72.96 1 .041x

b̂0 5 78.74 2 (.04103377)(140.895) 5 72.958547 < 72.96

b̂1 5 776.434

18,921.8295 5 .04103377 < .041

Sxy 5 776.434x 5 140.895, y 5 78.74, Sxx 5 18,921.8295 gyi

2 5 124,039.58gx i 2 5 415,949.85, gx iyi 5 222,657.88

gx i 5 2817.9, gyi 5 1574.8,

Obs Filtrate Moistcon Fit Residual

1 125.3 77.9 78.100 2 98.2 76.8 76.988 3 201.4 81.5 81.223 0.277 4 147.3 79.8 79.003 0.797 5 145.9 78.2 78.945 6 124.7 78.3 78.075 0.225 7 112.2 77.5 77.563 8 120.2 77.0 77.891 9 161.2 80.1 79.573 0.527

10 178.9 80.2 80.299 11 159.5 79.9 79.503 0.397 12 145.8 79.0 78.941 0.059 13 75.1 76.7 76.040 0.660 14 151.4 78.2 79.171 15 144.2 79.5 78.876 0.624 16 125.0 78.1 78.088 0.012 17 198.8 81.5 81.116 0.384 18 132.5 77.0 78.396 19 159.6 79.0 79.508 20 110.7 78.6 77.501 1.099

20.508 21.396

20.971

20.099

20.891 20.063

20.745

20.188 20.200

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 483

Example 12.7

DEFINITION The error sum of squares (equivalently, residual sum of squares), denoted by SSE, is

and the estimate of s2 is

ŝ2 5 s2 5 SSE

n 2 2 5

g (yi 2 ŷi) 2

n 2 2

SSE 5 g(yi 2 ŷi) 2 5 g[yi 2 (b̂0 1 b̂1x i)]

2

The divisor in s2 is the number of degrees of freedom (df) associated with SSE and the estimate s2. This is because to obtain s2, the two parameters b0 and b1 must first be estimated, which results in a loss of 2 df (just as m had to be estimated in one- sample problems, resulting in an estimated variance based on ). Replacing each yi in the formula for s

2 by the rv Yi gives the estimator S 2. It can be shown that

S 2 is an unbiased estimator for s2 (though the estimator S is not unbiased for �). An interpretation of s here is similar to what we suggested earlier for the sample standard deviation: Very roughly, it is the size of a typical vertical deviation within the sample from the estimated regression line.

n 2 1 df

n 2 2

The residuals for the filtration rate–moisture content data were calculated previously. The corresponding error sum of squares is

The estimate of s2 is then , and the estimated standard deviation is . Roughly speaking, .665 is the mag- nitude of a typical deviation from the estimated regression line—some points are closer to the line than this and others are further away. ■

Computation of SSE from the defining formula involves much tedious arithmetic, because both the predicted values and residuals must first be calculated. Use of the following computational formula does not require these quantities.

ŝ 5 s 5 1.4427 5 .665 ŝ2 5 s2 5 7.968/(20 2 2) 5 .4427

SSE 5 (2.200)2 1 (2.188)2 1 c1 (1.099)2 5 7.968

In much the same way that the deviations from the mean in a one-sample sit- uation were combined to obtain the estimate , the estimate of s 2 in regression analysis is based on squaring and summing the residuals. We will continue to use the symbol s2 for this estimated variance, so don’t confuse it with our previous s2.

s2 5 g (x i 2 x ) 2/(n 2 1)

SSE 5 gyi 2 2 b̂0gyi 2 b̂1gx iyi

This expression results from substituting into , squaring the summand, carrying through the sum to the resulting three terms, and simplify- ing. This computational formula is especially sensitive to the effects of rounding in

and , so carrying as many digits as possible in intermediate computations will protect against round-off error.

b̂1b̂0

g (yi 2 ŷi) 2ŷi 5 b̂0 1 b̂1x i

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

484 CHAPTER 12 Simple Linear Regression and Correlation

x 12 30 36 40 45 57 62 67 71 78 93 94 100 105

y 3.3 3.2 3.4 3.0 2.8 2.9 2.7 2.6 2.5 2.6 2.2 2.0 2.3 2.1

Example 12.8 The article “Promising Quantitative Nondestructive Evaluation Techniques for Composite Materials” (Materials Evaluation, 1985: 561–565) reports on a study to investigate how the propagation of an ultrasonic stress wave through a substance depends on the properties of the substance. The accompanying data on fracture strength (x, as a percentage of ultimate tensile strength) and attenuation ( y, in neper/cm, the decrease in amplitude of the stress wave) in fiberglass-reinforced poly- ester composites was read from a graph that appeared in the article. The simple lin- ear regression model is suggested by the substantial linear pattern in the scatter plot.

The necessary summary quantities are , and , from which

, and . Then

so and . When and are rounded to three decimal places in the computational formula for SSE, the result is

which is more than three times the correct value. ■

The Coefficient of Determination Figure 12.10 shows three different scatter plots of bivariate data. In all three plots, the heights of the different points vary substantially, indicating that there is much variability in observed y values. The points in the first plot all fall exactly on a straight line. In this case, all (100%) of the sample variation in y can be attributed to the fact that x and y are linearly related in combination with variation in x. The points in Figure 12.10(b) do not fall exactly on a line, but compared to overall y variability, the deviations from the least squares line are small. It is reasonable to conclude in this case that much of the observed y variation can be attributed to the approximate linear relationship between the variables postulated by the simple linear regression model. When the scatter plot looks like that of Figure 12.10(c), there is substantial variation about the least squares line relative to overall y variation, so the simple lin- ear regression model fails to explain variation in y by relating y to x.

SSE 5 103.54 2 (3.621)(37.6) 2 (2.015)(2234.30) 5 .905

b̂1b̂0s 5 .1479s 2 5 .2624532/12 5 .0218711

5 .2624532 SSE 5 103.54 2 (3.6209072)(37.6) 2 (2.0147109)(2234.30)

b̂0 5 3.6209072b̂1 5 2.0147109Sxy 5 2155.98571429, Sxx 5 10,603.4285714,gx iyi 5 2234.3037.6, gyi

2 5 103.54 n 5 14, gx i 5 890, gx i

2 5 67,182, gyi 5

x

y

(a) x

y

(b) x

y

(c)

Figure 12.10 Using the model to explain y variation: (a) data for which all variation is explained; (b) data for which most variation is explained; (c) data for which little variation is explained

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 485

The error sum of squares SSE can be interpreted as a measure of how much variation in y is left unexplained by the model—that is, how much cannot be attrib- uted to a linear relationship. In Figure 12.10(a), , and there is no unex- plained variation, whereas unexplained variation is small for the data of Figure 12.10(b) and much larger in Figure 12.10(c). A quantitative measure of the total amount of variation in observed y values is given by the total sum of squares

Total sum of squares is the sum of squared deviations about the sample mean of the observed y values. Thus the same number is subtracted from each yi in SST, whereas SSE involves subtracting each different predicted value from the corre- sponding observed yi. Just as SSE is the sum of squared deviations about the least squares line , SST is the sum of squared deviations about the horizon- tal line at height (since then vertical deviations are ), as pictured in Figure 12.11. Furthermore, because the sum of squared deviations about the least squares line is smaller than the sum of squared deviations about any other line, unless the horizontal line itself is the least squares line. The ratio SSE/SST is the pro- portion of total variation that cannot be explained by the simple linear regression model, and (a number between 0 and 1) is the proportion of observed y variation explained by the model.

1 2 SSE/SST

SSE , SST

yi 2 yy y 5 b̂0 1 b̂1x

ŷi

y

SST 5 Syy 5 g (yi 2 y) 2 5 gyi

2 2 (gyi) 2/n

SSE 5 0

(a)

Least squares line

y

x (b)

y

x

y

Horizontal line at height y

Figure 12.11 Sums of squares illustrated: (a) of squared deviations about the least squares line; (b) of squared deviations about the horizontal lineSST 5 sum

SSE 5 sum

The higher the value of r 2, the more successful is the simple linear regression model in explaining y variation. When regression analysis is done by a statistical computer package, either r 2 or 100r 2 (the percentage of variation explained by the model) is a prominent part of the output. If r 2 is small, an analyst will usually want to search for an alternative model (either a nonlinear model or a multiple regression model that involves more than a single independent variable) that can more effec- tively explain y variation.

DEFINITION The coefficient of determination, denoted by r 2, is given by

It is interpreted as the proportion of observed y variation that can be explained by the simple linear regression model (attributed to an approximate linear relationship between y and x).

r 2 5 1 2 SSE

SST

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

486 CHAPTER 12 Simple Linear Regression and Correlation

The regression equation is cet num 75.2 0.209 iod val

Predictor Coef SE Coef T P Constant 75.212 2.984 25.21 0.000 iod val 0.20939 0.03109 6.73 0.000

100r2

s � 2.56450 R-sq � 79.1% R-sq(adj) � 77.3%

Analysis of Variance SSE

SOURCE DF SS MS F P Regression 1 298.25 298.25 45.35 0.000 Error 12 78.92 6.58 Total 13 377.17 SST

22

b̂1b̂o

25

The coefficient of determination can be written in a slightly different way by introducing a third sum of squares—regression sum of squares, SSR—given by

. Regression sum of squares is interpreted as the amount of total variation that is explained by the model. Then we have

the ratio of explained variation to total variation. The ANOVA table in Figure 12.12 shows that , from which as before.

Terminology and Scope of Regression Analysis The term regression analysis was first used by Francis Galton in the late nineteenth century in connection with his work on the relationship between father’s height x

r 2 5 298.25/377.17 5 .791SSR 5 298.25

r 2 5 1 2 SSE/SST 5 (SST 2 SSE)/SST 5 SSR/SST

SSR 5 g ( ŷi 2 y ) 2 5 SST 2 SSE

Example 12.9 The scatter plot of the iodine value–cetane number data in Figure 12.8 portends a reasonably high r2 value. With

we have

The coefficient of determination is then

That is, 79.1% of the observed variation in cetane number is attributable to (can be explained by) the simple linear regression relationship between cetane number and iodine value (r2 values are even higher than this in many scientific contexts, but social scientists would typically be ecstatic at a value anywhere near this large!).

Figure 12.12 shows partial Minitab output from the regression of cetane num- ber on iodine value. The software will also provide predicted values, residuals, and other information upon request. The formats used by other packages differ slightly from that of Minitab, but the information content is very similar. Regression sum of squares will be introduced shortly. Other quantities in Figure 12.12 that have not yet been discussed will surface in Section 12.3 [excepting R-Sq(adj), which comes into play in Chapter 13 when multiple regression models are introduced].

r 2 5 1 2 SSE/SST 5 1 2 (78.920)/(377.174) 5 .791

SSE 5 43,745.22 2 (75.212432)(779.2) 2 (2.20938742)(71,347.30) 5 78.920

SST 5 43,745.22 2 (779.2)2/14 5 377.174

gx iyi 5 71,347.30 gyi 2 5 43,745.22

b̂0 5 75.212432 b̂1 5 2.20938742 gyi 5 779.2

Figure 12.12 Minitab output for the regression of Examples 12.4 and 12.9 ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 487

and son’s height y. After collecting a number of pairs , Galton used the prin- ciple of least squares to obtain the equation of the estimated regression line, with the objective of using it to predict son’s height from father’s height. In using the derived line, Galton found that if a father was above average in height, the son would also be expected to be above average in height, but not by as much as the father was. Similarly, the son of a shorter-than-average father would also be expected to be shorter than average, but not by as much as the father. Thus the pre- dicted height of a son was “pulled back in” toward the mean; because regression means a coming or going back, Galton adopted the terminology regression line. This phenomenon of being pulled back in toward the mean has been observed in many other situations (e.g., batting averages from year to year in baseball) and is called the regression effect.

Our discussion thus far has presumed that the independent variable is under the control of the investigator, so that only the dependent variable Y is random. This was not, however, the case with Galton’s experiment; fathers’ heights were not preselected, but instead both X and Y were random. Methods and conclusions of regression analysis can be applied both when the values of the independent variable are fixed in advance and when they are random, but because the derivations and interpretations are more straightforward in the former case, we will continue to work explicitly with it. For more commentary, see the excellent book by John Neter et al. listed in the chapter bibliography.

(x i, yi)

EXERCISES Section 12.2 (12–29)

12. Exercise 4 gave data on mass loading and mass removal. Values of relevant summary

quantities are

a. Obtain the equation of the least squares line. b. Predict the value of BOD mass removal for a single

observation made when BOD mass loading is 35, and calculate the value of the corresponding residual.

c. Calculate SSE and then a point estimate of s. d. What proportion of observed variation in removal can be

explained by the approximate linear relationship between the two variables?

e. The last two x values, 103 and 142, are much larger than the others. How are the equation of the least squares line and the value of r2 affected by deletion of the two corre- sponding observations from the sample? Adjust the given values of the summary quantities, and use the fact that the new value of SSE is 311.79.

13. The accompanying data on and appeared in the article “Plating of 60/40 Tin/Lead Solder for Head Termination Metallurgy” (Plating and Surface Finishing, Jan. 1997:

y 5 rate of deposition (mm/min ) x 5 current density (mA/cm2)

gyi 2 5 17,454 gx iyi 5 25,825

gyi 5 346 gx i 2 5 39,095

n 5 14 gx i 5 517

y 5 BOD x 5 BOD 38–40). Do you agree with the claim by the article’s author

that “a linear relationship was obtained from the tin-lead rate of deposition as a function of current density”? Explain your reasoning.

x 20 40 60 80 y .24 1.20 1.71 2.22

14. Refer to the tank temperature–efficiency ratio data given in Exercise 1. a. Determine the equation of the estimated regression line. b. Calculate a point estimate for true average efficiency

ratio when tank temperature is 182. c. Calculate the values of the residuals from the least

squares line for the four observations for which temper- ature is 182. Why do they not all have the same sign?

d. What proportion of the observed variation in efficiency ratio can be attributed to the simple linear regression relationship between the two variables?

15. Values of modulus of elasticity (MOE, the ratio of stress, i.e., force per unit area, to strain, i.e., deformation per unit length, in GPa) and flexural strength (a measure of the abil- ity to resist failure in bending, in MPa) were determined for a sample of concrete beams of a certain type, resulting in the following data (read from a graph in the article “Effects of Aggregates and Microfillers on the Flexural Properties of Concrete,” Magazine of Concrete Research, 1997: 81–98):

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

488 CHAPTER 12 Simple Linear Regression and Correlation

MOE 29.8 33.2 33.7 35.3 35.5 36.1 36.2 Strength 5.9 7.2 7.3 6.3 8.1 6.8 7.0

MOE 36.3 37.5 37.7 38.7 38.8 39.6 41.0 Strength 7.6 6.8 6.5 7.0 6.3 7.9 9.0

MOE 42.8 42.8 43.5 45.6 46.0 46.9 48.0 Strength 8.2 8.7 7.8 9.7 7.4 7.7 9.7

MOE 49.3 51.7 62.6 69.8 79.5 80.0 Strength 7.8 7.7 11.6 11.3 11.8 10.7

a. Construct a stem-and-leaf display of the MOE values, and comment on any interesting features.

b. Is the value of strength completely and uniquely deter- mined by the value of MOE? Explain.

c. Use the accompanying Minitab output to obtain the equation of the least squares line for predicting strength from modulus of elasticity, and then predict strength for a beam whose modulus of elasticity is 40. Would you feel comfortable using the least squares line to predict strength when modulus of elasticity is 100? Explain.

Predictor Coef Stdev t-ratio P Constant 3.2925 0.6008 5.48 0.000 mod elas 0.10748 0.01280 8.40 0.000

Analysis of Variance

SOURCE DF SS MS F P Regression 1 52.870 52.870 70.55 0.000 Error 25 18.736 0.749 Total 26 71.605

d. What are the values of SSE, SST, and the coefficient of determination? Do these values suggest that the simple linear regression model effectively describes the rela- tionship between the two variables? Explain.

16. The article “Characterization of Highway Runoff in Austin, Texas, Area” (J. of Envir. Engr., 1998: 131–137) gave a scatter plot, along with the least squares line, of

and for a particular location. The accompanying values were read from the plot.

x 5 12 14 17 23 30 40 47

y 4 10 13 15 15 25 27 46

x 55 67 72 81 96 112 127

y 38 46 53 70 82 99 100

a. Does a scatter plot of the data support the use of the sim- ple linear regression model?

b. Calculate point estimates of the slope and intercept of the population regression line.

c. Calculate a point estimate of the true average runoff vol- ume when rainfall volume is 50.

d. Calculate a point estimate of the standard deviation s.

y 5 runoff volume (m3)x 5 rainfull volume (m3)

(adj) 5 72.8%R-sqR-sq 5 73.8%s 5 0.8657

e. What proportion of the observed variation in runoff volume can be attributed to the simple linear regression relationship between runoff and rainfall?

17. No-fines concrete, made from a uniformly graded coarse aggregate and a cement-water paste, is beneficial in areas prone to excessive rainfall because of its excellent drainage properties. The article “Pavement Thickness Design for No- Fines Concrete Parking Lots,” J. of Trans. Engr., 1995: 476–484) employed a least squares analysis in studying how

porosity (%) is related to unit weight (pcf) in con- crete specimens. Consider the following representative data:

x 99.0 101.1 102.7 103.0 105.4 107.0 108.7 110.8

y 28.8 27.9 27.0 25.2 22.8 21.5 20.9 19.6

x 112.1 112.4 113.6 113.8 115.1 115.4 120.0

y 17.1 18.9 16.0 16.7 13.0 13.6 10.8

Relevant summary quantities are

a. Obtain the equation of the estimated regression line. Then create a scatter plot of the data and graph the estimated line. Does it appear that the model relationship will explain a great deal of the observed variation in y?

b. Interpret the slope of the least squares line. c. What happens if the estimated line is used to predict

porosity when unit weight is 135? Why is this not a good idea?

d. Calculate the residuals corresponding to the first two observations.

e. Calculate and interpret a point estimate of s. f. What proportion of observed variation in porosity can be

attributed to the approximate linear relationship between unit weight and porosity?

18. For the past decade, rubber powder has been used in asphalt cement to improve performance. The article “Experimental Study of Recycled Rubber-Filled High-Strength Concrete” (Magazine of Concrete Res., 2009: 549–556) includes a regression of on

based on the following sample data:

x 112.3 97.0 92.7 86.0 102.0 99.2 95.8 103.5 89.0 86.7

y 75.0 71.0 57.7 48.7 74.3 73.3 68.0 59.3 57.8 48.5

a. Obtain the equation of the least squares line, and inter- pret its slope.

b. Calculate and interpret the coefficient of determination. c. Calculate and interpret an estimate of the error standard

deviation s in the simple linear regression model.

19. The following data is representative of that reported in the article “An Experimental Correlation of Oxides of Nitrogen Emissions from Power Boilers Based on Field Data” (J. of Engr. for Power, July 1973: 165–170), with burner-area libera- tion rate (MBtu/hr-ft2) and emission rate (ppm):y 5 NOx

x 5

strength (MPa) x 5 cube y 5 axial strength (MPa)

gyi 2 5 6430.06.

g x i yi 5 32,308.59,g x i 2 5 179,849.73,gyi 5 299.8,

g x i 5 1640.1,

x 5y 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.2 Estimating Model Parameters 489

x 100 125 125 150 150 200 200

y 150 140 180 210 190 320 280

x 250 250 300 300 350 400 400

y 400 430 440 390 600 610 670

a. Assuming that the simple linear regression model is valid, obtain the least squares estimate of the true regression line.

b. What is the estimate of expected NOx emission rate when burner area liberation rate equals 225?

c. Estimate the amount by which you expect NOx emission rate to change when burner area liberation rate is decreased by 50.

d. Would you use the estimated regression line to predict emission rate for a liberation rate of 500? Why or why not?

20. A number of studies have shown lichens (certain plants composed of an alga and a fungus) to be excellent bioindi- cators of air pollution. The article “The Epiphytic Lichen Hypogymnia Physodes as a Biomonitor of Atmospheric Nitrogen and Sulphur Deposition in Norway” (Envir. Monitoring and Assessment, 1993: 27–47) gives the following data (read from a graph) on wet deposition (g N/m2) and (% dry weight):y 5 lichen

x 5 NO3 2

fabric (at the expense of reducing mechanical strength). The accompanying data on and

resistance angle was read from a graph in the paper “Predicting the Performance of Durable Press Finished Cotton Fabric with Infrared Spectroscopy” (Textile Res. J., 1999: 145–151).

x .115 .126 .183 .246 .282 .344 .355 .452 .491 .554 .651

y 334 342 355 363 365 372 381 392 400 412 420

Here is regression output from Minitab: Predictor Coef SE Coef T P Constant 321.878 2.483 129.64 0.000 absorb 156.711 6.464 24.24 0.000

Source DF SS MS F P Regression 1 7639.0 7639.0 587.81 0.000 Residual Error 9 117.0 13.0 Total 10 7756.0

a. Does the simple linear regression model appear to be appropriate? Explain.

b. What wrinkle resistance angle would you predict for a fabric specimen having an absorbance of .300?

c. What would be the estimate of expected wrinkle resistence angle when absorbance is .300?

22. Calcium phosphate cement is gaining increasing attention for use in bone repair applications. The article “Short-Fibre Reinforcement of Calcium Phosphate Bone Cement” (J. of Engr. in Med., 2007: 203–211) reported on a study in which polypropylene fibers were used in an attempt to improve frac- ture behavior. The following data on and compressive strength (MPa) was provided by the article’s authors.

x 0.00 0.00 0.00 0.00 0.00 1.25 1.25 1.25 1.25

y 9.94 11.67 11.00 13.44 9.20 9.92 9.79 10.99 11.32

x 2.50 2.50 2.50 2.50 2.50 5.00 5.00 5.00 5.00

y 12.29 8.69 9.91 10.45 10.25 7.89 7.61 8.07 9.04

x 7.50 7.50 7.50 7.50 10.00 10.00 10.00 10.00

y 6.63 6.43 7.03 7.63 7.35 6.94 7.02 7.67

a. Fit the simple linear regression model to this data. Then determine the proportion of observed variation in strength that can be attributed to the model relationship between strength and fiber weight. Finally, obtain a point estimate of the standard deviation of , the random devi- ation in the model equation.

b. The average strength values for the six different levels of fiber weight are 11.05, 10.51, 10.32, 8.15, 6.93, and 7.24, respectively. The cited paper included a figure in which the average strength was regressed against fiber weight. Obtain the equation of this regression line and calculate the corresponding coefficient of determination. Explain

P

y 5 x 5 fiber weight (%)

R-Sq (adj) 5 98.3%R-Sq 5 98.5%S 5 3.60498

y 5 wrinke x 5 absorbance

x .05 .10 .11 .12 .31 .37 .42

y .48 .55 .48 .50 .58 .52 1.02

x .58 .68 .68 .73 .85 .92

y .86 .86 1.00 .88 1.04 1.70

The author used simple linear regression to analyze the data. Use the accompanying Minitab output to answer the follow- ing questions: a. What are the least squares estimates of b0 and b1? b. Predict lichen N for an deposition value of .5. c. What is the estimate of s? d. What is the value of total variation, and how much of it

can be explained by the model relationship?

The regression equation is lichen no3 depo

Predictor Coef Stdev t–ratio P Constant 0.36510 0.09904 3.69 0.004 no3 depo 0.9668 0.1829 5.29 0.000

Analysis of Variance

SOURCE DF SS MS F P Regression 1 1.0427 1.0427 27.94 0.000 Error 11 0.4106 0.0373 Total 12 1.4533

21. Wrinkle recovery angle and tensile strength are the two most important characteristics for evaluating the perfor- mance of crosslinked cotton fabric. An increase in the degree of crosslinking, as determined by ester carboxyl band absorbence, improves the wrinkle resistance of the

R-sq (adj) 5 69.2%R-sq 5 71.7%s 5 0.1932

N 5 0.365 1 0.967

NO3 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

490 CHAPTER 12 Simple Linear Regression and Correlation

the difference between the r2 value for this regression and the r2 value obtained in (a).

23. a. Obtain SSE for the data in Exercise 19 from the defining formula , and compare to the value calculated from the computational formula.

b. Calculate the value of total sum of squares. Does the simple linear regression model appear to do an effective job of explaining variation in emission rate? Justify your assertion.

24. The accompanying data was read from a graph that appeared in the article “Reactions on Painted Steel Under the Influence of Sodium Chloride, and Combinations Thereof” (Ind. Engr. Chem. Prod. Res. Dev., 1985: 375–378). The independent variable is SO2 deposition rate (mg/m

2/d), and the dependent variable is steel weight loss (g/m2).

[SSE 5 g (yi 2 ŷi) 2]

for which the true regression line passes through (0, 0). The appropriate model is . Assume that

are observed pairs generated from this model, and derive the least squares estimator of b1. [Hint: Write the sum of squared deviations as a function of b1, a trial value, and use calculus to find the minimizing value of b1.]

28. a. Consider the data in Exercise 20. Suppose that instead of the least squares line passing through the points

, we wish the least squares line pass- ing through . Construct a scatter plot of the points and then of the

points. Use the plots to explain intuitively how the two least squares lines are related to one another.

b. Suppose that instead of the model , we wish to fit a model of the form

. What are the least squares estimators of and , and how do they relate to and ?

29. Consider the following three data sets, in which the variables of interest are and time. Based on a scatter plot and the values of s and r2, in which situation would simple linear regression be most (least) effective, and why?

Data Set 1 2 3

x y x y x y

15 42 5 16 5 8 16 35 10 32 10 16 17 45 15 44 15 22 18 42 20 45 20 23 19 49 25 63 25 31 20 46 50 115 50 60

17.50 1270.8333 1270.8333 29.50 2722.5 1431.6667

1.685714 2.142295 1.126557

13.666672 7.868852 3.196729 SST 114.83 5897.5 1627.33 SSE 65.10 65.10 14.48

b̂0

b̂1

Sxy

Sxx

y 5 commutingx 5 commuting distance

b̂1b̂0

b1 *b0

* Yi 5 b0

* 1 b1 *(x i 2 x) 1 Pi Ai 5 1, c, nB

Pi (i 5 1, c, n) Yi 5 b0 1 b1x i 1

(x i 2 x, yi) (x i, yi)

(x 1 2 x, y1), c, (xn 2 x, yn) (x 1, y1), c, (xn, yn)

(x 1, y1), c, (xn, yn) Y 5 b1x 1 P

12.3 Inferences About the Slope Parameter b1 In virtually all of our inferential work thus far, the notion of sampling variability has been pervasive. In particular, properties of sampling distributions of various statistics have been the basis for developing confidence interval formulas and hypothesis-testing methods. The key idea here is that the value of any quantity calculated from sample data—the value of any statistic—will vary from one sample to another.

x 14 18 40 43 45 112

y 280 350 470 500 560 1200

a. Construct a scatter plot. Does the simple linear regres- sion model appear to be reasonable in this situation?

b. Calculate the equation of the estimated regression line. c. What percentage of observed variation in steel weight

loss can be attributed to the model relationship in com- bination with variation in deposition rate?

d. Because the largest x value in the sample greatly exceeds the others, this observation may have been very influential in determining the equation of the estimated line. Delete this observation and recalculate the equa- tion. Does the new equation appear to differ substan- tially from the original one (you might consider predicted values)?

25. Show that b1 and b0 of expressions (12.2) and (12.3) satisfy the normal equations.

26. Show that the “point of averages” lies on the estimated regression line.

27. Suppose an investigator has data on the amount of shelf space x devoted to display of a particular product and sales revenue y for that product. The investigator may wish to fit a model

(x, y)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.3 Inferences About the Slope Parameter b1 491

Example 12.10 Reconsider the data on and emission rate from Exercise 12.19 in the previous section. There are 14 observations, made at the x values 100, 125, 125, 150, 150, 200, 200, 250, 250, 300, 300, 350, 400, and 400, respectively. Suppose that the slope and intercept of the true regression line are

and , with (consistent with the values ). We proceeded to generate a sample of

random deviations from a normal distribution with mean 0 and standard deviation 35 and then added to to obtain 14 corresponding y values. Regression calculations were then carried out to obtain the estimated slope, intercept, and standard deviation. This process was repeated a total of 20 times, resulting in the values given in Table 12.1.

b0 1 b1x iP|i P|1, c, P|14

b̂1 5 1.7114, b̂0 5 245.55, s 5 36.75 s 5 35b0 5 250b1 5 1.70

y 5 NOxx 5 burner area liberation rate

Table 12.1 Simulation Results for Example 12.10

s s

1. 1.7559 43.23 11. 1.7843 41.80 2. 1.6400 30.69 12. 1.5822 32.46 3. 1.4699 36.26 13. 1.8194 40.80 4. 1.6944 22.89 14. 1.6469 28.11 5. 1.4497 5.80 36.84 15. 1.7712 33.04 6. 1.7309 39.56 16. 1.7004 43.44 7. 1.8890 42.37 17. 1.6103 25.60 8. 1.6471 43.71 18. 1.6396 40.78 9. 1.7216 23.68 19. 1.7857 32.38

10. 1.7058 31.58 20. 1.6342 30.93217.00263.31 277.31242.68 224.89240.30 227.89295.01 258.06270.01 252.66 232.03241.95 283.9924.80 228.64249.40 267.36260.62

b̂0b̂1b̂0b̂1

There is clearly variation in values of the estimated slope and estimated intercept, as well as the estimated standard deviation. The equation of the least squares line thus varies from one sample to the next. Figure 12.13 on page 492 shows a dotplot of the estimated slopes as well as graphs of the true regression line and the 20 sample regression lines. ■

The slope b1 of the population regression line is the true average change in the dependent variable y associated with a 1-unit increase in the independent variable x. The slope of the least squares line, , gives a point estimate of b1. In the same way that a confidence interval for m and procedures for testing hypothe- ses about m were based on properties of the sampling distribution of , further inferences about b1 are based on thinking of as a statistic and investigating its sampling distribution.

The values of the xi’s are assumed to be chosen before the experiment is performed, so only the Yi’s are random. The estimators (statistics, and thus random variables) for b0 and b1 are obtained by replacing yi by Yi in (12.2) and (12.3):

Similarly, the estimator for s 2 results from replacing each yi in the formula for s 2 by

the rv Yi:

ŝ2 5 S 2 5 gYi

2 2 b̂0gYi 2 b̂1gxiYi n 2 2

b1 ˆ 5

g (xi 2 x )(Yi 2 Y)

g (xi 2 x ) 2

b0 ˆ

5 gYi 2 b1

ˆ gxi n

b̂1

X

b̂1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

492 CHAPTER 12 Simple Linear Regression and Correlation

The denominator of , depends only on the xi’s and not on the Yi’s, so it is a constant. Then because , the slope estimator can be written as

That is, is a linear function of the independent rv’s , each of which is normally distributed. Invoking properties of a linear function of random variables discussed in Section 5.5 leads to the following results.

Y1, Y2, c, Ynbi ˆ

b̂1 5 g (x i 2 x)Yi

Sxx 5 gciYi where ci 5 (x i 2 x)/Sxx

g (x i 2 x)Y 5 Y g (x i 2 x) 5 Y # 0 5 0 b̂1, Sxx 5 g (x i 2 x )

2

1.5 1.7

�1

(a)

Slope

1.6 1.8 1.9

100

200

300

Y

400

500

600

100 150 200 250

X

300

True regression line

350 400

Simulated least squares lines

(b)

Figure 12.13 Simulation results from Example 12.10: (a) dotplot of estimated slopes; (b) graphs of the true regres- sion line and 20 least squares lines (from S-Plus)

1.5 1.7

�1

(a)

Slope

1.6 1.8 1.9

PROPOSITION 1. The mean value of is , so is an unbiased estimator of b1 (the distribution of is always centered at the value of b1).b̂1

b̂1E(b̂1) 5 mb̂1 5 b1b̂1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.3 Inferences About the Slope Parameter b1 493

THEOREM The assumptions of the simple linear regression model imply that the standardized variable

has a t distribution with df.n 2 2

T 5 b̂1 2 b1 S/1Sxx

5 b̂1 2 b1

S b̂1

A Confidence Interval for B1 As in the derivation of previous CIs, we begin with a probability statement:

Manipulation of the inequalities inside the parentheses to isolate b1 and substitution of estimates in place of the estimators gives the CI formula.

Pa2ta/2, n22 , b̂1 2 b1Sb̂1 , ta/2,n22b 5 1 2 a

According to (12.4), the variance of equals the variance s2 of the random error term—or, equivalently, of any Yi, divided by . This denominator is a measure of how spread out the xi’s are about . We conclude that making observa- tions at xi values that are quite spread out results in a more precise estimator of the slope parameter (smaller variance of ), whereas values of xi all close to one another imply a highly variable estimator. Of course, if the xi’s are spread out too far, a linear model may not be appropriate throughout the range of observation.

Many inferential procedures discussed previously were based on standardizing an estimator by first subtracting its mean value and then dividing by its estimated standard deviation. In particular, test procedures and a CI for the mean m of a normal population utilized the fact that the standardized variable —that is, — had a t distribution with df. A similar result here provides the key to further inferences concerning b1.

n 2 1 (X 2 m)/S ̂ m(X 2 m)/(S/1n)

b̂1

x g (x i 2 x)

2 b̂1

2. The variance and standard deviation of are

(12.4)

where . Replacing s by its estimate s gives an estimate for (the estimated standard deviation, i.e., estimated

standard error, of ):

(This estimate can also be denoted by .)

3. The estimator has a normal distribution (because it is a linear function of independent normal rv’s).

b̂1

ŝb̂1

sb̂1 5 s

1Sxx

b̂1

sb̂ 1

Sxx 5 g (x i 2 x) 2 5 gx i

2 2 (gx i) 2/n

V(b̂1) 5 sb̂1 2 5

s2

Sxx s

b̂1 5

s

1Sxx

b̂1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

494 CHAPTER 12 Simple Linear Regression and Correlation

A CI for the slope B1 of the true regression line is

b̂1 6 ta/2,n22 # sb̂1 100(1 2 a)%

This interval has the same general form as did many of our previous intervals. It is centered at the point estimate of the parameter, and the amount it extends out to each side depends on the desired confidence level (through the t critical value) and on the amount of variability in the estimator (through , which will tend to be small when there is little variability in the distribution of and large otherwise).b̂1

sb̂1b̂1

Variations in clay brick masonry weight have implications not only for structural and acoustical design but also for design of heating, ventilating, and air conditioning systems. The article “Clay Brick Masonry Weight Variation” (J. of Architectural Engr., 1996: 135–137) gave a scatter plot of versus

for a sample of mortar specimens, from which the following representative data was read: x 5 mortar air content (%)

y 5 mortar dry density (lb/ft3)

Example 12.11

x 5.7 6.8 9.6 10.0 10.7 12.6 14.4 15.0 15.3

y 119.0 121.3 118.2 124.0 112.3 114.1 112.2 115.1 111.3

x 16.2 17.8 18.7 19.7 20.6 25.0

y 107.2 108.9 107.8 111.0 106.2 105.0

The scatter plot of this data in Figure 12.14 certainly suggests the appropriateness of the simple linear regression model; there appears to be a substantial negative linear relationship between air content and density, one in which density tends to decrease as air content increases.

5 15 25

105

115

125

Air content

Density

Figure 12.14 Scatter plot of the data from Example 12.11

The values of the summary statistics required for calculation of the least squares estimates are

from which , and .r 2 5 1 2 112.4432/454.1693 5 .752SSE 5 112.4432SST 5 454.163,

b̂0 5 126.248889,b̂1 5 2.917622,Sxx 5 405.836,Sxy 5 2372.404,

gx iyi 5 24,252.54 gyi 2 5 191,672.90

gx i 5 218.1 gyi 5 1693.6 gx 2 i 5 3577.01

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.3 Inferences About the Slope Parameter b1 495

Roughly 75% of the observed variation in density can be attributed to the simple linear regression model relationship between density and air content. Error df is

, giving and . The estimated standard deviation of is

A confidence level of 95% requires . The CI is

With a high degree of confidence, we estimate that an average decrease in density of between .603 lb/ft3 and 1.233 lb/ft3 is associated with a 1% increase in air content (at least for air content values between roughly 5% and 25%, corresponding to the x values in our sample). The interval is reasonably narrow, indicating that the slope of the population line has been precisely estimated. Notice that the interval includes only negative values, so we can be quite confident of the tendency for density to decrease as air content increases.

Looking at the SAS output of Figure 12.15, we find the value of under Parameter Estimates as the second number in the Standard Error column. All of the widely used statistical packages include this estimated standard error in output. There is also an estimated standard error for the statistic from which a CI for the intercept b0 of the population regression line can be calculated.

b̂0

sb̂1

2.918 6 (2.160)(.1460) 5 2.918 6 .315 5 (21.233, 2.603)

t.025,13 5 2.160

sb̂ 1 5 s

1Sxx 5

2.941

1405.836 5 .1460

b̂1

s 5 2.941s2 5 112.4432/13 5 8.649515 2 2 5 13

Dependent Variable: DENSITY

Analysis of Variance

Source DF Sum of Squares Mean Square F Value Prob . F

Model 1 341.72606 341.72606 39.508 0.0001 Error 13 112.44327 8.64948 C Total 14 454.16933

Root MSE 2.94100 R-square 0.7524 Dep Mean 112.90667 Adj R-sq 0.7334 C.V. 2.60481

Parameter Estimates

Parameter Standard T for H0: Variable DF Estimate Error Parameter�0 Prob .|T|

INTERCEP 1 126.248889 2.25441683 56.001 0.0001 AIRCONT 1 0.917622 0.14598888 6.286 0.0001

Dep Var Predict Obs DENSITY Value Residual 1 119.0 121.0 2.0184 2 121.3 120.0 1.2909 3 118.2 117.4 0.7603 4 124.0 117.1 6.9273 5 112.3 116.4 4.1303 6 114.1 114.7 0.5869 7 112.2 113.0 0.8351 8 115.1 112.5 2.6154 9 111.3 112.2 0.9093 10 107.2 111.4 4.1834 11 108.9 109.9 1.0152 12 107.8 109.1 1.2894 13 111.0 108.2 2.8283 14 106.2 107.3 1.1459 15 105.0 103.3 1.6917

Sum of Residuals 0 Sum of Squared Residuals 112.4433 Predicted Resid SS (Press) 146.4144

2

2 2 2 2

2 2 2

2

22

Figure 12.15 SAS output for the data of Example 12.11 ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

496 CHAPTER 12 Simple Linear Regression and Correlation

Hypothesis-Testing Procedures As before, the null hypothesis in a test about b1 will be an equality statement. The null value (value of b1 claimed true by the null hypothesis) is denoted by b10 (read “beta one nought,” not “beta ten”). The test statistic results from replacing b1 by the null value b10 in the standardized variable T—that is, from standardizing the estimator of b1 under the assumption that H0 is true. The test statistic thus has a t distribution with df when H0 is true, so the type I error probability is controlled at the desired level a by using an appropriate t critical value.

The most commonly encountered pair of hypotheses about b1 is ver- sus . When this null hypothesis is true, independent of x. Then knowledge of x gives no information about the value of the dependent variable. A test of these two hypotheses is often referred to as the model utility test in simple linear regression. Unless n is quite small, H0 will be rejected and the utility of the model con- firmed precisely when r2 is reasonably large. The simple linear regression model should not be used for further inferences (estimates of mean value or predictions of future val- ues) unless the model utility test results in rejection of H0 for a suitably small a.

mY # x 5 b0Ha: b1 2 0 H0: b1 5 0

n 2 2

Null hypothesis:

Test statistic value:

Alternative Hypothesis Rejection Region for Level � Test

either or

A P-value based on df can be calculated just as was done previously for t tests in Chapters 8 and 9.

The model utility test is the test of versus in which case the test statistic value is the t ratio .t 5 b̂1/sb̂1

Ha: b1 2 0,H0: b1 5 0

n 2 2

t # 2ta/2,n22t $ ta/2,n22Ha: b1 2 b10 t # 2ta,n22Ha: b1 , b10

t $ ta,n22Ha: b1 . b10

t 5 b̂1 2 b10

sb̂ 1

H0: b1 5 b10

Mopeds are very popular in Europe because of cost and ease of operation. However, they can be dangerous if performance characteristics are modified. One of the fea- tures commonly manipulated is the maximum speed. The article “Procedure to Verify the Maximum Speed of Automatic Transmission Mopeds in Periodic Motor Vehicle Inspections” (J. of Automotive Engr., 2008: 1615–1623) included a simple linear regression analysis of the variables and

. Here is data read from a graph in the article:y 5 rolling test speed x 5 test track speed (km/h)

Example 12.12

x 42.2 42.6 43.3 43.5 43.7 44.1 44.9 45.3 45.7

y 44 44 44 45 45 46 46 46 47

x 45.7 45.9 46.0 46.2 46.2 46.8 46.8 47.1 47.2

y 48 48 48 47 48 48 49 49 49

A scatter plot of the data shows a substantial linear pattern. The Minitab output in Figure 12.16 gives the coefficient of determination as , which certainly portends a useful linear relationship. Let’s carry out the model utility test at a significance level .a 5 .01

r2 5 .923

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.3 Inferences About the Slope Parameter b1 497

The parameter of interest is b1, the expected change in rolling track speed associated with a 1 km/h increase in test speed. The null hypothesis will be rejected in favor of the alternative if the t ratio satisfies either or . From Figure 12.16,

, and

Clearly this t ratio falls well into the upper tail of the two-tailed rejection region, so H0 is resoundingly rejected. Alternatively, the P-value is twice the area captured under the 16 df t curve to the right of 13.88. Minitab gives . Thus the null hypothesis of no useful linear relationship can be rejected at any reasonable sig- nificance level. This confirms the utility of the model, and gives us license to calcu- late various estimates and predictions as described in Section 12.4. ■

Regression and ANOVA The decomposition of the total sum of squares into a part SSE, which measures unexplained variation, and a part SSR, which measures variation explained by the linear relationship, is strongly reminiscent of one-way ANOVA. In fact, the null hypothesis can be tested against by construct- ing an ANOVA table (Table 12.2) and rejecting H0 if .

The F test gives exactly the same result as the model utility t test because and . Virtually all computer packages that have regression

options include such an ANOVA table in the output. For example, Figure 12.15 shows SAS output for the mortar data of Example 12.11. The ANOVA table at the top of the output has with a P-value of .0001 for the model utility test. The table of parameter estimates gives , again with and

.(26.286)2 5 39.51 P 5 .0001t 5 26.286

f 5 39.508

t a/2,n22 2 5 Fa,1,n22t

2 5 f

f $ Fa,1,n22

Ha: b1 2 0H0: b1 5 0

g (yi 2 y) 2

P-value 5 .000

t 5 1.08342

.07806 5 13.88 (also on output)

sb̂ 1 5 .07806b̂1 5 1.08342,

t # 22.921t $ ta/2,n22 5 t .005,16 5 2.921 t 5 b̂1/sb̂1H0: b1 2 0

H0: b1 5 0

The regression equation is roll spd trk spd

Predictor Coef SE Coef T P Constant 3.528 0.63 0.537 trk spd 1.08342 0.07806 13.88 0.000

S = 0.506890 R-Sq = 92.3% R-Sq(adj) = 91.9%

Analysis of Variance

Source DF SS MS F P Regression 1 49.500 49.500 192.65 0.000 Residual Error 16 4.111 0.257 Total 17 53.611

222.224

5 22.22 1 1.08

Table 12.2 ANOVA Table for Simple Linear Regression

Source of Variation df Sum of Squares Mean Square f

Regression 1 SSR SSR

Error SSE

Total SSTn 2 1

s2 5 SSE

n 2 2 n 2 2

SSR

SSE/(n 2 2)

Figure 12.16 Minitab output for the moped data of Example 12.12

← ← ←

t 5 b̂1 sb̂1

sb̂1

P-value for model utility test

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

498 CHAPTER 12 Simple Linear Regression and Correlation

EXERCISES Section 12.3 (30–43)

30. Reconsider the situation described in Exercise 7, in which accelerated strength of concrete and cured strength. Suppose the simple linear regression model is valid for x between 1000 and 4000 and that and . Consider an experiment in which , and the x values at which observations are made are

, and . a. Calculate , the standard deviation of . b. What is the probability that the estimated slope based on

such observations will be between 1.00 and 1.50? c. Suppose it is also possible to make a single observation at

each of the values . If a major objective is to estimate b1 as accu-

rately as possible, would the experiment with be preferable to the one with ?

31. During oil drilling operations, components of the drilling assembly may suffer from sulfide stress cracking. The article “Composition Optimization of High-Strength Steels for Sulfide Cracking Resistance Improvement” (Corrosion Science, 2009: 2878–2884) reported on a study in which the composition of a standard grade of steel was analyzed. The following data on and

was read from a graph in the article (which also included the equation of the least squares line).

x 635 644 711 708 836 820 810 870 856 923 878 937 948

y 100 93 88 84 77 75 74 63 57 55 47 43 38

a. What proportion of observed variation in stress can be attributed to the approximate linear relationship between the two variables?

b. Compute the estimated standard deviation . c. Calculate a confidence interval using confidence level

95% for the expected change in stress associated with a 1 MPa increase in strength. Does it appear that this true average change has been precisely estimated?

32. Exercise 16 of Section 12.2 gave data on and . Use the accompanying Minitab output to decide whether there is a useful linear relationship between rainfall and runoff, and then calculate a confidence interval for the true average change in runoff volume associated with a 1 m3 increase in rainfall volume.

The regression equation is rainfall

Predictor Coef Stdev t-ratio P Constant 2.368 0.642 rainfall 0.82697 0.03652 22.64 0.000

33. Exercise 15 of Section 12.2 included Minitab output for a regression of flexural strength of concrete beams on modu- lus of elasticity.

R-sq(adj) 5 97.3%R-sq 5 97.5%s 5 5.240

20.4821.128

runoff 5 21.13 1 0.827

y 5 runoff volume (both in m3) x 5 rainfall volume

sb̂ 1

gy2i 5 66,224, gx iyi 5 703,192 gx i 5 10,576, gyi 5 894, gx i

2 5 8,741,264,

x 5 yield strength (MPa) y 5 threshold stress (% SMYS)

n 5 7 n 5 11

x 11 5 3000 x 1 5 2000, x 2 5 2100, c,n 5 11

b̂1sb̂1

x 7 5 40002000, x 4 5 2500, x 5 5 3000, x 6 5 3500 1500, x 3 5x 1 5 1000, x 2 5

n 5 7 s 5 350b1 5 1.25

y 5 28-day x 5 a. Use the output to calculate a confidence interval with a

confidence level of 95% for the slope b1 of the popula- tion regression line, and interpret the resulting interval.

b. Suppose it had previously been believed that when mod- ulus of elasticity increased by 1 GPa, the associated true average change in flexural strength would be at most .1 MPa. Does the sample data contradict this belief? State and test the relevant hypotheses.

34. Refer to the Minitab output of Exercise 20, in which wet deposition and .

a. Carry out the model utility test at level .01, using the rejection region approach.

b. Repeat part (a) using the P-value approach. c. Suppose it had previously been believed that when

wet deposition increases by .1 g N/m2, the associated change in expected lichen N is at least .15%. Carry out a test of hypotheses at level .01 to decide whether the data contradicts this prior belief.

35. How does lateral acceleration—side forces experienced in turns that are largely under driver control—affect nausea as perceived by bus passengers? The article “Motion Sickness in Public Road Transport: The Effect of Driver, Route, and Vehicle” (Ergonomics, 1999: 1646–1664) reported data on

(calculated in accordance with a British standard for evaluating similar motion at sea) and

. Relevant summary quantities are

Values of dose in the sample ranged from 6.0 to 17.6. a. Assuming that the simple linear regression model is valid

for relating these two variables (this is supported by the raw data), calculate and interpret an estimate of the slope parameter that conveys information about the precision and reliability of estimation.

b. Does it appear that there is a useful linear relationship between these two variables? Answer the question by employing the P-value approach.

c. Would it be sensible to use the simple linear regression model as a basis for predicting % nausea when

? Explain your reasoning. d. When Minitab was used to fit the simple linear regression

model to the raw data, the observation (6.0, 2.50) was flagged as possibly having a substantial impact on the fit. Eliminate this observation from the sample and recalcu- late the estimate of part (a). Based on this, does the obser- vation appear to be exerting an undue influence?

36. Mist (airborne droplets or aerosols) is generated when metal-removing fluids are used in machining operations to cool and lubricate the tool and workpiece. Mist generation is a concern to OSHA, which has recently lowered substan- tially the workplace standard. The article “Variables Affecting Mist Generaton from Metal Removal Fluids” (Lubrication Engr., 2002: 10–17) gave the accompanying data on fluid-flow velocity for a 5% solublex 5

dose 5 5.0

gx iyi 5 2759.6, gyi 2 5 2975

n 5 17, gx i 5 222.1, gyi 5 193, gx i 2 5 3056.69,

y 5 reported nausea (%)

x 5 motion sickness dose

NO3 2

y 5 lichen N (% )x 5 NO3 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.4 Inferences Concerning mY • x* and the Prediction of Future Y Values 499

oil (cm/sec) and the extent of mist droplets having diameters smaller than 10 mm (mg/m3):

y 5 38. Refer to the data on and emis- sion rate given in Exercise 19. a. Does the simple linear regression model specify a useful

relationship between the two rates? Use the appropriate test procedure to obtain information about the P-value, and then reach a conclusion at significance level .01.

b. Compute a 95% CI for the expected change in emission rate associated with a 10 MBtu/hr-ft2 increase in liberation rate.

39. Carry out the model utility test using the ANOVA approach for the filtration rate–moisture content data of Example 12.6. Verify that it gives a result equivalent to that of the t test.

40. Use the rules of expected value to show that is an unbiased estimator for b0 (assuming that is unbiased for b1).

41. a. Verify that by using the rules of expected value from Chapter 5.

b. Use the rules of variance from Chapter 5 to verify the expression for given in this section.

42. Verify that if each xi is multiplied by a positive constant c and each yi is multiplied by another positive constant d, the t sta- tistic for testing versus is unchanged in value (the value of will change, which shows that the magnitude of is not by itself indicative of model utility).

43. The probability of a type II error for the t test for can be computed in the same manner as it was

computed for the t tests of Chapter 8. If the alternative value of b1 is denoted by , the value of

is first calculated, then the appropriate set of curves in Appendix Table A.17 is entered on the horizontal axis at the value of d, and b is read from the curve for df. An arti- cle in the Journal of Public Health Engineering reports the results of a regression analysis based on observations in which and efficiency of BOD removal. Calculated quantities include

, and . Consider testing at level .01 , which states that the expected increase in % BOD removal is 1 when filter applica- tion temperature increases by 1°C, against the alternative

. Determine P(type II error) when .br1 5 2, s 5 4Ha: b1 . 1

H0: b1 5 1 b̂1 5 1.7035gx i 5 402, gx i

2 5 11,098, s 5 3.725

y 5 %x 5 filter application temperature (8C) n 5 15

n 2 2

d 5 ub10 2 br1 u

s B

n 2 1

gx i 2 2 (gx i)

2/n

br1

H0: b1 5 b10

b̂1

b̂1

Ha: b1 2 0H0: b1 5 0

V(b̂1)

E(b̂1) 5 b1

b̂1

b̂0

y 5 NOxx 5 liberation rate

x 89 177 189 354 362 442 965

y .40 .60 .48 .66 .61 .69 .99

a. The investigators performed a simple linear regression analysis to relate the two variables. Does a scatter plot of the data support this strategy?

b. What proportion of observed variation in mist can be attributed to the simple linear regression relationship between velocity and mist?

c. The investigators were particularly interested in the impact on mist of increasing velocity from 100 to 1000 (a factor of 10 corresponding to the difference between the smallest and largest x values in the sample). When x increases in this way, is there substantial evidence that the true average increase in y is less than .6?

d. Estimate the true average change in mist associated with a 1 cm/sec increase in velocity, and do so in a way that conveys information about precision and reliability.

37. Magnetic resonance imaging (MRI) is well established as a tool for measuring blood velocities and volume flows. The article “Correlation Analysis of Stenotic Aortic Valve Flow Patterns Using Phase Contrast MRI,” referenced in Exercise 1.67, proposed using this methodology for determination of valve area in patients with aortic stenosis. The accompany- ing data on peak velocity (m/s) from scans of 23 patients in two different planes was read from a graph in the cited paper.

Level-: .60 .82 .85 .89 .95 1.01 1.01 1.05 Level--: .50 .68 .76 .64 .68 .86 .79 1.03

Level-: 1.08 1.11 1.18 1.17 1.22 1.29 1.28 1.32 Level--: .75 .90 .79 .86 .99 .80 1.10 1.15

Level-: 1.37 1.53 1.55 1.85 1.93 1.93 2.14 Level--: 1.04 1.16 1.28 1.39 1.57 1.39 1.32

a. Does there appear to be a difference between true average velocity in the two different planes? Carry out an appro- priate test of hypotheses (as did the authors of the article).

b. The authors of the article also regressed level--velocity against level- velocity. The resulting estimated intercept and slope are .14701 and .65393, with corresponding estimated standard errors .07877 and .05947, coefficient of determination .852, and . The article included a comment that this regression showed evi- dence of a strong linear relationship but a regression slope well below 1. Do you agree?

s 5 .110673

12.4 Inferences Concerning and mY # x*

Let x* denote a specified value of the independent variable x. Once the estimates and have been calculated, can be regarded either as a point esti-

mate of (the expected or true average value of Y when ) or as a prediction of the Y value that will result from a single observation made when

x 5 x*mY# x* b̂0 1 b̂1x*b̂1b̂0

the Prediction of Future Y Values

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

500 CHAPTER 12 Simple Linear Regression and Correlation

. The point estimate or prediction by itself gives no information concerning how precisely has been estimated or Y has been predicted. This can be reme- died by developing a CI for and a prediction interval (PI) for a single Y value.

Before we obtain sample data, both and are subject to sampling variability—that is, they are both statistics whose values will vary from sample to sample. Suppose, for example, that and . Then a first sample of

(x, y) pairs might give ; a second sample might result in

; and so on. It follows that itself varies in value from sample to sample, so it is a statistic. If the intercept and slope of the population line are the aforementioned values 50 and 2, respectively, and

then this statistic is trying to estimate the value . The estimate from a first sample might be , from a second sample might be , and so on.

This variation in the value of can be visualized by returning to Figure 12.13 on page 492. Consider the value . The heights of the 20 pic- tured estimated regression lines above this value are all somewhat different from one another. The same is true of the heights of the lines above the value . In fact, there appears to be more variation in the value of than in the value of . We shall see shortly that this is because 350 is further from

(the “center of the data”) than is 300. Methods for making inferences about b1 were based on properties of the

sampling distribution of the statistic . In the same way, inferences about the mean Y value are based on properties of the sampling distribution of

the statistic . Substitution of the expressions for and into

followed by some algebraic manipulation leads to the representation of as a linear function of the Yi’s:

The coefficients in this linear function involve the xi’s and x*, all of which are fixed. Application of the rules of Section 5.5 to this linear function gives the following properties.

d1, d2, c, dn

b̂0 1 b̂1x* 5 g n

i51 c 1n 1

(x* 2 x)(x i 2 x)

g (x i 2 x) 2

d Yi 5 g n

i51 diYi

b̂0 1 b̂1x*

b̂0 1 b̂1x*b̂1b̂0b̂0 1 b̂1x*

b0 1 b1x* b̂1

x 5 235.71 b̂0 1 b̂1(300)

b̂0 1 b̂1(350) x* 5 350

x* 5 300 b̂0 1 b̂1x*

46.52 1 2.056(10) 5 67.08 52.35 1 1.895(10) 5 71.30

50 1 2(10) 5 70x* 5 10,

Ŷ 5 b̂0 1 b̂1x*b̂0 5 46.52, b̂1 5 2.056

b̂0 5 52.35, b̂1 5 1.895

b1 5 2b0 5 50

b̂1b̂0

mY # x* mY # x*

x 5 x*

Let , where x* is some fixed value of x. Then

1. The mean value of is

Thus is an unbiased estimator for (i.e., for ).

2. The variance of is

and the standard deviation is the square root of this expression. The estimated standard deviation of , denoted by or , results from replacing s by its estimate s:

sb̂ 01b̂ 1x*sŶb̂0 1 b̂1x* sŶ

V(Ŷ) 5 sŶ 2

5 s2 c 1 n 1

(x* 2 x)2

gx i 2 2 (gx i)

2/n d 5 s2 c 1

n 1

(x* 2 x)2

Sxx d

mY # x*b0 1 b1x*b̂0 1 b̂1x*

E(Ŷ) 5 E(b̂0 1 b̂1x*) 5 mb̂01b̂ 1x* 5 b0 1 b1x*

Ŷ 5 b̂0 1 b̂1x*PROPOSITION

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.4 Inferences Concerning mY • x* and the Prediction of Future Y Values 501

3. has a normal distribution.Ŷ

sŶ 5 sb̂ 01b̂ 1x* 5 sB 1 n 1

(x* 2 x)2

Sxx

The variance of is smallest when and increases as x* moves away from in either direction. Thus the estimator of is more precise when x* is near the center of the xi’s than when it is far from the x values at which observations have been made. This will imply that both the CI and PI are narrower for an x* near than for an x* far from . Most statistical computer packages will provide both and for any specified x* upon request.

Inferences Concerning Just as inferential procedures for b1 were based on the t variable obtained by stan- dardizing b1, a t variable obtained by standardizing leads to a CI and test procedures here.

b̂0 1 b̂1x*

mY # x*

sb̂ 01b̂ 1x*

b̂0 1 b̂1x*x x

mY # x*x x* 5 xb̂0 1 b̂1x*

The variable

(12.5)

has a t distribution with df.n 2 2

T 5 b̂0 1 b̂1x* 2 (b0 1 b1x*)

Sb̂ 01b̂ 1x* 5

Ŷ 2 (b0 1 b1x*)

SŶ

A CI for , the expected value of Y when , is

(12.6)b̂0 1 b̂1x* 6 ta/2,n22 # sb̂ 01b̂ 1x* 5 ŷ 6 ta/2, n22 # sŶ x 5 x*mY # x*100(1 2 a)%

A probability statement involving this standardized variable can now be manipulated to yield a confidence interval for .mY # x*

Example 12.13 Corrosion of steel reinforcing bars is the most important durability problem for rein- forced concrete structures. Carbonation of concrete results from a chemical reaction that lowers the pH value by enough to initiate corrosion of the rebar. Representative data on x � carbonation depth (mm) and y � strength (MPa) for a sample of core specimens taken from a particular building follows (read from a plot in the article “The Carbonation of Concrete Structures in the Tropical Environment of Singapore,” Magazine of Concrete Res., 1996: 293–300).

THEOREM

This CI is centered at the point estimate for and extends out to each side by an amount that depends on the confidence level and on the extent of variability in the estimator on which the point estimate is based.

mY#x*

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

502 CHAPTER 12 Simple Linear Regression and Correlation

x 8.0 15.0 16.5 20.0 20.0 27.5 30.0 30.0 35.0

y 22.8 27.2 23.7 17.1 21.5 18.6 16.1 23.4 13.4

x 38.0 40.0 45.0 50.0 50.0 55.0 55.0 59.0 65.0

y 19.5 12.4 13.2 11.4 10.3 14.1 9.7 12.0 6.8

Figure 12.17 Minitab scatter plot with confidence intervals and prediction intervals for the data of Example 12.13

A scatter plot of the data (see Figure 12.17) gives strong support for use of the sim- ple linear regression model. Relevant quantities are as follows:

Let’s now calculate a confidence interval, using a 95% confidence level, for the mean strength for all core specimens having a carbonation depth of 45 mm—that is, a confidence interval for . The interval is centered at

The estimated standard deviation of the statistic is

sYˆ 5 2.8640B 1

18 1

(45 2 36.6111)2

4840.7778 5 .7582

ŷ 5 b̂0 1 b̂1(45) 5 27.18 2 .2976(45) 5 13.79

b0 1 b1(45)

r 2 5 .766 s 5 2.8640

b̂1 5 2.297561 b̂0 5 27.182936 SSE 5 131.2402

gyi 5 293.2 gx iyi 5 9293.95 gyi 2 5 5335.76

gx i 5 659.0 gx i 2 5 28,967.50 x 5 36.6111 Sxx 5 4840.7778

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.4 Inferences Concerning mY • x* and the Prediction of Future Y Values 503

The 16 df t critical value for a 95% confidence level is 2.120, from which we deter- mine the desired interval to be

The narrowness of this interval suggests that we have reasonably precise information about the mean value being estimated. Remember that if we recalculated this inter- val for sample after sample, in the long run about 95% of the calculated intervals would include . We can only hope that this mean value lies in the single interval that we have calculated.

Figure 12.18 shows Minitab output resulting from a request to fit the simple linear regression model and calculate confidence intervals for the mean value of strength at depths of 45 mm and 35 mm. The intervals are at the bottom of the out- put; note that the second interval is narrower than the first, because 35 is much closer to than is 45. Figure 12.17 shows (1) curves corresponding to the confidence lim- its for each different x value and (2) prediction limits, to be discussed shortly. Notice how the curves get farther and farther apart as x moves away from .x

x

b0 1 b1(45)

13.79 6 (2.120)(.7582) 5 13.79 6 1.61 5 (12.18, 15.40)

The regression equation is depth

Predictor Coef Stdev t-ratio P Constant 27.183 1.651 16.46 0.000

depth 0.04116 0.000

s � 2.864 R-sq � 76.6% R-sq(adj) � 75.1%

Analysis of Variance

SOURCE DF SS MS F P Regression 1 428.62 428.62 52.25 0.000

Error 16 131.24 8.20 Total 17 559.86

Fit Stdev.Fit 95.0% C.I. 95.0% P.I. 13.793 0.758 (12.185, 15.401) (7.510, 20.075)

Fit Stdev.Fit 95.0% C.I. 95.0% P.I. 16.768 0.678 (15.330, 18.207) (10.527, 23.009)

27.2320.29756

strength 5 27.2 2 0.298

Figure 12.18 Minitab regression output for the data of Example 12.13 ■

In some situations, a CI is desired not just for a single x value but for two or more x values. Suppose an investigator wishes a CI both for and for , where v and w are two different values of the independent variable. It is tempting to compute the interval (12.6) first for and then for . Suppose we use in each computation to get two 95% intervals. Then if the variables involved in computing the two intervals were independent of one another, the joint confidence coefficient would be .

However, the intervals are not independent because the same , and S are used in each. We therefore cannot assert that the joint confidence level for the two intervals is exactly 90%. It can be shown, though, that if the CI (12.6) is computed both for and to obtain joint CIs for and , then the joint confidence level on the resulting pair of intervals is at least . In particular, using results in a joint confidence level of at least 90%, whereas using results in at least 98% confidence. For example, in Example 12.13 a 95% CI for was (12.185, 15.401) and a 95% CI for was (15.330, 18.207). The simultaneous or joint confidence level for the two statements 12.185 and is at least 90%.15.330 , mY # 35 , 18.207mY # 45 , 15.401,

mY#35mY# 45

a 5 .01 a 5 .05

100(1 2 2a)% mY # wmY # vx 5 wx 5 v

100(1 2 a)%

b̂0, b̂1

(.95) # (.95) < .90

a 5 .05x 5 wx 5 v

mY # wmY # v

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

504 CHAPTER 12 Simple Linear Regression and Correlation

The validity of these joint or simultaneous CIs rests on a probability result called the Bonferroni inequality, so the joint CIs are referred to as Bonferroni intervals. The method is easily generalized to yield joint intervals for k different

. Using the interval (12.6) separately first for , then for , and finally for yields a set of k CIs for which the joint or simul-

taneous confidence level is guaranteed to be at least . Tests of hypotheses about are based on the test statistic T obtained

by replacing in the numerator of (12.5) by the null value m0. For exam- ple, in Example 12.13 says that when carbonation depth is 45, expected (i.e., true average) strength is 15. The test statistic value is then

, and the test is upper-, lower-, or two-tailed according to the inequality in Ha.

A Prediction Interval for a Future Value of Y Rather than calculate an interval estimate for , an investigator may wish to obtain an interval of plausible values for the value of Y associated with some future observation when the independent variable has value x*. Consider, for example, relating vocabulary size y to age of a child x. The CI (12.6) with would pro- vide an estimate of true average vocabulary size for all 6-year-old children. Alternatively, we might wish an interval of plausible values for the vocabulary size of a particular 6-year-old child.

A CI refers to a parameter, or population characteristic, whose value is fixed but unknown to us. In contrast, a future value of Y is not a parameter but instead a random variable; for this reason we refer to an interval of plausible values for a future Y as a prediction interval rather than a confidence interval. The error of esti- mation is , a difference between a fixed (but unknown) quantity and a random variable. The error of prediction is , a dif- ference between two random variables. There is thus more uncertainty in prediction than in estimation, so a PI will be wider than a CI. Because the future value Y is inde- pendent of the observed Yi’s,

Furthermore, because and , the expected value of the prediction error is . It can then be shown that the standardized variable

has a t distribution with df. Substituting this T into the probability statement and manipulating to isolate Y between the two

inequalities yields the following interval. P(2ta/2,n22 , T , ta/2, n21) 5 1 2 a

n 2 2

T 5 Y 2 (b̂0 1 b̂1x*)

S B

1 1 1 n 1

(x* 2 x)2

Sxx

E(Y 2 (b̂0 1 b̂1x*)) 5 0 E(b̂0 1 b̂1x*) 5 b0 1 b1x*E(Y) 5 b0 1 b1x*

5 s2c1 1 1 n 1

(x* 2 x)2

Sxx d

5 s2 1 s2c 1 n 1

(x* 2 x)2

Sxx d

5 V(Y) 1 V(b̂0 1 b̂1x*)

V [Y 2 (b̂0 1 b̂1x*)] 5 variance of prediction error

Y 2 (b̂0 1 b̂1x*) b0 1 b1x* 2 (b̂0 1 b̂1x*)

x* 5 6

mY #x*

t 5 [b̂0 1 b̂1(45) 2 15]/sb̂01b̂ 1(45)

H0: b0 1 b1(45) 5 15 b0 1 b1x*

b0 1 b1x* 100(1 2 ka)%

x 5 xk *x 5 x2

* , c

x 5 x 1 *mY#x’s

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.4 Inferences Concerning mY • x* and the Prediction of Future Y Values 505

The interpretation of the prediction level is analogous to that of pre- vious confidence levels—if (12.7) is used repeatedly, in the long run the resulting intervals will actually contain the observed y values of the time. Notice that the 1 underneath the initial square root symbol makes the PI (12.7) wider than the CI (12.6), though the intervals are both centered at . Also, as

, the width of the CI approaches 0, whereas the width of the PI does not (because even with perfect knowledge of b0 and b1, there will still be uncertainty in prediction).

n S ` b̂0 1 b̂1x*

100(1 2 a)%

100(1 2 a)%

A PI for a future Y observation to be made when is

(12.7)

5 ŷ 6 ta/2,n22 # $s2 1 sŶ2 5 b̂0 1 b̂1x* 6 ta/2,n22 # $s2 1 sb̂ 01b̂1x*2 b̂0 1 b̂1x* 6 ta/2,n22 # sB1 1

1 n 1

(x* 2 x)2

Sxx

x 5 x*100(1 2 a)%

Example 12.14 Let’s return to the carbonation depth-strength data of Example 12.13 and calculate a 95% PI for a strength value that would result from selecting a single core specimen whose depth is 45 mm. Relevant quantities from that example are

For a prediction level of 95% based on df, the t critical value is 2.120, exactly what we previously used for a 95% confidence level. The prediction interval is then

Plausible values for a single observation on strength when depth is 45 mm are (at the 95% prediction level) between 7.51 MPa and 20.07 MPa. The 95% confidence inter- val for mean strength when depth is 45 was (12.18, 15.40). The prediction interval is much wider than this because of the extra (2.8640)2 under the square root. Figure 12.18, the Minitab output in Example 12.13, shows this interval as well as the con- fidence interval. ■

The Bonferroni technique can be employed as in the case of confidence inter- vals. If a PI is calculated for each of k different values of x, the simul- taneous or joint prediction level for all k intervals is at least .100(1 2 ka)%

100(1 2 a)%

5 13.79 6 6.28 5 (7.51, 20.07)

13.79 6 (2.120)#(2.8640)2 1 (.7582)2 5 13.79 6 (2.120)(2.963)

n 2 2 5 16

ŷ 5 13.79 sŶ 5 .7582 s 5 2.8640

EXERCISES Section 12.4 (44–56)

44. Fitting the simple linear regression model to the observations on modulus of elasticity and flexural strength given in Exercise 15 of Section 12.2 resulted in

when and for .x 5 60

ŷ 5 9.741, sŶ 5 .253x 5 407.592, sŶ 5 .179 ŷ 5

y 5x 5 n 5 27 a. Explain why is larger when than when .

b. Calculate a confidence interval with a confidence level of 95% for the true average strength of all beams whose modulus of elasticity is 40.

x 5 40x 5 60sŶ

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

astringency that can be attributed to the model relation- ship between astringency and tannin concentration.

b. Calculate and interpret a confidence interval for the slope of the true regression line.

c. Estimate true average astringency when tannin concen- tration is .6, and do so in a way that conveys information about reliability and precision.

d. Predict astringency for a single wine sample whose tan- nin concentration is .6, and do so in a way that conveys information about reliability and precision.

e. Does it appear that true average astringency for a tannin concentration of .7 is something other than 0? State and test the appropriate hypotheses.

47. The simple linear regression model provides a very good fit to the data on rainfall and runoff volume given in Exercise 16 of Section 12.2. The equation of the least squares line is

, and . a. Use the fact that when rainfall volume is

40 m3 to predict runoff in a way that conveys information about reliability and precision. Does the resulting inter- val suggest that precise information about the value of runoff for this future observation is available? Explain your reasoning.

b. Calculate a PI for runoff when rainfall is 50 using the same prediction level as in part (a). What can be said about the simultaneous prediction level for the two in- tervals you have calculated?

48. The catch basin in a storm-sewer system is the interface between surface runoff and the sewer. The catch-basin insert is a device for retrofitting catch basins to improve pollutant- removal properties. The article “An Evaluation of the Urban Stormwater Pollutant Removal Efficiency of Catch Basin Inserts” (Water Envir. Res., 2005: 500–510) reported on tests of various inserts under controlled conditions for which inflow is close to what can be expected in the field. Consider the following data, read from a graph in the arti- cle, for one particular type of insert on x � amount filtered (1000s of liters) and y � % total suspended solids removed.

sŶ 5 1.44 s 5 5.24y 5 21.128 1 .82697x, r2 5 .975

c. Calculate a prediction interval with a prediction level of 95% for the strength of a single beam whose modulus of elasticity is 40.

d. If a 95% CI is calculated for true average strength when modulus of elasticity is 60, what will be the simultane- ous confidence level for both this interval and the inter- val calculated in part (b)?

45. Reconsider the filtration rate–moisture content data intro- duced in Example 12.6 (see also Example 12.7). a. Compute a 90% CI for , true average mois-

ture content when the filtration rate is 125. b. Predict the value of moisture content for a single experi-

mental run in which the filtration rate is 125 using a 90% prediction level. How does this interval compare to the interval of part (a)? Why is this the case?

c. How would the intervals of parts (a) and (b) compare to a CI and PI when filtration rate is 115? Answer without actually calculating these new intervals.

d. Interpret the hypotheses and , and then carry out a test at sig-

nificance level .01.

46. Astringency is the quality in a wine that makes the wine drinker’s mouth feel slightly rough, dry, and puckery. The paper “Analysis of Tannins in Red Wine Using Multiple Methods: Correlation with Perceived Astringency” (Amer. J. of Enol. and Vitic., 2006: 481–485) reported on an investi- gation to assess the relationship between perceived astrin- gency and tannin concentration using various analytic methods. Here is data provided by the authors on x � tan- nin concentration by protein precipitation and y � perceived astringency as determined by a panel of tasters.

Ha: b0 1 125b1 , 80 H0: b0 1 125b1 5 80

b0 1 125b1

506 CHAPTER 12 Simple Linear Regression and Correlation

x .718 .808 .924 1.000 .667 .529 .514 .559

y .428 .480 .493 .978 .318 .298 �.224 .198

x .766 .470 .726 .762 .666 .562 .378 .779

y .326 �.336 .765 .190 .066 �.221 �.898 .836

x .674 .858 .406 .927 .311 .319 .518 .687

y .126 .305 �.577 .779 �.707 �.610 �.648 �.145

x .907 .638 .234 .781 .326 .433 .319 .238

y 1.007 �.090�1.132 .538 �1.098 �.581 �.862 �.551

Relevant summary quantities are as follows:

a. Fit the simple linear regression model to this data. Then determine the proportion of observed variation in

5 3.83071088 Sxy 5 3.497811 2 (19.404)(2.549)/32

Syy 5 11.82637622 Sxx 5 13.248032 2 (19.404)

2/32 5 1.48193150,

gy2i 5 11.835795, gx iyi 5 3.497811

gx i 5 19.404, gyi 5 2.549, gx i 2 5 13.248032,

x 23 45 68 91 114 136 159 182 205 228

y 53.3 26.9 54.8 33.8 29.9 8.2 17.2 12.2 3.2 11.1

Summary quantities are

a. Does a scatter plot support the choice of the simple lin- ear regression model? Explain.

b. Obtain the equation of the least squares line. c. What proportion of observed variation in % removed can

be attributed to the model relationship? d. Does the simple linear regression model specify a useful

relationship? Carry out an appropriate test of hypotheses using a significance level of .05.

e. Is there strong evidence for concluding that there is at least a 2% decrease in true average suspended solid removal

gy2i 5 9249.36, gx iyi 5 21,904.4

gx i 5 1251, gx i 2 5 199,365, gyi 5 250.6,

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.4 Inferences Concerning mY • x* and the Prediction of Future Y Values 507

associated with a 10,000 liter increase in the amount fil- tered? Test appropriate hypotheses using .

f. Calculate and interpret a 95% CI for true average % removed when amount filtered is 100,000 liters. How does this interval compare in width to a CI when amount filtered is 200,000 liters?

g. Calculate and interpret a 95% PI for % removed when amount filtered is 100,000 liters. How does this interval compare in width to the CI calculated in (f) and to a PI when amount filtered is 200,000 liters?

49. You are told that a 95% CI for expected lead content when traffic flow is 15, based on a sample of observa- tions, is (462.1, 597.7). Calculate a CI with confidence level 99% for expected lead content when traffic flow is 15.

50. Silicon-germanium alloys have been used in certain types of solar cells. The paper “Silicon-Germanium Films Deposited by Low-Frequency Plasma-Enhanced Chemical Vapor Deposition” (J. of Material Res., 2006: 88–104) reported on a study of various structural and electrical properties. Consider the accompanying data on x � Ge concentration in solid phase (ranging from 0 to 1) and y � Fermi level position (eV):

x 0 .42 .23 .33 .62 .60 .45 .87 .90 .79 1 1 1

y .62 .53 .61 .59 .50 .55 .59 .31 .43 .46 .23 .22 .19

A scatter plot shows a substantial linear relationship. Here is Minitab output from a least squares fit. [Note: There are several inconsistencies between the data given in the paper, the plot that appears there, and the summary information about a regression analysis.]

The regression equation is Ge conc

Analysis of Variance

Source DF SS MS F P Regression 1 0.241728 0.241728 44.43 0.000 Error 11 0.059842 0.005440 Total 12 0.301569

a. Obtain an interval estimate of the expected change in Fermi-level position associated with an increase of .1 in Ge concentration, and interpret your estimate.

b. Obtain an interval estimate for mean Fermi-level position when concentration is .50, and interpret your estimate.

c. Obtain an interval of plausible values for position result- ing from a single observation to be made when concen- tration is .50, interpret your interval, and compare to the interval of (b).

d. Obtain simultaneous CIs for expected position when concentration is .3, .5, and .7; the joint confidence level should be at least 97%.

51. Refer to Example 12.12 in which x � test track speed and y � rolling test speed. a. Minitab gave and . Why

is the former estimated standard deviation smaller than the latter one?

sb̂01b̂1(47) 5 .186sb̂01b̂1(45) 5 .120

R–Sq(adj) 5 78.4%R–Sq 5 80.2%S 5 0.0737573

Fermi pos 5 0.7217 2 0.4327

n 5 10

a 5 .05 b. Use the Minitab output from the example to calculate

a 95% CI for expected rolling speed when test speed � 45.

c. Use the Minitab output to calculate a 95% PI for a single value of rolling speed when test speed = 47.

52. Plasma etching is essential to the fine-line pattern transfer in current semiconductor processes. The article “Ion Beam- Assisted Etching of Aluminum with Chlorine” (J. of the Electrochem. Soc., 1985: 2010– 2012) gives the accompa- nying data (read from a graph) on chlorine flow (x, in SCCM) through a nozzle used in the etching mechanism and etch rate (y, in 100 A/min).

x 1.5 1.5 2.0 2.5 2.5 3.0 3.5 3.5 4.0

y 23.0 24.5 25.0 30.0 33.5 40.0 40.5 47.0 49.0

The summary statistics are

.

a. Does the simple linear regression model specify a useful relationship between chlorine flow and etch rate?

b. Estimate the true average change in etch rate associated with a 1-SCCM increase in flow rate using a 95% confi- dence interval, and interpret the interval.

c. Calculate a 95% CI for , the true average etch rate

when flow Has this average been precisely estimated?

d. Calculate a 95% PI for a single future observation on etch rate to be made when . Is the prediction likely to be accurate?

e. Would the 95% CI and PI when be wider or narrower than the corresponding intervals of parts (c) and (d)? Answer without actually computing the intervals.

f. Would you recommend calculating a 95% PI for a flow of 6.0? Explain.

53. Consider the following four intervals based on the data of Exercise 12.17 (Section 12.2): a. A 95% CI for mean porosity when unit weight is 110 b. A 95% PI for porosity when unit weight is 110 c. A 95% CI for mean porosity when unit weight is 115 d. A 95% PI for porosity when unit weight is 115

Without computing any of these intervals, what can be said about their widths relative to one another?

54. The decline of water supplies in certain areas of the United States has created the need for increased understanding of relationships between economic factors such as crop yield and hydrologic and soil factors. The article “Variability of Soil Water Properties and Crop Yield in a Sloped Watershed” (Water Resources Bull., 1988: 281–288) gives data on grain sorghum yield (y, in g/m-row) and distance upslope (x, in m) on a sloping watershed. Selected observations are given in the accom- panying table.

flow 5 2.5

flow 5 3.0

5 3.0.

mY # 3.0

6.448718, b̂1 5 10.602564

gx i 2 5 70.50, gx i yi 5 902.25, gyi

2 5 11,626.75, b̂0 5

gx i 5 24.0, gyi 5 312.5,

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

508 CHAPTER 12 Simple Linear Regression and Correlation

a. Construct a scatter plot. Does the simple linear regres- sion model appear to be plausible?

b. Carry out a test of model utility. c. Estimate true average yield when distance upslope is 75

by giving an interval of plausible values.

55. Verify that is indeed given by the expression in the text. [Hint: .]

56. The article “Bone Density and Insertion Torque as Predictors of Anterior Cruciate Ligament Graft Fixation Strength” (The Amer. J. of Sports Med., 2004: 1421–1429) gave the accompanying data on maximum insertion torque

and yield load (N), the latter being one measure of graft strength, for 15 different specimens. (N # m)

V(gdiYi) 5 gdi 2 # V(Yi)

V(b̂0 1 b̂1x)

a. Is it plausible that yield load is normally distributed? b. Estimate true average yield load by calculating a confi-

dence interval with a confidence level of 95%, and inter- pret the interval.

c. Here is output from Minitab for the regression of yield load on torque. Does the simple linear regression model specify a useful relationship between the variables?

Predictor Coef SE Coef T P Constant 152.44 91.17 1.67 0.118 Torque 178.23 45.97 3.88 0.002

Source DF SS MS F P Regression 1 80554 80554 15.03 0.002 Residual Error 13 69684 5360 Total 14 150238

d. The authors of the cited paper state, “Consequently, we cannot but conclude that simple regression analysis- based methods are not clinically sufficient to predict individual fixation strength.” Do you agree? [Hint: Consider predicting yield load when torque is 2.0.]

R–Sq(adj) 5 50.0%R–Sq 5 53.6%S 5 73.2141

x 0 10 20 30 45 50 70

y 500 590 410 470 450 480 510

x 80 100 120 140 160 170 190

y 450 360 400 300 410 280 350

Torque 1.8 2.2 1.9 1.3 2.1 2.2 1.6 2.1 Load 491 477 598 361 605 671 466 431

Torque 1.2 1.8 2.6 2.5 2.5 1.7 1.6 Load 384 422 554 577 642 348 446

12.5 Correlation There are many situations in which the objective in studying the joint behavior of two variables is to see whether they are related, rather than to use one to predict the value of the other. In this section, we first develop the sample correlation coefficient r as a measure of how strongly related two variables x and y are in a sample and then relate r to the correlation coefficient r defined in Chapter 5.

The Sample Correlation Coefficient r Given n numerical pairs , it is natural to speak of x and y as having a positive relationship if large x’s are paired with large y’s and small x’s with small y’s. Similarly, if large x’s are paired with small y’s and small x’s with large y’s, then a negative relationship between the variables is implied. Consider the quantity

Then if the relationship is strongly positive, an xi above the mean will tend to be paired with a yi above the mean , so that , and this product will also be positive whenever both xi and yi are below their respective means. Thus a positive relationship implies that will be positive. An analogous argument shows that when the relationship is negative, will be negative, since most of the prod- ucts will be negative. This is illustrated in Figure 12.19.(x i 2 x)(yi 2 y)

Sxy

Sxy

(x i 2 x)(yi 2 y) . 0y x

Sxy 5 g n

i51 (x i 2 x)(yi 2 y) 5 g

n

i51 x iyi 2

agn i51

x ib ag n

i51 yib

n

(x 1, y1), (x 2, y2), c, (xn, yn)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.5 Correlation 509

Although Sxy seems a plausible measure of the strength of a relationship, we do not yet have any idea of how positive or negative it can be. Unfortunately, has a serious defect: By changing the unit of measurement for either x or y, can be made either arbitrarily large in magnitude or arbitrarily close to zero. For example, if when x is measured in meters, then when x is measured in millimeters and .025 when x is expressed in kilometers. A reasonable condition to impose on any measure of how strongly x and y are related is that the calculated measure should not depend on the particular units used to measure them. This con- dition is achieved by modifying to obtain the sample correlation coefficient.Sxy

Sxy 5 25,000Sxy 5 25

Sxy

Sxy

x

(a)

y

� �

x

(b)

y

Figure 12.19 (a) Scatter plot with positive; (b) scatter plot with negative [ , and ]2 means (xi 2 x)(yi 2 y) , 01 means (xi 2 x)(yi 2 y) . 0

SxySxy

Example 12.15

DEFINITION The sample correlation coefficient for the n pairs is

(12.8)r 5 Sxy

2g (x i 2 x) 22g (yi 2 y)

2 5

Sxy

2Sxx2Syy

(x 1, y1), c, (xn, yn)

An accurate assessment of soil productivity is critical to rational land-use planning. Unfortunately, as the author of the article “Productivity Ratings Based on Soil Series” (Prof. Geographer, 1980: 158–163) argues, an acceptable soil productivity index is not so easy to come by. One difficulty is that productivity is determined partly by which crop is planted, and the relationship between the yield of two different crops planted in the same soil may not be very strong. To illustrate, the article presents the accompany- ing data on corn yield x and peanut yield y (mT/Ha) for eight different types of soil.

x 2.4 3.4 4.6 3.7 2.2 3.3 4.0 2.1

y 1.33 2.12 1.80 1.65 2.00 1.76 2.11 1.63

With , and 26.4324,

from which ■r 5 .5960

15.751.5124 5 .347

Sxy 5 46.856 2 (25.7)(14.40)

8 5 .5960

Sxx 5 88.31 2 (25.7)2

8 5 5.75 Syy 5 26.4324 2

(14.40)2

8 5 .5124

gyi 2 5gx i 5 25.7, gyi 5 14.40, gx i

2 5 88.31, gx iyi 5 46.856

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

510 CHAPTER 12 Simple Linear Regression and Correlation

Properties of r The most important properties of r are as follows:

1. The value of r does not depend on which of the two variables under study is labeled x and which is labeled y.

2. The value of r is independent of the units in which x and y are measured.

3.

4. if and only if (iff) all pairs lie on a straight line with positive slope, and iff all pairs lie on a straight line with negative slope.

5. The square of the sample correlation coefficient gives the value of the coefficient of determination that would result from fitting the simple linear regression model—in symbols, .

Property 1 stands in marked contrast to what happens in regression analysis, where virtually all quantities of interest (the estimated slope, estimated y-intercept, s2, etc.) depend on which of the two variables is treated as the dependent variable. However, Property 5 shows that the proportion of variation in the dependent variable explained by fitting the simple linear regression model does not depend on which variable plays this role.

Property 2 is equivalent to saying that r is unchanged if each xi is replaced by cxi and if each yi is replaced by dyi (a change in the scale of measurement), as well as if each xi is replaced by and yi by (which changes the location of zero on the measurement axis). This implies, for example, that r is the same whether temperature is measured in °F or °C.

Property 3 tells us that the maximum value of r, corresponding to the largest possible degree of positive relationship, is , whereas the most negative rela- tionship is identified with . According to Property 4, the largest positive and largest negative correlations are achieved only when all points lie along a straight line. Any other configuration of points, even if the configuration suggests a deterministic relationship between variables, will yield an r value less than 1 in absolute magnitude. Thus r measures the degree of linear relationship among variables. A value of r near 0 is not evidence of the lack of a strong relationship, but only the absence of a linear relation, so that such a value of r must be interpreted with caution. Figure 12.20 illus- trates several configurations of points associated with different values of r.

r 5 21 r 5 1

yi 2 bx i 2 a

(r)2 5 r 2

(x i, yi)r 5 21 (x i, yi)r 5 1

21 # r # 1

(a) r near �1 (b) r near �1

(c) r near 0, no apparent relationship

(d) r near 0, nonlinear relationship

Figure 12.20 Data plots for different values of r

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.5 Correlation 511

A frequently asked question is, “When can it be said that there is a strong cor- relation between the variables, and when is the correlation weak?” Here is an infor- mal rule of thumb for characterizing the value of r:

Weak Moderate Strong either or either or r # 2.8r $ .8.5 , r , .82.8 , r , 2.52.5 # r # .5

It may surprise you that an r as substantial as .5 or goes in the weak category. The rationale is that if or , then in a regression with either vari- able playing the role of y. A regression model that explains at most 25% of observed variation is not in fact very impressive. In Example 12.15, the correlation between corn yield and peanut yield would be described as weak.

Inferences About the Population Correlation Coefficient The correlation coefficient r is a measure of how strongly related x and y are in the observed sample. We can think of the pairs as having been drawn from a bivariate population of pairs, with having some joint pmf or pdf. In Chapter 5, we defined the correlation coefficient by

where

If we think of or as describing the distribution of pairs of values within the entire population, r becomes a measure of how strongly related x and y are in that population. Properties of r analogous to those for r were given in Chapter 5.

The population correlation coefficient r is a parameter or population charac- teristic, just as , and are, so we can use the sample correlation coeffi- cient to make various inferences about r. In particular, r is a point estimate for r, and the corresponding estimator is

sYmX, mY, sX

f (x, y)p(x, y)

Cov (X, Y ) 5 d gx gy (x 2 mX)(y 2 mY)p (x, y) (X, Y ) discrete �

`

2`

� `

2`

(x 2 mX)(y 2 mY) f (x, y) dx dy (X, Y ) continuous

r 5 r(X, Y) 5 Cov(X, Y) sX

# sY

r(X, Y) (Xi, Yi)

(x i, yi)

r 2 5 .252.5r 5 .5 2.5

r̂ 5 R 5 g (Xi 2 X)(Yi 2 Y)

2g (Xi 2 X) 2 2g (Yi 2 Y)

2

Example 12.16 In some locations, there is a strong association between concentrations of two differ- ent pollutants. The article “The Carbon Component of the Los Angeles Aerosol: Source Apportionment and Contributions to the Visibility Budget” (J. of Air Pollution Control Fed., 1984: 643–650) reports the accompanying data on ozone concentration x (ppm) and secondary carbon concentration .y (mg/m3)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

512 CHAPTER 12 Simple Linear Regression and Correlation

The summary quantities are , and from which

The point estimate of the population correlation coefficient r between ozone con- centration and secondary carbon concentration is ■

The small-sample intervals and test procedures presented in Chapters 7–9 were based on an assumption of population normality. To test hypotheses about r, an analogous assumption about the distribution of pairs of (x, y) values in the popu- lation is required. We are now assuming that both X and Y are random, whereas much of our regression work focused on x fixed by the experimenter.

r̂ 5 r 5 .716.

5 2.3826

(.1597)(20.8456) 5 .716

r 5 20.0397 2 (1.656)(170.6)/16

2.196912 2 (1.656)2/1622253.56 2 (170.6)2/16

gyi 2 5 2253.56gx iyi 5 20.0397

n 5 16, gx i 5 1.656, gyi 5 170.6, gx i 2 5 .196912,

x .066 .088 .120 .050 .162 .186 .057 .100

y 4.6 11.6 9.5 6.3 13.8 15.4 2.5 11.8

x .112 .055 .154 .074 .111 .140 .071 .110

y 8.0 7.0 20.6 16.6 9.2 17.9 2.8 13.0

ASSUMPTION The joint probability distribution of (X, Y) is specified by

(12.9)

where m1 and s1 are the mean and standard deviation of X, and m2 and s2 are the mean and standard deviation of Y; f(x, y) is called the bivariate normal probability distribution.

2` , y , `

2` , x , `

f (x, y) 5 1

2p # s1s221 2 r2 e2[((x2m1)/s1)222r(x2m1)(y2m2)/s1s21((y2m2)/s2)2]/[2(12r2)]

The bivariate normal distribution is obviously rather complicated, but for our purposes we need only a passing acquaintance with several of its properties. The sur- face determined by f(x, y) lies entirely above the x, y plane and has a three-dimensional bell- or mound-shaped appearance, as illustrated in Figure 12.21. If we slice through the surface with any plane perpendicular to the x, y plane and look at the shape of the curve sketched out on the “slicing plane,” the result is a normal curve. More precisely, if , it can be shown that the (conditional) distribution of Y is normal with mean and variance . This is exactly the model used in simple linear regression with

, and independent of x. The implication is that if the observed pairs are actually drawn from a bivariate normal distribution, then the simple linear regression model is an appropriate way of studying the behavior of Y for fixed x. If , then independent of x; in fact, when , the joint probability density function of (12.9) can be factored as , which implies that X and Y are independent variables.f1(x)f2(y)

f(x, y)r 5 0 mY #x 5 m2r 5 0

(x i, yi) s2 5 (1 2 r2)s2

2b0 5 m2 2 rm1s2/s1, b1 5 rs2/s1

(1 2 r2)s2 2mY #x 5 m2 2 rm1s2/s1 1 rs2x/s1

X 5 x

[ f (x, y) $ 0]

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.5 Correlation 513

Assuming that the pairs are drawn from a bivariate normal distribution allows us to test hypotheses about r and to construct a CI. There is no completely satisfactory way to check the plausibility of the bivariate normality assumption. A partial check involves constructing two separate normal probability plots, one for the sample xi’s and another for the sample yi’s, since bivariate normality implies that the marginal distributions of both X and Y are normal. If either plot deviates substantially from a straight-line pattern, the following inferential procedures should not be used for small n.

f (x, y)

x

y

Figure 12.21 A graph of the bivariate normal pdf

Testing for the Absence of Correlation

When is true, the test statistic

has a t distribution with df.

A P-value based on df can be calculated as described previously.n 2 2

n 2 2

T 5 R1n 2 2

21 2 R2

H0: r 5 0

Example 12.17 Neurotoxic effects of manganese are well known and are usually caused by high occupational exposure over long periods of time. In the fields of occupational hygiene and environmental hygiene, the relationship between lipid peroxidation (which is responsible for deterioration of foods and damage to live tissue) and occu- pational exposure has not been previously reported. The article “Lipid Peroxidation in Workers Exposed to Manganese” (Scand. J. of Work and Environ. Health, 1996: 381–386) gives data on x � manganese concentration in blood (ppb) and y � con- centration (mmol/L) of malondialdehyde, which is a stable product of lipid peroxi- dation, both for a sample of 22 workers exposed to manganese and for a control sample of 45 individuals. The value of r for the control sample is .29, from which

t 5 (.29)145 2 2

#1 2 (.29)2 < 2.0

Alternative Hypothesis Rejection Region for Level a Test

either or t # 2ta/2,n22t $ ta/2,n22Ha: r 2 0 t # 2ta,n22Ha: r , 0 t $ ta,n22Ha: r . 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

514 CHAPTER 12 Simple Linear Regression and Correlation

The corresponding P-value for a two-tailed t test based on 43 df is roughly .052 (the cited article reported only that ). We would not want to reject the assertion that at either significance level .01 or .05. For the sample of exposed workers, and , clear evidence that there is a linear relationship in the entire population of exposed workers from which the sample was selected. ■

Because r measures the extent to which there is a linear relationship between the two variables in the population, the null hypothesis states that there is no such population relationship. In Section 12.3, we used the t ratio to test for a linear relationship between the two variables in the context of regression analysis. It turns out that the two test procedures are completely equivalent because

. When interest lies only in assessing the strength of any linear relationship rather than in fitting a model and using it to estimate or pre- dict, the test statistic formula just presented requires fewer computations than does the t-ratio.

Other Inferences Concerning r The procedure for testing when is not equivalent to any proce- dure from regression analysis. The test statistic is based on a transformation of R called the Fisher transformation.

r0 2 0H0: r 5 r0

r1n 2 2/21 2 r 2 5 b̂1/sb̂1

b̂1/sb̂1

H0: r 5 0

t < 6.7r 5 .83 r 5 0

P-value . .05

PROPOSITION When is a sample from a bivariate normal distribution, the rv

(12.10)

has approximately a normal distribution with mean and variance

mV 5 1

2 lna 1 1 r

1 2 r b sV2 5 1n 2 3

V 5 1

2 lna 1 1 R

1 2 R b

(X1, Y1), c, (Xn, Yn)

The rationale for the transformation is to obtain a function of R that has a variance independent of r; this would not be the case with R itself. Also, the transformation should not be used if n is quite small, since the approximation will not be valid.

Alternative Hypothesis Rejection Region for Level a Test

either or z # 2za/2z $ za/2Ha: r 2 r0 z # 2zaHa: r , r0

z $ zaHa: r . r0

The test statistic for testing is

A P-value can be calculated in the same manner as for previous z tests.

Z 5

V 2 1

2 ln[(1 1 r0)/(1 2 r0)]

1/1n 2 3

H0: r 5 r0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.5 Correlation 515

Example 12.18 The article “Size Effect in Shear Strength of Large Beams—Behavior and Finite Element Modelling” (Mag. of Concrete Res., 2005: 497–509) reported on a study of various characteristics of large reinforced concrete deep and shallow beams tested until failure. Consider the following data on x � cube strength and y � cylinder strength (both in MPa):

x 55.10 44.83 46.32 51.10 49.89 45.20 48.18 46.70 54.31 41.50

y 49.10 31.20 32.80 42.60 42.50 32.70 36.21 40.40 37.42 30.80

x 47.50 52.00 52.25 50.86 51.66 54.77 57.06 57.84 55.22

y 35.34 44.80 41.75 39.35 44.07 43.40 45.30 39.08 41.89

Then , from which . Does this provide strong evidence for concluding that the two measures of strength are at least moderately positively correlated?

Our previous interpretation of moderate positive correlation was , so we wish to test versus . The computed value of V is then

Thus . The P-value for an upper-tailed test is .0359. The null hypothesis can therefore be rejected at significance level .05 but not at level .01. This latter result is somewhat surprising in light of the magnitude of r, but when n is small, a reasonably large r may result even when r is not all that sub- stantial. At significance level .01, the evidence for a moderately positive correlation is not compelling. ■

To obtain a CI for r, we first derive an interval for . Standardizing V, writing a probability statement, and manipulating the resulting inequalities yields

(12.11)

as a interval for mV, where . This interval can then be manipulated to yield a CI for r.

v 5 12 ln[(1 1 r)/(1 2 r)]100(1 2 a)%

av 2 za/2 1n 2 3

, v 1 za/2 1n 2 3

b

mV 5 1 2 ln[(1 1 r)/(1 2 r)]

z 5 (.999 2 .549)119 2 3 5 1.80

v 5 .5 lna 1 1 .761 1 2 .761

b 5 .999 and .5 lna 1 1 .5 1 2 .5

b 5 .549 Ha: r . .5H0: r 5 .5

.5 , r , .8

r 5 .761Sxx 5 367.74, Syy 5 488.54, and Sxy 5 322.37

A confidence interval for r is

where c1 and c2 are the left and right endpoints, respectively, of the interval (12.11).

a e2c1 2 1 e2c1 1 1

, e2c2 2 1

e2c2 1 1 b

100(1 2 a)%

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

516 CHAPTER 12 Simple Linear Regression and Correlation

Example 12.19 The article “A Study of a Partial Nutrient Removal System for Wastewater Treatment Plants” (Water Research, 1972: 1389–1397) reports on a method of nitrogen removal that involves the treatment of the supernatant from an aerobic digester. Both the influent total nitrogen x (mg/ L) and the percentage y of nitrogen removed were recorded for 20 days, with resulting summary statistics

, and . The sample correlation coefficient between influent nitrogen and percentage nitrogen removed is

, giving . With , a 95% confidence interval for mV is . The 95% inter-

val for r is

In Chapter 5, we cautioned that a large value of the correlation coefficient (near 1 or ) implies only association and not causation. This applies to both r and r.21

c e2(.46) 2 1 e2(.46) 1 1

, e2(1.41) 2 1

e2(1.41) 1 1 d 5 (.43, .89)

(.935 2 1.96/117, .935 1 1.96/117) 5 (.460, 1.410) 5 (c1, c2) n 5 20v 5 .935r 5 .733

gx iyi 5 10,818.564409.55, gyi 5 690.30, gyi 2 5 29,040.29

gx i 5 285.90, gx i 2 5

EXERCISES Section 12.5 (57–67)

57. The article “Behavioural Effects of Mobile Telephone Use During Simulated Driving” (Ergonomics, 1995: 2536–2562) reported that for a sample of 20 experimental subjects, the sample correlation coefficient for x � age and y � time since the subject had acquired a driving license (yr) was .97. Why do you think the value of r is so close to 1? (The article’s authors give an explanation.)

58. The Turbine Oil Oxidation Test (TOST) and the Rotating Bomb Oxidation Test (RBOT) are two different procedures for evaluating the oxidation stability of steam turbine oils. The article “Dependence of Oxidation Stability of Steam Turbine Oil on Base Oil Composition” (J. of the Society of Tribologists and Lubrication Engrs., Oct. 1997: 19–24) reported the accompanying observations on x � TOST time (hr) and y � RBOT time (min) for 12 oil specimens.

“Post-Harvest Glyphosphate Application Reduces Tough- ening, Fiber Content, and Lignification of Stored Asparagus Spears” (J. of the Amer. Soc. of Hort. Science, 1988: 569–572). The article reported the accompanying data (read from a graph) on x � shear force (kg) and y � percent fiber dry weight.

TOST 4200 3600 3750 3675 4050 2770 RBOT 370 340 375 310 350 200

TOST 4870 4500 3450 2700 3750 3300 RBOT 400 375 285 225 345 285

a. Calculate and interpret the value of the sample correla- tion coefficient (as do the article’s authors).

b. How would the value of r be affected if we had let x � RBOT time and y � TOST time?

c. How would the value of r be affected if RBOT time were expressed in hours?

d. Construct normal probability plots and comment. e. Carry out a test of hypotheses to decide whether RBOT

time and TOST time are linearly related.

59. Toughness and fibrousness of asparagus are major determi- nants of quality. This was the focus of a study reported in

x 46 48 55 57 60 72 81 85 94

y 2.18 2.10 2.13 2.28 2.34 2.53 2.28 2.62 2.63

x 109 121 132 137 148 149 184 185 187

y 2.50 2.66 2.79 2.80 3.01 2.98 3.34 3.49 3.26

a. Calculate the value of the sample correlation coefficient. Based on this value, how would you describe the nature of the relationship between the two variables?

b. If a first specimen has a larger value of shear force than does a second specimen, what tends to be true of percent dry fiber weight for the two specimens?

c. If shear force is expressed in pounds, what happens to the value of r? Why?

d. If the simple linear regression model were fit to this data, what proportion of observed variation in percent fiber dry weight could be explained by the model relationship?

e. Carry out a test at significance level .01 to decide whether there is a positive linear association between the two variables.

60. Head movement evaluations are important because individu- als, especially those who are disabled, may be able to operate communications aids in this manner. The article “Constancy

gyi 5 47.92, gyi 2 5 130.6074, gx iyi 5 5530.92

n 5 18, gx i 5 1950, gx i 2 5 251,970,

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

12.5 Correlation 517

of Head Turning Recorded in Healthy Young Humans” (J. of Biomed. Engr., 2008: 428–436) reported data on ranges in maximum inclination angles of the head in the clockwise anterior, posterior, right, and left directions for 14 randomly selected subjects. Consider the accompanying data on aver- age anterior maximum inclination angle (AMIA) both in the clockwise direction and in the counterclockwise direction.

Reduced Pressure Test as a Measuring Tool in the Evaluation of Porosity/Hydrogen Content in A1–7 Wt Pct Si-10 Vol Pct SiC(p) Metal Matrix Composite” (Metallurgical Trans., 1993: 1857–1868) gives the accompanying data on x � con- tent and y � gas porosity for one particular measurement technique.

x 400 750 770 800 850 1025 1200

y 3.80 4.00 4.90 5.20 4.00 3.50 6.30

x 1250 1300 1400 1475 1480 1505 2200

y 6.88 7.55 4.95 7.80 4.45 6.60 8.90

. A scatter plot shows a linear pattern. a. Test to see whether there is a positive correlation be-

tween maximal lactate level and muscular endurance in the population from which this data was selected.

b. If a regression analysis were to be carried out to predict endurance from lactate level, what proportion of ob- served variation in endurance could be attributed to the approximate linear relationship? Answer the analogous question if regression is used to predict lactate level from endurance—and answer both questions without doing any regression calculations.

62. Hydrogen content is conjectured to be an important factor in porosity of aluminum alloy castings. The article “The

Sxx 5 36.9839, Syy 5 2,628,930.357, Sxy 5 7377.704

x .18 .20 .21 .21 .21 .22 .23

y .46 .70 .41 .45 .55 .44 .24

x .23 .24 .24 .25 .28 .30 .37

y .47 .22 .80 .88 .70 .72 .75

Minitab gives the following output in response to a Correlation command: Correlation of Hydrcon and

a. Test at level .05 to see whether the population correlation coefficient differs from 0.

b. If a simple linear regression analysis had been carried out, what percentage of observed variation in porosity could be attributed to the model relationship?

63. Physical properties of six flame-retardant fabric samples were investigated in the article “Sensory and Physical Properties of Inherently Flame-Retardant Fabrics” (Textile Research, 1984: 61–68). Use the accompanying data and a .05 significance level to determine whether a linear relationship exists between stiffness x (mg-cm) and thickness y (mm). Is the result of the test surprising in light of the value of r?

Porosity 5 0.449

x 7.98 24.52 12.47 6.92 24.11 35.71

y .28 .65 .32 .27 .81 .57

64. The article “Increases in Steroid Binding Globulins Induced by Tamoxifen in Patients with Carcinoma of the Breast” (J. of Endocrinology, 1978: 219–226) reports data on the effects of the drug tamoxifen on change in the level of cor- tisol-binding globulin (CBG) of patients during treatment. With and , summary values are n � 26,

, and . a. Compute a 90% CI for the true correlation coefficient r. b. Test versus at level .05. c. In a regression analysis of y on x, what proportion of

variation in change of cortisol-binding globulin level could be explained by variation in patient age within the sample?

d. If you decide to perform a regression analysis with age as the dependent variable, what proportion of variation in age is explainable by variation in �CBG?

65. Torsion during hip external rotation and extension may explain why acetabular labral tears occur in professional ath- letes. The article “Hip Rotational Velocities During the Full Golf Swing” (J. of Sports Science and Med., 2009: 296–299)

Ha: r , 2.5H0: r 5 2.5

gx iyi 5 16,731g (yi 2 y) 2 5 465.34

gx i 5 1613, g (x i 2 x) 2 5 3756.96, gyi 5 281.9,

�CBG 5 yage 5 x

Subj: 1 2 3 4 5 6 7 Cl: 57.9 35.7 54.5 56.8 51.1 70.8 77.3 Co: 44.2 52.1 60.2 52.7 47.2 65.6 71.4

Subj: 8 9 10 11 12 13 14 Cl: 51.6 54.7 63.6 59.2 59.2 55.8 38.5 Co: 48.8 53.1 66.3 59.8 47.5 64.5 34.5

a. Calculate a point estimate of the population correlation coefficient between Cl AMIA and Co AMIA

. b. Assuming bivariate normality (normal probability plots

of the Cl and Co samples are reasonably straight), carry out a test at significance level .01 to decide whether there is a linear association between the two variables in the population (as do the authors of the cited paper). Would the conclusion have been the same if a significance level of .001 had been used?

61. The authors of the paper “Objective Effects of a Six Months’ Endurance and Strength Training Program in Outpatients with Congestive Heart Failure” (Medicine and Science in Sports and Exercise, 1999: 1102–1107) pre- sented a correlation analysis to investigate the relationship between maximal lactate level x and muscular endurance y. The accompanying data was read from a plot in the paper.

43,478.07, gClCo 5 44,187.87) 786.7, gCo 5 767.9, gCl2 5 45,727.31, gCo2 5

(gCl 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

reported on an investigation in which lead hip internal peak rotational velocity (x) and trailing hip peak external rota- tional velocity (y) were determined for a sample of 15 golfers. Data provided by the article’s authors was used to calculate the following summary quantities:

Separate normal probability plots showed very substantial linear patterns. a. Calculate a point estimate for the population correlation

coefficient. b. Carry out a test at significance level .01 to decide

whether there is a linear relationship between the two velocities in the sampled population; your conclusion should be based on a P-value.

c. Would the conclusion of (b) have changed if you had tested appropriate hypotheses to decide whether there is a positive linear association in the population? What if a significance level of .05 rather than .01 had been used?

66. Consider a time series—that is, a sequence of observa- tions obtained over time—with observed val- ues . Suppose that the series shows no upward or downward trend over time. An investigator will frequently want to know just how strongly values in the series separated by a specified number of time units are related. The lag-one sample autocorrelation coefficient r1 is just the value of the sample correlation coefficient r for the pairs , that is, pairs of(x 1, x 2), (x 2, x 3), c, (xn21, xn)

x 1, x 2, c, xn

X1, X2, c

g(x i 2 x)(yi 2 y) 5 44,185.87

g (x i 2 x) 2 5 64,732.83, g (yi 2 y)

2 5 130,566.96,

values separated by one time unit. Similarly, the lag-two sample autocorrelation coefficient r2 is r for the pairs . a. Calculate the values of r1, r2, and r3 for the temperature

data from Exercise 82 of Chapter 1, and comment. b. Analogous to the population correlation coefficient r, let

denote the theoretical or long-run autocorre- lation coefficients at the various lags. If all these r’s are 0, there is no (linear) relationship at any lag. In this case, if n is large, each Ri has approximately a normal distri- bution with mean 0 and standard deviation and different Ri’s are almost independent. Thus can be rejected at a significance level of approximately .05 if either or . If and

, and , is there any evi- dence of theoretical autocorrelation at the first three lags?

c. If you are simultaneously testing the null hypothesis in part (b) for more than one lag, why might you want to increase the cutoff constant 2 in the rejection region?

67. A sample of pairs was collected and a test of versus was carried out. The resulting

P-value was computed to be .00032. a. What conclusion would be appropriate at level of signif-

icance .001? b. Does this small P-value indicate that there is a very

strong linear relationship between x and y (a value of r that differs considerably from 0)? Explain.

c. Now suppose a sample of pairs resulted in . Test versus at level .05. Is the result statistically significant? Comment on the practical significance of your analysis.

Ha: r 2 0H0: r 5 0r 5 .022 n 5 10,000 (x, y)

Ha: r 2 0H0: r 5 0 n 5 500(x, y)

r3 5 2.15r1 5 .16, r2 5 2.09 n 5 100ri # 22/1nri $ 2/1n

H0: ri 5 0 1/1n,

r1, r2, c

(x 1, x 3), (x 2, x 4), c, (xn22, xn) n 2 2

518 CHAPTER 12 Simple Linear Regression and Correlation

SUPPLEMENTARY EXERCISES (68–87)

68. The appraisal of a warehouse can appear straightforward compared to other appraisal assignments. A warehouse appraisal involves comparing a building that is primarily an open shell to other such buildings. However, there are still a number of warehouse attributes that are plausibly related to appraised value. The article “Challenges in Appraising ‘Simple’ Warehouse Properties” (Donald Sonneman, The Appraisal Journal, April 2001, 174–178) gives the accom- panying data on truss height (ft), which determines how high stored goods can be stacked, and sale price ($) per square foot.

Height: 12 14 14 15 15 16 18 22 22 24 Price: 35.53 37.82 36.90 40.00 38.00 37.50 41.00 48.50 47.00 47.50

Truss height: 24 26 26 27 28 30 30 33 36 Sale price: 46.20 50.35 49.13 48.07 50.90 54.78 54.32 57.17 57.45

a. Is it the case that truss height and sale price are “deter- ministically” related—i.e., that sale price is determined

completely and uniquely by truss height? [Hint: Look at the data.]

b. Construct a scatterplot of the data. What does it suggest? c. Determine the equation of the least squares line. d. Give a point prediction of price when truss height is

27 ft, and calculate the corresponding residual. e. What percentage of observed variation in sale price can

be attributed to the approximate linear relationship between truss height and price?

69. Refer to the previous exercise, which gives data on truss heights for a sample of warehouses and the corresponding sale prices. a. Estimate the true average change in sale price associated

with a one-foot increase in truss height, and do so in a way that conveys information about the precision of estimation.

b. Estimate the true average sale price for all warehouses having a truss height of 25 ft, and do so in a way that conveys information about the precision of estimation.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 519

c. Predict the sale price for a single warehouse whose truss height is 25 ft, and do so in a way that conveys informa- tion about the precision of prediction. How does this pre- diction compare to the estimate of (b)?

d. Without calculating any intervals, how would the width of a 95% prediction interval for sale price when truss height is 25 ft compare to the width of a 95% interval when height is 30 ft? Explain your reasoning.

e. Calculate and interpret the sample correlation coefficient.

70. Forensic scientists are often interested in making a meas- urement of some sort on a body (alive or dead) and then using that as a basis for inferring something about the age of the body. Consider the accompanying data on age (yr) and % D-aspertic acid (hereafter %DAA) from a particular tooth (“An Improved Method for Age at Death Determination from the Measurements of D-Aspertic Acid in Dental Collagen,” Archaeometry, 1990: 61–70.)

Age: 9 10 11 12 13 14 33 39 52 65 69 %DAA: 1.13 1.10 1.11 1.10 1.24 1.31 2.25 2.54 2.93 3.40 4.55

Suppose a tooth from another individual has 2.01%DAA. Might it be the case that the individual is younger than 22? This question was relevant to whether or not the individual could receive a life sentence for murder.

A seemingly sensible strategy is to regress age on %DAA and then compute a PI for age when However, it is more natural here to regard age as the inde- pendent variable x and %DAA as the dependent variable y, so the regression model is . After estimating the regression coefficients, we can substitute

into the estimated equation and then solve for a prediction of age . This “inverse” use of the regression line is called “calibration.” A PI for age with prediction level approximately is where

Calculate this PI for and then address the ques- tion posed earlier.

y* 5 2.01

SE 5 s

b̂1 e1 1 1

n 1

(x̂ 2 x)2

Sxx f 1/2

x̂ 6 ta/2,n22 # SE100(1 2 a)%

x̂ y* 5 2.01

%DAA 5 b0 1 b1x 1 P

%DAA 5 2.01.

71. The accompanying data on x � diesel oil consumption rate measured by the drain–weigh method and y � rate measured by the CI-trace method, both in g/hr, was read from a graph in the article “A New Measurement Method of Diesel Engine Oil Consumption Rate” (J. of Soc. of Auto Engr., 1985: 28–33).

x 4 5 8 11 12 16 17 20 22 28 30 31 39

y 5 7 10 10 14 15 13 25 20 24 31 28 39

SAS output for Exercise 72

Dependent Variable: NITRLVL

Analysis of Variance

Source DF Sum of Squares Mean Square F Value Prob . F

Model 1 64.49622 64.49622 63.309 0.0002 Error 6 6.11253 1.01875 C Total 7 70.60875

Root MSE 1.00933 R-square 0.9134 Dep Mean 26.91250 Adj R-sq 0.8990 C.V. 3.75043

Parameter Estimates

Parameter Standard T for HO: Variable DF Estimate Error Parameter � 0 Prob > |T| INTERCEP 1 326.976038 37.71380243 8.670 0.0001 SALINITY 1 1.05621381 0.000227.95728.403964

a. Assuming that x and y are related by the simple linear regression model, carry out a test to decide whether it is plausible that on average the change in the rate measured by the CI-trace method is identical to the change in the rate measured by the drain–weigh method.

b. Calculate and interpret the value of the sample correla- tion coefficient.

72. The SAS output at the bottom of this page is based on data from the article “Evidence for and the Rate of Denitrification in the Arabian Sea” (Deep Sea Research, 1978: 431–435). The variables under study are x � salinity level (%) and y � nitrate level (mM/L). a. What is the sample size n? [Hint: Look for degrees of

freedom for SSE.] b. Calculate a point estimate of expected nitrate level when

salinity level is 35.5. c. Does there appear to be a useful linear relationship

between the two variables? d. What is the value of the sample correlation coefficient? e. Would you use the simple linear regression model to

draw conclusions when the salinity level is 40?

73. The presence of hard alloy carbides in high chromium white iron alloys results in excellent abrasion resistance, making them suitable for materials handling in the mining and materials processing industries. The accompanying data on x � retained austenite content (%) and y � abrasive wear loss (mm3) in pin wear tests with garnet as the abrasive was read from a plot in the article “Microstructure-Property Relationships in High Chromium White Iron Alloys” (Intl. Materials Reviews, 1996: 59–82).

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

520 CHAPTER 12 Simple Linear Regression and Correlation

x 4.6 17.0 17.4 18.0 18.5 22.4 26.5 30.0 34.0

y .66 .92 1.45 1.03 .70 .73 1.20 .80 .91

x 38.8 48.2 63.5 65.8 73.9 77.2 79.8 84.0

y 1.19 1.15 1.12 1.37 1.45 1.50 1.36 1.29

Use the data and the SAS output above to answer the following questions. a. What proportion of observed variation in wear loss can

be attributed to the simple linear regression model relationship?

b. What is the value of the sample correlation coefficient? c. Test the utility of the simple linear regression model

using . d. Estimate the true average wear loss when content is 50%

and do so in a way that conveys information about relia- bility and precision.

e. What value of wear loss would you predict when content is 30%, and what is the value of the corresponding residual?

74. The accompanying data was read from a scatter plot in the article “Urban Emissions Measured with Aircraft” (J. of the Air and Waste Mgmt. Assoc., 1998: 16–25). The response variable is , and the explanatory variable is .�CO

�NOy

a 5 .01

�CO 50 60 95 108 135 2.3 4.5 4.0 3.7 8.2

�CO 210 214 315 720 5.4 7.2 13.8 32.1�NOy

�NOy

a. Fit an appropriate model to the data and judge the utility of the model.

b. Predict the value of �NOy that would result from making one more observation when �CO is 400, and do so in a way that conveys information about precision and relia- bility. Does it appear that �NOy can be accurately predicted? Explain.

SAS output for Exercise 73

Dependent Variable: ABRLOSS Analysis of Variance

Source DF Sum of Squares Mean Square F Value Prob � F Model 1 0.63690 0.63690 15.444 0.0013 Error 15 0.61860 0.04124

C Total 16 1.25551

Root MSE 0.20308 R-square 0.5073 Dep Mean 1.10765 Adj R-sq 0.4744 C.V. 18.33410

Parameter Estimates

Parameter Standard T for H0: Variable DF Estimate Error Parameter � 0 INTERCEP 1 0.787218 0.09525879 8.264 0.0001 AUSTCONT 1 0.007570 0.00192626 3.930 0.0013

Prob . uT u

c. The largest value of is much greater than the other values. Does this observation appear to have had a sub- stantial impact on the fitted equation?

75. An investigation was carried out to study the relationship between speed (ft/sec) and stride rate (number of steps taken/sec) among female marathon runners. Resulting summary quantities included

, and . a. Calculate the equation of the least squares line that you

would use to predict stride rate from speed. b. Calculate the equation of the least squares line that you

would use to predict speed from stride rate. c. Calculate the coefficient of determination for the

regression of stride rate on speed of part (a) and for the regression of speed on stride rate of part (b). How are these related?

76. “Mode-mixity” refers to how much of crack propagation is attributable to the three conventional fracture modes of open- ing, sliding, and tearing. For plane problems, only the first two modes are present, and the mode-mixity angle is a measure of the extent to which propagation is due to sliding as opposed to opening. The article “Increasing Allowable Flight Loads by Improved Structural Modeling” (AIAA J., 2006: 376–381) gives the following data on x � mode-mixity angle (degrees) and y � fracture toughness (N/m) for sandwich panels use in aircraft construction.

x 16.52 17.53 18.05 18.50 22.39 23.89 25.50 24.89

y 609.4 443.1 577.9 628.7 565.7 711.0 863.4 956.2

x 23.48 24.98 25.55 25.90 22.65 23.69 24.15 24.54

y 679.5 707.5 767.1 817.8 702.3 903.7 964.9 1047.3

a. Obtain the equation of the estimated regression line, and discuss the extent to which the simple linear regression model is a reasonable way to relate fracture toughness to mode-mixity angle.

g (speed)(rate) 5 660.130 g (speed)2 5 3880.08, g (rate) 5 35.16, g (rate)2 5 112.681

n 5 11, g (speed) 5 205.4,

�CO

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Supplementary Exercises 521

b. Does the data suggest that the average change in fracture toughness associated with a one-degree increase in mode-mixity angle exceeds 50 N/m? Carry out an appro- priate test of hypotheses.

c. For purposes of precisely estimating the slope of the population regression line, would it have been preferable to make observations at the angles 16, 16, 18, 18, 20, 20, 20, 20, 22, 22, 22, 22, 24, 24, 26, and 26 (again a sample size of 16)? Explain your reasoning.

d. Calculate an estimate of true average fracture toughness and also a prediction of fracture toughness both for an angle of 18 degrees and for an angle of 22 degrees, do so in a manner that conveys information about reliability and precision, and then interpret and compare the esti- mates and predictions.

77. The article “Photocharge Effects in Dye Sensitized Ag[Br,I] Emulsions at Millisecond Range Exposures” (Photographic Sci. and Engr., 1981: 138–144) gives the accompanying data on light absorption at 5800 A and y � peak pho- tovoltage.

x 5 %

a. Construct a scatter plot of this data. What does it suggest? b. Assuming that the simple linear regression model is appro-

priate, obtain the equation of the estimated regression line. c. What proportion of the observed variation in peak pho-

tovoltage can be explained by the model relationship? d. Predict peak photovoltage when % absorption is 19.1,

and compute the value of the corresponding residual. e. The article’s authors claim that there is a useful linear

relationship between % absorption and peak photovolt- age. Do you agree? Carry out a formal test.

f. Give an estimate of the change in expected peak photo- voltage associated with a 1% increase in light absorption. Your estimate should convey information about the precision of estimation.

g. Repeat part (f) for the expected value of peak photovolt- age when % light absorption is 20.

78. In Section 12.4, we presented a formula for and a CI for . Taking gives and a CI for b0. Use the data of Example 12.11 to calculate the estimated standard deviation of and a 95% CI for the y-intercept of the true regression line.

79. Show that , which gives an alternative computational formula for SSE.

80. Suppose that x and y are positive variables and that a sam- ple of n pairs results in . If the sample correlation coefficient is computed for the (x, y2) pairs, will the result- ing value also be approximately 1? Explain.

r < 1

SSE 5 Syy 2 b̂1Sxy

b̂0

s b̂ 0

2x* 5 0b0 1 b1x* V(b̂0 1 b̂1x*)

81. Let sx and sy denote the sample standard deviations of the observed x’s and y’s, respectively [that is,

and similarly for ]. a. Show that an alternative expression for the estimated

regression line is

b. This expression for the regression line can be interpreted as follows. Suppose . What then is the predicted y for an x that lies 1 SD (sx units) above the mean of the xi’s? If r were 1, the prediction would be for y to lie 1 SD above its mean , but since , we predict a y that is only .5 SD (.5sy unit) above . Using the data in Exercise 64 for a patient whose age is 1 SD below the average age in the sample, by how many standard deviations is the patient’s predicted �CBG above or below the average �CBG for the sample?

82. Verify that the t statistic for testing in Section 12.3 is identical to the t statistic in Section 12.5 for testing

.

83. Use the formula for computing SSE to verify that .

84. In biofiltration of wastewater, air discharged from a treat- ment facility is passed through a damp porous membrane that causes contaminants to dissolve in water and be trans- formed into harmless products. The accompanying data on x � inlet temperature (°C) and y � removal efficiency (%) was the basis for a scatter plot that appeared in the article “Treatment of Mixed Hydrogen Sulfide and Organic Vapors in a Rock Medium Biofilter”(Water Environment Research, 2001: 426–435).

Removal Removal Obs Temp % Obs Temp %

1 7.68 98.09 17 8.55 98.27 2 6.51 98.25 18 7.57 98.00 3 6.43 97.82 19 6.94 98.09 4 5.48 97.82 20 8.32 98.25 5 6.57 97.82 21 10.50 98.41 6 10.22 97.93 22 16.02 98.51 7 15.69 98.38 23 17.83 98.71 8 16.77 98.89 24 17.03 98.79 9 17.13 98.96 25 16.18 98.87

10 17.63 98.90 26 16.26 98.76 11 16.72 98.68 27 14.44 98.58 12 15.45 98.69 28 12.78 98.73 13 12.06 98.51 29 12.25 98.45 14 11.44 98.09 30 11.69 98.37 15 10.17 98.25 31 11.34 98.36 16 9.64 98.36 32 10.97 98.45

Calculated summary quantities are , and

.gyi 2 5 309,892.6548

gx iyi 5 37,850.7762gx i 2 5 5099.2412,3149.04,

gx i 5 384.26, gyi 5

r 2 5 1 2 SSE/SST

H0: r 5 0

H0: b1 5 0

y r 5 .5y

r 5 .5

y 5 y 1 r # sy sx

(x 2 x)

y 5 b̂0 1 b̂1x

sy 2g (x i 2 x)

2/(n 2 1) sx

2 5

x 4.0 8.7 12.7 19.1 21.4

y .12 .28 .55 .68 .85

x 24.6 28.9 29.8 30.5

y 1.02 1.15 1.34 1.29

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

522 CHAPTER 12 Simple Linear Regression and Correlation

a. Does a scatter plot of the data suggest appropriateness of the simple linear regression model?

b. Fit the simple linear regression model, obtain a point pre- diction of removal efficiency when temperature � 10.50, and calculate the value of the corresponding residual.

c. Roughly what is the size of a typical deviation of points in the scatter plot from the least squares line?

d. What proportion of observed variation in removal effi- ciency can be attributed to the model relationship?

e. Estimate the slope coefficient in a way that conveys information about reliability and precision, and interpret your estimate.

f. Personal communication with the authors of the article revealed that there was one additional observation that was not included in their scatter plot: (6.53, 96.55). What impact does this additional observation have on the equa- tion of the least squares line and the values of s and r2?

85. Normal hatchery processes in aquaculture inevitably pro- duce stress in fish, which may negatively impact growth, reproduction, flesh quality, and susceptibility to disease. Such stress manifests itself in elevated and sustained corti- costeroid levels. The article “Evaluation of Simple Instru- ments for the Measurement of Blood Glucose and Lactate, and Plasma Protein as Stress Indicators in Fish” (J. of the World Aquaculture Society, 1999: 276–284) described an experiment in which fish were subjected to a stress protocol and then removed and tested at various times after the protocol had been applied. The accompanying data on x � time (min) and y � blood glucose level (mmol/L) was read from a plot.

x 2 2 5 7 12 13 17 18 23 24 26 28

y 4.0 3.6 3.7 4.0 3.8 4.0 5.1 3.9 4.4 4.3 4.3 4.4

x 29 30 34 36 40 41 44 56 56 57 60 60

y 5.8 4.3 5.5 5.6 5.1 5.7 6.1 5.1 5.9 6.8 4.9 5.7

Use the methods developed in this chapter to analyze the data, and write a brief report summarizing your conclusions (assume that the investigators are particularly interested in glucose level 30 min after stress).

86. The article “Evaluating the BOD POD for Assessing Body Fat in Collegiate Football Players” (Medicine and Science in Sports and Exercise, 1999: 1350–1356) reports on a new

a. Use various methods to decide whether it is plausible that the two techniques measure on average the same amount of fat.

b. Use the data to develop a way of predicting an HW mea- surement from a BOD POD measurement, and investi- gate the effectiveness of such predictions.

87. Reconsider the situation of Exercise 73, in which x � retained austenite content using a garnet abrasive and y � abrasive wear loss were related via the simple linear regression model . Suppose that for a second type of abrasive, these variables are also related via the simple linear regression model and that for both types of abrasive. If the data set consists of n1 observations on the first abrasive and n2 on the second and if SSE1 and SSE2 denote the two error sums of squares, then a pooled estimate of s2 is

. Let and denote for the data on the first and second abrasives, respectively. A test of (equal slopes) is based on the statistic

When H0 is true, T has a t distribution with df. Suppose the 15 observations using the alternative abrasive give , and Using this along with the data of Exercise 73, carry out a test at level .05 to see whether expected change in wear loss asso- ciated with a 1% increase in austenite content is identical for the two types of abrasive.

SSE2 5 .51350.SSx2 5 7152.5578, ĝ1 5 .006845

n 1 1 n 2 2 4

T 5 b̂1 2 ĝ1

ŝ B

1

SSx1 1

1

SSx2

H0: b1 2 g1 5 0 g (x i 2 x)

2 SSx2SSx1ŝ

2 5 (SSE1 1 SSE2)/(n 1 1 n 2 2 4)

V(P) 5 s2 Y 5 g0 1 g1x 1 e

Y 5 b0 1 b1x 1 e

BOD 2.5 4.0 4.1 6.2 7.1 7.0 8.3 9.2 9.3 12.0 12.2

HW 8.0 6.2 9.2 6.4 8.6 12.2 7.2 12.0 14.9 12.1 15.3

BOD 12.6 14.2 14.4 15.1 15.2 16.3 17.1 17.9 17.9

HW 14.8 14.3 16.3 17.9 19.5 17.5 14.3 18.3 16.2

Bibliography Draper, Norman, and Harry Smith, Applied Regression Analysis

(3rd ed.), Wiley, New York, 1999. The most comprehensive and authoritative book on regression analysis currently in print.

Neter, John, Michael Kutner, Christopher Nachtsheim, and William Wasserman, Applied Linear Statistical Models (5th ed.), Irwin, Homewood, IL, 2005. The first 14 chapters con- stitute an extremely readable and informative survey of regression analysis.

air displacement device for measuring body fat. The cus- tomary procedure utilizes the hydrostatic weighing device, which measures the percentage of body fat by means of water displacement. Here is representative data read from a graph in the paper.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

523

13 Nonlinear and MultipleRegression

INTRODUCTION

The probabilistic model studied in Chapter 12 specified that the observed

value of the dependent variable Y deviated from the linear regression function

by a random amount. Here we consider two ways of

generalizing the simple linear regression model. The first way is to replace

by a nonlinear function of x, and the second is to use a regression

function involving more than a single independent variable. After fitting a

regression function of the chosen form to the given data, it is of course

important to have methods available for making inferences about the

parameters of the chosen model. Before these methods are used, though,

the data analyst should first assess the adequacy of the chosen model. In

Section 13.1, we discuss methods, based primarily on a graphical analysis of

the residuals (observed minus predicted y’s), for checking model adequacy.

In Section 13.2, we consider nonlinear regression functions of a single

independent variable x that are “intrinsically linear.” By this we mean that it

is possible to transform one or both of the variables so that the relationship

between the resulting variables is linear. An alternative class of nonlinear

relations is obtained by using polynomial regression functions of the form

; these polynomial models are the

subject of Section 13.3. Multiple regression analysis involves building models for

relating y to two or more independent variables. The focus in Section 13.4 is on

interpretation of various multiple regression models and on understanding and

using the regression output from various statistical computer packages. The last

section of the chapter surveys some extensions and pitfalls of multiple regression

modeling.

mY # x 5 b0 1 b1x 1 b2x2 1 c1 bk xk

b0 1 b1x

mY # x 5 b0 1 b1x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

524 CHAPTER 13 Nonlinear and Multiple Regression

13.1 Assessing Model Adequacy A plot of the observed pairs is a necessary first step in deciding on the form of a mathematical relationship between x and y. It is possible to fit many functions other than a linear one to the data, using either the principle of least squares or another fitting method. Once a function of the chosen form has been fitted, it is important to check the fit of the model to see whether it is in fact appropriate. One way to study the fit is to superimpose a graph of the best-fit function on the scatter plot of the data. However, any tilt or curvature of the best-fit function may obscure some aspects of the fit that should be investigated. Furthermore, the scale on the vertical axis may make it difficult to assess the extent to which observed values deviate from the best-fit function.

Residuals and Standardized Residuals A more effective approach to assessment of model adequacy is to compute the fitted or predicted values and the residuals and then plot various functions of these computed quantities. We then examine the plots either to confirm our choice of model or for indications that the model is not appropriate. Suppose the simple linear regression model is correct, and let be the equation of the estimated regression line. Then the ith residual is . To derive properties of

the residuals, let , represent the ith residual as a random variable (rv) before observations are actually made. Then

(13.1)

Because is a linear function of the Yj’s, so is (the coeffi- cients depend on the xj

’s). Thus the normality of the Yj ’s implies that each residual is

normally distributed. It can also be shown that

(13.2)

Replacing s2 by s2 and taking the square root of Equation (13.2) gives the estimated standard deviation of a residual.

Let’s now standardize each residual by subtracting the mean value (zero) and then dividing by the estimated standard deviation.

V(Yi 2 Ŷi) 5 s2 # c1 2 1 n 2

(x i 2 x#) 2

Sxx d

Yi 2 ŶiŶi (5 b̂0 1 b̂1x i)

E(Yi 2 Ŷi) 5 E(Yi ) 2 E(b̂0 1 b̂1x i) 5 b0 1 b1x i 2 (b0 1 b1x i) 5 0

ei 5 Yi 2 Ŷi

ei 5 yi 2 (b̂0 1 b̂1x i) y 5 b̂0 1 b̂1x

ei 5 yi 2 ŷiŷi

(y 5 b0 1 b1x)

(x i, yi)

The standardized residuals are given by

(13.3)e*i 5 yi 2 ŷi

s B

1 2 1 n 2

(x i 2 x)2

Sxx

i 5 1, c, n

If, for example, a particular standardized residual is 1.5, then the residual itself is 1.5 (estimated) standard deviations larger than what would be expected from fitting the correct model. Notice that the variances of the residuals differ from one another. In fact, because there is a � sign in front of , the variance of a residual decreases as xi moves further away from the center of the data . Intuitively, this isx

(x i 2 x) 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 13.1

13.1 Assessing Model Adequacy 525

Exercise 19 in Chapter 12 presented data on area liberation rate and emissions. Here we reproduce the data and give the fitted values, residuals,

and standardized residuals. The estimated regression line is , and . The standardized residuals are not a constant multiple of the residuals

because the residual variances differ somewhat from one another. r 2 5 .961

y 5 245.55 1 1.71x y 5 NOx

x 5 burner

xi yi ei

100 150 125.6 24.4 .75 125 140 168.4 �28.4 �.84 125 180 168.4 11.6 .35 150 210 211.1 �1.1 �.03 150 190 211.1 �21.1 �.62 200 320 296.7 23.3 .66 200 280 296.7 �16.7 �.47 250 400 382.3 17.7 .50 250 430 382.3 47.7 1.35 300 440 467.9 �27.9 �.80 300 390 467.9 �77.9 �2.24 350 600 553.4 46.6 1.39 400 610 639.0 �29.0 �.92 400 670 639.0 31.0 .99

ei*ŷ i

because the least squares line is pulled toward an observation whose xi value lies far to the right or left of other observations in the sample. Computation of the can be tedious, but the most widely used statistical computer packages will provide these values and construct various plots involving them.

e*i ’s

Diagnostic Plots The basic plots that many statisticians recommend for an assessment of model validity and usefulness are the following:

1. (or ei) on the vertical axis versus xi on the horizontal axis

2. (or ei) on the vertical axis versus on the horizontal axis

3. on the vertical axis versus yi on the horizontal axis

4. A normal probability plot of the standardized residuals

Plots 1 and 2 are called residual plots (against the independent variable and fitted values, respectively), whereas Plot 3 is fitted against observed values.

If Plot 3 yields points close to the 45° line then the estimated regression function gives accurate predictions of the values actually observed. Thus Plot 3 provides a visual assessment of model effectiveness in making predictions. Provided that the model is correct, neither residual plot should exhibit distinct patterns. The residuals should be randomly distributed about 0 according to a normal distribution, so all but a very few standardized residuals should lie between �2 and �2 (i.e., all but a few residuals within 2 standard deviations of their expected value 0). The plot of standardized residuals versus is really a combination of the two other plots, showing implicitly both how residuals vary with x and how fitted values compare with observed values. This latter plot is the single one most often recom- mended for multiple regression analysis. Plot 4 allows the analyst to assess the plausi- bility of the assumption that has a normal distribution.P

[slope 11 through (0, 0)],

ŷi

ŷiei *

ei *

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 13.2

526 CHAPTER 13 Nonlinear and Multiple Regression

Figure 13.1 presents a scatter plot of the data and the four plots just recommended. The plot of versus y confirms the impression given by r2 that x is effective in predicting y and also indicates that there is no observed y for which the predicted value is terribly far off the mark. Both residual plots show no unusual pattern or discrepant values. There is one standardized residual slightly outside the interval (�2, 2), but this is not surprising in a sample of size 14. The normal probability plot of the standardized resid- uals is reasonably straight. In summary, the plots leave us with no qualms about either the appropriateness of a simple linear relationship or the fit to the given data.

180 310 440

700

570

440

310

180

50

50

y

y 45.55 1.71x

y vs. x

x

340 680

580

240

100

100

y vs. y

y

1.0 2.00.0 1.0

1.0

0.0

1.0

2.0

3.0

2.0

240 400

2.0

1.0

0.0

1.0

2.0

40

Standardized residuals

vs. x

x

330 660

2.0

1.0

0.0

1.0

2.0

100

Standardized residuals

vs. y

e*

e*

e*

z percentile

Normal probability plot

ˆ

ˆ

Figure 13.1 Plots for the data from Example 13.1 ■

Difficulties and Remedies Although we hope that our analysis will yield plots like those of Figure 13.1, quite frequently the plots will suggest one or more of the following difficulties:

(Example 13.1 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.1 Assessing Model Adequacy 527

1. A nonlinear probabilistic relationship between x and y is appropriate.

2. The variance of (and of Y ) is not a constant s 2 but depends on x.

3. The selected model fits the data well except for a very few discrepant or outlying data values, which may have greatly influenced the choice of the best-fit function.

4. The error term does not have a normal distribution.

5. When the subscript i indicates the time order of the observations, the exhibit dependence over time.

6. One or more relevant independent variables have been omitted from the model.

Figure 13.2 presents residual plots corresponding to items 1–3, 5, and 6. In Chapter 4, we discussed patterns in normal probability plots that cast doubt on the assumption of an underlying normal distribution. Notice that the residuals from the data in Figure 13.2(d) with the circled point included would not by themselves necessarily suggest further analysis, yet when a new line is fit with that point deleted, the new line differs considerably from the original line. This type of behavior is more difficult to identify in multiple regression. It is most likely to arise when there is a single (or very few) data point(s) with independent variable value(s) far removed from the remainder of the data.

We now indicate briefly what remedies are available for the types of difficul- ties. For a more comprehensive discussion, one or more of the references on regres- sion analysis should be consulted. If the residual plot looks something like that of Figure 13.2(a), exhibiting a curved pattern, then a nonlinear function of x may be fit.

Pi’s P

P

2

2

e* e*

e*

e*

e*

x

2

2

2

2

x

2

2

x x

y

Time order of observation

Omitted independent variable

)b()a(

)d()c(

)f()e(

Figure 13.2 Plots that indicate abnormality in data: (a) nonlinear relationship; (b) nonconstant vari- ance; (c) discrepant observation; (d) observation with large influence; (e) dependence in errors; (f) variable omitted

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

528 CHAPTER 13 Nonlinear and Multiple Regression

EXERCISES Section 13.1 (1–14)

1. Suppose the variables and time are related according to the simple

linear regression model with . a. If observations are made at the x values x1 5 5,

x2 5 10, x2 5 15, x2 5 20, and , calculate the standard deviations of the five corresponding residuals.

b. Repeat part (a) for , and .x 5 5 50

x 1 5 5, x 2 5 10, x 3 5 15, x 4 5 20

x 5 5 25 n 5 5

s 5 10 y 5 comuting

x 5 commuting distance

The residual plot of Figure 13.2(b) suggests that, although a straight-line relationship may be reasonable, the assumption that for each i is of doubt- ful validity. When the assumptions of Chapter 12 are valid, it can be shown that among all unbiased estimators of b0 and b1, the ordinary least squares estimators have minimum variance. These estimators give equal weight to each . If the variance of Y increases with x, then Yi’s for large xi should be given less weight than those with small xi. This suggests that b0 and b1 should be estimated by minimizing

where the wi’s are weights that decrease with increasing xi. Minimization of Expression (13.4) yields weighted least squares estimates. For example, if the standard deviation of Y is proportional to —that is, —then it can be shown that the weights yield best estimators of b0 and b1. The books by John Neter et al. and by S. Chatterjee and Bertram Price contain more detail (see the chapter bibliography). Weighted least squares is used quite frequently by econometricians (economists who use statistical methods) to estimate parameters.

When plots or other evidence suggest that the data set contains outliers or points having large influence on the resulting fit, one possible approach is to omit these outlying points and recompute the estimated regression equation. This would certainly be correct if it were found that the outliers resulted from errors in recording data values or experimental errors. If no assignable cause can be found for the outliers, it is still desirable to report the estimated equation both with and without outliers omitted. Yet another approach is to retain possible outliers but to use an estimation principle that puts relatively less weight on outlying values than does the principle of least squares. One such principle is MAD (minimize absolute deviations), which selects and to minimize . Unlike the estimates of least squares, there are no nice formulas for the MAD estimates; their values must be found by using an iterative computational procedure. Such procedures are also used when it is suspected that the i’s have a distribution that is not normal but instead have “heavy tails” (making it much more likely than for the normal distribution that discrepant values will enter the sample); robust regression procedures are those that produce reliable estimates for a wide variety of underlying error distributions. Least squares estimators are not robust in the same way that the sample mean is not a robust estimator for m.

When a plot suggests time dependence in the error terms, an appropriate analysis may involve a transformation of the y’s or else a model explicitly including a time variable. Lastly, a plot such as that of Figure 13.2(f), which shows a pattern in the residuals when plotted against an omitted variable, suggests that a multiple regression model that includes the previously omitted variable should be considered.

X

P

g u yi 2 (b0 1 b1x i) ub̂ 1b̂ 0

wi 5 1/x i 2

V (Y) 5 kx 2x (for x . 0)

fw(b0, b1) 5 gwi[yi 2 (b0 1 b1x i)] 2

(x i, Yi)

V(Yi) 5 s 2

c. What do the results of parts (a) and (b) imply about the deviation of the estimated line from the observation made at the largest sampled x value?

2. The x values and standardized residuals for the chlorine flow/etch rate data of Exercise 52 (Section 12.4) are displayed in the accompanying table. Construct a standardized residual plot and comment on its appearance.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.1 Assessing Model Adequacy 529

x 1.50 1.50 2.00 2.50 2.50

e* .31 1.02 �1.15 �1.23 .23

x 3.00 3.50 3.50 4.00

e* .73 �1.36 1.53 .07

3. Example 12.6 presented the residuals from a simple linear regression of moisture content y on filtration rate x. a. Plot the residuals against x. Does the resulting plot suggest

that a straight-line regression function is a reasonable choice of model? Explain your reasoning.

b. Using , compute the values of the standardized residuals. Is for , or are the ’s not close to being proportional to the ei’s?

c. Plot the standardized residuals against x. Does the plot differ significantly in general appearance from the plot of part (a)?

4. Wear resistance of certain nuclear reactor components made of Zircaloy-2 is partly determined by properties of the oxide layer. The following data appears in an article that proposed a new nondestructive testing method to monitor thickness of the layer (“Monitoring of Oxide Layer Thickness on Zircaloy-2 by the Eddy Current Test Method,” J. of Testing and Eval., 1987: 333–336). The variables are thickness (mm) and response (arbitrary units).y 5 eddy-current

x 5 oxide-layer

ei*i 5 1, c, nei* < ei’s s 5 .665

x 0 7 17 114 133

y 20.3 19.8 19.5 15.9 15.1

x 142 190 218 237 285

y 14.7 11.9 11.5 8.3 6.6

a. The authors summarized the relationship by giving the equation of the least squares line as . Calculate and plot the residuals against x and then com- ment on the appropriateness of the simple linear regres- sion model.

b. Use to calculate the standardized residuals from a simple linear regression. Construct a standardized residual plot and comment. Also construct a normal prob- ability plot and comment.

5. As the air temperature drops, river water becomes super- cooled and ice crystals form. Such ice can significantly affect the hydraulics of a river. The article “Laboratory Study of Anchor Ice Growth” (J. of Cold Regions Engr., 2001: 60–66) described an experiment in which ice thickness (mm) was studied as a function of elapsed time (hr) under specified con- ditions. The following data was read from a graph in the arti- cle: ; ; , 1.25, 1.50, 2.75, 3.50, 4.75, 5.75, 5.60, 7.00, 8.00, 8.25, 9.50, 10.50, 11.00, 10.75, 12.50, 12.25, 13.25, 15.50, 15.00, 15.25, 16.25, 17.25, 18.00, 18.25, 18.15, 20.25, 19.50, 20.00, 20.50, 20.60, 20.50, 19.80.

y 5 .50x 5 .17, .33, .50, .67, c, 5.50n 5 33

s 5 .7921

y 5 20.6 2 .047x

a. The r2 value resulting from a least squares fit is .977. Interpret this value and comment on the appropriateness of assuming an approximate linear relationship.

b. The residuals, listed in the same order as the x values, are

�1.03 �0.92 �1.35 �0.78 �0.68 �0.11 0.21 �0.59 0.13 0.45 0.06 0.62 0.94 0.80 �0.14 0.93 0.04 0.36 1.92 0.78 0.35

0.67 1.02 1.09 0.66 �0.09 1.33 �0.10 �0.24 �0.43 �1.01 �1.75 �3.14

0 x

y

0

200

400

600

800

1000

1200

1400

1600

20 40 60

Load = –13.58 + 9.905 Discharge

80

Discharge (cfs)

L oa

d (K

g N

/d ay

)

100 120

S R-Sq R-Sq (adj)

69.0107 92.5% 92.4%

140

7. Composite honeycomb sandwich panels are widely used in various aerospace structural applications such as ribs, flaps, and rudders. The article “Core Crush Problem in Manufacturing of Composite Sandwich Structures: Mechanisms and Solutions” (Amer. Inst. of Aeronautics and Astronautics J., 2006: 901–907) fit a line to the following data on and :y 5 core crush (%)x 5 prepreg thickness (mm)

Plot the residuals against elapsed time. What does the plot suggest?

6. The accompanying scatter plot is based on data provided by authors of the article “Spurious Correlation in the USEPA Rating Curve Method for Estimating Pollutant Loads” (J. of Envir. Engr., 2008: 610–618); here discharge is in ft3/s as opposed to m3/s used in the article. The point on the far right of the plot corresponds to the observation (140, 1529.35). The resulting standardized residual is 3.10. Minitab flags the observation with an R for large residual and an X for poten- tially influential observation. Here is some information on the estimated slope:

Full sample (140, 1529.35) deleted

9.9050 8.8241

.3806 .4734

Does this observation appear to have had a substantial impact on the estimated slope? Explain.

sb̂1

b̂ 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

530 CHAPTER 13 Nonlinear and Multiple Regression

a. Fit the simple linear regression model. What proportion of the observed variation in core crush can be attributed to the model relationship?

b. Construct a scatter plot. Does the plot suggest that a linear probabilistic relationship is appropriate?

c. Obtain the residuals and standardized residuals, and then construct residual plots. What do these plots suggest? What type of function should provide a better fit to the data than does a straight line?

8. Continuous recording of heart rate can be used to obtain infor- mation about the level of exercise intensity or physical strain dur- ing sports participation, work, or other daily activities. The article “The Relationship Between Heart Rate and Oxygen Uptake During Non-Steady State Exercise” (Ergonomics, 2000: 1578–1592) reported on a study to investigate using heart rate response (x, as a percentage of the maximum rate) to predict oxy- gen uptake (y, as a percentage of maximum uptake) during exer- cise. The accompanying data was read from a graph in the article.

For each of these four data sets, the values of the summary statistics , and are virtually iden- tical, so all quantities computed from these five will be essentially identical for the four sets—the least squares line , SSE, s2, r2, t intervals, t statistics, and so on. The summary statistics provide no way of distin- guishing among the four data sets. Based on a scatter plot and a residual plot for each set, comment on the appropri- ateness or inappropriateness of fitting a straight-line model; include in your comments any specific suggestions for how a “straight-line analysis” might be modified or qualified.

10. a. Show that when the ei’s are the residuals from a simple linear regression.

b. Are the residuals from a simple linear regression inde- pendent of one another, positively correlated, or nega- tively correlated? Explain.

c. Show that for the residuals from a simple lin- ear regression. (This result along with part (a) shows that there are two linear restrictions on the ei’s, resulting in a loss of 2 df when the squared residuals are used to estimate s2.)

d. Is it true that ? Give a proof or a counter example.

11. a. Express the ith residual (where ) in the form , a linear function of the Yj’s. Then use

rules of variance to verify that is given by Expression (13.2).

b. It can be shown that and (the ith predicted value and residual) are independent of one another. Use this fact, the relation , and the expression for

from Section 12.4 to again verify Expression (13.2). c. As xi moves farther away from , what happens to

and to ?

12. a. Could a linear regression result in residuals 23, �27, 5, 17, �8, 9, and 15? Why or why not?

b. Could a linear regression result in residuals 23, �27, 5, 17, �8, �12, and 2 corresponding to x values 3, �4, 8, 12, �14, �20, and 25? Why or why not? [Hint: See Exercise 10.]

13. Recall that has a normal distribution with expected value and variance

so that

Z 5 b̂ 0 1 b̂ 1x 2 (b0 1 b1x)

s° 1 n 1

(x 2 x) 2

g (x i 2 x) 2 ¢

1/2

s 2• 1 n 1

(x 2 x) 2

g(x i 2 x) 2 ¶

b0 1 b1x b̂ 0 1 b̂ 1x

V(Yi 2 Ŷi ) V(Ŷi )x

V(Ŷ) Yi 5 Ŷi 1 (Yi 2 Ŷi )

Yi 2 ŶiŶi

V(Yi 2 Ŷi ) gcjYj

Ŷi 5 b̂0 1 b̂1x iYi 2 Ŷi

gni51 e*i 5 0

g n i51 x iei 5 0

gni51 ei 5 0

(y 5 3 1 .5x)

gx iyigx i, gx i 2, gyi, gyi

2

HR 43.5 44.0 44.0 44.5 44.0 45.0 48.0 49.0

VO2 22.0 21.0 22.0 21.5 25.5 24.5 30.0 28.0

HR 49.5 51.0 54.5 57.5 57.7 61.0 63.0 72.0

VO2 32.0 29.0 38.5 30.5 57.0 40.0 58.0 72.0

Use a statistical software package to perform a simple linear regression analysis, paying particular attention to the pres- ence of any unusual or influential observations.

9. Consider the following four (x, y) data sets; the first three have the same x values, so these values are listed only once (Frank Anscombe, “Graphs in Statistical Analysis,”Amer. Statistician, 1973: 17–21):

Data Set 1–3 1 2 3 4 4

Variable x y y y x y

10.0 8.04 9.14 7.46 8.0 6.58 8.0 6.95 8.14 6.77 8.0 5.76

13.0 7.58 8.74 12.74 8.0 7.71 9.0 8.81 8.77 7.11 8.0 8.84

11.0 8.33 9.26 7.81 8.0 8.47 14.0 9.96 8.10 8.84 8.0 7.04 6.0 7.24 6.13 6.08 8.0 5.25 4.0 4.26 3.10 5.39 19.0 12.50

12.0 10.84 9.13 8.15 8.0 5.56 7.0 4.82 7.26 6.42 8.0 7.91 5.0 5.68 4.74 5.73 8.0 6.89

x .246 .250 .251 .251 .254 .262 .264 .270

y 16.0 11.0 15.0 10.5 13.5 7.5 6.1 1.7

x .272 .277 .281 .289 .290 .292 .293

y 3.6 0.7 0.9 1.0 0.7 3.0 3.1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.2 Regression with Transformed Variables 531

has a standard normal distribution. If is substituted for s, the resulting variable has a t distribution with df. By analogy, what is the distribution of any particular standardized residual? If , what is the probability that a particular standardized residual falls out- side the interval (�2.50, 2.50)?

14. If there is at least one x value at which more than one observa- tion has been made, there is a formal test procedure for testing

for some values , (the true regres- sion function is linear)

versus

Ha: H0 is not true (the true regression function is not linear)

Suppose observations are made at . Let denote the n1 observations when

denote the nc observations when . With (the total number of observa- tions), SSE has df. We break SSE into two pieces, SSPE (pure error) and SSLF (lack of fit), as follows:

The ni observations at xi contribute df to SSPE, so the number of degrees of freedom for SSPE is

, and the degrees of freedom for SSLF�i(ni 2 1) 5 n 2 c

n i 2 1

SSLF 5 SSE 2 SSPE

5 ggYij 2 2 gniYi #

2

SSPE 5 g i g

j (Yij 2 Yi #)

2

n 2 2 n 5 gnix 5 xc

x 5 x 1; c; Yc1, Yc2, c, Ycnc

Y11, Y12, c, Y1n1

x 1, x 2, c, xc

b1b0H0: mY#x 5 b0 1 b1x

n 5 25 n 2 2

S 5 #SSE/(n 2 2) is . Let and . Then it can be shown that whereas whether or not H0 is true,

if H0 is true and if H0 is false.

Test statistic:

Rejection region:

The following data comes from the article “Changes in Growth Hormone Status Related to Body Weight of Growing Cattle” (Growth, 1977: 241–247), with and clearance rate/body weight.y 5 metabolic

x 5 body weight

f $ Fa,c22,n2c

F 5 MSLF

MSPE

E(MSLF) . s2E(MSLF) 5 s2 E(MSPE) 5 s2

MSLF 5 SSLF/(c 2 2) MSPE 5 SSPE/(n 2 c)n 2 2 2 (n 2 c) 5 c 2 2

x 110 110 110 230 230 230 360

y 235 198 173 174 149 124 115

x 360 360 360 505 505 505 505

y 130 102 95 122 112 98 96

(So .) a. Test H0 versus Ha at level .05 using the lack-of-fit test

just described. b. Does a scatter plot of the data suggest that the relation-

ship between x and y is linear? How does this compare with the result of part (a)? (A nonlinear regression func- tion was used in the article.)

c 5 4, n 1 5 n 2 5 3, n 3 5 n 4 5 4

DEFINITION

The necessity for an alternative to the linear model may be sug- gested either by a theoretical argument or else by examining diagnostic plots from a linear regression analysis. In either case, settling on a model whose parameters can be easily estimated is desirable. An important class of such models is specified by means of functions that are “intrinsically linear.”

Y 5 b0 1 b1x 1 P

A function relating y to x is intrinsically linear if, by means of a transfor- mation on x and/or y, the function can be expressed as where

the transformed independent variable and the transformed dependent variable.

yr 5xr 5 yr 5 b0 1 b1xr,

Four of the most useful intrinsically linear functions are given in Table 13.1. In each case, the appropriate transformation is either a log transformation—either base 10 or natural logarithm (base e)—or a reciprocal transformation. Representative graphs of the four functions appear in Figure 13.3.

For an exponential function relationship, only y is transformed to achieve lin- earity, whereas for a power function relationship, both x and y are transformed. Because the variable x is in the exponent in an exponential relationship, y increases

13.2 Regression with Transformed Variables

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

532 CHAPTER 13 Nonlinear and Multiple Regression

y

x

0

(a)

y

x

0

y

x

0 1

(b)

1

y

x

0

y

x

0 0

0

(c)

y

x

0

y

x (d)

0

y

x

0

Figure 13.3 Graphs of the intrinsically linear functions given in Table 13.1

DEFINITION

Table 13.1 Useful Intrinsically Linear Functions*

Function Transformation(s) to Linearize Linear Form

a. Exponential: b. Power: c.

d. Reciprocal:

*When appears, either a base 10 or a base e logarithm can be used. log ( # )

y 5 a 1 bxrxr 5 1 x

y 5 a 1 b # 1 x

y 5 a 1 bxrxr 5 log(x)y 5 a 1 b # log(x) yr 5 log(a) 1 bxryr 5 log(y), xr 5 log(x)y 5 axb yr 5 ln(a) 1 bxyr 5 ln(y)y 5 aebx

Intrinsically linear functions lead directly to probabilistic models that, though not linear in x as a function, have parameters whose values are easily estimated using ordinary least squares.

A probabilistic model relating Y to x is intrinsically linear if, by means of a transformation on Y and/or x, it can be reduced to a linear probabilistic model

.Yr 5 b0 1 b1xr 1 Pr

or decreases much more rapidly as x increases than is the case for the power function, though over a short interval of x values it can be difficult to differentiate between the two functions. Examples of functions that are not intrinsi- cally linear are and .y 5 a 1 gxby 5 a 1 gebx

(if b , 0)(if b . 0)

The intrinsically linear probabilistic models that correspond to the four functions of Table 13.1 are as follows:

a. , a multiplicative exponential model, from which with , and .Pr 5 ln(P)xr 5 x, b0 5 ln(a), b1 5 b b1xr 1 Pr

ln(Y) 5 Yr 5 b0 1Y 5 aebx # P

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.2 Regression with Transformed Variables 533

b. , a multiplicative power model, so that with , and .

c. , so that immediately linearizes the model.

d. , so that yields a linear model.

The additive exponential and power models, and , are not intrinsically linear. Notice that both (a) and (b) require a transformation on Y and, as a result, a transformation on the error variable . In fact, if has a lognormal dis- tribution (see Chapter 4) with and independent of x, then the transformed models for both (a) and (b) will satisfy all the assumptions of Chapter 12 regarding the linear probabilistic model; this in turn implies that all inferences for the parameters of the transformed model based on these assumptions will be valid. If s2 is small, in (a) or in (b).

The major advantage of an intrinsically linear model is that the parameters b0 and b1 of the transformed model can be immediately estimated using the principle of least squares simply by substituting x� and y� into the estimating formulas:

(13.5)

Parameters of the original nonlinear model can then be estimated by transforming back and/or if necessary. Once a prediction interval for y� when has been cal-

culated, reversing the transformation gives a PI for y itself. In cases (a) and (b), when s2 is small, an approximate CI for results from taking antilogs of the limits in the CI for (strictly speaking, taking antilogs gives a CI for the median of the Y distribution, i.e., for . Because the lognormal distribution is positively skewed,

; the two are approximately equal if s2 is close to 0.)m . m| m|Y#

x*

b0 1 b1xr* mY#x*

xr 5 xr*b̂1b̂0

b̂0 5 gyri 2 b̂1gxri

n 5 yr 2 b̂1xr

b̂1 5 gxriyri 2 gxri gyri /n g (xri)2 2 (gxri)2/n

axbmY#x < aebx

V(P) 5 t2E(P) 5 es2/2 PP

Y 5 axb 1 PY 5 aebx 1 P

xr 5 1/xY 5 a 1 b # 1/x 1 P xr 5 log(x)Y 5 a 1 b log(x) 1 P

Pr 5 log(P)xr 5 log(x), b0 5 log(x) 1 P b1xr 1 Pr log(Y ) 5 Yr 5 b0 1Y 5 axb # P

Example 13.3 Taylor’s equation for tool life y as a function of cutting time x states that or, equivalently, that . The article “The Effect of Experimental Error on the Determination of Optimum Metal Cutting Conditions” (J. of Engr. for Industry, 1967: 315–322) observes that the relationship is not exact (deterministic) and that the parameters a and b must be estimated from data. Thus an appropriate model is the multiplicative power model , which the author fit to the accom- panying data consisting of 12 carbide tool life observations (Table 13.2). In addi- tion to the x, y, x�, and y� values, the predicted transformed values and the predicted values on the original scale , after transforming back) are given.

The summary statistics for fitting a straight line to the transformed data are , and

, so

The estimated values of a and b, the parameters of the power function model, are and . Thus the estimatedâ 5 eb̂ 0 5 3.094491530 # 1015b̂ 5 b̂1 5 25.3996

b̂0 5 26.22601 2 (25.3996)(74.41200)

12 5 35.6684

b̂1 5 160.84601 2 (74.41200)(26.22601)/12

461.75874 2 (74.41200)2/12 5 25.3996

160.84601 gx iryir 574.41200, gyir 5 26.22601, gx ir2 5 461.75874, gyir2 5 67.74609 gxri 5

(ŷ (ŷr)

Y 5 a # xb # P

y 5 axb xyc 5 k

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

534 CHAPTER 13 Nonlinear and Multiple Regression

regression function is . To recapture Taylor’s

(estimated) equation, set , whence . Figure 13.4(a) gives a plot of the standardized residuals from the linear regres-

sion using transformed variables (for which ); there is no apparent pattern in the plot, though one standardized residual is a bit large, and the residuals look as they should for a simple linear regression. Figure 13.4(b) pictures a plot of versus y, which indicates satisfactory predictions on the original scale.

To obtain a confidence interval for median tool life when cutting time is 500, we transform to . Then , and a 95% CI for

is (from Section 12.4) . The 95% CI for is then obtained by taking antilogs:

. It is easily checked that for the transformed data . Because this is quite small, (6.876, 9.930) is an approximate interval for .mY #500

s2 5 ŝ2 < .081 e2.296) 5 (6.876, 9.930)

(e1.928, m|Y#500

2.1120 6 (2.228)(.0824) 5 (1.928, 2.296)b0 1 b1(6.21461) b̂0 1 b̂1xr 5 2.1120xr 5 6.21461x 5 500

r 2 5 .922

xy .185 5 740y 5 3.094491530 # 1015 # x25.3996 m̂Y#x < 3.094491530 # 1015 # x25.3996

2.0

1.0

0.0

1.0

2.0

3.0

6.0 6.2 6.4

(a)

e*

x'

6.0

12.0

18.0

24.0

30.0

8.0 16.0 24.0 32.0 40.0 y

(b)

Figure 13.4 (a) Standardized residuals versus x’ from Example 13.3; (b) versus y from Example 13.3 ■

In the article “Ethylene Synthesis in Lettuce Seeds: Its Physiological Significance” (Plant Physiology, 1972: 719–722), ethylene content of lettuce seeds (y, in nL/g dry wt) was studied as a function of exposure time (x, in min) to

Example 13.4

Table 13.2 Data for Example 13.3

x y

1 600 2.35 6.39693 .85442 1.12754 3.0881 2 600 2.65 6.39693 .97456 1.12754 3.0881 3 600 3.00 6.39693 1.09861 1.12754 3.0881 4 600 3.60 6.39693 1.28093 1.12754 3.0881 5 500 6.40 6.21461 1.85630 2.11203 8.2650 6 500 7.80 6.21461 2.05412 2.11203 8.2650 7 500 9.80 6.21461 2.28238 2.11203 8.2650 8 500 16.50 6.21461 2.80336 2.11203 8.2650 9 400 21.50 5.99146 3.06805 3.31694 27.5760

10 400 24.50 5.99146 3.19867 3.31694 27.5760 11 400 26.00 5.99146 3.25810 3.31694 27.5760 12 400 33.00 5.99146 3.49651 3.31694 27.5760

ŷ 5 e ŷ rŷryr 5 ln(y)xr 5 ln(x)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.2 Regression with Transformed Variables 535

0.0 20 40 60 80 100

0

100

200

300

400

y

x

(a)

0.0 20 40 60 80 100

e*

x

(b)

2.0

1.0

0.0

1.0

2.0

3.0

Figure 13.5 (a) Scatter plot; (b) residual plot from linear regression for the data in Example 13.4

an ethylene absorbent. Figure 13.5 presents both a scatter plot of the data and a plot of the residuals generated from a linear regression of y on x. Both plots show a strong curved pattern, suggesting that a transformation to achieve linearity is appropriate. In addition, a linear regression gives negative predictions for and .x 5 100

x 5 90

Table 13.3 Data for Example 13.4

x y

2 408 6.01 5.876 353.32 10 274 5.61 5.617 275.12 20 196 5.28 5.294 199.12 30 137 4.92 4.971 144.18 40 90 4.50 4.647 104.31 50 78 4.36 4.324 75.50 60 51 3.93 4.001 54.64 70 40 3.69 3.677 39.55 80 30 3.40 3.354 28.62 90 22 3.09 3.031 20.72

100 15 2.71 2.708 15.00

ŷ� e ŷ rŷryr� ln(y)

The author did not give any argument for a theoretical model, but his plot of versus x shows a strong linear relationship, suggesting that an exponential function will provide a good fit to the data. Table 13.3 shows the data values and other information from a linear regression of y� on x. The estimates of parameters of the linear model are and , with The estimated regression function for the exponential model is

. The predicted values can then be obtained by

substitution of into or else by computing , where the s are the predictions from the transformed straight-line model. Figure 13.6 presents both a plot of versus x (the standardized residuals from a linear regression) and a plot of

versus y. These plots support the choice of an exponential model.ŷ er*

ŷriŷi 5 eŷ rim̂Y#xx i (i 5 1, c, n) ŷim̂Y#x < eb̂0 # eb̂1x 5 380.32e2.0323x

r 2 5 .995. b̂0 5 5.941b̂1 5 2.0323

yr 5 ln(y)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

536 CHAPTER 13 Nonlinear and Multiple Regression

In analyzing transformed data, one should keep in mind the following points:

1. Estimating b1 and b0 as in (13.5) and then transforming back to obtain estimates of the original parameters is not equivalent to using the principle of least squares directly on the original model. Thus, for the exponential model, we could estimate a and b by minimizing . Iterative computation would be necessary.

In general, and .

2. If the chosen model is not intrinsically linear, the approach summarized in (13.5) cannot be used. Instead, least squares (or some other fitting procedure) would have to be applied to the untransformed model. Thus, for the additive exponential model

, least squares would involve minimizing . Taking partial derivatives with respect to a and b results in two nonlinear normal equations in a and b; these equations must then be solved using an iterative procedure.

3. When the transformed linear model satisfies all the assumptions listed in Chapter 12, the method of least squares yields best estimates of the transformed parameters. However, estimates of the original parameters may not be best in any sense, though they will be reasonable. For example, in the exponential model, the estimator will not be unbiased, though it will be the maximum likelihood estimator of a if the error variable � is normally distributed. Using least squares directly (without transforming) could yield better estimates.

4. If a transformation on y has been made and one wishes to use the standard for- mulas to test hypotheses or construct CIs, � should be at least approximately normally distributed. To check this, the residuals from the transformed regression should be examined.

5. When y is transformed, the r2 value from the resulting regression refers to varia- tion in the ’s, explained by the transformed regression model. Although a high value of r2 here indicates a good fit of the estimated original nonlinear model to the observed yi’s, r

2 does not refer to these original observations. Perhaps the best way to assess the quality of the fit is to compute the predicted values using the transformed model, transform them back to the original y scale to obtain , and then plot versus y. A good fit is then evidenced by points close to the 45° line. One could compute as a numerical measure of the goodness of fit. When the model was linear, we compared this to , the total variation about the horizontal line at height ; this led to r2. In the nonlinear case, though, it is not necessarily informative to measure total variation in this way, so an r2 value is not as useful as in the linear case.

y SST 5 g (yi 2 y)

2 SSE 5 g (yi 2 ŷi)

2 ŷ

ŷi

ŷri

yir

P

P â 5 eb̂0

g (yi 2 ae bxi)2Y 5 aebx 1 P

b̂ 2 b̂1â 2 eb̂0 g (yi 2 ae

bxi)2

0.0 80 160 240 320

0

80

160

240

320

y

y

(b)

0.0 20 40 60 80 100

e*

x

(a)

2.0

1.0

0.0

1.0

2.0

ˆ

Figure 13.6 Plot of (a) standardized residuals (after transforming) versus x; (b) versus y for data in Example 13.4 ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.2 Regression with Transformed Variables 537

More General Regression Methods Thus far we have assumed that either (an additive model) or that

(a multiplicative model). In the case of an additive model, , so estimating the regression function f (x) amounts to estimating the curve of mean y values. On occasion, a scatter plot of the data suggests that there is no simple math- ematical expression for f(x). Statisticians have recently developed some more flexi- ble methods that permit a wide variety of patterns to be modeled using the same fitting procedure. One such method is LOWESS (or LOESS), short for locally weighted scatter plot smoother. Let denote a particular one of the pairs in the sample. The value corresponding to is obtained by fitting a straight line using only a specified percentage of the data (e.g., 25%) whose x values are closest to x*. Furthermore, rather than use “ordinary” least squares, which gives equal weight to all points, those with x values closer to x* are more heavily weighted than those whose x values are farther away. The height of the resulting line above x* is the fitted value . This process is repeated for each of the n points, so n different lines are fit (you surely wouldn’t want to do all this by hand). Finally, the fitted points are connected to produce a LOWESS curve.

ŷ*

(x*, y*)ŷ n (x, y)(x*, y*)

mY # x 5 f (x)Y 5 f (x) # P Y 5 f (x) 1 P

Example 13.5 Weighing large deceased animals found in wilderness areas is usually not feasible, so it is desirable to have a method for estimating weight from various characteristics of an animal that can be easily determined. Minitab has a stored data set consisting of various characteristics for a sample of wild bears. Figure 13.7(a) dis- plays a scatter plot of versus around the chest (chest girth). At first glance, it looks as though a single line obtained from ordinary least squares would effectively summarize the pattern. Figure 13.7(b) shows the LOWESS curve produced by Minitab using a span of 50% [the fit at is determined by the closest 50% of the sample]. The curve appears to consist of two straight line seg- ments joined together above approximately . The steeper line is to the right of 38, indicating that weight tends to increase more rapidly as girth does for girths exceeding 38 in.

x 5 38

(x*, y*)

x 5 distancey 5 weight n 5 143

Figure 13.7 (a) A Minitab scatter plot for the bear weight data

100

0

20 30

Chest.G (a)

40 50

200

300

W ei

gh t

400

500

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

538 CHAPTER 13 Nonlinear and Multiple Regression

100

0

20 30

Chest.G (b)

40 50

200

300

W ei

gh t

400

500

It is complicated to make other inferences (e.g., obtain a CI for a mean y value) based on this general type of regression model. The bootstrap technique mentioned earlier can be used for this purpose.

Logistic Regression The simple linear regression model is appropriate for relating a quantitative response variable to a quantitative predictor x. Consider now a dichotomous response variable with possible values 1 and 0 corresponding to success and failure. Let

. Frequently, the value of p will depend on the value of some quantitative variable x. For example, the probability that a car needs warranty serv- ice of a certain kind might well depend on the car’s mileage, or the probability of avoiding an infection of a certain type might depend on the dosage in an inoculation. Instead of using just the symbol p for the success probability, we now use p(x) to emphasize the dependence of this probability on the value of x. The simple linear regression equation is no longer appropriate, for taking the mean value on each side of the equation gives

Whereas p(x) is a probability and therefore must be between 0 and 1, need not be in this range.

Instead of letting the mean value of Y be a linear function of x, we now consider a model in which some function of the mean value of Y is a linear function of x. In other words, we allow p(x) to be a function of rather than itself. A function that has been found quite useful in many applications is the logit function

Figure 13.8 shows a graph of p(x) for particular values of b0 and b1 with . As x increases, the probability of success increases. For b1 negative, the success proba- bility would be a decreasing function of x.

b1 . 0

p(x) 5 eb01b1x

1 1 eb01b1x

b0 1 b1xb0 1 b1x

b0 1 b1x

mY#x 5 1 # p(x) 1 0 # (1 2 p(x)) 5 p(x) 5 b0 1 b1x

Y 5 b0 1 b1x 1 P

p 5 P(S) 5 P(Y 5 1)

Figure 13.7 (b) A Minitab LOWESS curve for the bear weight data ■

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.2 Regression with Transformed Variables 539

10 20 30 40 50 60 70 80

0

.5

1.0

x

p(x)

Figure 13.8 A graph of a logit function

Logistic regression means assuming that p(x) is related to x by the logit function. Straightforward algebra shows that

The expression on the left-hand side is called the odds. If, for example, ,

then when a success is three times as likely as a failure. We now see that the logarithm of the odds is a linear function of the predictor. In particular, the slope parameter b1 is the change in the log odds associated with a one-unit increase in x. This implies that the odds itself changes by the multiplicative factor when x increases by 1 unit.

Fitting the logistic regression to sample data requires that the parameters b0 and b1 be estimated. This is usually done using the maximum likelihood technique described in Chapter 6. The details are quite involved, but fortunately the most pop- ular statistical computer packages will do this on request and provide quantitative and pictorial indications of how well the model fits.

eb1

x 5 60

p(60)

1 2 p(60) 5 3

p(x)

1 2 p(x) 5 eb01b1x

Example 13.6 Here is data, in the form of a comparative stem-and-leaf display, on launch temper- ature and the incidence of failure of O-rings in 23 space shuttle launches prior to the Challenger disaster of 1986 ( , failed; , did not fail). Observations on the left side of the display tend to be smaller than those on the right side.

Figure 13.9 shows Minitab output for a logistic regression analysis and a graph of the estimated logit function from the R software. We have chosen to let p denote the probability of failure. The graph of decreases as temperature increases because failures tended to occur at lower temperatures than did successes. The estimate of b1 and its estimated standard deviation are and , respectively.sb̂1 5 .1082b̂1 5 2.232

Y N

873 5

3 4 6 4 677789 Stem: Tens digit 500 7 002356689 Leaf : Ones digit

8 1

N 5 noY 5 yes

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

540 CHAPTER 13 Nonlinear and Multiple Regression

We assume that the sample size n is large enough here so that has approximately a normal distribution. If (i.e., temperature does not affect the likelihood of O-ring failure), the test statistic has approximately a standard normal distri- bution. The reported value of this ratio is , with a corresponding two-tailed P- value of .032 (some packages report a chi-square value which is just z2, with the same P-value). At significance level .05, we reject the null hypothesis of no temperature effect.

z 5 22.14 Z 5 b̂1/sb̂

b1 5 0 b̂1

55

0.0

0.2

0.4

0.6

0.8

1.0

Failure No Failure Predicted Probability of Failure

60 65 70

N

N

N

Y Y Y Y Y Y Y

Y

N NN N N N N N N N N N N

Temperature

F ai

lu re

75 80 x

y

N

(b)

Figure 13.9 (a) Logistic regression output from Minitab for Example 13.6; (b) graph of estimated logistic function and classification probabilities from R

The estimated odds of failure for any particular temperature value x is

This implies that the odds ratio—the odds of failure at a temperature of divided by the odds of failure at a temperature of —is

The interpretation is that for each additional degree of temperature, we estimate that the odds of failure will decrease by a factor of .79 (21%). A 95% CI for the true odds ratio also appears on output. In addition, Minitab provides three different ways of assessing model lack-of-fit: the Pearson, deviance, and Hosmer-Lemeshow tests. Large P-values are consistent with a good model. These tests are useful in multiple logistic regression, where there is more than one predictor in the model relationship so there is no single graph like that of Figure 13.9(b). Various diagnostic plots are also available.

p(x 1 1)/[1 2 p(x 1 1)]

p(x)/[1 2 p(x)] 5 e2.232163 5 .7928

x x 1 1

p(x)

1 2 p(x) 5 e15.04292.232163x

Binary Logistic Regression: failure versus temp

Logistic Regression Table Odds 95% CI

Predictor Coef SE Coef Z P Ratio Lower Upper Constant 15.0429 7.37862 2.04 0.041 temp 0.232163 0.108236 2.14 0.032 0.79 0.64 0.98

Goodness-of-Fit Tests Method Chi-Square DF P Pearson 11.1303 14 0.676 Deviance 11.9974 14 0.607 Hosmer-Lemeshow 9.7119 8 0.286

22

Classification Summary

Y 0 1 0 1.0000000 0.0000000 1 0.4285714 0.5714286

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.2 Regression with Transformed Variables 541

The R output provides information based on classifying an observation as a failure if the estimated p(x) is at least .5 and as a non-failure otherwise. Since

when , three of the seven failures (Ys in the graph) would be misclassified as non-failures (a misclassification proportion of .429), whereas none of the non-failure observations would be misclassified. A better way to assess the likelihood of misclassification is to use cross-validation: Remove the first observa- tion from the sample, estimate the relationship, then classify the first observation based on this estimated relationship, and repeat this process with each of the other sample observations (so a sample observation does not affect its own classification).

The launch temperature for the Challenger mission was only 31°F. This tem- perature is much smaller than any value in the sample, so it is dangerous to extrap- olate the estimated relationship. Nevertheless, it appears that O-ring failure is virtually a sure thing for a temperature this small. ■

x 5 64.80p(x) 5 .5

EXERCISES Section 13.2 (15–25)

15. No tortilla chip aficionado likes soggy chips, so it is impor- tant to find characteristics of the production process that produce chips with an appealing texture. The following data on time (sec) and content (%) appeared in the article “Thermal and Physical Properties of Tortilla Chips as a Function of Frying Time” (J. of Food Processing and Preservation, 1995: 175–189).

y 5 moisturex 5 frying

A linear regression of log(time) versus load was fit. The investigators were particularly interested in estimating the slope of the true regression line relating these variables. Investigate the quality of the fit, estimate the slope, and pre- dict time to failure when load is 80, in a way that conveys information about reliability and precision.

17. The following data on mass rate of burning x and flame length y is representative of that which appeared in the arti- cle “Some Burning Characteristics of Filter Paper” (Com- bustion Science and Technology, 1971: 103–120):

x 1.7 2.2 2.3 2.6 2.7 3.0 3.2

y 1.3 1.8 1.6 2.0 2.1 2.2 3.0

x 3.3 4.1 4.3 4.6 5.7 6.1

y 2.6 4.1 3.7 5.0 5.8 5.3

a. Estimate the parameters of a power function model. b. Construct diagnostic plots to check whether a power

function is an appropriate model choice. c. Test versus , using a level .05 test. d. Test the null hypothesis that states that the median flame

length when burning rate is 5.0 is twice the median flame length when burning rate is 2.5 against the alternative that this is not the case.

18. Failures in aircraft gas turbine engines due to high cycle fatigue is a pervasive problem. The article “Effect of Crystal Orientation on Fatigue Failure of Single Crystal Nickel Base Turbine Blade Superalloys” (J. of Engineering for Gas Turbines and Power, 2002: 161–176) gave the accompanying data and fit a nonlinear regression model in order to predict strain amplitude from cycles to failure. Fit an appropriate model, investigate the quality of the fit, and predict amplitude when cycles to failure 5

.5000

Ha: b , 4 3

H0: b 5 4 3

x 5 10 15 20 25 30 45 60

y 16.3 9.7 8.1 4.2 3.4 2.9 1.9 1.3

a. Construct a scatter plot of y versus x and comment. b. Construct a scatter plot of the (ln(x), ln(y)) pairs and

comment. c. What probabilistic relationship between x and y is sug-

gested by the linear pattern in the plot of part (b)? d. Predict the value of moisture content when frying time is

20, in a way that conveys information about reliability and precision.

e. Analyze the residuals from fitting the simple linear regression model to the transformed data and comment.

16. Polyester fiber ropes are increasingly being used as compo- nents of mooring lines for offshore structures in deep water. The authors of the paper “Quantifying the Residual Creep Life of Polyester Mooring Ropes” (Intl. J. of Offshore and Polar Exploration, 2005: 223–228) used the accompanying data as a basis for studying how time to failure (hr) depended on load (% of breaking load):

x 77.7 77.8 77.9 77.8 85.5 85.5

y 5.067 552.056 127.809 7.611 .124 .077

x 89.2 89.3 73.1 85.5 89.2 85.5

y .008 .013 49.439 .503 .362 9.930

x 89.2 85.5 89.2 82.3 82.0 82.3

y .677 5.322 .289 53.079 7.625 155.299

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

542 CHAPTER 13 Nonlinear and Multiple Regression

Obs Cycfail Strampl Obs Cycfail Strampl

1 1326 .01495 11 7356 .00576 2 1593 .01470 12 7904 .00580 3 4414 .01100 13 79 .01212 4 5673 .01190 14 4175 .00782 5 29516 .00873 15 34676 .00596 6 26 .01819 16 114789 .00600 7 843 .00810 17 2672 .00880 8 1016 .00801 18 7532 .00883 9 3410 .00600 19 30220 .00676

10 7101 .00575

19. Thermal endurance tests were performed to study the relationship between temperature and lifetime of poly- ester enameled wire (“Thermal Endurance of Polyester Enameled Wires Using Twisted Wire Specimens,” IEEE Trans. Insulation, 1965: 38–44), resulting in the follow- ing data.

Temp. 200 200 200 200 200 200

Lifetime 5933 5404 4947 4963 3358 3878

Temp. 220 220 220 220 220 220

Lifetime 1561 1494 747 768 609 777

Temp. 240 240 240 240 240 240

Lifetime 258 299 209 144 180 184

a. Does a scatter plot of the data suggest a linear proba- bilistic relationship between lifetime and temperature?

b. What model is implied by a linear relationship between expected ln(lifetime) and 1/temperature? Does a scatter plot of the transformed data appear consistent with this relationship?

c. Estimate the parameters of the model suggested in part (b). What lifetime would you predict for a temper- ature of 220?

d. Because there are multiple observations at each x value, the method in Exercise 14 can be used to test the null hypothesis that states that the model suggested in part (b) is correct. Carry out the test at level .01.

20. Exercise 14 presented data on body weight x and meta- bolic clearance rate/body weight y. Consider the following intrinsically linear functions for specifying the relation- ship between the two variables: (a) ln(y) versus x, (b) ln(y) versus ln(x), (c) y versus ln(x), (d) y versus 1/x, and (e) ln(y) versus 1/x. Use any appropriate diagnostic plots and analyses to decide which of these functions you would select to specify a probabilistic model. Explain your reasoning.

21. A plot in the article “Thermal Conductivity of Polyethylene: The Effects of Crystal Size, Density, and Orientation on the Thermal Conductivity” (Polymer Engr. and Science, 1972:

204–208) suggests that the expected value of thermal conductivity y is a linear function of , where x is lamellar thickness.

x 240 410 460 490 520 590 745 8300

y 12.0 14.7 14.7 15.2 15.2 15.6 16.0 18.1

a. Estimate the parameters of the regression function and the regression function itself.

b. Predict the value of thermal conductivity when lamellar thickness is 500 Å.

22. In each of the following cases, decide whether the given function is intrinsically linear. If so, identify x� and y�, and then explain how a random error term can be introduced to yield an intrinsically linear probabilistic model. a. b. c. d.

23. Suppose x and y are related according to a probabilistic exponential model , with a constant inde- pendent of x (as was the case in the simple linear model

. Is V(Y) a constant independent of x [as was the case for , where ]? Explain your reasoning. Draw a picture of a prototype scat- ter plot resulting from this model. Answer the same ques- tions for the power model .

24. Kyphosis refers to severe forward flexion of the spine fol- lowing corrective spinal surgery. A study carried out to determine risk factors for kyphosis reported the accompa- nying ages (months) for 40 subjects at the time of the oper- ation; the first 18 subjects did have kyphosis and the remaining 22 did not.

Kyphosis 12 15 42 52 59 73 82 91 96 105 114 120

121 128 130 139 139 157

No kyphosis 1 1 2 8 11 18 22 31 37 61 72 81 97 112 118 127 131 140

151 159 177 206

Use the Minitab logistic regression output on page 543 to decide whether age appears to have a significant impact on the presence of kyphosis.

25. The article “Acceptable Noise Levels for Construction Site Offices” (Building Serv. Engr. Res. Tech., 2009: 87–94) analyzed responses from a sample of 77 individu- als, each of whom was asked to say whether a particular noise level (dBA) to which he/she had been exposed was acceptable or unacceptable. Here is data provided by the article’s authors:

Y 5 axb # P

V(Y) 5 s2Y 5 b0 1 b1x 1 P Y 5 b0 1 b1x 1 P)

V(P)Y 5 aebx # P

y 5 a 1 belx y 5 eea1bx (a Gompertz curve) y 5 1/(1 1 ea1bx) y 5 1/(a 1 bx)

P

104 # 1/x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.3 Polynomial Regression 543

Acceptable:

55.3 55.3 55.3 55.9 55.9 55.9 55.9 56.1 56.1 56.1 56.1 56.1 56.1 56.8 56.8 57.0 57.0 57.0 57.8 57.8 57.8 57.9 57.9 57.9 58.8 58.8 58.8 59.8 59.8 59.8 62.2 62.2 65.3 65.3 65.3 65.3 68.7 69.0 73.0 73.0

Unacceptable:

63.8 63.8 63.8 63.9 63.9 63.9 64.7 64.7 64.7 65.1 65.1 65.1 67.4 67.4 67.4 67.4 68.7 68.7 68.7 70.4 70.4 71.2 71.2 73.1 73.1 74.6 74.6 74.6 74.6 79.3 79.3 79.3 79.3 79.3 83.0 83.0 83.0

Interpret the accompanying Minitab logistic regression output, and sketch a graph of the estimated probability of a noise level being acceptable as a function of the level.

Logistic regression table for Exercise 24 95% CI

Predictor Coef StDev Z P Odds Ratio Lower Upper Constant �0.5727 0.6024 �0.95 0.342 age 0.004296 0.005849 0.73 0.463 1.00 0.99 1.02

Logistic regression table for Exercise 25 95% CI

Predictor Coef SE Coef Z P Odds Ratio Lower Upper Constant 23.2124 5.05095 4.60 0.000 noise level �0.359441 0.0785031 �4.58 0.000 0.70 0.60 0.81

13.3 Polynomial Regression The nonlinear yet intrinsically linear models of Section 13.2 involved functions of the independent variable x that were either strictly increasing or strictly decreasing. In many situations, either theoretical reasoning or else a scatter plot of the data sug- gests that the true regression function has one or more peaks or valleys—that is, at least one relative minimum or maximum. In such cases, a polynomial function

may provide a satisfactory approximation to the true regression function. y 5 b0 1 b1x 1 c1 bkx

k

mY # x

DEFINITION The kth-degree polynomial regression model equation is

(13.6)

where is a normally distributed random variable with

(13.7)mP 5 0 sP 2 5 s2

P

Y 5 b0 1 b1x 1 b2x 2 1 c1 bkx

k 1 P

From (13.6) and (13.7), it follows immediately that

(13.8)

In words, the expected value of Y is a kth-degree polynomial function of x, whereas the variance of Y, which controls the spread of observed values about the regression function, is the same for each value of x. The observed pairs are assumed to have been generated independently from the model (13.6). Figure 13.10 illustrates both a quadratic and cubic model; very rarely in practice is it necessary to go beyond .k 5 3

(x 1, y1), c, (xn, yn)

mY # x 5 b0 1 b1x 1 c1 bkx k sY # x2 5 s2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 13.7

544 CHAPTER 13 Nonlinear and Multiple Regression

x

y

(a) x

y

(b)

Figure 13.10 (a) Quadratic regression model; (b) cubic regression model

Estimating Parameters To estimate the bs, consider a trial regression function Then the goodness of fit of this function to the observed data can be assessed by computing the sum of squared deviations

(13.9)

According to the principle of least squares, the estimates are those val- ues of that minimize Expression (13.9). It should be noted that when

are all different, there is a polynomial of degree that fits the data perfectly, so that the minimizing value of (13.9) is 0 when . However, in virtually all applications, the polynomial model (13.6) with large k is quite unrealistic.

To find the minimizing values in (13.9), take the partial derivatives and equate them to 0. This gives a system of normal equa-

tions for the estimates. Because the trial function is linear in (though not in x), the normal equations are linear in these unknowns:k 1 1b0, c, bk

b0 1 b1x 1 c1 bkx k

'f/'b0, 'f/'b1, c, 'f/'bk k 1 1

k 5 n 2 1 n 2 1x 1, x 2, c, xn

b0, b1, c, bk

b̂0, b̂1, c, b̂k

f(b0, b1, c, bk) 5 g n

i51 [yi 2 (b0 1 b1x i 1 b2x i

2 1 c 1 bkx i k)]2

y 5 b0 1 b1x 1 c1 bkx k.

The article “Residual Stresses and Adhesion of Thermal Spray Coatings” (Surface Engineering, 2005: 35–40) considered the relationship between the thickness (mm) of NiCrAl coatings deposited on stainless steel substrate and corresponding bond strength (MPa). The following data was read from a plot in the paper:

(13.10) b0gx i

k 1 b1gx i k11 1 c1 bkgx i

2k 5 gx i kyi

((( b0gx i 1 b1gx i

2 1 b2gx i 3 1 c1 bkgx i

k11 5 gx iyi

b0n 1 b1gx i 1 b2gx i 2 1 c 1 bkgx i

k 5 gyi

All standard statistical computer packages will automatically solve the equations in (13.10) and provide the estimates as well as much other information.*

Thickness 220 220 220 220 370 370 370 370 440 440

Strength 24.0 22.0 19.1 15.5 26.3 24.6 23.1 21.2 25.2 24.0

Thickness 440 440 680 680 680 680 860 860 860 860

Strength 21.7 19.2 17.0 14.9 13.0 11.8 12.2 11.2 6.6 2.8

* We will see in Section 13.4 that polynomial regression is a special case of multiple regression, so a command appropriate for this latter task is generally used.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.3 Polynomial Regression 545

The scatter plot in Figure 13.11(a) supports the choice of the quadratic regression model. Figure 13.11(b) contains Minitab output from a fit of this model. The estimated regression coefficients are

from which the estimated regression function is

Substitution of the successive x values 220, 220, . . . , 860, and 860 into this function gives the predicted values , and the residuals

result from subtraction. Figure 13.12 shows a plot of the standardized residuals versus and also a normal probability plot of the standardized residuals, both of which validate the quadratic model.

ŷ y1 2 ŷ1 5 2.872, c, y20 2 ŷ20 5 24.521

ŷ1 5 21.128, c, ŷ20 5 7.321

y 5 14.521 1 .04323x 2 .00006001x 2

b̂0 5 14.521 b̂1 5 .04323 b̂2 5 2.00006001

0 200 400 600 800 1000

30

25

20

15

10

5

0

S t

r e

n g

t h

Thickness

Figure 13.11 Scatter plot of data from Example 13.7 and Minitab output from fit of quadratic model

The regression equation is strength thickness � 0.000060 thicksqd

Predictor Coef SE Coef T P Constant 14.521 4.754 3.05 0.007 thickness 0.04323 0.01981 2.18 0.043 thicksqd �0.00006001 0.00001786 �3.36 0.004

Analysis of Variance

Source DF SS MS F P Regression 2 643.29 321.65 30.09 0.000 Residual Error 17 181.71 10.69 Total 19 825.00

Predicted Values for New Observations

New Obs Fit SE Fit 95% CI 95% PI

1 21.136 1.167 (18.674, 23.598) (13.812, 28.460) 2 10.704 1.189 ( 8.195, 13.212) ( 3.364, 18.043)

Values of Predictors for New Observations

New Obs thickness thicksqd

1 500 250000 2 800 640000

R-Sq(adj) 5 75.4%R-Sq 5 78.0%S 5 3.26937

5 14.5 1 0.0432

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

546 CHAPTER 13 Nonlinear and Multiple Regression

–2 –1 0 1 2

99

90

50

10

1

Standardized Residual

P er

ce nt

Normal Probability Plot of the Residuals

8 12 16 20 24

2

1

0

–1

–2

Fitted Value

St an

da rd

iz ed

R es

id ua

l

Residuals Versus the Fitted Values

Figure 13.12 Diagnostic plots for quadratic model fit to data of Example 13.7

2 and R2

To make further inferences, the error variance s2 must be estimated. With , the ith residual is , and the sum of squared

residuals (error sum of squares) is . The estimate of s2 is then

(13.11)

where the denominator is used because df are lost in estimating .

If we again let , then SSE/SST is the proportion of the total variation in the observed yi’s that is not explained by the polynomial model. The quantity , the proportion of variation explained by the model, is called the coefficient of multiple determination and is denoted by R2.

Consider fitting a cubic model to the data in Example 13.7. Because this model includes the quadratic as a special case, the fit will be at least as good as the fit to a quadratic. More generally, with sum of squares from a kth-degree polynomial, and whenever . Because the objective of regression analysis is to find a model that is both simple (relatively few parameters) and provides a good fit to the data, a higher-degree polynomial may not specify a better model than a lower-degree model despite its higher R2 value. To balance the cost of using more parameters against the gain in R2, many statisticians use the adjusted coefficient of multiple determination

kr . kRkr2 $ Rk2SSEkr # SSEk SSEk 5 the error

1 2 SSE/SST

SST 5 g (yi 2 y ) 2

b0, b1, c, bk

k 1 1n 2 (k 1 1)

ŝ2 5 s2 5 SSE

n 2 (k 1 1) 5 MSE

SSE 5 g (yi 2 ŷi) 2

yi 2 ŷiŷi 5 b̂ 0 1 b̂ 1x i 1 c1 b̂ kx i k

(13.12)

Adjusted R2 adjusts the proportion of unexplained variation upward [since the ratio exceeds 1], which results in adjusted . For example, if

, and , then

Thus the small gain in R2 in going from a quadratic to a cubic model is not enough to offset the cost of adding an extra parameter to the model.

adjusted R2 2 5

9(.66) 2 2

10 2 3 5 .563 adjusted R3

2 5 9(.70) 2 3

10 2 4 5 .550

n 5 10R2 2 5 .66, R3

2 5 .70 R2 , R2(n 2 1)/(n 2 k 2 1)

adjusted R 2 5 1 2 n 2 1

n 2 (k 1 1) # SSE SST

5 (n 2 1)R 2 2 k

n 2 1 2 k

Example 13.8 SSE and SST are typically found on computer output in an ANOVA table. Figure 13.11(b) gives and for the bond strength data, from which (alternatively,

). Thus 78.0% of the observed variation in bond strength can643.29/825.00 5 .780 R2 5 SSR/SST 5R2 5 1 2 181.71/825.00 5 .780

SST 5 825.00SSE 5 181.71(Example 13.7 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.3 Polynomial Regression 547

be attributed to the model relationship. Adjusted , only a small downward change in R2. The estimates of s2 and s are

Besides computing R2 and adjusted R2, one should examine the usual diagnostic plots to determine whether model assumptions are valid or whether modification may be appropriate (see Figure 13.12). There is also a formal test of model utility, an F test based on the ANOVA sums of squares. Since polynomial regression is a special case of multiple regression, we defer discussion of this test to the next section.

Statistical Intervals and Test Procedures Because the yi’s appear in the normal equations (13.10) only on the right-hand side and in a linear fashion, the resulting estimates are themselves linear func- tions of the yi’s. Thus the estimators are linear functions of the Yi’s, so each has a normal distribution. It can also be shown that each is an unbiased estimator of bi.

Let denote the standard deviation of the estimator . This standard devia- tion has the form

Fortunately, the expression in braces has been programmed into all of the most fre- quently used statistical software packages. The estimated standard deviation of results from substituting s in place of s in the expression for . These estimated standard deviations , and appear on output from all the aforementioned statistical packages. Let denote the estimator of —that is, the random variable whose observed value is . Then it can be shown that the standardized variable

(13.13)

has a t distribution based on df. This leads to the following inferential procedures.

n 2 (k 1 1)

T 5 b̂ i 2 b̂ i

Sb̂i

sb̂ i

sb̂iSb̂ i

sb̂ ksb̂ 0, sb̂1, c sb̂i

b̂ i

sb̂i 5 s # e a complicated expression involving all

x j’s, x j 2’s, c, and x j

k’s f

b̂ isb̂i

b̂ i

b̂ i

b̂ 0, c, b̂ k

ŝ 5 s 5 3.27

ŝ2 5 s2 5 SSE

n 2 (k 1 1) 5

181.71

20 2 (2 1 1) 5 10.69

R2 5 .754

A CI for bi, the coefficient of x i in the polynomial regression

function, is

A test of is based on the t statistic value

The test is based on df and is upper-, lower-, or two-tailed

according to whether the inequality in Ha is , or �.. , ,

n 2 (k 1 1)

t 5 b̂ i 2 bi0

sb̂ i

H0: bi 5 bi0

b̂ i 6 ta/2,n2(k11) # sb̂i

100(1 2 a)%

A point estimate of —that is, of —is . The estimated standard deviation of the corresponding estimator

is rather complicated. Many computer packages will give this estimated standard b̂ 1x 1 c1 b̂ kx

k

m̂Y # x 5 b̂ 0 1b0 1 b1x 1 c1 bkx kmY # x

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

548 CHAPTER 13 Nonlinear and Multiple Regression

deviation for any x value upon request. This, along with an appropriate standardized t variable, can be used to justify the following procedures.

Let x* denote a specified value of x. A CI for is

With denoting the calculated value of for the given data, and denoting the estimated standard deviation of the

statistic , the formula for the CI is much like the one in the case of simple linear regression:

A PI for a future y value to be observed when is

5 ŷ 6 ta/2,n2(k11) # 2s2 1 sŶ2m̂Y # x* 6 ta/2,n2(k11) # e s2 1 aestimated SDof m̂Y # x* b 2 f 1/2

x 5 x*100(1 2 a)%

ŷ 6 ta/2,n2(k11) # sŶ

sŶ

ŶŶ 5 b̂0 1 b̂1x* 1 c1 b̂k(x*) k, ŷ

m̂Y # x* 6 ta/2,n2(k11) # e estimated SD ofm̂Y # x* f mY # x*100(1 2 a)%

Example 13.9 Figure 13.11(b) shows that and (from the SE Coef column at the top of the output). The null hypothesis says that as long as the linear predictor x is retained in the model, the quadratic predictor x2

provides no additional useful information. The relevant alternative is , and

the test statistic is , with computed value �3.36. The test is based on df. At significance level .05, the null hypothesis is rejected

because . Inclusion of the quadratic predictor is justi- fied. The same conclusion results from comparing the reported P-value .004 to the chosen significance level .05.

The output in Figure 13.11(b) also contains estimation and prediction infor- mation both for and for . In particular, for ,

from which a 95% CI for mean strength when is . A 95% PI for the strength resulting from a single bond when

is As before, the PI is substantially wider than the CI because s is large compared to SE Fit. ■

Centering x Values For the quadratic model with regression function , the parameters b0, b1, and b2 characterize the behavior of the function near . For example, b0 is the height at which the regression function crosses the vertical axis

, whereas b1 is the first derivative of the function at (instantaneous rate of change of at ). If the xi’s all lie far from 0, we may not have precise infor- mation about the values of these parameters. Let of the xi’s for which observations are to be taken, and consider the model

(13.14)Y 5 b*0 1 b*1(x 2 x) 1 b*2(x 2 x) 2 1 P

x 5 the average x 5 0mY # x

x 5 0x 5 0

x 5 0 mY # x 5 b0 1 b1x 1 b2 x 2

21.136 6 (2.110)[(3.27)2 1 (1.167)2]1/2 5 (13.81, 28.46).thickness 5 500 (1.167) 5 (18.67, 23.60)

21.136 6 (2.110) 3thickness 5 500

sŶ 5 estimated SD of Ŷ 5 SE Fit 5 1.167

ŷ 5 b̂0 1 b̂1(500) 1 b̂2(500) 2 5 Fit 5 21.136

x 5 500x 5 800x 5 500

23.36 # 22.110 5 2t .025, 17

n 2 (k 1 1) 5 17 T 5 b̂2/Sb̂ 2

Ha: b2 2 0

H0: b2 5 0 sb̂ 2 5 .00001786b̂2 5 2.00006001

(Example 13.8 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.3 Polynomial Regression 549

In the model (13.14), , and the parameters now describe the behavior of the regression function near the center of the data.

To estimate the parameters of (13.14), we simply subtract from each xi to obtain and then use the ’s in place of the xi’s. An important benefit of this is that the coefficients of in the normal equations (13.10) will be of much smaller magnitude than would be the case were the original xi’s used. When the system is solved by computer, this centering protects against any round-off error that may result.

b0, c, bk

xrixri 5 x i 2 x x

x mY # x 5 b*0 1 b*1 (x 2 x) 1 b*2 (x 2 x )

2

Example 13.10 The article “A Method for Improving the Accuracy of Polynomial Regression Analysis” (J. of Quality Tech., 1971: 149–155) reports the following data on cure temperature (°F) and shear strength of a rubber compound (psi), with :x 5 297.13

y 5 ultimate x 5

x 280 284 292 295 298 305 308 315

x� �17.13 �13.13 �5.13 �2.13 .87 7.87 10.87 17.87

y 770 800 840 810 735 640 590 560

Table 13.4 Estimated Coefficients and Standard Deviations for Example 13.10

Parameter Estimate Estimated SD Parameter Estimate Estimated SD

b0 �26,219.64 11,912.78 759.36 23.20

b1 189.21 80.25 �7.61 1.43

b2 �.3312 .1350 �.3312 .1350b*2

b*1

b*0

A computer analysis yielded the results shown in Table 13.4.

The estimated regression function using the original model is , whereas for the centered model the function is

. These estimated functions are identical; the only difference is that different parameters have been estimated for the two models. The estimated standard deviations indicate clearly that and have been more accu- rately estimated than b0 and b1. The quadratic parameters are identical , as can be seen by comparing the x2 term in (13.14) with the original model. We emphasize again that a major benefit of centering is the gain in computational accuracy, not only in quad- ratic but also in higher-degree models. ■

The book by Neter et al., listed in the chapter bibliography, is a good source for more information about polynomial regression.

(b2 5 b*2) b*1b*0

2 7.61(x 2 297.13) 2 .3312(x 2 297.13)2 y 5 759.361 189.21x 2 .3312x 2

y 5 226,219.64

EXERCISES Section 13.3 (26–35)

26. The article “Physical Properties of Cumin Seed” (J. of Agric. Engr. Res., 1996: 93–98) considered a quadratic regression of density on content. Data from ax 5 moisturey 5 bulk

graph in the article follows, along with Minitab output from the quadratic fit.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

550 CHAPTER 13 Nonlinear and Multiple Regression

The regression equation is

Predictor Coef StDev T P Constant 403.24 36.45 11.06 0.002 moiscont 16.164 5.451 2.97 0.059 contsqd �0.7063 0.1852 �3.81 0.032

Analysis of Variance

Source DF SS MS F P Regression 2 4637.7 2318.9 22.51 0.016 Residual Error 3 309.1 103.0 Total 5 4946.8

StDev St Obs moiscont bulkdens Fit Fit Residual Resid

1 7.0 479.00 481.78 9.35 �2.78 �0.70 2 10.3 503.00 494.79 5.78 8.21 0.98 3 13.7 487.00 492.12 6.49 �5.12 �0.66 4 16.6 470.00 476.93 6.10 �6.93 �0.85 5 19.8 458.00 446.39 5.69 11.61 1.38 6 22.0 412.00 416.99 8.75 �4.99 �0.97

StDev Fit Fit 95.0% CI 95.0% PI

491.10 6.52 (470.36, 511.83) (452.71, 529.48)

a. Does a scatter plot of the data appear consistent with the quadratic regression model?

b. What proportion of observed variation in density can be attributed to the model relationship?

c. Calculate a 95% CI for true average density when mois- ture content is 13.7.

d. The last line of output is from a request for estimation and prediction information when moisture content is 14. Cal- culate a 99% PI for density when moisture content is 14.

e. Does the quadratic predictor appear to provide useful infor- mation? Test the appropriate hypotheses at significance level .05.

27. The following data on concentration (g/L) and time (days) for a particular blend of malt

liquor was read from a scatter plot in the article “Improving Fermentation Productivity with Reverse Osmosis” (Food Tech., 1984: 92–96):

x 1 2 3 4 5 6 7 8

y 74 54 52 51 52 53 58 71

a. Verify that a scatter plot of the data is consistent with the choice of a quadratic regression model.

b. The estimated quadratic regression equation is . Predict the value of

glucose concentration for a fermentation time of 6 days, and compute the corresponding residual.

c. Using , what proportion of observed variation can be attributed to the quadratic regression relationship?

d. The standardized residuals based on the quadratic model are 1.91, �1.95, �.25, .58, .90, .04, �.66, and .20. Construct a plot of the standardized residuals versus

n 5 8

SSE 5 61.77

y 5 84.482 2 15.875x 1 1.7679x 2

x 5 fermentation y 5 glucose

R-Sq(adj) 5 89.6%R-Sq 5 93.8%S 5 10.15

bulkdens 5 403 1 16.2 moiscont 2 0.706 contsqd x and a normal probability plot. Do the plots exhibit any troublesome features?

e. The estimated standard deviation of —that is, —is 1.69. Compute

a 95% CI for . f. Compute a 95% PI for a glucose concentration observa-

tion made after 6 days of fermentation time.

28. The viscosity (y) of an oil was measured by a cone and plate viscometer at six different cone speeds (x). It was assumed that a quadratic regression model was appropriate, and the estimated regression function resulting from the observations was

a. Estimate , the expected viscosity when speed is 75 rpm.

b. What viscosity would you predict for a cone speed of 60 rpm?

c. If , and , compute

and s.

d. From part (c), . Using SSE computed in part (c), what is the computed value of R2?

e. If the estimated standard deviation of is , test versus at level .01, and interpret the result.

29. High-alumina refractory castables have been extensively investigated in recent years because of their significant advantages over other refractory brick of the same class— lower production and application costs, versatility, and per- formance at high temperatures. The accompanying data on

and was read from a graph in the article “Processing of Zero-Cement Self-Flow Alumina Castables” (The Amer. Ceramic Soc. Bull., 1998: 60–66):

x 351 367 373 400 402 456 484

y 81 83 79 75 70 43 22

The authors of the cited paper related these two variables using a quadratic regression model. The estimated regres- sion function is . a. Compute the predicted values and residuals, and then

SSE and s2. b. Compute and interpret the coefficient of multiple

determination. c. The estimated SD of is . Does the

quadratic predictor belong in the regression model? d. The estimated SD of is .4050. Use this and the infor-

mation in (c) to obtain joint CIs for the linear and quad- ratic regression coefficients with a joint confidence level of (at least) 95%.

e. The estimated SD of is 1.198. Calculate a 95% CI for true average free-flow when and alsoviscosity 5 400

m̂Y # 400

b̂1

sb̂2 5 .0004835b̂2

y 5 2295.96 1 2.1885x 2 .0031662x 2

y 5 free-flow (%)x 5 viscosity (MPa # s)

Ha: b2 2 0H0: b2 5 0 sb̂ 2 5 .00226b̂2

SST 5 8386.43 2 (210.70)2/6 5 987.35

b̂0gyi 2 b̂1gx iyi 2 b̂2gx i 2yi]

SSE [5 gyi 2 2gx i

2yi 5 1,419,780 gyi

2 5 8386.43, gyi 5 210.70, gx iyi 5 17,002.00

mY # 75

y 5 2113.0937 1 3.3684x 2 .01780x 2

n 5 6

mY # 6

b̂0 1 b̂1(6) 1 b̂2(36) m̂Y # 6

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.3 Polynomial Regression 551

a 95% PI for free-flow resulting from a single observation made when , and compare the intervals.

30. The accompanying data was extracted from the article “Effects of Cold and Warm Temperatures on Springback of Aluminum-Magnesium Alloy 5083-H111” (J. of Engr. Manuf., 2009: 427–431). The response variable is yield strength (MPa), and the predictor is temperature (°C).

x �50 25 100 200 300

y 91.0 120.5 136.0 133.1 120.8

Here is Minitab output from fitting the quadratic regression model (a graph in the cited paper suggests that the authors did this):

Predictor Coef SE Coef T P Constant 111.277 2.100 52.98 0.000 temp 0.32845 0.03303 9.94 0.010 tempsqd �0.0010050 0.0001213 �8.29 0.014

Analysis of Variance

Source DF SS MS F P Regression 2 1245.39 622.69 52.50 0.019 Residual Error 2 23.72 11.86 Total 4 1269.11

a. What proportion of observed variation in strength can be attributed to the model relationship?

b. Carry out a test of hypotheses at significance level .05 to decide if the quadratic predictor provides useful informa- tion over and above that provided by the linear predictor.

c. For a strength value of 100, . Estimate true average strength when temperature is 100, in a way that conveys information about precision and reliability.

d. Use the information in (c) to predict strength for a single observation to be made when temperature is 100, and do so in a way that conveys information about precision and reliability. Then compare this prediction to the estimate obtained in (c).

31. The accompanying data on output (W) and difference (°K) was provided by the

authors of the article “Comparison of Energy and Exergy Efficiency for Solar Box and Parabolic Cookers” (J. of Energy Engr., 2007: 53–62).

The article’s authors fit a cubic regression model to the data. Here is Minitab output from such a fit.

x 5 temperature y 5 energy

ŷ 5 134.07, sŶ 5 2.38

R-Sq(adj) 5 96.3%R-Sq 5 98.1%S 5 3.44398

viscosity 5 400 The regression equation is

Predictor Coef SE Coef T P Constant �133.787 8.048 �16.62 0.000 x 12.7423 0.7750 16.44 0.000 x**2 �0.37652 0.02444 �15.41 0.000 x**3 0.0035861 0.0002529 14.18 0.000

Analysis of Variance

Source DF SS MS F P Regression 3 27.9744 9.3248 329.00 0.000 Residual Error 20 0.5669 0.0283 Total 23 28.5413

a. What proportion of observed variation in energy output can be attributed to the model relationship?

b. Fitting a quadratic model to the data results in . Calculate adjusted R2 for this model and compare to adjusted R2 for the cubic model.

c. Does the cubic predictor appear to provide useful information about y over and above that provided by the linear and quadratic predictors? State and test the appropriate hypotheses.

d. When . Obtain a 95% CI for true aver- age energy output in this case, and also a 95% PI for a sin- gle energy output to be observed when temperature difference is 30. Hint: 5 .0611.

e. Interpret the hypotheses versus , and then carry out a test at significance

level .05 using the fact that when .

32. The following data is a subset of data obtained in an exper- iment to study the relationship between x 5 soil pH and

. Concentration/EC (“Root Responses of Three Gramineae Species to Soil Acidity in an Oxisol and an Ultisol,” Soil Science, 1973: 295–302):

x 4.01 4.07 4.08 4.10 4.18

y 1.20 .78 .83 .98 .65

x 4.20 4.23 4.27 4.30 4.41

y .76 .40 .45 .39 .30

x 4.45 4.50 4.58 4.68 4.70 4.77

y .20 .24 .10 .13 .07 .04

A cubic model was proposed in the article, but the version of Minitab used by the author of the present text refused to

y 5 A1

x 5 35, sŶ 5 .0523 Ha: mY # 35 2 5

H0: mY # 35 5 5 sŶ

x 5 30, sŶ 5 .0611

R2 5 .780

R-Sq (adj) 5 97.7%R-Sq 5 98.0%S 5 0.168354

y 5 2134 1 12.7 x 2 0.377 x**2 1 0.00359 x**3

x 23.20 23.50 23.52 24.30 25.10 26.20 27.40 28.10 29.30 30.60 31.50 32.01

y 3.78 4.12 4.24 5.35 5.87 6.02 6.12 6.41 6.62 6.43 6.13 5.92

x 32.63 33.23 33.62 34.18 35.43 35.62 36.16 36.23 36.89 37.90 39.10 41.66

y 5.64 5.45 5.21 4.98 4.65 4.50 4.34 4.03 3.92 3.65 3.02 2.89

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

552 CHAPTER 13 Nonlinear and Multiple Regression

include the x3 term in the model, stating that “x3 is highly correlated with other predictor variables.” To remedy this,

was subtracted from each x value to yield . A cubic regression was then requested to fit the

model having regression function

The following computer output resulted:

Parameter Estimate Estimated SD

.3463 .0366

�1.2933 .2535

2.3964 .5699

�2.3968 2.4590

a. What is the estimated regression function for the “centered” model?

b. What is the estimated value of the coefficient b3 in the “uncentered” model with regression function

? What is the estimate of b2? c. Using the cubic model, what value of y would you predict

when soil pH is 4.5? d. Carry out a test to decide whether the cubic term should

be retained in the model.

33. In many polynomial regression problems, rather than fit- ting a “centered” regression function using , computational accuracy can be improved by using a func- tion of the standardized independent variable

, where sx is the standard deviation of the xi’s. Consider fitting the cubic regression function

to the following data resulting from a study of the relation between thrust effi- ciency y of supersonic propelling rockets and the half- divergence angle x of the rocket nozzle (“More on Correlating Data,” CHEMTECH, 1976: 266–270):

y 5 b0 * 1 b1

*xr 1 b2*(xr)2 1 b3*(xr)3

xr 5 (x 2 x )/sx

xr 5 x 2 x

b1x 1 b2x 2 1 b3x

3 y 5 b0 1

b*3

b*2

b*1

b*0

y 5 b0 * 1 b1

*xr 1 b2*(xr)2 1 b3*(xr) 3

xr 5 x 2 x x 5 4.3456

d. What can you say about the relationship between SSEs and R2’s for the standardized and unstandardized mod- els? Explain.

e. SSE for the cubic model is .00006300, whereas for a quadratic model SSE is .00014367. Compute R2 for each model. Does the difference between the two suggest that the cubic term can be deleted?

34. The following data resulted from an experiment to assess the potential of unburnt colliery spoil as a medium for plant growth. The variables are extractable cations and

acidity/total cation exchange capacity (“Exchangeable Acidity in Unburnt Colliery Spoil,” Nature, 1969: 161):

x �23 �5 16 26 30 38 52

y 1.50 1.46 1.32 1.17 .96 .78 .77

x 58 67 81 96 100 113

y .91 .78 .69 .52 .48 .55

Standardizing the independent variable x to obtain and fitting the regression function

yielded the accompanying com- puter output.

Parameter Estimate Estimated SD

.8733 .0421

�.3255 .0316

.0448 .0319

a. Estimate . b. Compute the value of the coefficient of multiple deter-

mination. (See Exercise 28(c).) c. What is the estimated regression function

using the unstandardized variable x? d. What is the estimated standard deviation of computed

in part (c)? e. Carry out a test using the standardized estimates to

decide whether the quadratic term should be retained in the model. Repeat using the unstandardized estimates. Do your conclusions differ?

35. The article “The Respiration in Air and in Water of the Limpets Patella caerulea and Patella lusitanica” (Comp. Biochemistry and Physiology, 1975: 407–411) proposed a simple power model for the relationship between respiration rate y and temperature x for P. caerulea in air. However, a plot of ln(y) versus x exhibits a curved pattern. Fit the qua- dratic power model to the accompanying data.

x 10 15 20 25 30

y 37.1 70.1 109.7 177.2 222.6

Y 5 aebx1gx2 # P

b̂2

b̂2x 2

b̂0 1 b̂1x 1

mY #50

b2*

b*1

b*0

y 5 b*0 1 b*1 xr 1 b*2 (xr)2 xr 5 (x 2 x)/sx

y 5 exchangeable x 5 acid

x 5 10 15 20 25 30 35

y .985 .996 .988 .962 .940 .915 .878

Parameter Estimate Estimated SD

.9671 .0026

�.0502 .0051

�.0176 .0023

.0062 .0031

a. What value of y would you predict when the half-divergence angle is 20? When ?

b. What is the estimated regression function for the “unstandardized” model?

c. Use a level .05 test to decide whether the cubic term should be deleted from the model.

b̂2x 2 1 b̂3x

3 b̂0 1 b̂1x 1

x 5 25

b3 *

b2 *

b1 *

b0 *

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 553

13.4 Multiple Regression Analysis In multiple regression, the objective is to build a probabilistic model that relates a dependent variable y to more than one independent or predictor variable. Let k repre- sent the number of predictor variables and denote these predictors by

. For example, in attempting to predict the selling price of a house, we might have with , and .x 3 5 number of roomsx 1 5 size (ft

2), x 2 5 age (years)k 5 3 x 1, x 2, c, xk

(k $ 2)

DEFINITION The general additive multiple regression model equation is

(13.15)

where and . In addition, for purposes of testing hypotheses and calculating CIs or PIs, it is assumed that is normally distributed.P

V(P) 5 s2E(P) 5 0

Y 5 b0 1 b1x 1 1 b2x 2 1 c 1 bkx k 1 P

Let be particular values of . Then (13.15) implies that

(13.16)

Thus just as describes the mean Y value as a function of x in simple linear regression, the true (or population) regression function gives the expected value of Y as a function of . The bi’s are the true (or population) regression coefficients. The regression coefficient b1 is interpreted as the expected change in Y associated with a 1-unit increase in x1 while are held fixed. Analogous interpretations hold for .

Models with Interaction and Quadratic Predictors If an investigator has obtained observations on y, x1, and x2, one possible model is

. However, other models can be constructed by forming predictors that are mathematical functions of x1 and/or x2. For example, with and , the model

has the general form of (13.15). In general, it is not only permissible for some pre- dictors to be mathematical functions of others but also often highly desirable in the sense that the resulting model may be much more successful in explaining variation in y than any model without such predictors. This discussion also shows that polyno- mial regression is indeed a special case of multiple regression. For example, the quad- ratic model has the form of (13.15) with , and .

For the case of two independent variables, x1 and x2, consider the following four derived models.

1. The first-order model:

2. The second-order no-interaction model:

Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 1 2 1 b4x 2

2 1 P

Y 5 b0 1 b1x 1 1 b2x 2 1 P

x 2 5 x 2

k 5 2, x 1 5 xY 5 b0 1 b1x 1 b2x 2 1 P

Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 3 1 b4x 4 1 P

x 4 5 x 1x 2

x 3 5 x 1 2

Y 5 b0 1 b1x 1 1 b2x 2 1 P

b2, c, bk

x 2, c, xk

x 1, c, xk

b0 1 b1x 1 1 c 1 bkx k

b0 1 b1x

mY #x1*,c,xk* 5 b0 1 b1x 1 * 1 c 1 bkx k*

x 1, c, xkx*1, x 2*, c, xk*

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

554 CHAPTER 13 Nonlinear and Multiple Regression

3. The model with first-order predictors and interaction:

4. The complete second-order or full quadratic model:

Understanding the differences among these models is an important first step in building realistic regression models from the independent variables under study.

The first-order model is the most straightforward generalization of simple linear regression. It states that for a fixed value of either variable, the expected value of Y is a linear function of the other variable and that the expected change in Y asso- ciated with a unit increase in is independent of the level of . Thus if we graph the regression function as a function of x1 for several different val- ues of x2, we obtain as contours of the regression function a collection of parallel lines, as pictured in Figure 13.13(a). The function specifies a plane in three-dimensional space; the first-order model says that each observed value of the dependent variable corresponds to a point which deviates vertically from this plane by a random amount .

According to the second-order no-interaction model, if x2 is fixed, the expected change in Y for a 1-unit increase in x1 is

Because this expected change does not depend on x2, the contours of the regression function for different values of x2 are still parallel to one another. However, the dependence of the expected change on the value of x1 means that the contours are now curves rather than straight lines. This is pictured in Figure 13.13(b). In this case, the regression surface is no longer a plane in three-dimensional space but is instead a curved surface.

The contours of the regression function for the first-order interaction model are nonparallel straight lines. This is because the expected change in Y when x1 is increased by 1 is

This expected change depends on the value of x2, so each contour line must have a different slope, as in Figure 13.13(c). The word interaction reflects the fact that an expected change in Y when one variable increases in value depends on the value of the other variable.

Finally, for the complete second-order model, the expected change in Y when x2 is held fixed while x1 is increased by 1 unit is , which is a function of both x1 and x2. This implies that the contours of the regression function are both curved and not parallel to one another, as illustrated in Figure 13.13(d).

Similar considerations apply to models constructed from more than two inde- pendent variables. In general, the presence of interaction terms in the model implies that the expected change in Y depends not only on the variable being increased or decreased but also on the values of some of the fixed variables. As in ANOVA, it is possible to have higher-way interaction terms (e.g., x1x2x3), making model inter- pretation more difficult.

b1 1 b3 1 2b3x 1 1 b5x 2

2 (b0 1 b1x 1 1 b2x 2 1 b3x 1x 2) 5 b1 1 b3x 2

b0 1 b1(x 1 1 1) 1 b2x 2 1 b3(x 1 1 1)x 2

2 (b0 1 b1x 1 1 b2x 2 1 b3x 1 2 1 b4x 2

2) 5 b1 1 b3 1 2b3x 1

b0 1 b1(x 1 1 1) 1 b2x 2 1 b3(x 1 1 1) 2 1 b4x 2

2

P

y 5 b0 1 b1x 1 1 b2x 2

x 2 (x 1)b1 (b2)x 1 (x 2)

Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 1 2 1 b4x 2

2 1 b5x 1x 2 1 P

Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 1x 2 1 P

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 555

5 15

5

10

0

x2 1

x2 1

x2 2

x2 2

x2 3

x2 3

(a) E(Y ) 1 .5x1 x2

5 15

5

10

0

(b) E(Y ) 1 .5x1 .25x 2 x2 .5x

2

E(Y )

5 15 0

x2 1

x2 2

x2 3

(c) E(Y ) 1 .5x1 x2 x1x2

E(Y )

E(Y )

1 2

(d) E(Y ) 1 .5x1 .25x 2 x2 .5x

2 x1x21 2

10

20

30

10 x1

1 3 0

x2 1

x2 2

x2 3

E(Y )

5

10

15

2 x1

x1 x1

Figure 13.13 Contours of four different regression functions

Note that if the model contains interaction or quadratic predictors, the generic interpretation of a bi given previously will not usually apply. This is because it is not then possible to increase xi by 1 unit and hold the values of all other predictors fixed.

Models with Predictors for Categorical Variables Thus far we have explicitly considered the inclusion of only quantitative (numerical) predictor variables in a multiple regression model. Using simple numerical coding, qualitative (categorical) variables, such as bearing material (aluminum or copper/ lead) or type of wood (pine, oak, or walnut), can also be incorporated into a model. Let’s first focus on the case of a dichotomous variable, one with just two possible categories—male or female, U.S. or foreign manufacture, and so on. With any such variable, we associate a dummy or indicator variable x whose possible values 0 and 1 indicate which category is relevant for any particular observation.

Example 13.11 The article “Estimating Urban Travel Times: A Comparative Study” (Trans. Res., 1980: 173–175) described a study relating the dependent variable time between locations in a certain city and the independent variable between locations. Two types of vehicles, passenger cars and trucks, were used in the study. Let

x 1 5 e1 if the vehicle is a truck0 if the vehicle is a passenger car

x 2 5 distance y 5 travel

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

556 CHAPTER 13 Nonlinear and Multiple Regression

One possible multiple regression model is

The mean value of travel time depends on whether a vehicle is a car or a truck:

The coefficient b1 is the difference in mean times between trucks and cars with distance held fixed; if , on average it will take trucks longer to traverse any particular distance than it will for cars. A second possibility is a model with an interaction predictor:

Now the mean times for the two types of vehicles are

For each model, the graph of the mean time versus distance is a straight line for either type of vehicle, as illustrated in Figure 13.14. The two lines are parallel for the first (no-interaction) model, but in general they will have different slopes when the second model is correct. For this latter model, the change in mean travel time associated with a 1-mile increase in distance depends on which type of vehicle is involved—the two variables “vehicle type” and “travel time” interact. Indeed, data collected by the authors of the cited article suggested the presence of interaction.

mean time 5 b0 1 b1 1 (b2 1 b3)x 2 when x 1 5 1

mean time 5 b0 1 b2x 2 when x 1 5 0

Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 1x 2 1 P

b1 . 0

mean time 5 b0 1 b1 1 b2 x 2 when x 1 5 1 (trucks)

mean time 5 b0 1 b2 x 2 when x 1 5 0 (cars)

Y 5 b0 1 b1x 1 1 b 2 x 2 1 P

Mean y

x2

(a)

b 0 + b

1 + b

2 x 2

(x 1 = 1

)

Mean y

x2

(b)

b 0 + b

2 x 2

(x 1 = 0

)

b 0 + b 1

+ (b 2

+b 3 )x 2

(x 1 =

1)

b0 + b 2

x 2 (x 1

= 0)

Figure 13.14 Regression functions for models with one dummy variable (x1) and one quantitative variable x2: (a) no interaction; (b) interaction

You might think that the way to handle a three-category situation is to define a single numerical variable with coded values such as 0, 1, and 2 corresponding to the three categories. This is incorrect, because it imposes an ordering on the cate- gories that is not necessarily implied by the problem context. The correct approach to incorporating three categories is to define two different dummy variables. Suppose, for example, that y is the lifetime of a certain cutting tool, x1 is cutting speed, and that there are three brands of tool being investigated. Then let

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 557

When an observation on a brand A tool is made, and , whereas for a brand B tool, and . An observation made on a brand C tool has

, and it is not possible that because a tool cannot simulta- neously be both brand A and brand B. The no-interaction model would have only the predictors x1, x2, and x3. The following interaction model allows the mean change in lifetime associated with a 1-unit increase in speed to depend on the brand of tool:

Construction of a picture like Figure 13.14 with a graph for each of the three possi- ble pairs gives three nonparallel lines (unless ).

More generally, incorporating a categorical variable with c possible categories into a multiple regression model requires the use of indicator variables (e.g., five brands of tools would necessitate using four indicator variables). Thus even one categorical variable can add many predictors to a model.

Estimating Parameters The data in simple linear regression consists of n pairs . Suppose that a multiple regression model contains two predictor variables, x1 and x2. Then the data set will consist of n triples . Here the first subscript on x refers to the predictor and the second to the observation number. More generally, with k predictors, the data consists of tuples

, where xij is the value of the ith predictor xi associated with the observed value yj. The observa- tions are assumed to have been obtained independently of one another according to the model (13.15). To estimate the parameters using the principle of least squares, form the sum of squared deviations of the observed yj’s from a trial function

:

(13.17)

The least squares estimates are those values of the bi’s that minimize . Taking the partial derivative of f with respect to each and equat- ing all partials to zero yields the following system of normal equations:

bi(i 5 0, 1, c, k) f (b0, c, bk )

f (b0, b1, c, bk) 5 g j

[yj 2 (b0 1 b1x 1j 1 b2x 2j 1 c 1 bk xkj)] 2

y 5 b0 1 b1x 1 1 c 1 bk xk

b0, b1, c, bk

(x 11, x 21, c, x k1, y1), (x 12, x 22, c, xk2, y2), c, (x 1n, x 2n, c, xkn, yn) n (k 1 1)

(x 11, x 21, y1), (x 12, x 22, y2), c, (x 1n, x 2n, yn)

(x 1, y1), c, (xn, yn)

c 2 1

b4 5 b5 5 0(x 2, x 3)

Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 3 1 b4x 1x 2 1 b5x 1x 3 1 P

x 2 5 x 3 5 1x 2 5 x 3 5 0 x 3 5 1x 2 5 0

x 3 5 0x 2 5 1

x 2 5 e1 if a brand A tool is used0 otherwise x 3 5 e 1 if a brand B tool is used

0 otherwise

(13.18)

b0gxkj 1 b1gx 1j x kj 1 c 1 bk21gxk21,j x kj 1 bkgxkj

2 5 gxkjyj

b0gx 1j 1 b1gx 1j 2 1 b2gx 1j x 2j 1

c 1 bkgx 1j x kj 5 gx 1jyj

b0 n 1 b1g x 1j 1 b2g x 2j 1 c 1 bk g x kj 5 gyj

.

.

.

.

.

.

.

.

.

These equations are linear in the unknowns . Solving (13.18) yields the least squares estimates . This is best done by utilizing a statistical soft- ware package.

b̂0, b̂1, c, b̂k

b0, b1, c, bk

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

558 CHAPTER 13 Nonlinear and Multiple Regression

Example 13.12 The article “How to Optimize and Control the Wire Bonding Process: Part II” (Solid State Technology, Jan. 1991: 67–72) described an experiment carried out to assess the impact of the variables , and on bond shear strength (gm). The following data* was generated to be consistent with the information given in the article:

y 5 ballx 4 5 time (msec) x 1 5 force (gm), x 2 5 power (mW), x 3 5 tempertaure (8C)

Observation Force Power Temperature Time Strength

1 30 60 175 15 26.2 2 40 60 175 15 26.3 3 30 90 175 15 39.8 4 40 90 175 15 39.7 5 30 60 225 15 38.6 6 40 60 225 15 35.5 7 30 90 225 15 48.8 8 40 90 225 15 37.8 9 30 60 175 25 26.6

10 40 60 175 25 23.4 11 30 90 175 25 38.6 12 40 90 175 25 52.1 13 30 60 225 25 39.5 14 40 60 225 25 32.3 15 30 90 225 25 43.0 16 40 90 225 25 56.0 17 25 75 200 20 35.2 18 45 75 200 20 46.9 19 35 45 200 20 22.7 20 35 105 200 20 58.7 21 35 75 150 20 34.5 22 35 75 250 20 44.0 23 35 75 200 10 35.7 24 35 75 200 30 41.8 25 35 75 200 20 36.5 26 35 75 200 20 37.6 27 35 75 200 20 40.3 28 35 75 200 20 46.0 29 35 75 200 20 27.8 30 35 75 200 20 40.3

* From the book Statistics Engineering Problem Solving by Stephen Vardeman, an excellent exposition of the territory covered by our book, albeit at a somewhat higher level.

A statistical computer package gave the following least squares estimates:

Thus we estimate that .1297 gm is the average change in strength associated with a 1-degree increase in temperature when the other three predictors are held fixed; the other estimated coefficients are interpreted in a similar manner.

The estimated regression equation is

A point prediction of strength resulting from a force of 35 gm, power of 75 mW, temperature of 200° degrees, and time of 20 msec is

5 38.41 gm ŷ 5 237.48 1 (.2117)(35) 1 (.4983)(75) 1 (.1297)(200) 1 (.2583)(20)

y 5 237.48 1 .2117x1 1 .4983x 2 1 .1297x 3 1 .2583x 4

b̂0 5 237.48 b̂1 5 .2117 b̂2 5 .4983 b̂3 5 .1297 b̂4 5 .2583

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 559

This is also a point estimate of the mean value of strength for the specified values of force, power, temperature, and time. ■

R2 and 2

Predicted or fitted values, residuals, and the various sums of squares are calculated as in simple linear and polynomial regression. The predicted value results from substituting the values of the various predictors from the first observation into the estimated regression function:

The remaining predicted values come from substituting values of the pre- dictors from the 2nd, 3rd, , and finally nth observations into the estimated func- tion. For example, the values of the 4 predictors for the last observation in Example 13.12 are , and , so

The residuals are the differences between the observed and predicted values. The last residual in Example 13.12 is . The closer the residuals are to 0, the better the job our estimated regression function is doing in making predictions corresponding to observations in the sample.

Error or residual sum of squares is . It is again interpreted as a measure of how much variation in the observed y values is not explained by (not attributed to) the model relationship. The number of df associated with SSE is

because df are lost in estimating the coefficients. Total sum of squares, a measure of total variation in the observed y values, is

Regression sum of squares is a measure of explained variation. Then the coefficient of multiple determination R2 is

It is interpreted as the proportion of observed y variation that can be explained by the multiple regression model fit to the data.

Because there is no preliminary picture of multiple regression data analogous to a scatter plot for bivariate data, the coefficient of multiple determination is our first indication of whether the chosen model is successful in explaining y variation. Unfortunately, there is a problem with R2: Its value can be inflated by adding lots of predictors into the model even if most of these predictors are rather frivolous. For example, suppose y is the sale price of a house. Then sensible predictors include

the interior size of the house, size of the lot on which the house sits, the number of bedrooms, of bathrooms, and house’s

age. Now suppose we add in of the doorknob on the coat closet, of the cutting board in the kitchen, of the

patio slab, and so on. Unless we are very unlucky in our choice of predictors, using predictors (one fewer than the sample size) will yield . So the objec-

tive in multiple regression is not simply to explain most of the observed y variation, but to do so using a model with relatively few predictors that are easily interpreted. It is thus desirable to adjust R2, as was done in polynomial regression, to take account of the size of the model:

Ra 2 5 1 2

SSE/(n 2 (k 1 1))

SST/(n 2 1) 5 1 2

n 2 1

n 2 (k 1 1) # SSE SST

R2 5 1n 2 1

x 8 5 the thicknessx 7 5 the thickness x 6 5 the diameter

x 5 5 thex 4 5 the numberx 3 5 x 2 5 thex 1 5

R2 5 1 2 SSE/SST 5 SSR/SST

SSR 5 g (ŷi 2 y ) 2 5 SST 2 SSESST 5 g (yi 2 y)

2.

k 1 1 bk 1 1n 2 (k 1 1)

SSE 5 g (yi 2 ŷi) 2

40.3 2 38.41 5 1.89 y1 2 ŷ1, c, yn 2 ŷn

ŷ30 5 237.48 1 .2117(35) 1 .4983(75) 1 .1297(200) 1 .2583(20) 5 38.41

x 4,30 5 20x 1,30 5 35, x 2,30 5 75, x 3,30 5 200

c

ŷ2, c, ŷn

ŷ1 5 b̂0 1 b̂1x 11 1 b̂1x 21 1 c 1 b̂kx k1

ŷ1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

560 CHAPTER 13 Nonlinear and Multiple Regression

Because the ratio in front of SSE/SST exceeds 1, is smaller than R2. Furthermore, the larger the number of predictors k relative to the sample size n, the smaller will be relative to R2. Adjusted R2 can even be negative, whereas R2 itself must be between 0 and 1. A value of that is substantially smaller than R2 itself is a warning that the model may contain too many predictors.

The positive square root of R2 is called the multiple correlation coefficient and is denoted by R. It can be shown that R is the sample correlation coefficient calculated from the pairs (that is, use in place of xi in the formula for r from Section 12.5).

SSE is also the basis for estimating the remaining model parameter:

ŝ 2 5 s2 5 SSE

n 2 (k 1 1) 5 MSE

ŷi(ŷi, yi)

Ra 2

Ra 2

Ra 2

Example 13.13 Investigators carried out a study to see how various characteristics of concrete are influ- enced by limestone powder and , resulting in the accompanying data (“Durability of Concrete with Addition of Limestone Powder,” Magazine of Concrete Research, 1996: 131–137).

x 2 5 water-cement ratiox 1 5 %

x1 x2 x1x2 28-day Comp Str. (MPa) Adsorbability (%)

21 .65 13.65 33.55 8.42 21 .55 11.55 47.55 6.26 7 .65 4.55 35.00 6.74 7 .55 3.85 35.90 6.59

28 .60 16.80 40.90 7.28 0 .60 0.00 39.10 6.90

14 .70 9.80 31.55 10.80 14 .50 7.00 48.00 5.63 14 .60 8.40 42.30 7.43

y 5 7.339, SST 5 18.356y 5 39.317, SST 5 278.52

Consider first compressive strength as the dependent variable y. Fitting the first- order model results in

whereas including an interaction predictor gives

Based on this latter fit, a prediction for compressive strength when % limestone 5 14 and water–cement is

Fitting the full quadratic relationship results in virtually no change in the R2 value. However, when the dependent variable is adsorbability, the following results are obtained: when just two predictors are used, .802 when the inter- action predictor is added, and .889 when the five predictors for the full quadratic relationship are used. ■

In general, can be interpreted as an estimate of the average change in Y associated with a 1-unit increase in xi while values of all other predictors are held

b̂i

R2 5 .747

ŷ 5 6.22 1 5.779(14) 1 51.33(.60) 2 9.357(8.4) 5 39.32

ratio 5 .60

SSE 5 29.35 (df 5 5) R2 5 .895 Ra 2 5 .831

y 5 6.22 1 5.779x 1 1 51.33x 2 2 9.357x 1x 2

y 5 84.82 1 .1643x 1 2 79.67x 2, SSE 5 72.52 (df 5 6), R 2 5 .741, Ra

2 5 .654

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 561

fixed. Sometimes, though, it is difficult or even impossible to increase the value of one predictor while holding all others fixed. In such situations, there is an alter- native interpretation of the estimated regression coefficients. For concreteness, suppose that , and let denote the estimate of b1 in the regression of y on the two predictors x1 and x2. Then

1. Regress y against just x2 (a simple linear regression) and denote the resulting resid- uals by . These residuals represent variation in y after removing or adjusting for the effects of x2.

2. Regress x1 against x2 (that is, regard x1 as the dependent variable and x2 as the inde- pendent variable in this simple linear regression), and denote the residuals by

. These residuals represent variation in x1 after removing or adjusting for the effects of x2.

Now consider plotting the residuals from the first regression against those from the second; that is, plot the pairs . The result is called a partial residual plot or adjusted residual plot. If a regression line is fit to the points in this plot, the slope turns out to be exactly (furthermore, the residuals from this line are exactly the residuals from the multiple regression of y on x1 and x2). Thus

can be interpreted as the estimated change in y associated with a 1-unit increase in x1 after removing or adjusting for the effects of any other model predictors. The same interpretation holds for other estimated coefficients regardless of the number of predictors in the model (there is nothing special about ; the foregoing argu- ment remains valid if y is regressed against all predictors other than x1 in Step 1 and x1 is regressed against the other predictors in Step 2).

As an example, suppose that y is the sale price of an apartment building and that the predictors are number of apartments, age, lot size, number of parking spaces, and gross building area (ft2). It may not be reasonable to increase the number of apartments without also increasing gross area. However, if then we estimate that a $16 increase in sale price is associated with each extra square foot of gross area after adjusting for the effects of the other four predictors.

A Model Utility Test With multivariate data, there is no picture analogous to a scatter plot to indicate whether a particular multiple regression model will successfully explain observed y variation. The value of R2 certainly communicates a preliminary message, but this value is sometimes deceptive because it can be greatly inflated by using a large number of predictors relative to the sample size. For this reason, it is important to have a formal test for model utility.

The model utility test in simple linear regression involved the null hypothesis , according to which there is no useful relation between y and the single

predictor x. Here we consider the assertion that , which says that there is no useful relationship between y and any of the k predictors. If at least one of these b’s is not 0, the corresponding predictor(s) is (are) useful. The test is based on a statistic that has a particular F distribution when H0 is true.

b1 5 0, b2 5 0, c, bk 5 0 H0: b1 5 0

b̂5 5 16.00,

k 2 1

k 5 2

b̂1

e1, c, en

b̂1

( f1, g1 ), c, ( fn, gn)

f1, c, fn

g1, g2, c, gn

b̂1k 5 2

Null hypothesis:

Alternative hypothesis: Ha: at least one

Test statistic value: f 5 R2/k

(1 2 R2)/[n 2 (k 1 1)]

bi 2 0 (i 5 1, c, k)

H0: b1 5 b2 5 c 5 bk 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

562 CHAPTER 13 Nonlinear and Multiple Regression

(13.19)

where

Rejection region for a level a test: f $ Fa,k,n2(k11)

SSR 5 regression sum of squares 5 SST 2 SSE

5 SSR/k

SSE/[n 2 (k 1 1)] 5

MSR

MSE

Except for a constant multiple, the test statistic here is , the ratio of explained to unexplained variation. If the proportion of explained variation is high relative to unexplained, we would naturally want to reject H0 and confirm the utility of the model. However, if k is large relative to n, the factor will decrease f considerably.

[(n 2 (k 1 1))/k]

R2/(1 2 R2)

Example 13.14 Returning to the bond shear strength data of Example 13.12, a model with predictors was fit, so the relevant hypotheses are

Figure 13.15 shows output from the JMP statistical package. The values of s (Root Mean Square Error), R2, and adjusted R2 certainly suggest a useful model. The value of the model utility F ratio is

f 5 R2/k

(1 2 R2)/ [n 2 (k 1 1)] 5

.713959/4

.286041/(30 2 5) 5 15.60

Ha: at least one of these four b s is not 0

H0: b1 5 b2 5 b3 5 b4 5 0

k 5 4

Figure 13.15 Multiple regression output from JMP for the data of Example 13.14

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 563

1. A CI for bi, the coefficient of xi in the regression function, is

2. A test for uses the t statistic value based on df. The test is upper-, lower-, or two-tailed according to

whether Ha contains the inequality , or .

3. A CI for is

where is the statistic and is the calculated value of .

4. A PI for a future y value is

Simultaneous intervals for which the simultaneous confidence or prediction level is controlled can be obtained by applying the Bonferroni technique.

5 ŷ 6 ta/2,n2(k11) # 2s2 1 sŶ2 m̂Y#x1*, c, xk* 6 ta/2,n2(k11)

# 5s2 1 (estimated SD of m̂Y#x1*, c, xk*)261/2 100(1 2 a)%

Ŷ ŷb̂0 1 b̂1x 1

* 1 c1 b̂kx k *Ŷ

m̂Y #x1*, c, xk* 6 ta/2,n2(k11) # 5estimated SD of m̂Y#x1*, c, xk*6 5 ŷ 6 ta/2,n2(k11) # sŶ mY# x1*, c, xk*100(1 2 a)%

2. , , n 2 (k 1 1)

t 5 ( b̂i 2 bi0)/sb̂iH0: bi 5 bi0

b̂i 6 ta/2,n2(k11) # sb̂i 100(1 2 a)%

This value also appears in the F Ratio column of the ANOVA table in Figure 13.15. The largest F critical value for 4 numerator and 25 denominator df in Appendix Table A.9 is 6.49, which captures an upper-tail area of .001. Thus . The ANOVA table in the JMP output shows that . This is a highly sig- nificant result. The null hypothesis should be rejected at any reasonable significance level. We conclude that there is a useful linear relationship between y and at least one of the four predictors in the model. This does not mean that all four predictors are use- ful; we will say more about this subsequently. ■

Inferences in Multiple Regression Before testing hypotheses, constructing CIs, and making predictions, the adequacy of the model should be assessed and the impact of any unusual observations investi- gated. Methods for doing this are described at the end of the present section and in the next section.

Because each is a linear function of the yi’s, the standard deviation of each is the product of s and a function of the xij ’s. An estimate of this SD is obtained

by substituting s for s. The function of the xij ’s is quite complicated, but all standard statistical software packages compute and show the . Inferences concerning a single bi are based on the standardized variable

which has a t distribution with df. The point estimate of , the expected value of Y when

, is . The estimated standard devia- tion of the corresponding estimator is again a complicated expression involving the sample xij’s. However, appropriate software will calculate it on request. Inferences about are based on standardizing its estimator to obtain a t variable having

df.n 2 (k 1 1) m̂Y# x1*, c, xk*

m̂Y# x1*, c, xk* 5 b̂0 1 b̂1x*1 1 c 1 b̂kx*kx k 5 x*k

x 1 5 x 1 *, . . . ,mY# x1* , c, xk*

n 2 (k 1 1)

T 5 b̂i 2 bi

Sb̂ i

sb̂ i’s

sb̂ib̂i

b̂i

P-value , .0001 P-value , .001

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 13.15

564 CHAPTER 13 Nonlinear and Multiple Regression

Soil and sediment adsorption, the extent to which chemicals collect in a condensed form on the surface, is an important characteristic influencing the effectiveness of pes- ticides and various agricultural chemicals. The article “Adsorption of Phosphate, Arsenate, Methanearsonate, and Cacodylate by Lake and Stream Sediments: Comparisons with Soils” (J. of Environ. Qual., 1984: 499–504) gives the accompany- ing data (Table 13.5) on adsorption index, of extractable iron, and of extractable aluminum.x 2 5 amount

x 1 5 amounty 5 phosphate

Table 13.5 Data for Example 13.15

Extractable Extractable Adsorption Observation Iron Aluminum Index

1 61 13 4 2 175 21 18 3 111 24 14 4 124 23 18 5 130 64 26 6 173 38 26 7 169 33 21 8 169 61 30 9 160 39 28

10 244 71 36 11 257 112 65 12 333 88 62 13 199 54 40

y 5x2 5x1 5

The article proposed the model

A computer analysis yielded the following information:

Y 5 b0 1 b1x 1 1 b2x 2 1 P

Parameter i Estimate i Estimated SD

b0 �7.351 3.485 b1 .11273 .02969 b2 .34900 .07131

estimated SD of m̂Y# 160,39 5 sŶ 5 1.30

m̂Y# 160,39 5 ŷ 5 27.351 1 (.11273)(160) 1 (.34900)(39) 5 24.30

s 5 4.379adjusted R2 5 .938R2 5 .948

sb ˆ i�̂�̂

A 99% CI for b1, the change in expected adsorption associated with a 1-unit increase in extractable iron while extractable aluminum is held fixed, requires

The CI is

Similarly, a 99% interval for b2 is

The Bonferroni technique implies that the simultaneous confidence level for both intervals is at least 98%.

.34900 6 (3.169)(.07131) 5 .34900 6 .22598 < (.123, .575)

.11273 6 (3.169)(.02969) 5 .11273 6 .09409 < (.019, .207)

t.005,10 5 3.169. t .005,132(211) 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 565

A 95% CI for , expected adsorption when extractable and extractable , is

A 95% PI for a future value of adsorption to be observed when and is

Frequently, the hypothesis of interest has the form for a particular i. For example, after fitting the four-predictor model in Example 13.12, the investigator might wish to test . According to H0, as long as the predictors x1, x2, and x3 remain in the model, x4 contains no useful information about y. The test statistic value is the t ratio . Many statistical computer packages report the t ratio and corresponding P-value for each predictor included in the model. For example, Figure 13.15 shows that as long as power, temperature, and time are retained in the model, the predictor can be deleted.

An F Test for a Group of Predictors The model utility F test was appropriate for testing whether there is useful information about the dependent variable in any of the k predictors (i.e., whether ). In many situations, one first builds a model containing k predictors and then wishes to know whether any of the predic- tors in a particular subset provide useful information about Y. For example, a model to be used to predict students’ test scores might include a group of background vari- ables such as family income and education levels and also some school characteris- tic variables such as class size and spending per pupil. One interesting hypothesis is that the school characteristic predictors can be dropped from the model.

Let’s label the predictors as , so that it is the last that we are considering deleting. The relevant hypotheses are as follows:k 2 l

x 1, x 2, c, x l, x l11, c, xk

b1 5 c 5 bk 5 0

x 1 5 force

b̂i/sb̂ i

H0: b4 5 0

H0: bi 5 0

24.30 6 (2.228)5(4.379)2 1 (1.30)261/2 5 24.30 6 10.18 5 (14.12, 34.48) x 2 5 39x 1 5 160

24.30 6 (2.228)(1.30) 5 24.30 6 2.90 5 (21.40, 27.20)

aluminum 5 39 iron 5 160mY#160,39

(13.20)

Rejection region: f $ Fa,k2l,n2(k11)

Test statistic value: f 5 (SSE l 2 SSEk)/(k 2 l)

SSEk/[n 2 (k 1 1)]

SSE l 5 unexplained variation for the reduced model

SSEk 5 unexplained variation for the full model

(so the “reduced” model is correct)

versus

Ha: at least one among is not 0

(so in the “full” model , at least one of the last predictors provides useful information)k 2 l

Y 5 b0 1 b1x 1 1 c 1 bkx k 1 P

bl11, c, bk

Y 5 b0 1 b1x 1 1 c 1 blx l 1 P

H0: bl11 5 bl12 5 c 5 bk 5 0

The test is carried out by fitting both the full and reduced models. Because the full model contains not only the predictors of the reduced model but also some extra predictors, it should fit the data at least as well as the reduced model. That is, if we let SSEk be the sum of squared residuals for the full model and SSEl be the corresponding sum for the reduced model, then . Intuitively, if SSEk is a great deal smaller than SSEl, the full model provides a much better fit than the reduced model; the appropriate test sta- tistic should then depend on the reduction in unexplained variation. SSE l 2 SSEk

SSEk # SSE l

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Example 13.16

566 CHAPTER 13 Nonlinear and Multiple Regression

Table 13.6 Data for Example 13.16

Observation x1 x2 x3 x4 y Observation x1 x2 x3 x4 y

1 8 4 100 1 1.4 16 4 10 160 5 4.6 2 2 4 180 7 2.2 17 4 13 100 7 4.3 3 7 4 180 1 4.6 18 10 10 120 7 4.9 4 10 7 120 5 4.9 19 5 4 100 1 1.7 5 7 4 180 5 4.6 20 8 13 140 1 4.6 6 7 7 180 1 4.7 21 10 1 180 1 2.6 7 7 13 140 1 4.6 22 2 13 140 1 3.1 8 5 4 160 7 4.5 23 6 13 180 7 4.7 9 4 7 140 3 4.8 24 7 1 120 7 2.5

10 5 1 100 7 1.4 25 5 13 140 1 4.5 11 8 10 140 3 4.7 26 8 1 160 7 2.1 12 2 4 100 3 1.6 27 4 1 180 7 1.8 13 4 10 180 3 4.5 28 6 1 160 1 1.5 14 6 7 120 7 4.7 29 4 1 100 1 1.3 15 10 13 180 3 4.8 30 7 10 100 7 4.6

Consider the full model consisting of predictors: (all first- and second-order pre-

dictors). Is the inclusion of the second-order predictors justified? That is, should the reduced model consisting of just the predictors , and be used? Output resulting from fitting the two models follows:

x 4 (l 5 4)x 1, x 2, x 3

x 5 5 x 1 2, c, x 8 5 x 4

2, x 9 5 x 1x 2, c, x 14 5 x 3x 4

x 1, x 2, x 3, x 4,k 5 14

The data in Table 13.6 was taken from the article “Applying Stepwise Multiple Regression Analysis to the Reaction of Formaldehyde with Cotton Cellulose” (Textile Research J., 1984: 157–165). The dependent variable y is durable press rat- ing, a quantitative measure of wrinkle resistance. The four independent variables used in the model building process are (formaldehyde) concentration,

, , and .x 4 5 curing timex 3 5 curing temperaturex 2 5 catalyst ratio x 1 5 HCHO

Parameter Estimate for Reduced Model Estimate for Full Model

b0 �.9122 �8.807 b1 .16073 .1768 b2 .21978 .7580 b3 .011226 .10400 b4 .10197 .5052 b5 — �.04393 b6 — �.035887 b7 — �.00003271 b8 — �.01646 b9 — .00588 b10 — .002702 b11 — .01178 b12 — �.0006547 b13 — .00242 b14 — .002526 R2 .692 .921 SSE 17.4951 4.4782

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 567

The hypotheses to be tested are

versus

With and , the F critical value for a test with is . The test statistic value is

Since , H0 is rejected. We conclude that the appropriate model should include at least one of the second-order predictors. ■

Assessing Model Adequacy The standardized residuals in multiple regression result from dividing each residual by its estimated standard deviation; the formula for these standard deviations is sub- stantially more complicated than in the case of simple linear regression. We recom- mend a normal probability plot of the standardized residuals as a basis for validating the normality assumption. Plots of the standardized residuals versus each predictor and versus should show no discernible pattern. Adjusted residual plots can also be helpful in this endeavor. The book by Neter et al. is an extremely useful reference.

4.36 $ 3.80

f 5 (17.4951 2 4.4782)/10

4.4782/15 5

1.3017

.2985 5 4.36

F.01,10,15 5 3.80 a 5 .01l 5 4k 5 14

Ha: at least one among b5, c, b14 is not 0

H0: b5 5 b6 5 c 5 b14 5 0

Example 13.17 Figure 13.16 shows a normal probability plot of the standardized residuals for the adsorption data and fitted model given in Example 13.15. The straightness of the plot casts little doubt on the assumption that the random deviation is normally distributed.P

–2 –1 0 1 2

–2.5

–1.5

–.5

.5

1.5

z percentile

Standardized residual

Figure 13.16 A normal probability plot of the standardized residu- als for the data and model of Example 13.15

Figure 13.17 shows the other suggested plots for the adsorption data. Given that there are only 13 observations in the data set, there is not much evidence of a pat- tern in any of the first three plots other than randomness. The point at the bottom of each of these three plots corresponds to the observation with the large residual. We will say more about such observations subsequently. For the moment, there is no compelling reason for remedial action.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

568 CHAPTER 13 Nonlinear and Multiple Regression

50 150 250 350

–2.5

–1.5

–.5

.5

1.5

Iron

Standardized residual

0 50 100

–2.5

–1.5

–.5

.5

1.5

Aluminum

Standardized residual

0 302010 40 6050

–2.5

–1.5

–.5

.5

1.5

Predicted y

Standardized residual

0 302010 70605040 0

10

20

30

40

50

60

Adsorption

Predicted y

)b()a(

)d()c(

Figure 13.17 Diagnostic plots for the adsorption data: (a) standardized residual versus x1; (b) standardized resid- ual versus x2; (c) standardized residual versus ; (d) versus y ■ŷŷ

EXERCISES Section 13.4 (36–54)

36. Cardiorespiratory fitness is widely recognized as a major component of overall physical well-being. Direct measure- ment of maximal oxygen uptake (VO2max) is the single best measure of such fitness, but direct measurement is time-consuming and expensive. It is therefore desirable to have a prediction equation for VO2max in terms of easily obtained quantities. Consider the variables

Here is one possible model, for male students, consistent with the information given in the article “Validation of the Rockport Fitness Walking Test in College Males and Females” (Research Quarterly for Exercise and Sport, 1994: 152–158):

s 5 .4

Y 5 5.0 1 .01x 1 2 .05x 2 2 .13x 3 2 .01x 4 1 P

x 4 5 heart rate at the end of the walk (beats/min )

x 3 5 time necessary to walk 1 mile (min)

x 2 5 age (yr)

y 5 VO2max (L/min ) x 1 5 weight (kg)

a. Interpret b1 and b3. b. What is the expected value of VO2max when weight is 76

kg, age is 20 yr, walk time is 12 min, and heart rate is 140 b/m?

c. What is the probability that VO2max will be between 1.00 and 2.60 for a single observation made when the values of the predictors are as stated in part (b)?

37. A trucking company considered a multiple regression model for relating the dependent variable daily travel time for one of its drivers (hours) to the predictors

traveled (miles) and of deliveries made. Suppose that the model equation is

a. What is the mean value of travel time when distance trav- eled is 50 miles and three deliveries are made?

b. How would you interpret , the coefficient of the predictor x1? What is the interpretation of ?

c. If hour, what is the probability that travel time will be at most 6 hours when three deliveries are made and the distance traveled is 50 miles?

s 5 .5 b2 5 .900

b1 5 .060

Y 5 2.800 1 .060x 1 1 .900x 2 1 P

x 2 5 the numberx 1 5 distance

y 5 total

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.4 Multiple Regression Analysis 569

38. Let life of a bearing, , and . Suppose that the multiple regression model

relating life to viscosity and load is

a. What is the mean value of life when viscosity is 40 and load is 1100?

b. When viscosity is 30, what is the change in mean life associated with an increase of 1 in load? When viscosity is 40, what is the change in mean life associated with an increase of 1 in load?

39. Let at a fast-food outlet (1000s of $), of competing outlets within a 1-mile radius, within a 1-mile radius (1000s of people), and x3 be an indicator variable that equals 1 if the outlet has a drive-up window and 0 otherwise. Suppose that the true regression model is

a. What is the mean value of sales when the number of competing outlets is 2, there are 8000 people within a 1-mile radius, and the outlet has a drive-up window?

b. What is the mean value of sales for an outlet without a drive-up window that has three competing outlets and 5000 people within a 1-mile radius?

c. Interpret b3.

40. The article “Readability of Liquid Crystal Displays: A Re- sponse Surface” (Human Factors, 1983: 185–190) used a multiple regression model with four independent variables to study accuracy in reading liquid crystal displays. The variables were

digit

The model fit to data was . The resulting estimated coefficients were

, and .

a. Calculate an estimate of expected error percentage when , and .

b. Estimate the mean error percentage associated with a backlight level of 20, character subtense of .5, viewing angle of 10, and ambient light level of 30.

c. What is the estimated expected change in error percentage when the level of ambient light is increased by 1 unit while all other variables are fixed at the values given in part (a)? Answer for a 100-unit increase in ambient light level.

d. Explain why the answers in part (c) do not depend on the fixed values of x1, x2, and x3. Under what conditions would there be such a dependence?

x 4 5 100x 1 5 10, x 2 5 .5, x 3 5 50

2.0006 b̂4 5b̂0 5 1.52, b̂1 5 .02, b̂2 5 21.40, b̂3 5 .02

b4x 4 1 P 1 b3x 311 b2x 2Y 5 b0 1 b1x 1

x 4 5 level of ambient light (ranging from 20 to 1500 lux)

x 3 5 viewing angle (ranging from 08 to 608)

x 2 5 character subtense (ranging from .0258 to 1.348)

x 1 5 level of backlight (ranging from 0 to 122 cd/m 2)

uid crystal displayliq four-error percentage for subjects reading a y 5

Y 5 10.00 2 1.2x 1 1 6.8x 2 1 15.3x 3 1 P

x 2 5 population x 1 5 numbery 5 sales

Y 5 125.0 1 7.75x 1 1 .0950x 2 2 .0090x 1x 2 1 P

loadx 2 5 x 1 5 oil viscosityy 5 wear e. The estimated model was based on observations,

with and . Calculate and inter- pret the coefficient of multiple determination, and then carry out the model utility test using .

41. The ability of ecologists to identify regions of greatest species richness could have an impact on the preservation of genetic diversity, a major objective of the World Conservation Strategy. The article “Prediction of Rarities from Habitat Variables: Coastal Plain Plants on Nova Scotian Lakeshores” (Ecology, 1992: 1852–1859) used a sample of lakes to obtain the estimated regression equation

where species richness, ,

, and . The coefficient of multiple determination was reported as

. Carry out a test of model utility.

42. An investigation of a die-casting process resulted in the accompanying data on ,

, and difference on the die surface (“A Multiple-Objective Decision-Making Approach for Assessing Simultaneous Improvement in Die Life and Casting Quality in a Die Casting Process,” Quality En- gineering, 1994: 371–383).

x1 1250 1300 1350 1250 1300

x2 6 7 6 7 6

y 80 95 101 85 92

x1 1250 1300 1350 1350

x2 8 8 7 8

y 87 96 106 108

Minitab output from fitting the multiple regression model with predictors x1 and x2 is given here.

The regression equation is

Predictor Coef Stdev t-ratio p Constant �199.56 11.64 �17.14 0.000 furntemp 0.210000 0.008642 24.30 0.000 clostime 3.0000 0.4321 6.94 0.000

Analysis of Variance

SOURCE DF SS MS F p Regression 2 715.50 357.75 319.31 0.000 Error 6 6.72 1.12 Total 8 722.22

a. Carry out the model utility test. b. Calculate and interpret a 95% confidence interval for b2,

the population regression coefficient of x2.

R-sq(adj) 5 98.8%R-sq 5 99.1%s 5 1.058

1 3.00 clostime

tempdiff 5 2200 1 0.210 furntemp

y 5 temperaturedie close time x 2 5x 1 5 furnace temperature

R2 5 .83

x 6 5 alkalinitysand (%)(total color units), x 5 5 x 3 5 poor drainage (%), x 4 5 water colorx 2 5shore width

x 1 5 watershed area,y 5

2 .0080x 4 2 .13x 5 2 .72x 6

1 .023x 3y 5 3.89 1 .033x 1 1 .024x 2

n 5 37

a 5 .05

SSE 5 20.0SST 5 39.2 n 5 30

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

570 CHAPTER 13 Nonlinear and Multiple Regression

c. When and , the estimated standard deviation of is . Calculate a 95% confidence interval for true average temperature difference when furnace temperature is 1300 and die close time is 7.

d. Calculate a 95% prediction interval for the temperature difference resulting from a single experimental run with a furnace temperature of 1300 and a die close time of 7.

43. An experiment carried out to study the effect of the mole contents of cobalt (x1) and the calcination temperature (x2) on the surface area of an iron-cobalt hydroxide catalyst (y) resulted in the accompanying data (“Structural Changes and Surface Properties of Spinels,” J. of Chemical Tech. and Biotech., 1994: 161–170). A request to the SAS package to fit , where (an interaction predictor) yielded the output below.

x1 .6 .6 .6 .6 .6 1.0 1.0

x2 200 250 400 500 600 200 250

y 90.6 82.7 58.7 43.2 25.0 127.1 112.3

x1 1.0 1.0 1.0 2.6 2.6 2.6 2.6

x2 400 500 600 200 250 400 500

y 19.6 17.8 9.1 53.1 52.0 43.4 42.4

x1 2.6 2.8 2.8 2.8 2.8 2.8

x2 600 200 250 400 500 600

y 31.6 40.9 37.9 27.5 27.3 19.0

a. Predict the value of surface area when cobalt content is 2.6 and temperature is 250, and calculate the value of the corresponding residual.

x 3 5 x 1x 2b0 1 b1x 1 1 b2x 2 1 b3x 3

CoxFe32xO4

sŶ 5 .353Ŷ x 2 5 7x 1 5 1300 b. Since , is it legitimate to conclude that if

cobalt content increases by 1 unit while the values of the other predictors remain fixed, surface area can be ex- pected to decrease by roughly 46 units? Explain your reasoning.

c. Does there appear to be a useful linear relationship be- tween y and the predictors?

d. Given that mole contents and calcination temperature remain in the model, does the interaction predictor x3 provide useful information about y? State and test the appropriate hypotheses using a significance level of .01.

e. The estimated standard deviation of when mole con- tents is 2.0 and calcination temperature is 500 is

. Calculate a 95% confidence interval for the mean value of surface area under these circumstances.

44. The accompanying Minitab regression output is based on data that appeared in the article “Application of Design of Experiments for Modeling Surface Roughness in Ultrasonic Vibration Turning” (J. of Engr. Manuf., 2009: 641–652). The response variable is surface roughness (mm), and the independent variables are vibration amplitude (mm), depth of cut (mm), feed rate (mm/rev), and cutting speed (m/min), respectively. a. How many observations were there in the data set? b. Interpret the coefficient of multiple determination. c. Carry out a test of hypotheses to decide if the model

specifies a useful relationship between the response variable and at least one of the predictors.

d. Interpret the number 18.2602 that appears in the Coef column.

sŶ 5 4.69

b̂1 5 246.0

SAS output for Exercise 43

Dependent Variable: SURFAREA

Analysis of Variance

Source DF Sum of Squares Mean Square F Value Prob.F Model 3 15223.52829 5074.50943 18.924 0.0001 Error 16 4290.53971 268.15873

C Total 19 19514.06800

Root MSE 16.37555 R-square 0.7801 Dep Mean 48.06000 Adj R-sq 0.7389 C.V. 34.07314

Parameter Estimates

Parameter Standard T for H0: Prob Variable DF Estimate Error INTERCEP 1 185.485740 21.19747682 8.750 0.0001 COBCON 1 �45.969466 10.61201173 �4.332 0.0005 TEMP 1 �0.301503 0.05074421 �5.942 0.0001 CONTEMP 1 0.088801 0.02540388 3.496 0.0030

. u T uParameter 5 0

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

e. At significance level .10, can any single one of the predictors be eliminated from the model provided that all of the other predictors are retained?

f. The estimated SD of when the values of the four predictors are 10, .5, .25, and 50, respectively, is .1178. Calculate both a CI for true average roughness and a PI for the roughness of a single specimen, and compare these two intervals.

The regression equation is

Predictor Coef SE Coef T P Constant �0.9723 0.3923 �2.48 0.015

a �0.03117 0.01864 �1.67 0.099 d 0.5568 0.3185 1.75 0.084 f 18.2602 0.7536 24.23 0.000 v 0.002822 0.003977 0.71 0.480

Source DF SS MS F P Regression 4 401.02 100.25 148.35 0.000 Residual Error 76 51.36 0.68 Total 80 452.38

45. The article “Analysis of the Modeling Methodologies for Predicting the Strength of Air-Jet Spun Yarns” (Textile Res. J., 1997: 39–44) reported on a study carried out to relate yarn tenacity (y, in g/tex) to yarn count (x1, in tex), percent- age polyester (x2), first nozzle pressure (x3, in kg/cm

2), and second nozzle pressure (x4, in kg/cm

2). The estimate of the constant term in the corresponding multiple regression equation was 6.121. The estimated coefficients for the four predictors were �.082, .113, .256, and �.219, respectively, and the coefficient of multiple determination was .946. a. Assuming that the sample size was , state and test

the appropriate hypotheses to decide whether the fitted model specifies a useful linear relationship between the dependent variable and at least one of the four model predictors.

b. Again using , calculate the value of adjusted R2. c. Calculate a 99% confidence interval for true mean yarn

tenacity when yarn count is 16.5, yarn contains 50% polyester, first nozzle pressure is 3, and second nozzle pressure is 5 if the estimated standard deviation of predicted tenacity under these circumstances is .350.

46. A regression analysis carried out to relate for a water filtration system (hr) to since the previous service (months) and of repair (1 if elec- trical and 0 if mechanical) yielded the following model based on observations: . In addition, , and . a. Does there appear to be a useful linear relationship be-

tween repair time and the two model predictors? Carry out a test of the appropriate hypotheses using a signifi- cance level of .05.

b. Given that elapsed time since the last service remains in the model, does type of repair provide useful information about repair time? State and test the appropriate hypothe- ses using a significance level of .01.

sb̂2 5 .312SST 5 12.72, SSE 5 2.09 y 5 .950 1 .400x1 1 1.250x 2n 5 12

x 2 5 type x 1 5 elapsed time

y 5 repair time

n 5 25

n 5 25

R-Sq(adj) 5 88.0%R-Sq 5 88.6%S 5 0.822059

Ra 5 20.972 2 0.0312 a 1 0.557 d 1 18.3 f 1 0.00282 v

c. Calculate and interpret a 95% CI for b2. d. The estimated standard deviation of a prediction for re-

pair time when elapsed time is 6 months and the repair is electrical is .192. Predict repair time under these circum- stances by calculating a 99% prediction interval. Does the interval suggest that the estimated model will give an accurate prediction? Why or why not?

47. Efficient design of certain types of municipal waste inciner- ators requires that information about energy content of the waste be available. The authors of the article “Modeling the Energy Content of Municipal Solid Waste Using Multiple Regression Analysis” (J. of the Air and Waste Mgmnt. Assoc., 1996: 650–656) kindly provided us with the accom- panying data on , the three physical composition variables plastics by weight,

paper by weight, and garbage by weight, and the proximate analysis variable moisture by weight for waste specimens obtained from a certain region.

Energy Obs Plastics Paper Garbage Water Content

1 18.69 15.65 45.01 58.21 947 2 19.43 23.51 39.69 46.31 1407 3 19.24 24.23 43.16 46.63 1452 4 22.64 22.20 35.76 45.85 1553 5 16.54 23.56 41.20 55.14 989 6 21.44 23.65 35.56 54.24 1162 7 19.53 24.45 40.18 47.20 1466 8 23.97 19.39 44.11 43.82 1656 9 21.45 23.84 35.41 51.01 1254

10 20.34 26.50 34.21 49.06 1336 11 17.03 23.46 32.45 53.23 1097 12 21.03 26.99 38.19 51.78 1266 13 20.49 19.87 41.35 46.69 1401 14 20.45 23.03 43.59 53.57 1223 15 18.81 22.62 42.20 52.98 1216 16 18.28 21.87 41.50 47.44 1334 17 21.41 20.47 41.20 54.68 1155 18 25.11 22.59 37.02 48.74 1453 19 21.04 26.27 38.66 53.22 1278 20 17.99 28.22 44.18 53.37 1153 21 18.73 29.39 34.77 51.06 1225 22 18.49 26.58 37.55 50.66 1237 23 22.08 24.88 37.07 50.72 1327 24 14.28 26.27 35.80 48.24 1229 25 17.74 23.61 37.36 49.92 1205 26 20.54 26.58 35.40 53.58 1221 27 18.25 13.77 51.32 51.38 1138 28 19.09 25.62 39.54 50.13 1295 29 21.25 20.63 40.72 48.67 1391 30 21.62 22.71 36.22 48.19 1372

Using Minitab to fit a multiple regression model with the four aforementioned variables as predictors of energy con- tent resulted in the following output:

x 4 5 % x 3 5 %x 2 5 % x 1 5 %

y 5 energy content (kcal/kg)

13.4 Multiple Regression Analysis 571

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

572 CHAPTER 13 Nonlinear and Multiple Regression

The regression equation is

Predictor Coef StDev T P Constant 2244.9 177.9 12.62 0.000 plastics 28.925 2.824 10.24 0.000 paper 7.644 2.314 3.30 0.003 garbage 4.297 1.916 2.24 0.034 water �37.354 1.834 �20.36 0.000

Analysis of Variance

Source DF SS MS F P Regression 4 664931 166233 167.71 0.000 Error 25 24779 991 Total 29 689710

a. Interpret the values of the estimated regression coefficients and .

b. State and test the appropriate hypotheses to decide whether the model fit to the data specifies a useful linear relationship between energy content and at least one of the four predictors.

c. Given that % plastics, % paper, and % water remain in the model, does % garbage provide useful information about energy content? State and test the appropriate hypotheses using a significance level of .05.

d. Use the fact that when , and to calculate a 95% confidence

interval for true average energy content under these circumstances. Does the resulting interval suggest that mean energy content has been precisely estimated?

e. Use the information given in part (d) to predict energy content for a waste sample having the specified charac- teristics, in a way that conveys information about preci- sion and reliability.

48. An experiment to investigate the effects of a new technique for degumming of silk yarn was described in the article “Some Studies in Degumming of Silk with Organic Acids” (J. Society of Dyers and Colourists, 1992: 79–86). One response variable of interest was . The experimenters made observations on weight loss for various values of three inde- pendent variables: temperature ;

; tartaric acid concentration . In the regression analyses,(g/L) 5 0,8,16

x 3 530, 75, 120x 2 5 time of teatment (min) 5 (8C) 5 90, 100, 110x 1 5

y 5 weight loss (%)

x 4 5 45x 3 5 40 x 1 5 20, x 2 5 25, sŶ 5 7.46

b̂4b̂1

R-Sq(adj) 5 95.8%R-Sq 5 96.4%s 5 31.48

237.4 water 1 7.64 paper 1 4.30 garbage

enercont 5 2245 1 28.9 plastics Obs 9 10 11 12 13 14 15

x1 0 0 0 0 0 0 0

x2 �1 �1 1 1 0 0 0

x3 �1 1 �1 1 0 0 0

y 13.1 23.0 20.9 21.5 22.0 21.3 22.6

A multiple regression model with predictors— , and

—was fit to the data, resulting in

, and . a. Does this model specify a useful relationship? State and test

the appropriate hypotheses using a significance level of .01. b. The estimated standard deviation of when

(i.e., when , , and ) is 1.248. Calculate a

95% CI for expected weight loss when temperature, time, and concentration have the specified values.

c. Calculate a 95% PI for a single weight-loss value to be observed when temperature, time, and concentration have values 100, 75, and 8, respectively.

d. Fitting the model with only x1, x2, and x3 as predictors gave and . Does at least one of the second-order predictors provide additional useful information? State and test the appropriate hypotheses.

49. The article “The Influence of Temperature and Sunshine on the Alpha-Acid Contents of Hops (Agric. Meteor. 1974: 375–382) reports the following data on yield (y), mean tem- perature over the period between date of coming into hops and date of picking (x1), and mean percentage of sunshine during the same period (x2) for the Fuggle variety of hop:

x1 16.7 17.4 18.4 16.8 18.9 17.1

x2 30 42 47 47 43 41

y 210 110 103 103 91 76

x1 17.3 18.2 21.3 21.2 20.7 18.5

x2 48 44 43 50 56 60

y 73 70 68 53 45 31

Here is partial Minitab output from fitting the first-order model used in the article:

Predictor Coef Stdev t-ratio P Constant 415.11 82.52 5.03 0.000 Temp �6.593 4.859 �1.36 0.208 Sunshine �4.504 1.071 �4.20 0.002

s � 24.45 R-sq � 76.8% R-sq(adj) � 71.6%

a. What is , and what is the corresponding residual? b. Test versus Ha: either b1 or at

level .05. b2 2 0H0: b1 5 b2 5 0

m̂Y # 18.9,43

Y 5 b0 1 b1x 1 1 b2x 2 1 P

SSE 5 203.82R2 5 .456

concentration 5 8time 5 75 temperature 5 100x 1 5 c5 x 9 5 0

m̂Y

R2 5 .938SSE 5 23.379b̂9 5 22.325, b̂8 5 23.750,b̂7 5 2.975,b̂6 5 24.208,b̂5 5 1.867,

b̂4 5 22.208,b̂3 5 3.4375,b̂2 5 1.2750,b̂1 5 2.8125,

b̂0 5 21.967,x 9 5 x 2x 3

x 3, x 4 5 x 1 2, x 5 5 x 2

2, x 6 5 x 3 2, x 7 5 x 1x 2, x 8 5 x 1x 3

x 1, x 2,k 5 9

the three values of each variable were coded as �1, 0, and 1, respectively, giving the accompanying data (the value

was reported, but our value results in regression output identical to that appearing in the article).

Obs 1 2 3 4 5 6 7 8

x1 �1 �1 1 1 �1 �1 1 1 x2 �1 1 �1 1 0 0 0 0 x3 0 0 0 0 �1 1 �1 1 y 18.3 22.2 23.0 23.0 3.3 19.3 19.3 20.3

y8 5 20.3y8 5 19.3

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

c. The estimated standard deviation of when and is 8.20. Use this to obtain a 95% CI for .

d. Use the information in part (c) to obtain a 95% PI for yield in a future experiment when x1 = 18.9 and .

e. Minitab reported that a 95% PI for yield when and is (35.94, 151.63). What is a 90% PI in this situation?

f. Given that x2 is in the model, would you retain x1? g. When the model is fit, the resulting

value of R2 is .721. Verify that the F statistic for testing versus the alternative hypothe-

sis satisfies , where t is the value of the t statistic from part (f).

50. a. When the model is fit to the hops data of Exercise

49, the estimate of b5 is with estimated stan- dard deviation . Test versus

. b. Each t ratio for the model of part

(a) is less than 2 in absolute value, yet for this model. Would it be correct to drop each term from the model because of its small t-ratio? Explain.

c. Using for the model of part (a), test (which says that all second-order

terms can be deleted).

51. The article “The Undrained Strength of Some Thawed Permafrost Soils” (Canadian Geotechnical J., 1979: 420–427) contains the following data on undrained shear strength of sandy soil (y, in kPa), depth (x1, in m), and water content (x2, in %).

y x1 x2 e*

1 14.7 8.9 31.5 23.35 �8.65 �1.50 2 48.0 36.6 27.0 46.38 1.62 .54 3 25.6 36.8 25.9 27.13 �1.53 �.53 4 10.0 6.1 39.1 10.99 �.99 �.17 5 16.0 6.9 39.2 14.10 1.90 .33 6 16.8 6.9 38.3 16.54 .26 .04 7 20.7 7.3 33.9 23.34 �2.64 �.42 8 38.8 8.4 33.8 25.43 13.37 2.17 9 16.9 6.5 27.9 15.63 1.27 .23

10 27.0 8.0 33.1 24.29 2.71 .44 11 16.0 4.5 26.3 15.36 .64 .20 12 24.9 9.9 37.8 29.61 �4.71 �.91 13 7.3 2.9 34.6 15.38 �8.08 �1.53 14 12.8 2.0 36.4 7.96 4.84 1.02

The predicted values and residuals were computed by fitting a full quadratic model, which resulted in the estimated regression function

2 .253x 2 2 1 .492x 1x 2

y 5 2151.36 2 16.22x 1 1 13.48x 2 1 .094x 1 2

y 2 ŷŷ

H0: b3 5 b4 5 b5 5 0 R2 5 .861

R2 5 .861 b̂i/sb̂i (i 5 1, 2, 3, 4, 5)

Ha: b5 2 0 H0: b5 5 0sb̂5 5 .94

b̂5 5 .557 b4x 2

2 1 b5x 1x 2 1 P Y 5 b0 1 b1x 1 1 b2x 2 1 b3x 1

2 1

t 2 5 fHa: Y 5 b0 1 b1x 1 1 b2 x 2 1 P H0: Y 5 b0 1 b2x 2 1 P

Y 5 b0 1 b2x 2 1 P

x 1 5 45 x 1 5 18

x 2 5 43

mY #18.9,43 x 2 5 43x 1 5 18.9

b̂0 1 b̂1x 1 1 b̂2x 2

13.4 Multiple Regression Analysis 573

a. Do plots of e* versus x1, e* versus x2, and e* versus suggest that the full quadratic model should be modi- fied? Explain your answer.

b. The value of R2 for the full quadratic model is .759. Test at level .05 the null hypothesis stating that there is no lin- ear relationship between the dependent variable and any of the five predictors.

c. It can be shown that . The estimate of s is (from the full quad- ratic model). First obtain the estimated standard devia- tion of , and then estimate the standard deviation of (i.e., ) when and . Finally, compute a 95% CI for mean strength. [Hint: What is ?]

d. Fitting the first-order model with regression function results in .

Test at level .05 the null hypothesis that states that all quadratic terms can be deleted from the model.

52. Utilization of sucrose as a carbon source for the production of chemicals is uneconomical. Beet molasses is a readily avail- able and low-priced substitute. The article “Optimization of the Production of b-Carotene from Molasses by Blakeslea Trispora (J. of Chem. Tech. and Biotech. 2002: 933–943) car- ried out a multiple regression analysis to relate the dependent variable of b-carotene (g/dm3) to the three pre- dictors amount of lineolic acid, amount of kerosene, and amount of antioxidant (all g/dm3).

Obs Linoleic Kerosene Antiox Betacaro

1 30.00 30.00 10.00 0.7000 2 30.00 30.00 10.00 0.6300 3 30.00 30.00 18.41 0.0130 4 40.00 40.00 5.00 0.0490 5 30.00 30.00 10.00 0.7000 6 13.18 30.00 10.00 0.1000 7 20.00 40.00 5.00 0.0400 8 20.00 40.00 15.00 0.0065 9 40.00 20.00 5.00 0.2020

10 30.00 30.00 10.00 0.6300 11 30.00 30.00 1.59 0.0400 12 40.00 20.00 15.00 0.1320 13 40.00 40.00 15.00 0.1500 14 30.00 30.00 10.00 0.7000 15 30.00 46.82 10.00 0.3460 16 30.00 30.00 10.00 0.6300 17 30.00 13.18 10.00 0.3970 18 20.00 20.00 5.00 0.2690 19 20.00 20.00 15.00 0.0054 20 46.82 30.00 10.00 0.0640

a. Fitting the complete second-order model in the three pre- dictors resulted in and adjusted , whereas fitting the first-order model gave . What would you conclude about the two models?

R2 5 .016 R2 5 .974R2 5 .987

y 5 amount

SSE 5 894.95mY # x1 # x2 5 b0 1 b1x 1 1 b2x 2

(y 2 ŷ)/e* x 2 5 33.1x 1 5 8.0

b̂0 1 b̂1x 1 1 b̂2x 2 1 b̂3x 1 2 1 b̂4x 2

2 1 b̂5x 1x 2Ŷ Y 2 Ŷ

ŝ 5 s 5 6.99 V(Y) 5 s2 5 V(Ŷ) 1 V(Y 2 Ŷ )

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

b. For , a statistical software package reported that , based on the complete second-order model. Predict the amount of b-carotene that would result from a single experimental run with the designated values of the independent vari- ables, and do so in a way that conveys information about precision and reliability.

53. Snowpacks contain a wide spectrum of pollutants that may represent environmental hazards. The article “Atmospheric PAH Deposition: Deposition Velocities and Washout Ratios” (J. of Environmental Engineering, 2002: 186–195) focused on the deposition of polyaromatic hydrocarbons. The authors proposed a multiple regression model for relat- ing deposition over a specified time period (y, in mg/m2) to two rather complicated predictors x1 (mg-sec/m

3) and x2 (mg/m2), defined in terms of PAH air concentrations for var- ious species, total time, and total amount of precipitation. Here is data on the species fluoranthene and corresponding Minitab output:

obs x1 x2 flth 1 92017 .0026900 278.78 2 51830 .0030000 124.53 3 17236 .0000196 22.65 4 15776 .0000360 28.68 5 33462 .0004960 32.66 6 243500 .0038900 604.70 7 67793 .0011200 27.69 8 23471 .0006400 14.18 9 13948 .0004850 20.64 10 8824 .0003660 20.60 11 7699 .0002290 16.61 12 15791 .0014100 15.08 13 10239 .0004100 18.05 14 43835 .0000960 99.71 15 49793 .0000896 58.97 16 40656 .0026000 172.58 17 50774 .0009530 44.25

The regression equation is

flth � �33.5 � 0.00205 x1 � 29836 x2

Predictor Coef SE Coef T P Constant �33.46 14.90 �2.25 0.041 x1 0.0020548 0.0002945 6.98 0.000 x2 29836 13654 2.19 0.046

S � 44.28 R-Sq � 92.3% R-Sq(adj) � 91.2%

Analysis of Variance

Source DF SS MS F P Regression 2 330989 165495 84.39 0.000 Residual Error 14 27454 1961 Total 16 358443

Formulate questions and perform appropriate analyses to draw conclusions.

ŷ 5 .66573, sŶ 5 .01785 x 1 5 x 2 5 30, x 3 5 10

574 CHAPTER 13 Nonlinear and Multiple Regression

54. The use of high-strength steels (HSS) rather than aluminum and magnesium alloys in automotive body structures reduces vehicle weight. However, HSS use is still problematic because of difficulties with limited formability, increased springback, difficulties in joining, and reduced die life. The article “Experimental Investigation of Springback Variation in Forming of High Strength Steels” (J. of Manuf. Sci. and Engr., 2008: 1–9) included data on from the wall opening angle and holder pressure. Three different material suppliers and three different lubrication regimens (no lubrication, lubricant #1, and lubricant #2) were also utilized. a. What predictors would you use in a model to incorporate

supplier and lubrication information in addition to BHP? b. The accompanying Minitab output resulted from fitting

the model of (a) (the article’s authors also used Minitab; amusingly, they employed a significance level of .06 in various tests of hypotheses). Does there appear to be a useful relationship between the response variable and at least one of the predictors? Carry out a formal test of hypotheses.

c. When BHP is 1000, material is from supplier 1, and no lubrication is used, . Calculate a 95% PI for the spingback that would result from making an additional observation under these conditions.

d. From the output, it appears that lubrication regimen may not be providing useful information. A regression with the corresponding predictors removed resulted in

. What is the coefficient of multiple deter- mination for this model, and what would you conclude about the importance of the lubrication regimen?

e. A model with predictors for BHP, supplier, and lubrica- tion regimen, as well as predictors for interactions between BHP and both supplier and lubrication regi- ment, resulted in and . Does this model appear to improve on the model with just BHP and predictors for supplier?

Predictor Coef SE Coef T P Constant 21.5322 0.6782 31.75 0.000 BHP �0.0033680 0.0003919 �8.59 0.000 Suppl_1 �1.7181 0.5977 �2.87 0.007 Suppl_2 �1.4840 0.6010 �2.47 0.019 Lub_1 �0.3036 0.5754 �0.53 0.602 Lub_2 0.8931 0.5779 1.55 0.133

S � 1.18413 R-Sq � 77.5% R-Sq(adj) � 73.8%

Source DF SS MS F P Regression 5 144.915 28.983 20.67 0.000 Residual Error 30 42.065 1.402 Total 35 186.980

R2 5 .849SSE 5 28.216

SSE 5 48.426

sŶ 5 .524

x 1 5 blank y 5 springback

13.5 Other Issues in Multiple Regression In this section, we touch upon a number of issues that may arise when a multiple regression analysis is carried out. Consult the chapter references for a more exten- sive treatment of any particular topic.

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.5 Other Issues in Multiple Regression 575

Example 13.18 An article in Lubrication Engr. (“Accelerated Testing of Solid Film Lubricants,” 1972: 365–372) reports on an investigation of wear life for solid film lubricant. Three sets of journal bearing tests were run on a Mil-L-8937-type film at each combination of three loads (3000, 6000, and 10,000 psi) and three speeds (20, 60, and 100 rpm), and the wear life (hours) was recorded for each run, as shown in Table 13.7.

Table 13.7 Wear-Life Data for Example 13.18

s l(1000s) w s l(1000s) w

20 3 300.2 60 6 65.9 20 3 310.8 60 10 10.7 20 3 333.0 60 10 34.1 20 6 99.6 60 10 39.1 20 6 136.2 100 3 26.5 20 6 142.4 100 3 22.3 20 10 20.2 100 3 34.8 20 10 28.2 100 6 32.8 20 10 102.7 100 6 25.6 60 3 67.3 100 6 32.7 60 3 77.9 100 10 2.3 60 3 93.9 100 10 4.4 60 6 43.0 100 10 5.8 60 6 44.5

The article contains the comment that a lognormal distribution is appropriate for W, since ln(W) is known to follow a normal law (recall from Chapter 4 that this is what defines a lognormal distribution). The model that appears is from which ; so with

, and , we have a multiple linear regression model. After computing , , and for the data, a first-order model in the transformed variables yielded the results shown in Table 13.8.

ln(li) ln(si) ln(wi) b2 5 2bln(s), x 2 5 ln(l), b0 5 ln(c), b1 5 2a

Y 5 ln(W), x 1 5 ln(W) 5 ln(c) 2 a ln(s) 2 b ln(l) 1 ln(P) W 5 (c/sal b) # P,

Table 13.8 Estimated Coefficients and t Ratios for Example 13.18

Parameter bbi Estimate i Estimated SD

b0 10.8719 .7871 13.81 b1 �1.2054 .1710 �7.05 b2 �1.3979 .2327 �6.01

t 5 �̂i /s�̂is�̂ibb̂

Transformations Sometimes, theoretical considerations suggest a nonlinear relation between a dependent variable and two or more independent variables, whereas on other occa- sions diagnostic plots indicate that some type of nonlinear function should be used. Frequently a transformation will linearize the model.

The coefficient of multiple determination (for the transformed fit) has value . The estimated regression function for the transformed variables is

ln(w) 5 10.87 2 1.21 ln(s) 2 1.40 ln(l)

R2 5 .781

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

576 CHAPTER 13 Nonlinear and Multiple Regression

Example 13.19 Data was obtained from 189 women who gave birth during a particular period at the Bayside Medical Center in Springfield, MA, in order to identify factors associated with low birth weight. The accompanying Minitab output resulted from a logistic regression in which the dependent variable indicated whether (1) or not (0) a child had low birth weight , and predictors were weight of the mother at her last menstrual period, age of the mother, and an indicator variable for whether (1) or not (0) the mother had smoked during pregnancy.

Logistic Regression Table Odds 95% CI

Predictor Coef SE Coef Z P Ratio Lower Upper Constant 2.06239 1.09516 1.88 0.060 Wt �0.01701 0.00686 �2.48 0.013 0.98 0.97 1.00 Age �0.04478 0.03391 �1.32 0.187 0.96 0.89 1.02 Smoke 0.65480 0.33297 1.97 0.049 1.92 1.00 3.70

It appears that age is not an important predictor of LBW, provided that the two other predictors are retained. The other two predictors do appear to be informative. The point estimate of the odds ratio associated with smoking status is 1.92 [ratio of the odds of LBW for a smoker to the odds for a nonsmoker, where ]; at the 95% confidence level, the odds of a low-birth-weight child could be as much as 3.7 times higher for a smoker what it is for a nonsmoker. ■

Please see one of the chapter references for more information on logistic regression, including methods for assessing model effectiveness and adequacy.

Standardizing Variables In Section 13.3, we considered transforming x to before fitting a poly- nomial. For multiple regression, especially when values of variables are large in magnitude, it is advantageous to carry this coding one step further. Let and si be the sample average and sample standard deviation of the . Now code each variable xi by . The coded variable simply reexpresses any xi value in units of standard deviation above or below the mean. Thus if and becomes , because 130 is 1.5, standard deviations above the mean of the values of xi. For example, the coded full second-order model with two independent variables has regression function

xri 5 1.5si 5 20, x i 5 130 x i 5 100

xrixri 5 (x i 2 x i)/si x ij’s ( j 5 1, c, n)

x i

xr 5 x 2 x

odds 5 P(Y 5 1)/P(Y 5 0)

(,2500 g)

so that the original regression function is estimated as

The Bonferroni approach can be used to obtain simultaneous CIs for b1 and b2, and because and , intervals for a and b are then immediately available. ■

The logistic regression model was introduced in Section 13.2 to relate a dichotomous variable y to a single predictor. This model can be extended in an obvi- ous way to incorporate more than one predictor. The probability of success p is now a function of the predictors :

Statistical software must be used to estimate parameters, calculate relevant standard deviations, and provide other inferential information.

p(x 1, c, xk) 5 eb01b1x11c1bkxk

1 1 eb01b1x11c1bkxk

x 1, x 2, c, xk

b2 5 2bb1 5 2a

w 5 e10.87 # s21.21 # l 21.40

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

The benefits of coding are (1) increased numerical accuracy in all computations and (2) more accurate estimation than for the parameters of the uncoded model, because the individual parameters of the coded model characterize the behavior of the regres- sion function near the center of the data rather than near the origin.

5 b0 1 b1xr1 1 b2xr2 1 b3xr3 1 b4xr4 1 b5xr5

1 b4a x 2 2 x 2s2 b 2

1 b5a x 1 2 x 1s1 b a x 2 2 x 2

s2 b

E(Y) 5 b0 1 b1a x 1 2 x 1s1 b 1 b2a x 2 2 x 2

s2 b 1 b2a x 1 2 x 1s1 b

2

13.5 Other Issues in Multiple Regression 577

Example 13.20 The article “The Value and the Limitations of High-Speed Turbo-Exhausters for the Removal of Tar-Fog from Carburetted Water-Gas” (Soc., Chemical Industry J. of 1946: 166–168) presents the data (in Table 13.9) on of a gas stream as a function of and . The data is also considered in the article “Some Aspects of Nonorthogonal Data Analysis” (J. of Quality Tech. 1973: 67–79), which suggests using the coded model described previously.

x 2 5 gas inlet temperature (8F)x 1 5 rotor speed (rpm) y 5 tar content (grains/100 ft3)

Table 13.9 Data for Example 13.20

Run y x1 x2

1 60.0 2400 54.5 �1.52428 �.57145 2 61.0 2450 56.0 �1.39535 �.35543 3 65.0 2450 58.5 �1.39535 .00461 4 30.5 2500 43.0 �1.26642 �2.22763 5 63.5 2500 58.0 �1.26642 �.06740 6 65.0 2500 59.0 �1.26642 .07662 7 44.0 2700 52.5 �.75070 �.85948 8 52.0 2700 65.5 �.75070 1.01272 9 54.5 2700 68.0 �.75070 1.37276

10 30.0 2750 45.0 �.62177 �1.93960 11 26.0 2775 45.5 �.55731 �1.86759 12 23.0 2800 48.0 �.49284 �1.50755 13 54.0 2800 63.0 �.49284 .65268 14 36.0 2900 58.5 �.23499 .00461 15 53.5 2900 64.5 �.23499 .86870 16 57.0 3000 66.0 .02287 1.08472 17 33.5 3075 57.0 .21627 �.21141 18 34.0 3100 57.5 .28073 �.13941 19 44.0 3150 64.0 .40966 .79669 20 33.0 3200 57.0 .53859 �.21141 21 39.0 3200 64.0 .53859 .79669 22 53.0 3200 69.0 .53859 1.51677 23 38.5 3225 68.0 .60305 1.37276 24 39.5 3250 62.0 .66752 .50866 25 36.0 3250 64.5 .66752 .86870 26 8.5 3250 48.0 .66752 �1.50755 27 30.0 3500 60.0 1.31216 .22063 28 29.0 3500 59.0 1.31216 .07662 29 26.5 3500 58.0 1.31216 �.06740 30 24.5 3600 58.0 1.57002 �.06740 31 26.5 3900 61.0 2.34360 .36465

xr2xr1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

The means and standard deviations are and , so and . With , fitting the full second-order model yieldedxr3 5 (xr1)2, xr4 5 (xr2)2, xr5 5 xr1 # xr2

xr2 5 (x 2 2 58.468)/6.944xr1 5 (x 1 2 2991.13)/387.81s2 5 6.944 x 2 5 58.468,s1 5 387.81,x 1 5 2991.13,

578 CHAPTER 13 Nonlinear and Multiple Regression

, and . The estimated regression equation is then

Thus if and , and , so

Variable Selection Suppose an experimenter has obtained data on a response variable y as well as on p candidate predictors . How can a best (in some sense) model involving a subset of these predictors be selected? Recall that as predictors are added one by one into a model, SSE cannot increase (a larger model cannot explain less variation than a smaller one) and will usually decrease, albeit perhaps by a small amount. So there is no mystery as to which model gives the largest R2 value—it must be the one con- taining all p predictors. What we’d really like is a model involving relatively few pre- dictors that is easy to interpret and use yet explains a relatively large amount of observed y variation.

For any fixed number of predictors (e.g., 5), it is reasonable to identify the best model of that size as the one with the largest R2 value—equivalently, the smallest value of SSE. The more difficult issue concerns selection of a criterion that will allow for comparison of models of different sizes. Let’s use a subscript k to denote a quantity computed from a model containing k predictors (e.g., SSEk). Three dif- ferent criteria, each one a simple function of SSEk, are widely used.

1. , the coefficient of multiple determination for a k-predictor model. Because will virtually always increase as k does (and can never decrease), we are

not interested in the k that maximizes . Instead, we wish to identify a small k for which is nearly as large as R2 for all predictors in the model.

2. , the mean squared error for a k-predictor model. This is often used in place of , because although never decreases with increasing k, a small decrease in SSEk obtained with one extra predictor can be more than offset by a decrease of 1 in the denominator of MSEk. The objective is then to find the model having minimum MSEk. Since adjusted

, where is constant in k, examina- tion of adjusted is equivalent to consideration of MSEk.

3. The rationale for the third criterion, Ck, is more difficult to understand, but the criterion is widely used by data analysts. Suppose the true regression model is specified by m predictors—that is,

so that

E(Y) 5 b0 1 b1x 1 1 c 1 bmx m

Y 5 b0 1 b1x 1 1 c 1 bmx m 1 P V(P) 5 s2

Rk 2

MST 5 SST/(n 2 1)Rk 2 5 1 2 MSEk /MST

Rk 2Rk

2 MSEk 5 SSEk” ”/(n 2 k 2 1)

Rk 2

Rk 2

Rk 2

Rk 2

x 1, c, xp

2(2.34)(.0447) 1 (2.60)(2.1139) 5 31.16 ŷ 5 40.27 2 (13.40)(.539) 1 (10.26)(2.211) 1 (2.33)(.2901)

xr5 5 (.539)(2.211) 5 2.1139xr4 5 (2.211)2 5 .0447 x 2 5 57.0, xr1 5 .539, xr2 5 2.211, xr3 5 (.539)2 5 .2901,x 1 5 3200

ŷ 5 40.27 2 13.40xr1 1 10.26xr2 1 2.33xr3 2 2.34xr4 1 2.60xr5

b̂5 5 2.5978 22.3405b̂0 5 40.2660, b̂1 5 213.4041, b̂2 5 10.2553, b̂3 5 2.3313, b̂4 5

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Consider fitting a model by using a subset of k of these m predictors; for sim- plicity, suppose we use . Then by solving the system of normal

equations, estimates are obtained (but not, of course, estimates of any b’s corresponding to predictors not in the fitted model). The true expected value E(Y) can then be estimated by . Now con- sider the normalized expected total error of estimation

(13.21)

The second equality in (13.21) must be taken on faith because it requires a tricky expected-value argument. A particular subset is then appealing if its value is small. Unfortunately, though, E(SSEk) and s

2 are not known. To remedy this, let s2 denote the estimate of s2 based on the model that includes all predictors for which data is available, and define

A desirable model is then specified by a subset of predictors for which Ck is small.

The total number of models that can be created from predictors in the candi- date pool is 2p (because each predictor can be included in or left out of any partic- ular model—one of these is the model that contains no predictors). If , then it would not be too tedious to examine all possible regression models involving these predictors using any good statistical software package. But the computational effort required to fit all possible models becomes prohibitive as the size of the can- didate pool increases. Several software packages have incorporated algorithms which will sift through models of various sizes in order to identify the best one or more models of each particular size. Minitab, for example, will do this for and allows the user to specify the number of models of each size (1, 2, 3, 4, or 5) that will be identified as having best criterion values. You might wonder why we’d want to go beyond the best single model of each size. The answer is that the 2nd or 3rd best model may be easier to interpret and use than would be the best model, or may be more satisfactory from a model-adequacy perspective. For example, sup- pose the candidate pool includes all predictors from a full quadratic model based on five independent variables. Then the best 3-predictor model might have predictors

, and , whereas the second-best such model could be the one with pre- dictors x2, x3, and x2x3.

x 3 x 5x 2, x 4 2

p # 31

p # 5

Ck 5 SSEk

s2 1 2(k 1 1) 2 n

�k

�k 5

Eagn i51

[Ŷi 2 E(Yi)] 2b

s2 5

E(SSEk)

s2 1 2(k 1 1) 2 n

Ŷ 5 b̂0 1 b̂1x 1 1 c 1 b̂kx k

b̂0, b̂1, c, b̂k

x 1, x 2, c, xk

13.5 Other Issues in Multiple Regression 579

Example 13.21 The review article by Ron Hocking listed in the chapter bibliography reports on an analysis of data taken from the 1974 issues of Motor Trend magazine. The dependent variable y was gas mileage, there were observations, and the predic- tors for which data was obtained were x 1 5 engine shape (1 5 straight and 0 5 V),

n 5 32

, and In Table 13.10, we present summary information from the analysis. The table describes for each k the subset having minimum SSEk; reading down the variables column indicates which variable is added in going from k to (going from to , both x3 and x10 are added, and x2 is deleted). Figure 13.18 contains plots of , adjusted , and Ck against k; these plots are an important visual aid in selecting a subset. The estimate of

Rk 2Rk

2 k 5 3k 5 2k 1 1

x 10 5 quarter-mile time.carburetor barrels, x 8 5 final drive ratio, x 9 5 weight horsepower, x 7 5 number ofnumber of transmission speeds, x 5 5 engine size, x 6 5

x 2 5 number of cylinders, x 3 5 transmission type (1 5 manual and 0 5 auto), x 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

580 CHAPTER 13 Nonlinear and Multiple Regression

Table 13.10 Best Subsets for Gas Mileage Data of Example 13.21

Number of Predictors Variables SSEk Adjusted Ck

1 9 247.2 .756 .748 11.6 2 2 169.7 .833 .821 1.2 3 3, 10, �2 150.4 .852 .836 .1 4 6 142.3 .860 .839 .8 5 5 136.2 .866 .840 1.8 6 8 133.3 .869 .837 3.4 7 4 132.0 .870 .832 5.2 8 7 131.3 .871 .826 7.1 9 1 131.1 .871 .818 9.0

10 2 131.0 .871 .809 11.0

Rk 2Rk

2 k 5

.70

.75

.80

.85

.90

2 4 6 8 10

k

R2k

.70

.75

.80

.85

.90

2 4 6 8 10

k

Adj. R2k

2 4 6 8 10

k

2

4

6

8

10

12

Ck

Figure 13.18 and Ck plots for the gas mileage data ■Rk2

s2 is , which is MSE10. A simple model that rates highly according to all cri- teria is the one containing predictors x3, x9, and x10.

s2 5 6.24

Generally speaking, when a subset of k predictors is used to fit a model, the estimators will be biased for and will also be a biased estimator for the true E(Y) (all this because predictors are missing from the fitted model). However, as measured by the total normalized expected error

, estimates based on a subset can provide more precision than would be obtained using all possible predictors; essentially, this greater precision is obtained at the price of introducing a bias in the estimators. A value of k for which indicates that the bias associated with this k-predictor model would be small.

Ck < k 1 1

�k

m 2 k Ŷb0, b1, c, bkb̂0, b̂1, c, b̂k

(k , m)

Example 13.22 The bond shear strength data introduced in Example 13.12 contains values of four different independent variables . We found that the model with only these four variables as predictors was useful, and there is no compelling reason to consider the inclusion of second-order predictors. Figure 13.19 is the Minitab output that results from a request to identify the two best models of each given size.

The best two-predictor model, with predictors power and temperature, seems to be a very good choice on all counts: R2 is significantly higher than for models with fewer predictors yet almost as large as for any larger models, adjusted R2 is almost at its maximum for this data, and C2 is small and close to .2 1 1 5 3

x 12x 4

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

13.5 Other Issues in Multiple Regression 581

Response is strength f p o o t t r w e i

Adj. c e m m Vars R-sq R-sq C-p s e r p e

1 57.7 56.2 11.0 5.9289 X 1 10.8 7.7 51.9 8.6045 X 2 68.5 66.2 3.5 5.2070 X X 2 59.4 56.4 11.5 5.9136 X X 3 70.2 66.8 4.0 5.1590 X X X 3 69.7 66.2 4.5 5.2078 X X X 4 71.4 66.8 5.0 5.1580 X X X X

Figure 13.19 Output from Minitab’s Best Subsets option ■

Stepwise Regression When the number of predictors is too large to allow for explicit or implicit examination of all possible subsets, several alternative selection procedures will generally identify good models. The simplest such procedure is the backward elimination (BE) method. This method starts with the model in which all predictors under consideration are used. Let the set of all such predictors be

. Then each t ratio appropriate for testing versus is examined. If the t ratio with the smallest absolute value is less than a prespecified constant tout, that is, if

then the predictor corresponding to the smallest ratio is eliminated from the model. The reduced model is now fit, the t ratios are again examined, and another predictor is eliminated if it corresponds to the smallest absolute t ratio smaller than tout. In this way, the algorithm continues until, at some stage, all absolute t ratios are at least tout. The model used is the one containing all predictors that were not elimi- nated. The value is often recommended since most values are near 2. Some computer packages focus on P-values rather than t ratios.

t .05tout 5 2

m 2 1

min i51, c, m ` b̂isb̂ i ` , tout

Ha: bi 2 0 H0: bi 5 0b̂1/sb̂i (i 5 1, c, m)x 1, c, xm

Example 13.23 (Example 13.20 continued)

For the coded full quadratic model in which , the five potential pre- dictors are , and . Without specifying tout, the predictor with the smallest absolute t ratio (asterisked) was eliminated at each stage, resulting in the sequence of models shown in Table 13.11.

xr5 5 xr1 xr2 (so m 5 5)xr1, xr2, xr3 5 xr12, xr4 5 xr22 y 5 tar content

Table 13.11 Backward Elimination Results for the Data of Example 13.20

Step Predictors 1 2 3 4 5

1 1, 2, 3, 4, 5 16.0 10.8 2.9 2.8 1.8* 2 1, 2, 3, 4 15.4 10.2 3.7 2.0* — 3 1, 2, 3 14.5 12.2 4.3* — — 4 1, 2 10.9 9.1* — — — 5 1 4.4* — — — —

u t - ratio u

Using , the resulting model would be based on , and , since at Step 3 no predictor could be eliminated. It can be verified that each subset is actually the best subset of its size, though this is by no means always the case. ■

xr3xr1, xr2tout 5 2

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

An alternative to the BE procedure is forward selection (FS). FS starts with no predictors in the model and considers fitting in turn the model with only x1, only

, and finally only xm. The variable that, when fit, yields the largest absolute t ratio enters the model provided that the ratio exceeds the specified constant tin. Suppose x1 enters the model. Then models with are considered in turn.The largest then specifies the entering predictor provided that this maximum also exceeds tin. This continues until at some step no absolute t ratios exceed tin. The entered predictors then specify the model. The value is often used for the same reason that is used in BE. For the tar- content data, FS resulted in the sequence of models given in Steps in Table 13.11 and thus is in agreement with BE. This will not always be the case.

The stepwise procedure most widely used is a combination of FS and BE, denoted by FB. This procedure starts as does forward selection, by adding variables to the model, but after each addition it examines those variables previously entered to see whether any is a candidate for elimination. For example, if there are eight pre- dictors under consideration and the current set consists of x2, x3, x5, and x6 with x5 having just been added, the t ratios , and are examined. If the smallest absolute ratio is less than tout, then the corresponding variable is eliminated from the model (some software packages base decisions on ). The idea behind FB is that, with forward selection, a single variable may be more strongly related to y than to either of two or more other variables individually, but the combination of these variables may make the single variable subsequently redundant. This actually happened with the gas-mileage data discussed in Example 13.21, with x2 entering and subsequently leaving the model.

Although in most situations these automatic selection procedures will identify a good model, there is no guarantee that the best or even a nearly best model will result. Close scrutiny should be given to data sets for which there appear to be strong relationships among some of the potential predictors; we will say more about this shortly.

Identification of Influential Observations In simple linear regression, it is easy to spot an observation whose x value is much larger or much smaller than other x values in the sample. Such an observation may have a great impact on the estimated regression equation (whether it actually does depends on how far the point (x, y) falls from the line determined by the other points in the scatter plot). In multiple regression, it is also desirable to know whether the val- ues of the predictors for a particular observation are such that it has the potential for exerting great influence on the estimated equation. One method for identifying poten- tially influential observations relies on the fact that because each is a linear function of , each predicted y value of the form is also a linear function of the yj’s. In particular, the predicted values corresponding to sample observations can be written as follows:

Each coefficient hij is a function only of the xij’s in the sample and not of the yj’s. It can be shown that and that .0 # hjj # 1hij 5 hji

ŷn 5 hn1y1 1 hn2y2 1 c 1 hnnyn

(((( ŷ2 5 h21y1 1 h22y2 1 c 1 h2nyn

ŷ1 5 h11y1 1 h12y2 1 c 1 h1nyn

ŷ 5 b̂0 1 b̂1x 1 1 c 1 b̂kx ky1, y2, c, yn

b̂i

f 5 t 2

b̂6 /sb̂ 6b̂2/sb̂ 2, b̂3/sb̂3

5, 4, c, 1 tout 5 2tin 5 2

u b̂j /sb̂j u ( j 5 2, c, m) (x 1, x 2), (x 1, x 3), c(x 1, xm)

x 2, c

582 CHAPTER 13 Nonlinear and Multiple Regression

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Let’s focus on the “diagonal” coefficients . The coefficient hjj is the weight given to yj in computing the corresponding predicted value . This quan- tity can also be expressed as a measure of the distance between the point in k-dimensional space and the center of the data . It is therefore natural to characterize an observation whose hjj is relatively large as one that has potentially large influence. Unless there is a perfect linear relationship among the k predictors,

, so the average of the hjj’s is . Some statisticians suggest that if , the jth observation be cited as being potentially influential; others use as the dividing line.3(k 1 1)/n

hjj . 2(k 1 1)/n (k 1 1)/ngnj51 hjj 5 k 1 1

(x 1., c, xk.) (x 1j, c, xkj) ŷj

h11, h22, c, hnn

13.5 Other Issues in Multiple Regression 583

Example 13.24 The accompanying data appeared in the article “Testing for the Inclusion of Variables in Linear Regression by a Randomization Technique” (Technometrics, 1966: 695–699) and was reanalyzed in Hoaglin and Welsch, “The Hat Matrix in Regression and ANOVA” (Amer. Statistician, 1978: 17–23). The hij’s (with elements below the diagonal omitted by symmetry) follow the data.

Beam Number Specific Gravity (x1) Moisture Content (x2) Strength (y)

1 .499 11.1 11.14 2 .558 8.9 12.74 3 .604 8.8 13.13 4 .441 8.9 11.51 5 .550 8.8 12.38 6 .528 9.9 12.60 7 .418 10.7 11.13 8 .480 10.5 11.70 9 .406 10.5 11.02

10 .467 10.7 11.41

Here , so ; since , the fourth data point is identified as potentially influential. ■

Another technique for assessing the influence of the jth observation that takes into account yj as well as the predictor values involves deleting the jth observation from the data set and performing a regression based on the remaining observations. If the estimated coefficients from the “deleted observation” regression differ greatly from the estimates based on the full data, the jth observation has clearly had a sub- stantial impact on the fit. One way to judge whether estimated coefficients change

h44 5 .604 . 2(.3)(k 1 1)/n 5 3/10 5 .3k 5 2

1 2 3 4 5 6 7 8 9 10

1 .418 �.002 .079 �.274 �.046 .181 .128 .222 .050 .242 2 .242 .292 .136 .243 .128 �.041 .033 �.035 .004 3 .417 �.019 .273 .187 �.126 .044 �.153 .004 4 .604 .197 �.038 .168 �.022 .275 �.028 5 .252 .111 �.030 .019 �.010 �.010 6 .148 .042 .117 .012 .111 7 .262 .145 .277 .174 8 .154 .120 .168 9 .315 .148

10 .187

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

greatly is to express each change relative to the estimated standard deviation of the coefficient:

There exist efficient computational formulas that allow all this information to be obtained from the “no-deletion” regression, so that the additional n regressions are unnecessary.

(b̂i before deletion) 2 (b̂i after deletion) sb̂i

5 change in b̂i

sb̂i

584 CHAPTER 13 Nonlinear and Multiple Regression

Example 13.25 Consider separately deleting observations 1 and 6, whose residuals are the largest, and observation 4, where hjj is large. Table 13.12 contains the relevant information.

Table 13.12 Changes in Estimated Coefficients for Example 13.25

Change When Point j Is Deleted

Parameter No-Deletions Estimates Estimated SD j � 1 j � 4 j � 6

b0 10.302 1.896 2.710 �2.109 �.642 b1 8.495 1.784 �1.772 1.695 .748 b2 .2663 .1273 �.1932 .1242 .0329

ej: �3.25 �.96 2.20 hjj: .418 .604 .148

For deletion of both point 1 and point 4, the change in each estimate is in the range 1–1.5 standard deviations, which is reasonably substantial (this does not tell us what would happen if both points were simultaneously omitted). For point 6, however, the change is roughly .25 standard deviation. Thus points 1 and 4, but not 6, might well be omitted in calculating a regression equation. ■

Multicollinearity In many multiple regression data sets, the predictors are highly inter- dependent. Consider the usual model

with data available for fitting. Suppose the principle of least squares is used to regress xi on the other predictors resulting in

It can then be shown that

(13.22)

When the sample xi values can be predicted very well from the other predictor values, the denominator of (13.22) will be small, so will be quite large. If this is the case for at least one predictor, the data is said to exhibit multicollinearity. Multicollinearity is often suggested by a regression computer output in which R2 is large but some of the t ratios are small for predictors that, based on prior information and intuition,b̂i /sb̂i

V(b̂i)

V(b̂i) 5 s2

g n

j51 (x ij 2 x̂ ij)

2

x̂ i 5 a0 1 a1x 1 1 c 1 ai21x i21 1 ai11x i11 1 c 1 akxk

x 1, c, x i21, x i11, c, xk, (x 1j, c, xkj, yj) (j 5 1, c, n)

Y 5 b0 1 b1x 1 1 c 1 bkx k 1 P

x 1, x 2, c, xk

(Example 13.24 continued)

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

seem important. Another clue to the presence of multicollinearity lies in a value that has the opposite sign from that which intuition would suggest, indicating that another predictor or collection of predictors is serving as a “proxy” for xi.

An assessment of the extent of multicollinearity can be obtained by regressing each predictor in turn on the remaining predictors. Let denote the value of R2

in the regression with dependent variable xi and predictors . It has been suggested that severe multicollinearity is present if for any i. Some statistical software packages will refuse to include a predictor in the model when its value is quite close to 1.

There is no consensus among statisticians as to what remedies are appropriate when severe multicollinearity is present. One possibility involves continuing to use a model that includes all the predictors but estimating parameters by using some- thing other than least squares. Consult a chapter reference for more details.

Ri 2

Ri 2 . .9

x 1, c, x i21, x i11, c, xk

Ri 2k 2 1

b̂i

13.5 Other Issues in Multiple Regression 585

EXERCISES Section 13.5 (55–64)

55. The article “Bank Full Discharge of Rivers” (Water Resources J., 1978: 1141–1154) reports data on discharge amount (q, in m3/sec), flow area (a, in m2), and slope of the water surface (b, in m/m) obtained at a number of floodplain stations. A subset of the data follows. The article proposed a multiplicative power model .

q 17.6 23.8 5.7 3.0 7.5

a 8.4 31.6 5.7 1.0 3.3

b .0048 .0073 .0037 .0412 .0416

q 89.2 60.9 27.5 13.2 12.2

a 41.1 26.2 16.4 6.7 9.7

b .0063 .0061 .0036 .0039 .0025

a. Use an appropriate transformation to make the model lin- ear, and then estimate the regression parameters for the transformed model. Finally, estimate a, b, and g (the parameters of the original model). What would be your prediction of discharge amount when flow area is 10 and slope is .01?

b. Without actually doing any analysis, how would you fit a multiplicative exponential model ?

c. After the transformation to linearity in part (a), a 95% CI for the value of the transformed regression function when

and was obtained from computer out- put as (.217, 1.755). Obtain a 95% CI for when

and .

56. In an experiment to study factors influencing wood specific gravity (“Anatomical Factors Influencing Wood Specific Gravity of Slash Pines and the Implications for the Development of a High-Quality Pulpwood,” TAPPI, 1964: 401–404), a sample of 20 mature wood samples was obtained, and measurements were taken on the number of fibers/mm2 in springwood (x1), number of fibers/mm

2 in

b 5 .0046a 5 3.3 aabbg

b 5 .0046a 5 3.3

Q 5 aebaegbP

Q 5 aabbgP

summerwood (x2), % springwood (x3), light absorption in springwood (x4), and light absorption in summerwood (x5). a. Fitting the regression function

resulted in . Does the data indicate that there is a linear relationship between specific gravity and at least one of the predictors? Test using .

b. When x2 is dropped from the model, the value of R 2

remains at .769. Compute adjusted R2 for both the full model and the model with x2 deleted.

c. When x1, x2, and x4 are all deleted, the resulting value of R2 is .654. The total sum of squares is . Does the data suggest that all of x1, x2, and x4 have zero coefficients in the true regression model? Test the rele- vant hypotheses at level .05.

d. The mean and standard deviation of x3 were 52.540 and 5.4447, respectively, whereas those of x5 were 89.195 and 3.6660, respectively. When the model involving these two standardized variables was fit, the estimated regression equation was . What value of specific gravity would you predict for a wood sample with % and % light absorption in ?

e. The estimated standard deviation of the estimated coeffi- cient of (i.e., for of the standardized model) was .0046. Obtain a 95% CI for b3.

f. Using the information in parts (d) and (e), what is the estimated coefficient of x3 in the unstandardized model (using only predictors x3 and x5), and what is the esti- mated standard deviation of the coefficient estimator (i.e., for in the unstandardized model)?

g. The estimate of s for the two-predictor model is , whereas the estimated standard deviation of

when and (i.e., when and ) is .00482. Compute a 95% PI for specific gravity when % and % light absorption in .summerwood 5 88.9

springwood 5 50.5 x 5 5 88.9x 3 5 50.5

xr5 5 2.2769xr3 5 2.3747b̂3xr3 1 b̂5 xr5

b̂ 0 1 s 5 .02001

b̂3sb̂3

b̂3xr3b̂3

summerwood 5 90 springwood 5 50

.0097xr5y 5 .5255 2 .0236xr3 1

SST 5 .0196610

a 5 .01

R2 5 .769b1x 1 1 c 1 b5 x 5

mY #x1, x2, x3, x4, x5 5 b0 1

Copyright 2010 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s).

Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

57. In the accompanying table, we give the smallest SSE for each number of predictors for a regression prob- lem in which heat of hardening in cement,

tricalcium aluminate, tricalcium silicate, aluminum ferrate, and dicalcium silicate.

Number of Predictors k Predictor(s) SSE

1 x4 880.85 2 x1, x2 58.01 3 x1, x2, x3 49.20 4 x1, x2, x3, x4 47.86

In addition, and . a. Use the criteria discussed in the text to recommend the

use of a particular regression model. b. Would forward selection result in the best two-predictor

model? Explain.

58. The article “Response Surface Methodology for Protein Extraction Optimization of Red Pepper Seed” (Food Sci. and Tech., 2010: 226–231) gave data on the response variable

and the independent variables

and . a. Fitting the model with the four xi’s as predictors gave the

following output:

Predictor Coef SE Coef T P Constant �4.586 2.542 �1.80 0.084 x1 0.01317 0.02707 0.49 0.631 x2 1.6350 0.2707 6.04 0.000 x3 0.02883 0.01353 2.13 0.044 x4 0.05400 0.02707 1.99 0.058

Source DF SS MS F P Regression 4 19.8882 4.9721 11.31 0.000 Residual Error 24 10.5513 0.4396 Total 28 30.4395

x 4 5 solvent/meal ratio temperature (8C), x 2 5 pH, x 3 5 extraction time ( min ),

x 1 5y 5 protein yield (%)

SST 5 2715.76n 5 13

x 4 5 %x 3 5 % x 2 5 %x 1 5 %

y 5 cumulative k (k 5 1, 2, 3, 4)

Calculate and interpret the values of R2 and adjusted R2. Does the model appear to be useful?

b. Fitting the complete second-order model gave the fol- lowing results:

Predictor Coef SE Coef T P Constant �119.49 18.53 �6.45 0.000 x1 �0.1047 0.2839 �0.37 0.718 x2 28.678 3.625 7.91 0.000 x3 0.4074 0.1303 3.13 0.007 x4 0.2711 0.2606 1.04 0.316 x1sqd �0.000752 0.002110 �0.36 0.727 x2sqd �1.6452 0.2110 �7.80 0.000 x3sqd 0.0002121 0.0005275 0.40 0.694 x4sqd �0.015152 0.002110 �7.18 0.000 x1x2 0.02150 0.02687 0.80 0.437 x1x3 0.000550 0.001344 0.41 0.688 x1x4 �0.000800 0.002687 �0.30 0.770 x2x3 �0.05900 0.01344 �4.39 0.001 x2x4 0.03900 0.02687 1.45 0.169 x3x4 0.002725 0.001344 2.03 0.062

Source DF SS MS F P Regression 14 29.4287 2.1020 29.11 0.000 Residual Error 14 1.0108 0.0722 Total 28 30.4395

Does at least one of the second-order predictors appear to be useful? Carry out an appropriate test of hypotheses.

c. From the output in (b), a reasonable conjecture is that none of the predictors involving x1 are providing useful information. When these predictors are eliminated, the value of SSE for the reduced regression model is 1.1887. Does this support the conjecture?

d. Here is output from Minitab’s best subsets option, with just the single best subset of each size identified. Which model(s) would you consider using (subject to checking model adequacy)?

R-Sq(adj) 5 93.4%R-Sq 5 96.7%S 5 0.268703

586 CHAPTER 13 Nonlinear and Multiple Regression

1 2 3 4 x x x x x x s s s s 1 1 1 2 2 3

Mallows x x x x q q q q x x x x x x Vars R-Sq R-Sq(adj) Cp S 1 2 3 4 d d d d 2 3 4 3 4 4

1 52.7 50.9 174.4 0.73030 X 2 67.9 65.4 112.5 0.61349 X X 3 77.7 75.0 73.1 0.52124 X X X 4 83.4 80.7 50.8 0.45835 X X X X 5 90.9 88.9 21.4 0.34731 X X X X X 6 94.6 93.1 7.9 0.27422 X X X X X X 7 95.8 94.4 4.7 0.24683 X X X X X X X 8 96.2 94.6 5.1 0.24137 X X X X X X X X 9 96.4 94.7 6.1 0.23962 X X X X X X X X X

10 96.6 94.6 7.5 0.24132 X X X X X X X X X X 11 96.6 94.4 9.4 0.24716 X X X X X X X X X X X 12 96.6 94.1 11.2 0.25328 X X X X X X X X X X X X 13 96.7 93.8 13.1 0.26041 X X X X X X X X X X X X X 14 96.7 93.4 15.0 0.26870 X X X X X X X X X X X X X X