Statistics
telmar
60 CHAPTER 2 Organizrng and Visualizing Data
K E Y E Q U A T I O N S
Analyze 16 bar chart 32 categorical variable 16 cell 22 chartjunk 57 class boundaries 26 class interval 26 class interval width 26 class midpoint 27 class 26 Col lect 16 contingency table 22 continuous variable l7 cumulative percentage distribution 29
cumulative percentage PolYgon (ogive) 43
data collection 20 DCOVA 16
Determining the Class Interval Width
hishest value - lowest value Intervalwidth:f f i
Computing the Proportion or Relative Frequency
frequencY in each class Proportion : relative frequencY : ffi
K E Y T E R M S
(2 .1 )
(2.2)
Define 16 discrete variable 16 drill down 54 frequency distribution 26 histogram 4l interval scale I 8 multidimerrsional data 52 nominal scale 17 numerical variable 16 ogive (cumulative Percentage
polygon) 43 ordered array 25 ordinal scale 17 Organize l6 Pareto chart 35 Pareto princiPle 35 percentage distribution 28 percentage PolYgon 42
pie chart 34 PivotTable 52 primary data source 20 proportion 28 qualitative variable I 6 quantitative variable 1 6 ratio scale 18 relative frequencY 28 relative frequency distribution 28
scatter plot 48 secondary data source 20 side-by-side bar chart 37 stem-and-leaf disPlaY 40 summary table 2l time-series plot 50 Visualize l6
C H A P T E R R E V I E W P R O B L E M S
CHECKING YCIUR [..JNDHRSTANSIN$
2.83 How do histograms and polygons differ in their con-
struction and use?
2.g4 Why would you construct a summary table?
2.85 What are the advantages and disadvantages of using a
bar chart, a pie chart, or a Pareto chart?
2.86 Compare and contrast the bar chart for categorical
data with the histogram for numerical data'
2.g7 What is the difference between a time-series plot and
a scatter plot?
2.88 Why is it said that the main feature of a Pareto chart
is its ability to separate the "vital few" from the "trivial
matty"?
2.89 What are the three different ways to break down the
percentages in a contingency table?
2.90 \\'hat is the difference between a PivotTable and
contingency table?
2.91 \\'hat insights can you gain from a three-way tat
that are not available in a two-way table?
AP PLYI ING Th{ H C*hd*f; PTs
2.92 The summary table on the next page presents t
breakdown ofthe price ofa new college textbook:
a. Using the four categories, publisheq bookstore, author, a
freight, consh'uct a bar chart, a pie chafi, and a Pareto chi
b. Using the four subcategories of publisher and three st
categories of bookstore, along with the author and freil
categories, construct a Pareto chart.
c. Based on the results of (a) and (b), what conclusions t
you reach concerning who gets the revenue from the sa
of new college textbooks? Do any of these results s
prise you? ExPlain.
Rer-enue Category Percentage (%o)
>, :rce'. Data extracted Jrom T Lewin, "When Books Break the Bank,"- -,: \ew York Timeq September I 6, 200 3, pp. B 1, B 4.
2.93 The following table represents the estimated green :,: , '.:r sales by renewable energy source in 2008:
:'ource Percentage (7o)
,-re otherrnal l'' dro -:ndfill mass and biomass S-.Iar -'rreported ", \ :nd
-.,. -:ce National RenewabLe Energy Laboratory, 2008.
Chapter Review Problems 61
Results of a Yahool Kevword Tool for Searches Related to "Sneakers"
Search Result Number of Occurrences
Jordan sneaker Nike sneaker Puma sneaker Sneaker Sneaker pimps*
* Sneaker pimps is a British electropop band
Source: Dala extracted,from K. J. Delaney, "The New Bene/its o.l lle:- Search Qtteries," The Wall Street Journal, February 6, 2007, p. 83.
a. For categories of online ad spending, construct a bar chart, a pie chart, and a Pareto chart.
b. Which graphical method do you think is best for portray- ing these data?
c. For the results of "sneakers" searches, construct a bar chart, a pie chart, and a Pareto chart.
d. Which graphical method do you think is best for portray- ing these data?
e. What conclusions can you reach concerning online ad spending and the results of"sneakers" searches?
2.95 The owner of a restaurant serving Continental-style entr6es has the business objective of learning more about the patterns of patron demand during the Friday-to-Sunday weekend time period. Data were collected from 630 cus- tomers on the type of entr6e ordered and organized in the following table:
Tlpe of Entr6e Number Served
187 103 30 25
122 63 74 26
630
Construct a percentage summary table for the trpes .., entr6es ordered.
b. Construct a bar chart, a pie chart, and a Pareto ci::: :- l the tvoes ofentr6es ordered.
c. Do you prefer using a Pareto chart or a pie char data? Why?
d. What conclusions can the restaurant o\\ ':r ir cerning demand for different types of enlfeis
2.96 Suppose that the owner of the restaurf,ni rn Problem 2.95 also wanted to study the demand tbr dessen dunng the same time period. She decided that in addition to srud)-ing whether a dessert was ordered she ii'ould also srudv the gen- der of the individual and uhether a beef entrie uas ordered.
f -bl isher '.lrnufacturing costs r-{arketing and promotion i.,lministrative costs and taxes f.ier-tax profit Jrokstore !:npioyee salaries and benefits -g,erations letax profit l"uthor i:eight
64.8 1 a a J Z . J
t5.4 10.0 1 . 1
22.4 I 1 . 3 6.6 4 .5
I 1 . 6 1 . 2
13.240 8 . 1 3 9 6.768
5 R g q i
2.8 1 1 .3 28.1 0.2 2.5
55 . I
iL arrnstruct a bar chafi, a pie chart, and a Pareto chart. h. r,\ 'hat conclusions can you reach about the sources of
.:een oower?
2-94 People conduct hundreds ofmiilions ofsearch queries fl i :\ day. ln response, businesses are estimated to spend &":-..rst S20 billion annually on online ad spending. The fol- , : i:1g represents the categories of online ad spending and
: :esults of a Yahoo! keyword tool for searches related to :.;:kers":
Spend ing
Ttpe Spending ($billions)
-- -:ssified r:splay aos J'rid search 1-.'n medla/vlcleo -tner lrral
>: -::e'. Data extracted from K. J. Delaney, "The New BeneJits oJ Web- k.:':h Queries," The Wa1l Street Journal, Februory 6. 2007, p. 83.
Beef Chicken Mixed Duck Fish Pasta Shellfish Veal Total
3 .32 3.90 8.29 2 . 1 5 1 . 8 5
1 9 . 5 1
62 CHAPTER 2 organizingandVisualizing Data
Datawereco l lec ted f rom600customersandorgan ized in the following contingencY tables:
GENDER
DESSERT OROERED Male Female Total
96 z/.+ 320
BEEF ENTREE
DESSERT ORDERED Yes Total
a. Construct a pie chart and a Pareto chart for the percent-
age of cor.tnties using the various methods'
U. fVhat conciusions can you reach concerning the type of
votins method used in November 2006?
c. \\'hat differences are there between the methods used in
2000 and 2006?
2.98 In summer 2000, a growing number of warranty
claims on Firestone tires sold on Ford SUVs prompted
Firestone and Ford to issue a major recall' An analysis of
warrant\ claims data helped identify which models to recall'
A breakdou n of 2,504 warranty claims based on tire size is
giren in the following table:
Yes No Total
40 240 280
1 3 6 464 600
No
Tire Size Number of WarrantY Claims Yes No Total
a. For each of the two contingency tables, construct contln-
gency tables of row percentages' column percentages'
and total Percentages' b. Which type of percentage (row, column, or total) do
you
think is most informative for each gender'? For beef entree?
Explain. c. What conclusious concerning the pattern of dessert
ordering can the restaurant owner reach?
2.97 The following data represent the method for recording
votes in the November 2006 election, broken down by percent-
age of counties in the United States, using each method and
G number of counties using each method in 2000 and 2006'
2,030 137
6 l
8 1 5 B 54 62
Sourc.: Ddk extrocted.from Robert L' Simisott' "Ford Steps Up Recall
llirhou: Ftresrotre," The Wall Street Joutnal,Augttst l4' 2000' p A3'
The 1.0-r0 n'arranty claims for the 23575R15 tires can be
categonzed into ATX models and Wilderness models' The
ti pe oi incident leading to a warranty claim, by model type'
is sunrntarized in the following table:
71 6s 1 16 348 187 413
t36 464 600
2 3 5 " 5 R 1 5 3 1 1 0 5 0 R 1 5 30950R1 5 2 3 5 - 0 R 1 6 3 3 1 1 5 0 R 1 5 2 5 5 - 0 R 1 6 Others
Method
Percentage of Counties Using
Method in 2006 (%)
Incident T1'Pe ATX Model
Warranty Claims Wilderness
WarrantY Claims
Tread separation Blou'out Other unknown 422
Total 1'864
r 165 77
59 4 l 66
166 Electronic Hand-counted paPer ballots
Lever Mixed Optically' scamed PaPer ballots
Punch card
Source: Data e\n'acIed -fi on1 R ltbtl
"'
Paper-Trail l/oting Gets Organized
Opposition." LS.\ Todar. '{pr.i1 )1. 2007, p 2A'
\umber of Counties
Method
309 1,742 370 51 434 62
92 l . r ) - l
572 13
Electronic Hand-counted PaPer ballots Lever Mixed Optically scanned PaPer ballots Punch card
r49 1,279
Soutce'. Data extractedfrom R. Wolf, "Paper-Trail l/otittg Gets Organized
Opposition,"IJSAToday,April 24, 2007' p 2A'
Source: Dnla extracted;fi'om Rrtbert L' Simison' "Ford Steps Up Recall
Il'ir itottt Firestone, " The Wall Street Journal, l ttgust 1 4' 2000' p A3 '
a. Construct a Pareto chart for the number of warrant'"
clain-rs by tire size. What tire size accounts for most c
the claims?
b. Construct a pie chart to display the percentage of the tot:.
number of warranty claims for the 23575R15 tires th'
come from the ATX model and Wilderness mode'
Interpret the chart.
c. Construct a Pareto chart for the type of incident causlll'
the warranty claim for the AIX model' Does a certar:
type of incident account for most of the claims?
d. Construct a Pareto chart for the type of incident causir':
thc warranty claim for the Wilderness model' Does a ce:-
tain type of incident account for most of the claims?
2.99 One of the major measures of the qu'l i ty of sen'r;=
provideci by an organization is the speed u'th which t::
36.6 1 . 8 2.0 3.0
s6.2 0.4
2000 2006
rt 3a t t i -
organization responds to customer complaints. A large hmily-held department store selling furniture and flooring, including carpet, had undergone a major expansion in the lnst several years. In particular, the flooring department had upanded ftom2 installation crews to an installation super- risor, a measurer, and 15 installation crews. A business djective of the company was to reduce the time between lhen the complaint is received and when it is resolved. Iluring a recent year, the company received 50 complaints oncerning carpet installation. The data from the 50 com-
ints, organized in f@E, represent the number of days the receipt of the complaint and the resolution of
complaint:
Chapter Review Problems 63
a. Construct a stem-and-leaf display for each of the three variables.
b. Construct three scatter plots: money market account ver- sus one-year CD, money market account versus five-year CD, and one-year CD versus five-year CD.
c. Discuss what you learn from studying the graphs in (a) and (b).
2.103 The file !!ftlslfftfi includes the total compensa- tion (in $) of CEOs of large public companies in 2008. Source: Data extractedfrom D. Jones and B. Hansen, "CEO Pay Dives in a Rough 2008," www.usatoday.com, May l, 2009.
a. Construct a frequency distribution and a percentage distribution.
b. Construct a histogram and a percentage polygon. c. Construct a cumulative percentage distribution and plot a
cumulative percentage polygon (ogive). d. Based on (a) through (c), what conclusions can you reach
concerning CEO compensation in 2008?
2.104 Studies conducted by a manufacturer of "Boston" and "Vermont" asphalt shingles have shown product weight to be a major factor in customers' perception of quality. Moreover, the weight represents the amount of raw materi- als being used and is therefore very impoftant to the com- pany from a cost standpoint. The last stage of the assembly line packages the shingles before the packages are placed on wooden pallets. The variable of interest is the weight in pounds of the pallet which for most brands holds 16 squares of shingles. The company expects pallets of its "Boston" brand-name shingles to weigh at least 3.050 pounds but less than 3,260 pounds. For the company's "Vermont" brand- name shingles, pallets should weigh at least 3.600 pounds but less than 3,800. Data are collected from a sample of 368 pallets of "Boston" shingles and 330 pallets of "Vermont" shingles and stored in EEft!. a. For the "Boston" shingles. construct a frequency distri-
bution and a percentage distribution having eight class intervals, using 3,015, 3,050, 3,085, 3,120, 3,155, 3,190, 3,225,3,260, and 3.295 as the class boundaries.
b. For the "Vermont" shingles, construct a frequency distri- bution and a percentage distribution having seven class intervals, using 3,550, 3,600, 3,650, 3,700, 3,750, 3,800, 3,850, and 3,900 as the class boundaries.
c. Construct percentage histograms for the o'Boston" shin- gles and for the "Vermont" shingles.
d. Comrnent on the distribution of pallet weights for the "Boston" and "Vermont" shingles. Be sure to identify the percentage of pallets that are underweight and overweight.
2.105 The file !!!ft@ includes the overall cost index, the monthly rent for a two-bedroom apartment, the cost of a cup of coffee with service, the cost of a fast-food hamburger meal, the cost of dry-cleaning a men's blazer,the cost of toothpaste, and the cost of movie tickets in 10 differ- ent cities.
s r s i l 1 9 [ 2 4 t3 10 33 68
35 137 31 27 152 2 r23 81 74 27 1 2 6 1 1 0 l l 0 2 9 6 t 3 5 9 4 3 1 2 6 5 165 32 29 28 29 26 25 | 14 13
s 2 7 4 5 2 3 0 2 2 3 6 2 6 2 0 2 3
ESa
d
Construct a frequency distribution and a percentage distribution. Construct a histogram and a percentage polygon. Construct a cumulative percentage distribution and plot a crrmulative percentage polygon (ogive).
the basis of the results of (a) through (c), if you had m tell the president of the company how long a customer stould expect to wait to have a complaint resolved" what muld you say? Explain.
Data concerning 128 of the best-selling domestic in the United States are contained in EEEEE!$.
ralues for three variables are included: percentage alco- nrmber of calories per 12 ounces, and number of carbo-
(in grams) per 12 ounces. Data extracted fromwww.Beerl00.com, June '/5, 2009
a percentage histogram for each of the three niables.
three scatter plots: percentage alcohol versus s, percentage alcohol versus carbohydrates, and
versus carbohydrates. what you learn from studying the graphs in (a)
tb).
The file ![l[[[ffi contains the state cigarette tax, for each state as of April 1,2009. an ordered array.
e percentage histogram. conclusions can you reach about the differences in
$ate cigarette tax between the states?
The file !!!@!f!l!l contains the yields for a money rcount, a one-year certificate of deposit (CD), and
CD, for 23 banks in the metropolitan New York of May.28,2009.
extracte d from www.Bankrat e.com, May. 2 8, 2 0 0 9 fg t
ftid
meats, poultry, and fish).
Source: U.S. Department of Agriculture'
a. Construct a percentage histogram
calories. b. Construct a percentage histogram
cholesterol.
for the number of
for the amount of
64 CHAPTER 2 OrganrzingandVisualizing Data
a. Construct six separate scatter plots' For each' use the
overall cost index as the I axis' Use the monthly rent
for a two-bedroom apartment, the costs of a cup of cof-
fee with service' a fast-food hamburger meal' dry-
cleaning a men's blazer, toothpaste, and movie tickets
as the X axis' b. What conclusions can you reach about the relationship of
the overall cost index to these six variables?
2.106 The file EEEE contains calorie and cholesterol
information .ot".rnirrg popular protein foods (fresh red
IBM-Weekly closing stock price for IBM
AAPL-Weekly closing stock price for Apple
Source: D ata extracted from finance'yahoo'com' January I 3' 2 0 09'
a. Construct a time-series plot for the weekly closing values
of the S&P 500Index, General Electric,IBM, andApple'
b. Explain any patterns present in the plots'
c. Write a short summary of your findings'
2.'l ' l} (Class Project) Have each student in the class
,.rpond'to the queslion "Which carbonated soft drink do
you most prefer?" so that the teacher can tally the results
into a summarY table. a. Convert the data to percentages and construct a Pareto chart'
b. Analyze the findings.
2.111 (Class Project) Let each student in the class be cross-
classified on the basis of gender (male, female) and current
employment status (yes, no) so that the teacher can tally the
results. a.Constructatablewitheitherroworcolumnpercentages'
depending on which you think is more informative'
b. Wirat rvould you conclude from this study?
c.Whatothervariableswouldyouwanttoknowregarding employment in order to enhance your findings?
REPORT WRITING EXERCISES
2.'112 Referring to the results from Problem 2'104 onpage
63 concerning the weight of "Boston" and "Vermont" shin-
gles. u'rite a ieport that evaluates whether the weight of the
fallets of the two types of shingles are what the companl-
."p..rr. Be sure to incotporate tables and charts into the repon'
2.119 Referring to the results from Problem 2'98 on page
62 concerning the warranty claims on Firestone tires' write
a report thatlvaluates warranty claims on Firestone tires
soldon Ford SUVs. Be sure to incorporate tables and charts
into the report.
TEAM PROJECT
The file f!!!f!fi!t contains information regarding nine
variables from a sample of 180 mutual funds:
Fund number-Identification number for each bond fund
Type-Bond fund type (intermediate government or
short-term corPorate) Assets-In millions of dollars
Fees-Sales charges (no or Yes) Expense ratio-Ratio of expenses to net assets rn
percentage Reiurn 20O8-Twelve-month return in 2008
Three-year return-Annualized return, 2006-2008
Five-year return-Annualized return, 2004-2 00 8
Risk-Risk-of-loss factor of the mutual fund (bel
average. average. or above average)
2.'114 For this problem, consider the expense ratio'
a. Construct a percentage histogram'
c. What conclusions can you reach from your analyses 1n
(a) and (b)?
2.107 The file G!!fft!! contains the weekly average
pri". of gu*oline in the United States from January l' 2007 '
io lun ruiy 12,2009. Prices are in dollars per gallon'
Source: U.S. Department of Energy, www'eia'doe'gov' January 14'
2009. a. Construct a time-series Plot. b. What pattern, if any, is present in the data?
2.1Og The file EEI contains data for the amount of soft
drink filled in a sample of 50 consecutive 2liter bottles' The
results are listed horizontally in the order of being filled:
2.109 2.086 2.066 2.015 2.065 2.05'1 2'052 2'044 2'036 2'038
2.031 2.029 2.025 2.029 2.023 2.020 2'015 2'014 2'013 2'014
2.012 2.012 2.012 2.010 2.005 2.003 1.999 1'996 l '991 1"992
i.994 1.986 1.984 1.981 1.9' ,13 1.915 l .g ' , l l l '969 l '966 l '961
t .963 1 .957 1 .951 1 .951 I '941 1 .941 1 '941 1 '938 1 '908 l '894
a. Construct a time-series plot for the amount of soft drink
on the l'axis and the bottle number (going consecutively
from I to 50) on the Xaxis.
b. What pattern. if any, is present in these data?
c. If you had to maks a prediction about the amount of soft
arint nttea in the next bottle, what would you predict?
d. Based on the results of (a) through (c), explain why it is
important to construct a time-series plot and not just a
histogram, as was done in Problem 2'59 on page 48'
2.109 The S&P 500 lndex tracks the overall movement of
the stock market by considering the stock prices of 500 large
,orpotutionr. The file fs![!fft!!t contains weekly data for
this index as well as the daily closing stock prices for three
companies from January 2,2008, to January 12'2009'The
following variables are included: WEEK-Week ending on date given
S&P-Weekly closing value for the S&P 500 Index
GE-Weekly closing stock price for General Electric