Cross-Tabulation & Z-test <Statistics>
sleepy_joyLEARNING OUTCOMES
Know what descriptive statistics are and why they are used
Create and interpret tabulation tables
Use cross-tabulations to display relationships
Perform basic data transformations
Understand the basics of testing hypotheses using inferential statistics
Z test
14–*
*
The Nature of Descriptive Analysis
- Descriptive Analysis
The elementary transformation of raw data in a way that describes the basic characteristics such as central tendency, distribution, and variability.
- Histogram
A graphical way of showing a frequency distribution in which the height of a bar corresponds to the observed frequency of the category.
14–*
*
EXHIBIT 14.1 Levels of Scale Measurement and Suggested Descriptive Statistics
14–*
*
Cross-Tabulation
- Cross-Tabulation
Addresses research questions involving relationships among multiple less-than interval variables.
Results in a combined frequency table displaying one variable in rows and another variable in columns.
- Contingency Table
A data matrix that displays the frequency of some combination of responses to multiple variables.
- Marginals
Row and column totals in a contingency table, which are shown in its margins.
20–*
*
Cross-Tabulation Table
- Did you watch the movie Into The Woods? Yes No
- What’s your gender? Male Female
(Observed distribution)
*
No | Yes | Total | |
Male | 14 | 3 | 17 |
Female | 15 | 17 | 32 |
Total | 29 | 20 | 49 |
Cross-tab: Project Assignment
- Thirty respondents were asked if they have the access to the 4G network and if they have used mobile banking services.
- The results showed that 11 people do not have the access to 4G and have not used mobile banking, 4 people have the access to 4G but have not used mobile banking, 12 people have the access to 4G and have used mobile banking, and 3 people do not have the access to 4G but have used mobile banking (using friends’ smartphone).
- Present the results in a cross-tabulation table in Project Assignment.
14–*
*
Cross-Tabulation Table
- Convert frequency table to percentage table.
Statistical base – the number of respondents or observations (in a row or column) used as a basis for computing percentages.
- What was the percentage of males who watched the movie?
- What was the percentage of moviegoers who were male?
*
*
Cross-Tabulation Table
- % of males watched the movie.
- % of
moviegoers
were male.
*
No | Yes | Total (base) |
Male | 3/17= 17.6% | |
Female | ||
Total |
No | Yes | Total |
Male | 3/20=15% | |
Female | ||
Total (base) |
Compare these two tables, which one does a better job displaying the relationship between gender and movie going, i.e., if moviegoers’ gender will affect whether they watched the movie.
*
Cross-Tabulation Table
- Percentages are computed in the direction of the “independent” variable, e.g., gender.
*
No | Yes | Total |
Male | 14/29=48% | 3/20=15% |
Female | 15/29=52% | 17/20=85% |
Total | 29/29=100% | 20/20=100% |
Note that as you have learned from Ch. 9 Experiments and project assignment, gender is NOT an independent variable, because we cannot alter or manipulate participants’ gender in a marketing experiment. However, in data analysis, we treat gender as independent variable in that one’s gender has an effect on DV. In this movie example, it makes sense to say that one’s gender will influence if that person decides to watch the movie. However, it does NOT work the other way: watching movie will affect one’s gender.
*
- What would be appropriate “independent” variable and dependent variable?
- Convert the 4G x Mobile Banking cross-tab into a percentage table.
*
Cross-tab: Project Assignment
Now, are you ready to convert the 4G x Mobile Banking table to a percentage table? Follow the instructions above.
*
Data Transformation
- Data Transformation
Process of changing the data from their original form to a format suitable for performing a data analysis addressing research objectives.
Recoding
Creating summated scales
Collapsing adjacent categories
Creating index numbers, e.g., consumer price index (CPI)
20–*
*
CPI: on changes in the prices paid by urban consumers for a representative basket of goods and services.
Computer Programs for Analysis
- Statistical Packages
Spreadsheets
Excel
Statistical software:
SPSS (Statistical Package for Social Sciences)
PASW (Predicative Analytics Software)
SAS
MINITAB
14–*
*
Hypothesis Testing Using Basic Statistics
- Univariate Statistical Analysis
Tests of hypotheses involving only one variable.
- Bivariate Statistical Analysis
Tests of hypotheses involving two variables.
E.g., t-test, ANOVA, correlation
- Multivariate Statistical Analysis
Statistical analysis involving three or more variables or sets of variables.
E.g., Multiple regression
14–*
*
Hypothesis Testing Procedure
- The specifically stated hypothesis is derived from the research objectives.
- A sample is obtained and the relevant variable is measured.
- The measured sample value is compared to the value either stated explicitly or implied in the hypothesis.
If the value is consistent with the hypothesis, the hypothesis is supported.
If the value is not consistent with the hypothesis, the hypothesis is not supported.
14–*
*
All these activities are centered around hypothesis.
Null Hypothesis vs.
Alternative Hypothesis
- Null hypothesis (H0): A statement about a status quo (asserting that any change from what has been thought to be true will be due entirely to random sampling errors).
E.g., H0: µ = 100
- Alternative hypothesis (H1): A statement indicating the opposite of the null hypothesis.
E.g., H1: µ 100
*
Hypothesis Testing (HT)
- The purpose of HT is to determine which of the two hypotheses is correct.
- Significant level: The critical probability in choosing between the null and alternative hypotheses.
- ą (Greek letter alpha) = .05
- The probability level that is too low to warrant support of the null hypothesis.
*
Hypothesis Testing (HT)
- p-value
Probability value, or the observed or computed significance level.
p-values are compared to significance levels to test hypotheses.
p < .05, null hypothesis is reject or alternative hypothesis is supported.
14–*
*
Univariate Hypothesis Testing
- Is the sample mean significantly different from the hypothesized population mean?
Is the sample a part of the population?
- Population mean IQ: µ=100
- Sample mean (e.g., SJSU) IQ: =105
- Is IQ score 105 statistically significantly different from IQ score 100?
“Well, Are They Satisfied or Not?”
- Suppose Best Buy is interested in if their customers were satisfied with their “Black Friday” shopping in the Best Buy stores.
- Unsatisfied 1 2 3 4 5 Satisfied
- The average score of 225 shoppers is 3.3.
- Is a satisfaction score of 3.3 good or bad?
- Need to compare with other scores.
21-*
Step 1*: Stating Hypotheses
- H0: µ=3.0 (customers were neither unsatisfied
nor satisfied.)
- H1: µ≠3.0 (customers were satisfied with their Black Friday shopping.)
*
*
Step 2: Deciding on Region of Rejection
-1.96
1.96
0
Critical Z-scores:
Z-distribution
The darkly shaded area shows the region of rejection when ą=.025
Raw scores:
HT: Best Buy Example
- Sample size n=225
- Sample mean =3.3
- Sample standard deviation S=1.5
*
Step 3*: Calculating z-statistic
*
z-statistic:
Standard error of mean:
The standard deviation of the sampling distribution.
(obs=observation; as opposed to expected critical values)
*
Step 4:
Comparing Z-Statistic to Critical Value
-1.96
1.96
3.0
0
Z-scores:
Step 5*: Making a Decision
- Zobs=3.0 > Z.05=1.96
- Therefore, p<.05. This means that the chance we observe µ=3.0 is less than 5%.
- Reject H0
- This suggests that Best Buy customers were satisfied with their “Black Friday” shopping.
*
Hypothesis Testing Procedure: Z Test
State hypotheses.
Null: H0: µ=
Alternative: H1: µ≠
Decide on region of rejection, i.e., find the critical value(s) for the significant level p=.05.
Z-distribution: -1.96 and 1.96
Calculate the z-statistic
Comparing the z-statistic with critical values.
Make a decision
If z-statistic falls in [-1.96, 1.96], then fail to reject H0.
If z-statistic falls out of [-1.96, 1.96], then reject H0.
*
*
According to the past 5 years of experience, a professor
knows that the average hours his students spend on the final
project is 15 (standard error of the mean = 0.9). In order to
see whether or not the time his students spend on the
project has decreased this semester, he randomly sampled
50 of his students and calculated the average hours as 14.
State an appropriate null hypothesis and alternative hypothesis.
Find the critical values at significant level p=.05
Calculate the z-statistic.
Compare z-statistic with critical values.
Make a decision.
Z-test: Project Assignment
¹
X
n
S
S
S
X
Z
x
x
obs
=
-
=
m