# week 1 reflection

Focus on what you learned that made an impression, what may have surprised you, and what you found particularly beneficial and why. Specifically:

• What did you find that was really useful, or that challenged your thinking?
• What are you still mulling over?
• Was there anything that you may take back to your classroom?
• Is there anything you would like to have clarified?
• Welcome to Statistics – the study of data! In this module we explore the GAISE report and what is involved in the process of statistical inquiry. We will look at study design and sampling. We will review the measures of center (mean, median and mode) in the context of sampling design and inquiry. By the end of this week’s investigations, you will be prepared to begin thinking about your own comparative study for the final project in the course.

• ### Weekly Goals

• Become acquainted and reacquainted with colleagues
• Review basic statistics – measures of center
• Look over the GAISE report and the process of statistics
• Distinguish between Categorical and Quantitative Data and their representations
• For Quantitative Data distinguish between continuous and discrete data and their representations
• Distinguish between an observational study and an experiment
• Identify design characteristics of a good study
• Identify different types of sampling and the strengths and weaknesses of each type
• Identify sources of bias in a study Define and identify lurking variables

• ### Ice-Breaker and Personal Reflection [Wiki]

Part One: Meyer-Briggs

Take the Meyer-Briggs quiz.

We will get to know each other by creating a personal Wiki profile page.

Post your Meyers Briggs type, along with a photo of yourself, to your Wiki page.

Respond to the following questions:

• Do you agree or disagree with your Meyer’s Briggs profile? Why?

• How does your profile correctly reflect your role when you work with others or in a group?

Part Two: Personal Reflection

In the Personal Reflection section of your wiki profile, reflect on what statistics means to you. Some questions to consider (you need not address them all nor limit yourself to these):

• What do you think of when you hear the word “statistics”?
• How do you use or interact with statistics in your personal life?
• How do you approach or incorporate statistics (if at all) into your classes?
• What do your students understand about statistics?
• What do your students understand about statistics?
• How do you use statistics (if at all) to inform your teaching?
• What do you know about the Statistics and Probability section of the Common Core State Standards for Mathematics for your grade band? How is this affecting your classes?

Please complete this reflection BEFORE completing any other investigations for this week. It should be a snapshot of you “before this course”; something we may return to at the end of this course to see how your thoughts and understandings may have changed.

• ### DoW #1 aka Data of the Week

DoW stands for "Data of the Week".

Each week, you will be presented with data that relates to the week’s investigations. The form of the data will vary from week to week. Sometimes, it will involve gathering data yourself; other times the data will be presented to you in one form or another. It is important for you to become familiar with the Data of the Week at the start of each week; this data will be referred to during the investigations & will be central to the activities and discussions.

“Become familiar with the data” has many meanings. Some things you should do:

• View the data in ways that are meaningful to you. You might organize it, graph it, or use some technology to help you do these things.
• In your journal, write down interesting things you see in the data
• In your journal, write down questions that you have about the data

Spending 15 minutes with the DoW at the start of the week will prepare you to engage with the data during the week’s investigations.

This Week’s DoW examines the question: Are the conclusions about the health affects of passive smoking associated with the affiliation of the author(s) of the article?

Bias in Smoking Article for Week One

A study1 of 106 reviews about the health effects of passive smoke looked at the association between the conclusion of the review and the affiliation of the author(s). A summary of the study is provided below:

Why Review Articles on the Health Effects of Passive Smoking Reach Different Conclusions
Deborah E. Barnes, MPH; Lisa A. Bero, PhD

Objective: To determine whether the conclusions of review articles on the health effects of passive smoking are associated with article quality, the affiliations of their authors, or other article characteristics.

Data Sources: Review articles published from 1980 to 1995 were identified through electronic searches of MEDLINE and EMBASE and from a database of symposium proceedings on passive smoking.

Article Selection: An article was included if its stated or implied purpose was to review the scientific evidence that passive smoking is associated with 1 or more health outcomes. Articles were excluded if they did not focus specifically on the health effects of passive smoking or if they were not written in English.

Data Extraction: Review article quality was evaluated by 2 independent assessors who were trained, followed a written protocol, had no disclosed conflicts of interest, and were blinded to all study hypotheses and identifying characteristics of articles. Article conclusions were categorized by the 2 assessors and by one of the authors. Author affiliation was classified as either tobacco industry affiliated or not, based on whether the authors were known to have received funding from or participated in activities sponsored by the tobacco industry. Other article characteristics were classified by one of the authors using predefined criteria.

Data Synthesis: A total of 106 reviews were identified. Overall, 37% (39/106) of reviews concluded that passive smoking is not harmful to health; 74% (29/39) of these were written by authors with tobacco industry affiliations. In multiple logistic regression analyses controlling for article quality, peer review status, article topic, and year of publication, the only factor associated with concluding that passive smoking is not harmful was whether an author was affiliated with the tobacco industry (odds ratio, 88.4; 95% confidence interval, 16.4-476.5; P,.001).

Conclusions: The conclusions of review articles are strongly associated with the affiliations of their authors. Authors of review articles should disclose potential financial conflicts of interest, and readers of review articles should consider authors’ affiliations when deciding how to judge an article’s conclusions. JAMA. 1998;279:1566-1570

1Barnes, Deborah E. 1998 Why review articles on the health effects of passive smoking reach different conclusions. JAMA. 279(19): 1566-1570.

The data from the study is summarized in the table & graphs below:

• ### Investigation 1: The Process of Statistics

One way to think about statistics is as a problem-solving tool. The following four-step approach to statistics is taken from the Annenburg Learning Math program, Data Analysis, Statistics, and Probability. We will refer to sections of this program throughout the course. (It will be referred to as “The Annenburg Series”.)

Four components make a problem statistical: the way in which you ask the question, the role and nature of the data, the particular ways in which you examine the data, and the types of interpretations you make from the investigation. The table below summarizes the four steps of statistical problem solving:

Four Steps of Statistical Problem Solving:

Similarly, the GAISE Report (A Curriculum Framework for PreK-12 Statistics Education. 2005 Franklin, Christine, Kader, G., Mewborn, D., Moreno, J., Peck, R., Perry, M., Schaeffer, R.) says:

The Investigative Process includes:

1. Formulate the Question

• Teachers help pose questions (Questions in contexts of interest to the student)
• Students distinguish between statistical solution and fixed answer

2. Collect Data to Answer the Question

• Students conduct a census of the Classroom
• Students understand individual-to-individual variability
• Students conduct simple experiments with non-random assignment of treatment
• Students understand variability attributable to an experimental condition
• Students should understand what constitutes good practice in conducting a sample survey
• Students should understand what constitutes good practice in conducting an experiment
• Students should understand what constitutes good practice in conducting an observational study
• Students should be able to design and implement a data collection plan for statistical studies, including observational studies, sample surveys, and simple comparative experiments

3. Analyze the Data

• Students should be able to summarize numerical and categorical data using tables, graphical displays, and numerical summary statistics such as the mean and standard deviation
• Students should understand how sampling distributions (developed through simulation) are used to describe sample-to-sample variability
• Students should be able to recognize association between two categorical variables
• Students should be able to describe relationships between two numerical variables using linear regression and the correlation coefficient
• Students observe association between two variables
• Students use tools for exploring distributions and association, including:
• Bar Graph
• Dotplot
• Stem and Leaf Plot
• Scatterplot Tables (using counts)
• Mean, Median, …, Range
• Modal Category
• Histograms
• The IQR (Interquartile Range)
• MAD (Mean Absolute Deviation)
• Five-Number Summaries and Boxplots
• Students acknowledge sampling error
• Students quantify the strength of association between two variables, develop simple models for association between two numerical variables, and use expanded tools for exploring association including:
• Contingency Tables for two categorical variables
• Time Series Plots
• Simple lines for modeling association between two numerical variables

4. Interpret Results

• Students describe differences between two or more groups with respect to center, spread, and shape
• Students acknowledge that a sample may not be representative of a larger population
• Students understand basic interpretations of measures of association
• Students begin to distinguish between an observational study and a designed experiment
• Students begin to distinguish between “association” and “cause and effect"
• Students recognize sampling variability in summary measures such as the sample mean and the sample proportion
• Students should understand the meaning of statistical significance and the difference between statistical significance and practical significance
• Students should understand the role of P-values in determining statistical significance
• Students should be able to interpret the margin of error associated with an estimate of a population characteristic

You may download the entire report from the ASA website.

I, Activity B: Sampling, Bias, and the Lurking Variable

A solid statistical study requires quality data to ensure a meaningful interpretation. This may sound simple, but it is no simple task to ensure that the data gathered represents what you believe it represents. Each decision you make, from the initial framing of the question to the selection of the sample to the actual collection of the data, can influence the quality of the data. Bias refers to systematic issues in the design of a study that affect the quality of the data.

In this activity, we look at sampling and bias.

• View the presentation on Bias Week One - Understanding Bias. This presentation provides an overview of the issue of bias in statistics. It reviews some points brought up in Activity A, and previews some of the information in the following reading.
• Read this chapter from CK-12 on Study Design. As you read, let the following questions guide you. Reflect on them in your journal.

• What is the difference between a survey and a census?
• What is a sample? Why do we use them?
• What is a representative sample? What is a random sample?
• Why is a random sampling method desirable?
• What is Random Sampling? Stratified Sampling?
• Sampling Biases Undercoverage (incorrect frame)
• Convenience Sampling
• Size Bias
• Non-Response Bias
• How can you reduce sample bias?

A Lurking Variable is a variable that directly affects both variables under consideration, making them appear to be related when they may not actually be related. The presence of possible Lurking Variables can influence the quality of a study. The YouTube video, The Lurking Variable, Part 1 discusses a hypothetical study on the relationship between depression and cancer diagnoses, highlighting the relevance of considering lurking variables in your study design and interpretation.

After watching the video, reflect on the following in your journal:

• What is the lurking variable in the study?
• What makes it a lurking variable?
• How does the presence of this lurking variable influence the interpretation of the data?

Optional: You can watch the second part of this video, The Lurking Variable, Part 2.

• ### Inv. I, Activity C: The Process in DoW #1

In this activity, we will look at the Process of Statistics in DoW #1. Complete the table below for DoW #1 in your journal, by addressing the questions in each step.

Post your answers to your Discussion Board group thread, The Process of DoW #1, by Thursday, 12 PM EST. Review the posts of your group. Engage in a meaningful discussion about your interpretation of the data and/or similarities and differences in your responses. Make at least two follow-up posts by Saturday, 12 PM EST.

 Ask a QuestionFor DoW #1, determine:The variable(s) present and the type of variable (categorical or quantitative)The Observational UnitsThe populationThe Sample Collect DataFor DoW #1:How was the data was gathered?Are there any potential biases in the data? If so, what are they? Analyze the Data Describe the analysis tools used.Tables?Graphs?Calculations? Interpret the DataMake two statements interpreting the data from this study.Discuss any strength and/or limitations you see.

I agree  with you on that point, permission slips had to be signed.

nerlande

Hi Nitha,

There is a possibility that the the number of articles be variables. I think that they are trying to catagorize the information, not determine the quantity of it.

Nerlande

Hi Nitha and Clayton,

I agree with both of you, I have found that 74% of the reviews were by writers that have tobacco industry affiliations.  Hardly seems fair and the ending is known before it is started.

Nerlande

DoW #1 – Articles on Passive Smoking Heath Outcomes.

 Ask a Question•         For DoW #1, determine:•         The variable(s):•         1) 106 reviews (quantitative)•         2) Author Affiliation (categorical)•         3) Time line (categorical)•         4) Quality of review (categorical)•         5) 1 or more health outcomes (categorical)•          Observational Units: The review articles from 1980 – 1995 that conclude the health risks of passive smoking.•          Population: Review articles from 1980 – 1995. •         Sample: Articles located electronically on two different database providers, selected by a blind, two person review staff based on whether the article stated 1 or more health outcomes. Collect DataFor DoW #1:•         Data was gathered?:  From three different database sources: symposium proceedings on passive smoking, MEDLINE and EMBASE. Selection was based on purpose implied in regards to evidence that passive smoke is associated with 1 or more health outcomes.•          •         Potential biases in the data?:  The dates used to meet the criteria. The articles must talk about health outcomes. Author affiliation. Limited and objective sources of articles. Population Bias as there were more authors selected without affiliation to tobacco than not affiliated, Objectivity of sample selection. Analyze the DataTables:      1) Criteria and satisfaction rating. (with          checklist)     2) Descriptive characteristics.     3) Relationship between conclusions            and author affiliations.     4) Factors associated with concluding          that passive smoke is not harmful to          health.Calculations: Averages, percentages and odds ratios. Interpret the DataThe data presented in indicates that there is quite possibly a correlation between affiliation of the author and the conclusions reached.  Use of a checklist to interpret “Partial” or “Completely” satisfying the criterion, is somewhat subject able to opinion of the assessors. (Table 1) Articles that stated the evidence was inconclusive were categorized as “Not harmful” articles.

Study 3:

The variables

−        Do you smoke or not? The variable is “The subject is a smoker” (sometimes abbreviated in some fashion as 'smoker' or 'smoking', or S). The response to this question determines the type of variable it is. For this question I should assume the answer is of the form “yes” or “no”, and the type of the variable is then categorical.

−        How many cigarettes do you smoke per day? The variable is the “number of cigarettes smoked per day”. The response to this question should be a number (or intepretable as a number), 0,1,2,......, 100.  The type then is quantitative. (Also an answer such as 'half a pack' can be translated into such a number, 10).

−        How many packs of cigarettes per day? The type is again quantitative, the answers could be in the form 0,1,2,3,.. or 0,.5,1,1.5,...

−        Do you smoke more of the beginning in the week than the end of the week? The response to this question is a comparison (more, less, the same). The type of the variable is then ordinal.

−        How much pleasure do you get from smoking? (On a scale of 0 to 5). Although we do not really know how to measure this very well, the type of the variable is then quantitative.

−        Do your parents know if you are smoking? The answer could be: yes, no, don't know. The type of this question is then categorical.

The Observational Units are the objects described by a set of data, the type of subject being sampled. In this case it is high school students at three high schools.

Bias: While the students were initially selected randomly, they could only participate in the study on two conditions, which leads to a selection that is less representative of the set of students.

This is called sampling bias.

The first: they needed to get a permission slip from their parents. The second is that the parents needed to sign a confidentiality agreement (that they would not have access to the file on their own child). There may be quite a few parents who do not want to be bothered with signing permission slips. This introduces one level of selection/attrition bias (loss of participants). The second: there may be quite a few parents who do not want to sign the particular confidentiality agreement. This introduces a second level of attrition bias.

The effect of the study is that these biases make the students that were 'used' in the study less representative of the general student population. Also, I suspect that parents who do not need to know what their own kids have been up to, may allow them in general to take more risks (and smoking is one such). It is quite likely that the results from the study overstate the prevalence of smoking among students.

Study 4:

Dr. Nandi used interviews to collect her data from a population of high school students. Students were from different community types, randomly selected, and ensured of their confidentiality.

Her results are reliable since there was no bias in the selection and students could be honest without fear of parental input.

The results of 53% having tried smoking and 25% being daily smokers is believable.

