III Problem Solving

profileherlooe8a13
StudyGuide3.pdf

PUH 5302, Applied Biostatistics 1

Course Learning Outcomes for Unit III

Upon completion of this unit, students should be able to:

4. Recommend solutions to public health problems using biostatistical methods. 4.1 Compute and interpret probability for biostatistical analysis. 4.2 Draw conclusions about public health problems based on biostatistical methods.

5. Analyze public health information to interpret results of biostatistical analysis. 5.1 Analyze literature related to biostatistical analysis in the public health field. 5.2 Prepare an annotated bibliography that explores a topic related to public health issues.

Course/Unit Learning Outcomes

Learning Activity

4.1 Unit Lesson Chapter 5 Unit III Problem Solving

4.2 Unit Lesson Chapter 5 Unit III Problem Solving

5.1 Chapter 5 Unit III Annotated Bibliography

5.2 Chapter 5 Unit III Annotated Bibliography

Reading Assignment

Chapter 5: The Role of Probability

Unit Lesson

Welcome to Unit III. In previous units, we discussed some fundamentals of biostatistics and their application to solving public health problems. In Unit III, we will compute, interpret, and apply probability, especially in relation to different populations.

Computing and Interpreting Probabilities

Probability means using a number (or numbers) to demonstrate how likely something is to occur. For example, if a coin is tossed, the probability of getting a heads or tail is one out of two chances; that is ½. Researchers have used probability studies to predict weather and other events and have been successful to some extent. Public health professionals have used statistical methods to predict the chances of health- related events, thereby providing arguments in favor of taking precautionary measures and warning the general public on important health issues.

In biostatistics, we use both descriptive statistics and inferential statistics to address public health issues within a population. In most cases, researchers are not able to study the entire population; they try to get a sample from the population from which they can generalize their findings.

Descriptive Statistics

Aside from the use of probability sampling methods, there are other methods used for the computation and interpretation of data; these are generally known as descriptive statistics. With descriptive statistics, we

UNIT III STUDY GUIDE

Probability

PUH 5302, Applied Biostatistics 2

UNIT x STUDY GUIDE

Title

normally compute the mean, mode, median, variance, and standard deviation. Information obtained using such computation methods is used for descriptive purposes, as opposed to information obtained from inferential statistics. Let’s examine this example using the numbers 5, 10, 2, 4, 6, 10, 2, 3, and 2. The mean is the sum of all the numbers ÷ the number of cases

= 37 ÷ 9 = 4.11

The median is the middle number after the numbers have been arranged in an ascending or descending order

= 4

The mode is the most frequently occurring number

= 2

You can calculate the variance and standard deviation from the definitions given here:

 Variance is the departure from the mean (±) or the average of the squared differences of the mean.

 Standard deviation is the √variance (the square root of the variance).

Population and Sample A sample is just a smaller subgroup of a larger population. Researchers can take a sample from the population for a specific study. It is also possible to get different samples from a population. The most important thing here is that the sample characteristics must reflect the population because the results of the findings are generalized to the entire population. In this case, the sampling method is very important. There are different types of sampling methods, including:

 simple random sampling: every item has equal and independent chances of been selected;

 systematic sampling: every nth item is selected in a given number of sample;

 stratified sampling: the population is divided into subgroups and simple random sampling is used on each group;

 cluster sampling: a population is divided into several areas and random areas are selected to asses;

 quota sampling: the number of people to sample is chosen and then any method is used to sample the desired sample size;

 selective sampling: a selected set of people is selected;

 convenience sampling: available research subjects are used;

 snowball sampling: recommendations from subjects meeting similar research characteristics are used; and

 theoretical sampling: sampling is for the purpose of testing a theory (Changing Minds, n.d.). Computing and interpreting probability for biostatistical analysis in health-related issues such as disease prevalence and incidence in a population is vital. The method of data collected and the type of analytical methods used are critical. Public health researchers may use two sampling methods: Probability sampling is the sampling method where every unit in the population has a chance of being selected in the sample (Sullivan, 2018). Probability sampling is widely used in quantitative sampling, making it possible for the selected sample to be unbiased. Unbiased reports from public health researchers are important for the veracity of the research studies. See the example below:

PUH 5302, Applied Biostatistics 3

UNIT x STUDY GUIDE

Title

Assume that out of a population of 1500 persons, 1000 are male, and 500 are females. From those samples, 500 males and 300 females were also exposed to HIV. Representing the data in probability terms may help significantly for comparison and reporting purposes. This information could be reported thus:

 P(males) = 1000 / 1500 = 0.66

 P (females) = 500 / 1500 = 0.33

 P(males exposed to HIV) = 500 / 1500 = 0.33

 P(females exposed to HIV) = 300 / 1500 = 0.2

 P(persons exposed to HIV) = 800 / 1500 = 0.53 Non-probability sampling is any sampling method where some of the population has no chance of selection (Sullivan, 2018). It is widely used in qualitative sampling. The use of nonprobability sampling makes it possible for public health researchers to insert descriptive comments regarding a sample, so they are cost effective and less time consuming. The non-probability sampling method makes it possible to still conduct sampling where it is impractical for probability sampling to be done. There are also some disadvantages associated with this sampling method. There are possibilities of lack of representation of some aspects of the population. In addition, generalization of results may be of low level and possible bias may be difficult to identify. This has a big implication for public health researchers who may want to report accurate results or findings. At any rate, such reports have also helped in providing vital information regarding disease and other aliments in the country. Proposing Solutions for Biostatistical Problems In proposing solutions for biostatistical problems, public health researchers, in addition to methods discussed above, use sensitivity and specificity testing, especially for investigations involving the presence of a diseases (Sullivan, 2018). Let’s define these terms and examine some applications. Sensitivity and Specificity Sensitivity and specificity tests are used in screening in order to help identify individuals who have contracted a specific disease for which the screening is done. Sensitivity is the likelihood that a test will show the presence of a disease, while specificity testing looks for the likelihood that a disease is absent. In order to understanding this concept, let’s examine the table below.

Screening Disease No Disease Total

Positive A = 25 B = 50 A + B = 75

Negative C = 5 D = 120 C + D = 125

Total A + C = 30 B + D = 170 N = 200

Using the table above, disease presence is evaluated by

 Sensitivity = P(Screen positive | Disease) = A / (A + C) x 100

 Specificity = P(Screen negative | Disease free) = D / (B + D) x 100 Disease absence is evaluated by

 P(Screen positive | Disease free) = B / (B+C) x 100

 P(Screen negative | Disease) = C / (A + C) x 100

Let’s see how these measures are applied in public health situations. Assume that 200 people are tested for a particular disease; 30 people tested positive for the disease, and 170 people do not have the disease. Calculate (the answers are below).

1. Prevalence 2. Sensitivity 3. Specificity

PUH 5302, Applied Biostatistics 4

UNIT x STUDY GUIDE

Title

4. Positive predictive value 5. Negative predictive value

Answers:

1. Prevalence Total disease / Total = 30 / 200 x 100 = 15%

2. Sensitivity A / (A + C) x 100 25 / 30 x 100 = 83.3%

3. Specificity D / (D + B) x 100 120 / 170 x 100 = 70.58%

4. Positive Predictive Value P(Disease | Screen Positive) = Positive for Disease (A) / Total Positive (A + B) 25 / 75 x 100 = 33.33 %

5. Negative Predictive Value P(Disease Free | Screen Negative) = Negative for No Disease (D) / Total Negative (C = D) 120 / 125 x 100 = 96%

Using this method, public health officials are able to compare and contrast and report diseases and other public health issues effectively within a population.

In another example, a screening test for Down Syndrome was conducted which yielded the following results:

Screening Test Result Affected Fetus Unaffected Fetus Total

Positive 17 251 268

Negative 5 449 454

Total 22 700 722

Thus, the performance characteristics of the test are:

 Sensitivity = P (Screen Positive | Affected Fetus) = 17 / 22 =0.772

 Specificity = P (Screen Negative | Unaffected Fetus) = 449 / 700 =0.641

 False Positive Fraction = P (Screen Positive | Unaffected Fetus) = 251 / 700 = 0.359

 False Negative Fraction = P (Screen Negative | Affected Fetus) = 5 / 22 = 0.227 Interpretation of results:

 If a woman is carrying an affected fetus, there is a 77.2% probability that the screening test will be positive.

 If the woman is carrying an unaffected fetus, there is a 64.1% probability that the screening test will be negative.

However, the false positive and false negative fractions are a concern of this test.

 If a woman is carrying an unaffected fetus, there is a 35.9% probability that the screening test will be positive.

 In addition, if the woman is carrying an affected fetus there is a 22.7% probability that the test will be negative.

Sample Size Determination In this section, we will only briefly discuss sample size and sample size determination because there is a whole lesson on this further in the course. For any research involving subjects (participants) within a

PUH 5302, Applied Biostatistics 5

UNIT x STUDY GUIDE

Title

population, it is important to get the appropriate sample sizes because sample sizes that are too small or too large are detrimental to the study. The wrong sample size may yield poor results, creating validity problems. In order to overcome the problem of sample size in a research, experts have come up with various methods of sample size determination. Some of these include:

1. Various free sample size calculators on the Internet—if you search “sample size calculator” on your favorite search engine, there are many readily available.

2. Survey Monkey provides a sample size calculator on its website (SurveyMonkey, n.d.). 3. National Statistical Service’s Sample Size Calculator is a website. 4. Manual calculations are another option. Before using calculations, we need to know few things,

namely: a. population size, b. margin of error or confidence interval (this determines how much higher or lower than the

population mean the sample mean can fall, commonly a margin of error of ± 5%), c. confidence level (the most common confidence interval used is 95% confident), and d. standard deviation (many researchers safely use .5).

We will also need the z-score and the standard values used for the most common confidence levels.

 90% z-score = 1.645

 95% z-score = 1.96

 99% z-score = 2.576

The next step is to insert the z-score, standard deviation, and confidence interval into this equation: Necessary sample size = (z-score)2 x StdDev x (1 - StdDev) / (margin of error)2

Let’s see an example here assuming a 95% confidence level, .5 standard deviation, and a margin of error of ± 5%. Sample size = ((1.96)2 x .5(.5)) / (.05)2 = (3.8416 x .25) / .0025 = .9604 / .0025 = 384.16 Subjects needed = 385 (Sample size) The number of subjects needed will vary if a different z-score, confidence interval, or margin of error is used. Note: This equation is used when we do not know the population size. In summary, this lesson introduced you to the importance of probability and how it can be used in the public health field. There are many different types of sampling methods, and it is important to be familiar with each one. In addition, it is also crucial to be able to effectively determine a sufficient sample size for a study. This topic will be explored more in depth in a later unit.

References Changing Minds. (n.d.). Choosing a sampling method. Retrieved from

http://changingminds.org/explanations/research/sampling/choosing_sampling.htm Sullivan, L. M. (2018). Essentials of biostatistics in public health (3rd ed.). Burlington, MA: Jones & Bartlett

Learning. SurveyMonkey. (n.d.). Sample size calculator. Retrieved from https://www.surveymonkey.com/mp/sample-

size-calculator/

PUH 5302, Applied Biostatistics 6

UNIT x STUDY GUIDE

Title

Learning Activities (Nongraded) Nongraded Learning Activities are provided to aid students in their course of study. You do not have to submit them. If you have questions, contact your instructor for further guidance and information. For extra practice with probability concepts, complete the following Chapter 5 practice problems on pages 99– 100 in your textbook: 12, 13, 14, 15, and 18. Be sure to show all of your work.