Austin Meyer, PhD

MS4

**Basic Statistics**

- Accuracy versus Precision - 2 minutes
- Statistical inference - 3 minutes
- Distributions - 5 minutes
- Hypothesis testing - 10 minutes

**Epidemiology**

- Prevention and Outbreaks - 5 minutes
- Disease metrics - 5 minutes
- Measures of risk - 10 minutes
- Clinical test characteristics - 15 minutes
- Minimal Bayesian statistics - 10 minutes
- Study bias - Mostly for independent
- Types of studies - 10 minutes
- Cinical application - 15 minutes

Generalizability

- How applicable is a finding to the general population

P-value

- Probability of finding a value this extreme by random chance

Confidence Interval

- Interval over which population value is found with a specified probability (e.g. 95%)

Efficacy

- Performance of treatment under ideal circumstances

Effectiveness

- Performance of treatment under real world circumstances

Investigators are studying prostate specific antigen (PSA) as a predictor for prostate cancer. To make the statistics easier, they are going to assume that PSA is a normally distributed population variable. Which of the following is correct under their assumption?

- Mode is greater than median
- Median is greater than mode
- 95% CI depends on degrees of freedom
*Median is equal to mean*- Mean is equal to standard deviation

The normal distribution is unimodal and symmetric.

The important invariant properties (for you) of normal distributions are the following:

- Mean = Median = Mode
- Unimodal
- Symmetric
- Area under curve is 1
- Constant relationship between standard deviation and percentiles

Which of the following corresponds to the measures of central tendency on the graph from **left to right**?

- mean, median, mode
- mode, mean, median
- median, mode, mean
*mode, median, mean*- mean, mode, median

Mode is most common, median is middle, mean is average value.

Always remember that the y-axis on these plots are counts or frequency. Therefore, which line is closest to the peak on the y-axis is the mode. The median is **always** in the middle. The mean is the most susceptible to outliers so in a skewed distribution it will **always** be farthest out on the tail.

- Assume:
- There are two or more groups being compared, or
- One group is being compared to zero, or
- One group is being compared to expectation.

- For Step 1, probably safe to assume null is always rejected with \(p < 0.05\).
- For ratios (e.g. Relative Risk, Odds Ratio), a 95% CI
**not**overlapping 1 is significant. - For two sample tests, it is less straightforward how the CI relates to the p-value so don't worry about it.

- For ratios (e.g. Relative Risk, Odds Ratio), a 95% CI

- Once \(H_0\) is rejected, we accept the alternative hypothesis \(H_A\).

- One sample test
- \(H_0\) = There is no difference between group mean and zero

- Two sample test
- \(H_0\) = There is no difference between the disease and no disease groups

- Paired test
- \(H_0\) = The difference of a measured variable between two time points on the same individuals is zero

Will the plot be significant?

Two sample: \(H_0\) = There is no difference between the disease and no disease groups

Run the T-test (in this case, in R language)

`norm1 <- rnorm(5000, mean = 4.75, sd = 1.2) norm2 <- rnorm(5000, mean = 5.25, sd = 1.2) (t.test(norm1, norm2))$p.value`

`## [1] 4.687206e-88`

Have we rejected the null hypothesis?

Yes, we have accepted \(H_A\). There is a difference between disease and no disease groups.

- Two common tests
- Goodness-of-fit
- Test of independence

- Goodness-of-fit
- \(H_0\): The number of cases occuring in a subgroup is consistent with random expectation
- \(H_A\): The number of cases occuring in a subgroup is not consistent with random expectation

- Test of independence
- \(H_0\): Categorical variable A and categorical variable B are independent
- \(H_A\): Categorical variable A and categorical variable B are not independent

Healthy | Disease | Total | |
---|---|---|---|

Exposed | 40 | 60 | 100 |

Not Exposed | 500 | 400 | 900 |

Total | 540 | 460 | 1000 |

Table 1: A 2x2 contingency table

Exposure Status | Never Sick | Sometimes Sick | Mostly Sick | Total |
---|---|---|---|---|

High | 10 | 20 | 180 | 210 |

Medium | 20 | 100 | 20 | 140 |

Low | 100 | 40 | 10 | 150 |

Total | 130 | 160 | 210 | 500 |

Table 2: A 3x3 contingency table

Exposure Status | Never Sick | Infrequently Sick | Sometimes Sick | Mostly Sick | Always Sick | Total |
---|---|---|---|---|---|---|

Super High | 10 | 90 | 34 | 12 | 12 | 158 |

Very High | 30 | 345 | 54 | 43 | 21 | 493 |

High | 70 | 57 | 67 | 65 | 32 | 291 |

Medium | 200 | 33 | 87 | 25 | 42 | 387 |

Low | 130 | 89 | 58 | 45 | 56 | 378 |

Very Low | 100 | 54 | 36 | 23 | 78 | 291 |

Super Low | 90 | 23 | 36 | 63 | 8 | 220 |

Total | 530 | 691 | 372 | 276 | 249 | 2118 |

Table 3: A 7x5 contingency table

**Spearman correlation compares ranked values of two variables**

- Must be quantitative data
**Not count data**

\(r =\) correlation between variables

\(r^2 = \) amount of variance in y that is explained by x

- p-value is still used for significance
- For Step 1, most likely significant at \(p < 0.05\)

Clot | No Clot | Total | |
---|---|---|---|

OCP Use | 500 | 400 | 900 |

No OCP Use | 80 | 20 | 100 |

Total | 580 | 420 | 1000 |

A study was conducted on OCPs and blood clots, and the data is shown. Which of the following is the best method to assess the association between OCP use and blood clots?

- Two sample T-test
- Analysis of variance
- Pearson correlation
*Chi-square test*- Spearman correlation

What kind of data is this?

The only test available that utilizes categorical data is the Chi-square test. All of the other tests require at least rank or quantitative data.

To test a new biomarker, investigators plan a cross-sectional study comprised of two groups. In one group, the researchers will include men with confirmed prostate cancer. In the other group, researchers will include men with no evidence of prostate cancer. The investigators will assume their biomarker is normally distributed. What is the best test to investigate whether the biomarker can distinguish the two groups?

- Two sample Mann-Whitney U-test
- Pearson correlation
*Two sample T-test*- Chi-squared test
- Analysis of variance

The number of groups and distribution is all that matters

The two sample T-test is the appropriate test in this case. The two sample Mann-Whitney U-test could work as well, but is slightly less efficient for normally distributed data than the T-test. The Pearson correlation requires two measured variables on the same sample. A chi-squared test requires categorical (i.e. count) data. An analysis of variance is typically used to measure the difference in means of three or more groups.

- Correct - Reject a false \(H_0\)
- Probability of success is called "power"
- Power depends on sample size
- bigger sample = bigger power

- Correct - Fail to reject a true \(H_0\)
- Probability determined by \(\alpha\) as \(1-\alpha\)

- Type 1 - Incorrect rejection of a true \(H_0\)
- False Positive

- Type 2 - Failure to reject a false \(H_0\)
- False Negative

**P**rimary -**P**revention- An action taken to prevent development of disease in a person who is well

- An action taken to prevent development of disease in a person who is well
**S**econdary -**S**creening- Identifying people in whom disease has begun but who do not have signs or symptoms

- Identifying people in whom disease has begun but who do not have signs or symptoms
**T**ertiary -**T**reatment- Preventing complications in those who have developed signs and symptoms and have been diagnosed

- Preventing complications in those who have developed signs and symptoms and have been diagnosed
**Q**uaternary -**Q**uit overtesting and overtreating- Recent effort to minimize excessive healthcare interventions in disease process

Attack rate

- Typically used
**during epidemics or pandemics** - Number of people who get disease / Number of people at risk

- Typically used
Incidence

- Given a
**defined period of time** - Number of people who get disease / Number of people at risk

- Given a
Prevalence

**No time course**(i.e. measured at a single point in time)- Number of people with disease / Number of people at risk
- Simple diseases (e.g. SIR infections): Prevalence = Incidence x Average Disease Duration

- Odds
- Risk that someone with an exposure will get disease

- Odds ratio (OR)
- Excess odds of exposure of one population relative to another

- Risk -
**Must know disease prevalence****Probability**that someone with an exposure will get a disease

- Risk Ratio (Relative Risk or RR)
- Excess risk of one population relative to another

- Both significant if CI does not include 1

Investigators are studying the association between mesothelioma and asbestos exposure. Due to the relative rarity of the disease, they design a very large case-control study. In the end, they find an \(OR = 20\ (19.54;20.52,\ p < 0.001)\). After assuming that the OR is a good approximation of risk, the authors conclude that the risk of mesothelioma is 20 times higher in those exposed to asbestos compared to control. Why is their assumption reasonable?

*The incidence of mesothelioma in the population is low*- The sample size of this study is very large
- The result is highly significant
- The OR is always a good approximation of outcome risk
- The 95% CI is very narrow around the OR of 20

Think about the denominators for odds and risks.

The odds ratio is (A / B) / (C / D) and the risk ratio is (A / (A + B)) / (C / (C + D)). In the case where the number of people with the disease is small, the numbers A and C become very small. In that case, B is a good approximation of A + B and D is a good aproximation of C + D. Thus, the RR ~ (A / B) / (C / D).

**If true infections are low, denominator \(A+B \approx B\) and \(C+D \approx D\)**

Two studies were conducted on different samples from the same population to assess the relationship between oral contraceptive use and the risk of deep venous thrombosis (DVT). Study A showed an increased risk of DVT among oral contraceptive users, with a relative risk of 2.0 and a 95% CI of 1.2-2.8. Study B showed a relative risk of 2.05 and a 95% CI of 0.8-3.1. Which of the following statements is most likely to be true regarding these 2 studies?

- The p-value in study B is likely to be < 0.05
- The result in study A is not accurate
- The result in study A is not statistically significant
- The result in study B is likely biased
*The sample size is likely smaller in study B than study A*

What gives a narrower confidence interval?

- Incorrect - The CI in study B overlaps 1 so it is not significant
- Incorrect - It is hard to judge accuracy without knowing the objective Truth
- Inccorect - The CI in study A does not include 1 so it is statistically significant
- Incorrect - There is no reason to believe B is biased
- Correct - Per slide 23/38 bigger sample leads to improved ability to reject a false null hypothesis

- Reminder
- Exposed: \(Risk = \frac{A}{A + B}\)
- Unexposed: \(Risk = \frac{C}{C + D}\)

- \(AR = Risk_{Exposed} - Risk_{Unexposed}\)
- \(ARR = Risk_{Control} - Risk_{Treatment}\)
- Number needed to treat
- Number of patients treated for
**ONE**patient benefited - \(NNT = \frac{1}{ARR}\)
- \(NNH = \frac{1}{AR}\)

- Number of patients treated for

**These are the defining characteristics of any clinical finding (e.g. history, physical, test, image)**.- They do not depend on anything... they are intrinsic to the exam/test

- If someone quotes the prositive or negative predictive value of a test, they are wrong.

- Therefore, if you do not know the sensitivity or specificity of a test, you are missing information
- Without sensitivity and specificity, you cannot make a ROC curve
- Never trust a paper, poster, or company presentation that does not include a ROC curve

**Sensitivity**- \(Sensitivity = \frac{TruePositives}{AllRealPositives}\)
- \(AllRealPositives = TP + FN\)

**Specificity**- \(Specificity = \frac{TrueNegatives}{AllRealNegatives}\)
- \(AllRealNegatives = TN + FP\)

- Positive Predictive Value
- Chance that person has the disease after a positive test result
- \(PPV = \frac{TP}{TP + FP}\)

- Negative Predictive Value
- Chance that person does not have disease after a negative test result
- \(NPV = \frac{TN}{TN + FN}\)

**Both depend on how prevalent the disease is in the population**

PPV depends more on sensitivity or specificity?

This is the real prevalence of HIV... Where would you put the cutoff?

Assume a steady-state population that is not changing in any way. Which of the following statements is true for people who test positive regarding moving the cutoff for a positive test from the solid to the dotted line?

- Decrease in test specificity
- Increase in test sensitivity
*Increase in PPV*- Increase in NPV
- Decrease in NPV

Question prefaces a positive test result

- Incorrect - Moving the line to the right increases the specificity because it captures more true negatives as a portion of total negative individuals
- Incorrect - Moving the line to the right decreases the sensitivity because it captures fewer true positives as a portion of total positive individuals
- Correct - Moving the line to the right increase positive predictive value drives up the portion of true positives to total positive test by reducing the number of false positives
- Incorrect - The question is concerned about positives tests which do not factor into negative predictive value
- Incorrect - The question is concerned about positives tests which do not factor into negative predictive value

- If the curve approximates the
**diagonal**, it is a**bad test**- \(AUC = 0.5\) for a bad test

- If the curve goes up the y-axis and then turns right down x-axis, it is a perfect test
- \(AUC = 1\) for a perfect test

**The highest yield cutoff is the x-value that maximizes the distance from the diagonal to the curve**

**Again, the best cutoff is the x-value that maximizes the distance from the diagonal to the curve**

- The value lies in the intuitive approach

- Frequentist: goal is to approximate objective truth through repeated trials
- The important metric is the probability that our estimate does
**not**match reality

- The important metric is the probability that our estimate does

- Bayesian: goal is to approximate objective truth by updating prior probability with new evidence
- The important metric is the probability that our subjective experience matches reality

- Example: given a clinical test, which do you care more about?
- If your patient has disease, there is a 2% chance of getting a test result this extreme by chance.
- If your patient has a positive test, there is a 75% chance of having disease.

- Start with pre-test probability
- This data is available all over the place (e.g. any prevalence data)
- Example: Region 6 prevalence of influenza right now is
**\(\approx 4\%\)**

- Do something (e.g. rapid flu test)
- Absolute best Rapid Flu test:
**\(Sensitivity \approx 70\%\)**and**\(Specificity \approx 95\%\)** - If positive, use likelihood ratio positive (a.k.a. bayes factor positive)
**\(LR+ = \frac{sensitivity}{1 - specificity} = \frac{0.7}{1 - 0.95} = 14\)**- If negative, use likelihood ratio negative (a.k.a. bayes factor negative)
**\(LR- = \frac{1 - sensitivity}{specificity} = \frac{1 - 0.7}{0.95} = 0.32\)**

- Absolute best Rapid Flu test:

- Pre-test Probability \(\rightarrow\) Pre-test Odds \(\rightarrow\) Pre-test Odds x LR \(\rightarrow\) Post-test Odds \(\rightarrow\) Post-test Probability

- Probability \(\rightarrow\) Odds: \(O = \frac{P}{1 - P}\)
- Odds \(\rightarrow\) Probability: \(P = \frac{O}{1 + O}\)

- Positive test: \(0.04 \rightarrow 0.04/0.96 = 0.042 \rightarrow 0.042 * 14 = 0.583 \rightarrow 0.583/1.583 = 0.37\)
- Post-test probability following
**positive**test:**\(37\%\)**

- Post-test probability following

- Negative test: \(0.04 \rightarrow 0.04/0.96 = 0.042 \rightarrow 0.042 * 0.32 = 0.0134 \rightarrow 0.0134/1.0134 = 0.0132\)
- Post-test probability following
**negative**test:**\(1.32\%\)** - So this time of year a negative test is basically useless

- Post-test probability following

- Selection bias
- Non-random partitioning of individuals into groups

- Observer-expectancy
- Observer is unblinded and expects a particular outcome

- Effect modification bias
- Magnitude of effect varies by third variable
**Can**be eliminated by stratification

- Confounding
- Unseen third variable is an underlying cause for correlation of two other variables
**Cannot**be eliminated by stratification

- Recall bias
- Subjects with disease can recall exposures better than healthy subjects

- Procedure bias
- Experimenters vary systematically in the way they do work
- e.g. Experimenters don't follow the specified procedure

- Instrument bias
- Instrument is broken
- Instruments can also be things like surveys or
**clerkship evaluations** - Just means instrument is not reliable

- Lead-time bias
- New test detects disease earlier
- Survival appears improved with new test

- Attrition bias
- Subjects systematically withdraw
- Could be things like side effects or lack of improvement

- Loss-to-follow up
- Subjects randomly do not report for scheduled followup

**Closer to the top means better evidence**

Experimental Trials

- This is widely considered the gold standard for clinical evidence

- Question:
**Primary**purpose of randomization? - Answer: To eliminate
**selection bias**- Selection bias is eliminated if randomization is technically correct

- Question: Secondary goal of randomization?
- Answer: To control confounders
- Confounders are not necessarily eliminated even with perfect technical execution

- Can use relative risk because investigator knows prevalence of disease and prior exposures

This post hoc analysis is overly simplified for real life

This understanding is sufficient for step 1

Confounders reduced because a patient can serve as their own control

Observational Studies

- Can use relative risk because investigator knows prevalence of exposure and disease
- Subjects vary by exposure status
- Can calculate incidence

**Selection bias**is the biggest problem- Investigator has infinite control over inclusion

- Other biases
- Attrition, loss-to-follow up, confounding, Hawthorne

- Retrospective
- Information bias

- Must use odds ratio because investigator does not know prevalence of disease
- Subjects grouped by cases and controls
- Measure
**odds of exposure**in case and control groups

- Measure
- Significantly improved power and decreased resource requirements compared to cohorts
- Due to cases being selected at out set

**Selection and Recall biases**are the biggest problem- Selecting appropriate controls is
**highly**non-trivial - Sick people remember exposures (e.g. Melanoma patients stew about their sunburns)

- Selecting appropriate controls is
- Also common
- Information biases

**Cannot calculate incidence or prevalence**

**Quick, cheap, and easy**- Typically this is a starting point

- Can establish prevalence of disease
- Must use chi-squared or correlation for statistical test
- Subjects can be grouped by exposure and diease in to the 2x2 contingency

**Cannot establish causation**- Cannot calculate risk metrics

A study was conducted to evaluate the efficacy of a new antiviral drug for the treatment of the common cold in young children. The study population consisted of 100 children between the age of 2 to 8 years. These children were diagnosed with rhinovirus infection and subsequently given the particular antiviral drug. One week later, it was observed that 92 of the 100 patient were asymptomatic. Which of the following is the true conclusion of this study?

- The drug is highly effective as the effectiveness is 90%
- The drug is moderately effective as the efficacy is 90%
*An exact conclusion cannot be drawn from the study*- The drug is not effective as the sample size is very small
- No conclusion can be made, as compliance is generally very low in small children

A treatment is tested without a control

- Incorrect - We can't compare to a real-world control.
- Incorrect - We can't compare to an ideal control.
- Correct - Most people with recover from a cold in a week or so.
- Incorrect - The sample size may be adequate. There are no statistical tests to evaluate this statement.
- Incorrect - Compliance would not be an issue in this case.

Researchers are studying the relationship between mutations in HMG-CoA reductase and CAD. The study population is selected at random. Tissue samples are obtained for genotyping and stress echos are performed to assess CHD. In the subsequent paper, the authors conclude that there is an association between mutations in HMG-CoA reductase and CHD. Which of the following study designs did the authors utilize?

- Retrospective cohort study
*Cross-sectional study*- Randomized clinical trial
- Prospective cohort study
- Case-control trial

What does the timeline look like?

- Incorrect - A retrospective cohort starts at some point in the past. There is no indication of a past time or chart review in this study.
- Correct - A cross-sectional study is a "snap shot". It simultaneously determines both risk factors and disease. It can establish an association, but it cannot say much about causation because the timeline is unknown.
- Incorrect - Although patients are randomly selected, a random clinical trial requires a control group and requires some treatment under investigation.
- Incorrect - A prospective cohort starts in the present and follows a group into the future. There is no indication of time or following patients or recording expsoure.
- Incorrect - A case-control trial requires identifying cases with disease and controls without disease, then identifying exposures, and calculating the risk of exposure in given disease. There is no indication of that here.

A study was conducted to evaluate the efficacy of a new antiviral drug. The study population consisted of 100 rhinovirus-infected children. The treatment arm was given an antiviral drug and the control arm was given a placebo. One week later, researchers found that 42 out of 50 treatment patients were asymptomatic and 30 of 50 control patients were asymptomatic. On average, how many people need to be treated with this drug to cure **one** infected person?

- 25/12
*50/12*- 50/15
- 24/7
- 50/8

\(ARR = Risk_{Control} - Risk_{Treatment}\)

\(NNT = \frac{1}{ARR}\)

\(ARR = \frac{42}{50} - \frac{30}{50} = \frac{12}{50}\)

\(NNT = \frac{1}{12/50} = \frac{50}{12}\)

Assuming that mortality is simply the incidence of death per 100 patients, after controlling for physician characteristics, what is the relative risk of death within 30 days of discharge for patients with a male physician versus a female physician?

- 11.49
- 11.07
*1.04*- 0.96
- 1.07

\(Incidence_{Male} = \frac{a}{a + b}\)

\(Incidence_{Female} = \frac{c}{c + d}\)

Thus, mortality is the **risk** of death.

Mortality is the **risk** of death. Then, the relative risk is:

\(Risk_{Male} = 11.49\)

\(Risk_{Female} = 11.07\)

\(RR = \frac{11.49}{11.07}\)

Controlling for physician characteristics as before, what is the absolute risk reduction of having a female physician?

*0.0042*- 0.0142
- 0.0049
- 0.0064
- 0.0342

Mortality is the **risk** of death.

\(ARR = Risk_{Male} - Risk_{Female}\)

Mortality is the **risk** of death. Since this is subtraction it is important to have the units correct.

\(ARR = 0.1149 - 0.1107 = 0.0042\)

Given the information from the previous slide, on average how many patients would need to be treated by a female physician to save a life?

- 2.4
- 8.7
- 87.1
*238.1*- 871.4

Mortality is the **risk** of death.

\(ARR = Risk_{Male} - Risk_{Female}\)

\(NNT = \frac{1}{ARR}\)

Thus, mortality is the **risk** of death. Since this is subtraction it is important to have the units correct.

\(ARR = 0.1149 - 0.1107 = 0.0042\)

\(NNT = \frac{1}{ARR} = \frac{1}{0.0042} = 238.1\)

Given the information from the previous slide, if we no longer allowed men to treat general medicine patients approximately how long on average would it take for a female physician to save a patient that otherwise would have died under the previous treatment system?

- 2 months
- 8 months
*1.5 years*- 3.5 years
- 5 years

Approximately how many individual study patients are being seen each year by the physicians in this study?

In this study, between 130 and 180 patients are being seen annually by female and male physicians respectively. Thus, the time to save a patient in years is:

\(\frac{238.1}{131.9} = 1.81\)

\(\frac{238.1}{180.5} = 1.32\)

The End