Confounding Variable
Variable related to both the treatment and the outcome
Conditional Probability
Probability of an event occurring given another event has already occurred
Chi-Square Goodness of Fit Test
Tests if sample distribution fits a population distribution
ANOVA (Analysis of Variance)
Compares means of three or more groups
Confounding vs Lurking Variable
Confounding: affects both variables; Lurking: affects outcome but not considered in analysis
Bimodal Distribution
Distribution with two different modes
Cluster Sampling
Population divided into clusters; randomly select clusters, then sample all in cluster
Bayes' Theorem
Calculates probability of an event based on prior knowledge of conditions
Chi-Square Test for Independence
Tests if two categorical variables are independent
Control Group
Group in an experiment that does not receive treatment
Addition Rule for Probability
P(A or B) = P(A) + P(B) - P(A and B)
Blinding
Participants unaware of whether they are receiving treatment or placebo
Box Plot
Displays five-number summary (minimum, Q1, median, Q3, maximum)
Correlation Coefficient
Measures strength and direction of linear relationship between two variables
Coefficient of Determination
R-squared; proportion of variance in the dependent variable predictable from the independent variable
Alternative Hypothesis
Statement researcher wants to prove
Conditional Probability Formula
P(A|B) = P(A and B)/P(B)
Cumulative Frequency
Sum of frequencies for that category and all previous categories
Central Limit Theorem
Sampling distribution of the sample mean approximates a normal distribution as sample size increases
Categorical Data Analysis: Chi-square Test
Assesses relationships between categorical variables
Convenience Sampling
Use results that are easy to get
Biased vs Unbiased Estimator
Biased: systematically off; Unbiased: accurate on average
Combination
Selection of objects without regard to order
Binomial Distribution
Probability distribution of number of successes in a fixed number of trials
Confidence Interval
Range of values believed to contain population parameter with a certain level of confidence