Basic Statistics - (08) Sampling Distribution
1. Inferential Statistics
Methods to draw conclusion about a population based on data coming from samples.
-
Sampling methods
-
Simple Random Sample
Each subject has the same choice to be chosen.
- unknow sampling frame
-
- More representative in every stratum
- Must know sampling frame
-
2. Sampling Distribution and Central Limit Theorem
-
Sampling Distribution
With infinite number of samples,Distribution of sample means is bell-shaped with a mean equal to the population mean.
-
Central Limit Theorem
-
the sampling distribution of sample mean x_bar is approximately normal (provided that n is sufficiently large).
-
even if the variable is not normally distributed in the population.(n>30)
-
-
Sample mean
- \[\mu_{\overline{x}} = \mu\]
-
Sample deviation
- \[\sigma_{\overline{x}} = \frac{\sigma} {\sqrt{n}}\]
- 总体方差与样本方差正相关。Larger variation in population larger variation in samples
- 样本数量与样本方差负相关。Larger n lower variation in samples(Central limit theorem!!!)
3. Population distribution & Sample distribution & Sampling distribution
-
Differences of 3 distributions
- sample distribution是一次取样分布。
- sampling distribution (theoretical distribution)是无限次取样分布,符合中央极限定理,平均值无限接近总体平均值。
- population distribution 总体分布。
-
Z-score for sample mean
这个问题问的是取样平均值在某个区间的概率,应该利用sampling distribution来进行计算。
4. Binominal Sampling Distribution
-
Sampling Distribution of Sample Proportion
-
Sample space
- \[n * \pi >= 15\]
-
\[n(1-\pi) >= 15\]
n为每次sample取样,样本个数。所以,一次取样样本量小的话,sampling distribution将不准确。
-
-
mean
- \[\mu_p =\pi\]
pi为事件成功概率。
-
deviation
- \[\sigma_{p}^2 =\frac{\pi(1-\pi)}\]
- 注:二项分布总体方差: \(\sigma^2 = p(1-p)\)
5. Example
6. Conclusion
无论是二项分布还是正态分布:
- 无限次样本量的平均值(sampling mean)都无限接近总体平均值(population mean)。
- 由于中央极限定理,无限次样本会趋中,所以方差需要除以样本个数,以降低趋中趋势。