Biased Estimators and Sample Statistics

Variance in Terms of Expected Differences

Variance is equal to the difference between the expected square value and the expected value squared:

Summing Variances

Note however, if and are not independent:

Multiplying Variance by a Constant}

Variance of a Sample Mean

Note

You may recall this:

from the Central Limit Theorem 1, we develop this idea further in the section on t-distributions.

1

https://en.wikipedia.org/wiki/Central_limit_theorem

Consider a sample of size , the population will have a variance and a mean , the sample will have a variance and a mean . Many different samples could be taken, so there is a distribution of values that could occur. The variance of sample means is given by , this is a part of the CLT and is shown here.

Bessel's Correction

Suppose that the sample variance was given by the typical formula:

The expected value should be , in that sense it would be an un-biased estimator of the population mean.

The key here is to recognise that corresponds to a sample, by introducing we can solve for variance in terms of expected values and the sample mean. The sample introduces the bias so it is necessary to use the sample mean.

Note in particular, that is now a form of the expression, by the CLT this will give a , this is the key insight that pulls this together, it's that very that accounts for the :

Expected Value Squared

The expected value of is the mean value:

Expected Square Value

Recall the definition of Variance from earlier:

Applying this to the sample mean:

This step provides the key insight, if variance is taken on a population, there is no in the denominator

because the variance of a sample mean is different to the variance of a population statistic, the expected value of a squared sample mean is different to the expected value of a squared observation. The same from is the same one that leands to The expected value of a squared sample mean is less than the expected value of a squared observation, because the sample mean is going to be more central.

Solving The Expected Sample Variance

This shows that the variance formula, applied to a sample is biased. In order to correct that bias define like so:

The expected value of this is: