~The relationship between the population mean & the population
standard deviation AND the sample means & their standard deviations
(for different size samples n taken from the population size N) are compared.
~Note: If the population size (N) is small, there needs to be
some correction when the samples are taken. (i.e., usually, if n >
5% of N)
(the correction will be specified in this lesson)
~First, we need to look at the sampling distribution for the all the samples of size n taken from N.
~Note: The following example is not realistic, it’s for illustration purposes only, since the population is very small.
~Ex: The heights (in inches) of starting players on a
given basketball team are: 76, 78, 79, 81, 86. We will consider this to
be the entire population. So, N = 5.
~The mean of this population is 80 and its standard deviation is 3.41 (either by formula or the TI-83)
~Let’s say we are interested in samples of size 2 taken from this population (n = 2)
~The number of samples of size 2 that can be taken is 5C2 = 10
They are: (76,78), (76,79), (76,81), (76,86), (78,79), (78,81), (78,86), (79,81), (79,86), (81,86).
Their means are: 77.0, 77.5, 78.5, 81.0, 78.5, 79.5, 82.0, 80.0, 82.5, 83.5, respectively.
~The mean of these sample means = 80 & the standard deviation of these means =2.09
~Repeating this procedure for all possible samples of n = 4, we would have 5C4 = 5.
~The mean of the sample means in this case would be 80 (same as above), but the standard deviation =0.85
~Note: In both cases, the mean of the sample means will equal the population mean. This will be true regardless of the size of our sample. However, the standard deviations seem to get smaller as the sample size increases. Is there a formula that ties these together?. Yes!
Standard deviation of the sample means = sqr[(N-n)/(N-1)] times the
standard deviation of the population divided by the square root of n.
Ugh!
~Note: The quantity sqr[(N-n)/(N-1)] is known as the "finite population correction" & must be used if n>5% of N
~Note: We will mainly use the formula without this factor,
since our samples will be small compare to N. So, we will be
using the population standard deviation divided by the square
root of n to compute the standard deviations of our sample means.
Note: Also see my link, "Comparing Means and Standard Deviations"
~Simply stated, says that the larger we take the sample size n, the
closer we get to a normal distribution, REGARDLESS of the distribution
of the original population.
~Recall that the mean of the sample means equals the population mean
& the standard deviation of the sample means equals the population
standard deviation divided by the square root of the sample size n.
~Important points:
A) If the original population is normally distributed, then so will the
sampling distribution of the means, for all sample sizes. The mean
& standard deviations are found by the previous formulas.
B) If the original population is not normally distributed, the sampling
distribution of the means will approach a normal distribution as the
sample size n increases. (usually need n 30 or greater)
~Note: So, techniques for finding probabilities (areas) using
the standard normal curve, can be applied to the sample means, provided
n is sufficiently large.
~Example: The average monthly mortgage payment for home
buyers has a mean of $732 and a standard deviation of $421. A random
sample of 125 home buyers are selected. Find the probability that their
mortgage payment exceeds the true mean by $50.
(i.e., exceeds $782).
The sample mean = population mean = $732, regardless of the sample size.
The sample standard deviation = pop standard deviation divided by the sqr(125) = 421/sqr(125)= 37.66 (there is a round off error, see the end of this lesson)
~Note: Since n is greater than 30, we do not use that finite population correction.
~Note: By the Central Limit Theorem, the sampling distribution
of the means is approximately Normal, so, we will use the standard
normal curve to compute this probability.
Converting 782 to a z score of 1.33 (rounded to 2 decimal places), we get a probability of .0918 (less than a 10% chance)
To do this, use 2nd VARS, go to menu 2, insert (1.33, 10000), then enter.
~Note: You can avoid finding the z score by entering
(782, 10000, 732, 421/sqr(125)) into the same menu to get .0921.
This is more accurate since you are not rounding off the
z-scores. I expect everyone to use this method on all of my testing
situations.