~Class meeting #14

~The relationship between the population mean & the population standard deviation AND the sample means & their standard deviations (for different size samples n taken from the population size N) are compared.

~Note: If the population size (N) is small, there needs to be some correction when the samples are taken. (i.e., usually, if n > 5% of N)
(the correction will be specified in this lesson)

~First, we need to look at the sampling distribution for the all the samples of size n taken from N.

~Note: The following example is not realistic, it’s for illustration purposes only, since the population is very small.

~Ex: The heights (in inches) of starting players on a given basketball team are: 76, 78, 79, 81, 86. We will consider this to be the entire population. So, N = 5.

~The mean of this population is 80 and its standard deviation is 3.41 (either by formula or the TI-83)

~Let’s say we are interested in samples of size 2 taken from this population (n = 2)

~The number of samples of size 2 that can be taken is 5C2 = 10
They are: (76,78), (76,79), (76,81), (76,86), (78,79), (78,81), (78,86), (79,81), (79,86), (81,86).

Their means are: 77.0, 77.5, 78.5, 81.0, 78.5, 79.5, 82.0, 80.0, 82.5, 83.5, respectively.

~The mean of these sample means = 80 & the standard deviation of these means =2.09

~Repeating this procedure for all possible samples of n = 4, we would have 5C4 = 5.

~The mean of the sample means in this case would be 80 (same as above), but the standard deviation =0.85

~Note: In both cases, the mean of the sample means will equal the population mean. This will be true regardless of the size of our sample. However, the standard deviations seem to get smaller as the sample size increases. Is there a formula that ties these together?. Yes!

Standard deviation of the sample means = sqr[(N-n)/(N-1)] times the
standard deviation of the population divided by the square root of n.
Ugh!

~Note: The quantity sqr[(N-n)/(N-1)] is known as the "finite population correction" & must be used if n>5% of N

~Note: We will mainly use the formula without this factor, since our samples will be small compare to N. So, we will be using the population standard deviation divided by the square root of n to compute the standard deviations of our sample means.

Note: Also see my link, "Comparing Means and Standard Deviations"

~The Central Limit Theorem (see Topics of Interest)

~Simply stated, says that the larger we take the sample size n, the closer we get to a normal distribution, REGARDLESS of the distribution of the original population.

~Recall that the mean of the sample means equals the population mean & the standard deviation of the sample means equals the population standard deviation divided by the square root of the sample size n.

~Important points:

A) If the original population is normally distributed, then so will the sampling distribution of the means, for all sample sizes. The mean & standard deviations are found by the previous formulas.

B) If the original population is not normally distributed, the sampling distribution of the means will approach a normal distribution as the sample size n increases. (usually need n 30 or greater)

~Note: So, techniques for finding probabilities (areas) using the standard normal curve, can be applied to the sample means, provided n is sufficiently large.

~Example: The average monthly mortgage payment for home buyers has a mean of $732 and a standard deviation of $421. A random sample of 125 home buyers are selected. Find the probability that their mortgage payment exceeds the true mean by $50.
(i.e., exceeds $782).

The sample mean = population mean = $732, regardless of the sample size.

The sample standard deviation = pop standard deviation divided by the sqr(125) = 421/sqr(125)= 37.66 (there is a round off error, see the end of this lesson)

~Note: Since n is greater than 30, we do not use that finite population correction.

~Note: By the Central Limit Theorem, the sampling distribution of the means is approximately Normal, so, we will use the standard normal curve to compute this probability.

Converting 782 to a z score of 1.33 (rounded to 2 decimal places), we get a probability of .0918 (less than a 10% chance)

To do this, use 2nd VARS, go to menu 2, insert (1.33, 10000), then enter.

~Note: You can avoid finding the z score by entering
(782, 10000, 732, 421/sqr(125)) into the same menu to get .0921. This is more accurate since you are not rounding off the z-scores. I expect everyone to use this method on all of my testing situations.