~Class meeting #3------Descriptive Statistics
~Descriptive Statistics deals with gathering & describing data (deals with a sample of data).
~Inferential Statistics deals with drawing conclusions about a population based on sampling.
~Measures of center (central tendency)
~The following is a list of measures ranked by reliability.
~1) Mean (average)(m): Just add the data point values then divide by the number of data points.
m = ∑x/n Ex: 70, 75, 80, 80, 90, 95.........m = 490/6 = 81.66...= 81.7
~If no directions are given, we usually round-off to one place more than original values
~2) Median (middle value)(M*): May or may not be in the original data set.
A) odd # of data: M* will be in the data set
B) even # of data: M* is the average the two center values (may or may not be in the data set)
Ex: For the above data points, median = (80+80)/2 = 80
~3) Mode (M)(most repeated data point): (highest frequency)
A) If there is no value repeated, then there is NO mode.
B) If two values have the same highest frequency, then both are modes (Bimodal data)
C) If more than two have the same highest frequency, then the data is multimodal
Ex: For the above data points, M = 80
~4) Midrange (bad usage of words) Will call this the Extreme value mean (EVM)
EVM = (highest score + lowest score)/2
Ex: For the above data points, EVM = (95+70)/2 = 82.5
~Other measures that some use:
A) Weighted mean: Certain data points are taken more than once the total number of scores
(including the multiple scores) are then averaged.
B) Trimmed mean: A certain percentage of low or high scores are eliminated. Then the rest are averaged.
~Finding Statistics from a Frequency Distribution:
Mean (m)= [∑(f)(x*)]/n, where x* is the class mark (midpoint of interval or single value), f is the frequency.
Let’s do it for data points on first page.
x f (f)(x)
70 1 70
75 1 75
80 2 160
90 1 90
95 1 95
490
so,
m = 490/6 = 81.7
Let’s do it on the TI-83...(see handout on TI-83, first page, bottom)
~The Shape of a Distribution:
A) Symmetric (bell-shaped): mean = median = mode....draw...
B) Skewness ("Squash"): Right (or positive), order (from left to right), mode, median, mean
~Skewed Left (or negative), order (from left to right), mean, median, mode.
~Skewed right (or positive), order (from left to right), mode, median, mean.
~See textbook for real good diagrams
Kurtosis (‘peakness")...The Normal curve (will cover later) has Skewness=0 (has symmetry about the mean) (mean=median=mode),
Kurtosis=3
~Measures of Variation (dispersion) (spread)
1) Range: Highest value minus lowest value. For our data points, 95-70 = 15
2) Sample Standard Deviation: (Sx)
Sx = sqr{∑(x - m)2] / (n-1)}= sqr{[n∑x2-(∑x)2] / n(n-1)}, see handout for the proof...
3) Sample variance: ( Sx )2, This is just the square of the standard deviation (No square root)
Ex: show long way for: x/67,72,76,76,84.. m = 75 (using nearest integer),
(x-m)/-8,-3,1,1,9,...(x-m)2/64,9,1,1,81
∑(x-m)2= 156, Sx = sqr{156/(5-1)}= 6.2 ...Show how to use TI-83...
~Note: For frequency distributions:
(for the TI-83, just enter frequencies in L2)
use sqr{∑[f(x - m)2] / [(∑f)-1)]}= sqr{[(∑f)∑fx2)-[∑(fx)]2 / (∑f)[(∑f)-1)]}, ugh!
~The 68-95-99.7 Rule for a Symmetric Distribution (bell-shaped or Normal)
68% of the scores will be within one standard deviation of the mean (center)
95% of the scores will be within two standard deviations of the mean
99.7% of the scores will be within three standard deviations of the mean
~See textbook for a real nice diagram
~Symbols for a Population mean, standard deviation, & size are: m ,s, & N