My views on Skewness of a distribution



1) Skewness is not a well-defined concept of Statistics

2) Skewness, in many cases, depends how you are looking at the distribution

3) Skewness from Boxpots can be difficult to ascertain, depending on your point of view

4) Some Stat teachers compare the relative position of the Median & Mean to determine skewness, however,the Mean can be distorted due to the outliers (extreme values). So, this is not a good way.

5) While other instructors compare the position of the Median with that of the midrange. Better, but can give different results,depending on the distribution. So, this is not a good way.

6) While other instructors compare the distances from Q1 to Q2 (2nd Quartile) to that of the distance from Q2 to Q3 (3rd Quartile). Again, that can give mixed results as well.

7) While other instructors remove the outliers, then look. Well, in my opinion, the outliers are what causes the skewness in the first place. So, this is not a good way.

So, Skewness is the worse concept of Stat. So imprecise and creates much ambiguity. This can never happen in a pure math course like Calculus. Yes, some teachers do have disagreements occasionally on interpretations of certain defintions, however, the profs that have the most knowledge usually have the correct analysis.

So, how do I expect my students to view Skewness? Let’s keep it simple.

Since Skewness is the attribute of some of the data points (could be just a few, including outliers) to deviate to the right or left of where most of the data is located, we look at the extension of the tails.

If the tail extends away from the mass of data towards the right, the distribution will be skewed to the right or positively skewed.

If the tail extends away from the mass of data towards the left, the distribution will be skewed to the left or negatively skewed.

Note: There is a complication however. A "fat" shorter tail (further away from the horizonal axis) and a longer "thiner" tail (closer to the horizonal axis) could balance each other out an give a false reading for skewness.

Note: qqPlots (Normal Quantile Plots)(based on standard z-scores) are often used to see of a data set is somewhat close to Normal. They can be confusing to beginning students, however, most advanced calculators will do them for you. All you need to do is enter the data set in a list then go to the appropiate menu location where this plot will be seen. The graph is a scatter plot of the z-scores of the data verses the probable z-scores expected. A linear pattern usually indicates a distribution close to Normal.

For boxplots, look at how far away H is from the most of data (use the Median or Q2) and compare that distance to how far away L is from the Median or Q2.

For boxplots drawn horizontally, H is right and L is left. Positive is in the direction of H, negative in the direction of L.

For boxplots drawn vertically, the L is low and H is high. Positive is in the direction of H, negative in the direction of L.