Statistical Design Institute Blog: standard deviation

Tuesday, June 21, 2011

Using Mean, Standard Deviation, Skewness, and Kurtosis

Previous blogs have discussed the properties of the first four moments which can be computed from a data set. The next step is to use these easily computed statistics in everyday applications. When presented with a set of data, it is important to understand what information may be hidden in the “sea of numbers”. We know that the Mean gives us the central tendency of the data, the Standard Deviation explains the dispersion about the Mean, the Skewness represents the symmetry/asymmetry of the data, and the Kurtosis is related to the shape or peakedness characteristics. In essence, we are using these numerical quantities to explain the properties of the underlying distribution or probability density function (PDF). These statistics can be used to qualitatively perform distribution fitting for you data.

Since a set of data can have any Mean and Standard Deviation, we can use these statistics to determine the location and relative dispersion. Using the Skewness and Kurtosis, we can learn much more as shown in the table below;

Skewness	Kurtosis*	Classical Distribution
0	1.8	Uniform PDF
Any negative number	2.4	Left-skewed Triangular PDF
0	2.4	Symmetric Triangular PDF
Any positive number	2.4	Right-skewed Triangular PDF
0	3	Normal PDF
0.63	3.26	Raleigh PDF
2	9	Exponential PDF

Using this information, you can make as-like comparisons of your data to the properties of some of the known classical distribution. While you may not be able to conclude that the data set is from a population with a particular distribution, you will be able to infer that “based on the data, the uncertainty is representative of that of a _______ distribution”.

*Note that these quantities are for Kurtosis where 3 = Normal PDF. If quantities for Excess of Kurtosis, where 0 = Normal PDF, are desired, then subtract 3 from the values shown.

Sunday, May 15, 2011

Mean and Standard Deviation

A random variable is defined by a distribution that has one or more variables that describe location, shape and scaling. (The term distribution is used in six sigma to denote a probability density function). Practically, a distribution can be described by:

mean
variance or standard deviation
skewness
kurtosis

Once mean, standard deviation, skewness, and kurtosis are calculated or assumed, the relevant location, shape, and scaling variables can be computed.

Mean
The expected (or average) value for the distribution of a random variable x_bar is the mean and it can be calculated from sample data as follows:

where x_i are the value for n data points. Mean is also called the first moment of a distribution about zero (this is the same things as a centroid – the distribution is rotated around zero).

Means (red lines) for different distributions.

Variance and Standard Deviation
The measurement of spread of a random variable is called variance σ^2 and the square root of variance is called standard deviation σ. This is equivalent to taking the second moment of a distribution around it’s mean.

Distributions with increasing standard deviation (a) to (c). Red lines are means