Normal Curve Standard Deviation Percentile

The Mahalanobis distance is not such a scary idea as might at first be imagined. Although many people have not come across the name, they almost certainly have come across the concept. Recall the F-distribution and the Chi squared distribution. These become useful when there is more than one type of measurement per sample, for example, instead of just measuring the height of a series of people, we measure both their weight and their height. These distributions are based on the normal distribution but are used when there is more than one type of measurement.

Standard Deviation of Heights and Weights

  • Consider first, the situation where just one type of measurement has been recorded, for example, the height of 100 men.
  • Assume we find the average height is 175 cm, and the standard deviation is 7 cm.
  • Then assume the heights are described by a normal distribution, we expect that around 18% of men are more than 182 cm height (one standard deviation above the mean)
  • Try typing =NORMDIST(7,175,182,1) into Excel to check this.

  • Consider now recording their weights.
  • Assume we find the average weight is 75 kg, with a standard deviation of 10 kg.
  • Assuming a normal distribution, we expect 18% of men to have a weight more than 85kg, which is one standard deviation above the mean.
Combining Heights and Weights

  • We may be interested in the probability that a person is both heavier than 85 kg and taller than 182 cm.
  • First, look at the graph of the weight against height, see the figure.
  • It is not unexpected that there is some relationship, or correlation, between the two types of measurement.
  • That is, one the whole, a tall man is also heavy. Of course this is not an exact relationship.
  • Refresh yourself if you need a reminder.

  • We want to ask how far a person of 85 kg and 182 cm differs from the "average man".
  • Look at the second figure.
  • We have now added contours. These contours are labelled 1, 2, 3 etc.
  • The contours are of the Mahalanobis distance.
  • A contour labelled 1, implies that the datapoint is a Mahalanobis distance of 1 away from the centre.
  • These contours are not "round" and follow the structure in the data.