RSD, standard deviation score (= Z-score) and percentile
The
scatter that remains after making allowance for independent variables,
is the residual scatter or residual
standard deviation (RSD). Some observations are larger, some
are smaller than the mean. The average of all deviations (observed
– predicted value) is nil, and at that point predicted and
observed values are identical.
The yellow dome-shaped bell depicts the frequency with which the observations exceeded or fell below a value predicted from the regression equation; it is called a frequency distribution or probability density function. If deviations are normally distributed, as in the figure, 50% of all observations are below predicted and 50% above. This point divides the population in equal halves and is therefore called the median; the 50th percentile in a normal population equals 0 RSD: in a normal distribution mean = median.
It is convenient to express the difference between the observed and predicted value as the standard deviation score, i.e. in the number of RSD. It is also called the Z-score. In 95% of individuals (observed – predicted) < 1.64·RSD; this therefore marks the 95th percentile. In only 5% of cases (observed – predicted) < -1.64·RSD; this then marks the 5th percentile. The area between –1.64·RSD en +1.64·RSD in a normal distribution therefore comprises 90% of the population, and hence delineates the 90% confidence interval.
The 95% confidence interval in a normal
distribution is between – 1.96·RSD and +1.96·RSD, i.e. 2½% of all observations are smaller than (predicted
– 1.96·RSD), and 2½% are larger than (predicted
+ 1.96·RSD); -1.96·RSD marks the lower 2½ percentile.