The name errance, symbol E, has been suggested; I like it better than
atomicity. But perhaps something better will come up. Here is new
beginning for the next draft. I hope Tim will find it more
satisfactory. The rest will be unchanged except that all s1 and s2 will
be replaced by E1 and E2.
Any comments?
Ludwik
1) Using mean values and standard deviations
A physical quantity, x, is usually measured several times. Each result
is slightly different and we calculate the mean value, <x>. Then we say
that the true value, μ, is probably equal to <x>. But that is only an
approximation; <x> is the unbiased estimate of μ. The true value might
be 17.32821 but <x>, from a particular sample, might be 16.92. Another
sample might yield <x>=17.8, etc. Clearly, we are dealing with
distributions of mean values. We want to establish the margin of error,
on each side of <x>. Graphically, margins of errors are represented by
error bars. Suppose we established, after making 10 measurements of x,
that <x>=16.2. How large should the error bar be? We usually assume
that the distribution of <x> is Gaussian and that the margin of error,
on each side is equal to s/sqr(n), where s is the standard deviation
obtained from n measurements and sqr(n ) is the sqare root of n.
It is possible, at least in principle, to collect many samples of size
n and to esablish the distribution of <x> experimentally. That would
give us the mean of many <x> and the mean of many s. That what one
would do to experimentally confirm the theoretical prediction that
E=s/sqr(n), where s is the standard deviation obtained from the first
sample. In what follows the quantity E. calculated from s and n, will
be called errorness. I do not want to call it “standard deviation of
the mean,” as it is often called in textbooks, because students might
abreviate the long name and the confuse E with s, which is quite
common. And I do not suggest that students verify the E=s/sqr(n) is
indeed in agreement with experimentally established distribution of
<x>. If one sample is suffiently large for our purose, to infer μ from
<x>, then we can simply use the predicted E, as shown below.
It is useful to be aware that distributions resulting from random
errors are bell-shaped. For large n they are always Gaussion, for small
n they are nearly Gaussian, as illustrated at the end of this essay.
Suppose that <x>=16.2 and E=0.9. How large is the true value, μ? An
experimental scientist cannot answer such question. The best one can do
is to determine a range of values of x that is likely to contain the μ.
In the above example, one might say that the true value is somewhere
between 15.3 and 17.1. Or one might say that the μ is between 14.4 and
18.0. Note that we assume that μ is in the middle of each selected
range of possibilities. Statisticians refer to such ranges as
“confidence intervals.” Graphically, confidence intervals are
represented by two line segments called margins of errors, or error
bars, Δx.
Unless the chosen interval is infinitely large, we are never sure that
the true value is indeed inside. It is a matter of probability. The
smaller interval we choose the less certain we are that the unknown
true value of is inside. The size of the assigned margin of error
depends on how certain one want to be that the true value is inside of
the chosen interval. Assuming the distribution of <x> is a Gaussian
bell-shaped curve, we know, from mathematical considerations, that the
error bar equal to the spread of one S corresponds to the level of
confidence of about 68%. It means we are expected to be correct (by
saying that μ is inside the chosen interval) only 68% of the time.
Likewise, the error bar of 2*E corresponds to the level of confidence
of 95% while the error bar of 3*S corresponds to the level of
confidence of 99%.
Actually, levels of confidence, C, are fractions representing
percentages. They are positive numbers between 0 and 1. Instead of
saying C=0.95, for example, we say that the level of confidence is 95%.
Most often error bars are equal to S but one is free to use other
values, for example, 1.5*S or 2*E, provided the value of C, associated
with the chosen error bars is clearly stated.
2) Finding C for a specified error bar: Using the standard normal
distribution
How to determine C for an arbitrary chosen margin of error? Suppose
that <x>=16.2 and the arbitrary chosen margin of error is Δx=1.5. What
is the value of C? To answer this question one should be familiar with
the “standard normal distribution.” To introduce this concept let me
say that any value of x can be associated a unique value z, defined as:
z=(x-<x>)/E
provided <x> and E have been determined. Any Gaussian distribution (no
matter how large or small <x> and E are) transforms into a unique z
distribution. The distribution of z is called standard normal
distribution. The definition of z implies that the mean value of z is
zero and that the its spread is one. The distribution of z is a
Gaussian curve centered on zero. This is not surprising because z=0
when x=<x>. Also not surprising is the fact that the E of the standard
normal distribution is equal to 1. How can it be otherwise; z is 1 when
x-<x> =E. It is thus clear that z is nothing else but a difference
between x and the mean value of x, expressed in terms of E. Also note
that z is a dimensionless quantity, even when x is associated with
units, such as volts, grams, meters, etc. The bell-shaped standard
normal distribution is shown below.
FIGURE 1
Ludwik Kowalski
Let the perfect not be the enemy of the good.