Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: Order being born from disorder?



Ludwik Kowalski writes:

1) I am sorry the word "entropy" was introduced into my message. The
terms "order" is "disorder" would certainly be sufficient (see below).

2) I suppose that entropy was discussed on phys-L when I was away.

Yep. The most recent extended discussion ran in late November 1996 with
the subject title "Entropy" and "entropy". That discussion extensively
discussed, entropy, disorder, and order and their interrelationships as well
as how much of it should be discussed at a high school level. I suggest
you check the PHYS-L archives for this discussion.

....
3) I do remember reading that cryptologists can assign "entropy" to coded
messages. The messages are treated as sets of immaterial particles
(words and sentences). These do not interact with forces and are not
influenced by temperature (except in hot debates). In what units do
they express entropy?
Ludwik Kowalski

The entropy you mention is the Shannon entropy that Shannon introduced back
in the 40's. Shannon's work helped found Information Theory as we know it.
Both the Shannon entropy of a communication system and the thermodynamic
entropy of a macroscopic physical system are two special cases of the
generic entropy concept. In general, entropy is a property of a probability
distribution. Each (typically discrete) distribution has its own entropy.
The entropy of a distribution objectively measures (nonparametrically) how
"random" or "uncertain" the outcomes are (on the average) for samples that
are drawn from the distribution. The entropy of a distribution is that it
is the average (minimal) amount of information necessary to exactly
determine which outcome obtains when a sample is drawn from the distribution
given only the information contained in the specification of the
distribution itself. Let {P_r} represent a probability distribution where
P_r means the probability that the r_th outcome obtains when a sample is
drawn. Here r is a label that runs over each distinct (disjoint) possible
outcome and: SUM_r{P_r} = 1. Here SUM_r{...} means sum the quantity ...
over all values of the parameter r. The entropy of a distribution {P_r} is:
S = SUM_r{P_r * log(1/P_r)}. It is the expectation over the set of outcomes
of the logarithm of the reciprocal of the probability of each outcome. In
the special case that there are N outcomes and each outcome is equally
likely, (i.e. P_r = 1/N, r = 1,2,3 ...,N) then S = log(N). The base to
which the logarithm is taken in the definition of S determines the units
that S is measured in. If the log is to the base 2 then S is measured in
bits. (In this case then S is just the number of bits needed to encode all
of the N possibilities.) If the log is to the base 256 then S is in bytes.
If the log is to the base 10 then S is in decimal digits. If the base is
the Naperian base e then the unit of S is the so-called "nat". Typically
the base b is the number of symbols in a symbol set and the entropy S is the
length of a sequence of those symbols needed to uniquely label each outcome
*on the average* for outcomes of samples drawn from the distribution. If
the distribution is uniform then N = b^S when N is the number of possible
outcomes. Those who work in the field of Information Theory prefer to
measure entropy in bits. Those who work in theoretical Statistical
Mechanics prefer to measure entropy in "nats". Those who work in
experimental thermodynamics prefer to measure entropy in joules/kelvin.

In the case of stat mech/thermo the relevent distribution whose entropy
represents the *thermodynamic entropy* of a macroscopic thermodynamic system
is the distribution of possible microscopic states consistent with the
system's macroscopic description. Thus, for an isolated system in
equilibrium (whose possible microstates are all equally likely) the entropy
of the system is just the logarithm of the number of microstates that give
the same given macrostate description. In stat mech it is found that
Boltzmann's constant k = 1.380658 x 10^(-23) J/K is really a conversion
factor which converts between temperature measured in kelvins to temperature
measured in joules (when the entropy is measured in "nats"). (1 K =
1.380658 x 10^(-23) J). Since entropy is fundamentally a (dimensionless)
measure of information as the number of symbols needed to label a set of
possibilities we see that intrinsically entropy is dimensionless (like how
the radian is dimensionless), and different entropy units are just different
dimensionless measures of this concept (just like how the radian, cycle,
degree, and grad are different measures of the dimensionless concept of
angle). The reason this works out for the J/K unit of thermodynamic entropy
is that fundamentally temperature is an (intensive) energy-like concept
which has dimensions of energy. If thermodynamicists measured thermodynamic
entropy in "nats" and energy in joules then temperature would have the
dimensions of joule/nat. (Since a nat is really as dimensionless as a
radian is we can just shorten temperature units to joules *if* we just make
sure to always agree to measure entropy consistently in nats. This is
analogous to measuring angular frequencies in s^(-1) as a short-hand for
measuring them in radians/sec. The rad unit can be dropped as long as it is
agreed ahead of time to only always measure all angles in radians and never
in cycles, degrees, etc.)

The conversion factors relating different entropy units are given as:
1 J/K = 1/(1.380658*10^(-23)) nat = 1/(ln(2)*1.380658*10^(-23)) bit =
1/(2^(38)*ln(2)*1.380658*10^(-23)) gigabyte. Notice that each J/K of
entropy of a thermodynamic system corresponds to a huge number of bits of
information. This is because a macroscopically slight increase (in J/K) of
the system's entropy requires a tremendously large increase in the amount of
extra information needed to specify the microstate in bits. For instance,
suppose a thermodynamic system (quasistaticly) absorbs 1 erg of heat at room
temperature of 298 K, then that extra thermal energy has increased the
uncertainty as to which of its possible microstates it can be in (by making
many more microstates possible than before the absorption took place) so
much so that after the absorption it requires an extra 3.801*10^11 gigabytes
of information to specify the microstate afterward (this being equivalent to
an entropy increase of 3.356*10^(-10) J/K).

When Leigh says that thermodynamic entropy is not related to order/disorder
what he means (I think) is that any increase in the order in a
*macroscopic* pattern seen in a system (such as in the arrangment of
floating whisker fragments in a toilet bowl) is not related to the system's
*thermodynamic* entropy. The thermodynamic entropy measures uncertainty (or
disorder if you will) at the level of the individual (atomic) microstates
for the system, *not* at the level of any possible macroscopic patterns.

On occasion it may be useful to make a distinction between the disorder
seen (at the *same* level of description as is used in some generic
entropy measure on some generic probability distribution) in a system and
the entropy for the distribution of outcomes for that (generic) system. For
such a generic system the entropy is the average (minimal) information
needed to exactly specify which outcome occurs when a sample is drawn from
the distribution; whereas the disorder of a given outcome can be defined as
the minimal information necessary to uniquely characterize or describe that
outcome. With this distinction, the entropy is a property of the
distribution, and the disorder is a property of the individual realizations
(outcomes). The distribution of possible orderings of a randomly shuffled
deck of playing cards has an entropy of log_2(52!) = 226 bits. The disorder
of a given shuffle of cards is the minimal information needed to
characterize that shuffle. For instance, suppose that the deck is sorted by
color and then by suits with each suit having a straight sequence of
numbers. In this case the disorder of the specification for this
arrangement is about 4 bits. We need 1 bit to determine which order the
colors come in (red or black first); we need 1 bit to determine the order of
the black suits (spades or clubs first); we need 1 bit to determine the
order of the red suits (hearts or diamonds first); and we need 1 bit to
determine the sequence order (do the cards count up or do they count down in
sequence?). If the specification of the card arrangement was less detailed
and the actual arrangement was nearly completely patternless, then it would
take more information to specify which arrangement of cards we had. It may
take upwards of 226 bits to nail down the correct sequence. In this latter
case we can say that the randomly shuffled (patternless) arrangement is more
disordered than the sorted arrangement by some 222 bits. Since the vast
majority (nearly all) arrangements of cards tend to be remarkably
patternless the average of the disorder is just about 226 bits. Thus we see
that the entropy of the distribution can be sort of thought of as the
average over the distibution of the disorders of the individual realizations.
We can consider the disorder of a given arrangement as the Chaitkin-
Kolmogorov complexity of that arrangement and we can consider the entropy of
the ensemble as the Gibbs-Shannon-Jaynes entropy of the distribution
characterizing the ensemble.

David Bowman
dbowman@gtc.georgetown.ky.us