Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-L] numerical methods



In the context of:

The textbook formula

(exp(-a*t1) - exp(-a*t2))/a [1]

There are *multiple* things wrong with this.
On 10/5/21 3:04 AM, David Bowman hit all the main points:

1) For starters:

divides by zero when a is zero

Yes. That means the formula [1] is not even mathematically correct.
Note that the case where a=0 is perfectly reasonable; it corresponds
to something that doesn't decay at all, like the background count
rate. The question is perfectly well defined for a=0, with a nontrivial
result, and doing the integration correctly is super-simple.

I am surprised and disappointed that fancy mathematical systems
like wolframalpha and maxima screw this up.

2) Even after we fix it to make it mathematically correct, it's
still terrible from a numerical-methods standpoint.

would run into significant precision subtraction error when |a*(t2 -
t1)| gets small compared to 1,

Yes. The take-home messages are:

2a) Beware of what we call *small differences between large numbers*.
For small x, exp(x) and exp(-x) not large in absolute terms (they're
close to 1), but they are large compared to the difference. This is a
recipe for disastrous roundoff errors.

2b) More generally, if something makes a big mistake at zero, check
for the possibility that it makes slightly smaller mistakes *near*
zero.

2c) Even more generally: Floating point numbers are not the same as
mathematical real numbers. They have roundoff errors.



3) Factoring the difference of exponentials into sinh(x) times exp(x)
is definitely smart. Let's be clear: in the computer's math library,
sinh(s) is not evaluated as the difference of exponentials; actually
it is evaluated from scratch, using the known mathematical properties
of sinh(x). The result is reasonably accurate near x=0.

In contrast, the difference of exponentials is a disaster.

If I was trying really hard to be miserly with the clock cycles I
might try something else that didn't require the evaluation of 2
different transcendental functions.

I disagree, because exp() and sinh() are both transcendental functions.
One exp() and one sinh() is no worse than two exp()s.

4) Calculating sinh(x)/x is probably good enough for practical purposes,
even when x is small, but we can do better:

if abs(x) < threshold
Set compsinc = 1st few initial terms of Taylor series for sinh(x)/x
else
Set compsinc = sinh(x)/x
endif

Yes. And in fact "few" terms means two terms: the constant term and the
first nontrivial term; in other words, 1 + x²/6.

The threshold is 10¯⁴. Above that, sinh(x)/x works fine. Below that,
the two-term series works fine, since the next term in the series is
less than 10¯¹⁸, which is small compared to the machine epsilon.

BTW FWIW I prefer the name sinhc(x) = sinh(x)/x
which is perhaps not perfect, but it is memorable,
in analogy to the established name sinc(x) = sin(x)/x.
https://mathworld.wolfram.com/SinhcFunction.html

5) Details:

the possibility a greatly negative argument of exp() causing an underflow

I wouldn't worry about that. If (-a t) is hugely negative, the integral
is zero for all practical purposes. Normally the exp() function won't
throw an underflow exception; it will just return 0. No problemo.

possibility of values of the arguments of the intrinsic exp() and
sinh() functions getting too big & cause an overflow

I reckon that's unlikely to occur in a reasonable physics situation.
It could occur as a result of a program bug.
Normally exp(1000) just returns "inf", which seems reasonable.
OTOH yeah, you could explicitly check for that if you want.
On the third hand, it might be easier to turn on the FPU overflow
trap and let the hardware do the checking. It might throw exceptions
from places you never thought to check.

treats both endpoints t1 & t2 on a similar footing without unduly
biasing one end of the interval relative to the other.
Agreed. That's a useful sanity check.

==============================

All in all:

// numerically accurate sinh(arg)/arg
template<class T>
inline T sinhc(T arg) {
if (abs(arg) < 1e-4) return 1 + arg*arg/6;
return sinh(arg)/arg;
}


double mid((t2 + t1) / 2.);
double tau(t2 - t1);
rslt += exp(-rate*mid) * tau * sinhc(rate*tau/2.);

==============

Note: This applies to data in bins of size t2-t1.
We still hate bins, but sometimes we have no choice.

For example, suppose we want to construct models of the current public
health emergency. The state and county have the timestamped data, but
they won't release it. Instead they grudgingly release data in bins of
size one day, or sometimes two days or three days, depending on their
whims. It would be easier for them and everybody else if they would
just release the underlying data, but they won't.

We have to deal with the data that's available, but we don't have to
like it.