Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-l] Another uncertainties question...



On 01/25/2008 01:45 PM, Jason Alferness wrote:

The report simply of "Less than (value)" seems to be quite common...
I've seen it in contamination testing, medical reports, blood tests and
a number of other things. So I don't think it's a quirk of one
particularly stupid lab. I'm not asking really how it _should_ have
been reported, I guess, but rather if this is what one has (and it seems
this is what one is likely to get in many places) and is forced to work
with it, what's the most intelligent thing to do with it, especially
with respect to crunching an "average" out of some stack of these
numbers (and in some cases lack of numbers). I could work much better
with the above asymmetric and plain english descriptions, but it seems
fairly unlikely that we'll get them.

Ah ... I didn't realize the question concerned /using/ the
numbers (as opposed to generating them). Sorry.

I guess the best thing about my previous note is that I made
my false assumptions explicit, rather than leaving them implicit.
So now we can move forward.

The question of what to do with the numbers is highly dependent
on the purpose to which the numbers will be put. Remember:
-- Uncertainty of measurement depends on where the numbers
came from.
-- Significance has to do with what the numbers will be used
for.

Let's denote the detection level by D. And let's consider
detection by nose, rather than by fancy instruments.

Scenario 1: Mercaptan. Methyl mercaptan. It's used as an
odorant in natural gas supplies. The mercaptan itself is
not good for you, but it doesn't really matter, because the
detectable level is a gazillion times lower than the danger
level. So a "no detect" level implies a "no danger" level,
and that's all you need to know for this purpose. Averaging
is not required. Averaging will not change the result.

Scenario 2: Carbon monoxide. This is the opposite scenario,
in the sense that it is colorless and odorless, even in
concentrations that are acutely dangerous. So a determining
a "no detect" level is worthless. Averaging doesn't help.
Averaging will not change the result.

Moral of the story: It all depends on what *you* want to do
with the data. You have to start with what you care about
and work backward to see what level of precision in the raw
data is needed to tell you what you need to know. The
available "no detect" data may or may not tell you anything
worth knowing. It all depends.

Possibly constructive suggestion: The aforementioned "working
backwards" is easier than it sounds. This situation is tailor-
made for an iterative "what if" analysis, sometimes called
"crank three times".
http://www.av8n.com/physics/uncertainty.htm#sec-crank3
You know the number behind the "no detect" report is at least
zero and at most D, so do the calculation twice: Calculate
whatever *you* care about, once with zero and once with D.
If you're lucky, D is small enough that both calculations
produce an acceptable result. Otherwise it is provably
impossible to solve the problem using only the given data.
You will have to go upstream and get better data.

Tangential remark: In the absence of additional information,
averaging isn't going to help. The "no detect" reports could
correspond to a level of zero, or they could correspond to
a level that is 95% of the detection threshold every time.
The latter case is not particularly far fetched; in an
industrial/regulatory situation, it makes economic sense to
engineer the process to the point where it is just barely
within tolerances, with a small margin of safety.

The problem with a "no detect" result can be understood as
a form of roundoff error. That is, anything less than the
detection threshold is being rounded off. If the raw data
(before rounding) was more precise than this, the data is
degraded by rounding. Rounding prevents you from doing things
with the data that you otherwise could have done. (Sometimes
this is intentional, for instance in a regulatory situation
where the producers never wanted to give up the number in the
first place; they will give up the least-useful number they
can get away with.)