Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

Re: [Phys-l] experimentation 101



I agree with almost everything John Denker has said. Graphical presentation (scatter plot) definitely. "Boo" to 2 or 3 sf's. An "intro to statistics" book (Triola) would say to use 5 (1 more than the data). Analysis of the variation would be an essential part of reporting this.

My question comes with "The rule is simple: Data is data." Not really. Data, as John presents it, is an interpreted measurement. The simplest rule is "the system is what it is." Every measurement I (or you) make is based on use of the measuring instrument plus interpretation of the extent of the system. This "data" that has been given is really six different interpretations of the hall. Are there 6 students, each doing 1, are there 2 students sharing the work, are there 10 students allowing one to take 6 measurements? That is critical, and that is an interpretation that affects the measurements. These measurements are not without bias, and measurements taken incorrectly don't accurately describe the system. If you know some facts about the system (the length of the hall does not change with time), the large variation definitely points to poor technique or poor instrumentation. If one didn't know have any a priori system information, throwing away outliers is definitely a bad thing to do.

JD goes on to reinforce what I've said by giving his examples: the data have resulted from circumstances, interpretations, and measurements that require further investigation. If the investigation results in finding that the measurement instrument was incorrectly used (including miscalibration), the "data" should be discarded (all of it) and the experiment done again (which I believe should have been done here, with both collections being reported).

My example: I was involved in a building project in which we were building roof trusses on-site. We measured the width of the building in several places with consistent results. The pattern pieces were made, the pieces cut, the trusses built (yeah...stupid). When we set first one, it was short (didn't hang over the edge enough). We measured the truss, remeasured the width, got the same data. Decided to use a different measuring device and got 1 foot wider. What?! Put the devices side-by-side and then noticed that the first one had a 1-foot leader section, then "zero" started over. The crew chief then got some snips and cut the 1-foot leader off. Lesson: Always learn where "zero" is. Thankfully, we were able to nail extensions to the trusses and use them. We should have thrown out all the data because the interpretation of the system was wrong.

JD, excuse me if what I've said sounds like nit-picking, but data is not fact; data is an interpretation of a system.

Thanks for the thoughtful question.
________________________________________
From: phys-l-bounces@carnot.physics.buffalo.edu [phys-l-bounces@carnot.physics.buffalo.edu] On Behalf Of John Denker [jsd@av8n.com]
Sent: Wednesday, July 14, 2010 3:27 PM
To: Forum for Physics Educators
Subject: Re: [Phys-l] experimentation 101

On 07/14/2010 11:48 AM, I wrote:
Here is an interesting question (not original with me):

A group of students are told to use a meter stick to find the length of a hallway.
They take 6 independent measurements (in cm) as follows: 440.2, 421.7, 434.5,
492.5, 437.2, 428.9 What result should they report? Explain your answer.

So, the actual questions for today are:

a) What result would you expect your students to report?
What explanation(s) would they give?

b) What result would *you* report in this situation?
What explanation would you give?


People often say I have a keen grasp of the obvious. I take it
as a complement, even when it is not intended as such. As the
saying goes, grasping the obvious is preferable to fumbling the
obvious.

In this case the obvious thing to do is .... report the data!
Also report whatever is known about the circumstances of the
measurement.

The data could be presented in tabular numerical form, and/or
in graphical form -- preferably both.

Anything that goes beyond this is not data, but rather /interpretation/
of the data. The rule is simple: data is data. Interpretation is
interpretation. Do not confuse data with intepretation.

According to reference /1/, here's the procedure that is "most
correct according to expert opinion":

a) The students were supposed to notice that 5 of the 6 measurements
are clustered "near" 432.5 while the remaining measurement is an
"outlier".
b) They were supposed to discard the outlier.
c) They were supposed to take the average of the 5 remaining numbers.
d) They were supposed to express the average using 2 or 3 sig figs.
(Don't ask me why the "experts" could not agree whether 2 or 3
sig figs were required; I would have thought that sig figs were
vague enough already, with no need to increase the slack....)
e) Preferably they would also report the uncertainty explicitly.

I agree with item (a) as far as it goes. It is always good to notice
things. Graphing the data makes it easier to notice clusters and
outliers.

Meanwhile ... I must object to item (b).

The rule is: never discard outliers just because they are outliers.
You need a darn good reason for discarding any data. If you do
discard anything, you *must* do a detailed sophisticated analysis
to see what effect that has on your results. Such an analysis is
well beyond the scope of typical undergraduate science and/or statistics
courses.

Example #1: Suppose a gold miner hires you to survey the surface
of the earth. You measure what percentage of gold there is at various
locations, and discard the outliers. You have just completely defeated
the purpose of the survey! People do not mine for gold in "average"
locations! They mine for gold in the locations that have millions of
times more gold than average!

Example #2: Once upon a time each of the students in a high-school
chemistry class was assigned to do the classic experiment of putting
a candle under a beaker and measuring the time it takes before the
candle goes out, and plotting the time as a function of the volume
of the beaker. Most of the students got data that looked about the
same, with a lot of scatter ... but another student got data that
followed a straight line with much less scatter, and with a slope
markedly different from what the other students got. The latter
student got an F, since it was obvious that he had faked the data.

Example #3: Once upon a time, the grad students at a certain university
got to thinking about the stipends they were getting, and wanted to
compare that to the cost of living. So they surveyed their fellow
students to see how much they were spending for food, lodging, books,
transportation, et cetera. They found that most of the results were
rather tightly clustered ... but there was one outlier. They discarded
this outlier, on the grounds that nobody could possibly be spending that
much. Obviously this data point was unreliable. Apparently the student
in question was too stupid to answer the question correctly.

Examples #4 through 492,761: The history of science is pockmarked
by many examples where people "improved" their data by discarding
observations that they didn't understand.

===

Note: In example #2, the outlier was me. I had figured out that hot
air rises, and it seemed obvious to me that there was going to be a
lot of stratification inside the beaker, so that the length of the
candle was going to be a major uncontrolled variable unless I did
something about it. So I used a knife to make the candle as short
as it could possibly be.

I explained this to the teacher, and he changed the grade from F to A.

Note: In example #3, I found out about this long after it was too
late to do anything about it. I explained to the committee who had
done the survey that it was outrageously unscientific to discard data.
The fact that some of the data was beyond their understanding did
*not* given them license to discard it. I told them the gold-mining
story.

And, in case you were wondering, the outlier was me. I had spent
several years working in industry before starting grad school, so
I did not need to rely on my fellowship stipend as my sole source
of money. I still lived reasonably frugally, but I did not need
to subsist on peanut butter or sleep in a garret.

It was ironic that this committee was trying to argue for higher
stipends. They had cut their own throats by discarding the outlying
data. In fact that one data point was in some ways the best data
they had, showing how much a reasonably frugal life would cost if it
were not artificially constrained by the existing stipend structure.

==========

Returning to the hallway length example: For several reasons, I
insist that reporting all the raw data is the only viable option
(unless you are going to throw out *all* the data and start over
using better methodology).

1a) You cannot hide behind the idea that you were asked to report
a "value". The actual question asked you to report a "result".
A result is not necessarily a single value.

1b) What's more, even if the question /had/ instructed you to report
a single value, scientific integrity would demand that you not
follow such an instruction.

2) Again, remember that data is data, and interpretation is interpretation.
You can /consider/ the option of summarizing the data by reporting
the mean and standard deviation, but you have to ask whether this
is a _good_ way of summarizing the data.

It could be hypothesized that the 5 clustered data points could be
described by a nominal "average" value plus some random noise, whereas
the full set of 6 data points cannot be described in that way.

2a) Even if that were true, it would not give you a license to discard
the outlying data point. It is not your job to find the laziest
way to represent the data ... or in this case the laziest wasy to
misrepresent the data.

2b) I do not believe this hypothesis at all. Even within the "cluster"
the data is spread over an interval more than 18 cm wide. There is
no way that laying a meter stick end-to-end 5 times can accumulate
that much "noise". Therefore we must conclude that there is something
deperately wrong with the measurement process, something that affects
most (perhaps all) of the 6 measurements. It is not even remotely
safe to describe any subset of the observations in terms of nominal
"average" value plus random fluctuations.

2c) By way of presentation and representation, I would present
the data as a scatter plot (in addition to the tabular numerical
representation). Then the readers can see for themselves that 5
of the data points are clustered and the 6th one is an outlier.

By way of interpretation and summary, I might go so far as to point
out the obvious features of the scatter plot, but this is about as
far as I would dare to go. By way of non-interpretation I would
point out that the scatter does not look like random noise, as
discussed above.

3) Lastly, the idea of rounding off the result(s) to "2 or 3 sig figs"
is an outrage. I won't spell out the details here. If you're
interested, see:
http://www.av8n.com/physics/uncertainty.htm

===================================

Where did they find the "experts" who said that it was "correct" to
discard outlying data and to amputate the data to 2 or 3 sig figs?

I don't know whether to laugh or cry when I read things like that.

=====

I am pleased to note that the responses posted to this list so far
have been quite sensible. For example:

On 07/14/2010 12:57 PM, marx@phy.ilstu.edu wrote in part:
(b) Given the range of results, I should think that the students need more instruction on the proper use
of a meter stick for making such measurements. We also need to define where the hallway begins
and ends. Then, we try again. I expect the measurements should be much closer together for this type
of measurement.

Quite so.
_______________________________________________
Forum for Physics Educators
Phys-l@carnot.physics.buffalo.edu
https://carnot.physics.buffalo.edu/mailman/listinfo/phys-l