Chronology Current Month Current Thread Current Date
[Year List] [Month List (current year)] [Date Index] [Thread Index] [Thread Prev] [Thread Next] [Date Prev] [Date Next]

[Phys-L] Re: Should Randomized Control Trials Be the Gold Standard of Educational Research ?



One certainly needs to control the variables, but the randomized control
trials in medicine are based on the idea that neither the experimenter nor
the subject know which treatment they are getting. This is absolutely
impossible with education. The only case I know of where a fair degree of
separation on the part of the subjects was achieved is the study done by
Mehl. The control and experimental groups spoke different languages, but
were subjected to the same exams just translated into the appropriate
language. He also looked at the history of the 2 groups and found that they
tended to score the same in traditional courses. The experimental group
turned a 50% failure rate into a 100% passing rate.

As far as experimental and control groups go, the way it is often done in
education is by studying the sort of gain that is normally achieved, and
then by changing the conditions, and then seeing what the gain is for the
experimental group. Shayer & Adey signed up schools for their experiment,
and established a baseline for output vs intake using these schools. Then
they measured the same parameters for the experimental group. One of the
things they found is that the baseline results could be plotted and all
schools fell on the same line. Essentially output could be predicted from
input. After installing their program the experimental schools had output
that was much higher than predicted by the output-input baseline.
Significant in this case is not just statistically significant, but
comparable to the sort of large gains seen by PER.

Unfortunately, it is not possible to single out just one variable in a
unique fashion in education. This is much more possible for short term
limited studies. As a result there have been very few valid studies of long
term interventions. And there have been no studies of textbooks as such, so
you do not know if a given text actually works better. The big problem here
is that it is not possible to separate the effect of a text from the actions
of the teachers. There have been studies of things like real time physics
labs. They do have a good effect even in the hands of inexperienced
instructors.

Then of course there is the big problem that one can achieve short term
increases in test scores by large amounts of drill and practice, but in the
end you sacrifice long term gains in thinking ability. In addition the
students can become totally turned off to the subject.

Finally the type of studies that are being contemplated need an evaluation
process that reveals what sort of gains have been achieved. This means that
the tests will be made by someone who might have a bias for a particular
type of result such as recall of facts vs the ability to solve problems that
have not been already presented to the students. The current crop of high
stakes testing essentially have a bias in this direction. The correct way
to construct the tests requires extensive interview protocols to validate
the tests. Generally the current state tests are just made up by "experts"
in the field. Actually many of these are classroom teachers, and as a
result you have totally stupid and wrong physics questions on these tests.
The last Texas test is a case in point. Also the current state tests are
totally released to the public. As a result the expense of doing interview
protocols each year prevents proper test construction.

Meanwhile the education industry really does not want their products tested.
They wish to convince the teachers that their products are better on the
basis of pictures, ancillary materials ...

To get back to the medical model, they essentially test just one thing such
as drug A vs drug B or vs a placebo. The outcome is very easy to determine
by checking whether the patient gets better, or the progress of the disease
is arrested. In the case of education the books are constantly being
revised, but if you look at the changes they are not that great from one
edition to another. The number of errors never seems to go down. The
number of variables in the educational setting is much larger than in the
medical research. And medical research has many more variables than in most
physics research. Remember that in medical research a certain error rate
(mortality) is tolerated. It is never zero. In education such a comparably
low error rate is impossible.

Now let us consider how to control variables in education. It is only
possible to control teacher behavior by constant monitoring and training.
This is prohibitively expensive in a long term large scale study. This can
be done well by the researchers at their institutions, but when the material
gets out it can be eviscerated by teachers who "know better" how to use it.
Then of course there is the effect of interfering parents. If they think
that their children are being improperly taught they will come in and tell
the teacher that they are wrong. They will hire tutors of dubious
competence, and in the end can sabotage a study.

What about methods that may have no immediate effect, but show a large
delayed effect? This is precisely the difficulty with convincing schools to
implement "Thinking Science". It has no immediate effect on test scores
during the two years (ages 10+, or middle school) it is implemented.
However it has a large delayed effect which shows up near the end of HS.
Studies generally only look at a one year time frame and never look for
delayed effects. Medical studies also generally take a short term time
frame, except for certain diseases such as cancer, ALS ...

Incidentally the advocates of randomized control studies have never done
such studies themselves in education, so it is really a case of do as I say
rather than do as I do. And in addition they have never looked at the work
of Shayer & Adey, Feuerstein, Mehl, or any of the PER results. This is
because these studies have not been in the journals they looked at, or
because PER is not in the ERIC database.

In the end, yes you do control variables, but you control as many variables
as possible, but you also end up changing a whole bunch of variables. The
researchers rely on constant testing and comparing how various changes
influence test results. They do this over many years of study. A single
"gold standard" study will hardly do what is needed. We know many things
that work, and many things that are useless. Despite that the relevant
research is seldom used in the teaching practices. This is because the
teachers teach as they were taught. When the going gets tough they resort
to traditional practices whether they work well or not. In addition parents
and administrators demand traditional practices and sabotage attempts to use
more effective practices.

John M. Clement
Houston, TX


I don't understand. If you can't control the variables, how can you get
statistically "valid" results?

My expeerimentalizt friends (in particle physics) now divide uncertainties
into a +statistical" part and a "systematic" part. These two parts must
be combined somehow (how to combine is still IMO an unsolved problem) in
order to arrive at a meaningful statement of uncertainty. When several
experiments measure the same quantity, then a comparison of quoted
uncertainties gives one an intuitive feeling for the uncertainty of our
knowledge of the quantity being measured.

I have seen nothing in the field of measuring teaching techniques the
tells me that we have much insight into how to make "statistically valid"
comparisons.
Regards,
Jack
_______________________________________________
Phys-L mailing list
Phys-L@electron.physics.buffalo.edu
https://www.physics.buffalo.edu/mailman/listinfo/phys-l