I have a question on data-analysis. To provide a context for my question I'll briefly describe a simple lab.
I measured with my high school students the distance between slits (d) for a certain grating using a laser of known wavelength. After taking required measurements we performed the data-analysis by calculating d separately for all measurements and calculated their average. Also a rough estimate for the uncertainty of d was calculated: we used simply (max-min)/2.
Then we performed the data-analysis using a linearizion of the principal maxima equation (d sin(theta) = k * lambda) to produce a straight line and calculated d from the slope. This was done by utilizing the Least Square Fit (LSF) method. The results of these two methods agreed within the uncertainty limits.
My question is the following: why is the LSF method considered to provide a better estimate in this particular case and also in general than averaging? (Of course, assuming that the data fulfill the assumptions underlying the LSF method). I know how the LSF and weighted LSF methods are derived but I don't seem to find an answer to the question which would be appropriate for my high school students. Could you help?