Thursday, November 19, 2009

Propagate Errors, not Bullshit!

Thank-you, Eric.  I never fully understood the significance, or lack of significance, of p-values in clinical drug trials before your two posts.  It makes many expensive drugs much less impressive, now that I can see how marginal of an effect they offer for $$,$$$ per year per patient.

I have to admit that I also read the chapter about p-values in the back of "the best detective story ever"*, Statistics by David Freedman, Robert Pisani, and Roger Purves. (You can pick up used older editions at very reasonable prices.  It is a fantastic introduction to how to apply statistics intelligently.)

I think that science education should include more rigorous statistical training.  The only training I ever received in statistics was in Honors Freshman Chemistry at Berkeley.  We were instructed to read the first chapter (36 pages) in our laboratory textbook, Chemical Separations and Measurements: Theory and Practice of Analytical Chemistry by Dennis G. Peters, John M. Hayes and Gary M Hieftje.  Then we did a problem set to make sure that we understood the normal distribution, how to propagate errors and how to report our average values and 95% confidence intervals. As meager as that was, that was infinitely more than the nonexistent instruction that I received from the Physics department.

We could have used more training earlier in our careers.

When I taught physical chemistry lab, I read An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements by John R. Taylor, one of the assigned textbooks.  That's another highly-recommended classic book.

* Actual quip from the back cover of the 3rd edition.  Isn't that the most lovely quip you can imagine for a statistics book?

How did you learn statistics?  This is a question for everyone, not just trained scientists.


  1. Ok, so it's not just me then (with the almost nonexistent statistics education). Somehow, you'd think a Ph.D. in physics would mean I would have a basic knowledge of the subject, but other than a cursory understanding of different distributions and their basic properties, I know nada... Very embarrassing ;) I guess I did learn a few things about errors and how to propagate them in the grad physics lab we were all required to take at cornell, but that's it.

  2. I had a one-quarter class in undergrad, statistics for engineers or something like that. And then I had a two-week course on Six Sigma through a previous employer.

  3. Is six-sigma a discipline or is it a religion? My sister says it has elements of both.

  4. hmm...I actually had a different experience.

    I had "error propagation" (rudimentary stats for Gaussian distributions) in undergrad (physics) and then pretty hardcore stats in grad school. Stats is actually something I got a lot of, as opposed to fluid dynamics (completely absent in my undergrad physics curriculum), numerical methods, and a differential equations (only touched upon briefly in my undergrad. I know...right?).

  5. Marie-Christine01:29

    Oh, I took a class way back when in grad school. But where I really learned statistics was from the classic How to lie with statistics by Darrell Huff, from 1954 if you can believe it, and still totally current..

  6. I had one 9-week graduate class. I muddled through it, and retained nothing.

    I am learning again now as I go along. It helps very much to have actual problems to be concerned about.

  7. I only have an undergrad Botany degree. In my second last quarter I took an amazing field work class that introduced a lot of statistics. I love the subject and the teacher so much that I dropped the fluff I had signed up to take in my last quarter and I took a grad level stats class. Fun stuff.

  8. What statistics I know, I learned in undergraduate biology classes and in statistical thermodynamics in grad school. I have also picked up some from problems encountered as I work and from a recent training course in the use of JMP software. It looked like it might be helpful for me to learn some more, so I bought a book called PDQ Statistics (by Norman and Streiner). But then the project that was making me want to know more stats got canceled, and, while more stats knowledge would still be nice, it has fallen down on the priority list.

    I certainly don't consider myself an expert on statistics. Luckily, one of the contractors we do a lot of work with is an expert, so I can get an expert's opinion anytime I'm willing to pay for it!

  9. Six Sigma gets applied in lots of places that it shouldn't, but the class materials I got during my Six Sigma training are my best reference for applied statistics.

  10. I had some stats as part of grad work for an MA in TESOL. It helps whenever I see stats in the news. I think it was four weeks of a 10 week quarter on understanding experiment design and outcomes in applied linguistics. I think the class was required for ph.d candidates in foreign language depts also.

  11. My stats came from an undergrad physics lab and a graduate course in thermodynamics.

    I'll second the recommendation for _How to Lie With Statistics_ given earlier, but since I read it while in elementary school I'm not sure that I would recommend it to an adult with a scientific education.

    My experience with Six Sigma is that it is mostly beneficial philosophy, but it has become somewhat jargonistic:

    Contractor: "That's part of our Six Sigma process."
    Me: "But you only produce tens of widgets per year."

  12. I remember reading How to Lie with Statistics as a teenager, and remember liking it, but I don't remember much of the content. I never had a class in statistics, except of course Statistical Mechanics. Most of what I know about the statistical analysis of data I learned hanging on the street corner in a bad part of town, listening to older scientists, the women chewing gum loudly, the men with a pack of cigs rolled up in their t-shirt sleeve, all of them bragging crudely about their data analysis exploits.