Monday, November 30, 2009

Leaf Yoke and Vogue 1358 Rerun

I finished the Leaf Yoke sweater shortly after my last post about it.

I made a few alterations, adding i-cord at both the top and bottom of the lace yoke, changing the armhole shaping, and using plain garter stitch at the hem and armhole edges. Of course, I changed the gauge slightly, too.
Not only did I finish the sweater, but I sewed a skirt to match.
I have to stop using the skirt from Vogue 1358 (blogged earlier here) and use some of the many others in my collection. But this skirt is so easy to make and graceful to wear.  I made a few alterations to this as well.  I used a stretch velvet, which meant that I could use an elastic pull-on waistband. 

Because my last incarnation of this skirt turned out to be too heavy, I bought only 1.25 yards (60" wide), vowing that the skirt would be only as long as would fit in that 1.25 yards = 1 pound = $5.99/lb (from SAS Fabrics in Hawthorne, CA).   I ran the nap upward to make the velvet appear darker than the sweater.  If I ran it the other way, the bottom would have looked lighter than the sweater.

Then I added a 3" rectangular band of doubled stretch illusion to the hem (cut 6.5" and the width of the skirt, then folded and sewn w/ a 1/4" seam allowance).  It might have flowed more gracefully if I used a curved band, but that would have added a seam at the hem. I made the choice to make it easy on myself and use the rectangular band.That's the great thing about making stuff yourself.  You can make alterations and experiment.

Time in a Box and Statistics

Kathleen's post, Time in a Box, about the oldest sewing patterns in her collection got me searching through mine to look for my oldies but goodies.

This is one of the oldest patterns that I bought new and kept. In my teens, I didn't buy many patterns. This is before patterns were sold as loss-leaders. They were pricey then.  (But, perhaps, better drafted and more thoroughly tested?)

I made the shorts several times and the pants once. The pants were a wadder, more due to my poor fabric choice than the pattern. But check out version one of the shorts!

I still remember that I paid $12/yard (1.5 yards) for the Hoffman cotton sateen, bought at Kaufman's Fabrics in downtown Berkeley.  I had to work 4 hours to pay for the fabric, notions and pattern at my job in the Chemistry lab, one of the highest-paying student jobs on campus.  I did gulp at the expenditure, but they became my favorite volleyball shorts.  I wore those giraffe shorts at least once a week for nearly a decade.

The giraffes were printed in Japan on American basecloth.  Hoffman is still selling cotton sateens for about the same price.  I wonder where they are made today?

Kaufman's closed years ago.  Willi Smith, the pattern designer, died of AIDS shortly after I made the shorts.

But I recently came back into contact with these friends/coworkers from the Chemistry lab through facebook.  I joined FB at my neighbor's suggestion.  I didn't know what to make of it--I still am not sure what it is about.

But, a week after I signed up, someone contacted me and asked me if I was the person he went to HS and college with.  I checked his pix and profile and recognized his friends as our friends.  So I friended him.  Then our former boss and the organizer of the Berkeley Chemistry Demo Lab group on FB found me because I was a friend of a friend.  I joined the group and he sent me a link to this photo from 1987.  I love the 1980s fashion vibe in the photo.

This photo then reminded me of the book, Statistics by David Freedman, Robert Pisani, and Roger Purves.  In the introduction, they discussed how statistics can be used to draw the wrong conclusions.  For example, the acceptance rate for women for graduate school at Berkeley was much lower than for men.  Did that mean bias?  A committee looked at acceptance rates for a half dozen major departments, broken down by gender.   It turned out, that acceptance rates varied by department, between ~30-80%.  In most departments, males and females were accepted at roughly comparable rates.  If anything, the data showed that women were more likely to be admitted on a department-wide basis.  The overall higher rejection rate for women reflected that they were more likely to apply to the most competitive graduate programs at Berkeley. 

Why did that picture remind me of the example in Statistics? Because everyone in the photo worked at the Chemistry Demo Lab (and stockroom) and there are two females and one male in the photo.  That was also the gender ratio of the workers during my tenure at the lab.  We were jokingly called Lonnie's angels by the male graduate students who used to hang out in the lab with us and get in our way.

Lonnie was called to task for hiring so many females.  If he drew his employees from students who had completed honors Freshman Chemistry and Quantitative Analysis,  and the population of that class was about 80-90% male, then why were the majority of his employees females?

Lonnie countered that he interviewed students that landed in the top third in theoretical grades and the top fifth in laboratory grades.  We made the stock solutions that the students used.  The students' quantitative analysis accuracy depended upon our accuracy.  It was very important to find the most careful workers.  Secondly, he wanted workers that understood what they were doing and why.

After the grade cutoff, he went by interview impressions.  We were interviewed by returning workers as well as Lonnie.  Basically, we had to be the type of person they wanted to work with.

So how did he hire 2/3 females from a class that was 7/8 male?  Because the female students who sign up for honors science classes are different than the males.  Women tend to underestimate their abilities and men tend to overestimate theirs.  The average female student who signed up for that class was stronger than the average male classmate.  Males and females were roughly evenly represented in the top third in theoretical scores, but the females had higher laboratory scores.  There was no bias.  Lonnie beat the rap.

No one told us, but we were smart enough to realize that, if we wore short skirts while performing the demos, the students were more likely to stay awake and pay attention.  And, thanks to FB, I learned that nearly all of my former coworkers went on to earn MDs and/or PhDs.  Lonnie certainly knew how to pick them.

Thursday, November 19, 2009

Propagate Errors, not Bullshit!

Thank-you, Eric.  I never fully understood the significance, or lack of significance, of p-values in clinical drug trials before your two posts.  It makes many expensive drugs much less impressive, now that I can see how marginal of an effect they offer for $$,$$$ per year per patient.

I have to admit that I also read the chapter about p-values in the back of "the best detective story ever"*, Statistics by David Freedman, Robert Pisani, and Roger Purves. (You can pick up used older editions at very reasonable prices.  It is a fantastic introduction to how to apply statistics intelligently.)

I think that science education should include more rigorous statistical training.  The only training I ever received in statistics was in Honors Freshman Chemistry at Berkeley.  We were instructed to read the first chapter (36 pages) in our laboratory textbook, Chemical Separations and Measurements: Theory and Practice of Analytical Chemistry by Dennis G. Peters, John M. Hayes and Gary M Hieftje.  Then we did a problem set to make sure that we understood the normal distribution, how to propagate errors and how to report our average values and 95% confidence intervals. As meager as that was, that was infinitely more than the nonexistent instruction that I received from the Physics department.

We could have used more training earlier in our careers.

When I taught physical chemistry lab, I read An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements by John R. Taylor, one of the assigned textbooks.  That's another highly-recommended classic book.

* Actual quip from the back cover of the 3rd edition.  Isn't that the most lovely quip you can imagine for a statistics book?

How did you learn statistics?  This is a question for everyone, not just trained scientists.

Tuesday, November 17, 2009

What is the P-Value of Bullshit? Part Two

Another guest post from Eric

Recall in part one of "What is the P-Value of Bullshit?" we did a thought experiment, called "Study #3" in which we encountered a measurement, four heads in a row, with p-value as low as .06, which was nonetheless almost certainly due to random statistics.

Study #4: In Which the Naive Interpretation of P-value Is Partially Redeemed

Surely there must be at least some scenarios in which we can interpret the p-value as "probability of our result being bullshit"? Well, yes, and here's one: Suppose our jar of 1000 pennies now contains 500 two-headed pennies and 500 ordinary pennies. So now a two-headed penny becomes a mundane hypothesis, no longer a "way out there" sort of thing. Suppose we pick a penny at random, and do our four flips and get four heads, a p=0.06 result.

Q4: Now what's the probability now that we are not holding a two-headed penny?
       The arithmetic goes [(1/2)/16]/[(1/2+(1/2)/16)] = 0.0588

A4: Almost 6%, about the same number as our p-value!

So in Study #4, in which we test a "mundane" hypothesis, your old thesis adviser's naive interpretation of p-value works pretty well.

As a general rule, in order to believe a measurement is real, you should look for a p-value that is small compared to how "out there" the result is you're trying to confirm. If you're testing a mundane hypothesis that is as likely to be true as not, then p=0.05 is probably good enough for you. But if you are trying to confirm that you have hit some one-in-a-thousand two-headed jackpot, then you'd better wait until you get a p value of safely less than 1/1000 (e.g., better flip 13 heads in a row, not just four!) Incidentally, the philosophy of this post, and of this approach to hypothesis confirmation, is based on Bayesian statistics.

You might complain that a sliding criterion for adequate p-value makes the believability of a statistical measurement a matter of subjective judgment. After all, usually we don't know ahead of time that we are fishing for a precisely one-in-a-thousand payoff, and we can only estimate how far "out there" our original hypothesis is.

My response to your complaint: tough tootsie roll. No one ever said doing science was going to be easy. You can blindly apply p-value analysis, and be a hack, or you can bring some careful thought to a problem, and be a real scientist. And speaking of hacks...

For a real-life example, let's go back to that NYT story, which had to do with two candidate AIDS vaccines. Each vaccine had been previously tested and shown quite decisively to be ineffective. The US Army and the NIH jointly decided to sponsor a placebo-controlled, Phase III human subjects trial on the use of the two vaccines in combination.Has the idea of using two vaccines in combination, when each is shown to be ineffective on its own, ever worked?

Dozens of AIDS scientists protested that this hypothesis was such a long shot that testing it amounted to a huge waste of AIDS-battle resources.Was it a one-in-a-thousand long shot, like our two-headed penny hypothesis? Who knows? But in any case it was surely one-in-25 or worse. The NIH and the Army pushed ahead, lined up 16,000 volunteers and spent $100 million, and in the end published a p = 0.04 result* claiming that the combined vaccine worked a wee little bit, providing immunity to only one in every three who got the full combined dose.

Does p = 0.04 mean that the probability that the published result is due to statistical noise is only 4%? The scientists interviewed in the NYT seemed to think so, but alas the study is most likely a 100 million dollars worth of statistical noise: bullshit.

At this very moment, somewhere in the world a scientist is testing a long-shot hypotheses: does eating a diet of only artichokes cure breast cancer?Are red-headed children more responsive to acetaminophen? There are thousands of such investigations going on. They are long shots, but every once in a while a seemingly bizarre hypothesis turns out to be true, so what the hell, no harm in checking it out?

Problem is, with many thousands of long-shot studies going on at any one time, by random chance you will get hundreds of "p = .04" results supporting hypotheses that are in fact incorrect. If you're from the naive school of p-value interpretation, you'll celebrate your p = 0.04 result by publishing a paper, or better, holding a press conference!

And if you are stats-challenged science journalist, you'll write the bullshit up for the New York Times.

*The "p=0.04" number actually comes from a fairly "aggressive" analysis. Playing more strictly by the rules, the study's authors got a still-less-impressive p=0.15.

**Thanks to Jonathan Golub of Slog for providing the point-of-departure for this two-part post. Always lively, readable and informative, Golub is, along with BMGM herself, one of my favorite bloggers on science and science policy. Like all prolific science writers, he has on rare occasions oversimplified and on very rare occasions, totally screwed up.

Monday, November 16, 2009

What is the P-Value of Bullshit? Part One

Eric here, sporadic guest poster on Bad Mom, Good Mom. I am a laboratory scientist working in Colorado.

Last week, BMGM aired one of her pet peeves, confusing correlation with causality.

My own statistical pet peeve? The oft-abused concept of p-value. Probably a majority of practicing laboratory scientists routinely misinterpret p-values. I'm not talking mere bit players either: last month the NYT reported on a Phase III human-subjects AIDS vaccine trial, run in Thailand by the US Army and the NIH. Naive thinking about statistics led to the publicized conclusions of the study being almost surely crap.

We'll come back to how we know the conclusions are crap, but first let's do a thought experiment. Imagine taking one two-headed penny and mixing it in with a jar of 999 ordinary pennies. Shake the jar up and pull out one penny. Don't look at it yet! Let's do some scientific studies.

Study #1: In Which We Collect No Data At All

Q1: Before you've looked at the penny you took out, what is the probability that the coin you are holding has two heads?
A1: You got it -- one in a thousand, or 0.1%.

Study #2: In Which We Collect Deterministic Data

Now look at both sides of the penny. Suppose you notice that on both sides, there is a head!
Q2: Now, what is the probability you are holding a two-headed penny?
A2: Yep -- unity, or 100%. OK, we're ready for something more difficult!

Study #3: In Which We Collect Some Odd-Seeming Statistical Data

Throw your two-headed coin back in, shake the jar, and again reach in and grab a single penny. Don't look at it yet! Now suppose you flip the coin four times and get a slightly unusual result: four heads in a row.
Q3: OK, given you flipped four heads in a row, now what would you say is the probability that your penny has two heads?

Well, if we were doing biomedical research, the first thing we do when we encounter statistical data is calculate the p-value, right? Turns out that if you took an ordinary (not two-headed) penny and flip in four times, then the probability you will get heads four times in a row is one in 16, or about 6%. So now we can (correctly!) define p-value by example: four heads in a row is a p-value 0.06 measurement.

Can we turn this idea on its head and say, "If we flip four heads in a row, then there is only a 6% chance the coin is not a two-headed coin"? Many practicing scientists would say "yes", but the correct answer is no, NO, Goddammit, NOOOOOO!

In our Study #3, picking a two-headed coin out of the jar is a very rare thing to do, one in 1000, whereas picking an ordinary coin out and flipping four heads in a row is only a slightly odd thing to do, (999/1000)(1/16), or about one in 16. Thus we get:

A3: the probability you are holding a two-headed coin is very small, (0.001)/(0.001+(999/1000)/16), or about 16 over 1000, only 1.6%. You are 98% likely not to be holding a two-headed coin!

Bottom line: your seemingly significant, p = 0.06 measurement of four heads in a row was not strong evidence of a two-headed coin, and saying otherwise would be, in the technical jargon of the trained laboratory scientist, "bullshit".

Perhaps your research adviser told you the p-value meant "probability your measurement is merely random noise". Is s/he always wrong about that?"

Nah, the old geezer got it right once in a while, if only by accident. To find out about that exception, stay tuned for part two of this post!

**Thanks to Jonathan Golub of Slog for providing the point-of-departure for this two-part post. Always lively, readable and informative, Golub is, along with BMGM herself, one of my favorite bloggers on science and science policy. Like all prolific science writers, he has on rare occasions oversimplified and on very rare occasions, totally screwed up.

Friday, November 13, 2009

Leaf Yoke Detour

Because I will do anything to avoid seaming up Shadow October Frost.  Including knitting another sweater.  Angela Hahn's Leaf Yoke Top [Ravelry link] in Knit Picks Comfy worsted, a super-soft pima cotton and acrylic blend.  The catalog called the color "planetarium".  I was expecting an inky navy-black.  It is really a dark Prussian blue.  Now why don't they just say that in the catalog instead of giving it an artsy-fartsy name?

Proof that cell phone use impairs driving

I was driving east on crowded Manhattan Beach Boulevard, performing my soccer mom duties, when this humongous SUV kept veering into my lane.  I looked over to see if the driver was drunk or perhaps having a stroke.  Nope, he was driving while talking on his hand-held cell phone, which BTW is illegal in California.

I told Iris to quickly get out the camera and catch him in flagrante delicto. He looked at our car and the camera and put down the phone. A few hundred yards later, he picked up the phone again. Iris managed to get some good photos.  Notice the height differential between our Prius and his Porsche Cayenne S.  His bumpers are aimed squarely at our head and shoulder height.  He's not paying attention to where he is aiming that 5000 pound weapon.

If you see this car, steer clear. He's a menace to society.

How to become a home cook

I've been thinking, reading and writing a great deal about food  lately.  It ramped up after I read Animal, Vegetable, Miracle.  I meant to post something about Michael Pollan's screed, Out of the Kitchen, Onto the Couch, but there was plenty of ink and pixels spilled across the internet without my contribution.

It's easy to judge people for watching others cook, rather than getting into the kitchen and cooking themselves.  But, what if someone doesn't know how to cook?  Where do they start?   How does someone who doesn't know a clove of garlic from a head of garlic* get started?

Cookbooks by celebrity chefs that are familiar from TV may not be the best place to start.  Professional chefs cook on restaurant scale on professional equipment (50,000 BTUs?  No problem!) with rare ingredients.  Years ago, a NYT article claimed that ~20% of cookbook recipes don't even work when tested in a home kitchen.  Recipes from celebrity restaurant chefs were heavily over-represented in the bad recipes .

I am a huge fan of Marion Cunningham because she tests each recipe in a home kitchen, using basic home equipment.  Then she has others test the recipes in their home kitchens.  I respect that attention to detail.

Learning to Cook with Marion Cunningham is the best book I have ever seen for learning how to cook. She assumes no prior knowledge, explains every term and shows every step.  She developed the book while teaching rank beginners how to cook.  If you want to learn how to become a home cook, this is the place to start.

I've compiled a list of my most useful cookbooks (plus one example of the type of cookbook I hate). 

A bilingual compilation of Taiwanese recipes doesn't have an ISBN # and doesn't show up on that list.  But it's also highly recommended. It was put together by the Northern California Chapter of the North America Taiwanese Women's Association.  My mom might still have more.  Email me if you want a copy.

* Don't laugh, but I once had a housemate who borrowed one of my cookbooks and made a garlic pasta with three heads of garlic.  He thought that was a lot of garlic, but the recipe said 3 cloves of garlic.  He didn't know what a clove was, but he assumed it was a unit of garlic.  The entire garlic bulb looked like the basic unit of garlic to that novice.  So do not assume prior knowledge.  Newbies are not necessarily dumb, but they don't know the jargon yet.

Tuesday, November 10, 2009

The catchy name school of science

Donald Knuth wrote in The Art of Computer Programming something to the effect of "The bubble sort has nothing to recommend it except for a catchy name."

What's catchier than "dandelion kids" and "orchid hypothesis"?  Read the Science of Success in the December issue of Atlantic.  Then read Genetic 'breakthroughs' in medicine are often nothing of the sort
Don't believe everything you read about genes and disease in prestigious journals like Science and Nature, say Marcus Munafò and Jonathan Flint. A lot of it is simply wrong.
I don't have time for a longer post. I have looming deadlines at work and at home. But my money is on Enrico Fermi, Marcus Munafò and Jonathan Flint.    ;-)

Discuss among yourself.

Sunday, November 08, 2009

Clothespin Extinction

I spent a week in Boulder, attending to IT issues at NCAR and a satellite data users' workshop. I had a few free hours* one afternoon and browsed the aisles of McGuckin Hardware.  I was looking for the clothespins I use, shown in How to Use a Clothesline

I learned that Penley, the maker of my old sturdy wooden clothespins, has discontinued domestic production. They now make their clothespins in China. I have no idea if the quality is the same. McGuckins hardware sells both the Penley wooden ones made in China, and a plastic variety.  Has anyone used them?  Do they hold wet, heavy laundry?

I have no use for the high-style ones mentioned in this story:
Nowadays plastic clothespins are available in endless variations, including a new one that has gone into widespread production, Zebra’s “sweet clip,” made with both hard and soft plastics, using a dual-injection manufacturing process. The hard plastic is in the long handles, while two softer cushions sit where the pin grips the clothes. Zebra developed a dual-plastic toothbrush 15 years ago, applied the principle to clothespins in Europe in the late 1990s, obtained a worldwide patent, and captured 8 percent of the global clothespin market. The pin is sold in North America under the name Urbana.
     “We love to target stupid products,” says Xavier Gibert of Zebra. “When you walk into a megastore, most of the time you see stupid products, boring products. You buy them because you need them. We target basic products to make them come alive, able to talk to people.” And what does the Urbana clothespin say? Something along the lines of “I’ll be gentle.”
     “The key of this peg is not to be able to hold very heavy clothes,” says Gibert. “It’s much more dedicated to sensitive clothes.” Response to the pin has been enthusiastic. “People were attracted by the design. They said, ‘Wow, we love the shape.’”
If we want to save carbon and achieve energy independence, we will need clothespins that can securely hold heavy and wet laundry.

* Ok, I squeezed in visits to Elfriede's Fine Fabrics and Shuttles, Spindles and Skeins.  There was stash enhancement.  Photos after I come up for air.  This is the first weekend in a month in which both Bad Dad and Bad Mom were home at the same time.  Iris celebrated by coming down sick.  She was a trooper at soccer playoffs this weekend.  Her team was short three players, and she played two quarters, even though she was very weak and tired.  She says I revived her for the fourth quarter with my magic zucchini-chocolate chip muffins.

Monday, November 02, 2009

TSA Story

I was running late to catch a flight two Sundays ago.  When I went through the security checkpoint, everything came out except for my laptop.  What was wrong?

I saw two TSA employees in discussion and looking at my laptop. They signaled a third one to come over and take a look. Yipes! I asked them if there was something wrong.

One TSA employee said to another, "Why don't we ask her?"

So she turned to me and asked, "Did you make that case?"

Gulp.  "Yes."

She smiled and turned to the other one, "I told you that you can't buy a case like that."

Remember my not so minamalist laptop case?

I survived the heaviest October snowstorm in Colorado since 1997 and got home to LAX safely.  I will be in Pasadena (Caltech) later this week.  In two weeks, I will be in Boston.  In four weeks, I will be in DC.  It looks like that laptop case will get around. 

Sadly, I lost the minimalist camera case in Twain Harte, CA; the one I made to replace it was lost inside Hoover Dam.  I am sad about the second one.  I used a vintage button and the last bits of two fabrics.  I didn't even have time to blog about it and Iris dropped it during the Hoover Dam tour.  She did take some nice photos, though.  Perhaps she can help me make the next case.