In 1945, a Cleveland newspaper held a contest to find the woman whose measurements were closest to average. This average was based on a study of 15,000 womenI want to point out that women were more homogeneous in 1945 Cleveland than they are today. I would guess that the contestants were mainly young women of European descent.
Out of 3,864 contestants, no one was average on all nine factors, and fewer than 40 were close to average on five factors.
Then Cook uses a normal distribution to simulate 3,864 women with 9 independent measurements to illustrate the point I made in Meeting Shams.
Body measurements are correlated; that's why RTW pants can be clustered as "curvy", "straight" or "favorite" fits. But it's an interesting exercise and Cook provides his Python code so you can play around with your own simulations.
BTW, I'll be giving a talk, Data Thinking Before Data Crunching, at the CISL 2016 Software Engineering Assembly on April 5, 2016. The following day, Mary Haley* and I will be co-teaching an all-day hands-on workshop for analyzing and visualizing spatial and atmospheric datasets with Python and NCL.
If you are in (or can get to) Boulder April 4th-8th, 2016, we'd love to host you at NCAR. This year's theme is Data Science.
See the program.
Apply for a student scholarship to attend.
* Mary is the lead software engineer for the visualization group and I am the education and outreach lead for the data support group.