Tuesday, August 09, 2016

More statistical nonsense

Let's talk about something that makes me mad, but only mildly so.  This also gives me a chance to jump up and down on my soapbox about bad data crunching.

Exhibit A, this piece of click bait from the NY Times with a tone of schadenfreude toward engineering majors:
I clicked and read these counter-intuitive numbers.
This doesn't jibe with my personal experience. Physical scientists that I know are very, very civically engaged. How could we be such slackers when it comes to getting to the voting booth when I see "I voted" stickers on everyone in lab on election day?

Do I know a very atypical set of physical scientists?  I had a hunch that, perhaps, it is because (outside of school and student jobs) I have always worked in national labs that require US citizenship?

I did a little research.

First, I went to The National Study of Learning, Voting, and Engagement (NSLVE) website and read about the project. There appears to be a database accessible from that website. Because I'm not a participant in the research, I lack access to it.

There also appear to be some scholarly articles, which might have the summary data cited by the NY Times.  Again, I lack access to the articles.  (I'm not going to pay $41 for 24 hours of access to an article that may or may not have the data I seek.)

Search for "The National Study of Learning, Voting, and Engagement report". I was able to find several, including reports for Columbia and Long Beach Community College students.

In each report, I saw that the figures for % of eligible students voting by major was calculated using IPEDS and the same percentage was applied to all majors at a school, given the schools' overall demographics.
This is based on the percentage of non-resident aliens reported by your institution to the Integrated Postsecondary Education Data System (IPEDS), and is more reliable than the demographic data campuses provide to the Clearinghouse at this time.
Do you see the statistical flaw? The reports gave the numbers with this caveat at the top:
Your students broken down by field of study. Please note that we are not able to adjust these voting rates by removing non-resident aliens.
The NY Times' poorly-researched and reported listicle did not include any methodology or context.

OK, now let's read what the National Science Foundation has to say about Higher Ed in Science and Engineering.
  • About 60% of all foreign graduate students in the United States in 2010 were enrolled in S&E fields, compared with 32% at the undergraduate level.
  • Foreign students earned 57% of all engineering doctorates, 54% of all computer science degrees, and 51% of physics doctoral degrees. Their overall share of S&E degrees was one-third.
  • In 2009, temporary visa students earned 27% of S&E master's degrees, receiving 46% of those in computer sciences, 43% of those in engineering, and 36% of those in physics.
Moreover, physical science and math students are vastly outnumbered by business and other students; the US graduated 19 business majors for every math or statistics major in 2011.

Let's list what we know:
  • Statistics tying individual students majors and voting behavior are difficult to obtain for privacy reasons.
  • They had to make estimates based upon school-wide statistics.
  • Each school reported the % of their students that were not on temporary visas.
  • NSLVE then applied the same % to all majors, even though they know this is inaccurate. They reported that this is a source of error.
  • They also removed students that were younger than 18 and not eligible to vote.
  • The % of students studying STEM is quite low compared to other majors, particularly business.   That gives larger error bars to STEM voting numbers, even without the eligibility estimation.
  • STEM students as a whole make up ~20% of the total undergraduate (UG) population, but 30% of the foreign UG student population; their voting participation is underestimated by the NSLVE methodology.
  • This means non-STEM students are more likely to be US natives; their voting participation is overestimated by the NSLVE methodology.
  • Foreign-born permanent residents are a wild card.  They do not need a temporary visa.  Yet, they cannot vote.  They are also disproportionately likely to be studying STEM.
  • Foreign students make up a disproportionate share of STEM students at every level, but particularly so at the graduate level.  They dominate in many STEM fields.  Thus, their voting participation is VASTLY underestimated by the NSLVE methodology.  (That 40% of physical science students could very well be 90% of eligible students.)
I found all sorts of interesting information, especially at the National Center for Education Statistics:
Anyway, after examining the data, I think it is very, very likely that physical science students that are eligible to vote do so at higher rates than journalism students.  I'm sure The average physical science student is better with data than the average journalism student.  We might even be better than the average NY Times journalist.

Another piece of bullshit debunked.


