One of the professors teaching MITx: 6.00x Introduction to Computer Science and Programming stated that his goal was for us (the students in the class) to learn enough to break down problems and write python programs to solve them.
In the introductory chapter of my PhD thesis, I wrote that one of the reasons for modeling is to help explain the results of physical experiments. Sometimes, modern experiments (not to mention, nature!) are so complex, it can be hard to interpret the data. Through modeling, we can break problems down into simpler components and then add effects in until we match the observed data. Then we can explain what actually likely happened.
Anyway, I wrote a python program to clarify the friendship paradox because it also explains why I, with a BMI of 22.5, am often the fattest person in Pilates or spin class. Steven Strogatz explained:
For example, imagine going to the gym. When you look around, does it seem that just about everybody there is in better shape than you are? Well, you’re probably right. But that’s inevitable and nothing to feel ashamed of. If you’re an average gym member, that’s exactly what you should expect to see, because the people sweating and grunting around you are not average. They’re the types who spend time at the gym, which is why you’re seeing them there in the first place. The couch potatoes are snoozing at home where you can’t count them. In other words, your sample of the gym’s membership is not representative. It’s biased toward gym rats.So back to the friends problem. I started by thinking about the shape of the distribution of the number of friends one might have.
- The distribution should be self-similar; the shape of the distribution of the friends of friends should resemble the shape of distributions of friends of friends of friends.
- The distribution must be positive; you can't have less than 0 friends.
- The distribution should have a long right tail
paper about the self-similarity of the Euler Gamma Function. Gamma distributions are useful in modeling computer networks--why not people networks?
I sorted people by number of friends, from hermits on the left to social butterflies on the right. The number of friends for each person is in blue. The average number of friends of friends for each person is in red. Note that, for the first 400 or so of the 500 people, their friends have more friends than they do.
I had a heck of a time assigning friends to the social butterflies in a computationally efficient manner. If you sort people by numFriends (taken from the Gamma distribution) from hermits to butterflies, than go up the sequential list, assigning them more friends from those remaining until they reach numFriends and having those people friend them back, you will eventually reach a problem.
There are simply not enough people leftover to fill up the friend lists for those social butterflies. In that case, I had them friend everyone to their right and then randomly select friends from those on the left that had not initially selected them. I assumed that those on the left friended them back out of politeness, which added slightly to the total friend count.
The distribution of number of friends for a sample population of 500 people:
The commands to run the program:
numf = pickNumFriendsRecur(500, shape=3, scale=5 stretch=8)
flist = assignFriends(numf)
fof = howMany(flist)
max(numf) > nsubj: 501 500
trying reducing scale to: 4.9
avg numf after iteration: 115.732
Target average friends: 115.732
Adjusted average assigned friends: 121.724
Average friends of friends: 151.410319998
I encourage you (and your kids!) to view and download the sample code from pastebin and play around with it.