I began by taking a sample of about ten million pairs of friends from Apache Hive, our data warehouse. I combined that data with each user’s current city and summed the number of friends between each pair of cities. Then I merged the data with the longitude and latitude of each city.
At that point, I began exploring it in R, an open-source statistics environment. As a sanity check, I plotted points at some of the latitude and longitude coordinates. To my relief, what I saw was roughly an outline of the world. Next I erased the dots and plotted lines between the points. After a few minutes of rendering, a big white blob appeared in the center of the map. Some of the outer edges of the blob vaguely resembled the continents, but it was clear that I had too much data to get interesting results just by drawing lines. I thought that making the lines semi-transparent would do the trick, but I quickly realized that my graphing environment couldn’t handle enough shades of color for it to work the way I wanted.
Latest posts by Tom George (see all)
- New Digital Trends In Today’s Smartphone-Obsessed World - September 6, 2016
- 5 Things a Recruiter Looks for in a Resume - August 30, 2016
- Obtaining Legal Immunity for Pet Cohabitation - August 30, 2016